Keywords: why, where and when to use them

by Amber Swope of JustSystems

In DITA, the <keyword> element and the elements that are specialized from it contain text that has a unique or key-like value.

This article explains these elements in more detail and proposes guidelines for how to use them.

Keywords and the <keyword> element

The keyword elements that are available in DITA cover a wide variety of situations in which you need to introduce new terms to the reader. For these situations, DITA provides a variety of specialized elements.

The <keyword> element is used to specify any of several common types of terms in DITA. What do all of these elements have in common? They are elements designed to contain text "that has a unique or key-like value"1 By "text with a unique or key-like value," we mean text that has a specific purpose. You designate a term as a keyword when you want to indicate that the term has a special meaning.

For example, if you consistently apply the <wintitle> element for the names of windows in the graphical user interface, you can process the <wintitle> element differently from other text. If the corporate style is to format window names as bold text, then you can specify that the transform apply bold styling to all <wintitle> elements upon processing.

When used in the metadata of a topic, keywords are "Terms from a controlled or uncontrolled subject vocabulary that apply to the topic."2 Whether used as metadata or in the body of a topic, "the <keyword> element identifies a keyword or token, such as a single value from an enumerated list, the name of a command or parameter, product name, or a lookup key for a message."3

The <keyword> element is a generic element designed to allow easy specialization. For example, the <apiname>, <kwd>, <option>, <parmname>, <cmdname>, <msgnum>, <varname> <shortcut>, <wintitle>, and <shape> elements are all specializations of the <keyword> element.

Why use keywords

The ability to control output processing for specific types of text is one of the main reasons to use the <keyword> element.

Another reason for specifying keywords is to aid in Web content indexing and retrieval. When the DITA Open Toolkit (OT) processes DITA topics and maps to XHTML, it places any <keyword> and <indexterm> elements that appear in the prolog of a topic into the metadata of the generated Web page.

However, because of abuses of Web page metadata by sites attempting to optimize their ranking in search engine results, search engines that are designed for use on the public portions of the Web tend not to trust Web page metadata.

Where to use keywords

You can include keywords in different places in DITA topics. This ability is useful because it gives you flexibility in applying the keywords; however, you need to consider where the keywords are appropriate.

If you have a word or phrase to which you want to apply a semantic identification, simply make any inline text a keyword by including the text within the <keyword> element. The following example shows how to make DITA a keyword in a sentence from DITA XML.org. <p><keyword>DITA</keyword> builds content reuse into the authoring process, defining an XML architecture for designing, writing, managing, and publishing many kinds of information in print and on the Web.</p> You can also specify keywords in the prolog of a topic, which implies that the keyword applies to the full topic AND it will apply to the full topic in any context. The following example shows the XML markup: <prolog><metadata> <keywords><keyword>Whitepaper</keyword></keywords> </metadata></prolog> Note: You insert the <prolog> element between the closing <shortdesc> tag and the opening tag for the topic body element. If a keyword applies to a topic but only in a specific context, you can specify the keywords in the DITA map. The following example shows the XML markup for specifying a keyword for a topic in a map: <topicref href="Topic_a.xml"> <topicmeta><keywords><keyword>XML Whitepapers</keyword></keywords></topicmeta> </topicref>

When to use keywords

There are several DITA elements that seem to serve the same or similar purposes. For example, when do you specify a word as an index term, term, phrase, or keyword?

<indexterm> or <keyword>

Although the DITA OT processes <keyword> and <indexterm> elements the same way for Web content, it provides additional processing for <indexterm> elements and compiles them into an alphabetized index for print, help, or another medium that includes an index. In addition, unlike <keyword> elements, <indexterm> elements do not appear in output text. To illustrate the difference in the output, the example shows the generated text with "DITA" as a keyword.
DITA builds content reuse into the authoring process, defining an XML architecture for designing, writing, managing, and publishing many kinds of information in print and on the Web.
In contrast, the following text shows the generated text with <indexterm>DITA</indexterm>:
builds content reuse into the authoring process, defining an XML architecture for designing, writing, managing, and publishing many kinds of information in print and on the Web.
In the second example, the word "DITA" is missing. To have a word or phrase appear in the text as well as in the index, you must apply both <indexterm> and <keyword>. <indexterm>DITA</indexterm><keyword>DITA</keyword> builds content reuse into the authoring process, defining an XML architecture for designing, writing, managing, and publishing many kinds of information in print and on the Web.

Although it looks redundant when you view the XML markup, the output from the above example generates the text with "DITA" as the keyword and includes "DITA" in the index.

<term> or <keyword>?

The important distinction to note between a keyword and a term is the purpose of the text. If you are defining a term in the text, then use the <term> element rather than <keyword> for the inline element. The <term> element identifies words that may have or require extended definitions or explanations." 4 If you consistently apply the <term> element when you are defining words or phrases, you can easily format the text according to your corporate style, such as formatting terms in text with italics.

<phrase> or <keyword>?

The <phrase> and <keyword> elements are valid in almost all of the same places in DITA topics, but they are not interchangeable. The primary difference is the purpose of the elements. If you want to classify text as semantically important, use the <keyword> element. If you simply want to encapsulate text in a generic element for reuse or conditional processing, then use the <phrase> element.

For example, if you want to reuse a word, but it does not have semantic meaning, you can apply the <phrase> element, assign an ID to the <phrase> element, and then reference the ID using the conref attribute. Granted, you can achieve the same outcome with the <keyword> element, but the DITA OT also processes the <keyword> element for the Web metadata.

Another point to note is that the default .css that ships with the DITA OT applies the bold style to keywords. Of course, you can override this formatting with your own .css.

Summary

The rule when applying elements is to apply the element that provides the minimum processing that you need. If you do not need the semantic processing that the <keyword> element provides, do not use it; rather, use the element that provides the appropriate processing power and output.

If you are generating XHTML for Web output or need to identify text with semantic distinction, then the <keyword> element provides value. In many other cases, you can achieve the desired resulting processing and output with other elements.

DITA Release 1.1 Language Specification , page 71 DITA Release 1.1 Architectural Specification DITA Release 1.1 Language Specification, page 71 DITA Release 1.1 Language Specification, page 204. Share your perspective on this article by adding a comment to this page. Also weigh in on our Keyword poll.

The numbers are missing, but the references are correct.

All the footnotes in the article link to a file on someone's C drive, so we can't follow them.

XML.org Focus Areas: BPEL | DITA | ebXML | IDtrust | OpenDocument | SAML | UBL | UDDI
OASIS sites: OASIS | Cover Pages | XML.org | AMQP | CGM Open | eGov | Emergency | IDtrust | LegalXML | Open CSA | OSLC | WS-I