FAQ: The Topic Architecture of DITA

The topic architecture of DITA

What is a topic?

A topic is a chunk of information organized around a single subject. Structurally, it is a title followed by text and images, optionally organized into sections. Topics can be of many different types, the most common being concepts, tasks, and reference.

Why use topics?

Topics are the basis for high-quality information. They should be short enough to be easily readable, but long enough to make sense on their own.

By organizing your content into topics, you can achieve several goals simultaneously:

  • Your content is readable even when accessed from an index or search, not just when read in sequence as part of a chapter. Since most readers don't read information end-to-end, it's good information design to make sure each unit of information can be read on its own to give just-in-time help.
  • Your content can be organized differently for online and print purposes. You can create task flows and concept hierarchies for online orientation, and still have a print-friendly combined hierarchy that helps people who do want an organized reading flow.
  • Your content can be reused in different collections. Since the topic is written to make sense when accessed randomly (as by search), it should also make sense when included as part of different product deliverables, so you can refactor your information as needed, including just the topics that apply for each reuse scenario.

Topics are small enough to provide lots of opportunities for reuse, but large enough to be coherently authored and read. While DITA supports reuse below the topic level, this requires considerably more thought and review, since topics assembled out of smaller chunks often require editing to make them flow properly. By contrast, since topics are already organized around a single subject, you can organize a set of topics logically and get an acceptable flow between them, since transitions from subject to subject don't need to be as seamless as the explanations within a single subject.

What are the key principles of the DITA architecture?

Topic orientation
Information covering one subject with a specific intent
Topic granularity
Discrete, self-contained units accessed independently
Topic sets
Deliverables assembled from a pool of available topics
Strong typing
Required structures with well-defined semantics for each kind of information — such as a Task with steps
Specialization
Extensibility by defining a new type as a special case of an existing type — API Reference from Reference
Type hierarchy
A single generic Topic type from which all other types are specialized and to which all types can fall back
Reuse
Content reused through topic and content references; design and processing reused through specialization

What is the topic structure in the architecture?

The topic structure has the following major parts:
  1. <topic> is the container for a single unit of typed information. Tightly-related topics may be nested for reuse as a single construct.
  2. <title> provides self-description, consistent with guidelines for authoring.
  3. <titlealts> provides optional title content optimized for search or for navigation.
  4. <shortdesc> provides an abtract-level description of the content of the topic. Shortdesc is optional at the topic level for generality when specializing. For User Assistance, the best practice is to use shortdesc as if it were a required first or only paragraph.
  5. <prolog> contains optional metadata, or information about the document or its content.
  6. <body> is the container for paragraph-level content and any number of non-nesting sections.
  7. <related-links> is an optional container for in-topic link relations. Links can also be maintained separately from topics by using a DITA map to express the linking relationships between topics.
  8. optional nested information types that you can use to develop internal hierarchy when necessary.

The DTD declaration for topic has the following structure:

<!ELEMENT topic (title, titlealts?, shortdesc?, prolog?, body,
related-links?, (%info-types;)*)>

which reads, "topic contains a required title, optional titlealts, optional shortdesc, optional prolog, required body, optional related-links, and any number of allowed child topics of various info-types."

A typical DITA topic utilizing all these elements may look like this: <topic id="example1">
<title>Example topic</title>
<titlealts>
<searchtitle>Typical example DITA topic</searchtitle>
</titlealts>
<shortdesc>A typical DITA topic can be augmented by optional elements.</shortdesc>
<prolog>
<author>John Doe</author>
</prolog>
<body>
<p>The topic element must contain a title and body.</p>
</body>
<related-links>
<link href="required-elements.xml" scope="local">
<linktext>Required elements</linktext>
</link>
<link href="optional-elements.xml" scope="local">
<linktext>Optional elements</linktext>
</link>
</related-links>
<topic id="nest1">
<title>A nested topic</title>
<body>
<p>This topic is a very simple nested topic.</p>
</body>
</topic>
</topic>

What do "info-typed" DITA topic examples look like?

A small concept example

A minimal topic for field description in a dialog might look like this: <concept id="username">
<title><var>username>/var> input field</title>
<shortdesc>Enter your name or the name of the user for whom you are creating a record.</shortdesc>
</concept>

A larger task example

A procedure for installing a hard drive might look like this. <task id="installstorage">
<title>Installing a hard drive</title>
<shortdesc>You open the box and insert the drive.</shortdesc>
<prolog><metadata>
<audience type="administrator"/>
<keywords>
<indexterm>hard drive</indexterm>
<indexterm>disk drive</indexterm>
</keywords>
</metadata></prolog>
<taskbody>
<steps>
<step><cmd>Unscrew the cover.</cmd>
<stepresult>The drive bay is exposed.</stepresult>
</step>
<step><cmd>Insert the drive into the drive bay.</cmd>
<info>If you feel resistance, try another angle.</info>
</step>
</steps>
</taskbody>
<related-links>
<link href="formatstorage.dita"/>
<link href="installmemory.dita"/>
</related-links>
</task>

What is "progressive disclosure" in a topic?

Because each topic has a title and short description in addition to its full content, applications can provide progressive disclosure. For example, a user can hover over a link to see its short description and then decide whether to follow the link for the rest of the topic. Progressive disclosure also allows topics to be meaningfully browsed in a variety of viewing contexts, whether full-screen browsers, integrated help panes, infopops, or PDA screens. The application can disclose as much information as the context supports, letting the user decide where and how to drill down to more content.

Can topics be nested?

Topics can be nested to create larger document structures. However, the nesting always occurs outside the content boundary, so that child and parent topics can be easily separated and reused in different contexts. Here is a sample nesting structure: <topic>
<title>A general topic</title>
<shortdesc>This general topic is pretty general.</shortdesc>
<body><p>General topics are not very specific. They are useful for
the big picture, but they don't get into details in the same way as
more specific topics.</p></body>
<topic>
<title>A specific topic</title>
<shortdesc>This is a more specific topic.</shortdesc>
<body><p>Specifically, this is more specific.</p></body>
</topic>
</topic>

You can author topics either as nested structures or as individual stand-alone documents. In the latter case, you assemble the documents into nested structures as required, such as when delivering printed or printable information that has a part and chapter hierarchy.

The nested structure gives a sequence and hierarchy of topics within a topic collection. In a Web environment you could disassemble this structure into individual topics and preserve the hierarchy in a generated navigation map or table of contents. However, if the Web is the main delivery vehicle, you might want to author the topics as separate documents and then apply several tables of contents to the same collection of topics.

What is an information type?

An information type describes a category of topics, such as concepts, tasks, or reference. Typically, different information types support different kinds of content. For example, a task typically has a set of steps, whereas a reference topic has a set of customary sections, such as syntax, properties, and usage.

Why information types?

With information types, you can divide topics into categories that you can manage and keep consistent more easily than without information types. Information types also make it easier for users to find the information that they are looking for: how-to information in a task versus background information in a concept versus detailed specifications in a reference topic.

What is specialization?

Specialization is the process of creating new categories of topics, or information types, as well as new categories of elements, or domain types. You can define these new types using the existing ones as a base. For example, a product group might identify three main types of reference topic - messages, utilities, and APIs - and define three domains - networking, programming, and databases. By creating a specialized topic type for each kind of reference information, and creating a domain type for each kind of subject, the product architect can ensure that each type of topic has the appropriate structures and content. In addition, the specialized topics make XML-aware search more useful, because users can make fine-grained distinctions. For example, a user could search for xyz only in messages or only in APIs, as well as searching for xyz across reference topics in general.

Rules govern how to specialize safely: Each new information type must map to an existing one, and new information types must be more restrictive than the existing one in the content that they allow. With such specialization, new information types can use generic processing streams for translation, print, and Web publishing. Although a product group can override or extend these processes, they get the full range of existing processes by default, without any extra work or maintenance. The DITA specialization articles outline the rules for each kind of specialization (topic type and domain type).

XML.org Focus Areas: BPEL | DITA | ebXML | IDtrust | OpenDocument | SAML | UBL | UDDI
OASIS sites: OASIS | Cover Pages | XML.org | AMQP | CGM Open | eGov | Emergency | IDtrust | LegalXML | Open CSA | OSLC | WS-I