Topic-based authoring

Topic-based authoring has been a mainstay of technical information development since we first began developing help systems. We learned quickly enough that we couldn't split our existing books into help topics by making every heading level a new help page. Information originally designed with a narrative flow no longer made sense nor assisted users in finding exactly the content they needed. We had to rethink the type of information that our help systems should include and create a new set of standards for its development. The result is topic-based authoring.

Authoring in topics provides information developers with a way to create distinct modules of information that can stand alone for users. Each topic answers one question: "How do I ...?" "What is ...?" "What went wrong?" Each topic has a title to name its purpose and contains enough content for someone to begin and complete a task, grasp a basic concept, or look up critical reference information. Each topic has a carefully defined set of the basic content units that are required and accommodates other optional content. As information developers learn to author in topics and follow sound authoring guidelines consistently, we gain the ability to offer information written by many different experts that looks and feels the same to the users.

Not only has topic-based authoring become the norm for well-designed help systems, information architects have learned that formulating consistently structured topics facilitates readability and information access in traditional, more  linear book structures. Readers are able to identify task-based topics within sections and chapters because the tasks look the same and contain the same essential content units. Readers learn that conceptual and background information is always located in the same position in the table of contents with respect to the tasks. Readers come to depend upon standard reference sections that contain similarly structured details for ease of lookup.

The core information types in DITA support the structures that underlie most well-designed technical information. Any organization that follows best practices in formation architecture will find the core DITA structure a good fit. But they also challenge us to become even more disciplined in structuring information according to a set of carefully defined business rules. The benefit of such disciplined information structuring is the consistent presentation of information that helps you build reader confidence and simplify the reader's task of knowing how to navigate and use your information.

Benefits of topic-based authoring

Authoring in structured topics provides you with a sophisticated and powerful way to deliver information to your user community. You will find benefits that decrease your development costs and time to market, as well as provide increased value to your customers:

If one of your business goals is to use information topics in multiple deliverables, you need to build a repository of topics that are clearly defined according to your standard set of information types. Your repository is also characterized by the metadata attributes you associate with you topics.

DITA provides you with such a standard as a starting point. DITA gives you the capability to expand upon its core information types when you need to accommodate the special needs of your customers and your information.

Defining information types for your topics

If your information is like most in the technical information industry, you have a great diversity of structures in your information, especially if those topics are embedded in the threaded, narrative sections and chapters of books. Your first job is to inventory your content to identify its range and diversity.

In most cases, you will find lots of tasks, containing step-by-step instructions for reaching a specific goal. The dominance of the task in technical information is why DITA includes the task as one of the three core information types.

Accompanying tasks, you are likely to find background, description, and conceptual information that explains what something is. DITA labels such supporting information "concepts". You will also find tables, lists, diagrams, process flows, and other information that can be labeled as "reference," the information that no one wants to memorize but must be easy to look up.

Once you have completed your content inventory, you need to carefully analyze the three core information types provided with DITA. The standard structure for task, concept, and reference is presented in the DITA specification. Experiment with accommodating your content to the standard structure. In most cases, your content easily fits into a standard DITA structure.

Where you may encounter difficulties is with the diversity of your own content rather than with the DITA information types. Some of the content in your inventory will not even meet your own guidelines. Often, that content was written by people long gone from your organization or was influenced by subject-matter experts who wanted it their way rather than following your authoring guidelines.

Our recommendation is to focus on the essential underlying structure of your content rather than the idiosyncrasies and accidents of individual writers over the years. If you find an odd structure in a task, for example, ask if that structure is the best way of conveying the information to the user or if the task can be rewritten following the structure of a standard DITA information type. Most of the time, you will find that the standard is the best solution.

One of the more common problems you will find with some of the content you examine is mixed structure. Tasks start out with long discussions of background information. Concepts end up including step-by-step procedures. Tables of reference material end up with concepts in the footnotes or tasks incorporated into table cells.

Although mixed information types are possible in DITA, we don't recommend using them. Consider that by separating information carefully and rigorously into the neat, consistent information-type buckets provided, you will have information that you can present much more dynamically and flexibly to users. If a user wants to know the steps of a task, they can skip background information that they don't want to think about yet. You can refer them to that conceptual and background information through a related-topic reference or a hypertext link rather than embed lengthy conceptual information in the task.

By chunking your information according to well-defined information types rather than combining types randomly,  you gain flexibility in distributing your information to people who need it most. You also make the relationships between chunks of information more obvious. If you believe that users will profit from reading background information before performing a task, by using related-topic links, you can ensure that they know about the relationship and why reviewing the concept or background is advantageous.

Adding new information types

Although we find that most technical information fits neatly into the standard DITA information types, we recognize that you may discover that you have special information types that cannot be accommodated by the standard content units or that you want to label those content units with more descriptive XML tag names. At that point, you need to pursue specialization.

Consider an example in the semiconductor industry. A great deal of detailed information about a chip design is contained in an information type called a register description. Although a register description falls into the class of reference information types, it has some very specific and detailed content. By specializing on the standard reference information type, you can build a register description specialization that standardizes the content with appropriate XML elements names, assisting the writers and providing additional metadata to facilitate searches. Many similar opportunities for specialization may present themselves in your content. But be careful to exhaust the possibilities of the standard information types before pursuing the differences.

The more differences you present to writers and readers, the more opportunities there are for confusion. With too many choices of information types, an information developer is more likely to chose incorrectly. With too many subtle differences in the presentation of information, your users are more likely to become confused when they are unable to find the standard set of content that they have come to expect.

This discussion of topic-based authoring is excerpted from

Introduction to DITA: A Basic User Guide to the Darwin Information Typing Architecture, by Comtech Services, Inc.


See Portuguese translation of this page.

Related articles (topic-based authoring)

"Structuring your Documents for Maximum Reuse," Janice (Ginny) Redish, Best Practices, June 2000. [Best Practices is the bimonthly newsletter of the Center for Information-Development Management (CIDM)]

Ginny Redish outlines a step-by-step procedure for creating structured documents.  Even if you aren't yet considering single sourcing, you'll find that structuring documents is an extremely useful, time-saving technique. It works in traditional publishing and is useful for individual writers in any situation where they have to create the same type of document many times. It is essential for teams of writers who are contributing parts to a large document or to a set of documents.  (link coming soon)

Related books (topic-based authoring)

The books listed here contain information relevant to topic-based authoring:

Sissi Closs: Single Source Publishing. Topicorientierte Strukturierung und DITA, Entwickler-Press, 2006

This book describes the Single Source Publishing history and explains the relevant concepts focused on topic-oriented structures. Siss Closs has developed the class concept method with which adequate topic and link types can systematically be developed for any kind of content. In this book, the class concept method is described in detail. In addition, the book contains a DITA short reference.

Jonathan and Lisa Price, Hot Text, New Riders Press, 2002

Hot Text focuses on good writing practices, including topic-based authoring, and applies these to web-based deliverables. It includes XML authoring that is directly applicable to implementing the DITA model.

Robert E. Horn, Mapping Hypertext: The Analysis, Organization, and Display of Knowledge for the Next Generation of On-Line Text and Graphics, Lexington Institute, 1990

Robert Horn is the developer of Information Mapping(tm). Although this book focuses on online information, it is one of the few publically available discussions of the topic-based principles of information mapping. In the book, Horn explains how to chunk, organize, and sequence content.

JoAnn Hackos and Dawn Stevens, Standards for Online Communication, Wiley, 1997

Hackos and Stevens focus on topic-based authoring in the context of online information systems. They include both help and web design in the examples. However, the topic-based authoring principles are central to the writing methods detailed in the book. The authors demonstrate how topic-based authoring differs significantly from book-based authoring.

Kurt Ament, Single Sourcing: Building Modular Documentation, Noyes Publications, 2002

Ament explains in plain language and by example how to develop single source documents. He shows technical writers how to develop standalone information modules, then map these modules to a variety of audiences and formats using proven information mapping techniques.

Gretchen Hargis et al. , Developing Quality Technical Information: A Handbook for Writers and Editors, IBM Press, 2nd edition, 2004

Many books about technical writing tell you how to develop different parts of technical information, such as headings, lists, tables, and indexes. Instead, we organized this book to tell you how to apply quality characteristics that, in our experience, make technical information easy to use, easy to understand, and easy to find.

Developing Quality Technical Information : A Handbook for Writers and Editors

IBM Press Series--Information Management (not DITA but recommended as a good book on "topic based authoring") (Available from