Revision of History of DITA from Sat, 2008-04-12 15:37

 

The history of DITA is the history of its many powerful characteristics - modularity, structured writing, information typing, separation of content from presentation, single-sourcing, minimalism, topic-based, task-orientation, content reuse, conditional processing, localization-friendly, multi-channel, component publishing, usability, consistency, object-orientation, inheritance, specialization, simplified XML.

If you don't understand all these DITA characteristics, you may not have analyzed the DITA Business Case properly - for your organization, or for yourself if you are a professional writer.

You don't have to know how to do all these things to use DITA, but if there is no one in your organization who knows why you should use them, you may have a problem. If you have already been doing some of these things, you will want to know how DITA incorporates them.

 

Before 1960

The historian of technical communications, R. John Brockmann, researched efforts to document products going back centuries. He finds that some of today's hottest new documentation ideas were present in the work of those creating, documenting, and selling the technology of manufacturing just after the revolutionary war.
( From Millwrights to Shipwrights to the Twenty-First Century: Explorations in a History of Technical Communication in the United States)

Today's computers, with their spectacular graphical interfaces, allow us to present animated visual images, even 3-D models to illustrate complex machinery. But this is not the work of the everyday tech writer. Flash animations and computer-aided design (CAD) demand skills more like those found in a game design team than a lone tech writer and wordsmith.

Brockmann found that two-dimensional images were a key part of 18th century technical documents. And modern ideas like modularity were there in the form of documents, which were as often a set of cards as a book. He also found that early work was very user-centered and task oriented, and that it took advantage of knowledge already available to the user.

It seems that much of the change in today's technical documentation is the direct influence of the computer, and for some obvious reasons:

  • Theorists of computer documentation thought they were in a new field and simply reinvented practices long in use for explaining ordinary machines.
  • Software documentation, including online help, had become the paradigm for today's tech writers.
  • Since desktop publishing appeared, we rely heavily on the computer to assist us in the preparation of our documentation.

 

The 1960's and '70's - Programmed Learning, STOP, QRC, Advance Organizers, Information Mapping

At Harvard in the 1960's, computers were enlisted to become "teaching machines" by the behaviorist B.F.Skinner. His ideas of "programmed learning" still have influences in today's eLearning models. His work required knowledge to be broken down into chunks.

Hughes STOP - (Sequential Thematic Organization of Publications) advocated a storyboard approach with two-page spreads. A large graphic on one page, with clear labels, faces the main explanatory text on the opposite page.

The U.S. Navy published the Quick Reader Comprehension (QRC) method in 1961. It explicitly called for modular documentation that could be reassembled and reused for different purposes, perhaps the first mention of Reuse.

David Ausubel first proposed Advance Organizers in 1960. They are formal versions of the teacher telling the students what will be said (then saying it, then telling them what was said - a summary, in the classic three-step teaching method). Ausubel advocated images and clear titles and subtitles that revealed the structure in a document.

In the mid-'60's Robert Horn (winner of an ACM SIGDOC Lifetime Achievement Award for Documentation) developed Information Mapping techniques and founded the company by that name. Common "Information Types" were identified in dozens of standard document types like user manuals, policy and procedure manuals, annual reports, etc. Identifying standard information types is at the heart of DITA (Darwin Information Typing Architecture).

In the late '60's, Charles Goldfarb, Edward Mosher and Raymond Lorie (whose surname initials were used by Goldfarb to make up the acronym GML) created IBM's Generalized Markup Language for documents. In 1974, GML became SGML, with the help of Yuri Rubinsky and others. SGML was the standard for many years of structured documents in the military, aerospace, and large computer companies. It became the basis of DocBook.

The 1980's - IBM Task Orientation, Desktop Publishing, Macintosh Documentation Guidelines, and Help

In 1981 a team at IBM led by Fred Bethke called for a new "task orientation" in computer software documentation. Their report, IBM Improving usability of publications (1981), contrasted documents that reflected the software systems architecture. They found that a user had to already understand the software to find the help they needed. Inexperienced users got lost. Another approach was role-based documentation. The new idea was task orientation, which deals with the tasks people commonly perform with computer programs, regardless of their job titles, and focuses on the information needed to perform the tasks.

In 1981 Interleaf introduced technical publishing software for document authoring and composition. It included word processing, graphics, charts, tables, equations, image editing, and automatic page layout. Interleaf automatically generated indexes and tables of contents for books, and featured conditional processing of content.

In 1984, the new Apple Macintosh was a revolution in computer user interfaces and a similar revolution in computer documentation. The user interface for documents was WYSIWYG (what you see is what you get - when you print the document). Affordable Desktop Publishing was born. The first DTP program, the $99 MacPublisher, was created by Bob Doyle, in the year of the Mac. Aldus (later Adobe) PageMaker followed in 1985. These tools led technical writers to style their documents and even arrange the content layout on the page. To this day DTP thinking is the most important inhibitor of content reuse, mixing presentation with content.

The new Macintosh Documentation Guidelines called for three sections. A Learning overview with tutorials that introduce new concepts and functions, an extensive Using section that spells out how to accomplish tasks, and a program Reference section. To this day, well written books on computers (for example those from O'Reilly) have Learning (e.g., Learning PHP), Using (e.g., Programming PHP), and Definitive Reference volumes.

 

Note how Learning, Using, and Reference map perfectly onto the three DITA information types specialized from the basic DITA Topic structure - Concept, Task, and Reference. And note that the Macintosh "Using" section was task-oriented, just as IBM was recommending.

 

In 1986 FrameMaker was introduced on the Sun OS. This DTP program was designed for long-form documents like books. It became very popular among professional tech writers and at $2500 was a major competitor for the much more expensive Interleaf system.

In 1986 R. John Brockmann published Writing Better Computer User Documentation. Brockmann described the changes needed to move from paper docs to online. He reported on the new task-based approach, which limits information to that needed to perform a single task, assuming that the user can find general information elsewhere, or very likely already knows it.

The first Mapping Hypertext, an extraordinary book with fantastic illustrations - all drawn by Horn himself - exhibiting the kind of structured writing that Information Mapping was proposing for all documentation. This is still one of the three most important books in the history of documentation in general (it's not about computer docs). The book described the seven information types of a structured document - classification, concept, principle, procedure, process, structure, and fact. Horn was inspired by Harvard Professor George Miller's famous work on the Magical Number Seven (plus or minus two) as the number of things easily learned at one time.

Learning theorist Dr. Ruth Clark would trim these down to five - concept, principle, procedure, process, and fact - her information types for Training and eLearning - in her workshops and book Developing Technical Training: A Structured Approach for Developing Classroom and Computer-based Instructional Materials.

The 1990's - IBM Minimalism and User-Centered Design

In 1990 MIT Press published the research results of another IBM team led by John M. Carroll. Carroll's book, The Nurnberg Funnel introduced the idea of minimalism in technical writing. It was task orientation carried to an extreme. Minimalism meant small non-linear chunks readable in any order. It emphasized reading To Do, not reading To Know or To Learn, a phrase first introduced by Ginny Reddish. It attacked the standard systems approach to learning of Gagne and Briggs, with its hierarchical decomposition of learning objectives, which remains to this day as a standard in learning systems. And it emphasized handling errors when the user could not accomplish a task.

In 1991 Sun Microsystems introduced FrameBuilder, a version of FrameMaker with added support for SGML.

In 1994 JoAnn Hackos published her landmark Managing Your Documentation Projects, revised and republished as Information Development: Managing Your Documentation Projects, Portfolio, and People by Wiley in 2006. Fully in tune with task orientation, Hackos book described only three information types - concept, procedure, and reference. This seems to be a combination of Information Mapping's seven types, Ruth Clark's simplification to five types, and Apple Macintosh Documentation Guidelines three components.

In the mid '90's, Yuri Rubinsky's team at SoftQuad in Toronto (creators of one of the first and most popular HTML editors, HoTMetaL, became involved in the development of a compromise markup language somewhere between the extraordinarily complex SGML and the popular new HTML (Hypertext Markup Language) for web pages. (HoTMetaL was the precursor to today's XMetaL from Justsystems.) HTML was a disaster from the point of structured reusable component documentation, not least because it combined presentation markup with structural markup. The new markup language was XML (eXtensible Markup Language).

In November 1995 John Carroll convened a workshop, sponsored by the Society of Technical Communication (STC), to evaluate Minimalism in the years since the Nurnberg Funnel. Carroll invited his major colleagues - R. John Brockmann, David Farkas, JoAnn Hackos, Hans van der Meij, Janice C. (Ginny) Reddish, and others.

In another part of Toronto in 1995, an IBM documentation team was developing a Help system for IBM's new line of Visual Age software. Jamie Roberts returned from graduate study at Waterloo and attended a brainstorming session to define some basic information topic types for the new Help. He scribbled "concept, task, and reference" on a napkin and a new help document architecture was born. There is not much unusual about a Help system that is task-based and assembled from topics. What was new was that this system was to lead to the simplified form of XML known as DITA.

In 1995 Adobe acquired FrameMaker and FrameBuilder, which was to become FrameMaker + SGML, and eventually the more affordable Structured FrameMaker, now included with every copy of FrameMaker, though used by a small percentage of tech writers. Most writers continue to prefer unstructured documents.

In 1998, JoAnn Hackos and Ginny Reddish published the definitive reference on task analysis, User and Task Analysis for Interface Design, and John Carroll published the edited proceedings of his 1995 workshop, Minimalism Beyond the Nurnberg Funnel, with major contributions by Hackos and Reddish.

In 1999, IBM published an important guide, Publishing Quality Technical Information (called PQTI and now unavailable). The team of writers included Fred Bethke, whose earlier IBM Publishing Guidelines has established the critical importance of task orientation in documentation.

The 2000's - IBM DITA

In March 2001, IBM introduced DITA as a series of developerWorks articles about a new simplified version of XML for documentation. It was intended to replace IBM's IBMIDDoc, an internal version of SGML for IBM's technical software support. While XML was enjoying great use as a data exchange method (RSS and SOAP protocols), DITA was an attempt to make a simplified XML starter set as a documentation markup language, one designed from the outset to encourage reuse of small content components. The key ideas were to be simpler than the complex SGML and also be usable online.

The goal of DITA was to formalize information typing practices, both print and online, and also enable an extensible typing architecture through specialization of base topics. DITA maps were a way to standardize collection publishing and information architecture/outlining models. DITA was initially known as MITA, for Mendel Information Typing Architecture, to emphasize the object orientation of the new architecture, with its "inheritance" and evolution of topic structures via specialization. Since MITA was already a somewhat proprietary acronym, IBM switched to Darwin and DITA.

In May 2002, IBM added domain specialization to topic specialization, and demonstrated these in the Open Toolkit, a reference implementation of DITA publishing, with a starter set of XSLT stylesheets. IBM encouraged authoring tool vendors to integrate the Open Toolkit as a means of publishing DITA, and most have done so.

In 2003, two important books appeared on single sourcing and content reuse, Single sourcing: Building Modular Documentation, by Kurt Ament, and Managing Enterprise Content, by Ann Rockley.

In 2003, IBM revised and published PQTI as Developing Quality Technical Information: A Handbook for Writers and Editors (2nd Edition) , by Gretchen Hargis and others. This book is all about DITA without mentioning the name, because IBM was using DITA internally but not yet sharing it with the world when the book was drafted,and because the DITA Architects were implementing the the earlier edition's recommended practices.

OASIS DITA

In April 2004, the Organization for the Advancement of Structured Information Standards (OASIS), formed a Technical Committee to explore a DITA Standard. The TC included XML tools vendors, consultants on Information Architecture and Content Management Systems (CMS), and end users of the DITA Document Type Definitions (DTD) and Schemas needed for the new DITA Standard.

In February 2005, IBM donated the Open Toolkit, a limited version of their internal Information Developers Workbench, to SourceForge. IBM continues to develop the OT, which is not a part of the AOSIS DITA Standard efforts.

DITA 1.0

DITA 1.0 was approved as an OASIS Standard in June 2005

DITA 1.1

DITA 1.1 was approved in August 2007, adding a new Bookmap specialization.

DITA 1.2

DITA 1.2 is expected sometime in 2008. It will add structured learning, creation of Learning Objects with DITA, which will be compatible with eLearning standards such as SCORM.

References

A History of Technical Communications in the U.S., by R. John Brockmann.

History of Outlining (and STOP).

Quick Reader Comprehension (1961).

Hughes STOP - Sequential Thematic Organization of Publications (1965).

IBM Improving usability of publications (1981). Task-orientation HTML version

Writing Better Computer User Documentation (1986)

Mapping Hypertext, Robert Horn, Lexington Institute (1989).

Developing Technical Training: A Structured Approach for Developing Classroom and Computer-based Instructional Materials, Dr. Ruth Clark (1989, 2nd edition, 1999).

The Nurnberg Funnel, John M. Carroll, MIT Press(1990).

Managing Your Documentation Projects, by JoAnn Hackos (Wiley, 1994).

Standards for Online Communication, by JoAnn Hackos (Wiley, 1997).

Robert Horn, Visual Language (1998).

User and Task Analysis for Interface Design, by JoAnn Hackos and Janice C. (Ginny) Reddish)(1998).

Minimalism Beyond the Nurnberg Funnel, John Carroll, MIT Press(1998).

Two approaches to modularity (1999). Robert Horn compares structured writing to Hughes STOP.

Review of the Nurnberg Funnel(1999) Robert Horn compares structured writing to Minimalism.

The Impact of Single Sourcing and Technology, Ann Rockley, 2001.

Cisco/Clark Reusable Learning Objects.

Managing Enterprise Content, by Ann Rockley, New Riders, 2003.

Single sourcing: Building Modular Documentation, by Kurt Ament, Andrew Publishing, 2003.

Robert Horn Powerpoint on Visual Language.(2003).

Developing Quality Technical Information: A Handbook for Writers and Editors (2nd Edition) , by Gretchen Hargis, Michelle Carey, Ann Kilty Hernandez, Polly Hughes, Deirdre Longo, Shannon Rouiller, Elizabeth Wilde (IBM Press, Information Management Series, 2004).

Information Development: Managing Your Documentation Projects, Portfolio, and People, by JoAnn Hackos (Wiley, 2006).

XML.org Focus Areas: BPEL | DITA | ebXML | IDtrust | OpenDocument | SAML | UBL | UDDI
OASIS sites: OASIS | Cover Pages | XML.org | AMQP | CGM Open | eGov | Emergency | IDtrust | LegalXML | Open CSA | OSLC | WS-I