Revision of Level Four: Automation and integration from Mon, 2008-05-19 22:42

Introduction to DITA Maturity Model
Level 1: Topics
Level 2: Scaleable reuse
Level 3: Specialization and customization
Level 4: Automation and Integration
Level 5: Semantics on demand
Level 6: Universal semantic infosystem

Once content is specialized, you can
leverage your investment in semantics
with automation of key processes, and
begin tying content together—even across
different specializations or authoring
disciplines. For example, you can share
common content across marketing and
training, or share common processes and infrastructure throughout your content life cycle.

Scenario

The software division of a large technology company stores their content in a CMS, which allows all the teams in the division to reuse the content. At this level, they have moved beyond single-sourcing of content and achieved multiway reuse. Product descriptions created by the marketing team can be reused by the technical publications group to create product overviews, and by the training group to create product tours. At the same time, product architectural specifications created by technical publications can be reused by training, technical support groups, and the marketing team.

The following figure illustrates how content created by different teams can be reused in multiple deliverables by multiple teams across the division.

Figure 6

Figure 6: Content reuse across teams

Reusing its content across the teams in the division, the company can save a signifi cant amount of money by translating the content source rather than each deliverable that instantiates the content.

Investment

Organizations need a CMS to effectively control and automate the content development life cycle. In addition to storing content and providing versioning control, the CMS provides workfl ow automation support that assists authors in creating, reusing, and publishing. However, the investment in implementing a CMS is non-trivial in terms of preparation and cost.

In preparation for a CMS implementation, you must understand the structure of the content and where it is appropriate for reuse. This requires a significant amount of research, planning, and coordination to identify the reuse possibilities, requirements, and standards across disciplines. In addition, you need to defi ne a robust metadata model to support the content model and apply it to all topics. Lastly, you must have agreed-upon content development processes in order to automate them with workfl ow control. This requires consensus and support from all stakeholders in the content life
cycle. The cost for implementing the CMS includes the following items:
• Price of the CMS software
• Hardware to run it and store the content
• Resource time to prepare and plan for implementation
• Resources to customize and maintain the CMS
• Resource time for training stakeholders to use it

Although such an undertaking may seem daunting, the initial implementation is a one-time cost but the improvements in speed and efficiency will allow you to recoup the investment in a minimal amount of time.

A translation management system is another key automation and integration investment to manage and automate content localization. If you are translating content into more than one language, you must have processes in place to handle this additional work. A translation management system provides automated process management for translating content and integrates into the CMS workflow support.

To implement a translation management system, you must have a defined translation process that can scale to meet your localization needs as they increase, and you must understand the requirements for a scalable system. In addition, you must build your translation memory, which is the library of localized content.

Return

The return on investment in a CMS is the ability to reuse content across disciplines and automate the content development workflow. If content is not stored in a repository that provides easy retrieval through metadata, it will be impossible to reuse content across teams. In addition to obvious characteristics such as automated status change notification and reporting, workflow support enables you to see quickly what information is reused in which topics. This crucial feature of this fourth level of adoption enables true reuse and mitigates the risk of inadvertently propagating change throughout the content set.

The following figure shows how users can share content stored in multiple repositories.

Figure 7
Figure 7: Multiple users sharing content from multiple repositories

Traditional publishing and translation processes involve sending each deliverable out for translation. Although you can leverage the translation memory for the content in each deliverable, the translation vendor must compare each deliverable to the translation memory to determine what content is new and what needs to be translated. If you have multiple deliverables with the same content, you pay for each analysis pass. If you have multiple deliverables with similar but non-identical information, you pay for the analysis pass, as well as the cost to translate each “version” of the information. Organizations that produce multi-language documentation can incur large, unnecessary costs if they have to multiply the number of languages by the number of versions of the content for each release.

In contrast, because DITA is an XML topic-based architecture, you send only the source topics that contain changed content to the translation vendor. This means that you can control the content in smaller units, and thus the amount of content the vendor analyzes for each language is significantly reduced. In addition, if you are reusing content rather than rewriting multiple versions of it, you simply pay to translate the original source instead of multiple versions of the same information. Content that is translated at the source rather than at the level of each deliverable, radically changes the translation cost structure. The ability to translate content at the source, combined with the ability to identify changed content and thereby reduce the actual amount of content by reuse, gives you greater control over the translation process and your overall localization costs.

By automating workflow support with a CMS and integrating the translation process, you can reuse content with confi dence across teams and realize significant savings when localizing to multiple languages.

DITA features used

This adoption level uses the following DITA features:

Metadata

DITA provides some basic metadata attributes for all topics, including author, audience, resource ID, keywords, and index markers. Maps also have default metadata, including copyright information and critical dates. However, specializations provide additional, deliverable-specific attributes. For example, the bookmap specialization includes book-specific metadata including book identifi cation numbers and publication data.

Translation and language attributes

DITA provides the translate and xml:lang attributes to support localization. The translate attribute “indicates whether the content of the element should be translated or not.” The xml:lang attribute identifi es the language into which the content should be translated. You can specify these attributes at the element, topic, or map level.

Generalization for cross-specialization reuse

When reuse happens across different content types, issues of cross-type validation can quickly result: some of the semantics in the source may not be valid in the context of reuse. For example, a <step> is allowed in a task topic but not in a concept topic. But since a <step> is just a specialized type of list item (<li>), you can reuse a <step> any place where a <li> is allowed by stripping away the extra semantics that do not apply in the new context. In this way, you can reuse the content of a <step> between tasks and concepts, even if the specialized semantics and structure only apply in the source type.

Sharing Your Feedback on Level Four: Automation and Integration

Use the "Comment" feature at the bottom of each page to share constructive criticism, make suggestions for improvement, and to provide use case sceanrios that can help us enhance the DITA Maturity Model in the future.

Scott Abel
Content Management Strategist
The Content Wrangler
scottabel@mac.com
www.thecontentwrangler.com

 

XML.org Focus Areas: BPEL | DITA | ebXML | IDtrust | OpenDocument | SAML | UBL | UDDI
OASIS sites: OASIS | Cover Pages | XML.org | AMQP | CGM Open | eGov | Emergency | IDtrust | LegalXML | Open CSA | OSLC | WS-I