[ARTICLE] 2014 Stilo DITA Knowledge Series: DITA reuse and conversion together

When you are considering converting content from Word or unstructured FrameMaker (or other unstructured formats) into DITA, one of the things you want to consider before you start converting is your reuse strategy.

Why reuse and conversion together?

Your reuse strategy can be partially implemented as part of the conversion process, which means that you can automate some of the work. The highly automated nature of conversion is the perfect opportunity to sneak in some reuse automation at the same time. If you know what your reuse goals are, you can save a lot of manual effort by using the conversion process to automatically and programmatically add some reuse mechanisms to your content.

At its core, a DITA conversion is the process of mapping your content and formatting to DITA elements and attributes. If you opt to ignore certain formats or objects, your conversion process essentially “flattens” your content and you lose a great opportunity to automate.

For example, if you neglect to map variables to a specific element in DITA, then those variables are converted as plain text and you’ve lost your opportunity to apply a DITA element to them programmatically (and quickly).

In short, with a little planning and setup, combining reuse and conversion can save you lots of time.

Reuse strategy

A reuse strategy defines what kind of content you’ll reuse and how you’ll reuse it, including the DITA mechanisms you’ll need for each kind of reuse.

Generally speaking, you should be looking at reusing two types of content:

  1. Content that stays the same: Wherever it is, it needs to be standardized.
  2. Content that changes: Content has variations because of the
           * context of the person reading it (or someone who is re-branding and publishing it)

 

           * product version, suite or product combinations, or

           * changing nature of the product over time (like product or component names that may evolve).

Your reuse strategy is going to be your target. Without this strategy you’re really not going to be able to definitively know what it is you need to do during conversion or how to best do it. Planning and testing the reuse strategy before conversion is the key to being able to automate its application as part of conversion.

Reuse strategy for conversion

Although you should always develop a content reuse strategy when moving to DITA, when combining reuse and conversion, you need to add an extra layer to your reuse strategy that includes two major areas:

1.     Identify your existing content reuse (text insets, conditions, and variables) and decide how you’ll leverage it during conversion.

a)      What will each one map to?

b)      What is your desired end result?

2.     Plan for new content reuse that can be applied during conversion.

a)      What is your desired end result? What reuse mechanisms do you want to use?

b)      What and how can you automate?

c)       What do you need to change to enable automation during conversion? (You may need to apply formatting, for example, to automate something you really need.)

Content reuse in DITA

Everyone’s requirements are unique, but in general you should consider some common reuse strategies in DITA to get you started.

DITA allows you to use a variety of methods to reuse content and you’ll want to consider them all. When you’re getting started with reuse, you usually consider three main mechanisms:

  • Conref: Conref’ing is a mechanism that is equivalent to a text inset in FrameMaker, where a chunk of content (less than a topic) is pulled in from another location. DITA does this using a conref. (A push mechanism is also available, but less frequently used.)
  • Profiling: Profiling is equivalent to FrameMaker conditions, where content can be shown or hidden based on attribute values on elements.
  • Topic reuse: Topic-level reuse is simply pulling your topic into a map wherever it’s needed. Once in DITA, content is modular enough to be in short, reusable chunks. You don’t necessarily need to plan for this reuse during conversion, but it may allow you to NOT convert some content.

Warehouse topics for conrefs

These are topics that hold fragments of content and are never meant to be published as topics. You might create a warehouse topic for each of the following:

  • GUI objects, fields, buttons, icons
  • Frequently used steps, with step results and info
  • All your notes and warnings
  • Pre-requisites that are commonly mentioned, like having administrative privileges

You then use these warehouse topics as the source for conref mechanisms. Just like with text insets, warehouse topics let you write content once, and use wherever you need it. That means, when it changes, you update it in one spot. You translate it once too.

Once you know the steps that will go into a warehouse topic, for example, you can apply a distinct FrameMaker or Word format/style to the steps and then script conversion that will 

  1. Pull the step into a warehouse topic (if not already there).
  2. Replace the step with a conref from the warehouse topic.

 

The result is that a good chunk of your reuse is automated during conversion.

DITA keys

Keys are a powerful mechanism in DITA. Although not strictly reuse, they can make reuse faster, simpler, and more dynamic. If you’re planning consistent, ongoing, and growing reuse, then consider keys as well.

Keys are used for indirect referencing of any kind. You can use keys for any piece that may need to be centrally updated or swapped out. For example, keys are often used for

  • Variables: To define terms or product names that can change based on context or over time.
  • URLs: To centrally manage and update them or customize them based on deliverables.
  • Conrefs become conkeyrefs: To pull in a different set of conrefs and quickly customize a document.
  • Related links: To customize them based on deliverable when topics are reused in multiple maps.
  • Including/excluding topics or maps: To create deliverables that have specific content without having to create many different maps.

Note: DITA keys will change for DITA 1.3; they will include a scoping mechanism that will simplify and extend linking. This article is based on DITA 1.2.

DITA keys ensure maximum reuse with minimum long-term efforts for updates.

An example of applying reuse during conversion: variables

If I know that I’ll be using keys to manage content that changes frequently (every release or when there is re-branding), I can ensure that, for example, the variables I’m using in FrameMaker for my product and version are part of my conversion. Instead of converting them as text, I can convert variables as <keyword> elements.

Converting variables as plain text is what we call flattening variables—once flattened, there is nothing that distinguishes them from the rest of the text. If you’re expecting conversion to leverage the DITA key mechanism but you are flattening variables, you will be left with adding keys manually after conversion.

Instead, as part of your conversion, you can leverage your variables by wrapping an appropriate element around them and even setting a keyref value on the element.

For example, my conversion plan for variables might specifically map variables to elements and keyrefs.

Variables become keys

Variables named… Are converted as element… With keyref value… For an end result of…

Apple

Banana

<keyword> product <keyword   keyref=”product”/>
ComponentB <keyword> componentB <keyword keyref=”componentB”/>

 

However, when you’re building this plan, it’s essential that you know that keys are defined in a map and defined only once. So if you need both Apple and Banana products to have separate names in a deliverable, then you need to create a unique keyref value for each one. When they have the same keyref value, then they resolve to the same name in the output. In my example above, <keyword keyref=”product”/> will resolve to the same product name in a particular DITA map, but can change in other DITA maps. I can no longer have both Apple and Banana in the same DITA map.

The key here is to plan, test, and then test again.

Strategies for combining conversion and reuse

It’s sometimes quite difficult to determine what can and should be done as part of the conversion to build the end result you need. There are two possible solutions to this: 

  1. Manually build your end result in DITA and test it out until you’re sure it’s how you want to work. You can do this on a small set of content if you have limited funds or time, but the larger and more realistic the data set, the more accurate it will be for your overall needs. Once you have something that actually works the way you want it with the reuse set up the way you need it, you have a very concrete goal to work towards and can figure out what can be built and automated as part of conversion.
  2. Convert a small set of content, automating the reuse parts that you know you’ll need for sure and use that as an iterative process to keep building upon it until you have your desired result.

 

Either way, slow and steady is the way to go. Diving into conversion without considering reuse can lead to some frustrated hours or days doing something that could have taken seconds.

Best practices to prepare content for reuse

At this point, some of you may be saying that this is just too difficult and too time consuming to figure out. You want to convert now! Well, that’s ok too, but you should consider doing some basic pre-conversion work that will let you at least search and replace (or find) items that you want to reuse.

For example, what you can do while converting or before converting that will save time afterwards is: 

  1. Re-write content and chunk content: A precursor to any good conversion is making sure your content adheres nicely to the topic-based writing paradigm and that you have clear distinctions between task, reference, and concept. All other work is based on this essential step. This is also where you would remove text indicators to location of other content, like the words “before, after, following, preceding, next, first”. In DITA, content can occur in any order, so you should remove any references to location.
  2. Include placeholders for future reuse: If you know you’ll be replacing the step “Select Log in from [graphic] and enter your administrative credentials.” with a conref, then go ahead and replace the content with “Conref login admin.” in your unstructured source. You’re simplifying the structure so your conversion will be faster and easier and afterwards, you can quickly insert a conref or conkeyref right where you need one.
  3. Standardize phrasing: The hardest things to do is find content that is almost the same but not quite matching. Although laborious, this pre-conversion cleanup process can set you up for easy reuse down the road. A tool like Acrolinx can help.
  4. Use a FrameMaker condition to identify likely or potential reusable content that you want to revisit after conversion. Then convert the condition as <draft-comment> elements. This is a good way to leave a note to yourself that is easily findable after conversion.
  5. Convert boilerplate content once, even if it has variations: If you have many versions of legal pages, copyright statements, standard notices, and any other content that is generally standardized, don’t bother converting those with each book. Convert them once, then modify the XML until it meets your requirements for all your books. 

 

Resources

XML.org Focus Areas: BPEL | DITA | ebXML | IDtrust | OpenDocument | SAML | UBL | UDDI
OASIS sites: OASIS | Cover Pages | XML.org | AMQP | CGM Open | eGov | Emergency | IDtrust | LegalXML | Open CSA | OSLC | WS-I