Revision of Migrating legacy content from Fri, 2008-01-04 16:39
Advice for preparing unstructured documents for conversion:
Label all headings intended for topics with styles that indicate the intended topic type.
For example, in FrameMaker
unstructured docs, a simple heading2 style used for level 2
headings may be changed in the templates to make several heading2 styles
available (heading2concept, heading2task, etc.). This enables the DITA conversion programs to determine which type of
topic was desired when the documents are converted.
This is not necessary if you use MIF2Go to convert unstructured FM
content to DITA: MIF2Go can determine the correct infotype
automatically, based on the styles used in the "body" section of the
topic. For example, if you have a Heading2 followed by steps, MIF2Go
will convert this to a task topic.
Try to fit all unstructured content to a DITA model. This involves moving all conceptual information out of task topics and into concept topics, moving tables that belong in reference topics out of concept/task topics, ensuring that all task topics have only one main procedure, moving prerequisites into a separate section before the main procedure in task topics, etc.
Clearly understand the difference between concepts and references and create guidelines you (and possibly others) will follow when you begin the task of chunking your legacy content. This is crucial to ensure you don't end up with "concepts" that are
actually "references"...and vice versa.
Consider applying minimalism techniques early. Go through your content and make it
minimalist prior to chunking.
Ensure that all books are using the same paragraph and character tagging definitions. In Framemaker, all books should ideally be using the same paragraph and character catalogs.
Remove overrides to paragraph and character tag attributes. Replace one-off bold, underline, and italic settings with catalog-based character tags. Doing this helps any automation tools you might use to do a better job.
Use many of the items in your existing department style guide as a basis to create an Information Model that includes guidelines on how to use the collection of DITA elements. This model would define these elements in a way that would help you enforce the style and branding (look and feel) of all your docs. Having this Model gives authors the needed guidelines to develop new content in DITA. Put to paper the crucial items first (you'll discover what those are as you progress). This Model will develop and mature as time passes.
Ensure that the tag name is consistent throughout all books if you're using conditional tagging (such as that in Framemaker). In the DITA world, these tag names will become "metadata" (values for element attributes such as "audience", "platform", and "product").
These tag names should be defined in a metadata schema, which would be included in your Information Model.
Determine which content can be reused, at
the topic level (EULA, copyright, preface info) and at the phrase level
(company names, product names).
Carefully consider what is worth the
trouble to store in a single location and
import by conref vs. what is better left typed in as normal text. Going
overboard with the conrefs can be a maintenance nightmare, but more
reuse means less writing work and lower translation costs. You may find
you have several sections that serve the same
purpose and are almost the same. Consider writing one generic section
that can work in place of the several.
Contributors to this page:
- Paul Masalsky, EMC
- Yves Barbion, Scripto
- Jerry Pope
- Derek Adams, InfoPros
-Jan Brandego
See also:
-DITA-users mailing list thread