Specification Strategies Thread: <xref> text control

It is nearly impossible to talk about what you want to do without at least giving an example of how you want to do it.  Already, the requirements thread has produced three different potential strategies for how to specify some mechanism to control the text substituted by <xref>.  These three initial suggestions are:

  • Add an attribute to xref to control the output.  (my example)
  • Specialize <xref>, where each specialization produces different output. (Debbie)
  • Use the existing "outputclass" attribute and a newly defined external "reference format description file" along the lines of framemaker. (Jeremy)

Please use this thread to flesh these ideas out and contribute any other strategies for specifying the control of <xref>'s text substitution behavior. 

Please also keep in mind that we do all this within the context of interoperability: our result should require the same behavior of all processing systems, be it DITA-OT or Framemaker or any other solution.  We need to go exactly as far as we need to in order to make that happen, and no further. :) 

Over the weekend I took the time to figure out how ODF handles document structure and sequences.  I do not intend to suggest that we adopt ODF strategies outright; I am merely fleshing out one option so we can see it.  Bearing in mind that ODF text documents are geared for paginated output, interested parties may have a look at:

Regardless of output target, the concept of a multi-level document structure is directly applicable to a DITA map/bookmap. ODF simply has a numbered level structure, where 1 is most granular, 2 is next most granular, etc. This is directly adaptable to DITA where the document structure is explicitly constructed via maps, the <topic>/<section> relationship, and by topic containment (e.g. topics which are the children of other topics).

If sequences are adapted to DITA (for the numbering of various items), the proper place to declare them is in the map/bookmap.  For the most flexibility w.r.t. reuse, they should be bound to one or more particular target elements.  Implicit sequences should be considered to be present even if they are not explicitly declared in the map, and the appropriate defaults expressed in the standard.  For instance:

  • Implicit table sequence: bound to the <table> element and the <fig type="table"> element.
  • Implicit figure sequence: bound to the <fig type="fig"> element
  • Math domain extension could define an implicit equation sequence: bound to <equation> or <fig type="eq">

If the defaults are acceptable, the author need do nothing.  If they want to change a parameter, they need only explicitly declare the sequence in their map. (e.g., if they wanted to relabel all Figure #s to Illustration #.). If the author wants to number an element which does not have an implicit sequence defined for it, they should be able to declare a sequence and "attach" it to that element in the map.

The specification of the level of structure which resets the sequences to their starting value should also occur in the map.  For convenience, an element to set the default reset level for all sequences should be provided.  Individual sequence declarations may override this default.

xref could be extended with a <sequence-ref> which should be used for any item having a sequence attached.  It should also be extended with <doc-ref> which should be used for cross referencing any major element of document structure (topic, section). These items must only target elements which have a defined "address" in a document (e.g., the element must be contained by a topic included in a map, and the map must be of type "sequence")

The ODF solution supports a finite number of document location (e.g. "chapter") and sequence value displays upon use and reference, much like my suggestion above.  In DITA, the caption text would be gleaned from the target's child <title> element.  Control over the output is predictable and simple.  It does not have the flexibility of Jeremy's templating engine, but it covers all of my examples.

I do not see a need for an analog to the <text:sequence> element, which increments the counter.  Binding the sequence to a particular element performs that function whenever the target element is encountered.  Nor do I see a need to include a formula attribute (which, by the way has no syntax or grammar associated with it).

Thoughts?  I don't intend to present this as a highly polished finished plan.  It's just what naturally occurs to me.

Numbered paragraphs and cross references to list items are a separate issue and are not addressed here. Linking behavior is a separate issue and is not addressed here.  Styling the text with a particular font or typeface is a separate issue and is not addressed here.

Till now, the DITA spec has avoided making any statements about the outputclass attribute, leaving it to users to use or ignore as they need.  I think it would be against the spirit of outputclass to force users to use it to get a particular feature.

That's not to say that a bunch of users couldn't get together and agree to do things a certain way, including using outputclass.  But if the users want to create a de facto standard and submit it to the DITA TC as something that should be blessed and official, it will have more weight as a full-fledged specialization.

yes DITA has not pronounced yet but they soon will be doing so i bet ya, overalla DITA does good work

XML.org Focus Areas: BPEL | DITA | ebXML | IDtrust | OpenDocument | SAML | UBL | UDDI
OASIS sites: OASIS | Cover Pages | XML.org | AMQP | CGM Open | eGov | Emergency | IDtrust | LegalXML | Open CSA | OSLC | WS-I