Requirements thread: <xref>

Please use this thread to discuss how you would like <xref> to behave.  <xref> is, in the opinions of some, underspecified.  Please show here how you would like to use it.  We will discuss implementation issues on another thread.

To start:

  1. I would like the text substituted by <xref> to be deterministic, such that a sentence may be written to include it.  Within contexts where hyperlinking is supported, all of the text substituted by <xref> should be presented as a link.
    • Pre-substitution: shown in <xref href="..." type="fig"/>, the widgets...
    • Post-substitution: shown in Figure 2, the widgets...
  2. I would like the text substituted by <xref> to be controllable, such that some instances may contain just the sequence number while others contain other degrees of detail.  Examples here use an arbitrarily selected textsub attribute to perform this control.
    1. Sequence name and number (default): 
      • Pre-substitution: shown in <xref href="..." type="fig"/>, the widgets...
      • Post-substitution: shown in Figure 2, the widgets...
    2. Above/below; previous/following:
      • Pre-substitution: shown in the <xref href="..." type="fig" textsub="previous"/> figure, the widgets...
      • Post-substitution: shown in the following figure, the widgets...
      • Pre-substitution: shown in the figure <xref href="..." type="fig" textsub="above"/>, the widgets...
      • Post-substitution: shown in the figure below, the widgets...
    3. Sequence number only:
      • Pre-substitution: shown in Illustration <xref href="..." type="fig" textsub="number"/>, the widgets...
      • Post-substitution: shown in Illustration 2, the widgets...
    4. Sequence name, number, and caption text:
      • Pre-substitution: shown in <xref href="..." type="fig" textsub="caption"/>, the widgets...
      • Post-substitution: shown in Figure 2: Definitions of Widgets, the widgets...
  3. I would like option of sequence numbering to be multilevel (e.g. Figure 2.3.1), the value of the sequence at the rightmost level is the number within the topic, and the value of all other levels is dictated by the placement of the topic within the map.  The default may be a flat numbering system (e.g. Figure 1-N regardless of the number of topics.)

Hi Bryce,

I've only got a moment this morning to respond, so I just want to correct a miscommunication I made.  The examples I gave of other uses of <xref> (syntax diagram fragment reference; Java API class inheritance) do use the specialization mechanism.  They aren't raw <xref> elements. I was intending to show them as examples of specialized uses of <xref> which don't interfere at all with the meaning of the base <xref> element.

I won't respond to the remainder immediately; you've got some good meaty comments which I will need time to digest.

Briefly, it occurs to me that the notion you are describing is that of an "abstract" element.  Or at least it would be if this was object-oriented modeling.  A good deal of the common structure has been specified, but the last bit (the behavior) has been left unspecified.  This is very much like an abstract superclass with an abstract method. One creates an instantiable class by specialization. One "implements" the missing functionality by specifying the behavior for the chosen target. (Note that in this context, tool vendors are not implementors, domain authors are.)

If this is how xref should be treated, then it's use should be prohibited.  Concrete elements should be provided for specific types of cross referencing, but xref itself should never be instantiated and processing systems should be instructed to discard it.  Process only the children, never xref itself.  

Implies: the only permissible cross reference is that which is explicitly defined.

Change from current: "Implementation" would be provided by specification (DITA or a domain) of an element, to which all vendors must adhere if they support that domain.

Hi Bryce,

There are a lot of parallels between the DITA hierarchy and OOP.  This idea of abstract elements is among them.

DITA is torn between architects who care about precise encapsulation, who believe that most base elements are too general for everyday use, and users who just want to make their content and aren't worried about information typing. There are members of both camps in the DITA TC, so what's happened is that the spec has had to compromise, and so it caters to both. Sometimes with mixed results. You could probably classify the xref rendering issue somewhere in there.

I'm all for deprecating raw xref in domains where it would be unacceptable (e.g., in the scientific-journal-article document type that started this conversation).  DITA 1.2 provides a mechanism for doing just this: you (the information architect) can allow certain specializations of <xref> but not allow raw <xref> itself. This feature isn't in DITA 1.1 and can't be for compatibility reasons.


I have a hard time getting my head around the data model in XML specs like this.  Certain fundamentals, like "incompletely specified items may never be instantiated" are often violated without a second thought.  Programming languages disallow the instantiation of abstract classes for good reason: a piece is missing; it can't run; linker error! 

It's the same thing here.  xref is missing a piece in order that its children may fill in just that piece (behavior on resolving a target) while inheriting all the common structure.  At least this is your interpretation of the TCs motives.

But the documentation says the opposite.  We are required to use xref for all cross referencing:  xref is the only element which can target an item more specific than a topic.  Legal types defined for xref include "fig", "table", "li", etc.  And yet the desired behavior when it targets the elements it's specified to work with is undefined.

There is a fundamental contradiction here.  On the one hand it makes sense to have an abstract superclass from which specific types of references inherit.  On the other hand, it's not abstract.  There's not even a breath of a suggestion that we not use it.  To the contrary we are assured that it is the only element that even has a shot at performing the desired task.

This kind of ambiguity is exactly what a good specification resolves.  xref is either abstract or it is not.  It is either designed as a generalized extension point which is not useful to place in the document, or it has a specific meaning.  A good standard resolves ambiguity and puts everyone on the same page.  With respect to xref, this standard is capricious.  I don't care which side of the fence they land on: just pick a side and stick with it!!


xref is designed for immediate use. It is also designed for specialization. This is true of a lot of elements in DITA. There are very few purely abstract elements (those which have no default behavior associated with them) - for example, <unknown>. 

Michael Priestley

To be ready for use, the default behavior must be completely specified.  You may allow children to override the defaults, but the default must be defined.  Failure to provide a default means that the element is not ready for direct use.  This is why abstract items may not be instantiated!  Failing to provide a default and declaring that the element is ready for use is sitting on the fence. You may have one or the other but not both.

Now, with the caveat that "examples" are not typically considered normative, the default text has been "specified" for topics:

"Here's an example of a cross-reference to another topic; that topic's title will be used as the link text."

There is not even an example which applies to exhibits.  

My criticism of xref has been evolving over the past week as I talk with more people, but this is where it stands now:

  1. I would recommend that the TC "pick a side":
    1. prohibit the direct use of xref (i.e. require each type of reference to define its own behavior), and declare in the standard body that xref exists only as an extension point; or
    2. specify the default behavior (i.e. something which can always happen regardless of target element type).
  2. I further recommend that the TC provide a type of reference for all of elements defined by the TC which are expected to be targets (e.g. topics, sections, figs, li, tables...).  These may serve as examples to those who would define their own domains.

Other than that, I'm here to help gather information on how this function (numbered item reference) is typically specified, and to flesh out some options giving the TC a headstart once they pick a side.

You've given me two options: either it's intended for use directly, or it's intended only for specialization.

Speaking to intent, I continue to say it's intended for both. And people have been using it for both for years.

So yes, there could be more information about processing behaviors in the spec. There are also people who argue strenuously against putting anything about processing in the spec, on the premise that the point of XML is to provide processor-independent representations of content. But I think I'm siding with you on this. I just expect an argument when it comes up at the TC.

I appreciate your frustration with the current under-specification. It doesn't change my statement that it is intended for both. In fact, it bolsters your case that it needs more specification. If it were only intended to be abstract, you wouldn't have a case against the spec. 

Michael Priestley

That's not actually what I said.

I said you either specify a default behavior or you do not. Both cases allow for specialization.

Children may override the default behavior if it is provided.  Children may supply a behavior if it is omitted.

The only thing you may not do is fail to provide a behavior and allow the element to be instantiated.

The spec says that eg xref to a table should be supported by processors - that is, turned into a link or cross-reference on output. It does not say how exactly, but it does say what to do. 

It doesn't say what to do in cases of xref to other things, eg xref to a phrase. So the element has default behaviors for one class of targets, but not for another.

Regardless, I'm just trying to make the point that xref is not intended to be an abstract element. It is intended for use as well as for specialization, and if that use is inhibited for some users because of a lack of information in the spec, then I think that's something to fix in the spec, not something to fix by reclassifying xref's intent.

Michael Priestley

A reader sent this to me in email but it's a good thing to keep in mind to avoid tunnel vision.  Used with email permission.

Contributed Email

Hi Bryce,

I saw your post on XML DITA and wanted to bring the Congressional legislative approach to your attention.

It uses two attributes (legal-doc and parsable-cite).  The parsable-cite attribute enables an encoding of the variables for (what I hoped would be) any citation.  For example, the US Code citations are something like 110 USC 2(a) and the parsable-cite would be “usc/110/2”. 

<external-xref legal-doc="usc" parsable-cite="usc/20/1001">20 U.S.C. 1001(a)</external-xref>

We chose not to use href because the XML files will likely be a resource for a much longer period of time than any web address (e.g., organization name change could change URLs) and the parsable-cite approach enables a simple approach for handling at least the legal citations that legislation required.

I’m not a DITA user so I didn’t want to post this to the site, but I thought it might be helpful info to have. 

Best regards,

Joe Carmel

In an abstract sort of way, I don't care whether we fix this by specialization (if it's a standardized domain), a templating language, or by an attribute.

In this last message, I see you providing examples of other domains directly using the generic xref element without specialization.  They accomplish this by associating a desired result with a specific target (and leaving the specification of behavior w.r.t. other targets unaffected).  To parallel your examples, then, we would not specialize xref, but attempt to define the desired behavior for a suite of specific targets (table, fig, etc.)  Did I misread your presentation?

I think this basically says that "when a new element is defined, the expected behavior when this new element is the target of xref should also be defined".  It just so happens that "exhibits" have a common desired behavior, and "exhibits" are already part of the base specification.  Ergo, the specification of how xref behaves when it targets an exhibit also belongs in the specification.

You do have a point about other domains perhaps not being able to use a platform neutral template language which is targeted at exhibits. Could <xref> be parameterized by "target specific options" (which may be either an optional attribute or an optional child element)?  In the case of exhibits it could be a very simple template string such as Jeremy suggests.  For other targets in other domains, valid options would be defined with the element, and processors which grok that domain would also have to grok the options.

Neat, tidy, scalable, combats element proliferation,  and conforms to current use. As a bonus, xref behavior is actually specified instead of being a free for all!  And last but not least, the specification of behavior as an xref target is alongside the specification of the element itself. Easy to write, maintain, and read.

New Requirement

  • Speak for everyone if changing a base element.  Exhibits are not the only xref targets.

I'll try to rephrase the scope thing, because I'm afraid that I've missed the mark.

By making requests for a change in the behaviour of the base element <xref>, you end up implicitly speaking for all users of <xref>, past, present and future.  That's fine if you are aware of every use that people are making of <xref>, and you are confident that you are not hindering their use of it.  In practice, because of specialization, you can't possibly make that claim. At best, you are speaking for the people who are using <xref> the way you use it.

Here are some uses of <xref> that I have seen in my travels:

  • In the Java API reference specalization, points to another class which contains the base class of this class. Processing expectations are to write the class name (and perhaps make it a hyperlink).
  • In the programming domain specialization, points to a syntax diagram fragment which is inserted in the grammar being described. Processing expectations are varied: for the railroad diagram presentation form, draws a box with the name of the referenced fragment inside. May result in a dynamic diagram which expands inline to the referenced fragment on user interaction.

These uses of specializations of <xref> don't care at all about participating in inline sentences.

The defined properties of basic <xref> should contain only those things that are common to all <xrefs>: yours, mine, everybody's. If we make an exception for you, then we must make an exception for everyone. The base definition of <xref> will become so cluttered with stuff, most of which is useless to any one given use.

This is what I mean by "scope". Either you speak for everyone, or you draw a line in the sand and stay on your side. And in the case of <xref>, the above examples demonstrate that you don't speak for everyone, any more than the creator of that syntaxdiagram specialization does.

How do you draw a line in the sand? Specialize. Make a specialized version of <xref> and give it the behaviour that you want.  Share the specialization with anyone else who wants to use it. If it serves the needs of a lot of users, then it will become widely used. Those for whom it serves no need can carry on without it. This is how DITA expects you to solve this problem.

Basic <xref> is so generic that, really, no author should be using it directly. This is the same argument as the one against people using <topic> directly. It's the same argument that led to DITA 1.2 getting a more generalized task topic type.

(Removed . . duplicate post.)

Another Thread Created

Ok lots of good ideas and I want to keep this up while at the same time staying on topic.  I took the three major "specification strategies" reflected in this thread and started another thread for us to flesh them out.  Please take the discussion of how to accomplish the control of <xref> output to:


In terms of scope, I very much want to eliminate the unconstrained behavior given in Debbie's example (e.g. for HTML, <xref> would substitute "Figure 2" but for PDF it would substitute "Figure 2 on page 10".)  This very thing is why I'm here trying to constrain <xref>.  Remember: we're putting this in the middle of a sentence and the rest of the sentence does not change when the output target does, so whatever text is substituted should either be the same or something the author can vet.  And above all: all platforms must be required to substitute the same text!

My scope statement is: "Make the text of <xref> deterministic such that I can use it in a sentence!"

As to localization issues, these are intertwined with translation issues.  Those doing the translating are going to have to type a new sentence around the <xref>, at which time they will need to select a "format" appropriate for the new usage.  While I understand your point about needing to substitute different words for different target languages, I do not think it is necessary to require that all of the use cases common to English have exact parallels in other target languages.  The human translator is ultimately responsible for authoring the translated document.

New Requirement Recap: 

  •  <xref> needs to work in any language, not just English.  Therefore the mechanism by which the text substitution is constrained shall not assume English.

Point for discussion:

The crux of Jeremy's suggestion, specifics aside, is that a generic templating language may offer users flexibility and interoperability without requiring every template to be specified ahead of time in the DITA standard. This contrasts with my initial suggestion of a fixed number of templates having specific substitution patterns of relevance to English audiences.  At the same time, it would seem to go a long way toward solving Debbie's internationalization problem.

So, we certainly need more flexibility than my predefined templates to address Debbie's requirement.  Do we need the full power of Jeremy's suggestion or is there some middle ground? 

Under Scope, you wrote:

In terms of scope, I very much want to eliminate the unconstrained behavior given in Debbie's example (e.g. for HTML, <xref> would substitute "Figure 2" but for PDF it would substitute "Figure 2 on page 10".)  This very thing is why I'm here trying to constrain <xref>.  Remember: we're putting this in the middle of a sentence and the rest of the sentence does not change when the output target does, so whatever text is substituted should either be the same or something the author can vet.  And above all: all platforms must be required to substitute the same text!

I know of cases where people just write around this, for example by putting xrefs that have generated text at the end of sentences (eg if you want it in the middle of a sentence, then specify the text explicitly). If you want the same text though for both HTML and print, how would you reconcile it? You presumably don't want to eliminate page numbers from the PDF/printed version, but page numbers in the HTML don't make sense either. How can you have the same generated text for both outputs? 

Michael Priestley

Rather than add an attribute, or even a DITA domain specialization, I'm inclined to use the existing @outputclass on the <xref> element, then provide a second source file, something like a ditaval file, to provide definitions of the formats named.

The formats, then, would contain text, in-line elements like <b> and <i>, possibly even <ph conref="..."> for variables, and tokens to represent the xref source's text and numbering. If you have used FrameMaker, you will recognize this as similar to the Frame "Cross-reference Formats".

This could provide a very flexible system without requiring any changes to the DITA spec itself.  For example, if you had:

<xref  href="..." type="fig" outputclass="FigName"/>

and a second xrefs.xml file with:

<formdef name="FigName"><i>Figure <num/></i></formdef>

you would get what you suggest in 1, with the figure number in italics.

This method would allow easy switching of definitions for different output formats; you don't want the same ones for PDF and HTML, for example.

An initial set of tokens might include <text/> for the full content of the referenced element (or just the title for topic, fig and table), <num/> for the element's (generated) number, <page/> for the print-output page number, and <above/> and <previous/> with the semantics you suggest above.

The numbering is another area in need of definition; that could be handled in a very similar way.  Again, I'd base it on Frame's methods, used by tens of thousands of tech writers worldwide for many years.  Again, @outputclass could provide the format linkage, and still be fully available for its present purposes.

Jeremy H. Griffith
Omni Systems


It's important to be clear about the scope of the requirements here.  Requirement 1 may warrant different text for HTML outputs ("Figure 2") compared with PDF outputs ("Figure 2 on page 38"). How does it look if you want to reference a figure in a different HTML page ("Figure 2 in topic Flossing Cats")?  Either all of these cases need to be specified, or the requirement's scope needs to be reduced to only certain cases (e.g., figures in the current topic in print output). Requirement 2, for instance, assumes a continuous document flow (such as a book, or a single-page HTML document).  Do "following" and "above" really work when you are viewing a multi-page HTML document set?

All of these requirements need to be considered in the light of localization.  Not everyone translates their topics, but to make these requirements useful to those who do, the thorny issue of constructing sentences by contatenating strings needs to be considered. Requirement 2 is particularly ringing my I18N alarm bells, but even Requirement 1 makes assumptions about grammatical number, gender and mood.

Since these requirements won't be useful to everyone, I hope that we can agree that it would be best to frame their implmentation as a DITA domain specialization of xref. All the examples are saying <xref> but they will probably need to say <xref-in-book> or whatever element name you choose. Specializing elements rather than adding attributes is the DITA extension mechanism of choice.

Requirement 3 (multi-level numbering) is something that could be done entirely at processing time.  (Or are you envisaging that the author uses different markup?) It would make sense for this to apply to all kinds of things, not just figures and tables. The legal profession would probably want it to apply to paragraphs, for instance. Things like Requirement 3 are probably the basis of a much bigger requirement, namely supporting and enforcing style guides in processing. I wonder if Requirement 3 should be spun off into its own area, since it can be implemented independently of the other ones.

Big shiny note! I am not saying that any of this is a bad idea.  I think it's a good idea.  But you need to be clear about the scope of your requirements, and to ensure that you don't have unforeseen effects on authors working outside of your scope.

You're my "straight man", Debbie, one step ahead of me.  Requirement three is really the articulation of the sequence abstraction and how it makes an appearance in the DITA source document.  It is indeed separable and should be developed independently.  Once the sequence abstraction is defined, it's just a matter of defining how the sequence instances interact with the DITA elements.

As to your question, I do envision giving the author a rather simple control over sequence numbering (i.e., permitting them to specify "multi-level" or "flat") which would be binding regardless of the output target.

Any further discussion on this really should be ported to a different thread.

Ok, the spam filter ate my last message.  Suffice it to say that I do envision a simple author control to indicate "multi-level" or "flat" numbering which is binding on all output targets and I DO see this as a separate topic which can be developed independently and applied to any DITA element which may need to participate in a numbering system.

I found it necessary, as a workaround, to implement a divisional numbering arrangement, such that a prefix portion of the reference text numbering is derived from part of the title of the referred material topic title.

Also feature numbering within each such division was recommenced from unity, with that intra-divisional number used to extend the reference number text (with period separator). Such features include <table> and <fig>.


I'm thinking that the articulation of a "sequence" abstraction may be necessary before this is over and was planning on starting a new thread just to hash that out.  If you're interested in this topic from a "what do I need to type into my dita document" standpoint, please feel free to start that topic yourself and provide a link to it here. Focus Areas: BPEL | DITA | ebXML | IDtrust | OpenDocument | SAML | UBL | UDDI
OASIS sites: OASIS | Cover Pages | | AMQP | CGM Open | eGov | Emergency | IDtrust | LegalXML | Open CSA | OSLC | WS-I