Marking up code samples

When documenting programming languages we commonly include code snippets within the documentation to illustrate how to go about doing something. In HTML this would usually include <PRE> or <CODE> tags to inform the browser that we want to see this snippet in a non-proportional font with the white space preserved.

However, modern programming tools go one better and now markup this text with colour to assist with readability. So, rather than just displaying a block of black text they display something like this:

Is there an existing / standard way to do this with DITA?


It is a common myth in some circles of developers that good code is self documentation. Heck, I have seen thousands of lines of code and HTML shockingly bad without a single line of comments. When asking the original developer will often tell you that it is self-documenting code.
Basic HTML Coding

I used the prettify.js script (used on the Google code site, you can google for it and get the script and docs.)

4 things need to happen in the build to implement this stylesheet in your HTML output to highlight code examples. (This is the nicest highlighter I've seen, it guesses at the language, so no need to specify. I couldn't get the plug-in for the other highlighter to work anyway.)

In the <head>:

1 <link rel="stylesheet" type="text/css" href="prettify.css"/>

2 <script type="text/javascript" src="prettify.js"></script>

3 <body onload="prettyPrint()">

4 <pre class="prettyprint">


For 1 and 2, use args.hdr and hdf. For 4, use outputclass in the codeblock. The latest milestone build contains a way to modify the body tag to do 3. I haven't had a chance to try it yet--I'm also waiting for a bug fix so I'll probably wait and get that build.

Yes, interesting ... I'm trying to document a risk-assessment template that cheerfully highlights "unacceptable" levels as red and "acceptable" levels as green. I suppose the idea of specialisation and elements or attributes for the two levels is not per se more frightening than defining character-formats, so I'd better read up on it!

I guess the best would be to create a specialization with tags like <codeblock language="java"/> or <java/> (replace java by your prefered language) and then create a DITA-OT plugin that would apply color coding depending on the language and the output.

Thanks Claude. Creating a plugin Sounds technical :)

If I copy sample code to the clipboard, I can easily manipulate it from RTF into tagged XML and paste it into my DITA text.

Unfortunately, while I can recognise colour on the text, I don't know the cause of it. For example, there may be more than one cause of the text being blue and this information isn't available inside the RTF.

I suppose just tagging a block with a tag that just means "this is blue" or "this is green" gets you in trouble around here.


The thing is that unlike HTML (or RTF) DITA is not really about formatting, its more about semantics.

In your case you would say that "this is a bloc of VB scrip" and directly put your code as is, without decorations.

The processing would then take this information to create appropriate formatting for your target output (e.g. CSS classes for HTML).

I recommend you read this article about specialization

Hope this helps,

Claude Focus Areas: BPEL | DITA | ebXML | IDtrust | OpenDocument | SAML | UBL | UDDI
OASIS sites: OASIS | Cover Pages | | AMQP | CGM Open | eGov | Emergency | IDtrust | LegalXML | Open CSA | OSLC | WS-I