Revision of Getting started as cheaply as possible from Fri, 2008-03-21 14:32
Eliot Kimber suggested some low-cost (if not free) setups for DITA and others commented on the dita-users mailing list.
Eliot wrote:
I've been thinking a lot lately about one of the singular aspects of
DITA
compared to other XML approaches for document creation which is
that it
significantly lowers both the cost of entry and the cost of
ownership of
very sophisticated systems, systems that even two years ago
were
prohibitively expensive for all but the wealthiest enterprises
(that is,
enterprises who had sufficient cash or credit to make the
investment
necessary to realize the ROI that the use of XML represents).
[By
"sophisticated" I mean the information representation and processing
features (e.g., linking, use-by-reference, conditional processing,
etc.), not the sophistication of the supporting tools necessarily--one
of the aspects of XML is that content sophistication is usually much
more important than tool sophistication, as DITA demonstrates.]
Even
if you were an early DITA adopter, trying to use DITA 1.0, things
were too
expensive and DITA was still not quite sufficiently cooked.
But now, at
the beginning of 2008, we have a number of things that,
taken together, make
the use of DITA as inexpensive as it could possibly
be, bordering on free
(for a certain value of "free"). At a minimum, it
lowers the up-front cash
outlay required to get started, although
you
still have to do some implementation work to get a useful system. But
the cost of the implementation is a function of the skills and resources
you have available--if you've got somebody on staff who can do the
implementation, then it is truly free. If you don't, you're not spending
any money you wouldn't have had to spend with any other XML approach you
could have chosen, and will likely be able to spend much less than you
would otherwise have had to spend.
The cost-lowering artifacts available in 2008 that weren't there in 2006
include:
- DITA 1.1 fills in most of the critical holes in DITA 1.0,
providing a
sufficiently complete solution able to meet most documentation
requirements, certainly in the domain of technical documentation (but
beyond that as well)
- Version 1.4.1 of the DITA Open Toolkit adds
some important
functionality (better handling of output organization,
support for
chunk=, etc.) as well as providing improved documentation for
how to use it.
- New low-cost DITA-aware XML editors, including Syntext
Serna and
OxygenXML, provide excellent value for graphical authoring of DITA
content.
- A deeper body of community knowledge and published knowledge
make it
easier to learn and apply DITA generally.
- More third-party
support providing various helpful bits any system
would need.
So,
given that, it raises the question of what a low-cost,
production-capable
DITA environment might look like. Obviously there are
a number of choices
and those choices change day to day as new products
and tools are introduced
and existing products are improved.
So here's my question to those who
care to offer an opinion: what would
you recommend as a low-cost or
lowest-cost system? Let's assume a
10-person or smaller writing team,
meaning that their operating budget
is "as little as you can spend and still
get your work done".
Here is my recommendation as of 20 Feb 2008, based
on my practical
experience with the tools involved and my knowledge of
what's available
generally:
Authoring:
Syntext Serna 3.5.
Version 3.5 of Serna offers almost as much
functionality as XMetal and
Arbortext Editor but at a significantly
lower per-seat cost. It's relatively
easy to configure for the use of
local shells and specializations, easier
than XMetal or Arbortext. It
still has a few fit and finish issues but it's
reliable enough.
OxygenXML is a close second but its graphical editing features,
especially for maps, is not as good as Serna's and it's not enough
cheaper to make it the better value.
Content
Management:
Subversion. Subversion is an open-source code control system
that
functionally replaces CVS but offers several important new features,
including full support for versioning of binary objects (including UTF-8
and UTF-16-encoded XML), versioning of directories (very important for
DITA where you need flexibility to change how your topics are organized
as you refine your practices), HTTP-based access (avoids issues with
corporate firewalls), easy scripting, and arbitrary per-file metadata
(enables potentially quite sophisticated management features). There are
a number of good open-source and commercial Subversion clients,
including TortoiseSVN, Oxygen's Subversion client, and the subclipse
plug-in for Eclipse.
Coupled with good file organization and naming
discipline Subversion can
get you a long way.
Production of Published
Output:
DITA Open Toolkit. This is a no-brainer of course. The biggest
question
here is whether or not to step up to a commercial XSL-FO
implementation
if you're producing PDF. Both XEP and XSL Formatter provide
better
results than FOP but either would represent a significant cost
relative
to the total cost of the authoring tools, essentially doubling or
tripling the total system dollar cost. But this is where hard
requirements for print quality or features carry sufficient weight to
justify the expense.
Given the above, what would you need to do in
order to have something
that could produce production-quality output,
assuming you are not using
any non-standard specializations, only local
shells?
1. Configure the editor to use your local shells. This requires
just
setting up your entity resolution catalogs and creating Serna-specific
templates, which is mostly an exercise in copying. Should take 1/2 day
at most if you know what to do.
2. Set up branding for the HTML
output. This involves creating
appropriate CSS style sheets and headers and
footers, as well as
creating the appropriate scripts or Ant tasks to use
them with the base
Toolkit transform. Time dependent on the complexity of
the styling you
want. Say two days max to implement and deploy. 1/2 minimum
for simple
style changes.
3. Set up branding for the PDF output.
Assuming you're using the PDF2
plug-in, as for HTML, it depends on the
complexity of your style
changes, but 1/2 day to 2 days would be typical.
The main challenge here
is the lack of documentation on how to do this--it's
not hard if you're
already familiar with XSLT and XSL-FO. Would be next to
impossible if
you're not.
4. Set up convenience scripts or GUIs by
which authors can produce
output. This can be as easy as some simple command
line utilities that
just take the directory for a given publication as input
or could be
more involved, like a server-based system accessed through a
Web-based
front-end. 1/2 day for scripts, more for more sophisticated
stuff.
The above is essentially what we are using at Really Strategies
for the
product documentation for our RSuite CMS product and it's working
fine
so far with a team distributed between the U.S. and China. It's not
ideal but it was cheap and easy to set up.
What would others suggest?
Jim Cain replied:
Authoring tool aside, this is exactly what we are doing to produce
project
documentation. We were already using subversion for our source
code, so it
was an easy decision to also store our topics and maps for
the project
documentation in subversion and treat it as any other
development
project.
As for authoring, the system we are currently building is
deploying
XMetaL, so we decided to use XMetaL in order to allow us to share
a
similar authoring experience as our client. In this case, we wanted
to
be able to gain more insight into using the tool that we are asking
our
client to use in their system. Beyond this project, we may
consider a cheaper
authoring tool, but have not evaluated any others
at this point.
Wrightsell Hughes commented:
We are using CVS Tortoise as a DITA repository and XMLSPY as our
authoring
tool. The reason we chose these tools is that they were
already being used
in-house and didn't cost us anything. So far, we
are pretty happy with our
setup.
Steve Andersen said:
I agree with everything you said, with one caveat. If you have access
to a
SCMS already (say as part of the development team you work with),
you should
use it instead of installing Subversion. Although you
don't have to pay to
use Subversion, even if you are using a hosted
system, there are costs to set
up and manage it. Not as high as with
a CMS, but it's not free. If you don't
have a SCMS set up, unless you
are familiar with Subversion, I think one of
the hosting solutions
should be investigated.
Which version of Serna
do you think is the minimum required for
authoring in DITA? I think it's the
Professional version, but that's
more than double the cost of Personal
edition.
RenderX XEP can be purchased for $300, so, I think it's a
no-brainer
for PDF generation if that is required. XSL Formatter was, last
time
I checked, $1k more, and, although it's made big strides, I'm not
sure
if FOP, with the current OT, can produce high enough quality
output.
In addition to everything you listed, I think you need a
XSLT
development tool. I find it very unlikely that anyone is going to
be
satisfied for long using the default stylesheets in the OT. They
are
very good, but they are a bit generic for most uses. I prefer
oxygen
in that role, and I think that's your preferred tool, also,
but
Eclipse does have some nice plugins (like Orangevolt) for
XSLT
development that may be good enough.
So, here's the total cost I
see:
Serna : $200
DITA OT : $0
xep: : $300
oxygen :
$300
Subversion : $0
Total : $800
and what do you have? A WYSIWYG
editor, professional quality HTML and
PDF output, version management, and the ability to customize both
your
outputs and your inputs.
That's not bad. The nicest part is that,
as your team grows, the only
cost increase is the authoring tool.
You
want cheap, though?
Eclipse with XMLBuddy and OrangeVolt gives you
editing and development
tools. Add in FOP and Subversion for PDFs and version
control, and
you have a completely free solution.
Hedley Finger suggested other tools:
I have been playing with Serna, oXygen, XMLmind XML Editor (XXE) and
FrameMaker 8 with Scott Prentice's DITA-FMx plugin (replaces the
Adobe
DITA plugin that comes with it).** oXygen is great for all the
other stuff
around DITA -- XSL, XSLT, XSL-FO conversion, etc. But
out of all the
editors, FM8 is the best (but oXygen is always open to
do those source-code
jobs.
The tools are cheap. I mean, just take your hourly labour rate by
the number of writers, multiply by a week, month or year, and the
capital costs of the tools are minuscule.
It's the running costs that
are huge. If you have a smart staff
member who can do the XSLT stuff and
other tweaks (think a clone of
Deborah, Don or yourself), they are not free.
And while they are
trying to implement your organisation's branding and
document
standards, they are not doing something else productive.
For
my money, FrameMaker is both your editor and PDF formatter for
print and, if
you already have FM and years of skills with it, then
getting from DITA to your standard document look and feel is a
doddle
that makes the XSL-FO route not worth considering, especially when
you are a one-man band like me and just don't have the time to get up
to
speed on Ant scripts, XSLT conversion steps, Subversion, and all
the rest of
the technology. You can use the standard DITA-OT toolkit
for all your other
output.
So the cheapest startup for those currently using Word,
FrameMaker,
Robohelp, etc. might be to just fork out for an integrated
package
from one of the vendors because you can outsource your formatting
and
scripting to them and, if done well, you will have tools that need
not be changed for years in place.
Troy Klukewich challenged the savings:
Cheap and easy is not always the same thing. When using open source solutions, it is helpful to have the necessary technical talent on staff. Assuming you have people that are willing to put some time under the hood, inexpensive or free solutions are more readily available. On a previous project, my team used as many open source tools as we could for a structured XML solution similar to DITA. The idea was to own our own sources with complete independence from proprietary tools and vendor lock-in. Even if we did resort to a commercial tool at points, we wanted to be able to freely swap them out (which we did in one case with an XML editor and a DB for tracking statuses).
Like others, we found Subversion with Tortoise to be a great solution both for storing XML content and for setting custom statuses on files. We were able to jettison a cumbersome, commercial database and report off Subversion itself to track file milestones. I ultimately liked Subversion for the simple reason that the writers found the Tortoise integration more intuitive than traditional source control interfaces. Training was easy.
We did buy a commercial WYSIWYG tool for editing, Arbortext Epic. Binaries and intermediary formats (like MIF) were absolutely out, so Structured Framemaker was not an option. One writer insisted on using free Emacs. As long as the content validated against the schemas, we didn't really care what people used to edit the files. For training purposes, though, it is best to standardize on one editor and include pre-built templates for each content type. At various times, we used a number of different tools on the same source, including XMetal, XML Spy, and oXygen.
We could not get away from a commercial solution for robust PDF production. We tried everything that was freely available and found serious problems with scale (into the thousands of pages). We used the Antenna House XSL:FO Processor to generate PDFs direct from XML. It was robust and perfectly reliable. The license was cheap considering all the time we saved debugging problems in free tools.
Once we automated the PDF production, we were extremely happy. We were able to jettison ancient Framemaker sources, intermediary files, and manual futzes and never looked back. Though there was an upfront cost to the XSL:FO expertise we developed, we easily recouped costs many times over with extremely fast PDF production and full compliance with requirements for simultaneous localizations. We pumped the localized XML through the same XSL:FO process. The localized PDFs were essentially free. We no longer required manual adjustments for numerous localized PDFs, which were extremely expensive and slowed time to market.
We used Saxon for the XSLT to HTML transforms and the many other help formats based on HTML. We used Python as a kind glue script to run everything. I would use Ant now. For a recent DITA project, I am using some already licensed tools within our department, plus we can use our own Oracle products. I still hold to the ideal of freely open XML source with swapable components. I do not want lock-in with any vendor tool.
Epic for editing (The DITA integration is worth the price of admission) Saxon for XML processing for Help Oracle XML Publisher for PDF (I've heard it works great and scales) Ant for driving builds DITA OT Perforce for Source Control (Maybe) Oracle UCM for content management down the road, but I would prefer a DITA-aware CMS Perl (free) and PowerGREP (commercial) for miscellaneous regular expression exercises
In most cases, a full XML shop will probably use a mix of commercial and open source solutions, weighing off what is already available inhouse, plus what is easier to buy versus configure oneself.
Hedley Finger defended FrameMaker:
FrameMaker has been able to directly open from and save to XML files
from
version 7.2. It can also use XSLT and read/write rules to
transform to/from
FM's internal format. Leximation DITA-FMx has more
features than the Adobe
DITA plug-in that comes with FM8, and
DITA-FMx works with both 7.2 and 8.0.
FrameMake has the best
structure editor/view bar none which is much easier
to use. And FM8
now supports Unicode.
If you have existing in-house
FrameMaker expertise and licenses, it
might be cheaper to upgrade to FM8 and
purchase DITA-FMx
licences. You can easily disable the Adobe DITA plug-in.
This is
likely to be cheaper than replacing your investment with another
proprietary editor such as XMetaL or Abortext Author with Antenna
House
XSL-FO or RenderX processor. And, instead of having to get
your head around
FO, you can use your existing FM skills to format
PDF for print or on-line
presentation. In particular, you can
continue to use the much smarter FM
cross-reference formats which
will round-trip to DITA <xref>s, and
even FM variables instead of
@conref's acting as variabels, although this is
deprecated. The
other outputs can use the DITA OT.
FrameMaker also
has functionality to assist in converting legacy
unstructured FM content
into DITA XML, but it will still require hand
tweaking, as it would with any
other converter.
Subversion is a good cheap option and it would be even
better when
someone with greater knowledge than me develops an XML-aware
diff/merge tool.
I am not decrying other solutions but only suggesting
that if you
already have FM it might be the cheaper way to go, both with
upfront
costs and on-going costs.
And Troy replied to Hedley:
Great response and I found the information you provided about Frame's more recent capabilities useful.
It is also worth emphasizing, as you point out, that the cost of tools is minimal compared to other costs. Though, it can be fun on a shoestring to see how far one can get with time versus a cash layout.
I find that free is usually not all that free when the cost of time is figured in. Even just setting up the DITA OT for the first time on a fresh Windows machine can take some time. Of course, once it is set up, I find the OT is a reliable, powerful processing factory.
When looking at free tools, I consider if the cost of time is worth the investment versus a commercial option. It is also worth quantifying the commercial value-add versus a free option so we know why we are going commercial.
On a recent DITA conversion project that needed some serious regex processing, I ended up paying for a commercial tool, PowerGREP, for the main reason that it provided a dynamic preview mode with a drill down for mass changes. It is a killer feature and worth the time it saves. Otherwise I would use Perl.
I'm also a fan of Epic's Resource Manager and its integration with DITA. I'm happy looking at raw XML, but day-to-day I'd rather use a unified dialog to build topic paths, conrefs, and links to graphics.
With an open architecture, we can use the right mix of free and commercial tools on the same sources, the best tools for the best purposes.
- Login to post comments
- 27486 reads