Hi,
actually we use DocBook V4.2 for the PostgreSQL manuals. I suggest
an upgrade to DocBook 5.x. This sounds simple, but it will be a
long process with many sub-tasks.
Rationale:
- Sooner or later we
MUST migrate as the 4.x series is outdated: V4.2 dates back to
2002. The 4.x series is no longer actively developed since
2006. See: http://www.docbook.org/tdg5/en/html/ch01.html "In
October 2006, the DocBook Technical Committee released DocBook
V4.5, the last release planned in the 4.x series."
- V5.0 is available
since 2009. See:
http://www.docbook.org/tdg5/en/html/ch01.html: "DocBook V5.0
became an official Committee Specification in June 2009 and
became an official OASIS Standard in October 2009."
- Actually the
technical committee has the third Candidate Release for V5.1.
PROs:
- The formal part of
the migration is supported by existing tools:
http://docbook.org/docs/howto/#convert4to5 (nevertheless some
scripts written by ourself will be necessary).
- The normative schema
for Docbook 5.x is written in RELAX NG. Additionally the
technical committee converts this normative schema to a XSD
schema and to DTD, which are not normative but very near to
RELAX NG and will fit for most applications. Hence, we have
the choice between three schema syntaxes and everybody can use
his favourite one.
- Our source file
format will switch from SGML to XML. This implies that we have
access to all XML features like XLink, XPath, XSLT, XSL-FO,
SVG, MathML, namespaces, ... .
CONs:
- The migration from
4.x to 5.x implies major changes at 3 different levels.
- DocBook structure:
Previously it was defined in SGML syntax (DTD). Now it is
defined in RELAX NG schema language plus Schematron rules.
- DocBook files:
Previously we used SGML syntax for our files. We must
convert them to a valid XML syntax, eg: tag omission.
- Tools and style
sheets: All tools which operate at the native SGML-level
(editors, conversions, ...) must be replaced by XML
conforming tools. As valid XML implicitly conforms to a
valid SGML syntax this step may be accomplished by
reconfiguring some of the tools, eg.: .emacs.
What I have done so far
is:
- Conversion of sgml
files to valid xml syntax with a perl skript. I failed to use
'osx' or 'spam'.
- Conversion of these
xml files to Docbook5.x format using xsltproc and Docbooks
xslt-migration skripts.
- Creation of html
files using xsltproc and Docbooks xslt skripts.
- Creation of fo files
using xsltproc and Docbooks xslt skripts.
- Creation of pdf
files using fop.
- The conversions needs
less than 10 minutes on a Intel i5 processor.
This is a very first raw
round-trip with one output file per sgml file and output type. Not
supported: entities (__gt__ as a surrogate), <[CDATA and
similar SGML constructs, PostgreSQL specific style sheets,
Makefile, additional errors occur, .... . I append one file of
every new format for the chapter "Advanced Features": xml (the new
source), html, fo, pdf.
Any ideas or suggestions? Shall we go further on this way? Has
anybody more experiences in SGML-->XML conversions or Docbook
4.x --> 5.x conversions?
Kind regards
Jürgen Purtz