Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1atEkj-0001bF-Uq for pgsql-docs@arkaria.postgresql.org; Thu, 21 Apr 2016 13:34:42 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1atEkj-0004KK-CA for pgsql-docs@arkaria.postgresql.org; Thu, 21 Apr 2016 13:34:41 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1atEki-0004Jv-FS for pgsql-docs@postgresql.org; Thu, 21 Apr 2016 13:34:40 +0000 Received: from mout.kundenserver.de ([212.227.17.13]) by makus.postgresql.org with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.84_2) (envelope-from ) id 1atEke-0002lO-R0 for pgsql-docs@postgresql.org; Thu, 21 Apr 2016 13:34:39 +0000 Received: from [192.168.178.26] ([84.165.216.155]) by mrelayeu.kundenserver.de (mreue104) with ESMTPSA (Nemesis) id 0MQODc-1bIVMK1HsS-00TpOa; Thu, 21 Apr 2016 15:34:32 +0200 Subject: Re: Docbook 5.x References: <57179283.6080704@purtz.de> Cc: Simon Riggs , Alexander Law From: =?UTF-8?Q?J=c3=bcrgen_Purtz?= To: pgsql-docs@postgresql.org Message-ID: <5718D6E7.7000805@purtz.de> Date: Thu, 21 Apr 2016 15:34:31 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/alternative; boundary="------------040903010407040608050604" X-Provags-ID: V03:K0:4chsJ7WAmX/UsSiG+zcOU/igCQ3qU59PTQLfjcuXgKcxvzskONh Q0CyRnkXuv6D+adp9IU+IG2ENtSZwl6lkOHKJyWUkExcwmUFdjy5ElsO0uFKYTd0SqdIDKK 3EuKrwxzpjaqKvTDYE3PY/x7d92760RswI7AyLOk8cUxpIY1lCjBo/WnmCpZx5SzhTnGPSh G+gLI5e9fZHQx6NmlC1CA== X-UI-Out-Filterresults: notjunk:1;V01:K0:UZ6N2dQJJNA=:Ch8HMvr8NQbXvBd4/Wnf1o Zxd8ciFljSq+4ktBCNBWWCLWWMQXOKt9pYo7g1Dez9YpjK4quHOMm9NcxgCKP5+fzBPelOYRZ DwffiFNZzoWeHgNS82pc4C3SFe0tbaF8xilxn2RVunS8g+ide0hYrMJxKFqWtRNZeS2m/JE7F INYLajmVNUM00ddiLHyBoBj2XhM5MTSdNwLIOsWX4CcOGvvGczqdRrimLBM9DT9tmIvEhATIO r5W0yDo1FkeA+wYLRN/6uATyfQlWMRYSB8BdCZp952/NMEtQXSNe/dsJ5huHmwsZsrHigVvKn FNZ4484xnUhQiK2PJGnNSAlWIzgUOLeIxFbr51OtGC69j5ysaGHApm6N0R2j3KhLzsLG5xPYi rErBcFL3BJw19xDQkWAcM+uLhfqIUXEuulH9JakNAD/o/pqgx6E4iO5tlbPX1gajnUSNdlQj3 VwwGtL/IS4NnQNaOb0O1axD/picriCDShUPMgDvxCRg6PZp0aXQWOpqsOOnG0yhGNRNrJ6QDW 43n4ZMEyg+rxyJxltUoTxWxJX/1q3K6VIxJPGPOHqNcfJJ/uE0udeHCNe+h6T5IuvubEleEkH /KSLGVA9ru6LIQdXmc92GZ/3vk8k+koIrxsVCVIixIyzy2D9LYtzhnMSWiPKjX+XkHP0+eidh 9sLPz3d5VppqAd9tQilNL+KdIr4rzDuok+NaGapV4BXBCDVqYOs4mASpTSBpDo+pDvqk0k4KV nXWN6EHfIY2H4UGi X-Pg-Spam-Score: -2.6 (--) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-docs Precedence: bulk Sender: pgsql-docs-owner@postgresql.org This is a multi-part message in MIME format. --------------040903010407040608050604 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 20.04.2016 20:41, Simon Riggs wrote: > On 20 April 2016 at 15:30, Jürgen Purtz > wrote: > > What I have done so far is: > > * Conversion of sgml files to valid xml syntax with a perl > skript. I failed to use 'osx' or 'spam'. > * Conversion of these xml files to Docbook5.x format using > xsltproc and Docbooks xslt-migration skripts. > * Creation of html files using xsltproc and Docbooks xslt skripts. > * Creation of fo files using xsltproc and Docbooks xslt skripts. > * Creation of pdf files using fop. > * The conversions needs less than 10 minutes on a Intel i5 > processor. > > So you believe you have/can convert between the two formats > accurately, so we can change things in a single commit? > > What verification is offered? Possible? > > And that is ready to go now? Will you post your perl script, or the > patch? Other projects use the same file formats, e.g. Slony, XL etc > > If an automatic migration is possible do we need to change at all? > > -- > Simon Riggs http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services Hi, actually I have done only a first raw round-trip to evaluate that there is no showstopper for my plans. If we find a consensus in the community that this work is valuable for the postgres documentation I will continue to work on it in the near future. To answer your questions: * "do we need to change at all?". This question has to be discussed in the community. I tried to use the recommended tools like 'osx' and 'spam' - and failed (not at all but in details like newline processing). This may be a my fault, or it results from the fact that we still use sgml instead of xml. But over time this task will get harder and harder: sgml knowledge gets lost, sgml-tools are no longer actively developed, xml move foreward, ... * Actually I don't see any showstopper. Therefore I believe that the conversion from Docbook 4 to 5 is manageable. The plan is that we will have one xml-file in db5 format per every sgml file in db4 format. * To support the repository in a continuous way we shall do something like 'git mv file.sgml file.xml', put the new content to 'file.xml' and 'git commit'. Additionally the newlines must be kept during all conversation steps. * Maybe some very individual (manual) steps are necessary, but it shall be possible that also this can be scripted. Therefore the conversion shall run fast and a single commit shall work on the complete documentation. * There are no special "Postgres" tasks in the Perl script or at any other places. It depends on docbook only. Therefore other projects can use it in the same way. Of course I will publish all sources. * Actually I try to generate well-formed xml. Validation against the Docbook 5 schema will follow. Alexander Law posted additional suggestions and questions: Hello Jürgen, Please look at the discussion that we had some time ago: http://www.postgresql.org/message-id/56337365.2080104@postgrespro.ru And we (postgrespro) still have plans to migrate to XML as soon as we get documentation translated. We had no issues with SGML->XML conversion, "make postgres.xml" creates XML (with entities and alike), which we use. When you talking about "conversion of html, fo, pdf, ..." do you mean using docs/sgml/Makefile or some other scripts? As to conversion SGML to XML, we need to decide whether to generate a single XML, or a set of XMLs (corresponding to current SGMLs). In the latter case - how to include XML-fragments into the main document (as entities or with xi:include)? Please, can you explain what are "Docbooks xslt-migration scripts"? Is Docbook 4.x incompatible with Docbook 5.x and we need to convert it additionally? Best regards, Alexander ----- Alexander Lakhin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company My answers: * Docbook 4 and 5 are not compatible. There are new elements, others have gone and are replaced by more generic ones. But the Docbook project offers xslt's to convert Docbook 4 xml-files to Docbook5 xml-files. * There are pros and cons using postgres.xml as a starting point. PRO: well formed (and valid?) xml format. Entities keeps alive. No more "
On 20.04.2016 20:41, Simon Riggs wrote:
On 20 April 2016 at 15:30, Jürgen Purtz <juergen@purtz.de> wrote:
 
What I have done so far is:
  • Conversion of sgml files to valid xml syntax with a perl skript. I failed to use 'osx' or 'spam'.
  • Conversion of these xml files to Docbook5.x format using xsltproc and Docbooks xslt-migration skripts.
  • Creation of html files using xsltproc and Docbooks xslt skripts.
  • Creation of fo files using xsltproc and Docbooks xslt skripts.
  • Creation of pdf files using fop.
  • The conversions needs less than 10 minutes on a Intel i5 processor.
So you believe you have/can convert between the two formats accurately, so we can change things in a single commit?

What verification is offered? Possible?

And that is ready to go now? Will you post your perl script, or the patch? Other projects use the same file formats, e.g. Slony, XL etc

If an automatic migration is possible do we need to change at all?

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Hi,

actually I have done only a first raw round-trip to evaluate that there is no showstopper for my plans. If we find a consensus in the community that this work is valuable for the postgres documentation I will continue to work on it in the near future. To answer your questions:
  • "do we need to change at all?". This question has to be discussed in the community. I tried to use the recommended tools like 'osx' and 'spam' - and failed (not at all but in details like newline processing). This may be a my fault, or it results from the fact that we still use sgml instead of xml. But over time this task will get harder and harder: sgml knowledge gets lost, sgml-tools are no longer actively developed, xml move foreward, ...
  • Actually I don't see any showstopper. Therefore I believe that the conversion from Docbook 4 to 5 is manageable. The plan is that we will have one xml-file in db5 format per every sgml file in db4 format.
  • To support the repository in a continuous way we shall do something like 'git mv file.sgml file.xml', put the new content to 'file.xml' and 'git commit'. Additionally the newlines must be kept during all conversation steps.
  • Maybe some very individual (manual) steps are necessary, but it shall be possible that also this can be scripted. Therefore the conversion shall run fast and a single commit shall work on the complete documentation.
  • There are no special "Postgres" tasks in the Perl script or at any other places. It depends on docbook only. Therefore other projects can use it in the same way. Of course I will publish all sources.
  • Actually I try to generate well-formed xml. Validation against the Docbook 5 schema will follow.


Alexander Law posted additional suggestions and questions:

Hello Jürgen,

Please look at the discussion that we had some time ago:
http://www.postgresql.org/message-id/56337365.2080104@postgrespro.ru

And we (postgrespro) still have plans to migrate to XML as soon as we get documentation translated.
We had no issues with SGML->XML conversion, "make postgres.xml" creates XML (with entities and alike), which we use.

When you talking about "conversion of html, fo, pdf, ..." do you mean using docs/sgml/Makefile or some other scripts?

As to conversion SGML to XML, we need to decide whether to generate a single XML, or a set of XMLs (corresponding to current SGMLs).
In the latter case - how to include XML-fragments into the main document (as entities or with xi:include)?

Please, can you explain what are "Docbooks xslt-migration scripts"?
Is Docbook 4.x incompatible with Docbook 5.x and we need to convert it additionally?


Best regards,
Alexander

-----
Alexander Lakhin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


My answers:

  • Docbook 4 and 5 are not compatible. There are new elements, others have gone and are replaced by more generic ones. But the Docbook project offers xslt's to convert Docbook 4 xml-files to Docbook5 xml-files.
  • There are pros and cons using postgres.xml as a starting point. PRO: well formed (and valid?) xml format. Entities keeps alive. No more "<![CDATA[", "<![%include" and similar sgml constructs. CON: Only one file. Ugly line break algorithm.
  • Actually I don't use the existing Makefile. I start Perl, xsltproc and fop with a different script. If I continue to work, I have to change the Makefile.
  • "how to include XML-fragments into the main document (as entities or with xi:include) ?". As described above, I prefer one file per existing sgml-file. But some of those sgml-files have more than one root element. It such situations (and without further processing) the resulting xml-files will have fragments. In general it will be more "Docbook 5 compliant" to use xi:include instead of entities.
  • "Docbooks xslt-migration scripts": see: http://docbook.org/docs/howto/#convert4to5

Kind regards
Jürgen Purtz


--------------040903010407040608050604--