Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.80) (envelope-from ) id 1ZsBJI-0005ps-CR for pgsql-docs@arkaria.postgresql.org; Fri, 30 Oct 2015 15:09:44 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84) (envelope-from ) id 1ZsBJH-0000Lc-VH for pgsql-docs@arkaria.postgresql.org; Fri, 30 Oct 2015 15:09:44 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84) (envelope-from ) id 1Zs9vQ-0003gK-RC for pgsql-docs@postgresql.org; Fri, 30 Oct 2015 13:41:01 +0000 Received: from newmail.postgrespro.ru ([93.174.131.138] helo=mail.postgrespro.ru) by makus.postgresql.org with esmtp (Exim 4.84) (envelope-from ) id 1Zs9vL-0008OW-W8 for pgsql-docs@postgresql.org; Fri, 30 Oct 2015 13:40:59 +0000 Received: from [1.0.0.7] (unknown [109.196.196.153]) by mail.postgrespro.ru (Postfix) with ESMTPSA id EA7C721C40F8 for ; Fri, 30 Oct 2015 16:40:54 +0300 (MSK) To: "pgsql-docs@postgresql.org" From: Alexander Lakhin Subject: Re: Moving documentation to XML Message-ID: <56337365.2080104@postgrespro.ru> Date: Fri, 30 Oct 2015 16:40:53 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------010101000405030909050105" X-Pg-Spam-Score: -2.9 (--) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-docs Precedence: bulk Sender: pgsql-docs-owner@postgresql.org This is a multi-part message in MIME format. --------------010101000405030909050105 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Hello, Guillaume. We have plans to use this for russian translation, too. We translate the docs by converting (with xml2po) the single xml to postgres-ru.po and after translating it we convert it back to xml (we get postres-ru.xml here). (Until now we had to perform one more conversion (postgres-ru.xml -> set of sgml's).) So now we can get russian html/* with: python xml2po.py -l ru -k -p postgres-ru.po postgres.xml >postgres-ru.xml xsltproc --stringparam pg.version '9.4.1' stylesheet.xsl postgres-ru.xml But I had some doubts about DSSSL and XSL differences. As I noted previously there was at least one visible difference. So I decided to customize XSL templates to make sure that html's are generated without a loss or corruption. I thought that comparing two HTML sources will not work, as they are too different, but maybe we can compare text generated from html by lynx, for example. So I use the following procedure to look for differences: 0. Get dsssl-generated html's make html 1. Extract text content from html's: for f in html/*.html; do fn=`basename $f`; echo $fn; cat $f | perl -0pi -pe 's/Note:\s*<\/B\s*>/\

Note<\/h3>/g' | perl -0pi -pe 's/>
/tmp/$fn; lynx /tmp/$fn --dump >html-text/$fn; * Some differences are not significant so it's not reasonable to modify XSL templates to eliminate them. Difference in "Note" placement and spelling is one of them, so I just filter it out. 2. Rename html to html-o and html-text to html-o-text. 3. Generate html's with XSL (use modified templates): rm -r html; xsltproc --stringparam pg.version '9.4.1' stylesheet.xsl postgres.xml 4. Extract text content from html's as above. 5. Make sure that two text html's are identical: diff -s -u -b -I '^\s*_\+\s*$' html-o-text/xtypes.html html-text/xtypes.html * Differences in whitespaces and length of "____" lines are not significant, too. For now, I've managed to get the same xtypes.html (I tested my XSL customizations with it), but I think, we can eliminate other most outstanding (or maybe all) differences likewise. I can describe XSL customizations in more details, if needed. Best regards, Alexander P.S. I couldn't post the message as a reply due to error on the postgresql.org side. (: host makus.postgresql.org[174.143.35.229] said: 550 Message headers fail syntax check (in reply to end of DATA command)) 28.10.2015 14:46, Guillaume Lelarge wrote: > > Le 26 oct. 2015 6:40 PM, "Alexander Lakhin" > a écrit : > > > ... > > To make sure that result of the transformation is the same, I've > compared original .html's with .html's generated with modified templates. > > Unfortunately xslt generates random id's, so it's needed to exclude > them before comparing. I do that with: > > for f in */*.html; do sed -e > 's/id=\"\(ftn\.\)\?id[a-z][0-9]\+\"/id=\"id\"/g' -i $f ; sed -e > 's/href=\"[^#]*#\(ftn\.\)\?id[a-z][0-9]\+\"/href=\"#\"/g' -i $f; done > > > > > > So if it's acceptable way to speed up generation of HTML (and maybe > some other formats), what other steps should we take to move away from > SGML? > > If the performance is still not satisfying, please let me know, I'll > continue to optimize xslt. > > Beside performance issues, I can see some difference in results of > 'make html' and 'make xslthtml'. For example, see > doc/src/sgml/html/spi.html (xslt-generated version doesn't contain the > lists of functions). > > > > What you've done is awesome. I can't wait to test it on the french > translation. > > Nice work! > --------------010101000405030909050105 Content-Type: text/x-patch; name="xslt-customize.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="xslt-customize.patch" diff --git a/doc/src/sgml/stylesheet-xhtml-dsssl-like.xsl b/doc/src/sgml/st= ylesheet-xhtml-dsssl-like.xsl new file mode 100644 index 0000000..95ca042 --- /dev/null +++ b/doc/src/sgml/stylesheet-xhtml-dsssl-like.xsl @@ -0,0 +1,266 @@ + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + +
+ + + + + + + + +
+ + + + + + + + + + + + +   + + + + + + + + + + + + + + +   + + + + + + +   + + +   + + + + + + + + + + + + +
+
+ +
+
+
+
+
+ + + + + + + + + + + + + + + +
+ +
+
+ + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + +   + + + + + + + + + + + + + +  |  + + +   + + + + + + + + + + -toc + + + + + + + + +   + + + + + + + + + + + + +
+ + + +   + + + + + + + + + + + + + + +   + + +   + + + +
+
+
+
+
+ + + + + + + title-unnumbered + title + + + + + + + + + + + + + + + +
diff --git a/doc/src/sgml/stylesheet-xhtml-speedup.xsl b/doc/src/sgml/style= sheet-xhtml-speedup.xsl new file mode 100644 index 0000000..d52b48e --- /dev/null +++ b/doc/src/sgml/stylesheet-xhtml-speedup.xsl @@ -0,0 +1,327 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +=20=20 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +=20=20 + + + + + + + + + + + + + + + +=20=20 + + + + + + + + + + + + + + + + + + +=20=20=20 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + , + + + + + + + + + + + + + + + + + + + + diff --git a/doc/src/sgml/stylesheet.xsl b/doc/src/sgml/stylesheet.xsl index 7967b36..07be4a7 100644 --- a/doc/src/sgml/stylesheet.xsl +++ b/doc/src/sgml/stylesheet.xsl @@ -6,6 +6,8 @@ =20 + + =20 =20 @@ -13,7 +15,7 @@ - + pgsql-docs@postgresql.org @@ -21,6 +23,9 @@ =20 =20 + + + stylesheet.css --------------010101000405030909050105 Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 8bit MIME-Version: 1.0 -- Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-docs --------------010101000405030909050105--