Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ozc9R-0008Cs-HM for pgsql-odbc@arkaria.postgresql.org; Mon, 28 Nov 2022 11:18:18 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1ozc8Q-0000qz-Ue for pgsql-odbc@arkaria.postgresql.org; Mon, 28 Nov 2022 11:17:14 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ozc8Q-0000qq-Eh for pgsql-odbc@lists.postgresql.org; Mon, 28 Nov 2022 11:17:14 +0000 Received: from mout.gmx.net ([212.227.17.20]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ozc8L-0006Jz-E5 for pgsql-odbc@lists.postgresql.org; Mon, 28 Nov 2022 11:17:13 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417; t=1669634224; bh=Jap+trFvM2N2aLGqyB+f3kP2KRP1pFUZhfkdylk88qY=; h=X-UI-Sender-Class:Date:Subject:To:References:From:In-Reply-To; b=S9OkPVr7vozJGZVWk2dXNozhqDB1S5rbUJWT7lNfaN6u5kXVs/0276EmnFdNUYijy 6QHj9nZeGn84Qcy7w2xq1E5EATiAKRkXZlLnLytZy3JhZSwCz02a+4OEkRQO1i/cql TomuouMU5iRDYBfeKUHGu4R+/HLHCc/mjQ3Ec1HQzU8LNS8RkBVSEV1Irm9ncR3/7W 9a6c/sieYjGKKVIUidhqY78RFVOge4UIPOxMC0lVriAVKPhfuw/TvAoUVflu0PbFzd UTrSzpClmSrsLGeyjV++8zrrqTPBW/KEAbTEozi/njkaJ0Wrb5ODx47eq4Fp0WMMGx ht4gSmj+8tpzQ== X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a Received: from [192.168.1.72] ([178.13.20.97]) by mail.gmx.net (mrgmx104 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MrQJ5-1odE8e2n4B-00oUS7; Mon, 28 Nov 2022 12:17:04 +0100 Content-Type: multipart/alternative; boundary="------------sfCz4Lezai9x4mxMCaByJitq" Message-ID: Date: Mon, 28 Nov 2022 12:16:59 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.5.0 Subject: Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? Content-Language: de-DE To: "Wal, Jan Tjalling van der" , Jon Raiford , "pgsql-odbc@lists.postgresql.org" References: <39e35073-0cac-5d64-9a72-b47ea671fb20@gmx.de> From: Marsupilami79 In-Reply-To: X-Provags-ID: V03:K1:6DxrPfbok/u5gAoaSTcR1uB8A9Ewl8ZwgU07nnVSGpaw0NGTfc5 547hnhLu+/OQPIvD1puZFP+6chS05wUPuh9aHuozzrRhQ7+SZQtB4MYLuHK4bGzPEldFwQ6 tNwUMbi3yIWBZpwH9T8FsS5Omi4Ox7SkyKNuT/uVTD/qeeL9c4vQPsU1DT09cesPYxbnxnk 8YqPSXHMnHiSKnroKSybg== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:/06hAryx5AE=;QWmisLfXot+xke/e/YqaQZMH+Wb FDCEbluMqVUxjWWd+RZM04DkPZ8BRkKPMxFMz3d/rcAcGUtoeW/9qhK2+P1NrJOFxbqTtzazL auVBv+9mkJqMxCHPRzvA9ZNcAMorRT5Fu6jdJxlVuRzpkbO//0FXpOB6z+yyNBqnSDk95rOhW b++xBTm0U1oFIzEOiLclsV3o0Fv35myZOJhzM9GcHHtqZYivVWR9SW/k0U3T/bqpesli1Tdej Q1R9ba/B615fZe8LJlnIT9Z3HIkoWBfREDClZqnY5ww5PqPYUFaVmY+HlzICLuJzr+m1z/MLr Ci94a+SgHIFzSrT3Bmg43l8dYwZelN2H3OeUyq5q2wSoyepswsjRXRE32ToROhwrqavIFakLZ JOD+ufahTlvBqyjoATl61hPHvJy7meAyH1AO99GOTqVMa/I0cYhqvuueEOgUXNuCEy0I6m7Hs XWfZenNLq343kTa2d4lJ/bdtydBcFoWGlh+X6oHAQEMJna7XXNMb3NQN7y+SrwTnKO9OZWXwd 60LRtm2Ik+Lvcr4zeJ6oWz/nnMD8GM8HFbBXhLIm4p4l1EWGYu1XK8ypS3F2rYODzQbGG75VL NNHlA6wTb/sh6ngEisq81/fUawuxO9Me3Y0Jm//7X3aR02EtHJ3bDX/o/FgLrcHG+oZQJJw8Y XLNwUUhdqCcVB/n4Etm8OTBpyk1uV5gxrRqutJYCjeB6zKaVnsulMfDxqHAcvBGfqRbvePWq+ FSpDiduRYuMe/YPBVbTAZIDf6+vXzJlrAUMQZAkmu/mF/aWslD/IRYHMeAOWA3wMCYe+pjVg1 XdUwzsetxAkhGLYwbMJzEh4TQdHWPscYT5pLQcUjTe3FoFyhKXF8Kksa2CZBqa9JyJr6/SJTk 8yCgwuI7Afxt6GIXE/iUx1p2eHcdJMhVo5rBBDtE34SR2uc4ih8RuG1mdVUbQ4BYuVY0icavF /kgv517geD1FvkBzptS2LGeaKhQ= List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------sfCz4Lezai9x4mxMCaByJitq Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Hello Jan Tjalling, hello Jon, Jon is right - SQLGetConnectAttrW is a WideChar function (2 Bytes per Character). So the encoding in the PG database should not matter. It should return the length for the SQL_ATTR_CURRENT_CATALOG attribute in bytes. So for a database named "topsales" it should return 16 because each character uses two bytes. I assume this is a bug in the PostgreSQL ODBC driver. The question is where to file a bug report and how to get this fixed? Is there a chance to get this fixed? With best regards, Jan Baumgarten Am 25.11.2022 um 20:22 schrieb Wal, Jan Tjalling van der: > > Okay, getting out of my comfort zone here. > > I have found encoding options for PostgreSQL (v15) here: > > PostgreSQL: Documentation: 15: 24.3.=C2=A0Character Set Support > ; they do not > include UTF16, just UTF8. > > There is no mention of UTF16 anywhere on that page. > > I also found this: PostgreSQL: Re: DataDirect Driver, ExecDirect and > UTF-8 > > that does mention WCHAR being different form CHAR and how that could wor= k. > > I hope there is something on these pages that helps you further. > > Kind regards, Jan Tjalling > > *From:*Jon Raiford > *Sent:* 25 November 2022 03:58 > *To:* Wal, Jan Tjalling van der ; > Marsupilami79 ; pgsql-odbc@lists.postgresql.org > *Subject:* Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG =3D> wrong > byte count? > > I believe the point is that the function is a "W" (wide char 16-bit) > function so the strings should be UTF-16. > > Jon > > ------------------------------------------------------------------------ > > *From:*Wal, Jan Tjalling van der > *Sent:* Tuesday, November 22, 2022 11:21:56 AM > *To:* Marsupilami79 ; > pgsql-odbc@lists.postgresql.org > *Subject:* RE: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG =3D> wrong > byte count? > > Hello Marsupilami79, Jan, > > It could be that the answers you receive are in fact correct for postgre= s. > In a database set to charset=3DUTF-8 I get the following answers. > > select=C2=A0 'topsales' as string, char_length('topsales'), > length('topsales'), octet_length('topsales'); > "string"=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 "char_length"=C2=A0=C2=A0 "length" "octet_leng= th" > "topsales"=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 8=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 8=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 8 > > However for a variation that requires more bytes to store the answer > start to differ. > select=C2=A0 't=C3=B6ps=C3=A5l=C3=A9s'as string, char_length('t=C3=B6ps= =C3=A5l=C3=A9s'), > length('t=C3=B6ps=C3=A5l=C3=A9s'), octet_length('t=C3=B6ps=C3=A5l=C3=A9s= '); > "string"=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 "char_length"=C2=A0=C2=A0 "length" "octet_leng= th" > "t=C3=B6ps=C3=A5l=C3=A9s"=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 8=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 8=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 11 > > In the above octet_length is a postgres-function that yields results > in bytes. > And the three characters with a diacritical added, each requires 2 > bytes, yielding a resulting lengt of 11 instead of 8. > > Kind regards, Jan Tjalling van der Wal > > > -----Original Message----- > From: Marsupilami79 > Sent: 22 November 2022 16:57 > To: pgsql-odbc@lists.postgresql.org > Subject: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG =3D> wrong byte > count? > > Hello, > > I am a co author of a data access library and we recently added an > ODBC bridge. This bridge has the capability to detemine the current > Catalog / Database. This is done by calling SQLGetConnectAttrW. > > We try to determine the size of the buffer that is needed for the > catalog name in the following manner: > SQLGetConnectAttrW(fHDBC, SQL_ATTR_CURRENT_CATALOG, null, 0, &aLen) > > The ODBC driver for Microsoft SQL server correctly returns the number > of bytes required (10 bytes for the Database name "Stork") in the aLen > parameter. The ODBC driver for PostgreSQL returns the number of > characters (8 characters for a database named "topsales"), where it > should return 16 for the number of bytes required. > > I tested this with the psqlodbc_13_02_0000-x86 download for Windows 10 > and installed the Unicode ODBC driver. > > I assume this is a bug and needs to be fixed. I just don't know if > this is the right place to report the bug to? > > With best regards, > > Jan > --------------sfCz4Lezai9x4mxMCaByJitq Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello Jan Tjalling, hello Jon,

Jon is right - SQLGetConnectAttrW is a WideChar function (2 Bytes per Character). So the encoding in the PG database should not matter. It should return the length for the SQL_ATTR_CURRENT_CATALOG attribute in bytes. So for a database named "topsales" it should return 16 because each character uses two bytes.

I assume this is a bug in the PostgreSQL ODBC driver. The question is where to file a bug report and how to get this fixed? Is there a chance to get this fixed?

With best regards,

Jan Baumgarten

Am 25.11.2022 um 20:22 schrieb Wal, Jan Tjalling van der:

Okay, getting out of my comfort zone here.

I have found encoding options for PostgreSQL (v15) here:

PostgreSQL: Documentation: 15: 24.3.=C2=A0Character Set Support; they do not include UTF16, just UTF8.

There is no mention of UTF16 anywhere on that page.

=C2=A0

I also found this: PostgreSQL: Re: DataDirect Driver, ExecDirect and UTF-8 that does mention WCHAR being different form CHAR and how that could work.

=C2=A0

I hope there is something on these pages that helps you further.

=C2=A0

Kind regards, Jan Tjalling

=C2=A0

From: Jon Raiford <raiford@labware.com>
Sent: 25 November 2022 03:58
To: Wal, Jan Tjalling van der <jan_tjalling.vanderwal@wur.nl>; Marsupil= ami79 <marsupilami79@gmx.de>; pgsql-odbc@lists.postgresql.org
Subject: Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG =3D> wrong byte count?

=C2=A0

I believe the point is that the function is a "W" (wide char 16-bit) function so the strings should be UTF-16.

=C2=A0

Jon


From: Wal, Jan Tjalling van der <j= an_tjalling.vanderwal@wur.nl>
Sent: Tuesday, November 22, 2022 11:21:56 AM
To: Marsupilami79 <m= arsupilami79@gmx.de>; p= gsql-odbc@lists.postgresql.org <p= gsql-odbc@lists.postgresql.org>
Subject: RE: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG =3D> wrong byte count?

=C2=A0

Hello Marsupilami79, Jan,

It could be that the answers you receive are in fact correct for postgres.
In a database set to charset=3DUTF-8 I get the following answers.

select=C2=A0 'topsales' as string, char_length('topsales'), length('topsales'), octet_length('topsales');
"string"=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "char_length"=C2=A0=C2=A0 "length"= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "octet_length"
"topsales"=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 8=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 8=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 8

However for a variation that requires more bytes to store the answer start to differ.
select=C2=A0 't=C3=B6ps=C3=A5l=C3=A9s'as string, char_length= ('t=C3=B6ps=C3=A5l=C3=A9s'), length('t=C3=B6ps=C3=A5l=C3=A9s'), octet_length('t=C3=B6ps= =C3=A5l=C3=A9s');
"string"=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "char_length"=C2=A0=C2=A0 "length"= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "octet_length"
"t=C3=B6ps=C3=A5l=C3=A9s"=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 8=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 8=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 11

In the above octet_length is a postgres-function that yields results in bytes.
And the three characters with a diacritical added, each requires 2 bytes, yielding a resulting lengt of 11 instead of 8.

Kind regards, Jan Tjalling van der Wal


-----Original Message-----
From: Marsupilami79 <m= arsupilami79@gmx.de>
Sent: 22 November 2022 16:57
To: p= gsql-odbc@lists.postgresql.org
Subject: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG =3D> wrong byte count?

Hello,

I am a co author of a data access library and we recently added an ODBC bridge. This bridge has the capability to detemine the current Catalog / Database. This is done by calling SQLGetConnectAttrW.

We try to determine the size of the buffer that is needed for the catalog name in the following manner:
SQLGetConnectAttrW(fHDBC, SQL_ATTR_CURRENT_CATALOG, null, 0, &aLen)

The ODBC driver for Microsoft SQL server correctly returns the number of bytes required (10 bytes for the Database name "Stork") in the aLen parameter. The ODBC driver for PostgreSQL returns the number of characters (8 characters for a database named "topsales"), where it should return 16 for the number of bytes required.

I tested this with the psqlodbc_13_02_0000-x86 download for Windows 10 and installed the Unicode ODBC driver.

I assume this is a bug and needs to be fixed. I just don't know if this is the right place to report the bug to?

With best regards,

Jan


--------------sfCz4Lezai9x4mxMCaByJitq--