public inbox for [email protected]help / color / mirror / Atom feed
SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? 7+ messages / 4 participants [nested] [flat]
* SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? @ 2022-11-22 15:56 Marsupilami79 <[email protected]> 0 siblings, 2 replies; 7+ messages in thread From: Marsupilami79 @ 2022-11-22 15:56 UTC (permalink / raw) To: [email protected] Hello, I am a co author of a data access library and we recently added an ODBC bridge. This bridge has the capability to detemine the current Catalog / Database. This is done by calling SQLGetConnectAttrW. We try to determine the size of the buffer that is needed for the catalog name in the following manner: SQLGetConnectAttrW(fHDBC, SQL_ATTR_CURRENT_CATALOG, null, 0, &aLen) The ODBC driver for Microsoft SQL server correctly returns the number of bytes required (10 bytes for the Database name "Stork") in the aLen parameter. The ODBC driver for PostgreSQL returns the number of characters (8 characters for a database named "topsales"), where it should return 16 for the number of bytes required. I tested this with the psqlodbc_13_02_0000-x86 download for Windows 10 and installed the Unicode ODBC driver. I assume this is a bug and needs to be fixed. I just don't know if this is the right place to report the bug to? With best regards, Jan ^ permalink raw reply [nested|flat] 7+ messages in thread
* RE: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? @ 2022-11-22 16:21 Wal, Jan Tjalling van der <[email protected]> parent: Marsupilami79 <[email protected]> 1 sibling, 1 reply; 7+ messages in thread From: Wal, Jan Tjalling van der @ 2022-11-22 16:21 UTC (permalink / raw) To: Marsupilami79 <[email protected]>; [email protected] <[email protected]> Hello Marsupilami79, Jan, It could be that the answers you receive are in fact correct for postgres. In a database set to charset=UTF-8 I get the following answers. select 'topsales' as string, char_length('topsales'), length('topsales'), octet_length('topsales'); "string" "char_length" "length" "octet_length" "topsales" 8 8 8 However for a variation that requires more bytes to store the answer start to differ. select 'töpsålés'as string, char_length('töpsålés'), length('töpsålés'), octet_length('töpsålés'); "string" "char_length" "length" "octet_length" "töpsålés" 8 8 11 In the above octet_length is a postgres-function that yields results in bytes. And the three characters with a diacritical added, each requires 2 bytes, yielding a resulting lengt of 11 instead of 8. Kind regards, Jan Tjalling van der Wal -----Original Message----- From: Marsupilami79 <[email protected]> Sent: 22 November 2022 16:57 To: [email protected] Subject: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? Hello, I am a co author of a data access library and we recently added an ODBC bridge. This bridge has the capability to detemine the current Catalog / Database. This is done by calling SQLGetConnectAttrW. We try to determine the size of the buffer that is needed for the catalog name in the following manner: SQLGetConnectAttrW(fHDBC, SQL_ATTR_CURRENT_CATALOG, null, 0, &aLen) The ODBC driver for Microsoft SQL server correctly returns the number of bytes required (10 bytes for the Database name "Stork") in the aLen parameter. The ODBC driver for PostgreSQL returns the number of characters (8 characters for a database named "topsales"), where it should return 16 for the number of bytes required. I tested this with the psqlodbc_13_02_0000-x86 download for Windows 10 and installed the Unicode ODBC driver. I assume this is a bug and needs to be fixed. I just don't know if this is the right place to report the bug to? With best regards, Jan ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? @ 2022-11-25 02:58 Jon Raiford <[email protected]> parent: Wal, Jan Tjalling van der <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: Jon Raiford @ 2022-11-25 02:58 UTC (permalink / raw) To: Wal, Jan Tjalling van der <[email protected]>; Marsupilami79 <[email protected]>; [email protected] <[email protected]> I believe the point is that the function is a "W" (wide char 16-bit) function so the strings should be UTF-16. Jon ________________________________ From: Wal, Jan Tjalling van der <[email protected]> Sent: Tuesday, November 22, 2022 11:21:56 AM To: Marsupilami79 <[email protected]>; [email protected] <[email protected]> Subject: RE: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? Hello Marsupilami79, Jan, It could be that the answers you receive are in fact correct for postgres. In a database set to charset=UTF-8 I get the following answers. select 'topsales' as string, char_length('topsales'), length('topsales'), octet_length('topsales'); "string" "char_length" "length" "octet_length" "topsales" 8 8 8 However for a variation that requires more bytes to store the answer start to differ. select 'töpsålés'as string, char_length('töpsålés'), length('töpsålés'), octet_length('töpsålés'); "string" "char_length" "length" "octet_length" "töpsålés" 8 8 11 In the above octet_length is a postgres-function that yields results in bytes. And the three characters with a diacritical added, each requires 2 bytes, yielding a resulting lengt of 11 instead of 8. Kind regards, Jan Tjalling van der Wal -----Original Message----- From: Marsupilami79 <[email protected]> Sent: 22 November 2022 16:57 To: [email protected] Subject: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? Hello, I am a co author of a data access library and we recently added an ODBC bridge. This bridge has the capability to detemine the current Catalog / Database. This is done by calling SQLGetConnectAttrW. We try to determine the size of the buffer that is needed for the catalog name in the following manner: SQLGetConnectAttrW(fHDBC, SQL_ATTR_CURRENT_CATALOG, null, 0, &aLen) The ODBC driver for Microsoft SQL server correctly returns the number of bytes required (10 bytes for the Database name "Stork") in the aLen parameter. The ODBC driver for PostgreSQL returns the number of characters (8 characters for a database named "topsales"), where it should return 16 for the number of bytes required. I tested this with the psqlodbc_13_02_0000-x86 download for Windows 10 and installed the Unicode ODBC driver. I assume this is a bug and needs to be fixed. I just don't know if this is the right place to report the bug to? With best regards, Jan ^ permalink raw reply [nested|flat] 7+ messages in thread
* RE: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? @ 2022-11-25 19:22 Wal, Jan Tjalling van der <[email protected]> parent: Jon Raiford <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: Wal, Jan Tjalling van der @ 2022-11-25 19:22 UTC (permalink / raw) To: Jon Raiford <[email protected]>; Marsupilami79 <[email protected]>; [email protected] <[email protected]> Okay, getting out of my comfort zone here. I have found encoding options for PostgreSQL (v15) here: PostgreSQL: Documentation: 15: 24.3. Character Set Support<https://www.postgresql.org/docs/current/multibyte.html;; they do not include UTF16, just UTF8. There is no mention of UTF16 anywhere on that page. I also found this: PostgreSQL: Re: DataDirect Driver, ExecDirect and UTF-8<https://www.postgresql.org/message-id/C631E68E.133%[email protected]; that does mention WCHAR being different form CHAR and how that could work. I hope there is something on these pages that helps you further. Kind regards, Jan Tjalling From: Jon Raiford <[email protected]> Sent: 25 November 2022 03:58 To: Wal, Jan Tjalling van der <[email protected]>; Marsupilami79 <[email protected]>; [email protected] Subject: Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? I believe the point is that the function is a "W" (wide char 16-bit) function so the strings should be UTF-16. Jon ________________________________ From: Wal, Jan Tjalling van der <[email protected]<mailto:[email protected]>> Sent: Tuesday, November 22, 2022 11:21:56 AM To: Marsupilami79 <[email protected]<mailto:[email protected]>>; [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> Subject: RE: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? Hello Marsupilami79, Jan, It could be that the answers you receive are in fact correct for postgres. In a database set to charset=UTF-8 I get the following answers. select 'topsales' as string, char_length('topsales'), length('topsales'), octet_length('topsales'); "string" "char_length" "length" "octet_length" "topsales" 8 8 8 However for a variation that requires more bytes to store the answer start to differ. select 'töpsålés'as string, char_length('töpsålés'), length('töpsålés'), octet_length('töpsålés'); "string" "char_length" "length" "octet_length" "töpsålés" 8 8 11 In the above octet_length is a postgres-function that yields results in bytes. And the three characters with a diacritical added, each requires 2 bytes, yielding a resulting lengt of 11 instead of 8. Kind regards, Jan Tjalling van der Wal -----Original Message----- From: Marsupilami79 <[email protected]<mailto:[email protected]>> Sent: 22 November 2022 16:57 To: [email protected]<mailto:[email protected]> Subject: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? Hello, I am a co author of a data access library and we recently added an ODBC bridge. This bridge has the capability to detemine the current Catalog / Database. This is done by calling SQLGetConnectAttrW. We try to determine the size of the buffer that is needed for the catalog name in the following manner: SQLGetConnectAttrW(fHDBC, SQL_ATTR_CURRENT_CATALOG, null, 0, &aLen) The ODBC driver for Microsoft SQL server correctly returns the number of bytes required (10 bytes for the Database name "Stork") in the aLen parameter. The ODBC driver for PostgreSQL returns the number of characters (8 characters for a database named "topsales"), where it should return 16 for the number of bytes required. I tested this with the psqlodbc_13_02_0000-x86 download for Windows 10 and installed the Unicode ODBC driver. I assume this is a bug and needs to be fixed. I just don't know if this is the right place to report the bug to? With best regards, Jan ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? @ 2022-11-28 11:16 Marsupilami79 <[email protected]> parent: Wal, Jan Tjalling van der <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: Marsupilami79 @ 2022-11-28 11:16 UTC (permalink / raw) To: Wal, Jan Tjalling van der <[email protected]>; Jon Raiford <[email protected]>; [email protected] <[email protected]> Hello Jan Tjalling, hello Jon, Jon is right - SQLGetConnectAttrW is a WideChar function (2 Bytes per Character). So the encoding in the PG database should not matter. It should return the length for the SQL_ATTR_CURRENT_CATALOG attribute in bytes. So for a database named "topsales" it should return 16 because each character uses two bytes. I assume this is a bug in the PostgreSQL ODBC driver. The question is where to file a bug report and how to get this fixed? Is there a chance to get this fixed? With best regards, Jan Baumgarten Am 25.11.2022 um 20:22 schrieb Wal, Jan Tjalling van der: > > Okay, getting out of my comfort zone here. > > I have found encoding options for PostgreSQL (v15) here: > > PostgreSQL: Documentation: 15: 24.3. Character Set Support > <https://www.postgresql.org/docs/current/multibyte.html;; they do not > include UTF16, just UTF8. > > There is no mention of UTF16 anywhere on that page. > > I also found this: PostgreSQL: Re: DataDirect Driver, ExecDirect and > UTF-8 > <https://www.postgresql.org/message-id/C631E68E.133%[email protected]; > that does mention WCHAR being different form CHAR and how that could work. > > I hope there is something on these pages that helps you further. > > Kind regards, Jan Tjalling > > *From:*Jon Raiford <[email protected]> > *Sent:* 25 November 2022 03:58 > *To:* Wal, Jan Tjalling van der <[email protected]>; > Marsupilami79 <[email protected]>; [email protected] > *Subject:* Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong > byte count? > > I believe the point is that the function is a "W" (wide char 16-bit) > function so the strings should be UTF-16. > > Jon > > ------------------------------------------------------------------------ > > *From:*Wal, Jan Tjalling van der <[email protected]> > *Sent:* Tuesday, November 22, 2022 11:21:56 AM > *To:* Marsupilami79 <[email protected]>; > [email protected] <[email protected]> > *Subject:* RE: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong > byte count? > > Hello Marsupilami79, Jan, > > It could be that the answers you receive are in fact correct for postgres. > In a database set to charset=UTF-8 I get the following answers. > > select 'topsales' as string, char_length('topsales'), > length('topsales'), octet_length('topsales'); > "string" "char_length" "length" "octet_length" > "topsales" 8 8 8 > > However for a variation that requires more bytes to store the answer > start to differ. > select 'töpsålés'as string, char_length('töpsålés'), > length('töpsålés'), octet_length('töpsålés'); > "string" "char_length" "length" "octet_length" > "töpsålés" 8 8 11 > > In the above octet_length is a postgres-function that yields results > in bytes. > And the three characters with a diacritical added, each requires 2 > bytes, yielding a resulting lengt of 11 instead of 8. > > Kind regards, Jan Tjalling van der Wal > > > -----Original Message----- > From: Marsupilami79 <[email protected]> > Sent: 22 November 2022 16:57 > To: [email protected] > Subject: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte > count? > > Hello, > > I am a co author of a data access library and we recently added an > ODBC bridge. This bridge has the capability to detemine the current > Catalog / Database. This is done by calling SQLGetConnectAttrW. > > We try to determine the size of the buffer that is needed for the > catalog name in the following manner: > SQLGetConnectAttrW(fHDBC, SQL_ATTR_CURRENT_CATALOG, null, 0, &aLen) > > The ODBC driver for Microsoft SQL server correctly returns the number > of bytes required (10 bytes for the Database name "Stork") in the aLen > parameter. The ODBC driver for PostgreSQL returns the number of > characters (8 characters for a database named "topsales"), where it > should return 16 for the number of bytes required. > > I tested this with the psqlodbc_13_02_0000-x86 download for Windows 10 > and installed the Unicode ODBC driver. > > I assume this is a bug and needs to be fixed. I just don't know if > this is the right place to report the bug to? > > With best regards, > > Jan > ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? @ 2022-11-28 14:01 Jon Raiford <[email protected]> parent: Marsupilami79 <[email protected]> 0 siblings, 0 replies; 7+ messages in thread From: Jon Raiford @ 2022-11-28 14:01 UTC (permalink / raw) To: Marsupilami79 <[email protected]>; Wal, Jan Tjalling van der <[email protected]>; [email protected] <[email protected]> I don’t know if there is an official place to file a bug report for this issue. I have seen several issues reported in this email list which were later fixed, so presumably this is an appropriate place to document the issue. I did some research on this and found something interesting that may help. If you look at the documentation for SQLGetConnectAttr() from Microsoft, they specifically mention the wide char format: https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqlgetconnectattr-function Under the description for the BufferLength argument (describing the length of the buffer used to answer the attribute value): “If the value in *ValuePtr is a Unicode string (when calling SQLGetConnectAttrW), the BufferLength argument must be an even number.” The reason this is expected to be an even number is because all Unicode strings use 16-bit characters in Win32 wide char functions. I hope this helps, Jon From: Marsupilami79 <[email protected]> Date: Monday, November 28, 2022 at 6:22 AM To: Wal, Jan Tjalling van der <[email protected]>, Jon Raiford <[email protected]>, [email protected] <[email protected]> Subject: Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? Hello Jan Tjalling, hello Jon, Jon is right - SQLGetConnectAttrW is a WideChar function (2 Bytes per Character). So the encoding in the PG database should not matter. It should return the length for the SQL_ATTR_CURRENT_CATALOG attribute in bytes. So for a database named "topsales" it should return 16 because each character uses two bytes. I assume this is a bug in the PostgreSQL ODBC driver. The question is where to file a bug report and how to get this fixed? Is there a chance to get this fixed? With best regards, Jan Baumgarten Am 25.11.2022 um 20:22 schrieb Wal, Jan Tjalling van der: Okay, getting out of my comfort zone here. I have found encoding options for PostgreSQL (v15) here: PostgreSQL: Documentation: 15: 24.3. Character Set Support<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fprotect-us.mimecast.com%2Fs%2Fa-zK...;; they do not include UTF16, just UTF8. There is no mention of UTF16 anywhere on that page. I also found this: PostgreSQL: Re: DataDirect Driver, ExecDirect and UTF-8<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fprotect-us.mimecast.com%2Fs%2FAAWR...; that does mention WCHAR being different form CHAR and how that could work. I hope there is something on these pages that helps you further. Kind regards, Jan Tjalling From: Jon Raiford <[email protected]><mailto:[email protected]> Sent: 25 November 2022 03:58 To: Wal, Jan Tjalling van der <[email protected]><mailto:[email protected]>; Marsupilami79 <[email protected]><mailto:[email protected]>; [email protected]<mailto:[email protected]> Subject: Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? I believe the point is that the function is a "W" (wide char 16-bit) function so the strings should be UTF-16. Jon ________________________________ From: Wal, Jan Tjalling van der <[email protected]<mailto:[email protected]>> Sent: Tuesday, November 22, 2022 11:21:56 AM To: Marsupilami79 <[email protected]<mailto:[email protected]>>; [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> Subject: RE: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? Hello Marsupilami79, Jan, It could be that the answers you receive are in fact correct for postgres. In a database set to charset=UTF-8 I get the following answers. select 'topsales' as string, char_length('topsales'), length('topsales'), octet_length('topsales'); "string" "char_length" "length" "octet_length" "topsales" 8 8 8 However for a variation that requires more bytes to store the answer start to differ. select 'töpsålés'as string, char_length('töpsålés'), length('töpsålés'), octet_length('töpsålés'); "string" "char_length" "length" "octet_length" "töpsålés" 8 8 11 In the above octet_length is a postgres-function that yields results in bytes. And the three characters with a diacritical added, each requires 2 bytes, yielding a resulting lengt of 11 instead of 8. Kind regards, Jan Tjalling van der Wal -----Original Message----- From: Marsupilami79 <[email protected]<mailto:[email protected]>> Sent: 22 November 2022 16:57 To: [email protected]<mailto:[email protected]> Subject: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? Hello, I am a co author of a data access library and we recently added an ODBC bridge. This bridge has the capability to detemine the current Catalog / Database. This is done by calling SQLGetConnectAttrW. We try to determine the size of the buffer that is needed for the catalog name in the following manner: SQLGetConnectAttrW(fHDBC, SQL_ATTR_CURRENT_CATALOG, null, 0, &aLen) The ODBC driver for Microsoft SQL server correctly returns the number of bytes required (10 bytes for the Database name "Stork") in the aLen parameter. The ODBC driver for PostgreSQL returns the number of characters (8 characters for a database named "topsales"), where it should return 16 for the number of bytes required. I tested this with the psqlodbc_13_02_0000-x86 download for Windows 10 and installed the Unicode ODBC driver. I assume this is a bug and needs to be fixed. I just don't know if this is the right place to report the bug to? With best regards, Jan ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? @ 2022-11-30 11:19 Inoue,Hiroshi <[email protected]> parent: Marsupilami79 <[email protected]> 1 sibling, 0 replies; 7+ messages in thread From: Inoue,Hiroshi @ 2022-11-30 11:19 UTC (permalink / raw) To: Marsupilami79 <[email protected]>; +Cc: [email protected] Hi Jan, The psqlodbc unicode driver correctly returns the number of bytes(10 bytes for a database name "visco") here. regards, Hiroshi Inoue 2022年11月23日(水) 0:57 Marsupilami79 <[email protected]>: > Hello, > > I am a co author of a data access library and we recently added an ODBC > bridge. This bridge has the capability to detemine the current Catalog / > Database. This is done by calling SQLGetConnectAttrW. > > We try to determine the size of the buffer that is needed for the > catalog name in the following manner: > SQLGetConnectAttrW(fHDBC, SQL_ATTR_CURRENT_CATALOG, null, 0, &aLen) > > The ODBC driver for Microsoft SQL server correctly returns the number of > bytes required (10 bytes for the Database name "Stork") in the aLen > parameter. The ODBC driver for PostgreSQL returns the number of > characters (8 characters for a database named "topsales"), where it > should return 16 for the number of bytes required. > > I tested this with the psqlodbc_13_02_0000-x86 download for Windows 10 > and installed the Unicode ODBC driver. > > I assume this is a bug and needs to be fixed. I just don't know if this > is the right place to report the bug to? > > With best regards, > > Jan > > > ^ permalink raw reply [nested|flat] 7+ messages in thread
end of thread, other threads:[~2022-11-30 11:19 UTC | newest] Thread overview: 7+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2022-11-22 15:56 SQLGetConnectAttrW + SQL_ATTR_CURRENT_CATALOG => wrong byte count? Marsupilami79 <[email protected]> 2022-11-22 16:21 ` Wal, Jan Tjalling van der <[email protected]> 2022-11-25 02:58 ` Jon Raiford <[email protected]> 2022-11-25 19:22 ` Wal, Jan Tjalling van der <[email protected]> 2022-11-28 11:16 ` Marsupilami79 <[email protected]> 2022-11-28 14:01 ` Jon Raiford <[email protected]> 2022-11-30 11:19 ` Inoue,Hiroshi <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox