public inbox for [email protected]
help / color / mirror / Atom feedFrom: Neel Patel <[email protected]>
To: Sergey Burladyan <[email protected]>
Cc: pgadmin-hackers <[email protected]>
Cc: Ashesh Vashi <[email protected]>
Cc: Dave Page <[email protected]>
Subject: Re: pgagent unicode support
Date: Thu, 18 Feb 2021 17:12:38 +0530
Message-ID: <CACCA4P3JvfMmG_LCm8Oeqnp4r=EuJBLuaS2mUieuw+--2ybgFQ@mail.gmail.com> (raw)
In-Reply-To: <CACCA4P0PG8y_u9vDwTTL4AHbXOfP1k7joLY4tjLQ40DNpWgyoA@mail.gmail.com>
References: <[email protected]>
<CA+OCxoxsk-VJOJvw95V+pDN0G79T5-F1jAHB-THsPZd10f0KRg@mail.gmail.com>
<CACCA4P0PG8y_u9vDwTTL4AHbXOfP1k7joLY4tjLQ40DNpWgyoA@mail.gmail.com>
Hi Sergey,
Thank you for the patch. It looks good to me except below.
We have modified the patch as we fixed the memory leak ( review comment
given by Ashesh ) and also fixed the compilation warnings.
Can you please review and let us know ?
Thanks,
Neel Patel
On Mon, Feb 15, 2021 at 6:15 PM Neel Patel <[email protected]>
wrote:
> Thanks Sergey for the patch.
>
> Sure Dave.
> There is some compilation warning in linux, I will fix those and test
> pgAgent in windows and update the thread.
>
> On Mon, Feb 8, 2021 at 2:55 PM Dave Page <[email protected]> wrote:
>
>> Hi
>>
>> On Sat, Feb 6, 2021 at 5:00 AM Sergey Burladyan <[email protected]>
>> wrote:
>>
>>> Currently pgagent doesn't handle unicode correctly.
>>>
>>> CharToWString function corrupt multibyte characters because it processes
>>> string one byte at a time:
>>> 148 std::string s = std::string(cstr);
>>> 149 std::wstring wsTmp(s.begin(), s.end());
>>>
>>> WStringToChar function does not take into account that there can be
>>> _multi_byte character on wcstombs output and create buffer with
>>> size = wcslen:
>>> 157 int wstr_length = wcslen(wchar_str);
>>> 158 char *dst = new char[wstr_length + 10];
>>>
>>> Also pgagent do not setup locale with setlocale(), without it all
>>> wcs/mbs functions cannot handle multibyte strings.
>>>
>>> For example:
>>>
>>> === step code ===
>>> select 'это проверка кириллицы в теле запроса pgagent'
>>> =================
>>>
>>> === postgres log ===
>>> 2021-02-05 23:19:05 UTC [15600-1] postgres@postgres ERROR:
>>> unterminated quoted string at or near "'" at character 8
>>> 2021-02-05 23:19:05 UTC [15600-2] postgres@postgres STATEMENT: select '
>>> ====================
>>>
>>> Please see attached patch.
>>> I only test it on GNU/Linux and can't test it on Windows, sorry.
>>>
>>
>> Thanks for the patch! Neel/Ashesh; can you take a look please? It looks
>> OK to me, but then I'm not overly familiar with multibyte string handling.
>> What, if anything, needs to be done on Windows?
>>
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EDB: http://www.enterprisedb.com
>>
>>
Attachments:
[application/octet-stream] pgagent_unicode.patch (2.2K, 3-pgagent_unicode.patch)
download | inline diff:
diff --git a/misc.cpp b/misc.cpp
index 35ac83d..b046f47 100644
--- a/misc.cpp
+++ b/misc.cpp
@@ -143,22 +143,63 @@ std::wstring NumToStr(const long l)
}
// This function is used to convert char* to std::wstring.
-std::wstring CharToWString(const char* cstr)
+std::wstring CharToWString(const char *cstr)
{
- std::string s = std::string(cstr);
- std::wstring wsTmp(s.begin(), s.end());
- return wsTmp;
+ if (cstr != NULL)
+ {
+ size_t wc_cnt = mbstowcs(NULL, cstr, 0);
+
+ if (wc_cnt == (size_t) -1) {
+ return std::wstring();
+ }
+
+ wchar_t *wcs = new wchar_t[wc_cnt + 1];
+ if (wcs == NULL) {
+ return std::wstring();
+ }
+
+ if (mbstowcs(wcs, cstr, wc_cnt + 1) == (size_t) -1) {
+ delete [] wcs;
+ return std::wstring();
+ }
+
+ std::wstring tmp(&wcs[0], &wcs[wc_cnt]);
+ delete [] wcs;
+
+ return tmp;
+ }
+ return std::wstring();
}
// This function is used to convert std::wstring to char *.
-char * WStringToChar(const std::wstring &wstr)
+char *WStringToChar(const std::wstring &wstr)
{
- const wchar_t *wchar_str = wstr.c_str();
- int wstr_length = wcslen(wchar_str);
- char *dst = new char[wstr_length + 10];
- memset(dst, 0x00, (wstr_length + 10));
- wcstombs(dst, wchar_str, wstr_length);
- return dst;
+ const wchar_t *wchar_str = wstr.c_str();
+
+ int mb_len = wcstombs(NULL, wchar_str, 0);
+
+ if ((size_t)mb_len == (size_t) -1) {
+ return NULL;
+ }
+
+ char *mbs = new char[mb_len + 1];
+ if (mbs == NULL) {
+ return NULL;
+ }
+
+ memset(mbs, 0, mb_len + 1);
+
+#ifdef __WIN32__
+ size_t charsConverted = 0;
+ wcstombs_s(&charsConverted, mbs, mb_len + 10, wchar_str, mb_len + 1);
+#else
+ if (wcstombs(mbs, wchar_str, mb_len + 1) == (size_t) -1) {
+ delete [] mbs;
+ return NULL;
+ }
+
+#endif
+ return mbs;
}
// Below function will generate random string of given character.
view thread (13+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: pgagent unicode support
In-Reply-To: <CACCA4P3JvfMmG_LCm8Oeqnp4r=EuJBLuaS2mUieuw+--2ybgFQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox