Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lChhw-0006A9-QO for pgadmin-hackers@arkaria.postgresql.org; Thu, 18 Feb 2021 11:42:56 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1lChhv-0003wz-Ns for pgadmin-hackers@arkaria.postgresql.org; Thu, 18 Feb 2021 11:42:55 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lChhv-0003wo-ES for pgadmin-hackers@lists.postgresql.org; Thu, 18 Feb 2021 11:42:55 +0000 Received: from mail-ed1-x52c.google.com ([2a00:1450:4864:20::52c]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1lChhr-00079X-Od for pgadmin-hackers@postgresql.org; Thu, 18 Feb 2021 11:42:53 +0000 Received: by mail-ed1-x52c.google.com with SMTP id q10so3955855edt.7 for ; Thu, 18 Feb 2021 03:42:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=XGz5EXDVJIr6E1fjmFluJcEDbfguHoqysI14WSI0J6U=; b=TNZygvYvfUDXLeW+xWOznzvYN49sKRMEiRCN7XR2Fa69cJf0why+EJM0kwqoIKCFen GZnZgET8kS++0YX+sSJvWWEVlBu5HcHzwkjdFySWRUMeq985285/PGm1Sr3bqJKGhE3k xXkY/0XJbnpmAsS5tVwrJ6Zp35dmuahTe0Vu5MuLJkHdt+mt5kF3yhxhRF2MASDCf9CS c1ssc0u/5Fq1dsk+SgLG9WYt8knVvN7t/oF2jBVqyDRt7Uv03I12RObenLWzJAfW77vG YBGAG0IGJeTRpDNSi8v9XpK2Q/EXQjXqb/sFCcwJaPdcPSelIsFt0jaSw7cK1Uyfhn7V np9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=XGz5EXDVJIr6E1fjmFluJcEDbfguHoqysI14WSI0J6U=; b=dii6jVub975sHvXFZbczWIM7DS2SzjFIchgWhEKb+LY5VmR8+K/tdgaueR2uhR5GJb KYx52NTyBqdgVCi/mUq3P4sP26C70a+W+qoqOJ7XTH+wop3nGk+x2IJSKRLEUYZDW3GX brbIcr6qBmyLtuV1NeXfpVYxY43sFV+tHPiE6TGmn3XJ9CKpVP0gs1qqHYfzAW9Ds3C5 YCWFu1GBj5RBD1wR0Wyrc0i46i3bo7wlxXYYbY3B8qPt1H96e1RYly/pNYAgkpwrOCgi QFUn4yLMlm3/uuyM97aZJyuMtFNO7Ye8+xBUpjau+nwOP6mF6X8T/74/8nbpzOIY0Dt0 O89g== X-Gm-Message-State: AOAM532z9dJqJxjpHNT6MQNlqyCDj6FMpuebSLPVRvQXTHMVDv9M5zWx aWIPw+MS8KFz54Zc3nQUpzUbsRPMgymROIFVBPwYnLbtcFTzQFOPWjdbEJo/lTSB5WotAeBTYyv ImcsYvYfJ9/XIOz7tp9HHEKhKfSpViXB1RSqgN154Nn5B/aRGMc8j9ChcM4ry9v6LOmRL2D+q65 9dRh5guxRgjhKuktS6Hm8R9qe39Lq6glkmgzzYx6mktkFE3x+hwC7NiYT/Ig== X-Google-Smtp-Source: ABdhPJyHkzr8/+TvNTwAAhrLZCoCn/wVbx6tMogQ1YHuJuMBljA/aD65xu3U8i+qksalPTUiNVRTvAmxlEVd5Tk+JcE= X-Received: by 2002:a50:bf47:: with SMTP id g7mr3621506edk.323.1613648569998; Thu, 18 Feb 2021 03:42:49 -0800 (PST) MIME-Version: 1.0 References: <87a6shyenl.fsf@gmail.com> In-Reply-To: From: Neel Patel Date: Thu, 18 Feb 2021 17:12:38 +0530 Message-ID: Subject: Re: pgagent unicode support To: Sergey Burladyan Cc: pgadmin-hackers , Ashesh Vashi , Dave Page Content-Type: multipart/mixed; boundary="000000000000cef6a505bb9ad532" X-CLOUD-SEC-AV-Info: enterprisedb,google_mail,monitor X-CLOUD-SEC-AV-Sent: true X-Gm-Spam: 0 X-Gm-Phishy: 0 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk --000000000000cef6a505bb9ad532 Content-Type: multipart/alternative; boundary="000000000000cef6a205bb9ad530" --000000000000cef6a205bb9ad530 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Sergey, Thank you for the patch. It looks good to me except below. We have modified the patch as we fixed the memory leak ( review comment given by Ashesh ) and also fixed the compilation warnings. Can you please review and let us know ? Thanks, Neel Patel On Mon, Feb 15, 2021 at 6:15 PM Neel Patel wrote: > Thanks Sergey for the patch. > > Sure Dave. > There is some compilation warning in linux, I will fix those and test > pgAgent in windows and update the thread. > > On Mon, Feb 8, 2021 at 2:55 PM Dave Page wrote: > >> Hi >> >> On Sat, Feb 6, 2021 at 5:00 AM Sergey Burladyan >> wrote: >> >>> Currently pgagent doesn't handle unicode correctly. >>> >>> CharToWString function corrupt multibyte characters because it processe= s >>> string one byte at a time: >>> 148 std::string s =3D std::string(cstr); >>> 149 std::wstring wsTmp(s.begin(), s.end()); >>> >>> WStringToChar function does not take into account that there can be >>> _multi_byte character on wcstombs output and create buffer with >>> size =3D wcslen: >>> 157 int wstr_length =3D wcslen(wchar_str); >>> 158 char *dst =3D new char[wstr_length + 10]; >>> >>> Also pgagent do not setup locale with setlocale(), without it all >>> wcs/mbs functions cannot handle multibyte strings. >>> >>> For example: >>> >>> =3D=3D=3D step code =3D=3D=3D >>> select '=D1=8D=D1=82=D0=BE =D0=BF=D1=80=D0=BE=D0=B2=D0=B5=D1=80=D0=BA= =D0=B0 =D0=BA=D0=B8=D1=80=D0=B8=D0=BB=D0=BB=D0=B8=D1=86=D1=8B =D0=B2 =D1=82= =D0=B5=D0=BB=D0=B5 =D0=B7=D0=B0=D0=BF=D1=80=D0=BE=D1=81=D0=B0 pgagent' >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> >>> =3D=3D=3D postgres log =3D=3D=3D >>> 2021-02-05 23:19:05 UTC [15600-1] postgres@postgres ERROR: >>> unterminated quoted string at or near "'" at character 8 >>> 2021-02-05 23:19:05 UTC [15600-2] postgres@postgres STATEMENT: select = ' >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> >>> Please see attached patch. >>> I only test it on GNU/Linux and can't test it on Windows, sorry. >>> >> >> Thanks for the patch! Neel/Ashesh; can you take a look please? It looks >> OK to me, but then I'm not overly familiar with multibyte string handlin= g. >> What, if anything, needs to be done on Windows? >> >> >> -- >> Dave Page >> Blog: http://pgsnake.blogspot.com >> Twitter: @pgsnake >> >> EDB: http://www.enterprisedb.com >> >> --000000000000cef6a205bb9ad530 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Sergey,

Thank you for the patch. It = looks good to=C2=A0me except below.

We have modifi= ed the patch as we fixed the memory leak ( review comment given by Ashesh )= and also fixed the compilation warnings.
Can you please review a= nd let us know ?

Thanks,
Neel Patel

On Mon, Feb 15, 2021 at 6:15 PM Neel Patel <neel.patel@enterprisedb.com> wrote:
Thanks= =C2=A0Sergey for the patch.

Sure Dave.=C2=A0
T= here is some=C2=A0compilation warning in linux, I will fix those and test p= gAgent in windows and update the thread.

On Mon, Feb 8, 2021 at = 2:55 PM Dave Page <dpage@pgadmin.org> wrote:
Hi

On Sat, Feb 6, 2021 at 5:00 AM S= ergey Burladyan <eshkinkot@gmail.com> wrote:
Currently pgagent doesn't handle unicode correctly= .

CharToWString function corrupt multibyte characters because it processes string one byte at a time:
=C2=A0148=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0std::string s =3D std::string(cs= tr);
=C2=A0149=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0std::wstring wsTmp(s.begin(), s.= end());

WStringToChar function does not take into account that there can be
_multi_byte character on wcstombs output and create buffer with
size =3D wcslen:
=C2=A0157=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0int wstr_length =3D wcslen(wchar= _str);
=C2=A0158=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0char *dst =3D new char[wstr_leng= th + 10];

Also pgagent do not setup locale with setlocale(), without it all
wcs/mbs functions cannot handle multibyte strings.

For example:

=3D=3D=3D step code =3D=3D=3D
select '=D1=8D=D1=82=D0=BE =D0=BF=D1=80=D0=BE=D0=B2=D0=B5=D1=80=D0=BA= =D0=B0 =D0=BA=D0=B8=D1=80=D0=B8=D0=BB=D0=BB=D0=B8=D1=86=D1=8B =D0=B2 =D1=82= =D0=B5=D0=BB=D0=B5 =D0=B7=D0=B0=D0=BF=D1=80=D0=BE=D1=81=D0=B0 pgagent'<= br> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

=3D=3D=3D postgres log =3D=3D=3D
2021-02-05 23:19:05 UTC [15600-1] postgres@postgres ERROR:=C2=A0 unterminat= ed quoted string at or near "'" at character 8
2021-02-05 23:19:05 UTC [15600-2] postgres@postgres STATEMENT:=C2=A0 select= '
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

Please see attached patch.
I only test it on GNU/Linux and can't test it on Windows, sorry.

Thanks for the patch! Neel/Ashesh; can you ta= ke a look please? It looks OK to me, but then I'm not overly familiar w= ith multibyte string handling. What, if anything, needs to be done on Windo= ws?
=C2=A0

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

= EDB: http://www.e= nterprisedb.com

--000000000000cef6a205bb9ad530-- --000000000000cef6a505bb9ad532 Content-Type: application/octet-stream; name="pgagent_unicode.patch" Content-Disposition: attachment; filename="pgagent_unicode.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_klasmzc10 ZGlmZiAtLWdpdCBhL21pc2MuY3BwIGIvbWlzYy5jcHAKaW5kZXggMzVhYzgzZC4uYjA0NmY0NyAx MDA2NDQKLS0tIGEvbWlzYy5jcHAKKysrIGIvbWlzYy5jcHAKQEAgLTE0MywyMiArMTQzLDYzIEBA IHN0ZDo6d3N0cmluZyBOdW1Ub1N0cihjb25zdCBsb25nIGwpCiB9CiAKIC8vIFRoaXMgZnVuY3Rp b24gaXMgdXNlZCB0byBjb252ZXJ0IGNoYXIqIHRvIHN0ZDo6d3N0cmluZy4KLXN0ZDo6d3N0cmlu ZyBDaGFyVG9XU3RyaW5nKGNvbnN0IGNoYXIqIGNzdHIpCitzdGQ6OndzdHJpbmcgQ2hhclRvV1N0 cmluZyhjb25zdCBjaGFyICpjc3RyKQogewotCXN0ZDo6c3RyaW5nIHMgPSBzdGQ6OnN0cmluZyhj c3RyKTsKLQlzdGQ6OndzdHJpbmcgd3NUbXAocy5iZWdpbigpLCBzLmVuZCgpKTsKLQlyZXR1cm4g d3NUbXA7CisgICAgICAgIGlmIChjc3RyICE9IE5VTEwpCisgICAgICAgIHsKKyAgICAgICAgICAg ICAgICBzaXplX3Qgd2NfY250ID0gbWJzdG93Y3MoTlVMTCwgY3N0ciwgMCk7CisKKyAgICAgICAg ICAgICAgICBpZiAod2NfY250ID09IChzaXplX3QpIC0xKSB7CisgICAgICAgICAgICAgICAgICAg ICAgICByZXR1cm4gc3RkOjp3c3RyaW5nKCk7CisgICAgICAgICAgICAgICAgfQorCisgICAgICAg ICAgICAgICAgd2NoYXJfdCAqd2NzID0gbmV3IHdjaGFyX3Rbd2NfY250ICsgMV07CisgICAgICAg ICAgICAgICAgaWYgKHdjcyA9PSBOVUxMKSB7CisgICAgICAgICAgICAgICAgICAgICAgICByZXR1 cm4gc3RkOjp3c3RyaW5nKCk7CisgICAgICAgICAgICAgICAgfQorCisgICAgICAgICAgICAgICAg aWYgKG1ic3Rvd2NzKHdjcywgY3N0ciwgd2NfY250ICsgMSkgPT0gKHNpemVfdCkgLTEpIHsKKyAg ICAgICAgICAgICAgICAgICAgICAgIGRlbGV0ZSBbXSB3Y3M7CisgICAgICAgICAgICAgICAgICAg ICAgICByZXR1cm4gc3RkOjp3c3RyaW5nKCk7CisgICAgICAgICAgICAgICAgfQorCisgICAgICAg ICAgICAgICAgc3RkOjp3c3RyaW5nIHRtcCgmd2NzWzBdLCAmd2NzW3djX2NudF0pOworICAgICAg ICAgICAgICAgIGRlbGV0ZSBbXSB3Y3M7CisKKyAgICAgICAgICAgICAgICByZXR1cm4gdG1wOwor ICAgICAgICB9CisgICAgICAgIHJldHVybiBzdGQ6OndzdHJpbmcoKTsKIH0KIAogLy8gVGhpcyBm dW5jdGlvbiBpcyB1c2VkIHRvIGNvbnZlcnQgc3RkOjp3c3RyaW5nIHRvIGNoYXIgKi4KLWNoYXIg KiBXU3RyaW5nVG9DaGFyKGNvbnN0IHN0ZDo6d3N0cmluZyAmd3N0cikKK2NoYXIgKldTdHJpbmdU b0NoYXIoY29uc3Qgc3RkOjp3c3RyaW5nICZ3c3RyKQogewotCWNvbnN0IHdjaGFyX3QgKndjaGFy X3N0ciA9IHdzdHIuY19zdHIoKTsKLQlpbnQgd3N0cl9sZW5ndGggPSB3Y3NsZW4od2NoYXJfc3Ry KTsKLQljaGFyICpkc3QgPSBuZXcgY2hhclt3c3RyX2xlbmd0aCArIDEwXTsKLQltZW1zZXQoZHN0 LCAweDAwLCAod3N0cl9sZW5ndGggKyAxMCkpOwotCXdjc3RvbWJzKGRzdCwgd2NoYXJfc3RyLCB3 c3RyX2xlbmd0aCk7Ci0JcmV0dXJuIGRzdDsKKyAgICAgICAgY29uc3Qgd2NoYXJfdCAqd2NoYXJf c3RyID0gd3N0ci5jX3N0cigpOworCisgICAgICAgIGludCBtYl9sZW4gPSB3Y3N0b21icyhOVUxM LCB3Y2hhcl9zdHIsIDApOworCisgICAgICAgIGlmICgoc2l6ZV90KW1iX2xlbiA9PSAoc2l6ZV90 KSAtMSkgeworICAgICAgICAgICAgICAgIHJldHVybiBOVUxMOworICAgICAgICB9CisKKyAgICAg ICAgY2hhciAqbWJzID0gbmV3IGNoYXJbbWJfbGVuICsgMV07CisgICAgICAgIGlmIChtYnMgPT0g TlVMTCkgeworICAgICAgICAgICAgICAgIHJldHVybiBOVUxMOworICAgICAgICB9CisKKyAgICAg ICAgbWVtc2V0KG1icywgMCwgbWJfbGVuICsgMSk7CisKKyNpZmRlZiBfX1dJTjMyX18KKyAgICAg ICAgc2l6ZV90IGNoYXJzQ29udmVydGVkID0gMDsKKyAgICAgICAgd2NzdG9tYnNfcygmY2hhcnND b252ZXJ0ZWQsIG1icywgbWJfbGVuICsgMTAsIHdjaGFyX3N0ciwgbWJfbGVuICsgMSk7CisjZWxz ZQorICAgICAgICBpZiAod2NzdG9tYnMobWJzLCB3Y2hhcl9zdHIsIG1iX2xlbiArIDEpID09IChz aXplX3QpIC0xKSB7CisgICAgICAgICAgICAgICAgZGVsZXRlIFtdIG1iczsKKyAgICAgICAgICAg ICAgICByZXR1cm4gTlVMTDsKKyAgICAgICAgfQorCisjZW5kaWYKKyAgICAgICAgcmV0dXJuIG1i czsKIH0KIAogLy8gQmVsb3cgZnVuY3Rpb24gd2lsbCBnZW5lcmF0ZSByYW5kb20gc3RyaW5nIG9m IGdpdmVuIGNoYXJhY3Rlci4K --000000000000cef6a505bb9ad532--