Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lEXCl-0004lG-BG for pgadmin-hackers@arkaria.postgresql.org; Tue, 23 Feb 2021 12:54:19 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1lEXCi-0007nV-35 for pgadmin-hackers@arkaria.postgresql.org; Tue, 23 Feb 2021 12:54:16 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lEXCh-0007nN-Mh for pgadmin-hackers@lists.postgresql.org; Tue, 23 Feb 2021 12:54:15 +0000 Received: from mail-wm1-x334.google.com ([2a00:1450:4864:20::334]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1lEXCe-00044U-9c for pgadmin-hackers@postgresql.org; Tue, 23 Feb 2021 12:54:14 +0000 Received: by mail-wm1-x334.google.com with SMTP id f137so2337733wmf.3 for ; Tue, 23 Feb 2021 04:54:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hWmzYwRK4Hm1Np+IgkjYN9HAh9ZL2He1Lc4JqbNzzhI=; b=M7x1xY0p32vLwl4oed6ghlLRrszOjRwsA4FSWRVPMdTcwWD9rlDPkDq4syCtvGusaH ZihmkuzDGchbMF0+ddembjUxDb/1oyl7I5cckFAlKzFnT7WNGE/Kjm2WDm7v/hwR9fv1 2/vtWMOUpOi16SFruGF7LvUmPhc6lViI/9UrTWXc+DqFByfw7hd8WKFGDtVFVdrN3csE dLsdmvP6L5orPgFztvzjTbvUcC7uCD1MKwu+cGsxTmNHiUlDUvNmV84QmrsmJi0JzX9h g7lgHLvHjuT45s0bdBdLcd0gQ/5O67Egl35B26q6A5BabhonkhirXfFxd14NDF15rVlL 312w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hWmzYwRK4Hm1Np+IgkjYN9HAh9ZL2He1Lc4JqbNzzhI=; b=NHpKu6sHuSpDeShr9WYhzEzmaBaZS/UkKUPr0c4utNTWVF3lXOGsAUPHlIig4/zjNF ATe/3UCRRUJUpqVxFvsOBpQfeBdEmvNnDEGdoDZqVzq9mlt0/bPWwjAefg6xI3uM6I7P vtUiFEqoOXUylXm6UxlwrEMZz8dkYHESHzX1Yc0qgkDaY06yoBz9QgaUZmxCXRN4TEo5 1a/b1v69NkHVUI2okIoi7jFiJouutnnszFHs1vvyPCfa+diCBmOB8eN0Lt4waCcDm5TI /gCJW9MF6ZeJbRdPuZhM6/D3SfPnnjDbkEowtR+4j1me5ka3tgr67acZP7lUtAprUFmR EjHQ== X-Gm-Message-State: AOAM530enKEYAb056CHJ/Zv768lAMjcSj3bE2G4WxlxBhmGQtgtE1Mzc c2Dl3vF+NNQn5pi+Dcg+Woaozz0/DfAQ8wyeTPSDsIL0bz3Huj7+6lvmqT28acMnbNcqrGJp6Uc q8x0H+q/Bqr20mVqerwVTqa0zPHo4IpjikuC8xRXbwoF4xKTa6giwvyQMt2j73YYHQI+HnqHHGF l6TaxfuO+1bgeWZ9mH3pT4HwWHEdP0+hsoENadVbzAJPFiuKrWW1tv7Hk4jg== X-Google-Smtp-Source: ABdhPJwgW5+E/wOxtvRjj3Mhv8LcbnJXLjBwDG/Q1ulCw201+2eMyt8OEVnrI8vzMiJ6ra4fhMC88L/QRGXFJkYYN3c= X-Received: by 2002:a7b:c305:: with SMTP id k5mr24609129wmj.57.1614084848299; Tue, 23 Feb 2021 04:54:08 -0800 (PST) MIME-Version: 1.0 References: <87a6shyenl.fsf@gmail.com> In-Reply-To: From: Neel Patel Date: Tue, 23 Feb 2021 18:23:57 +0530 Message-ID: Subject: Re: pgagent unicode support To: Sergey Burladyan Cc: pgadmin-hackers , Ashesh Vashi , Dave Page Content-Type: multipart/alternative; boundary="00000000000005421205bc006a63" X-CLOUD-SEC-AV-Info: enterprisedb,google_mail,monitor X-CLOUD-SEC-AV-Sent: true X-Gm-Spam: 0 X-Gm-Phishy: 0 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk --00000000000005421205bc006a63 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Sergey, Do you have any comments for this updated patch ? Let us know ASAP so that we can commit it. Thanks, Neel Patel On Thu, Feb 18, 2021 at 5:12 PM Neel Patel wrote: > Hi Sergey, > > Thank you for the patch. It looks good to me except below. > > We have modified the patch as we fixed the memory leak ( review comment > given by Ashesh ) and also fixed the compilation warnings. > Can you please review and let us know ? > > Thanks, > Neel Patel > > On Mon, Feb 15, 2021 at 6:15 PM Neel Patel > wrote: > >> Thanks Sergey for the patch. >> >> Sure Dave. >> There is some compilation warning in linux, I will fix those and test >> pgAgent in windows and update the thread. >> >> On Mon, Feb 8, 2021 at 2:55 PM Dave Page wrote: >> >>> Hi >>> >>> On Sat, Feb 6, 2021 at 5:00 AM Sergey Burladyan >>> wrote: >>> >>>> Currently pgagent doesn't handle unicode correctly. >>>> >>>> CharToWString function corrupt multibyte characters because it process= es >>>> string one byte at a time: >>>> 148 std::string s =3D std::string(cstr); >>>> 149 std::wstring wsTmp(s.begin(), s.end()); >>>> >>>> WStringToChar function does not take into account that there can be >>>> _multi_byte character on wcstombs output and create buffer with >>>> size =3D wcslen: >>>> 157 int wstr_length =3D wcslen(wchar_str); >>>> 158 char *dst =3D new char[wstr_length + 10]; >>>> >>>> Also pgagent do not setup locale with setlocale(), without it all >>>> wcs/mbs functions cannot handle multibyte strings. >>>> >>>> For example: >>>> >>>> =3D=3D=3D step code =3D=3D=3D >>>> select '=D1=8D=D1=82=D0=BE =D0=BF=D1=80=D0=BE=D0=B2=D0=B5=D1=80=D0=BA= =D0=B0 =D0=BA=D0=B8=D1=80=D0=B8=D0=BB=D0=BB=D0=B8=D1=86=D1=8B =D0=B2 =D1=82= =D0=B5=D0=BB=D0=B5 =D0=B7=D0=B0=D0=BF=D1=80=D0=BE=D1=81=D0=B0 pgagent' >>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>> >>>> =3D=3D=3D postgres log =3D=3D=3D >>>> 2021-02-05 23:19:05 UTC [15600-1] postgres@postgres ERROR: >>>> unterminated quoted string at or near "'" at character 8 >>>> 2021-02-05 23:19:05 UTC [15600-2] postgres@postgres STATEMENT: select >>>> ' >>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>> >>>> Please see attached patch. >>>> I only test it on GNU/Linux and can't test it on Windows, sorry. >>>> >>> >>> Thanks for the patch! Neel/Ashesh; can you take a look please? It looks >>> OK to me, but then I'm not overly familiar with multibyte string handli= ng. >>> What, if anything, needs to be done on Windows? >>> >>> >>> -- >>> Dave Page >>> Blog: http://pgsnake.blogspot.com >>> Twitter: @pgsnake >>> >>> EDB: http://www.enterprisedb.com >>> >>> --00000000000005421205bc006a63 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Sergey,

Do you have any comments for= this updated patch ? Let us know ASAP so that we can commit it.
=
Thanks,
Neel Patel

On Thu, Feb 18, 2021 at 5:= 12 PM Neel Patel <neel.pa= tel@enterprisedb.com> wrote:
Hi Sergey,

Thank yo= u for the patch. It looks good to=C2=A0me except below.

We have modified the patch as we fixed the memory leak ( review comme= nt given by Ashesh ) and also fixed the compilation warnings.
Can= you please review and let us know ?

Thanks,
=
Neel Patel

On Mon, Feb 15, 2021 at 6:15 PM Neel Patel <neel.patel@enterp= risedb.com> wrote:
Thanks=C2=A0Sergey for the patch.

Sure Dave.=C2=A0
There is some=C2=A0compilation warnin= g in linux, I will fix those and test pgAgent in windows and update the thr= ead.

On Mon, Feb 8, 2021 at 2:55 PM Dave Page <dpage@pgadmin.org> wrote:
=
Hi

On Sat, Feb 6, 2021 at 5:00 AM Sergey Burladyan <eshkinkot@gmail.com> wrote:=
Currently pgage= nt doesn't handle unicode correctly.

CharToWString function corrupt multibyte characters because it processes string one byte at a time:
=C2=A0148=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0std::string s =3D std::string(cs= tr);
=C2=A0149=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0std::wstring wsTmp(s.begin(), s.= end());

WStringToChar function does not take into account that there can be
_multi_byte character on wcstombs output and create buffer with
size =3D wcslen:
=C2=A0157=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0int wstr_length =3D wcslen(wchar= _str);
=C2=A0158=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0char *dst =3D new char[wstr_leng= th + 10];

Also pgagent do not setup locale with setlocale(), without it all
wcs/mbs functions cannot handle multibyte strings.

For example:

=3D=3D=3D step code =3D=3D=3D
select '=D1=8D=D1=82=D0=BE =D0=BF=D1=80=D0=BE=D0=B2=D0=B5=D1=80=D0=BA= =D0=B0 =D0=BA=D0=B8=D1=80=D0=B8=D0=BB=D0=BB=D0=B8=D1=86=D1=8B =D0=B2 =D1=82= =D0=B5=D0=BB=D0=B5 =D0=B7=D0=B0=D0=BF=D1=80=D0=BE=D1=81=D0=B0 pgagent'<= br> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

=3D=3D=3D postgres log =3D=3D=3D
2021-02-05 23:19:05 UTC [15600-1] postgres@postgres ERROR:=C2=A0 unterminat= ed quoted string at or near "'" at character 8
2021-02-05 23:19:05 UTC [15600-2] postgres@postgres STATEMENT:=C2=A0 select= '
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

Please see attached patch.
I only test it on GNU/Linux and can't test it on Windows, sorry.

Thanks for the patch! Neel/Ashesh; can you ta= ke a look please? It looks OK to me, but then I'm not overly familiar w= ith multibyte string handling. What, if anything, needs to be done on Windo= ws?
=C2=A0

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

= EDB: http://www.e= nterprisedb.com

--00000000000005421205bc006a63--