Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tQDJ5-000saV-11 for pgsql-general@arkaria.postgresql.org; Tue, 24 Dec 2024 22:23:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tQDJ1-008v1a-TQ for pgsql-general@arkaria.postgresql.org; Tue, 24 Dec 2024 22:23:11 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tQDJ1-008v1S-EP for pgsql-general@lists.postgresql.org; Tue, 24 Dec 2024 22:23:11 +0000 Received: from zproxy5.mail3d.it ([212.78.3.138]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tQDIy-000P3L-0p for pgsql-general@lists.postgresql.org; Tue, 24 Dec 2024 22:23:10 +0000 Received: from zproxy5.mail3d.it (localhost.localdomain [127.0.0.1]) by zproxy5.mail3d.it (Postfix) with ESMTPS id 525B460873; Tue, 24 Dec 2024 23:23:05 +0100 (CET) Received: from localhost (localhost.localdomain [127.0.0.1]) by zproxy5.mail3d.it (Postfix) with ESMTP id 36EAA60874; Tue, 24 Dec 2024 23:23:05 +0100 (CET) DKIM-Filter: OpenDKIM Filter v2.10.3 zproxy5.mail3d.it 36EAA60874 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cleistech.it; s=1D99BE42-E8ED-11ED-9278-A8004466375A; t=1735078985; bh=CGysD5bpiFH9Q9JSjiHVzYm772+uPvHMGEX5ErfoE/U=; h=Message-ID:Date:MIME-Version:To:From; b=OoeDBI1jfAWVQTj/s9YP8IWn1pkDgku42YcTq1nN5I2dihp9xvZ7IiTK/HU4f5bmT WSExRWIOJxNUZyQkIHbWwIV32i3md4IryKEKX3iNyi5QbyuLmgiCQKGgZ9fleIOI00 X13uvbl+piH0/cGc7ZQtAbLdpLfFB35rnlUlNX9v+ye2QhmMd6Gg+vG9QqJbskIxDY uiTKyaORvP4V27q+e9XkuTgXRUi6UtL7vSpFDBlvqK+oXEseQSHzt+ROcUwTknidxF Q39X9/wnVqHZ/qUmxPHm/GA1Zxij8u5IbPtkPKFXXfVxLGgC/fNRSg3QepHQ4IoSig kRD1e1cv4BlnQ== Received: from zproxy5.mail3d.it ([127.0.0.1]) by localhost (zproxy5.mail3d.it [127.0.0.1]) (amavis, port 10026) with ESMTP id vDY5nSGVfqxi; Tue, 24 Dec 2024 23:23:05 +0100 (CET) Received: from [192.168.0.50] (host-87-9-123-144.retail.telecomitalia.it [87.9.123.144]) by zproxy5.mail3d.it (Postfix) with ESMTPSA id E100960873; Tue, 24 Dec 2024 23:23:04 +0100 (CET) Content-Type: multipart/alternative; boundary="------------U6e0jfNrlHMp8vCXohLTRIax" Message-ID: Date: Tue, 24 Dec 2024 23:23:04 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Intermittent errors when fetching cursor rows on PostgreSQL 16 To: Adrian Klaver , pgsql-general@lists.postgresql.org Cc: Massimo Catti , Livio Pizzolo References: <446423eb-4a4e-4135-bbb8-4d0e5c7aac3b@cleistech.it> <25d5fb16-9bb2-4ad1-969c-eaca03ccbaaf@aklaver.com> <6ce80aaa-970b-4432-938a-39a07f811599@cleistech.it> <9f60eb26-7d34-4228-bd78-74c6deb90e54@aklaver.com> <282c2a48-bb12-4486-b03d-563523cac81b@cleistech.it> <2645a89e-d661-4f2b-92b3-01154a78d535@aklaver.com> <54689a6a-839c-44c4-90b5-b9692e8e7cb0@aklaver.com> <4efe42a2-789c-4957-a564-25199869f6ec@cleistech.it> Content-Language: it From: Enrico Schenone In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------U6e0jfNrlHMp8vCXohLTRIax Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi, Adrian. I'm arranging a test program with two nested cursors in two versions: 1. 4Js Genero BDL language 2. pure C with libpq language I'll put both programs in stress execution into the production environment looking for some hours how they behaves. Possible combinations are: 1. no-one throws an error 2. only the 4Js Genero version throws an error 3. only the pure C version throws an error 4. both versions throws the error This stress test should address further investigations. I'll keep you informed. Regards. Enrico Schenone Il 20/12/24 17:43, Adrian Klaver ha scritto: > On 12/20/24 07:02, Enrico Schenone wrote: >> Hi, Adrian. >> Today I have collected a tcpdump at client side with communications >> between application server and db server while the issue was >> occurring one time per second on another program. >> I send you two files. >> The first one is a zipped tarball (.tgz) containing a text >> representation of the tcpdump starting at point where it reports the >> declaration of the failing cursor ("cu4" as you can see in the first >> line of the file) and subsequent fetch. Consider that the client >> application log detected the XX001 error on the first FETCH of the >> cursor at 2024-12-20 12:17:35.175 >> The second file (zipped tarball .tgz) is too big to be sent as >> attachment, so I provide a link where it can be downloaded. It is the >> fraction of tcpdump recorded during the program failure (occurred >> several times). It is in .pcap format so it is possible to open it >> with Wireshark or tcpdump -A -r >> Anyone interested can download it at >> https://cleislabs.cleistech.it/downloads/tcpdump_out009.pcap.tgz >> >> Consider that during the dump several different cursor was declared >> with the name "cu4", but the one failing is the one of the first line. >> Maybe an expert (I'm not so expert) can see if the disconnection is >> really made by the client and/or if the data returned by the server >> are really corrupted as per XX001 SQLSTATE. > > This is beyond me, someone else will need to chime in. > >> >> Best regards. >> Enrico >> >> Il 19/12/24 22:47, Adrian Klaver ha scritto: > > --------------U6e0jfNrlHMp8vCXohLTRIax Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit Hi, Adrian.
I'm arranging a test program with two nested cursors in two versions:
  1. 4Js Genero BDL language
  2. pure C with libpq language
I'll put both programs in stress execution into the production environment looking for some hours how they behaves.
Possible combinations are:
  1. no-one throws an error
  2. only the 4Js Genero version throws an error
  3. only the pure C version throws an error
  4. both versions throws the error
This stress test should address further investigations.
I'll keep you informed.

Regards.
Enrico Schenone

Il 20/12/24 17:43, Adrian Klaver ha scritto:
On 12/20/24 07:02, Enrico Schenone wrote:
Hi, Adrian.
Today I have collected a tcpdump at client side with communications between application server and db server while the issue was occurring one time per second on another program.
I send you two files.
The first one is a zipped tarball (.tgz) containing a text representation of the tcpdump starting at point where it reports the declaration of the failing cursor ("cu4" as you can see in the first line of the file) and subsequent fetch. Consider that the client application log detected the XX001 error on the first FETCH of the cursor at 2024-12-20 12:17:35.175
The second file (zipped tarball .tgz) is too big to be sent as attachment, so I provide a link where it can be downloaded. It is the fraction of tcpdump recorded during the program failure (occurred several times). It is in .pcap format so it is possible to open it with Wireshark or tcpdump -A -r
Anyone interested can download it at https://cleislabs.cleistech.it/downloads/tcpdump_out009.pcap.tgz

Consider that during the dump several different cursor was declared with the name "cu4", but the one failing is the one of the first line.
Maybe an expert (I'm not so expert) can see if the disconnection is really made by the client and/or if the data returned by the server are really corrupted as per XX001 SQLSTATE.

This is beyond me, someone else will need to chime in.


Best regards.
Enrico

Il 19/12/24 22:47, Adrian Klaver ha scritto:



--------------U6e0jfNrlHMp8vCXohLTRIax--