public inbox for [email protected]  
help / color / mirror / Atom feed
From: Enrico Schenone <[email protected]>
To: Adrian Klaver <[email protected]>
To: [email protected]
Cc: Massimo Catti <[email protected]>
Cc: Livio Pizzolo <[email protected]>
Subject: Re: Intermittent errors when fetching cursor rows on PostgreSQL 16
Date: Mon, 13 Jan 2025 09:45:53 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

Hello, Adrian.
As I said days ago, I have arranged a kind of stress test in production 
environment.
I wrote a program that loads a temporary table, loads 2049 rows into 
them from a baseline_table and finally declare two nested cursors.
The first cursor is on the temp table as parent while the second is on a 
lookup table as child.

The program logic is the transposition of one fragment of several 
production programs that was failing on cursors, and has to be intended 
as a POC only.

The program has been wrote in both pure C with libpq (see attached 
source program) and in 4Js Genero language.

Each program was executed by a shell script loop who ran 10 times the 
program each minute with 1 second sleep between each run (see attachment).

An automatic scheduler has continuously submitted 4 parallel tasks (two 
for C version and two for 4Js version programs).

The test was started the Dec, 29 2024 and it was kept in execution for 
many days directly in production environment.
In total, nearly a billion of child test cursors were executed while all 
other production tasks was running (normally 20 to 30 concurrent batch 
services on a pool of 100).

And Well, I'm quite confused: no error at all has been detected, not 
only on the test programs but in the whole production system. The error 
was completely disappeared.

Then I have stopped the four tasks of the stress test leaving all other 
services running for a week, and again no error at all.

No setup was changed nor servers was rebooted, nor infrastructure has 
been upgraded during the test period.

As a result, at the moment I'm not understood not only Why & Where the 
error was occurring, but also Why it is disappeared.

Anyone may feel free to give me his opinion.
For the moment I'll make no other test unless the error is knocking back 
to my door.

*Enrico Schenone*
Software Architect

*Cleis Tech s.r.l.* - www.gruppocleis.it
Sede di Genova, Via Paolo Emilio Bensa, 2 - 16124 Genova, ITALY
Tel: +39-0104071400   Fax: +39-0104073276
Mobile: +39-320 
7709352file:///home/enrico/Documenti/Work/Clienti/hh24/Incident/err-6372/C-test/C-testCursors.c
E-mail: [email protected]

<https://gruppocleis.it><https://ibm.biz/BdqAJh;

<https://ibm.biz/BdqAJh;
<https://ibm.biz/BdqAJh;
Il 26/12/24 00:20, Adrian Klaver ha scritto:
> On 12/24/24 14:23, Enrico Schenone wrote:
>> Hi, Adrian.
>> I'm arranging a test program with two nested cursors in two versions:
>>
>>  1. 4Js Genero BDL language
>>  2. pure C with libpq language
>>
>> I'll put both programs in stress execution into the production 
>> environment looking for some hours how they behaves.
>> Possible combinations are:
>>
>>  1. no-one throws an error
>>  2. only the 4Js Genero version throws an error
>>  3. only the pure C version throws an error
>>  4. both versions throws the error
>>
>> This stress test should address further investigations.
>> I'll keep you informed.
>
> Yes, would like to see how this turns out.
>
>>
>> Regards.
>> Enrico Schenone
>>
>
>


Attachments:

  [image/jpeg] LogoCleis (4.0K, 3-LogoCleis)
  download | view image

  [image/png] HTTPS://IBM.BIZ/BDQAJH (1018B, 4-HTTPS:%2F%2FIBM.BIZ%2FBDQAJH)
  download | view image

  [text/x-csrc] C-testCursors.c (5.2K, 5-C-testCursors.c)
  download | inline:
#include <stdio.h>
#include <stdlib.h>
#include </usr/include/postgresql/libpq-fe.h>
#include <sys/time.h>
#include <time.h>

// This function print the timestamp
void print_timestamp(FILE *stream) {
    struct timeval tv;
    gettimeofday(&tv, NULL);

    time_t now = tv.tv_sec;
    struct tm *timeinfo = localtime(&now);

    char buffer[64];
    strftime(buffer, sizeof(buffer), "%Y-%m-%d %H:%M:%S", timeinfo);

    fprintf(stream, "[%s.%03ld] ", buffer, tv.tv_usec / 1000);
}

// Error handler
void handle_error(PGconn *conn, PGresult *res, const char *context) {
    print_timestamp(stderr);
    const char *sqlstate = res ? PQresultErrorField(res, PG_DIAG_SQLSTATE) : NULL;
    fprintf(stderr, "ERROR in %s: %s", context, PQerrorMessage(conn));
    if (sqlstate) {
        fprintf(stderr, " (SQLSTATE: %s)", sqlstate);
    }
    fprintf(stderr, "\n");

    if (res) PQclear(res);
    PQfinish(conn);
    exit(EXIT_FAILURE);
}

int main() {
    print_timestamp(stdout);
    printf("Start program C-testCursors\n");
    const char *conninfo = "host=192.168.20.94 dbname=hh24odds_prod user=my_user password=my_password";
    PGconn *conn = PQconnectdb(conninfo);

    if (PQstatus(conn) != CONNECTION_OK) {
        print_timestamp(stderr);
        fprintf(stderr, "Connection to database failed: %s\n", PQerrorMessage(conn));
        PQfinish(conn);
        return EXIT_FAILURE;
    }

    print_timestamp(stdout);
    printf("Connected to database.\n");

    // Start the transaction
    PGresult *res = PQexec(conn, "BEGIN");
    if (PQresultStatus(res) != PGRES_COMMAND_OK) {
        handle_error(conn, res, "BEGIN");
    }
    PQclear(res);

    // Creating temporary table
    res = PQexec(conn, "CREATE TEMP TABLE IF NOT EXISTS tt1 (like template_table)");
    if (PQresultStatus(res) != PGRES_COMMAND_OK) {
        handle_error(conn, res, "CREATE TEMP TABLE");
    }
    PQclear(res);
    print_timestamp(stdout);
    printf("Temp table created.\n");

    // Caricamento della temp table
    res = PQexec(conn, "INSERT INTO tt1 SELECT * FROM test_baseline_table order by nelab, bkmkc_id, sport_id, pal_c_id, bt_id, oc_id, mp_id, spread");
    if (PQresultStatus(res) != PGRES_COMMAND_OK) {
        handle_error(conn, res, "INSERT INTO TEMP TABLE");
    }
    PQclear(res);
    print_timestamp(stdout);
    printf("Temp table loaded.\n");

    // Declaring parent cursor on temporary table
    res = PQexec(conn, "DECLARE parent_cursor CURSOR FOR SELECT bt_id, evento_id FROM tt1");
    if (PQresultStatus(res) != PGRES_COMMAND_OK) {
        handle_error(conn, res, "DECLARE parent_cursor");
    }
    PQclear(res);

    // Declaring child cursor out-of-parent loop
    // Just as POC
    res = PQexec(conn, "DECLARE child_cursor CURSOR FOR SELECT descri FROM lookup_table");
    if (PQresultStatus(res) != PGRES_COMMAND_OK) {
        handle_error(conn, res, "DECLARE child_cursor");
    }
    PQclear(res);

    // Parent cursor loop
    while (1) {
        res = PQexec(conn, "FETCH NEXT FROM parent_cursor");
        if (PQresultStatus(res) != PGRES_TUPLES_OK) {
            handle_error(conn, res, "FETCH parent_cursor");
        }

        int nrows = PQntuples(res);
        if (nrows == 0) {
            PQclear(res);
            break;
        }

        for (int i = 0; i < nrows; i++) {
            char *parent_id = PQgetvalue(res, i, 0);
            char *parent_name = PQgetvalue(res, i, 1);

            // Child cursor whith parametric query
            char query[256];
            snprintf(query, sizeof(query), "CLOSE child_cursor; DECLARE child_cursor CURSOR FOR SELECT descri FROM lookup_table WHERE id = %s", parent_id);

            PGresult *child_res = PQexec(conn, query);
            if (PQresultStatus(child_res) != PGRES_COMMAND_OK) {
                handle_error(conn, child_res, "OPEN child_cursor");
            }
            PQclear(child_res);

            // Child cursor loop
            while (1) {
                child_res = PQexec(conn, "FETCH NEXT FROM child_cursor");
                if (PQresultStatus(child_res) != PGRES_TUPLES_OK) {
                    handle_error(conn, child_res, "FETCH child_cursor");
                }

                int child_nrows = PQntuples(child_res);
                if (child_nrows == 0) {
                    PQclear(child_res);
                    break;
                }

                for (int j = 0; j < child_nrows; j++) {
                    char *detail = PQgetvalue(child_res, j, 0);
                    print_timestamp(stdout);
                    printf("Child row: %s\n", detail);
                }

                PQclear(child_res);
            }
        }

        PQclear(res);
    }

    print_timestamp(stdout);
    printf("End of cursors process.\n");
    // Chiudi il cursore padre
    res = PQexec(conn, "CLOSE parent_cursor");
    if (PQresultStatus(res) != PGRES_COMMAND_OK) {
        handle_error(conn, res, "CLOSE parent_cursor");
    }
    PQclear(res);

    // Note: the last child cursor was intentionally leaved open

    // Closing transaction
    res = PQexec(conn, "END");
    if (PQresultStatus(res) != PGRES_COMMAND_OK) {
        handle_error(conn, res, "END");
    }
    PQclear(res);

    PQfinish(conn);
    print_timestamp(stdout);
    printf("End program C-testCursors2\n");
    return EXIT_SUCCESS;
}


  [application/x-shellscript] C-testCursors.sh (279B, 6-C-testCursors.sh)
  download

view thread (15+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Intermittent errors when fetching cursor rows on PostgreSQL 16
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox