public inbox for [email protected]  
help / color / mirror / Atom feed
Why plpython functions increase transaction counter much more then plpgsql functions?
4+ messages / 3 participants
[nested] [flat]

* Why plpython functions increase transaction counter much more then plpgsql functions?
@ 2024-11-08 07:58 Michał Albrycht <[email protected]>
  2024-11-08 14:03 ` Re: Why plpython functions increase transaction counter much more then plpgsql functions? Ron Johnson <[email protected]>
  2024-11-08 14:39 ` Re: Why plpython functions increase transaction counter much more then plpgsql functions? Tom Lane <[email protected]>
  0 siblings, 2 replies; 4+ messages in thread

From: Michał Albrycht @ 2024-11-08 07:58 UTC (permalink / raw)
  To: [email protected]

I'm trying to understand why plpython function has much bigger impact on
transaction counter in Postgres than plpgSQL function. Below is example
which uses 2 functions:

Version with plpgSQL (each part done in separate transactions one after
another)
 - check txid_current
 - SQL query which calls the `f1_plpgsql` function  which calls the
`insert_row_to_db` function 100 times
 - check txid_current

 Then we compare txid_currnent values and difference is 2 which means that
whole sql with 100 calls to `f1_plpgsql` and `insert_row_to_db` increased
transaction counter only by 1.

Here is the code:
```
CREATE TABLE insert_rows_table(
    i BIGINT
);

CREATE OR REPLACE FUNCTION insert_row_to_db(i BIGINT)
RETURNS VOID
AS $$
BEGIN
    INSERT INTO insert_rows_table SELECT i;
END
$$ LANGUAGE plpgsql SECURITY DEFINER VOLATILE PARALLEL UNSAFE;


CREATE OR REPLACE FUNCTION f1_plpgsql(i BIGINT)
  RETURNS bigint
AS $$
BEGIN
    PERFORM insert_row_to_db(i);
    RETURN i;
END
$$ LANGUAGE plpgsql SECURITY DEFINER VOLATILE PARALLEL UNSAFE;


SELECT txid_current();
SELECT f1_plpgsql(i::BIGINT) FROM generate_series(1,100) as i;
SELECT txid_current();
```

Example output:

txid_current
500

f1_plpgsql
1
2
...
99
100

txid_current
502


Here is a code reproduction on db-fiddle:
https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/15135

Now let's replace `f1_plpgsql` with function written in plpython:

```
CREATE OR REPLACE FUNCTION f1_plpython(i BIGINT)
  RETURNS bigint
AS $$
    rows = plpy.execute("SELECT insert_row_to_db(" + str(i) + ")")
    return i
$$ LANGUAGE plpython3u SECURITY DEFINER VOLATILE PARALLEL UNSAFE;
```

I get:

txid_current
500

f1_plpgsql
1
2
...
99
100

txid_current
602


This proves that the plpython function affects the transaction counter much
more. Does anyone know why? Is there anything I can do about it?

What's interesting it happens only if the function called by plpyhon makes
changes to DB. When I replace `INSERT INTO insert_rows_table SELECT i;`
with `SELECT i` both plpython and plpgsql functions behave the same.
 Regards,

Michał Albrycht


^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Why plpython functions increase transaction counter much more then plpgsql functions?
  2024-11-08 07:58 Why plpython functions increase transaction counter much more then plpgsql functions? Michał Albrycht <[email protected]>
@ 2024-11-08 14:03 ` Ron Johnson <[email protected]>
  1 sibling, 0 replies; 4+ messages in thread

From: Ron Johnson @ 2024-11-08 14:03 UTC (permalink / raw)
  To: pgsql-generallists.postgresql.org <[email protected]>

Because the plpython function is executing dynamic SQL?

On Fri, Nov 8, 2024 at 2:59 AM Michał Albrycht <[email protected]>
wrote:

> I'm trying to understand why plpython function has much bigger impact on
> transaction counter in Postgres than plpgSQL function. Below is example
> which uses 2 functions:
>
> Version with plpgSQL (each part done in separate transactions one after
> another)
>  - check txid_current
>  - SQL query which calls the `f1_plpgsql` function  which calls the
> `insert_row_to_db` function 100 times
>  - check txid_current
>
>  Then we compare txid_currnent values and difference is 2 which means that
> whole sql with 100 calls to `f1_plpgsql` and `insert_row_to_db` increased
> transaction counter only by 1.
>
> Here is the code:
> ```
> CREATE TABLE insert_rows_table(
>     i BIGINT
> );
>
> CREATE OR REPLACE FUNCTION insert_row_to_db(i BIGINT)
> RETURNS VOID
> AS $$
> BEGIN
>     INSERT INTO insert_rows_table SELECT i;
> END
> $$ LANGUAGE plpgsql SECURITY DEFINER VOLATILE PARALLEL UNSAFE;
>
>
> CREATE OR REPLACE FUNCTION f1_plpgsql(i BIGINT)
>   RETURNS bigint
> AS $$
> BEGIN
>     PERFORM insert_row_to_db(i);
>     RETURN i;
> END
> $$ LANGUAGE plpgsql SECURITY DEFINER VOLATILE PARALLEL UNSAFE;
>
>
> SELECT txid_current();
> SELECT f1_plpgsql(i::BIGINT) FROM generate_series(1,100) as i;
> SELECT txid_current();
> ```
>
> Example output:
>
> txid_current
> 500
>
> f1_plpgsql
> 1
> 2
> ...
> 99
> 100
>
> txid_current
> 502
>
>
> Here is a code reproduction on db-fiddle:
> https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/15135
>
> Now let's replace `f1_plpgsql` with function written in plpython:
>
> ```
> CREATE OR REPLACE FUNCTION f1_plpython(i BIGINT)
>   RETURNS bigint
> AS $$
>     rows = plpy.execute("SELECT insert_row_to_db(" + str(i) + ")")
>     return i
> $$ LANGUAGE plpython3u SECURITY DEFINER VOLATILE PARALLEL UNSAFE;
> ```
>
> I get:
>
> txid_current
> 500
>
> f1_plpgsql
> 1
> 2
> ...
> 99
> 100
>
> txid_current
> 602
>
>
> This proves that the plpython function affects the transaction counter
> much more. Does anyone know why? Is there anything I can do about it?
>
> What's interesting it happens only if the function called by plpyhon makes
> changes to DB. When I replace `INSERT INTO insert_rows_table SELECT i;`
> with `SELECT i` both plpython and plpgsql functions behave the same.
>  Regards,
>
> Michał Albrycht
>
>


-- 
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Why plpython functions increase transaction counter much more then plpgsql functions?
  2024-11-08 07:58 Why plpython functions increase transaction counter much more then plpgsql functions? Michał Albrycht <[email protected]>
@ 2024-11-08 14:39 ` Tom Lane <[email protected]>
  2024-11-08 17:44   ` Re: Why plpython functions increase transaction counter much more then plpgsql functions? Michał Albrycht <[email protected]>
  1 sibling, 1 reply; 4+ messages in thread

From: Tom Lane @ 2024-11-08 14:39 UTC (permalink / raw)
  To: Michał Albrycht <[email protected]>; +Cc: [email protected]

=?UTF-8?Q?Micha=C5=82_Albrycht?= <[email protected]> writes:
> This proves that the plpython function affects the transaction counter much
> more. Does anyone know why?

A quick look at PLy_spi_execute_query shows that it runs each command
in a subtransaction.  It pretty much has to, because the coding rules
for a Python method don't permit it to just longjmp out of the Python
interpreter, so it has to set up a subtransaction so it can catch any
error.  In this example, each subtransaction will consume an XID.

The plpgsql example is different because it doesn't trap errors,
hence no subtransaction needed, and all the rows will get inserted
under the XID of the outer command's main transaction.  If you were
to wrap the insert_row_to_db call in BEGIN ... EXCEPTION then it'd
consume the same number of XIDs as plpython, for the same reason.

> Is there anything I can do about it?

Batch the DB updates, perhaps?

> What's interesting it happens only if the function called by plpyhon makes
> changes to DB.

Totally unsurprising.  XIDs are acquired only when the current
transaction or subtransaction first needs to change the DB.

			regards, tom lane






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Why plpython functions increase transaction counter much more then plpgsql functions?
  2024-11-08 07:58 Why plpython functions increase transaction counter much more then plpgsql functions? Michał Albrycht <[email protected]>
  2024-11-08 14:39 ` Re: Why plpython functions increase transaction counter much more then plpgsql functions? Tom Lane <[email protected]>
@ 2024-11-08 17:44   ` Michał Albrycht <[email protected]>
  0 siblings, 0 replies; 4+ messages in thread

From: Michał Albrycht @ 2024-11-08 17:44 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: [email protected]

>
> A quick look at PLy_spi_execute_query shows that it runs each command
> in a subtransaction.  It pretty much has to, because the coding rules
> for a Python method don't permit it to just longjmp out of the Python
> interpreter, so it has to set up a subtransaction so it can catch any
> error.  In this example, each subtransaction will consume an XID.
>
> That makes sense. Thank you very much.
Regards,

Michał Albrycht


^ permalink  raw  reply  [nested|flat] 4+ messages in thread


end of thread, other threads:[~2024-11-08 17:44 UTC | newest]

Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-11-08 07:58 Why plpython functions increase transaction counter much more then plpgsql functions? Michał Albrycht <[email protected]>
2024-11-08 14:03 ` Ron Johnson <[email protected]>
2024-11-08 14:39 ` Tom Lane <[email protected]>
2024-11-08 17:44   ` Michał Albrycht <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox