Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s1lOO-00Egb6-RA for pgsql-general@arkaria.postgresql.org; Tue, 30 Apr 2024 11:11:24 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1s1lOL-005umU-6z for pgsql-general@arkaria.postgresql.org; Tue, 30 Apr 2024 11:11:22 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s1lOK-005umL-H8 for pgsql-general@lists.postgresql.org; Tue, 30 Apr 2024 11:11:21 +0000 Received: from mail-lj1-x22c.google.com ([2a00:1450:4864:20::22c]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1s1lOH-000lRC-Vn for pgsql-general@lists.postgresql.org; Tue, 30 Apr 2024 11:11:19 +0000 Received: by mail-lj1-x22c.google.com with SMTP id 38308e7fff4ca-2dd615b6c44so59603361fa.0 for ; Tue, 30 Apr 2024 04:11:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714475476; x=1715080276; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=VuTEBEgJuB6Lg4YIcsQx+d3DWXG45Uuopeh4OMWG34E=; b=ifNGlrAmnmleniVFp3dEjV14FieIuLTq9GI3Bm8EgplwD8P9+rpoDANWYgP4biZc8p NC2rPE5dTHnlUcqso78jfE6eIir7qPPN5vHTXPCGQt/86MAhM/fwpFsL7uy9G1lQHK/F kdJCOI2Z2hIuLgUQoAvr6F/yRi1MYnFxpJlRNi0AVKQNKnwazHjCyVt/r3MGdg0XxBSo Y9Uk2RsizneM+IxWE83w7X5kn7BvACkGUAUMLafZlGh9FSiOpImeed3YkAbuLX0uQzPM djM6cRdcP7yCmWhgQRdZK1EWHxo6YsBswSgw5t2FtXgazxZ+frNe/sK9x9H6Uxn6hZqJ unDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714475476; x=1715080276; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VuTEBEgJuB6Lg4YIcsQx+d3DWXG45Uuopeh4OMWG34E=; b=HlWH141fZOwymrHK9xir4ECr2eOJp0kSbvimp3EwCOsBub8/SiBFn1waU60D0dKNy1 P16yHIOEvYjCrkjiO027vld+LA+iffKbdPL2xLsJkeqQb4oLIWUztboNzIH71y0XmdnU /WSwzlaWD/PNu+xBrJm/Oe4B/0Ai0AmMlBiK9G3777vOMzq7hQ+l7F2bQRf6gRW/fiBH 8T9h7kKXCEUAaMXgQvLSOSTXuuEIk8xKdMLx8du9pyj8qQQoZCgP4BjU2ggOGZwctGHm xWlcHDTXshVte0OLHSxAhod10r0iQPpsxnKzL148RMCILQS+jXRwwveLL3mU2UiD26HY ILdQ== X-Gm-Message-State: AOJu0Yw0zWHz+ALT8mGkWcddPFwBthJfwvNXl1QRU5TV+M+E1ey7hqw9 VLyt8aU9DjA1IKiQeT8UXLBwls9Ll+G6ZNZanw2Zg0vB1jBhAQ6hQPabcSe3z2d8wGlK//bekr+ Cm+y0dOM6bJduxAG8au1f6AIj8qw= X-Google-Smtp-Source: AGHT+IGlrBYJZjdGg99N9a9EC944kIqGhh8eDBy4JSBccwUywRFcG9UppmEqc6V8RxBvXFvrpg6fINi5P3nUWcuWnXs= X-Received: by 2002:a19:915a:0:b0:51b:64c3:8cfa with SMTP id y26-20020a19915a000000b0051b64c38cfamr7927817lfj.13.1714475475686; Tue, 30 Apr 2024 04:11:15 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Dimitris Zenios Date: Tue, 30 Apr 2024 14:10:39 +0300 Message-ID: Subject: Re: pg_notify contention To: =?UTF-8?Q?Torsten_F=C3=B6rtsch?= Cc: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000b45d4106174e6d61" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000b45d4106174e6d61 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Torsten and thanks for your quick response. My problem with WAL stream and pg_logical_emit_message is that this will pollute the wal files resulting in bigger files in which btw are also archived using archive-command for disaster recovery cases. I am thinking of an approach that instead of notifying, I save into an unlogged table and then have a queue that processes the messages sequentially.This will allow the original transaction to be as fast as possible. I still haven't figured out the correct way to do the fifo approach on the unlogged tables. What is your opinion on my proposed approach? Thanks On Tue, Apr 30, 2024 at 1:57=E2=80=AFPM Torsten F=C3=B6rtsch wrote: > As part of the commit operation, Postgres inserts the notification into a > queue. Naturally, that insert is guarded by a lock and that lock is > released only at the very end of the commit operation. This effect gets > much worse if you also configure synchronous replication because commit > finishes only when the desired sync level has been achieved. > > An alternative solution could be to receive notification through the WAL > stream. Use pg_revclogical to connect to a replication slot configured wi= th > wal2json. Then you can send notifications with pg_logical_emit_message() > for instance which even allows you to send notifications from aborted > transactions. > > Setup: > SELECT * FROM pg_create_logical_replication_slot('w2j', 'wal2json'); > > Here is a sender: > CREATE SCHEMA notify; > CREATE OR REPLACE FUNCTION notify.send( > transactional BOOLEAN, > channel TEXT, > message TEXT > ) RETURNS PG_LSN AS $$ > SELECT pg_logical_emit_message( > transactional, > 'NOTIFY_'||channel, > jsonb_build_object( > 'txn_start', now()::TEXT, > 'pid', pg_backend_pid(), > 'message', message > )::TEXT > ); > $$ LANGUAGE sql VOLATILE; > > And something like this as a receiver / distributor: > sudo -iu postgres /usr/lib/postgresql/$VERSION/bin/pg_recvlogical \ > -d 'port=3D5432 dbname=3Dyour_database' \ > -S w2j \ > -f - \ > -F 0 \ > --start \ > -o include-xids=3D1 \ > -o include-timestamp=3D1 \ > -o add-tables=3Dnotify.\* \ > -o format-version=3D2 | > jq 'select(.action=3D=3D"M" and (.prefix | test("^NOTIFY_"))) | > . as $x | > .content | > fromjson + { > "channel": ($x.prefix | .[7:]), > "xid": $x.xid, > "txn_commit": $x.timestamp, > "transactional": $x.transactional > }' > > Note, this is not a 100% replacement. The semantics are slightly > different. You also need a replication slot which comes with its own > maintenance cost. But maybe it's worth a try. > > Have fun, > Torsten > > On Tue, Apr 30, 2024 at 12:05=E2=80=AFPM Dimitris Zenios < > dimitris.zenios@gmail.com> wrote: > >> Hi, >> >> I am measuring a very simple case of pg_notify in an after update >> trigger. The trigger is the following: >> >> CREATE >> OR REPLACE FUNCTION audit_event() RETURNS TRIGGER AS >> $$ >> BEGIN >> PERFORM pg_notify('user', 'hello world'); >> RETURN NULL; >> END; >> $$ >> LANGUAGE plpgsql; >> >> and configured on the table with the following sql >> >> CREATE CONSTRAINT TRIGGER trigger_test AFTER UPDATE ON pgbench_accounts >> FOR EACH ROW EXECUTE PROCEDURE audit_event() >> >> I am running two benchmarks. One with the pg_notify on the trigger >> enabled and one with the notify commented out. >> >> The command is the following: >> pgbench -f /tmp/update_bench.sql -c 10 -j 10 -t 100 benchdb >> >> And the content of the update_bench.sql are the following >> >> \set aid random(1, 100000 * :scale) >> \set delta random(-5000, 5000) >> BEGIN; >> UPDATE pgbench_accounts SET abalance =3D abalance + :delta WHERE aid =3D= :aid; >> END; >> >> Results are >> >> - With pg_notify disabled >> >> pgbench (16.1) >> starting vacuum...end. >> transaction type: /tmp/update_bench.sql >> scaling factor: 1 >> query mode: simple >> number of clients: 10 >> number of threads: 10 >> maximum number of tries: 1 >> number of transactions per client: 100 >> number of transactions actually processed: 1000/1000 >> number of failed transactions: 0 (0.000%) >> latency average =3D 11.744 ms >> initial connection time =3D 15.616 ms >> tps =3D 851.531991 (without initial connection time) >> >> - With pg_notify enabled >> >> pgbench (16.1) >> starting vacuum...end. >> transaction type: /tmp/update_bench.sql >> scaling factor: 1 >> query mode: simple >> number of clients: 10 >> number of threads: 10 >> maximum number of tries: 1 >> number of transactions per client: 100 >> number of transactions actually processed: 1000/1000 >> number of failed transactions: 0 (0.000%) >> latency average =3D 56.927 ms >> initial connection time =3D 11.182 ms >> tps =3D 175.664989 (without initial connection time) >> >> There is a huge drop in TPS from 851 to 175. >> >> I also noticed that if I run the test with a single connection -c 1 then >> the results are nearly identical which makes me assume that this is a >> contention that occurs between multiple connections. >> >> Thanks >> >> --000000000000b45d4106174e6d61 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Torsten and thanks for your=C2=A0quick response.
My problem with WAL stream and pg_logical_emit_message is that= this will pollute the wal files resulting in bigger files in which btw are= also archived using archive-command for disaster recovery cases.

I am thinking of an approach that instead of notifying, I save into an= unlogged table and then have a queue that processes the messages sequentia= lly.This will allow the original transaction to be as fast as possible.
= I still haven't figured out the correct way to do the fifo approach on = the unlogged tables.=C2=A0

What is your opinion on= my proposed approach?

Thanks

On Tue, Apr 30,= 2024 at 1:57=E2=80=AFPM Torsten F=C3=B6rtsch <tfoertsch123@gmail.com> wrote:
As part of the comm= it operation, Postgres inserts the notification into a queue. Naturally, th= at insert is guarded by a lock and that lock is released only at the very e= nd of the commit operation. This effect gets much worse if you also configu= re synchronous replication because commit finishes only when the desired sy= nc level has been achieved.

An alternative solution coul= d be to receive notification through the WAL stream. Use pg_revclogical to = connect to a replication slot configured with wal2json. Then you can send n= otifications with=C2=A0pg_logical_emit_message() for instance which even al= lows you to send notifications from aborted transactions.

Setup:
SELECT * FROM pg_create_logical_replication_slot(&= #39;w2j', 'wal2json');

Here is a s= ender:
CREATE SCHEMA notify;
CREATE OR REPLACE FUNCTION notify= .send(
=C2=A0 =C2=A0 transactional BOOLEAN,
=C2=A0 =C2=A0 channel TEX= T,
=C2=A0 =C2=A0 message TEXT
) RETURNS PG_LSN AS $$
=C2=A0 =C2=A0= SELECT pg_logical_emit_message(
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0transactional,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0'NOTIFY_'||channel,
=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0jsonb_build_object(
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0'txn_start', now()::TE= XT,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0'pid', pg_backend_pid(),
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0'message', message
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0)::TEXT
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0);
$$ LANGUAGE sql VOLATILE;

<= /div>
And something like this as a receiver / distributor:
su= do -iu postgres /usr/lib/postgresql/$VERSION/bin/pg_recvlogical \
=C2=A0= =C2=A0 =C2=A0-d 'port=3D5432 dbname=3Dyour_database' \
=C2=A0 = =C2=A0 =C2=A0-S w2j \
=C2=A0 =C2=A0 =C2=A0-f - \
=C2=A0 =C2=A0 =C2=A0= -F 0 \
=C2=A0 =C2=A0 =C2=A0--start \
=C2=A0 =C2=A0 =C2=A0-o include-x= ids=3D1 \
=C2=A0 =C2=A0 =C2=A0-o include-timestamp=3D1 \
=C2=A0 =C2= =A0 =C2=A0-o add-tables=3Dnotify.\* \
=C2=A0 =C2=A0 =C2=A0-o format-vers= ion=3D2 |
jq 'select(.action=3D=3D"M" and (.prefix | test(= "^NOTIFY_"))) |
=C2=A0 =C2=A0 . as $x |
=C2=A0 =C2=A0 .cont= ent |
=C2=A0 =C2=A0 fromjson + {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 "ch= annel": ($x.prefix | .[7:]),
=C2=A0 =C2=A0 =C2=A0 =C2=A0 "xid&= quot;: $x.xid,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 "txn_commit": $x.ti= mestamp,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 "transactional": $x.trans= actional
=C2=A0 =C2=A0 }'

Note, this is= not a 100% replacement. The semantics are slightly different. You also nee= d a replication slot which comes with its own maintenance cost. But maybe i= t's worth a try.

Have fun,
Torsten

On Tue, Apr 30, 2024 at 12:05=E2=80=AFPM Dimitris Zenios <dimitris.zenios@gmail.= com> wrote:
Hi,=C2=A0

I am measuring a very simp= le case of pg_notify in an after update trigger. The trigger is the followi= ng:

CREATE
=C2=A0 =C2=A0 OR REPLACE FUNCTION audit_event() RETURN= S TRIGGER AS
$$
BEGIN
=C2=A0 =C2=A0 PERFORM pg_notify('user= 9;, 'hello world');
=C2=A0 =C2=A0 RETURN NULL;
END;
$$
= =C2=A0 =C2=A0 LANGUAGE plpgsql;

and configured on = the table with the following sql

CREATE CONSTRAINT= TRIGGER trigger_test AFTER UPDATE ON pgbench_accounts FOR EACH ROW EXECUTE= PROCEDURE audit_event()

I am running two benchmarks. One with the p= g_notify on the trigger enabled and one with the notify commented out.
<= br>The command is the following:
pgbench -f /tmp/update_bench.sql =C2= =A0-c 10 -j 10 -t 100 benchdb

And the content of the update_bench.sq= l are the following

\set aid random(1, 100000 * :s= cale)
\set delta random(-5000, 5000)
BEGIN;
UPDATE pgbench_account= s SET abalance =3D abalance + :delta WHERE aid =3D :aid;
END;
Results are

- With pg_notify disabled=

pgbench (16.1)
starting vacuum...end.
trans= action type: /tmp/update_bench.sql
scaling factor: 1
query mode: simp= le
number of clients: 10
number of threads: 10
maximum number of t= ries: 1
number of transactions per client: 100
number of transactions= actually processed: 1000/1000
number of failed transactions: 0 (0.000%)=
latency average =3D 11.744 ms
initial connection time =3D 15.616 ms<= br>tps =3D 851.531991 (without initial connection time)

<= /div>
- With pg_notify enabled

pgbench (16.1)<= br>starting vacuum...end.
transaction type: /tmp/update_bench.sql
sca= ling factor: 1
query mode: simple
number of clients: 10
number of = threads: 10
maximum number of tries: 1
number of transactions per cli= ent: 100
number of transactions actually processed: 1000/1000
number = of failed transactions: 0 (0.000%)
latency average =3D 56.927 ms
init= ial connection time =3D 11.182 ms
tps =3D 175.664989 (without initial co= nnection time)

There is a huge drop in TPS fro= m 851 to 175.

I also noticed that if I run the tes= t with a single connection -c 1 then the results are nearly identical which= makes me assume that this is a contention that occurs between multiple con= nections.

Thanks

<= /div>
--000000000000b45d4106174e6d61--