Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v6mYz-009eA2-6m for pgsql-general@arkaria.postgresql.org; Thu, 09 Oct 2025 09:03:54 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v6mYw-002Xcu-VW for pgsql-general@arkaria.postgresql.org; Thu, 09 Oct 2025 09:03:51 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v6mYw-002Xcc-Eh for pgsql-general@lists.postgresql.org; Thu, 09 Oct 2025 09:03:51 +0000 Received: from p-east2-cluster2-host8-snip4-10.eps.apple.com ([57.103.78.63] helo=outbound.st.icloud.com) by makus.postgresql.org with smtp (Exim 4.96) (envelope-from ) id 1v6mYt-000pE5-2K for pgsql-general@lists.postgresql.org; Thu, 09 Oct 2025 09:03:50 +0000 Received: from outbound.st.icloud.com (unknown [127.0.0.2]) by p00-icloudmta-asmtp-us-east-1a-10-percent-0 (Postfix) with ESMTPS id 6F4CF1800576 for ; Thu, 9 Oct 2025 09:03:46 +0000 (UTC) Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nunnari.ch; s=sig1; bh=qpZSlK6wFd1gboxycsQ0u7BOSVE4LygdBHvqKlSjUlQ=; h=To:From:Subject:Date:Message-id:Content-Type:MIME-Version:x-icloud-hme; b=RKZ+0XL/LR+jBcXEUNIiQhEwWjfyTOzS7DRDWQrCENgo2vp4IKl/O5Ph60QEwcIOtGWsHnEzqm+sB0wZXBfl9oKQhUZ/mK7kZsh/rFcMgX1UtPRYn4ZfVhoBBVzxfjYlHdthfMWMEPRPzz8oNsSGTyfnwvPWVnCvWSuXHpyKhVcWfVo+pClqMWAoR6Bkxyjla9X6nTGuLHd0hb9vhGkFCwSw9yjc+FbCqwjrKklLw6QaaUANZINcn2wFAzzi5xyMKcKHFd/xofmaNmpqBIKpLcneHjqP9A8ULJBFrfJ42PGyhucRL+frGsESuFORimtaXKCqnilns8k34iZ40JYYGw== mail-alias-created-date: 1516299501000 Received: from p00-mailws2-6d7c9cbdd6-5n64k (unknown [10.76.196.12]) by p00-icloudmta-asmtp-us-east-1a-10-percent-0 (Postfix) with ESMTPSA id 594341800668 for ; Thu, 9 Oct 2025 09:03:31 +0000 (UTC) To: pgsql-general@lists.postgresql.org From: Roberto Nunnari Subject: High latency and profiling Date: Thu, 9 Oct 2025 09:03:31 +0000 (GMT) X-Mailer: iCloud MailClient2536Hotfix37 MailServer2502B13.7c5914c7dad3 Message-id: Content-Type: multipart/alternative; boundary=Apple-Webmail-42--e8f9f4f7-70a0-4b80-a6c3-7b48d7d9112e MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: v9ESSt3PLzK2iuoY34LgsXU1uo9Qpuyn X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUxMDA5MDA1MSBTYWx0ZWRfX53IAS6gMSJzV P0xNzDKLP8jsciUuu/RwjZU/6+NO5oZhiAAzp877H8x2Xjr2Z7KNPzu1l1I8V0+GsmEruGo3KuT 20EIDLbPS8kaVL6YdTSF3pAl5rL3ZHzmsfUPxq0LrZiv1OmVEhxvvXdlh0h7wxmBHfePM4IIPE1 2Va1n/YC8izW42Exh/2JHB13M/XNoCz5hQLPN+brYHjG+wri6cksN/5zY4qGpqehw2mwTWA69Ym U5LxYbt/0xO1QB6v4m0qtCSkgBvF06DWS9olV9gToDnGL64r4Cc59oxO3lCS84GAfxDvAGuek= X-Proofpoint-GUID: v9ESSt3PLzK2iuoY34LgsXU1uo9Qpuyn X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1117,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-10-09_03,2025-10-06_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 spamscore=0 suspectscore=0 mlxscore=0 clxscore=1030 phishscore=0 mlxlogscore=502 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.22.0-2506270000 definitions=main-2510090051 X-JNJ: AAAAAAABQWsSSsYYAkzRbB+KsEhm/GNxCHdID94pklTXOxITKV9YeZwa/53rED3ZRy3kKeBMo+ZMMWXkxkXa86tTT20h+CZNvqQImx/C7LjrjZiF9LcbytqnvfAsgXeUZCwI90NJJjoQUy0JchnlFA3katAoCcTxr2okT2tqA+iE32roaLAgLClsl2Ca4cg8IrFTVZAO7xWW3zApRnutMdQY7l2tzzbJxt7k5yiHjkl/MR3xjgqKAEVd6HFpE8gAc0ByRt/SA0KXGWg1QduS/GySMyLDUK32er3XdsLx0a/baZZd9gOVFmWl2LwKYMfuASVFHQq32GDiHyDmLRLVGei59pWrCL28H2dexwYuRHxclnBc/TLdor5qwULy8ad/B4JGKb5kMsk3o2DIRRS85kZziRMTurah2h8MeD6I6nIRrDGMu/HaU0765T3PeKu6BSQQ0JhqSL9csX4iCoV3rUkWQzJlG8hz4uRLB5rejlrBpmreO9sFhbuuThW75EER5aZnKf+ksf4MaT1GRjVdalGvx/ljAQIPSt3YE6pa20EOT8g5K10XRxR/kmqnHWKg4xjFWuR+FEg45TolbA== List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --Apple-Webmail-42--e8f9f4f7-70a0-4b80-a6c3-7b48d7d9112e Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8; format=flowed Hello. I was notified about high latency on one of our PostgreSQL clusters= . This latency is observed from the application and has been increasing. T= here is no indication about the exact database as the application uses sev= eral DBs, but the largest and most used is 'mydb'. So, a couple of months = ago I installed pg_profile and configured it as follows: shared_preload_li= braries =3D 'pg_stat_statements,dblink,pg_stat_kcache,pg_wait_sampling' pg= _profile.max_sample_age =3D 90 \c postgres CREATE SCHEMA extensions; CREAT= E EXTENSION dblink SCHEMA extensions; CREATE EXTENSION pg_profile SCHEMA e= xtensions; \c mydb CREATE SCHEMA extensions; CREATE EXTENSION dblink SCHEM= A extensions; CREATE EXTENSION pg_stat_statements SCHEMA extensions; CREAT= E EXTENSION pg_stat_kcache schema extensions; CREATE EXTENSION pg_wait_sam= pling schema extensions; CREATE EXTENSION pg_profile SCHEMA extensions; se= lect extensions.set_server_size_sampling('local','23:00+01',interval '2 ho= ur',interval '8 hour'); CREATE USER collector_user with password ''; GRANT= pg_read_all_stats TO collector_user; grant connect on database mydb to co= llector_user; ALTER ROLE collector_user SET search_path =3D extensions,pub= lic; The postgres user had the following crontab: 13,43 * * * * psql -d my= db -c 'SELECT extensions.take_sample()' > /dev/null 2>&1 I made sure autov= acuum works correctly and after that, just to make sure, I run vacuum anal= yze on all DBs. This has not lead to any notable improvement in the latenc= y as seen by the application. I access pg_profile data from grafana, but I= cannot see anything that looks wrong. Overall Hit Ratio is always between= 99.3% and 100%. The only strange thing I see in grafana, are the followin= g: Overall statement times sais "No data" and 'db query error: pq: column = "blk_read_time" does not exist'. Also the graph "Statement time" says "No = data" and "db query error: pq: column sst.blk_read_time does not exist". W= ait event types (statements) graph: N/A was quite stable at about 0.200 s/= s until 2nd september and then grow regularly and sharply up 4 s/s until 2= 2nd september when it dropped back to 0.7 s/s and is stable at that value = till now. Developers report that their software is already optimized and d= atabase size is not a problem. They suggest the issue might be inside Post= greSQL itself. Could you help me understand what is causing this? In parti= cular, what does the N/A data in the "Wait event types (statements)" graph= represent? Thank you and best regards. Robi --Apple-Webmail-42--e8f9f4f7-70a0-4b80-a6c3-7b48d7d9112e Content-Type: multipart/related; type="text/html"; boundary=Apple-Webmail-86--e8f9f4f7-70a0-4b80-a6c3-7b48d7d9112e --Apple-Webmail-86--e8f9f4f7-70a0-4b80-a6c3-7b48d7d9112e Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8;
Hello.

I was notified about high latenc= y on one of our PostgreSQL clusters.
This latency is observe= d from the application and has been increasing. There is no indication abo= ut the exact database as the application uses several DBs, but the largest= and most used is 'mydb'.
So, a couple of months ago I insta= lled pg_profile and configured it as follows:

shared_preload_libraries =3D 'pg_stat_statements,dblink,pg_stat_kcache,pg= _wait_sampling'
pg_profile.max_sample_age =3D 90

\c postgres
CREATE SCHEMA extensions;
<= /div>
CREATE EXTENSION dblink SCHEMA extensions;
CREATE = EXTENSION pg_profile SCHEMA extensions;

\c my= db
CREATE SCHEMA extensions;
CREATE EXTENSION = dblink SCHEMA extensions;
CREATE EXTENSION pg_stat_statement= s SCHEMA extensions;
CREATE EXTENSION pg_stat_kcache schema = extensions;
CREATE EXTENSION pg_wait_sampling schema extensi= ons;
CREATE EXTENSION pg_profile SCHEMA extensions;
select extensions.set_server_size_sampling('local','23:00+01',interv= al '2 hour',interval '8 hour');

CREATE USER c= ollector_user with password '';
GRANT pg_read_all_stats TO c= ollector_user;
grant connect on database mydb to collector_u= ser;
ALTER ROLE collector_user SET search_path =3D extension= s,public;

The postgres user had the following= crontab:
13,43 * * * * psql -d mydb -c 'SELECT extensions.t= ake_sample()' > /dev/null 2>&1

I ma= de sure autovacuum works correctly and after that, just to make sure, I ru= n vacuum analyze on all DBs. This has not lead to any notable improvement = in the latency as seen by the application.

I = access pg_profile data from grafana, but I cannot see anything that looks = wrong.
Overall Hit Ratio is always between 99.3% and 100%.
The only strange thing I see in grafana, are the following:

Overall statement times sais "No data" and 'db= query error: pq: column "blk_read_time" does not exist'. Also the graph "= Statement time" says "No data" and "db query error: pq: column sst.blk_rea= d_time does not exist".
Wait event types (statements) graph:= N/A was quite stable at about 0.200 s/s until 2nd september and then grow= regularly and sharply up 4 s/s until 22nd september when it dropped back = to 0.7 s/s and is stable at that value till now.
Developers = report that their software is already optimized and database size is not a= problem. They suggest the issue might be inside PostgreSQL itself.

Could you help me understand what is causing this? = In particular, what does the N/A data in the "Wait event types (statements= )" graph represent?

Thank you and best regard= s.
Robi
--Apple-Webmail-86--e8f9f4f7-70a0-4b80-a6c3-7b48d7d9112e-- --Apple-Webmail-42--e8f9f4f7-70a0-4b80-a6c3-7b48d7d9112e--