Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tgNyB-00AEG4-RU for pgsql-general@arkaria.postgresql.org; Fri, 07 Feb 2025 13:00:32 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tgNy8-001FaO-I2 for pgsql-general@arkaria.postgresql.org; Fri, 07 Feb 2025 13:00:28 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tgNy8-001FaF-7J for pgsql-general@lists.postgresql.org; Fri, 07 Feb 2025 13:00:28 +0000 Received: from mail-lj1-x234.google.com ([2a00:1450:4864:20::234]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tgNy5-003leM-2S for pgsql-general@lists.postgresql.org; Fri, 07 Feb 2025 13:00:27 +0000 Received: by mail-lj1-x234.google.com with SMTP id 38308e7fff4ca-30615661f98so21882371fa.2 for ; Fri, 07 Feb 2025 05:00:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738933223; x=1739538023; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=HvwcnABIXlTrlp/iLbU7Dfvo3TH7crn771fGgUpSHYU=; b=LZ/g67icWa9YdXHJhlSqddDVNLdkU53aCRK1HLreoluuvC1OoBfYhvZ8ss2Rggrrxs B5L2HEYcAHgq3/Iq7AqAv2IY4NMz16Jo9SL5HauBW83lY98m7+HNquMYgRkfenIAme+r 6RR1L8jJgagGQHujE1HLkPygmnkwzhCopVGSFNTrB4YyyN/tXQhs3ZUtVfoVfKzxoTUe 1m19kMSTDI5DZveokrFLRtrowfPxIrWr4D5Q8NLscyV+UGVeMu0IGof5GJnu7Aqm7LbS UJf0l2uZSEF5QLgN/KqCnZ9OYrMfDMkPnFzdg3ltB683eRXb518HBK09WB9e7uyjtKc7 UXBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738933223; x=1739538023; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HvwcnABIXlTrlp/iLbU7Dfvo3TH7crn771fGgUpSHYU=; b=HYAzOVUBMOGphpVafh5TjEtCiwUWruKp7Q1WJbbOK+jKbpuzsDGmeSzrnC/qaJqttw LmX/ExBHQ+jQ47b+6iDARbbaoVzQhZVrwKuif4abrYCg+D82tgRZeH2/ciSe/6OwgYb8 wnG/kD/PsH6wgpmgM07TxAjtFcrmXn+oWXaCnXuUa/tVIrlU2xr9+VaXxOF1p6kT+PSA uLkfczKnINqKRLRrzAbn73X7Lp40KLBDyI03WOfnGtlW39s0nv0ovathUwn9KGtF5gZc kq8xfN7n1zEpjJLILDskBwM8F1/RHlJui3tWrPt+dtQPj6a8f3sP6xfJWr3Hhbl1+eQf srNQ== X-Forwarded-Encrypted: i=1; AJvYcCVeGnGo7qcVONCCOmQYvsqRDmdhbB4ZhsyyaNFdbM5nkDms7kZhKFSkg4h9YCbmEdCPPWjm9KUYeElOZy64@lists.postgresql.org X-Gm-Message-State: AOJu0Yz0z5d7ae9qXoPKI+Cx9cEAZwfF3yaa4G1X8YoCyQlGhcHMjkQ5 B0Nch0ReSUrQAbt4w6+10vjXHaaCnG41WdlZUcyALH3pVtvD1uuQxxOeDEvFPHJI5WWka+FNNQL rC+2OcGdEr5rR3wQIhtzINJEtbY8= X-Gm-Gg: ASbGncuKch8MVDFsi1gQ2rh0TWF1dgt+ylm5ljDxy9unvblZRa/cbjPLf/RxhqEkOiN QyyDbRXMkAgIlXpHxJMUBJ8Jcvy2siNwz4zvzOpvAnob1nhK76RksZGvyDack9SdLKD9NLJk= X-Google-Smtp-Source: AGHT+IEBGWBsILPtFMX6BeCrZBV5fVC79KQ93b0BBgX20rP3llv/A0s9WGk0fEknInYylrKX090ThGj2GHdcOjaYaWg= X-Received: by 2002:a2e:9a0d:0:b0:302:3534:1fba with SMTP id 38308e7fff4ca-307e57fe4admr10555171fa.17.1738933223136; Fri, 07 Feb 2025 05:00:23 -0800 (PST) MIME-Version: 1.0 References: <202502051537.rpohgrxu4vrz@alvherre.pgsql> In-Reply-To: <202502051537.rpohgrxu4vrz@alvherre.pgsql> From: ravi k Date: Fri, 7 Feb 2025 18:30:10 +0530 X-Gm-Features: AWEUYZnSVc-SMd8VR61_iqSLHezJs94_zw96wjcGGjzg6WENwoU3VZT2xhR9D9k Message-ID: Subject: Re: Commit Latency To: =?UTF-8?Q?=C3=81lvaro_Herrera?= Cc: Ramakrishna m , pgsql-general Content-Type: multipart/alternative; boundary="0000000000000da5e5062d8cf129" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000000da5e5062d8cf129 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks for the suggestions! It looks the issue is happening at the time of wal creation, does wal_init_zero off is good option? Best On Wed, 5 Feb, 2025, 9:07=E2=80=AFpm =C3=81lvaro Herrera, wrote: > Hello > > On 2025-Feb-05, Ramakrishna m wrote: > > > I have a system handling *300 TPS*, with resource usage *below 10%*. > > However, I=E2=80=99m noticing *commit latency of around 200ms* for *1% = of > > transactions*, occasionally spiking to *1 second*. Since there is no > > significant *I/O pressure*, I=E2=80=99m trying to identify what else mi= ght be > > causing this in *PostgreSQL 16*. > > max_connections=3D8000 doesn't sound great -- how many of those are > active, typically, and how many are idle-in-transaction? And you have > autovacuum_naptime=3D5s ... which sounds rather dubious. Either somebody > with great expertise configured this very carefully, or the settings are > somewhat randomly chosen with little or no expert oversight. Do you > have monitoring on the amount of bloat on these database? Maybe you > should consider connection pooling and limit the number that are active, > for starters. > > Maybe have a look at whether pg_wait_sampling can give you more clues. > Some basic bloat monitoring is a prerequisite to any further performance > tweaking anyhow. > > -- > =C3=81lvaro Herrera Breisgau, Deutschland =E2=80=94 > https://www.EnterpriseDB.com/ > Maybe there's lots of data loss but the records of data loss are also los= t. > (Lincoln Yeoh) > --0000000000000da5e5062d8cf129 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks for the suggestions!

It looks the issue is happening at the time of wal creation= , does wal_init_zero off is good option?

<= div dir=3D"auto">
Best

On Wed, 5 Feb, 2025, 9:07=E2=80=AFpm =C3=81lvaro Herrera, <alvherre@alvh.no-ip.org> wrote:
<= /div>
Hello

On 2025-Feb-05, Ramakrishna m wrote:

> I have a system handling *300 TPS*, with resource usage *below 10%*. > However, I=E2=80=99m noticing *commit latency of around 200ms* for *1%= of
> transactions*, occasionally spiking to *1 second*. Since there is no > significant *I/O pressure*, I=E2=80=99m trying to identify what else m= ight be
> causing this in *PostgreSQL 16*.

max_connections=3D8000 doesn't sound great -- how many of those are
active, typically, and how many are idle-in-transaction?=C2=A0 And you have=
autovacuum_naptime=3D5s ... which sounds rather dubious.=C2=A0 Either someb= ody
with great expertise configured this very carefully, or the settings are somewhat randomly chosen with little or no expert oversight.=C2=A0 Do you have monitoring on the amount of bloat on these database?=C2=A0 Maybe you should consider connection pooling and limit the number that are active, for starters.

Maybe have a look at whether pg_wait_sampling can give you more clues.
Some basic bloat monitoring is a prerequisite to any further performance tweaking anyhow.

--
=C3=81lvaro Herrera=C2=A0 =C2=A0 =C2=A0 =C2=A0 Breisgau, Deutschland=C2=A0 = =E2=80=94=C2=A0 https://www.EnterpriseDB.com/
Maybe there's lots of data loss but the records of data loss are also l= ost.
(Lincoln Yeoh)
--0000000000000da5e5062d8cf129--