Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5u0j-003jwt-1k for pgsql-hackers@arkaria.postgresql.org; Thu, 26 Mar 2026 23:21:09 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w5u0h-006N3a-2S for pgsql-hackers@arkaria.postgresql.org; Thu, 26 Mar 2026 23:21:08 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5u0h-006N3S-0z for pgsql-hackers@lists.postgresql.org; Thu, 26 Mar 2026 23:21:07 +0000 Received: from mail-ot1-x330.google.com ([2607:f8b0:4864:20::330]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w5u0a-00000001Do0-4Aw6 for pgsql-hackers@lists.postgresql.org; Thu, 26 Mar 2026 23:21:05 +0000 Received: by mail-ot1-x330.google.com with SMTP id 46e09a7af769-7d7e5e8c907so982878a34.0 for ; Thu, 26 Mar 2026 16:21:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774567260; cv=none; d=google.com; s=arc-20240605; b=lF4goEm1MGdWkOsSpLJdTut/7SKhSki94E5Kqk94NEkX7hUYZ36z3tlvpjP9Y2XKxA dAc6L9PpB7wdCbBQ7ehTRfQQyDzeisEOhiVRyu2gJZQz9MpygNMDIW1OjYoTY6kxMXtP h1Eod4dPKwDvPGwjeADzOvWc8HHfQDXWoCMLyn6SKxftV6f3HAtRNdxXWtWOVref7XEx JEWNEC6PWEqrBn9Fgl8X4MD8Mum+Bxpp7RLXo8hSmEUMjHmOtz1CueBnO2kVlfU8QQvK bOHJlCqL8fP/BxdALCXOM6yAnH3ufpchyMdefSKdi3435OVIon/VJejXNZJkFxy5/5tB d9aQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=KfSr+KkodhqE8FtITts4M0DFqE0hr649Q2lP9iISJcw=; fh=ohUyyfrLRh8cLXcdiFYUgGDBG5CQoMXqAjTZCYMuP4s=; b=SE2OEObIZ2NRZ8udRQWB2ig4UnEgVWoSJOayKjj92Mm4QED+Qi5351G0YgChsnH6cf PC/ppRrPNn3R4+U+2SYGNIKACOgAIeB9bIFSy3jWiDnC+R2vcSvTEV5ILNOztoKrkqmc P4+gcPqhSu+0Xo5H3HX7az8oz2I6+fkHXUkmfKwvepAPNp3vnZVb8GJTAeBIQwEyoVL+ XqAXglM/6SDYtalo+28Ns+Dxx6uPRLb/oBtxcQhj3Uck3wGZz3gfovNU5gPMTj5DTPDN 7q332OwZPvF/FksrKRZjE0YaLsBHjlq51xfBSBYB2bHdztfdy4WppTUcyu2MJYCMXGGU +JUA==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; t=1774567260; x=1775172060; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=KfSr+KkodhqE8FtITts4M0DFqE0hr649Q2lP9iISJcw=; b=dbnqWhX9qmP/DmfMkP9Af1pWiQqOtCj04zbAujUR9XAYX+mAyiZFS7BNIH392MpKJJ JlSR83K80Xqgavi5oM1cnC7CRVneNK+eaBIfSmz23pOzivVeJjKH2nn3CUwsyyw8AZwI u+uOBwys/0Kxi/FgRMwUsWMb96mDIri09soDTSEIitECfwtWAXV0UUMlGDYpG85fO+w4 trgLL1Z4AAIAlN987Rr4dC7zW/oLNcqaB82Db5hcTa6fywsqVH8geu6bpPxF6MvuGIiQ AudUOtk8VBx4Xks1DLNKAbuuS+NXHuor8L3wco5QXVP5CRWdirmCCGrC2F1HfElz6Ic1 3+6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774567260; x=1775172060; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KfSr+KkodhqE8FtITts4M0DFqE0hr649Q2lP9iISJcw=; b=dBlo4+or0E9gCZ6tItwasKuVTkLTdET4ixCz0te94FD8TEgWRsC2qBHHA++VmOwMfy ZSeEuAWzNRlxqyv/mDtwhRjLMz6ILpIyjLfj2jmW4jXzppVVIgaiUAIycf/WgPBlQMMy kb3f7Yhvh0PSy8HWY4TQJq9LtymUCUgiLnWXRbqCkAB6+sU66ZZvf0d4vp8Ltpa+7SX8 dXggkhKJFr0+8oB1GAt2bQLOM7W0rcFSHd+NYsYEiNfFkUS0+fofrf/Bsx7f7x2r7bzz a4S8KIvGJzN6O+zeO8NGnSMVielrup/5dn4lfarvR1ojtWkr1ZNxVd9kD9lzK0pneXde le5w== X-Forwarded-Encrypted: i=1; AJvYcCV9Y3JcZ9CmSgK+nkh5t+9blCtXj/+BfWjmItUszX9Y7qZBdifDCWeOzQeU3Ofo1s3qnrGOX/+e0Kzz8Fno@lists.postgresql.org X-Gm-Message-State: AOJu0Yw8H83sCK67kfpP9J0eEVSJW1UEoO9Rwmm3jIZ67fn5YUqTMn7Q Bxk9hEIT51fkqhU+u+1ulCmvrDx+mWnr/sVFpmurgtx7YDXwFMn9vPLmDwM8BRnZGmWEa/2pDTJ g6y4yb1HeRL0DuucWF9obKnhY9SCg8KAJjdnDLGHl9OU5JOhL2VKutA== X-Gm-Gg: ATEYQzzS3b9MLxPLUL/N/UtjaofE44reHQHGmCDtDFI6ejmqDAG80dcYNcev4bzbmG9 PGfClbNJZ33i6oxA5Ok/lcYn+Fu65XFrf3h7VGbqTRNXYpSDgiN7Tg7z/GasdNw9HY2BXMpgG+p oInGAJbgFtXFu8kNMTfQPMipm40jVBwQNPhUYLHn6C0PIVsTV9zfV8CvRoeetXArnhzHZI1zp93 XcFmLBX6h3TWhFxoj9k3RmMQNht2GtWRlGAvIZXk1P7f36EEwrVrmRTxxrars04ahWf94t/QOjP NO8/NS/piHulpbyk52+D X-Received: by 2002:a05:6830:67ed:b0:7d9:b023:cb80 with SMTP id 46e09a7af769-7d9faf8c304mr89340a34.30.1774567260240; Thu, 26 Mar 2026 16:21:00 -0700 (PDT) MIME-Version: 1.0 References: <1136161.1769654478@sss.pgh.pa.us> <1299934.1773938807@sss.pgh.pa.us> In-Reply-To: From: Mark Dilger Date: Thu, 26 Mar 2026 16:20:49 -0700 X-Gm-Features: AQROBzCkbJZP88hrFr75rcLMoEk1MPN4OFya_aQ9A2ezHaT-Yo2jPX6PgbZ539M Message-ID: Subject: Re: pg_plan_advice To: Robert Haas Cc: Lukas Fittl , Tom Lane , PostgreSQL Hackers Content-Type: multipart/alternative; boundary="0000000000002d56d3064df5a369" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000002d56d3064df5a369 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Robert, Thanks for the pg_plan_advice patches! I've tried hard to attack them, to get them to fail in some catastrophic way. You seem to have hardened the code well. I found only one concern for you to consider, a kind of memory leak ratchet: Once the system reaches memory pressure where: - The 8192-byte DSA size class is exhausted (needs a new DSM segment, OS refuses) - Smaller size classes still have free space in existing superblocks Then every single query that triggers advice collection will: 1. Successfully allocate an advice entry from existing free space 2. Enter store_shared_advice, hit the same chunk boundary 3. Fail to allocate the chunk 4. Leak the advice entry 5. Reduce remaining free space in the small size classes This continues until the small size classes are also exhausted, at which point make_collected_advice itself fails (the DSA area has been consumed by leake= d entries). The ratchet is self-reinforcing: each failure guarantees the next failure while consuming more resources, assuming nobody else is freeing memory simultaneously. I looked for situations where something inside postgres would keep retrying after the OOM, but the most likely I think is just an application that treats OOM as a transient error and keeps retrying. See the make_collected_advice() call in pg_collect_advice_save(); the point where DSA memory is allocated but not yet linked into any data structure. Everything downstream from here (the four dsa_allocate0 calls inside store_shared_advice) can fail and leak it. On Thu, Mar 26, 2026 at 10:20=E2=80=AFAM Robert Haas wrote: > On Thu, Mar 26, 2026 at 9:55=E2=80=AFAM Robert Haas wrote: > > Whoops. Obviously got the wrong thing stuck in my cut and paste buffer > > when I was writing that. Thanks for checking it. I'm going to go ahead > > and commit this, because I'm pretty confident that it's correct, and > > the rest of these patches are not going to fix the buildfarm > > instability without it, and I'm pretty sure multiple committers are > > pretty tired of seeing these test_plan_advice failures already. > > Done. > > > Right, the comment isn't quite correct. I don't think your rewording > > is quite right either, though, because there's really no reason to > > mention plan_name here at all. I'll adjust it. > > Done and committed, after also adjusting the memory context handling > to avoid re-breaking GEQO. > > > The dangling pointers are a good point; I agree that's bad. However, > > I'd be more inclined to fix it by nulling out the alternative_root > > pointers at the end of set_plan_references. I think that would just be > > the case where root->isAltSubplan[ndx] && root->isUsedSubplan[ndx]. > > The reason I'm reluctant to just store the name is that there's not an > > easy way to find a PlannerInfo by name. I originally proposed an > > "allroots" list in PlannerGlobal, but we went with subplanNames on > > Tom's suggestion. I subsequently realized that this kind of stinks for > > code that is trying to use this infrastructure for anything, for > > exactly this reason, but Tom never responded and I never pressed the > > issue. But I think we're boxing ourselves into a corner if we just > > keep storing names that can't be looked up everywhere. It doesn't > > matter for the issue before us, so maybe doing as you say here is the > > right idea just so we can move forward, but I think we're probably > > kidding ourselves a little bit. > > Here's a new version, where I've replaced alternative_root by > alternative_plan_name, serving the same function. > > -- > Robert Haas > EDB: http://www.enterprisedb.com > --=20 *Mark Dilger* --0000000000002d56d3064df5a369 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Robert,

Thanks for the pg_plan_advice patches!= =C2=A0 I've tried hard to attack them, to get
them to fail in some c= atastrophic way.=C2=A0 You seem to have hardened the code
well.=C2=A0 I = found only one concern for you to consider, a kind of memory leak
ratche= t:

Once the system reaches memory pressure where:
=C2=A0 - The 81= 92-byte DSA size class is exhausted (needs a new DSM segment, OS refuses)=C2=A0 - Smaller size classes still have free space in existing superbloc= ks

Then every single query that triggers advice collection will:
= =C2=A0 1. Successfully allocate an advice entry from existing free space=C2=A0 2. Enter store_shared_advice, hit the same chunk boundary
=C2= =A0 3. Fail to allocate the chunk
=C2=A0 4. Leak the advice entry
= =C2=A0 5. Reduce remaining free space in the small size classes

This= continues until the small size classes are also exhausted, at which point<= br>make_collected_advice itself fails (the DSA area has been consumed by le= aked
entries). The ratchet is self-reinforcing: each failure guarantees = the next
failure while consuming more resources, assuming nobody else is= freeing
memory simultaneously.

I looked for situations where so= mething inside postgres would keep retrying
after the OOM, but the most = likely I think is just an application that treats
OOM as a transient err= or and keeps retrying.

See the make_collected_= advice() call in pg_collect_advice_save(); the point
where DSA memory is= allocated but not yet linked into any data structure.
Everything downst= ream from here (the four dsa_allocate0 calls inside
store_shared_advice)= can fail and leak it.

On Thu, Mar 26, 2026 = at 10:20=E2=80=AFAM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Ma= r 26, 2026 at 9:55=E2=80=AFAM Robert Haas <robertmhaas@gmail.com> wrote:
> Whoops. Obviously got the wrong thing stuck in my cut and paste buffer=
> when I was writing that. Thanks for checking it. I'm going to go a= head
> and commit this, because I'm pretty confident that it's correc= t, and
> the rest of these patches are not going to fix the buildfarm
> instability without it, and I'm pretty sure multiple committers ar= e
> pretty tired of seeing these test_plan_advice failures already.

Done.

> Right, the comment isn't quite correct. I don't think your rew= ording
> is quite right either, though, because there's really no reason to=
> mention plan_name here at all. I'll adjust it.

Done and committed, after also adjusting the memory context handling
to avoid re-breaking GEQO.

> The dangling pointers are a good point; I agree that's bad. Howeve= r,
> I'd be more inclined to fix it by nulling out the alternative_root=
> pointers at the end of set_plan_references. I think that would just be=
> the case where root->isAltSubplan[ndx] && root->isUsedSu= bplan[ndx].
> The reason I'm reluctant to just store the name is that there'= s not an
> easy way to find a PlannerInfo by name. I originally proposed an
> "allroots" list in PlannerGlobal, but we went with subplanNa= mes on
> Tom's suggestion. I subsequently realized that this kind of stinks= for
> code that is trying to use this infrastructure for anything, for
> exactly this reason, but Tom never responded and I never pressed the > issue. But I think we're boxing ourselves into a corner if we just=
> keep storing names that can't be looked up everywhere. It doesn= 9;t
> matter for the issue before us, so maybe doing as you say here is the<= br> > right idea just so we can move forward, but I think we're probably=
> kidding ourselves a little bit.

Here's a new version, where I've replaced alternative_root by
alternative_plan_name, serving the same function.

--
Robert Haas
EDB: http://www.enterprisedb.com


--

Mark Dilger
--0000000000002d56d3064df5a369--