Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w3K8f-00176O-2z for pgsql-hackers@arkaria.postgresql.org; Thu, 19 Mar 2026 20:38:42 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w3K8e-002mDA-0z for pgsql-hackers@arkaria.postgresql.org; Thu, 19 Mar 2026 20:38:40 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w3K8d-002mD0-2v for pgsql-hackers@lists.postgresql.org; Thu, 19 Mar 2026 20:38:40 +0000 Received: from mail-ej1-x62e.google.com ([2a00:1450:4864:20::62e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w3K8c-000000003kY-0nIf for pgsql-hackers@lists.postgresql.org; Thu, 19 Mar 2026 20:38:39 +0000 Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-b982a80317fso78627866b.1 for ; Thu, 19 Mar 2026 13:38:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773952717; cv=none; d=google.com; s=arc-20240605; b=eeN3115gikSXCwvOwYutqOQ6WtoNnKTN/xiPq0CG/texVpYh3hl97B0cibCsLTCSyV C7DBRSy1o8VansqPE30h2pBUsT3KF/VMhtL7BbDYs5PPM17cS8b4gqQrea9/Dsz/ilXG 4uGZ97uRt4VEaQuFoRz7PoPuHTdR1pzu7cOKbaEK11hkbc8prRYsbt0YFfHZ8ubzHAtM 5ad7HPlmbZwYWu47wsTLvT2/g4d3+Y5vTqFnLdLRcGD1wl1TFpzwDba7RuE5GnYhzUKO d3O1aj7IJk/+XRDwGADdp7rRGPMPt6cE/TaRJ6JEWU7tliE0SZ/yCwJj8MnEFc6PAIHP UWjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=0ohJa9uj6tN5AhPAOqIPIiSvDdNxFfT/CnbVKWEGwMg=; fh=YhmxPC0NYagsxVMppzYMgO0XMWsRSWzMegLjmIklu8U=; b=OFRMl3i1yxfL/qebLA3ubLBm8ETC4ff8Wa2VCu5+npMkCQoVvEMUTxunZPC3WKGeTy Nejnxjq3nwobeM6MgULNmjWmfLtWo4tTdWrIvBZr1NzIc+GZVUW/Mkd7IU13yLw+r9FI +7T+7wuBTtCrpC72W7c7hdEeV5+EA7SAlVzyzqvXgJ/eW+N6AI/BAF4htPMMtye07Osw JLw/1EcSTnYnWAxb0KF659YVr/hdXPx0bRomHR6MI7FaGznUGyfRR8Nyj6147YeX1PS2 JaSL7Rxtx4ihgMUaZY1vDPiOphV7uDaiI+oJ/vk82/+AR2ToD807WW5DT2bMgrfb4fE4 qQSw==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773952717; x=1774557517; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=0ohJa9uj6tN5AhPAOqIPIiSvDdNxFfT/CnbVKWEGwMg=; b=LIvs1W9/WM5dpHDx+/Nnf1HwwRbj6RZmMrS0tV+R99DmkASBa5jQjoddUDhPTiIPKO S1Cri7QusGLYnuoQmKilSJDLonbK1D9DFCR+uM1HniSgIM5UQVANxBM3cQwFJjberql3 ZZ5wwHngv0kTlFv8srF4iR7MbQ1iYsh79TIHH4G03JpsY5fT80oKDP4mpl1jooIj4fHV G0Nv50DlCsCnDDHKa4h7zHnw7wHj+8sT3fc6uVLLeT+KSX6mI9tOy+UGeZuDqsuT2sAo zBhrBXS/PXaGKPqqlCcORRbbicdvZfmLGdI/pfLuEm0pkg9XS8S57g5Dm/kBd8Ack3Ef 1mVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773952717; x=1774557517; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=0ohJa9uj6tN5AhPAOqIPIiSvDdNxFfT/CnbVKWEGwMg=; b=bERMwfY2vk3eKsg/7s24UxdD1vw8UQTcu+clepV1BJmzRAC0m+U3amG9YQ7ms85gra H+c/kNsl/G47GDJ0ac4OYXOayhm6THq7LfWxeznpNc6QW2vuiwXBtPlFJnBiX6Al+/u3 e4/I2dRX2Iy7pNfVkREdH+zmUkdwG8F/t1IlmqKuZJfcdxdAaFQc6bzU2tycELrBH4br 2SbNpV8yZ0l9evEHzP4uJ7+EE+E3Tu8jjx5OS4Zix0BnD11jExiiM114RPgSS6jk4UfO S4phvKW1SjkR4e6AVQDJp28e6YcL0xQt29HedP5W7c1QDfp2arjliH+JRsS/pn2yfFJD 8PuA== X-Gm-Message-State: AOJu0YwFywYq4uErblCB+2ZN0GAZu3CT2u5TaDhEFZUv5Q9IP6LSgtQB cojfCmobKTDsWm67E6TCSSjnlyGe8ELEeq+CfFTTkwGa+QeHp/M32ZHI7gJl1pJzY6rKIQF61JG I4fRhyQ7OhZ159GPKsgaYVVQDZZA4Q+0= X-Gm-Gg: ATEYQzz0accOJEVgALaWGurWmFOx1++taNIkJadd0IC5OJ03QVs/15BnQcVVK/ueg4I bxx0P944k+GmgnKszCD+WqmRhdR8KI4MU5BXsAa0SLY4dxnVATX1XfdLlZaTMvNG3XctWSGPGxR gjI74CS+gr1GvmSxy45KEdWJCYvtOjk8DGv+mcyjEuYxjNHAP11hZRRIfxp3F4YMgpI23DXKhfc ZiennsCMjRCzR0w6aHJ6Dru/PTDKtmDShgiYWhuBTFfTLGcsfpA/jDmZnRRQbVaL0mQpu4cSL0O +sviuyOtXhBVUVj50xpG02T0BXOav9dXIq/cF20= X-Received: by 2002:a17:906:b34d:b0:b97:9636:5aac with SMTP id a640c23a62f3a-b982f1ecb48mr35055966b.13.1773952716248; Thu, 19 Mar 2026 13:38:36 -0700 (PDT) MIME-Version: 1.0 References: <1136161.1769654478@sss.pgh.pa.us> <1299934.1773938807@sss.pgh.pa.us> In-Reply-To: From: Robert Haas Date: Thu, 19 Mar 2026 16:38:23 -0400 X-Gm-Features: AaiRm53dd-HY3qwhODTpsTHuGAnpjnbdZxnjtHEPLztPHGqN03zj3msUiEAtJDM Message-ID: Subject: Re: pg_plan_advice To: Tom Lane Cc: PostgreSQL Hackers Content-Type: multipart/mixed; boundary="000000000000805cd0064d668d81" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000805cd0064d668d81 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Mar 19, 2026 at 1:17=E2=80=AFPM Robert Haas = wrote: > > I would dig into why grison and schnauzer are failing this test, > > except that I don't agree that we should be running it in the > > first place. > > I'll go have a look. grison failed like this: CREATE INDEX ON vaccluster(wrap_do_analyze(i)); +ERROR: indexes on virtual generated columns are not supported This is a surprising diff, because either that command is trying to create an index on a virtual generated column, or it's not, and it either is or isn't for every machine in the buildfarm and under test_plan_advice or not. The problem might be that the first TupleDescAttr(...) =3D=3D ATTRIBUTE_GENERATED_VIRTUAL test in DefineIndex can be reached with attno =3D=3D 0, which seems like it will then access memory that is not part of the TupleDesc. If that memory happens to contain a 'v' in the right spot, this will happen. (Is it not viable to have TupleDescAttr() assert that the second argument is >=3D 0?) skink has a failure that looks like this: +WARNING: supplied plan advice was not enforced +DETAIL: advice NESTED_LOOP_MEMOIZE(nt) feedback is "matched, failed" I think this is caused by a minor bug in the pgs_mask infrastructure. get_memoize_path() exits quickly when outer_path->parent->rows < 2, on the theory that the resulting path must lose on cost. But that presumes that we could do a plain nested loop instead, i.e. that PGS_NESTLOOP_PLAIN is set. And it might not be. Before the pgs_mask stuff, that case couldn't happen: enable_nestloop=3Doff disabled all nested loops, and enable_memoize=3Doff disabled only the memoized version, but there wasn't any way to disable only the non-memoized version (which, of course, was part of the point of the whole thing). I think the fix is as attached. Unfortunately, the other failures look like they are pointing to a rather more serious problem. schnauzer's got a failure that looks like this: +WARNING: supplied plan advice was not enforced +DETAIL: advice SEQ_SCAN(pg_trigger@exists_1) feedback is "matched, failed= " +advice NO_GATHER(pg_trigger@exists_1) feedback is "matched, failed" And an older run on skink has a failure that looks like this: +WARNING: supplied plan advice was not enforced +DETAIL: advice SEQ_SCAN(pg_trigger@exists_6) feedback is "matched, failed= " +advice NO_GATHER(pg_trigger@exists_6) feedback is "matched, failed" What these failures have in common is that both of them involve selecting from information_schema.views. What "matched, failed" means is that we saw the advice target during planning and tried to influence the plan, but then observed that the final plan doesn't respect the supplied advice. The query for information_schema.views involves three subplans, so the fact that exists_6 is mentioned here is a strong hint that the AlternativeSubPlan machinery is in play here. The query is planned once, and one of the two alternative subplans is chosen, generating advice for that plan. Upon replanning, the other plan is chosen, so the subplan for which we have plan advice doesn't appear in the query at all, leading to this. Now, you might wonder how that's possible, considering that we're planning the same query twice in a row with advice that isn't supposed to change anything. My guess is that it's possible because these machines are slow and other tests are running concurrently. If those other sessions execute DDL, they can send sinval messages, which can cause the second planning cycle to see different statistics than the first one. That then means the plan can change in any way except for what the advice system already knows how to control, and choice of AlternativeSubPlan is not in that set of things. I think I actually saw a failure similar to this once or twice locally during development, but that was back when the code had a lot of bugs, and I assumed that the failure was caused by some transient bug in whatever changes I was hacking on at the time, or some other bug that I fixed later, rather than being a real issue. I think the reason it doesn't happen very often is because the statistics have to change enough at just the right moment and even on slower buildfarm machines, most of the time, they don't. It's not really clear to me exactly where to place the blame for this category of failure. One view is that tests are being run in parallel and I didn't think about that, and therefore this is a defect in the test methodology that needs to be rectified somehow (hopefully not by throwing it out). We might be able to fix that by running the test suite serially rather than in parallel, although I expect that since you (Tom) are already unhappy with the time this takes, that will probably go over like a lead balloon. Another angle is to blame this on the decision to assign different plan names to the different subroots that we use for the alternative subplans. If we used exists_1, exists_2, and exists_3 twice each instead of exists_1..exists_6, this wouldn't happen, though that idea seems questionable on theoretical grounds. A third possible take is that not including choice-of-alternative-subplan in the initial scope was an error. As of this moment, I'm not really sure which way to jump. I need to think further about what to do about this one. We can continue the discussion about reducing the cost at the same time; again, I am definitely not saying that it isn't legitimate to be concerned about the CPU cycles expended running these tests, but those CPU cycles have found three separate problems in two days, which is not nothing. Separately, I am now 100% convinced that I need to go revise the pg_collect_advice patch, because that adds yet another run of the core regression tests, but for much less possibility of any real gain. I'll go get rid of that and figure out what, if anything, to replace it with. --=20 Robert Haas EDB: http://www.enterprisedb.com --000000000000805cd0064d668d81 Content-Type: application/octet-stream; name="v1-0001-get_memoize_path-Don-t-exit-quickly-when-PGS_NEST.nocfbot" Content-Disposition: attachment; filename="v1-0001-get_memoize_path-Don-t-exit-quickly-when-PGS_NEST.nocfbot" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_mmxvmrhr0 RnJvbSAwY2ZjNTViZWNiNDcxM2VhNmQ4YTUyZDczNjhjOTMwMjdiZmNkYTYzIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBSb2JlcnQgSGFhcyA8cmhhYXNAcG9zdGdyZXNxbC5vcmc+CkRh dGU6IFRodSwgMTkgTWFyIDIwMjYgMTU6MzU6NDkgLTA0MDAKU3ViamVjdDogW1BBVENIIHYxXSBn ZXRfbWVtb2l6ZV9wYXRoOiBEb24ndCBleGl0IHF1aWNrbHkgd2hlbgogUEdTX05FU1RMT09QX1BM QUlOIGlzIHVuc2V0LgoKVGhpcyBmdW5jdGlvbiBleGl0cyBlYXJseSBpbiB0aGUgY2FzZSB3aGVy ZSB0aGUgbnVtYmVyIG9mIGlubmVyIHJvd3MKaXMgZXN0aW1hdGVkIHRvIGJlIGxlc3MgdGhhbiAy LCBvbiB0aGUgdGhlb3J5IHRoYXQgaW4gdGhhdCBjYXNlIGEKTmVzdGVkIExvb3Agd2l0aCBpbm5l ciBNZW1vaXplIG11c3QgbG9zZSB0byBhIHBsYWluIE5lc3RlZCBMb29wLgpCdXQgc2luY2UgY29t bWl0IDQwMjBiMzcwZjIxNDMxNWI4YzEwNDMwMzAxODk4YWMyMTY1ODE0M2YgaXQncwpwb3NzaWJs ZSBmb3IgYSBwbGFpbiBOZXN0ZWQgTG9vcCB0byBiZSBkaXNhYmxlZCwgd2hpbGUgYSBOZXN0ZWQg TG9vcAp3aXRoIGlubmVyIE1lbW9pemUgaXMgc3RpbGwgZW5hYmxlZC4gSW4gdGhhdCBjYXNlLCB0 aGlzIHJlYXNvbmluZwppcyBub3QgdmFsaWQsIHNvIGFkanVzdCB0aGUgY29kZSBub3QgdG8gZXhp dCBlYXJseSBpbiB0aGF0IGNhc2UuCgpUaGlzIGlzc3VlIHdhcyByZXZlYWxlZCBieSBhIHRlc3Rf cGxhbl9hZHZpY2UgZmFpbHVyZSBvbiBidWlsZGZhcm0KbWVtYmVyIHNraW5rLCB3aGVyZSBORVNU RURfTE9PUF9NRU1PSVpFKCkgY291bGRuJ3QgYmUgZW5mb3JjZWQgb24KcmVwbGFubmluZyBkdWUg dG8gdGhpcyBlYXJseSBleGl0LgotLS0KIHNyYy9iYWNrZW5kL29wdGltaXplci9wYXRoL2pvaW5w YXRoLmMgfCA3ICsrKysrKy0KIDEgZmlsZSBjaGFuZ2VkLCA2IGluc2VydGlvbnMoKyksIDEgZGVs ZXRpb24oLSkKCmRpZmYgLS1naXQgYS9zcmMvYmFja2VuZC9vcHRpbWl6ZXIvcGF0aC9qb2lucGF0 aC5jIGIvc3JjL2JhY2tlbmQvb3B0aW1pemVyL3BhdGgvam9pbnBhdGguYwppbmRleCAwNDQ1NjBk YTdiZi4uZTg5NDBhMjNjNjQgMTAwNjQ0Ci0tLSBhL3NyYy9iYWNrZW5kL29wdGltaXplci9wYXRo L2pvaW5wYXRoLmMKKysrIGIvc3JjL2JhY2tlbmQvb3B0aW1pemVyL3BhdGgvam9pbnBhdGguYwpA QCAtNzI5LDggKzcyOSwxMyBAQCBnZXRfbWVtb2l6ZV9wYXRoKFBsYW5uZXJJbmZvICpyb290LCBS ZWxPcHRJbmZvICppbm5lcnJlbCwKIAkgKiB0aGFuIG9uZSBpbm5lciBzY2FuLiAgVGhlIGZpcnN0 IHNjYW4gaXMgYWx3YXlzIGdvaW5nIHRvIGJlIGEgY2FjaGUKIAkgKiBtaXNzLiAgVGhpcyB3b3Vs ZCBsaWtlbHkgZmFpbCBsYXRlciBhbnl3YXkgYmFzZWQgb24gY29zdHMsIHNvIHRoaXMgaXMKIAkg KiByZWFsbHkganVzdCB0byBzYXZlIHNvbWUgd2FzdGVkIGVmZm9ydC4KKwkgKgorCSAqIEhvd2V2 ZXIsIGlmIHRoZSAicGxhaW4gbmVzdGVkIGxvb3AiIHN0cmF0ZWd5IGlzIGRpc2FibGVkLCB0aGVu IGl0IGlzIG5vCisJICogbG9uZ2VyIGNlcnRhaW4gdGhhdCBhbnkgcGF0aCB3ZSdkIGNvbnN0cnVj dCBoZXJlIHdvdWxkIGxvc2Ugb24gY29zdC4KKwkgKiBTbywgaW4gdGhhdCBjYXNlLCBjb250aW51 ZSBhbmQgbGV0IGNvc3QgY29tcGFyaXNvbiBzb3J0IHRoaW5ncyBvdXQuCiAJICovCi0JaWYgKG91 dGVyX3BhdGgtPnBhcmVudC0+cm93cyA8IDIpCisJaWYgKG91dGVyX3BhdGgtPnBhcmVudC0+cm93 cyA8IDIgJiYKKwkJKGV4dHJhLT5wZ3NfbWFzayAmIFBHU19ORVNUTE9PUF9QTEFJTikgPT0gMCkK IAkJcmV0dXJuIE5VTEw7CiAKIAkvKgotLSAKMi41MS4wCgo= --000000000000805cd0064d668d81--