Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7Dzx-0055b2-1j for pgsql-hackers@arkaria.postgresql.org; Mon, 30 Mar 2026 14:53:49 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w7Dzv-0040en-1S for pgsql-hackers@arkaria.postgresql.org; Mon, 30 Mar 2026 14:53:47 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7Dzv-0040ee-0S for pgsql-hackers@lists.postgresql.org; Mon, 30 Mar 2026 14:53:47 +0000 Received: from mail-ej1-x62e.google.com ([2a00:1450:4864:20::62e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w7Dzs-000000020if-1VR5 for pgsql-hackers@lists.postgresql.org; Mon, 30 Mar 2026 14:53:47 +0000 Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-b97bca3797dso698155366b.0 for ; Mon, 30 Mar 2026 07:53:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774882423; cv=none; d=google.com; s=arc-20240605; b=FbWarUMDzMkzW0p+cB2jhvDnO1TG4qOcp3MLnMMOWpPh3djO42qS2j+3cAT4iFN2w6 J6lZI1JumDhN/wd1rgL5zIbB2byEiq6Hcm064z9rnXLV0awRNXKomglJxZLxQb3jA+0f 4+YjlDky1Az0Whs2UlI3IH4SbXbdvnG9vSGYZH5Wwr0iOtybZLULpTAtAkA312VpVWaK wr/40Al9tuBiBhkFJtf6Zw7jGNXTK07AW9TyAMaicWqGK5sXAq+PsdoUR9mMZkJ05fiU r0fk6qq4rCcr0neFMzCFni4o3yz1RVYSoIZQ5Pk1bmZTGmIDwg/uo4vPxQvrLqSbaenK kolQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=dWkcCAm10fIhiyd+gIg9J3LLMM1pWVMTFSujX7ta374=; fh=ONPDOnZXH8Ebcapeg+OW2dE6gFoTROJqvELfxJM3NdE=; b=gs0tPTuQiYexlKSRUCjGOf5vi4Zg4W1SZuTwiT5Rcby3eV+LSUR8+PwPssV+VbidZ9 nYYAvVRP6daxqVLDPaIw8KHLMIS9mdvus1MevPPverZp8oCuY+pVmjgYPGmGhJibHXmT 2UOCjit73P6AUBwTlV9pzSf9IYIGi2DlTuh0LB7BS8bxp3GVsDGw0K6XPZSqQEt+7Hn6 tSTXsdwv1zD1s4vVifAWo/eSpot7iPgduBda0oMzPuj1L8mHVBuFaOBMFc+tMAeE6p2P ejcpLPIob+BEDar51abrRM8noqg2t2ZMiYmgduVuxwTnQtCylmyFFjVUqXj/IClWpr46 PWpQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774882423; x=1775487223; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=dWkcCAm10fIhiyd+gIg9J3LLMM1pWVMTFSujX7ta374=; b=qD52f1fwQQUD5W1iQSrdwP+THB3Q9W/3upYw6gSauwRovEEMdZAL810Iw7VNhlMXLE eIEMAFpurMMnl/Zp0u11DiOib8Ub5hP059HHVEK2cd7cBDGgxzOBS0W7KguFjae3ADMY +9x5x7YjadoE7b6VgKakPKnNpOtxr5WH70YRF4gMLChIubXffVaUxp6RUYSdYb6XDsD+ mZrSIxO7mf8DQ+D6IfV8WPnclwBhL2uuY+6sghFIfaxnHPGrbHFwvrfkcehsV1TnaPjh QMd14x8sn9weWgN6FKX5dj4Loz+kvFQWU9WnfPMwP9CLCtrx9/ispae/bScKEYDllyxN UJjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774882423; x=1775487223; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=dWkcCAm10fIhiyd+gIg9J3LLMM1pWVMTFSujX7ta374=; b=pFieNxvmNSpfpQ3hQKNWrEBhLjsQCpXggVue0V3G0gq3aaUof2nQr/K5iQi2tx6F8s C31RbhMlHLqWIJhZzVL1EHjlm+Bhs6Vrtj3ncXlQg8zZZWHfIBRI2tyPEsIwUbckGuE0 4YfhpvSjwXtv1RVj7Gx5oXHTw0iA31uFrSM0h35vm584zhIpd9/T63Y6HVZhIQ9c2Zex f6n+zlBlIkWo6fegzWoLh1L76XYvPy7lbN7PDTeS7osnT150CIc07f68qeyMNxadYbGN HslYnx9nc3KgRHei/W8yNY0NGUtq/rUao2U2f++6QjG5qJejDy107CKbnmtaqvPbvnIl gXdQ== X-Forwarded-Encrypted: i=1; AJvYcCX55WruqcVZ26a9a+4oX5KZhVeYtT9mLa0JsdnLDQZUfIrlvygbQajs/ZsJK52TIQhJ8NcJqCRVwkLLOLd3@lists.postgresql.org X-Gm-Message-State: AOJu0Yz60uVOWVplz1hZMtZhjLy5RUry+nhoAUo94ZGshm4J28xr5dCk IaK2AAAVcEP7vsb3Ov/XnjdUEUos1vy+zPx30MDMC0y44qIZUMb9merSKTEIqnLsOshS3K2o90w RGH6iIXkpg2UGfqv7N70xYKsP1k9jWp4= X-Gm-Gg: ATEYQzwm6hfQDipMuYmX396K8Dz7JfX08LiT8Bd5Jb5VI+LrkNmuMJKrtWspXQL4Pt2 L6RmygCupgoMolrLAY/JgXGenE5fzvMukZxQg701BnSiS4k5ZKXepU4cB/5tcpq3sJ+uyh+VcPE /4RDqsD+CS364SdRt7ZQ+rxZITiDEadyHtQHcaVmMHsDc+96LD0O5oylf7wagx8j4mvHYfueQxc KW8Iu1hSCPdqjruWPDeROe4x5bBI4Sbb5pVu+Wm18tn5TAhC19I7FLGO8A+37wvimZtYbYram8P 8egsZrImJlYNg6Yc1f/cH/60McSEg8QZjU2VsAM= X-Received: by 2002:a17:907:e158:b0:b98:c69:f07d with SMTP id a640c23a62f3a-b9b5090a9b8mr637934766b.34.1774882422943; Mon, 30 Mar 2026 07:53:42 -0700 (PDT) MIME-Version: 1.0 References: <1136161.1769654478@sss.pgh.pa.us> <1299934.1773938807@sss.pgh.pa.us> In-Reply-To: From: Robert Haas Date: Mon, 30 Mar 2026 10:53:30 -0400 X-Gm-Features: AQROBzBrCpZbaSLUyv_oqo-HLVkoNTm2OuAFwHrqdNFSrrOyAEDblBgEthoxSlY Message-ID: Subject: Re: pg_plan_advice To: Lukas Fittl Cc: Jakub Wartak , Tom Lane , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, Thanks for taking the time to respond. My reading of your comments is that we are in overall agreement on the design, with the possible exception of persisting data cross restarts. I will write more about that topic below; but I think if that's the only design disagreement we have, it makes sense to go forward with committing the patch that I have, and we can continue to discuss whether we want to add something related to persistence. The only reason not to do that would be if there were a consensus that the lack of a persistence framework was such a critical defect that we shouldn't ship this at all without that, but I don't agree with that idea and I think it would be a pretty strong position for someone to take. On Sun, Mar 29, 2026 at 2:59=E2=80=AFPM Lukas Fittl wrote= : > I think a simple disk file is the way to go, similar to how > autoprewarm works with its "autoprewarm.blocks" file. Its a bit > awkward that that just sits in the main data directory, but since > pg_prewarm already does it today, I think its okay to have another > contrib module do the same. As noted I'm mainly worried about restarts > that the user didn't control, causing advice that was set to be lost. > > I've attached a patch of how that could look like on top of your v23, > that copies the modified stash information to a > "pg_stash_advice.entries" file, and loads it after restarts. I'll be honest: I don't like this design much at all, but I do see the practical advantages of it, and we have done similar things elsewhere, in particular in autoprewarm. Before I get to the specifics of your patch, let me complain about some things that I don't like at the design level. We lose a lot by directing data through a bespoke mechanism rather than handling it as table data. There are no checksums, so we have less protection against corruption. There is no write-ahead logging, so data does not make it to standbys, which is more of a potential issue for pg_stash_advice than it is for autoprewarm. All the code to read and write the file is specific to this contrib module, so it can have its own bugs separate from every similar module's bugs. The data can't easily be examined and manipulated from SQL as table data can. It's just a messy one-off that solves a practical problem but is otherwise not very nice. Of course, sometimes such messy one-offs are the right answer. In terms of the patch itself, the concurrency situation here seems noticeably worse than with autoprewarm. In that case, there's only one authorized writer at a time, tracked via pid_using_dumpfile. But in this case, it seems like multiple backends could be writing to the temporary dump file at the same time, which could result in a corrupted file that doesn't reload properly. Your code also has a race condition when reloading the data: the first arriving backend tries to reload the flat file, but any other backends that arrive while that's in progress see no stashed advice, and if the load fails for some reason, it's never retried, and the first modification to the in-memory state will clobber the file. autoprewarm has this issue to some extent as well, but that's more OK there because recreating the contents of shared buffers is only an approximate good, but people probably don't want their stashed advice to disappear out from under them if it was billed as persistent. That said, I'm not entirely opposed to a design where there's a small window where the advice stash is empty after a restart, because avoiding that means that it has to be safe to do the reload of saved advice from the middle of a query planning cycle, which is probably true with a flat-file design but wouldn't be true with a table. Still, I don't know whether the current behavior is deliberate or accidental. I also feel a bit uncomfortable with the idea of rewriting the entire file on every single change. If the hypothesis that this is only for adjusting the behavior of a small number of critical queries is correct, then it won't matter, but if people start using this for lots of queries, it's potentially painful. Neither autoprewarm nor pg_stat_statements does that. pg_stat_statements reads data only at postmaster startup and writes data only at postmaster shutdown, so it simply accepts loss of incremental changes in case of a crash, but that also means it doesn't read and write the file repeatedly. autoprewarm writes the file periodically from a background worker so that the on-disk state doesn't drift too far out of sync with what's in memory, without promising perfect durability. Both of those placements have the further advantage that the reading and writing of the file is not being done "in medias res," which does seem to have certain advantages from a robustness perspective. For example, without necessarily endorsing this design, suppose you added a background worker and there are GUCs to configure the database that it connects to and the query it executes to restore advice stashes. Or, alternatively, a background worker that still uses a flat file. Either way, that opens up design ideas like: when you see that the in-memory stashes are not yet reloaded, you can decide to wait up to X seconds for that to happen and then proceed anyway if it hasn't happened by then. I'm not saying that is the right idea here necessarily, but it's an option, whereas what you've done doesn't lend itself to that sort of idea. One other note is that fscanf() ending in a newline could eat up whitespace at the start of the following line. Since a stash name can begin with whitespace, that could be an issue. > Because the number of entries here is controlled by the user (i.e. its > not a function of the workload, but a function of how much advice you > as a user have set), I'm much less worried about memory usage, as long > as we document it clearly how to measure the amount of memory used. The module doesn't have a built-in way to do that right now. Are you thinking we would document that pg_get_dsm_registry_allocations() can be used? > In practice for a good amount of our user base these days the question > will be "Does my cloud provider give me access to create stash > entries", so its maybe worth thinking about if we could also allow > pg_maintain to manage entries by default? Wouldn't it make more sense for the cloud provider to grant execute permissions on these functions to pg_maintain if they are so inclined? This is a brand-new facility, so I think we had better be conservative in terms of default permissions. --=20 Robert Haas EDB: http://www.enterprisedb.com