Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vztFs-001Pqz-2B for pgsql-hackers@arkaria.postgresql.org; Tue, 10 Mar 2026 09:19:56 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vztFq-002Kmb-1f for pgsql-hackers@arkaria.postgresql.org; Tue, 10 Mar 2026 09:19:55 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vztFq-002KmS-0c for pgsql-hackers@lists.postgresql.org; Tue, 10 Mar 2026 09:19:54 +0000 Received: from mail-ed1-x536.google.com ([2a00:1450:4864:20::536]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vztFo-00000001wbG-0o4h for pgsql-hackers@postgresql.org; Tue, 10 Mar 2026 09:19:54 +0000 Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-661d929219bso3540782a12.0 for ; Tue, 10 Mar 2026 02:19:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773134391; cv=none; d=google.com; s=arc-20240605; b=XYT4SYiWeS8SQM3ZW1pQ65V0uqLOSVCPpd+oAckPjH4d0OQC3Y8Wb1vdxmLsrJ/B7f kE3CtTRHy7b1/pA0HQoHp80dP9bzwCFGzfZHYS1DHOoBnINCuRMhOABkrySskpi2MJsM hYyiM9NyiJwxtoxiSj1O9mdgpW7JNtGLzypptZ1QnGuvx63DE7BwlI7Po6Gw/+Y3dwxC ptYUhE0To5TZF00X7EhIgRMzFgzjkaYHLiX9Ax6yAPohGUS/SqtNh1ueH/8FJT9DeN6D sEiPqt7tJfgWx4M7/SUoHOEuJdhUU+iJLqcO/jOXHlNadT0H7Ttcye/uBiIU8VDXQrbg 4gBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=AnOOQlhE2pLOJglVa4l4UjDexadtCdOGk9bbFgiBZik=; fh=HiUX7gbs12RQrs2/eScdfYLRKI6+fFocMgItw/iIClk=; b=ZRnmCT+btU0EPSIWHXDwbtJlcexr0+ey1xFps59v9wlWqfxQJ0nZ2+xVbeBS2sNf8H ZDK4Wb0Kn10tYiCAud0sJ1OqiL4Ii0HmqnTAkg3FvVQm3U4u3U+8C4VZSHJqeAHKbkMY M2xyBKdYKP1vVkwAZQe+biSWnxOQgSg8sMR5m/bEkNHuChAQaMZJoDCXKLl7EKSxvW15 tx4nDMdZ6+dhY8rG4ERrNHutllfbtsA5Zers3x9pAgG4TO07YNJlRZjN2gtSMU3Zvr2g Va0+3UCm9KPdjRIUssidD95anXwO6cHWXkJAZxtIyb3kh/L+lDIPoCVn43dgBaCLU6ZA 8i6w==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773134391; x=1773739191; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=AnOOQlhE2pLOJglVa4l4UjDexadtCdOGk9bbFgiBZik=; b=FvHgqUoBVE2HOT6YQSQHH5Rd6yKQ19JAPQ8hbeUL9iBeveZPInjrAIr5YD7Rb9z8VZ qLgECIVrsJHTiBRECbGNy67s5TB/kKAUlTTu5Dysy0dw1WLAtEo6BO90wpQIeZysmbW/ yk9heS30t1yepo666ysQEBctd+m+lrCve1n39VsLsX73/BEvTR0cRWV4LrN+/l4E5Jwb ONbHydlLCtLn7hYsAeq6tmoX1JoaP94oI9N6TXfsV590TQ1JP1Z/JAYXaWkhuKTLDgcs vRfnHMY8NSJWX2fUCwJSAIbbxKR0U3gV+S3mleBE4cAAJj5q5/Ojm9aiYuSU8+3IGcRQ KLrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1773134391; x=1773739191; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=AnOOQlhE2pLOJglVa4l4UjDexadtCdOGk9bbFgiBZik=; b=uM8szqvdnpGYEXnrdA8+y7IrcRV5/f8l1HB170VUMD9la5HN9Sqd259GhH7T7wwo24 PDUjLMiqClwGywsT/W4QtBmWOZR/mMy99V7pq7PWfW1wvWi1sWci4W+t1pM6Kbgv5Nfo E2oDeKMm8geE5WXgj69SLhkafIp7IBSSIGlIqe67QD07YYCWvqmeaAFEXOJU6WnRe0wN esx6bguEANxQy0wyZepBcLWxNe1g2qb6wS1KLWTEkyGuhq8des3qZCN37G4+S5Vx36Eu uByllN+g+PtX6jXUeq0Q9kX4JdzqyJm40nxZSo4ZAvI0PLgJMpl01nk5+LgMXFzRGU5x 3bbA== X-Forwarded-Encrypted: i=1; AJvYcCUXJeIOew8v6570u1+OFRTIpLZ5dBokxUwwP5XNHUOo9sA3WrNcjAAIfknaRWYj9acX+UEZ2VVGa5ExvzdT@postgresql.org X-Gm-Message-State: AOJu0YzXCdFgtPxK0jnq9l4sxNBJ5V7wqOn+0U99JrL1YLN/uC9tbM4w 7tW+Zr63igwoJOohXp+jmK5mjb8wSz8uWZc1A/lic5sxZpiUKLv2rdHk4doAp53g2yQeTywN5V6 G2QmWS4MWFT7AAoUWD9Rt1Ce7sQ9yADU= X-Gm-Gg: ATEYQzxkNIK/3TOfYeZKAjysYqUw5lrc4m7K4801OiAp49IQ8+IBAHUflcTw9quwH/O rOagbXq188wXfjIbeO9bZWJoYLwNpR32NRoITv1muYw01tKSM2/BE2BAJdbJ23eE+tygpinMDUG trtl8y2ktyuuLEDHXaBGO7RMVO9/6mf3678hyIBXgRJ4YwEeoVCpq1dJFw0xYaDuRFRotJWje3W zvxb4SHBa0i4Esd1GoC2V6GT4Wz8uz/J1GV+sqSBeOQ5iCaYH9gaOW2R8cjjPFodhTELYwmuZBh hz6DTakfvXhozcQrubd2cCVvW143LOCLiVts6tEwgigPksDtiXWn/CKC/6YxTdTSAqyiV64CLGt 9qu3ccNSq3y+jGruE8jE= X-Received: by 2002:a17:906:2091:b0:b97:1d24:bfe1 with SMTP id a640c23a62f3a-b971d24f8c2mr45592466b.22.1773134390625; Tue, 10 Mar 2026 02:19:50 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Xuneng Zhou Date: Tue, 10 Mar 2026 17:19:39 +0800 X-Gm-Features: AaiRm53r8oNriM2QB6XCPlQ6nlq_kR-e00ilGuV1pG9sPsXc2XnKMv__yTIACBo Message-ID: Subject: Re: Odd code around ginScanToDelete To: Pavel Borisov , Alexander Korotkov Cc: Andres Freund , pgsql-hackers@postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Pavel, Alexander, On Fri, Mar 6, 2026 at 10:45=E2=80=AFPM Pavel Borisov wrote: > > Hi, Andres and Alexander! > > On Sat, 21 Feb 2026 at 13:55, Alexander Korotkov w= rote: > > > > On Wed, Feb 4, 2026 at 12:32=E2=80=AFAM Alexander Korotkov wrote: > > > > > > On Tue, Feb 3, 2026 at 6:26=E2=80=AFPM Andres Freund wrote: > > > > > > > > While looking at converting more places to UnlockReleaseBuffer(), i= n the > > > > course of making UnlockReleaseBuffer() faster than the two separate > > > > operations, I found this code: > > > > > > > > static bool > > > > ginScanToDelete(GinVacuumState *gvs, BlockNumber blkno, bool isRoot= , > > > > DataPageDeleteStack *parent, Offset= Number myoff) > > > > ... > > > > > > > > if (!meDelete) > > > > { > > > > if (BufferIsValid(me->leftBuffer)) > > > > UnlockReleaseBuffer(me->leftBuffer); > > > > me->leftBuffer =3D buffer; > > > > } > > > > else > > > > { > > > > if (!isRoot) > > > > LockBuffer(buffer, GIN_UNLOCK); > > > > > > > > ReleaseBuffer(buffer); > > > > } > > > > > > > > if (isRoot) > > > > ReleaseBuffer(buffer); > > > > > > > > > > > > Which sure looks like it'd release buffer twice if isRoot is set? = I guess > > > > that's not reachable, because presumably the root page will always = go down the > > > > !meDelete path. But it sure made me wonder if there's a hard to rea= ch bug. > > > > > > Yes, it's not possible to have meDelete set for root, because > > > me->leftBuffer is always InvalidBuffer for the root. So the branch > > > handling meDelete case should better do Assert(!isRoot). > > > > This code was introduced in > > > > commit e14641197a5 > > > > Author: Alexander Korotkov > > > > Date: 2019-11-19 23:07:36 +0300 > > > > > > > > Fix deadlock between ginDeletePage() and ginStepRight() > > > > > > > > I didn't trace it further to see if it existed before that in some = fashion. > > > > > > Yes. I think generally this area needs to be reworked to become more > > > clear, and have vast more comments. It was wrong from my side trying > > > to fix bugs there without reworking it into something more > > > appropriate. I'm planning to put work on this during this week. > > > > > > > There's another oddity here: ginScanToDelete() requires that the ro= ot page has > > > > been locked by the caller already, but will afaict re-read the root= page? But > > > > then have code to avoid locking it again, because that would not ha= ve worked? > > > > Seems odd. > > > > > > > > > It seems a bit odd for me that caller already have locked buffer, but > > > passes BlockNumber making us re-read the buffer. But I'm not sure > > > that's the same as your point. Could you, please, elaborate more on > > > this? > > > > Here is the refactoring patch. Sorry for the delay. > > Hi, Andres and Alexander! > > I've looked into the patch v1. > Overall, it looks good to me. The refactor LGTM in general. The buffer-ownership rewrite looks cleaner and safer overall. > Some thoughts: > > Is it worth/possible in recursive calls of ginScanToDelete() to free > allocated myStackItem->child after processing all children of the > current level, when they are not needed anymore? > Previously to this patch, palloc-ed "me" variable also was't freed at > recursion levels. I think freeing myStackItem->child inside recursive calls might not be worthwhile here. That node is intentionally reused for subsequent siblings at the same depth, and it carries state (leftBuffer) that can still be needed until the level is fully processed. Freeing/reallocating it per subtree would add churn and make the lifetime rules harder to reason about without meaningful memory savings (the number of nodes is bounded by tree depth, not number of pages). We currently free the chain once after ginScanToDelete() returns in ginVacuumPostingTree(), which matches the natural lifetime boundary > Could limiting the maximum recursion level be useful? Posting-tree depth is naturally small; a hard cap seems to add failure risk with little practical benefit. > In the comment to myStackItem before ginScanToDelete(), it might be > worth adding that after processing all pages on the current level, > myStackItem is not needed anymore. > > > > Yes, it's not possible to have meDelete set for root, because > > > me->leftBuffer is always InvalidBuffer for the root. So the branch > > > handling meDelete case should better do Assert(!isRoot). > Looks like this additional Assert is not in patch v1. > > In the root call of ginScanToDelete(gvs, &root); we can add Assert > checking that its return result is false: > - ginScanToDelete(gvs, &root); > + deleted =3D ginScanToDelete(gvs, &root); > +. Assert(!deleted); /* Root page is never deleted */ + 1, this is a good invariant check and improves readability One minor nit for the comment: The DataPageDeleteStack.buffer field comment says "valid only while recursing into children" this is true for internal pages, but for leaf pages the buffer is valid until the pageWasDeleted / leftBuffer decision. The validity window is actually "from when the caller sets it until the post-recursion cleanup." -- Best, Xuneng