Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rhb5W-00CLo9-Hp for pgsql-hackers@arkaria.postgresql.org; Tue, 05 Mar 2024 20:08:34 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rhb5T-002TXo-Gb for pgsql-hackers@arkaria.postgresql.org; Tue, 05 Mar 2024 20:08:31 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rhb5T-002TXf-6u for pgsql-hackers@lists.postgresql.org; Tue, 05 Mar 2024 20:08:31 +0000 Received: from mail-lj1-x22d.google.com ([2a00:1450:4864:20::22d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1rhb5L-0031Sf-KH for pgsql-hackers@postgresql.org; Tue, 05 Mar 2024 20:08:30 +0000 Received: by mail-lj1-x22d.google.com with SMTP id 38308e7fff4ca-2d09cf00214so77593821fa.0 for ; Tue, 05 Mar 2024 12:08:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709669303; x=1710274103; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=yfwiTaSUkt7T5whV7EovI0ph+/gTYTxcaftHp9sw5p0=; b=LpQxjyuZl1LWM7WEmY0FMy1HzC87Bal0jDy3v3xgIJbTIXzLLiiAQEDK/8eZTbkYkc VFub+G/wsw26ckGq1S/H/y1EMr5LewzoRefqcGZ89VMFfoxiyjJFiwZiHxn8oHcK6AKn LMNP9nwYwnv1w0oZL1IYNgNQ1U34p9ibY9WBeIdXAoigV9lVRQJ6iMEQC+PzJSIYhB9p +3iD2PmzuwD+oi5RE2WUwUv3DV84FrQy6WBj7o0z3Zg0UShUQUCAx2rS+9/uMWlgIZnd /Rg/2oZ9AztpM5/OLg+JzWt6CaLrQPdArc7E/S13cPrQq//Js1BcBCkoqZNEsjjQoPU3 EVbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709669303; x=1710274103; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=yfwiTaSUkt7T5whV7EovI0ph+/gTYTxcaftHp9sw5p0=; b=ECMywrmxrE77xnJcs8FyBLyivFnWvi0O/24cjalx4mksb3GKShk3UpSu7f7NvrW9D1 aL9yGe1MGImpd1s2ZhJhqRodVbiXhKySkTwodNrqM5jgOK2XBEk1H88dKAMk1aFLSqFN VUnkMKmTWxFWTSDVmOMvrJOFP30G47W9WO1glIJeeKBR64mgsy/wM0zHZwAvPU+ZKi3X q1PSvVBFDikzMjGb+o/Yvjks4SNTYw1avggUaLFNTDTZiAt3P/DxUpoa+wbz5+7hLuH7 lsd6ESgisZH3uCeJmXEWx3bBjKuF3uRVwVfxDx/mpEo835xpdrQ4P+nPRVG/yWk79m+g 5pyQ== X-Forwarded-Encrypted: i=1; AJvYcCXf/sh/bvmUFr58Tfb/h2K2zAItp+og+0D0Ypg5TU+Z6HyDtVem69VwWRSzyl+FeuOnWBnZJALd5y6SYx++1WvmsC1W5URAMx7bwUYf X-Gm-Message-State: AOJu0YyVyn642MYGFsnii+/Li3/k+qJNStkte5jZN/9E1GZ3Tb2ol5Es xZFJWzLJ6/8vckZpIL3JB/QgpCndTxv79UQdSlLNhhqQTetrKYhM+oupLIDWvyQG6PzKMExCwtt 7P3rGqO4wxaHS05oluvdvgJCyDsg= X-Google-Smtp-Source: AGHT+IGL79ZB9Qf3TPcS7cm7UyhprgHwX6Z7axQbaz8B6UrIysniMDgSIuITO3kZvBln6rsTnBtuX4DNerkEoU+qIh4= X-Received: by 2002:a2e:8716:0:b0:2d2:937d:f5fa with SMTP id m22-20020a2e8716000000b002d2937df5famr1830216lji.12.1709669302535; Tue, 05 Mar 2024 12:08:22 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Matthias van de Meent Date: Tue, 5 Mar 2024 21:08:08 +0100 Message-ID: Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements To: Michail Nikolaev Cc: Melanie Plageman , PostgreSQL Hackers , Alvaro Herrera Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, 21 Feb 2024 at 12:37, Michail Nikolaev wrote: > > Hi! > > > How do you suppose this would work differently from a long-lived > > normal snapshot, which is how it works right now? > > Difference in the ability to take new visibility snapshot periodically > during the second phase with rechecking visibility of tuple according > to the "reference" snapshot (which is taken only once like now). > It is the approach from (1) but with a workaround for the issues > caused by heap_page_prune_opt. > > > Would it be exclusively for that relation? > Yes, only for that affected relation. Other relations are unaffected. I suppose this could work. We'd also need to be very sure that the toast relation isn't cleaned up either: Even though that's currently DELETE+INSERT only and can't apply HOT, it would be an issue if we couldn't find the TOAST data of a deleted for everyone (but visible to us) tuple. Note that disabling cleanup for a relation will also disable cleanup of tuple versions in that table that are not used for the R/CIC snapshots, and that'd be an issue, too. > > How would this be integrated with e.g. heap_page_prune_opt? > Probably by some flag in RelationData, but not sure here yet. > > If the idea looks sane, I could try to extend my POC - it should be > not too hard, likely (I already have tests to make sure it is > correct). I'm not a fan of this approach. Changing visibility and cleanup semantics to only benefit R/CIC sounds like a pain to work with in essentially all visibility-related code. I'd much rather have to deal with another index AM, even if it takes more time: the changes in semantics will be limited to a new plug in the index AM system and a behaviour change in R/CIC, rather than behaviour that changes in all visibility-checking code. But regardless of second scan snapshots, I think we can worry about that part at a later moment: The first scan phase is usually the most expensive and takes the most time of all phases that hold snapshots, and in the above discussion we agreed that we can already reduce the time that a snapshot is held during that phase significantly. Sure, it isn't great that we have to scan the table again with only a single snapshot, but generally phase 2 doesn't have that much to do (except when BRIN indexes are involved) so this is likely less of an issue. And even if it is, we would still have reduced the number of long-lived snapshots by half. -Matthias