Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sbTnn-00Grp9-7c for pgsql-hackers@arkaria.postgresql.org; Tue, 06 Aug 2024 23:41:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sbTnl-003aYl-Jb for pgsql-hackers@arkaria.postgresql.org; Tue, 06 Aug 2024 23:41:13 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sbTnl-003aYc-90 for pgsql-hackers@lists.postgresql.org; Tue, 06 Aug 2024 23:41:13 +0000 Received: from mail-lj1-x236.google.com ([2a00:1450:4864:20::236]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sbTnf-003SID-Rd for pgsql-hackers@postgresql.org; Tue, 06 Aug 2024 23:41:12 +0000 Received: by mail-lj1-x236.google.com with SMTP id 38308e7fff4ca-2f035ae0ff1so3259581fa.0 for ; Tue, 06 Aug 2024 16:41:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722987666; x=1723592466; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=f5CHuh6RaK6VOF0JnbxyrboQbD+Zh73qYvRdSfATLoE=; b=TSKTwHVBR3zegRWwMzfXvVa+jyS2uA5brIlHwKPevTVnQdBaEKlwzY6ae+iPhgIbaQ M4PYOIay0ylDgZxFumRjAaFSAmwVsng8Jd3OorvKdR5pjV5gxGQPm0X4W9ZiYBIeW+HE 0DUqbw/MkkbdYA7/spT3g+MdY0HMh9xr5R8bfTgiumwyNAN42HzGLTa8c4fpt+tHNTzx C260UK9kGav6v+8L+vyZm7lTcDV9bmgGccLno7tp9gsVivbjelGP1PWWRqYYRLG132gJ wjyQleKQvJeNEVjaNBfZNdg14DtnS2O1vQWeK//LsZN3fTp79oTNff0IcqcUjbcvg3Hd wIMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722987666; x=1723592466; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=f5CHuh6RaK6VOF0JnbxyrboQbD+Zh73qYvRdSfATLoE=; b=W5I9D5CvANUCHTxwk2XHni7x845mlrgTiFVUzu/yBjb18dpSo2HDuetGb2msdq31Bk 25X/EUvKm0lVkQTA558Z7AB3s7CsiaYdnKu+XLdBI9SEhv5X838mjeZUcSJP++VFFslU +DhKrcQ3WaWXIYB9hHIUF0VYZQTkG6bEvigU43vNNMugBrIEcLTV+0u+X4EM13X38UW1 uHZIpCYDbpskbvOToWFrTa0RuJWAvksIgmzms0TdP/5aWQWni+9ssybbukFQdzsrnwgW SdQB9kROo9a/OqF/S3JRxbTEhHJ4W18bJCWOnfc8MDPzYF7BbRDfHoXAKbetub+lJj/f S3Dg== X-Gm-Message-State: AOJu0YwlreHid/n5DAA6cAmOXOqFL4GmzEXm8/bBp9cCvEFqizvooaOn pZscy/gY4j6xOn/672ibgpkp+BXSnsv5cjSXvIjCifz1hKKKE7w2+M0WqJvbO2iD4nQdrQtWDOq PtFfLgwtMJlqSsVy9G8WcTTKyTuTtOZ+h X-Google-Smtp-Source: AGHT+IGbHmaMlL9dxlN4O282Z4wxbiY+KisJN+sZTv28QjRoUDwdBgfbfOb+UCeFaq84HV01xbGag42MY22ckjhf2cU= X-Received: by 2002:a2e:b8c4:0:b0:2f0:1c57:625a with SMTP id 38308e7fff4ca-2f19536c7a7mr1012481fa.9.1722987665012; Tue, 06 Aug 2024 16:41:05 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Matthias van de Meent Date: Wed, 7 Aug 2024 01:40:51 +0200 Message-ID: Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements To: Michail Nikolaev Cc: PostgreSQL Hackers , Andrey Borodin , Melanie Plageman Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, 11 Jun 2024 at 10:58, Michail Nikolaev wrote: > > Hello. > > I did the POC (1) of the method described in the previous email, and it looks promising. > > It doesn't block the VACUUM, indexes are built about 30% faster (22 mins vs 15 mins). That's a nice improvement. > Additional index is lightweight and does not produce any WAL. That doesn't seem to be what I see in the current patchset: https://github.com/postgres/postgres/compare/master...michail-nikolaev:postgres:new_index_concurrently_approach#diff-cc3cb8968cf833c4b8498ad2c561c786099c910515c4bf397ba853ae60aa2bf7R311 > I'll continue the more stress testing for a while. Also, I need to restructure the commits (my path was no direct) into some meaningful and reviewable patches. While waiting for this, here are some initial comments on the github diffs: - I notice you've added a new argument to heapam_index_build_range_scan. I think this could just as well be implemented by reading the indexInfo->ii_Concurrent field, as the values should be equivalent, right? - In heapam_index_build_range_scan, it seems like you're popping the snapshot and registering a new one while holding a tuple from heap_getnext(), thus while holding a page lock. I'm not so sure that's OK, expecially when catalogs are also involved (specifically for expression indexes, where functions could potentially be updated or dropped if we re-create the visibility snapshot) - In heapam_index_build_range_scan, you pop the snapshot before the returned heaptuple is processed and passed to the index-provided callback. I think that's incorrect, as it'll change the visibility of the returned tuple before it's passed to the index's callback. I think the snapshot manipulation is best added at the end of the loop, if we add it at all in that function. - The snapshot reset interval is quite high, at 500ms. Why did you configure it that low, and didn't you make this configurable? - You seem to be using WAL in the STIR index, while it doesn't seem that relevant for the use case of auxiliary indexes that won't return any data and are only used on the primary. It would imply that the data is being sent to replicas and more data being written than strictly necessary, which to me seems wasteful. - The locking in stirinsert can probably be improved significantly if we use things like atomic operations on STIR pages. We'd need an exclusive lock only for page initialization, while share locks are enough if the page's data is modified without WAL. That should improve concurrent insert performance significantly, as it would further reduce the length of the exclusively locked hot path. Kind regards, Matthias van de Meent Neon (https://neon.tech)