Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rcZcQ-000qDe-Re for pgsql-hackers@arkaria.postgresql.org; Tue, 20 Feb 2024 23:33:47 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rcZcP-001qJA-44 for pgsql-hackers@arkaria.postgresql.org; Tue, 20 Feb 2024 23:33:45 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rcZcO-001qJ2-RQ for pgsql-hackers@lists.postgresql.org; Tue, 20 Feb 2024 23:33:45 +0000 Received: from mail-ed1-x52a.google.com ([2a00:1450:4864:20::52a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1rcZcJ-0004vE-VJ for pgsql-hackers@postgresql.org; Tue, 20 Feb 2024 23:33:44 +0000 Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-564fc8f56c5so16759a12.1 for ; Tue, 20 Feb 2024 15:33:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708472019; x=1709076819; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=bkjHVMY7/MqsbWaDGIw7hz+ThUW0EORInkUVkZAES4c=; b=ZGkgD+n087ddRUJJHA0bRErxFiR3lCWI8mRqPR/nRJBZ6rtY7xgfI9sk/jgrIfE9ux 93BDljJXJ+Rtdpv9g/FuUB4redYQ3Q3Ur7LlyPMMDTWZZCzlJqmBMQdDtanyUxLbHcZ0 yN5DpSSOqPlv+fpGsatjzrH24hD3Oyj317Mas2JxWafdN9PW9ZjBSxGqVOIX6wrTjWEN AmV73XxK4hMax7saKVYL+fe5XOCSb6ViVzgcJ+HHMyBKDr/yql1VE3MV/HDAzfcg/TGG TThiJ1D9fsBH/Kr/R3LhGwL5Qpfwxs99v0A3k0ShVl6mcJs2p/PydIvaiYEf6YSxeXYq 9JyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708472019; x=1709076819; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=bkjHVMY7/MqsbWaDGIw7hz+ThUW0EORInkUVkZAES4c=; b=kwn9QQ9cN+2CXoyLiOmlBYCkm4cJ1DQ2Ct/9mrnawojWWcIexHUd9+zUcfDy2X0j7F nyA2WuzQkMipty5qTIS5DW1VbbFLdMu/1cwinqHeK7pNi0HUQ1EPjg7uUzyo5aaPHcI9 Xdo+LCw6FHc4PusCXF67Hzwl3v9nUjMCFNEdQxYrK8MCRj7ZTh3aomTbAeIWLC0sy4Hw 6qC51BzwpIXc+8FHHSrbK8OgkW/e7cobqdbcl9Fs0hIkxEuir4JkiQlGyfOgVomtP2xL SuLVfC6BMjIuWOnc8vEnpDSvuz1/3KqgTwqdEKxAKmU6dsWXGcQESkNKnyPbPBgPeYYr ZJXQ== X-Forwarded-Encrypted: i=1; AJvYcCU+X8j4PRzTWv9G5BZpoXIsVzrD2qvld18kbxcMU43ryW2OYWVdOxp26zloDmODTNuO8NQNseZ98MkzXc7cwzWvU3A6cJzuoL7Kz916 X-Gm-Message-State: AOJu0YwcR7LXW508hX3njpP/dDb9e/PlJyyLXT1BM1+GgGo9ikjspj28 d20d7rgEBWyTyqsQeb+KVCsi5he+9hXJ/ffMcGa6vnS4fEqiH5tBpPXbyrm9yd2qj45hPM1Dim9 PMnAeRF7K8Oi/kJ5/bUcB88SGluA= X-Google-Smtp-Source: AGHT+IGhueRxOeIwSbz0dqGVQdufIHgLlw1QuceRJ1TaqoLBFZcptBUfauS6Jm2w6HArGYugqy4n7mXR1RPJqt6LYxE= X-Received: by 2002:a17:906:2350:b0:a3e:92b8:89c2 with SMTP id m16-20020a170906235000b00a3e92b889c2mr5403616eja.42.1708472018605; Tue, 20 Feb 2024 15:33:38 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Michail Nikolaev Date: Wed, 21 Feb 2024 00:33:26 +0100 Message-ID: Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements To: Matthias van de Meent Cc: Melanie Plageman , PostgreSQL Hackers , Alvaro Herrera Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hello! > I think the best way for this to work would be an index method that > exclusively stores TIDs, and of which we can quickly determine new > tuples, too. I was thinking about something like GIN's format, but > using (generation number, tid) instead of ([colno, colvalue], tid) as > key data for the internal trees, and would be unlogged (because the > data wouldn't have to survive a crash) Yeah, this seems to be a reasonable approach, but there are some doubts related to it - it needs new index type as well as unlogged indexes to be introduced - this may make the patch too invasive to be merged. Also, some way to remove the index from the catalog in case of a crash may be required. A few more thoughts: * it is possible to go without generation number - we may provide a way to do some kind of fast index lookup (by TID) directly during the second table scan phase. * one more option is to maintain a Tuplesorts (instead of an index) with TIDs as changelog and merge with index snapshot after taking a new visibility snapshot. But it is not clear how to share the same Tuplesort with multiple inserting backends. * crazy idea - what is about to do the scan in the index we are building? We have tuple, so, we have all the data indexed in the index. We may try to do an index scan using that data to get all tuples and find the one with our TID :) Yes, in some cases it may be too bad because of the huge amount of TIDs we need to scan + also btree copies whole page despite we need single item. But some additional index method may help - feels like something related to uniqueness (but it is only in btree anyway). Thanks, Mikhail.