Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rFtJr-00CeX1-VC for pgsql-hackers@arkaria.postgresql.org; Wed, 20 Dec 2023 09:56:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rFtJq-00Akte-MZ for pgsql-hackers@arkaria.postgresql.org; Wed, 20 Dec 2023 09:56:50 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rFtJq-00AktW-D0 for pgsql-hackers@lists.postgresql.org; Wed, 20 Dec 2023 09:56:50 +0000 Received: from mail-ej1-x62f.google.com ([2a00:1450:4864:20::62f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1rFtJo-00BJNW-5D for pgsql-hackers@postgresql.org; Wed, 20 Dec 2023 09:56:49 +0000 Received: by mail-ej1-x62f.google.com with SMTP id a640c23a62f3a-a236d77dceeso307639566b.2 for ; Wed, 20 Dec 2023 01:56:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1703066206; x=1703671006; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=lE3dYjSPcjilh3pzQzY2Bs5DajLptIRuj+n1bINtwFk=; b=NPv/q7yUv610/PCdfQSEmJhGa0Ft65UkWycpw9HHwOsrkxo9NYkfQ8JIIlJsruslq7 0orFrILGR7UBSQsFAdf5FGXizrYXqki+xj14X6tS+sOyF792DY+vY6bbkxxmnSPOqiSj aKpYXwc7N1oDKVA6GCdjzXfdM26wAPoAbuYaprghDF6f/SNq3HLoV/ERwEFbq3RxZRn+ akDwLmPZzloG/ysZO6nRglRPAusTfimajPUmr0NK7H+alyCvJeDkQmHdQvd/Yq1EVS7e j4yRYs+9uNjdpWszu2wkVBVTB3qeAN/UUlRHMFCRJNAeKzZuwdZhpz7Bri2+cvDjEwBi HhXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703066206; x=1703671006; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lE3dYjSPcjilh3pzQzY2Bs5DajLptIRuj+n1bINtwFk=; b=Fu2woCA6i4fNjP8mmoc9KuWsEbZahBTYddcyGcJdQEzWOQcC1h8Bl6AV+pnzjG807H e/6wcWdBmlhPqaMtKUlEGCJ17OCTpjptSNuU/kQgsCTT7sU3C3cNiYy/SvWKOlC0GOY7 SyBdx1ooEES3b3jxEm0wFXMxvaCJV66O9W6oLDF3tauDsyZymU+yVzEJ+uWKrA7LuBsu lG61TbI8ciM/YqtxUHWs1iM24zz1RQ568m0UvET/dc6zJ7+xPwwNcyQv5yW9gL1Ho+KZ Dj3PXa76TK8uYZ/PtqoMZBUyleiNuETc4ryzF4rjzCHgJisX4/EiFn8BDkcwXLUUbNp0 +fNw== X-Gm-Message-State: AOJu0Yxxly9YnjlTygDio26gh8GY4eRTw4pAn4lT+FRpmgWkM3TdEAid ubqxYGues13M/OevfYMLSovkG8nmWsZKZ7WpuF8= X-Google-Smtp-Source: AGHT+IEVoQTNyLSCb6b9hi8RgHLswh3ttmvQYCfRo0MudelimwIHa7XDpLUwYhv5KnajybiWXbtNCRYFV+HIzVNTf6Q= X-Received: by 2002:a17:906:2242:b0:a1f:ad99:a8d8 with SMTP id 2-20020a170906224200b00a1fad99a8d8mr5504724ejr.113.1703066205854; Wed, 20 Dec 2023 01:56:45 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Michail Nikolaev Date: Wed, 20 Dec 2023 10:56:33 +0100 Message-ID: Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements To: Matthias van de Meent Cc: PostgreSQL Hackers , Alvaro Herrera Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hello! > Also, feels like we may apply this to both phases (first and the second scans). > The original patch (1) was helping only to the second one (after call > to set_indexsafe_procflags). Oops, I was wrong here. The original version of the patch was also applied to both phases. > Note that the use of such expressions would be a violation of the > function's definition; it would depend on data from other tables which > makes the function behave like a STABLE function, as opposed to the > IMMUTABLE that is required for index expressions. So, I don't think we > should specially care about being correct for incorrectly marked > function definitions. Yes, but such cases could probably cause crashes also... So, I think it is better to check them for custom functions. But I still not sure - if such limitations still required for proposed optimization or not. > I just realised there is one issue with this design: We can't cheaply > reset the snapshot during the second table scan: > It is critically important that the second scan of R/CIC uses an index > contents summary (made with index_bulk_delete) that was created while > the current snapshot was already registered. > So, the "reset the snapshot every so often" trick cannot be applied in > phase 3 (the rescan), or we'd have to do an index_bulk_delete call > every time we reset the snapshot. Rescanning might be worth the cost > (e.g. when using BRIN), but that is very unlikely. Hm, I think it is still possible. We could just manually recheck the tuples we see to the snapshot currently used for the scan. If an "old" snapshot can see the tuple also (HeapTupleSatisfiesHistoricMVCC) then search for it in the index summary. > What do you mean by "new locking pattern"? We already keep an > ShareUpdateExclusiveLock on every heap table we're accessing during > R/CIC, and that should already prevent any concurrent VACUUM > operations, right? I was thinking not about "classical" locking, but about waiting for other backends by WaitForLockers(heaplocktag, ShareLock, true). But I think everything should be fine. Best regards, Michail.