Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tpxfh-00EqBB-Lw for pgsql-hackers@arkaria.postgresql.org; Wed, 05 Mar 2025 22:57:01 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tpxff-008CK5-PT for pgsql-hackers@arkaria.postgresql.org; Wed, 05 Mar 2025 22:56:59 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tpxff-008CJu-8K for pgsql-hackers@lists.postgresql.org; Wed, 05 Mar 2025 22:56:59 +0000 Received: from mail-lf1-x129.google.com ([2a00:1450:4864:20::129]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tpxfd-001BkE-0P for pgsql-hackers@postgresql.org; Wed, 05 Mar 2025 22:56:58 +0000 Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-5495078cd59so50141e87.1 for ; Wed, 05 Mar 2025 14:56:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741215415; x=1741820215; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Im5evkRapmXPvXn2KpP+sc71RQMpPeLkgotgk+/Kfns=; b=nr/EcBRnUuf1aN5yvrQAulRl7gqewZHt0ZRZH50P6w362RxatNoUlg4/IzCXlqdpj4 1yWLeAmJ7Dj1+IgvcPctxid8H+a05Vy2VUVLNn4V7+aDR0So+CPTcjbVeG/tV7O2j3IN LXfIsvN50Mnq5lApzIjWnt+qYAqyn6gWAbCObcgd41jwABg1+2ibfh+eIMAJpNxh5id6 LRTq8q3cp3CAC8PH8J/v7rmB5ESFjL37j078PUF5UxvjWObm2wgkf7HVtDId6m8RfSkD L/JT4VO9oKbdkDHasmjJgyzXeBBVY/+/ZhOtQNk7m9KfS5mviejwsfUxnWua5AD81h96 6ZcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741215415; x=1741820215; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Im5evkRapmXPvXn2KpP+sc71RQMpPeLkgotgk+/Kfns=; b=l2cLh7OJeT+6mzzsdgE2NyFsCcScGtJHYAw/rgCrVuKVOg+Jv0KzGcupOjkL6BzL0n Du9auRRcOOOhpx4/zC4uroMqIpE5szywrrzfGa0A4stRI8iZx5DYDbrxdGPSULa8eXEJ anoUfpGielfrvtb5nP+loL0v21wDkMnkQ1b6BRY9qx5XPSpNywV0xHN+MmIVTaLZMVvP 8PJ1G/F0/049X/WwSi3PeSd1vpJ4OY4FIbXn8Ru7XYrrx9rplE08DR6KWPPUtLnJXa4y 9NRT3w5pRlT//lF9g6cxLMjpOxNPrszAtSG8uzHcnaeloPzs1D18c2QvvGVUCvqP9Nw/ D8wQ== X-Gm-Message-State: AOJu0Yyffr9bqfypYa4Num5eT27oXH4PEh5EO0capc6wKayWeRjsZpir 4zNdSVFt1y9n7T3VYXTCXRduul5qlS+queO3TKPl+gsLbGmBBQKd+/dUpaPjSX/a3+RDMStaH6Y CAmLxgJbiJnUkKpR+AKXExZThVYu7ADAD X-Gm-Gg: ASbGncuiu4I/QKyehv+5IU+5vD0YW50UxX7iqkHEA3CjBlMYt8y8fBNzA+6IbDYrkLu a8SUH2Zi3zr2610f7w2d3pX+/N0XXO482k2TvtpV1XXx+8vH+KiJMoVK0W00qOQG1UaG+pRT92s tSNP2XFXyYODnSSmxCFt9HXrDtcA== X-Google-Smtp-Source: AGHT+IF01ajw79j89N4Oc4CvCVhd3XHDnw7I4lU4ACjGmh20BUROl1YoQpoM19QyNF8et1sud4CgmtOIkOemhZGiFW8= X-Received: by 2002:a05:6512:1113:b0:545:60b:f382 with SMTP id 2adb3069b0e04-5497d336412mr2036143e87.17.1741215415093; Wed, 05 Mar 2025 14:56:55 -0800 (PST) MIME-Version: 1.0 References: <78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com> <238EEE41-B206-4590-8C20-DA52C25A2291@amazon.com> In-Reply-To: <238EEE41-B206-4590-8C20-DA52C25A2291@amazon.com> From: Matthias van de Meent Date: Wed, 5 Mar 2025 23:56:42 +0100 X-Gm-Features: AQ5f1JpMY7JyR1QMlKh79XnVEXpEHb38dRd7pDI6Hj43xtkJmZTgPwY4U0p6RsA Message-ID: Subject: Re: Expanding HOT updates for expression and partial indexes To: "Burd, Greg" Cc: "pgsql-hackers@postgresql.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, Sorry for the delay. This is a reply for the mail thread up to 17 Feb, so it might be very out-of-date by now, in which case sorry for the noise. On Mon, 17 Feb 2025 at 20:54, Burd, Greg wrote: > On Feb 15, 2025, at 5:49=E2=80=AFAM, Matthias van de Meent wrote: > > > > In HEAD, we have a clear indication of which classes of indexes to > > update, with TU_UpdateIndexes. With this patch, we have to derive that > > from the (lack of) bits in the bitmap that might be output by the > > table_update procedure. > > Yes, but... that "clear indication" is lacking the ability to convey more= detailed information. It doesn't tell you which summarizing indexes reall= y need updating just that as a result of being on the HOT path all summariz= ing indexes require updates. Agreed that it's not great if you want to know about which indexes were meaningfully updated. I think that barring significant advances in performance of update checks, we can devise a way of transfering this info to the table_tuple_update caller once we get a need for more detailed information (e.g. this could be transfered through the IndexInfo* that's currently also used by index_unchanged_by_update). > > I think we can do with an additional parameter for which indexes would > > be updated (or store that info in the parameter which also will hold > > EState et al). I think it's cheaper that way, too - only when > > update_indexes could be TU_SUMMARIZING we might need the exact > > information for which indexes to insert new tuples into, and it only > > really needs to be sized to the number of summarizing indexes (usually > > small/nonexistent, but potentially huge). > > Okay, yes with this patch we need only concern ourselves with all, none, = or some subset of summarizing as before. I'll work on the opaque parameter= next iteration. Thanks! > > ----- > > > > I notice that ExecIndexesRequiringUpdates() does work on all indexes, > > rather than just indexes relevant to this exact phase of checking. I > > think that is a waste of time, so if we sort the indexes in order of > > [hotblocking without expressions, hotblocking with expressions, > > summarizing], then (with stored start/end indexes) we can save time in > > cases where there are comparatively few of the types we're not going > > to look at. > > If I plan on just having ExecIndexesRequiringUpdates() return a bool rath= er than a bitmap then sorting, or even just filtering the list of IndexInfo= to only include indexes with expressions, makes sense. That way the only = indexes in question in that function's loop will be those that may spoil th= e HOT path. When that list is length 0, we can skip the tests entirely. Yes, exactly. Though I'm not sure it should hit that path if the length is 0, as would mean we had expression indexes that matched the updated columns, but somehow none are in the list? > > You're extracting type info from the opclass, to use in > > datum_image_eq(). Couldn't you instead use the index relation's > > TupleDesc and its stored attribute information instead? That saves us > > from having to do further catalog lookups during execution. I'm also > > fairly sure that that information is supposed to be a more accurate > > representation of attributes' expression output types than the > > opclass' type information (though, they probably should match). > > I hadn't thought of that, I think it's a valid idea and I'll update accor= dingly. I think I understand what you are suggesting. Thanks, that change was exactly what I meant. > > > > ----- > > > > The operations applied in ExecIndexesRequiringUpdates partially > > duplicate those done in index_unchanged_by_update. Can we (partially) > > unify this, and pass which indexes were updated through the IndexInfo, > > rather than the current bitmap? > > I think I do that now, feel free to say otherwise. When the expression i= s checked in ExecIndexesExpressionsWereNotUpdated() I set: > > /* Shortcut index_unchanged_by_update(), we know the answer. */ = = indexInfo->ii_CheckedUnchanged =3D t= rue; = = indexInfo->ii_IndexUnchanged =3D !changed; > > That prevents duplicate effort in index_unchanged_by_update(). Exactly, yes. > > ----- > > > > I don't see a good reason to add IndexInfo to Relation, by way of > > rd_indexInfoList. It seems like an ad-hoc way of passing data around, > > and I don't think that's the right way. > > At one point I'd created a way to get this set via relcache, I will resur= rect that approach but I'm not sure it is what you were hinting at. AFAIK, we don't have IndexInfo in the relcaches currently. I'm very hesitant to add an executor node (!) subtype to catalog caches, as IndexInfos are also used to store temporary information about e.g. index tuple insertion state, which (if IndexInfo is stored in relcaches) would imply modifying relcache entries without any further locks, and I'm not sure that's at all an OK thing to do. > The current method avoids pulling a the lock on the index to build the l= ist, but doing that once in relcache isn't horrible. Maybe you were sugges= ting using that opaque struct to pass around the list of IndexInfo? Let me= know on this one if you had a specific idea. The swap I've made in v6 rea= lly just moves the IndexInfo list to a filtered list with a new name create= d in relcache. My main concern is the storage of executor nodes(!) directly in the relcache. I don't think we need that: We have relatively direct access to the right IndexInfo** in ResultRelInfo->ri_IndexRelationInfo, which I think should be sufficient for this purpose. (The relevant RRI is available in table_tuple_update caller ExecUpdateAct; and could be passed down by ExecSimpleRelationUpdate to simple_table_tuple_update, covering both (current, core) callers of table_tuple_update). That would then be passed down to the TableAM using an opaque pointer type; for example (names, file locations, exact layout all bikesheddable): /* tableam.h */ /* exact definition somewhere else, in e.g. an executor_internal.h */ typedef struct TU_UpdateIndexData TU_UpdateIndexData; table_tuple_update(..., TU_UpdateIndexData *idxupdate, ...) /* executor.h */ TU_UpdateIndexes UpdateDetermineChangedIndexes(TU_UpdateIndexData *idxupdate, TableTupleSlot *old, TableTupleSlot *new, bitmap *changed_atts, ...); /* executor_internal.h */ struct TU_UpdateIndexData { EState estate; IndexInfo **idxinfos; ... } ----- Looking at your later comments about RRI in patch v8, I think that would solve and clean up the way that you currently get access to the RRI and thus index set. Kind regards, Matthias van de Meent Neon (https://neon.tech)