Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vnX4s-008Sbn-0K for pgsql-hackers@arkaria.postgresql.org; Wed, 04 Feb 2026 07:13:30 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vnX4q-009dzA-3B for pgsql-hackers@arkaria.postgresql.org; Wed, 04 Feb 2026 07:13:28 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vnX4q-009dz1-1z for pgsql-hackers@lists.postgresql.org; Wed, 04 Feb 2026 07:13:28 +0000 Received: from mail-ej1-x641.google.com ([2a00:1450:4864:20::641]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vnX4n-00000000zG3-1RVO for pgsql-hackers@postgresql.org; Wed, 04 Feb 2026 07:13:27 +0000 Received: by mail-ej1-x641.google.com with SMTP id a640c23a62f3a-b7cf4a975d2so937304266b.2 for ; Tue, 03 Feb 2026 23:13:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kleczek.org; s=google; t=1770189203; x=1770794003; darn=postgresql.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=zRYzoEqB2frEJVl9jlTwci0WmPz0r8yOxg2jiByHJcY=; b=IOg+HsXEGXfsOkPMf5Wt51KKE9GTQtaw/sNQxiFrjl2Jlm9+qSl5CHxlbiaHmh+FAu QncHOKG/oOlWZHGzCY1wVau4tKIq36nwRNgIGT/vniV85FKa6/LrDG3OuiNOpEaP1kZX qEbHiM0WeCHG+cGtcGVeGKLHkOSMGV0CaM46Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770189203; x=1770794003; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=zRYzoEqB2frEJVl9jlTwci0WmPz0r8yOxg2jiByHJcY=; b=jC8EeLNMR0zAZk8fUvpH3IspiWLVQMdPzZWdSj3dvkZMg5YJSjLkH/4H+VySPy4Kfs ntOagkKMaDR/tQ9bp09JYZSMgaTIcpeEp4TB0rq3auKyT/7hoipPlhQIVe+5QJ5fr48F /bRrPipeSXPK5iyD1eLFuMRMIHkwU8LmwAAOf4KrqHgiWuKAMBMNSHcTBKXOwIFAzjX3 XXfKLNGx4wjkXtYZHM3z458Ljeh16vmTM8Yk5i+ggkpZTAETW4M6nSmgcKWU6Ck0CjPF dnBnL9Du4gHVf8fFjok/Wi9l3B8ucOq7x+qBZUpBaK1jM962DP/+rL0sIH/xWpt99VOp z3Rw== X-Forwarded-Encrypted: i=1; AJvYcCWsN1tl4oujmpT/9VobZqZzHUY6jKSKX1bH3LAtRY9zrqQmVXHlcHCP7AUFIqd1KQ6tFD9GdAEyp3GzGtnD@postgresql.org X-Gm-Message-State: AOJu0YwCVvWUUGu60WEGbJTUJzC2A6PPaBtyRDAJ54vtVgHK+aAz9JyS vwIqxPRPf+XBWrYkEujb2HyPiSNjSPxVr7zBUM8hy8OR8Mv7n8HCzsjDN7YPWM4MMt/p8om8yJa Ux+014iaj4PiC X-Gm-Gg: AZuq6aLfye3h9aTqAtksk90MieGKWJkvu0sKG73ezmvN/a6cpHd7QFKegwTRTQyY+HL BlFMpY2+qFP3tI7wdhTfF8W/ec3tDMVlWdF0gBZ7lkQsg4DZmQ+BOwrjZoMU/nXYkOefCJNLzzH wM3IbtBpAm7kA4KRb95Ios5og1yW5ARGhFwqqkWn5qqAeVC2vE06X/pD/PqBYRnwAQZuakIKPu2 HoAviS0VKNL2FwiJlkfIOApYW/BpFGt7T18CR0YxrnJFBEVk2MXD8CYCPOLZ/+g6o6c+BsN5r7K 9hwe4icgLko+aHljV6AoJTbQR/JkyAkSseVChDQt+ZHJ9PGMmp9ofa3zhdMlq6nIa49i/pem0tj 6RuxeRXehh0GUGTgDfcrruKaRmRESiSoBwV1TzeCObQG46yExRMt//wElr2cN7DGxrBtAEXyuGR T6OKf0p40DwFuka/SvJC1ykohfPOy8EMSZMIUvvtJv5xg= X-Received: by 2002:a17:907:d19:b0:b87:39d:2bb4 with SMTP id a640c23a62f3a-b8e9f6a0f53mr147749066b.59.1770189202967; Tue, 03 Feb 2026 23:13:22 -0800 (PST) Received: from smtpclient.apple (b15.kleczek.org. [185.20.172.182]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b8e9fef0006sm96368066b.41.2026.02.03.23.13.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 Feb 2026 23:13:22 -0800 (PST) From: =?utf-8?Q?Micha=C5=82_K=C5=82eczek?= Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_949C9C92-0BD5-4F30-B1A1-79B3FAE58331" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3864.300.41.1.7\)) Subject: Re: New access method for b-tree. Date: Wed, 4 Feb 2026 08:13:11 +0100 In-Reply-To: Cc: Tomas Vondra , Alexandre Felipe , "pgsql-hackers@postgresql.org" To: Ants Aasma References: X-Mailer: Apple Mail (2.3864.300.41.1.7) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --Apple-Mail=_949C9C92-0BD5-4F30-B1A1-79B3FAE58331 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 3 Feb 2026, at 22:42, Ants Aasma wrote: >=20 > On Mon, 2 Feb 2026 at 01:54, Tomas Vondra wrote: >> I'm also wondering how common is the targeted query pattern? How = common >> it is to have an IN condition on the leading column in an index, and >> ORDER BY on the second one? >=20 > I have seen this pattern multiple times. My nickname for it is the > timeline view. Think of the social media timeline, showing posts from > all followed accounts in timestamp order, returned in reasonably sized > batches. The naive SQL query will have to scan all posts from all > followed accounts and pass them through a top-N sort. When the total > number of posts is much larger than the batch size this is much slower > than what is proposed here (assuming I understand it correctly) - > effectively equivalent to running N index scans through Merge Append. >=20 > My workarounds I have proposed users have been either to rewrite the > query as a UNION ALL of a set of single value prefix queries wrapped > in an order by limit. This gives the exact needed merge append plan > shape. But repeating the query N times can get unwieldy when the > number of values grows, so the fallback is: >=20 > SELECT * FROM unnest(:friends) id, LATERAL ( > SELECT * FROM posts > WHERE user_id =3D id > ORDER BY tstamp DESC LIMIT 100) > ORDER BY tstamp DESC LIMIT 100; >=20 > The downside of this formulation is that we still have to fetch a > batch worth of items from scans where we otherwise would have only had > to look at one index tuple. GIST can be used to handle this kind of queries as it supports multiple = sort orders. The only problem is that GIST does not support ORDER BY column. One possible workaround is [1] but as described there it does not play = well with partitioning. I=E2=80=99ve started drafting support for ORDER BY column in GIST - see = [2]. I think it would be easier to implement and maintain than a new IAM (but = I don=E2=80=99t have enough knowledge and experience to implement it = myself) [1] = https://www.postgresql.org/message-id/3FA1E0A9-8393-41F6-88BD-62EEEA1EC21F= %40kleczek.org [2] = https://www.postgresql.org/message-id/B2AC13F9-6655-4E27-BFD3-068844E5DC91= %40kleczek.org =E2=80=94 Kind regards, Michal= --Apple-Mail=_949C9C92-0BD5-4F30-B1A1-79B3FAE58331 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On 3 Feb 2026, at 22:42, Ants Aasma = <ants.aasma@cybertec.at> wrote:

On Mon, 2 Feb 2026 at = 01:54, Tomas Vondra <tomas@vondra.me> wrote:
I'm also wondering how common is the targeted query = pattern? How common
it is to have an IN condition on the leading = column in an index, and
ORDER BY on the second = one?

I have seen this pattern multiple times. My = nickname for it is the
timeline view. Think of the social media = timeline, showing posts from
all followed accounts in timestamp = order, returned in reasonably sized
batches. The naive SQL query will = have to scan all posts from all
followed accounts and pass them = through a top-N sort. When the total
number of posts is much larger = than the batch size this is much slower
than what is proposed here = (assuming I understand it correctly) -
effectively equivalent to = running N index scans through Merge = Append.

My= workarounds I have proposed users have been either to rewrite = the
query as a UNION ALL of a set of single value prefix queries = wrapped
in an order by limit. This gives the exact needed merge = append plan
shape. But repeating the query N times can get unwieldy = when the
number of values grows, so the fallback is:

SELECT * = FROM unnest(:friends) id, LATERAL (
   SELECT * FROM = posts
   WHERE user_id =3D id
=    ORDER BY tstamp DESC LIMIT 100)
ORDER BY tstamp = DESC LIMIT 100;

The downside of this formulation is that we still = have to fetch a
batch worth of items from scans where we otherwise = would have only had
to look at one index = tuple.

GIST can be used = to handle this kind of queries as it supports multiple sort = orders.
The only problem is that GIST does not support ORDER = BY column.
One possible workaround is [1] but as described = there it does not play well with partitioning.
I=E2=80=99ve = started drafting support for ORDER BY column in GIST - see = [2].
I think it would be easier to implement and maintain than = a new IAM (but I don=E2=80=99t have enough knowledge and experience to = implement it myself)


=E2=80=94
Ki= nd regards,
Michal
= --Apple-Mail=_949C9C92-0BD5-4F30-B1A1-79B3FAE58331--