public inbox for [email protected]  
help / color / mirror / Atom feed
From: Ayush Tiwari <[email protected]>
To: [email protected]
Subject: Re: tid_blockno() and tid_offset() accessor functions
Date: Sun, 8 Mar 2026 01:13:13 +0530
Message-ID: <CAJTYsWVK6rw++WU94=gFk5ZR5ZXhT3BYOFxhm6tT9xOGG-drnA@mail.gmail.com> (raw)
In-Reply-To: <CAJTYsWUzok2+mvSYkbVUwq_SWWg-GdHqCuYumN82AU97SjwjCA@mail.gmail.com>
References: <CAJTYsWUzok2+mvSYkbVUwq_SWWg-GdHqCuYumN82AU97SjwjCA@mail.gmail.com>

Hello,

Attaching a V2-patch post rebasing due to oid conflict with the latest main
branch. In addition to that changing the sql function name for tid block
number to tid_block and adding document related changes.

Please review and let me know your thoughts.

Regards,
Ayush

On Sat, 28 Feb 2026 at 00:29, Ayush Tiwari <[email protected]>
wrote:

> Hi hackers,
>
> As of now we don't have any built-in way to extract the block and offset
> components from a TID. When people need to group by page (like for bloat
> analysis) or filter by specific blocks, they usually end up using the
> `ctid::text::point` hack:
>
>     SELECT (ctid::text::point)[0]::bigint AS blockno,
>            (ctid::text::point)[1]::int    AS offset
>     FROM my_table;
>
> This works, but it's pretty clunky, relies on the text representation, and
> isn't great if you're trying to parse TIDs outside of SQL.
>
> The attached patch adds two simple accessor functions:
> - `tid_blockno(tid) -> bigint`
> - `tid_offset(tid) -> integer`
>
> A couple of quick notes on the implementation I went for:
> - `tid_blockno` returns `int8` since `BlockNumber` is `uint32` and could
> overflow `int4`.
> - `tid_offset` returns `int4` since `OffsetNumber` is `uint16`.
> - Both are marked leakproof and strict.
> - I used the `NoCheck` macros from `itemptr.h` so they safely handle
> user-supplied literals like `(0,0)`.
>
> Please let me know what you think!
>
> Regards,
> Ayush
>


Attachments:

  [application/octet-stream] v2-0001-Add-tid_block-and-tid_offset-accessor-functions.patch (11.1K, 3-v2-0001-Add-tid_block-and-tid_offset-accessor-functions.patch)
  download | inline diff:
From 92e3657d85b13355563ba4c447ddf89fcb4c4b3e Mon Sep 17 00:00:00 2001
From: Ayush Tiwari <[email protected]>
Date: Sat, 7 Mar 2026 18:27:36 +0000
Subject: [PATCH v2] Add tid_block() and tid_offset() accessor functions

Add two new built-in SQL functions to extract the components of a tid
(tuple identifier) value:

  tid_block(tid) -> bigint   -- extract block number
  tid_offset(tid)  -> integer  -- extract offset number

These provide a clean, efficient alternative to the common workaround
of ctid::text::point for decomposing TID values. The text-based hack
is fragile, inefficient, and unavailable outside of SQL contexts.

tid_block() returns int8 (bigint) because BlockNumber is uint32,
which exceeds the range of int4. tid_offset() returns int4 (integer)
because OffsetNumber is uint16, which fits safely in int4.

Both functions use the NoCheck accessor variants from itemptr.h,
are marked leakproof, and include regression tests covering typical
values, boundary conditions, NULL handling, and round-trip identity.
---
 doc/src/sgml/func/allfiles.sgml          |  1 +
 doc/src/sgml/func/func-tid.sgml          | 73 ++++++++++++++++++++++++
 doc/src/sgml/func/func.sgml              |  1 +
 src/backend/utils/adt/tid.c              | 31 ++++++++++
 src/include/catalog/pg_proc.dat          |  6 ++
 src/test/regress/expected/opr_sanity.out |  2 +
 src/test/regress/expected/tid.out        | 66 +++++++++++++++++++++
 src/test/regress/sql/tid.sql             | 21 +++++++
 8 files changed, 201 insertions(+)
 create mode 100644 doc/src/sgml/func/func-tid.sgml

diff --git a/doc/src/sgml/func/allfiles.sgml b/doc/src/sgml/func/allfiles.sgml
index ce11ef1d5d8..f5e3f008537 100644
--- a/doc/src/sgml/func/allfiles.sgml
+++ b/doc/src/sgml/func/allfiles.sgml
@@ -17,6 +17,7 @@ Complete list of usable sgml source files in this directory.
 <!ENTITY func-formatting            SYSTEM "func-formatting.sgml">
 <!ENTITY func-datetime              SYSTEM "func-datetime.sgml">
 <!ENTITY func-enum                  SYSTEM "func-enum.sgml">
+<!ENTITY func-tid                   SYSTEM "func-tid.sgml">
 <!ENTITY func-geometry              SYSTEM "func-geometry.sgml">
 <!ENTITY func-net                   SYSTEM "func-net.sgml">
 <!ENTITY func-textsearch            SYSTEM "func-textsearch.sgml">
diff --git a/doc/src/sgml/func/func-tid.sgml b/doc/src/sgml/func/func-tid.sgml
new file mode 100644
index 00000000000..6b3de14d093
--- /dev/null
+++ b/doc/src/sgml/func/func-tid.sgml
@@ -0,0 +1,73 @@
+ <sect1 id="functions-tid">
+  <title>TID Functions</title>
+
+  <indexterm zone="functions-tid">
+   <primary>TID</primary>
+   <secondary>functions</secondary>
+  </indexterm>
+
+  <indexterm>
+   <primary>tid_block</primary>
+  </indexterm>
+
+  <indexterm>
+   <primary>tid_offset</primary>
+  </indexterm>
+
+  <para>
+   For the <type>tid</type> data type (described in <xref
+   linkend="datatype-oid"/>), <xref linkend="functions-tid-table"/>
+   shows the functions available for extracting the block number and
+   tuple offset.  These functions are commonly used with the
+   <structfield>ctid</structfield> system column.
+  </para>
+
+  <table id="functions-tid-table">
+   <title><type>TID</type> Functions</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="func_table_entry"><para role="func_signature">
+       Function
+      </para>
+      <para>
+       Description
+      </para>
+      <para>
+       Example(s)
+      </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="func_table_entry"><para role="func_signature">
+       <function>tid_block</function> ( <type>tid</type> )
+       <returnvalue>bigint</returnvalue>
+      </para>
+      <para>
+       Extracts the block number from a tuple identifier.
+      </para>
+      <para>
+       <literal>tid_block('(42,7)'::tid)</literal>
+       <returnvalue>42</returnvalue>
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="func_table_entry"><para role="func_signature">
+       <function>tid_offset</function> ( <type>tid</type> )
+       <returnvalue>integer</returnvalue>
+      </para>
+      <para>
+       Extracts the tuple offset within the block from a tuple identifier.
+      </para>
+      <para>
+       <literal>tid_offset('(42,7)'::tid)</literal>
+       <returnvalue>7</returnvalue>
+      </para></entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect1>
\ No newline at end of file
diff --git a/doc/src/sgml/func/func.sgml b/doc/src/sgml/func/func.sgml
index f351ef53f63..3fd21568b2e 100644
--- a/doc/src/sgml/func/func.sgml
+++ b/doc/src/sgml/func/func.sgml
@@ -59,6 +59,7 @@ repeat('Pg', 4) <returnvalue>PgPgPgPg</returnvalue>
 &func-formatting;
 &func-datetime;
 &func-enum;
+&func-tid;
 &func-geometry;
 &func-net;
 &func-textsearch;
diff --git a/src/backend/utils/adt/tid.c b/src/backend/utils/adt/tid.c
index 07248b69e57..d13c7cbe418 100644
--- a/src/backend/utils/adt/tid.c
+++ b/src/backend/utils/adt/tid.c
@@ -280,6 +280,37 @@ hashtidextended(PG_FUNCTION_ARGS)
 }
 
 
+/*
+ *	TID accessor functions
+ */
+
+/*
+ * tid_block - extract the block number from a TID
+ *
+ * Returns int8 because BlockNumber is uint32, which exceeds the range of int4.
+ */
+Datum
+tid_block(PG_FUNCTION_ARGS)
+{
+	ItemPointer tid = PG_GETARG_ITEMPOINTER(0);
+
+	PG_RETURN_INT64((int64) ItemPointerGetBlockNumberNoCheck(tid));
+}
+
+/*
+ * tid_offset - extract the offset number from a TID
+ *
+ * Returns int4 because OffsetNumber is uint16, which fits in int4.
+ */
+Datum
+tid_offset(PG_FUNCTION_ARGS)
+{
+	ItemPointer tid = PG_GETARG_ITEMPOINTER(0);
+
+	PG_RETURN_INT32((int32) ItemPointerGetOffsetNumberNoCheck(tid));
+}
+
+
 /*
  *	Functions to get latest tid of a specified tuple.
  *
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 361e2cfffeb..08267dd4f81 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -2750,6 +2750,12 @@
 { oid => '2796', descr => 'smaller of two',
   proname => 'tidsmaller', prorettype => 'tid', proargtypes => 'tid tid',
   prosrc => 'tidsmaller' },
+{ oid => '9951', descr => 'extract block number from tid',
+  proname => 'tid_block', proleakproof => 't', prorettype => 'int8',
+  proargtypes => 'tid', prosrc => 'tid_block' },
+{ oid => '9952', descr => 'extract offset number from tid',
+  proname => 'tid_offset', proleakproof => 't', prorettype => 'int4',
+  proargtypes => 'tid', prosrc => 'tid_offset' },
 { oid => '2233', descr => 'hash',
   proname => 'hashtid', prorettype => 'int4', proargtypes => 'tid',
   prosrc => 'hashtid' },
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 6ff4d7ee901..cfdc6b1a17a 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -887,6 +887,8 @@ oid8le(oid8,oid8)
 oid8gt(oid8,oid8)
 oid8ge(oid8,oid8)
 btoid8cmp(oid8,oid8)
+tid_block(tid)
+tid_offset(tid)
 -- Check that functions without argument are not marked as leakproof.
 SELECT p1.oid::regprocedure
 FROM pg_proc p1 JOIN pg_namespace pn
diff --git a/src/test/regress/expected/tid.out b/src/test/regress/expected/tid.out
index 083c83a1e1b..3497a77688b 100644
--- a/src/test/regress/expected/tid.out
+++ b/src/test/regress/expected/tid.out
@@ -42,6 +42,57 @@ SELECT * FROM pg_input_error_info('(0,-1)', 'tid');
  invalid input syntax for type tid: "(0,-1)" |        |      | 22P02
 (1 row)
 
+-- tests for tid_block() and tid_offset()
+SELECT tid_block('(0,0)'::tid), tid_offset('(0,0)'::tid);
+ tid_block | tid_offset 
+-----------+------------
+         0 |          0
+(1 row)
+
+SELECT tid_block('(0,1)'::tid), tid_offset('(0,1)'::tid);
+ tid_block | tid_offset 
+-----------+------------
+         0 |          1
+(1 row)
+
+SELECT tid_block('(42,7)'::tid), tid_offset('(42,7)'::tid);
+ tid_block | tid_offset 
+-----------+------------
+        42 |          7
+(1 row)
+
+-- max values: blockno uint32 max, offset uint16 max
+SELECT tid_block('(4294967295,65535)'::tid), tid_offset('(4294967295,65535)'::tid);
+ tid_block  | tid_offset 
+------------+------------
+ 4294967295 |      65535
+(1 row)
+
+-- (-1,0) wraps to blockno 4294967295
+SELECT tid_block('(-1,0)'::tid);
+ tid_block  
+------------
+ 4294967295
+(1 row)
+
+-- NULL handling (strict functions)
+SELECT tid_block(NULL::tid), tid_offset(NULL::tid);
+ tid_block | tid_offset 
+-----------+------------
+           |           
+(1 row)
+
+-- round-trip: blockno + offset reconstruct the original TID
+SELECT t, tid_block(t), tid_offset(t),
+       format('(%s,%s)', tid_block(t), tid_offset(t))::tid = t AS roundtrip_ok
+FROM (VALUES ('(0,0)'::tid), ('(1,42)'::tid), ('(4294967295,65535)'::tid)) AS v(t);
+         t          | tid_block  | tid_offset | roundtrip_ok 
+--------------------+------------+------------+--------------
+ (0,0)              |          0 |          0 | t
+ (1,42)             |          1 |         42 | t
+ (4294967295,65535) | 4294967295 |      65535 | t
+(3 rows)
+
 -- tests for functions related to TID handling
 CREATE TABLE tid_tab (a int);
 -- min() and max() for TIDs
@@ -58,6 +109,21 @@ SELECT max(ctid) FROM tid_tab;
  (0,2)
 (1 row)
 
+-- tid_block() and tid_offset() with real table ctid
+SELECT ctid, tid_block(ctid), tid_offset(ctid) FROM tid_tab;
+ ctid  | tid_block | tid_offset 
+-------+-----------+------------
+ (0,1) |         0 |          1
+ (0,2) |         0 |          2
+(2 rows)
+
+-- use in WHERE clause
+SELECT ctid FROM tid_tab WHERE tid_block(ctid) = 0 AND tid_offset(ctid) = 1;
+ ctid  
+-------
+ (0,1)
+(1 row)
+
 TRUNCATE tid_tab;
 -- Tests for currtid2() with various relation kinds
 -- Materialized view
diff --git a/src/test/regress/sql/tid.sql b/src/test/regress/sql/tid.sql
index 2602e20eb5a..c0a70be5cbd 100644
--- a/src/test/regress/sql/tid.sql
+++ b/src/test/regress/sql/tid.sql
@@ -16,6 +16,22 @@ SELECT pg_input_is_valid('(0,-1)', 'tid');
 SELECT * FROM pg_input_error_info('(0,-1)', 'tid');
 
 
+-- tests for tid_block() and tid_offset()
+SELECT tid_block('(0,0)'::tid), tid_offset('(0,0)'::tid);
+SELECT tid_block('(0,1)'::tid), tid_offset('(0,1)'::tid);
+SELECT tid_block('(42,7)'::tid), tid_offset('(42,7)'::tid);
+-- max values: blockno uint32 max, offset uint16 max
+SELECT tid_block('(4294967295,65535)'::tid), tid_offset('(4294967295,65535)'::tid);
+-- (-1,0) wraps to blockno 4294967295
+SELECT tid_block('(-1,0)'::tid);
+-- NULL handling (strict functions)
+SELECT tid_block(NULL::tid), tid_offset(NULL::tid);
+-- round-trip: blockno + offset reconstruct the original TID
+SELECT t, tid_block(t), tid_offset(t),
+       format('(%s,%s)', tid_block(t), tid_offset(t))::tid = t AS roundtrip_ok
+FROM (VALUES ('(0,0)'::tid), ('(1,42)'::tid), ('(4294967295,65535)'::tid)) AS v(t);
+
+
 -- tests for functions related to TID handling
 
 CREATE TABLE tid_tab (a int);
@@ -24,6 +40,11 @@ CREATE TABLE tid_tab (a int);
 INSERT INTO tid_tab VALUES (1), (2);
 SELECT min(ctid) FROM tid_tab;
 SELECT max(ctid) FROM tid_tab;
+
+-- tid_block() and tid_offset() with real table ctid
+SELECT ctid, tid_block(ctid), tid_offset(ctid) FROM tid_tab;
+-- use in WHERE clause
+SELECT ctid FROM tid_tab WHERE tid_block(ctid) = 0 AND tid_offset(ctid) = 1;
 TRUNCATE tid_tab;
 
 -- Tests for currtid2() with various relation kinds
-- 
2.43.0



view thread (18+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected]
  Subject: Re: tid_blockno() and tid_offset() accessor functions
  In-Reply-To: <CAJTYsWVK6rw++WU94=gFk5ZR5ZXhT3BYOFxhm6tT9xOGG-drnA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox