public inbox for [email protected]  
help / color / mirror / Atom feed
Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum
6+ messages / 3 participants
[nested] [flat]

* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum
@ 2026-05-03 18:17  Andrey Borodin <[email protected]>
  0 siblings, 1 reply; 6+ messages in thread

From: Andrey Borodin @ 2026-05-03 18:17 UTC (permalink / raw)
  To: Andres Freund <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; Alexander Lakhin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]>



> On 2 May 2026, at 00:41, Andres Freund <[email protected]> wrote:
> 
> This is checking (as you noted) !VARATT_IS_EXTENDED, whereas the
> bt_normalize_tuple() code is checking !VARATT_IS_COMPRESSED.
> 
> VARATT_IS_EXTENDED() will return true for short varlenas (because it's not a
> standard 4 byte uncompressed varlena), whereas VARATT_IS_COMPRESSED() will
> return false for a short varlena (since it's not compressed).
> 
> I didn't find other instanes of similar code that uses !VARATT_IS_COMPRESSED.

As far as I understand 

!VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i])) && !VARATT_IS_SHORT(DatumGetPointer(normalized[i]))

is exactly

!VARATT_IS_EXTENDED(DatumGetPointer(untoasted_values[i]))

Which is what was proposed in v2 patch. But later was changed to !VARATT_IS_COMPRESSED().
As I understood it was done to further strengthen normalization.

So the intent might be that short varatts need normalization in some cases. But we have no tests that show such a case.
I tried to build a problematic storage alternation like [0], but everything works nicely.

So I propose something in a line with attached patch.


Best regards, Andrey Borodin.

[0] https://github.com/postgres/postgres/blob/master/contrib/amcheck/sql/check_btree.sql#L170-L172


Attachments:

  [application/octet-stream] 0001-amcheck-avoid-using-VARSIZE-for-VARATT_IS_SHORT.patch (1.4K, 2-0001-amcheck-avoid-using-VARSIZE-for-VARATT_IS_SHORT.patch)
  download | inline diff:
From 276482c76491b1aa18a729dd637342b374dcd7fa Mon Sep 17 00:00:00 2001
From: Andrey Borodin <[email protected]>
Date: Sun, 3 May 2026 23:13:59 +0500
Subject: [PATCH] amcheck: avoid using VARSIZE() for VARATT_IS_SHORT()

---
 contrib/amcheck/verify_nbtree.c | 2 +-
 src/include/varatt.h            | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index b74ab5f7a05..a3394d88a4e 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -2888,7 +2888,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup)
 							ItemPointerGetBlockNumber(&(itup->t_tid)),
 							ItemPointerGetOffsetNumber(&(itup->t_tid)),
 							RelationGetRelationName(state->rel))));
-		else if (!VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i])) &&
+		else if (!VARATT_IS_EXTENDED(DatumGetPointer(normalized[i])) &&
 				 VARSIZE(DatumGetPointer(normalized[i])) > TOAST_INDEX_TARGET &&
 				 (att->attstorage == TYPSTORAGE_EXTENDED ||
 				  att->attstorage == TYPSTORAGE_MAIN))
diff --git a/src/include/varatt.h b/src/include/varatt.h
index 000bdf33b92..321342cab67 100644
--- a/src/include/varatt.h
+++ b/src/include/varatt.h
@@ -297,6 +297,7 @@ typedef struct
 static inline Size
 VARSIZE(const void *PTR)
 {
+	Assert(VARATT_IS_4B(PTR));
 	return VARSIZE_4B(PTR);
 }
 
-- 
2.50.1 (Apple Git-155)



^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum
@ 2026-05-03 19:44  Alexander Korotkov <[email protected]>
  parent: Andrey Borodin <[email protected]>
  0 siblings, 1 reply; 6+ messages in thread

From: Alexander Korotkov @ 2026-05-03 19:44 UTC (permalink / raw)
  To: Andrey Borodin <[email protected]>; +Cc: Andres Freund <[email protected]>; Alexander Lakhin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]>

On Sun, May 3, 2026 at 9:17 PM Andrey Borodin <[email protected]> wrote:
> > On 2 May 2026, at 00:41, Andres Freund <[email protected]> wrote:
> >
> > This is checking (as you noted) !VARATT_IS_EXTENDED, whereas the
> > bt_normalize_tuple() code is checking !VARATT_IS_COMPRESSED.
> >
> > VARATT_IS_EXTENDED() will return true for short varlenas (because it's not a
> > standard 4 byte uncompressed varlena), whereas VARATT_IS_COMPRESSED() will
> > return false for a short varlena (since it's not compressed).
> >
> > I didn't find other instanes of similar code that uses !VARATT_IS_COMPRESSED.
>
> As far as I understand
>
> !VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i])) && !VARATT_IS_SHORT(DatumGetPointer(normalized[i]))
>
> is exactly
>
> !VARATT_IS_EXTENDED(DatumGetPointer(untoasted_values[i]))
>
> Which is what was proposed in v2 patch. But later was changed to !VARATT_IS_COMPRESSED().
> As I understood it was done to further strengthen normalization.
>
> So the intent might be that short varatts need normalization in some cases. But we have no tests that show such a case.
> I tried to build a problematic storage alternation like [0], but everything works nicely.
>
> So I propose something in a line with attached patch.
>
>
> Best regards, Andrey Borodin.
>
> [0] https://github.com/postgres/postgres/blob/master/contrib/amcheck/sql/check_btree.sql#L170-L172

AFAICS, this is correct.  However, I propose an alternative approach:
use VARSIZE_ANY().  This might be a bit slower, but looks more
intuitive to me.

------
Regards,
Alexander Korotkov
Supabase


Attachments:

  [application/octet-stream] verify_nbtree_varsize_fix.patch (686B, 2-verify_nbtree_varsize_fix.patch)
  download | inline diff:
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index b74ab5f7a05..eed26206f10 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -2889,7 +2889,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup)
 							ItemPointerGetOffsetNumber(&(itup->t_tid)),
 							RelationGetRelationName(state->rel))));
 		else if (!VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i])) &&
-				 VARSIZE(DatumGetPointer(normalized[i])) > TOAST_INDEX_TARGET &&
+				 VARSIZE_ANY(DatumGetPointer(normalized[i])) > TOAST_INDEX_TARGET &&
 				 (att->attstorage == TYPSTORAGE_EXTENDED ||
 				  att->attstorage == TYPSTORAGE_MAIN))
 		{


^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum
@ 2026-05-04 05:20  Andrey Borodin <[email protected]>
  parent: Alexander Korotkov <[email protected]>
  0 siblings, 1 reply; 6+ messages in thread

From: Andrey Borodin @ 2026-05-04 05:20 UTC (permalink / raw)
  To: Alexander Korotkov <[email protected]>; +Cc: Andres Freund <[email protected]>; Alexander Lakhin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]>



> On 4 May 2026, at 00:44, Alexander Korotkov <[email protected]> wrote:
> 
> AFAICS, this is correct.  However, I propose an alternative approach:
> use VARSIZE_ANY().  This might be a bit slower, but looks more
> intuitive to me.
Works for me.
However, I'd like to note that (VARSIZE_1B() < TOAST_INDEX_TARGET) is
constantly true for 8Kb+ pages.


Best regards, Andrey Borodin.






^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum
@ 2026-05-08 23:07  Michael Paquier <[email protected]>
  parent: Andrey Borodin <[email protected]>
  0 siblings, 1 reply; 6+ messages in thread

From: Michael Paquier @ 2026-05-08 23:07 UTC (permalink / raw)
  To: Andrey Borodin <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; Andres Freund <[email protected]>; Alexander Lakhin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]>

On Mon, May 04, 2026 at 10:20:04AM +0500, Andrey Borodin wrote:
> However, I'd like to note that (VARSIZE_1B() < TOAST_INDEX_TARGET) is
> constantly true for 8Kb+ pages.

How much slower?  I cannot imagine that it matters much in this code
path, but you are getting me worried.
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum
@ 2026-05-12 09:17  Andrey Borodin <[email protected]>
  parent: Michael Paquier <[email protected]>
  0 siblings, 1 reply; 6+ messages in thread

From: Andrey Borodin @ 2026-05-12 09:17 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; Andres Freund <[email protected]>; Alexander Lakhin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]>



> On 9 May 2026, at 04:07, Michael Paquier <[email protected]> wrote:
> 
> On Mon, May 04, 2026 at 10:20:04AM +0500, Andrey Borodin wrote:
>> However, I'd like to note that (VARSIZE_1B() < TOAST_INDEX_TARGET) is
>> constantly true for 8Kb+ pages.
> 
> How much slower?  I cannot imagine that it matters much in this code
> path, but you are getting me worried.


I think there will be no performance difference.

Change proposed by Alexander only prevents use of VARSIZE() against datum
that is VARSIZE_1B. AFAICS no actual behavior would change.

On some occasions we would have to normilize less tuples.


Best regards, Andrey Borodin.





^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum
@ 2026-05-12 10:22  Alexander Korotkov <[email protected]>
  parent: Andrey Borodin <[email protected]>
  0 siblings, 0 replies; 6+ messages in thread

From: Alexander Korotkov @ 2026-05-12 10:22 UTC (permalink / raw)
  To: Andrey Borodin <[email protected]>; +Cc: Michael Paquier <[email protected]>; Andres Freund <[email protected]>; Alexander Lakhin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]>

On Tue, May 12, 2026 at 12:17 PM Andrey Borodin <[email protected]> wrote:
> > On 9 May 2026, at 04:07, Michael Paquier <[email protected]> wrote:
> >
> > On Mon, May 04, 2026 at 10:20:04AM +0500, Andrey Borodin wrote:
> >> However, I'd like to note that (VARSIZE_1B() < TOAST_INDEX_TARGET) is
> >> constantly true for 8Kb+ pages.
> >
> > How much slower?  I cannot imagine that it matters much in this code
> > path, but you are getting me worried.
>
>
> I think there will be no performance difference.
>
> Change proposed by Alexander only prevents use of VARSIZE() against datum
> that is VARSIZE_1B. AFAICS no actual behavior would change.
>
> On some occasions we would have to normilize less tuples.

Any objections if I push and backpatch this?

------
Regards,
Alexander Korotkov
Supabase






^ permalink  raw  reply  [nested|flat] 6+ messages in thread


end of thread, other threads:[~2026-05-12 10:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-05-03 18:17 Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum Andrey Borodin <[email protected]>
2026-05-03 19:44 ` Alexander Korotkov <[email protected]>
2026-05-04 05:20   ` Andrey Borodin <[email protected]>
2026-05-08 23:07     ` Michael Paquier <[email protected]>
2026-05-12 09:17       ` Andrey Borodin <[email protected]>
2026-05-12 10:22         ` Alexander Korotkov <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox