public inbox for [email protected]help / color / mirror / Atom feed
Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum 20+ messages / 7 participants [nested] [flat]
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-07 18:04 Andrey M. Borodin <[email protected]> 0 siblings, 2 replies; 20+ messages in thread From: Andrey M. Borodin @ 2024-01-07 18:04 UTC (permalink / raw) To: Michael Zhilin <[email protected]>; +Cc: pgsql-bugs; y sokolov <[email protected]>; Alexander Lakhin <[email protected]> > On 14 Dec 2023, at 21:18, Michael Zhilin <[email protected]> wrote: I've checked that: * bug is reproduced by the test in the patch * bug is fixed by the patch * fix seems idiomatic, similar to nearby code Patch needed a rebase, so please find attached rebased version. I did not change anything. I see that using a temp file in PG_ABS_SRCDIR is common approach. But still I want to ask, maybe can we develop some clever way to reproduce the bug without external file? Also, maybe nearby code would be slightly more readable, if normalized[i] was a local variable. And one last question about the line: char *data = palloc(len); what if data is somehow corrupted here... are there enough sanity checks that we won't palloc(-1) or something like that? Won't we memcpy() from some other memory when len is bogus? Besides this paranoid questions, I think that this patch is ready for committer. Thanks! Best regards, Andrey Borodin. Attachments: [application/octet-stream] v1rebased-0001-contrib-amcheck-must-support-different-hea.patch (6.2K, 2-v1rebased-0001-contrib-amcheck-must-support-different-hea.patch) download | inline diff: From f05a9ad52aa6552c9cae5dcc712a168e08c67d80 Mon Sep 17 00:00:00 2001 From: Michael Zhilin <[email protected]> Date: Thu, 14 Dec 2023 16:08:15 +0300 Subject: [PATCH v1rebased] contrib/amcheck: must support different header size of short varlena datum --- contrib/amcheck/expected/check_btree.out | 19 +++++++++++++ contrib/amcheck/sql/check_btree.sql | 17 +++++++++++ contrib/amcheck/verify_nbtree.c | 36 ++++++++++++++++++++---- 3 files changed, 67 insertions(+), 5 deletions(-) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index 86b38d93f4..1ab1bd260c 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -1,3 +1,5 @@ +-- directory paths are passed to us in environment variables +\getenv abs_srcdir PG_ABS_SRCDIR CREATE TABLE bttest_a(id int8); CREATE TABLE bttest_b(id int8); CREATE TABLE bttest_multi(id int8, data int8); @@ -240,6 +242,22 @@ SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, che (1 row) +-- +-- BUG: must support different header size of short varlena datum +-- +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +\set filename :abs_srcdir '/results/varlena_bug.dmp' +COPY varlena_bug TO :'filename'; +COPY varlena_bug FROM :'filename'; +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + bt_index_check +---------------- + +(1 row) + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -250,3 +268,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index aa461f7fb9..4daed676c2 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -1,3 +1,6 @@ +-- directory paths are passed to us in environment variables +\getenv abs_srcdir PG_ABS_SRCDIR + CREATE TABLE bttest_a(id int8); CREATE TABLE bttest_b(id int8); CREATE TABLE bttest_multi(id int8, data int8); @@ -148,6 +151,19 @@ SELECT bt_index_check('bttest_unique_nulls_c_key', heapallindexed => true, check CREATE INDEX on bttest_unique_nulls (b,c); SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, checkunique => true); +-- +-- BUG: must support different header size of short varlena datum +-- + +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +\set filename :abs_srcdir '/results/varlena_bug.dmp' +COPY varlena_bug TO :'filename'; +COPY varlena_bug FROM :'filename'; +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -158,3 +174,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 91caa53dd8..e7f01c2add 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -2942,7 +2942,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) TupleDesc tupleDescriptor = RelationGetDescr(state->rel); Datum normalized[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; - bool toast_free[INDEX_MAX_KEYS]; + bool need_free[INDEX_MAX_KEYS]; bool formnewtup = false; IndexTuple reformed; int i; @@ -2961,7 +2961,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) att = TupleDescAttr(tupleDescriptor, i); /* Assume untoasted/already normalized datum initially */ - toast_free[i] = false; + need_free[i] = false; normalized[i] = index_getattr(itup, att->attnum, tupleDescriptor, &isnull[i]); @@ -2973,6 +2973,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * index without further processing, so an external varlena header * should never be encountered here */ + if (VARATT_IS_EXTERNAL(DatumGetPointer(normalized[i]))) ereport(ERROR, (errcode(ERRCODE_INDEX_CORRUPTED), @@ -2984,11 +2985,32 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) { formnewtup = true; normalized[i] = PointerGetDatum(PG_DETOAST_DATUM(normalized[i])); - toast_free[i] = true; + need_free[i] = true; } + /* + * Short tuples may have 1B or 4B header. Convert 4B header of short + * tuples to 1B + */ + else if (VARATT_CAN_MAKE_SHORT(DatumGetPointer(normalized[i]))) + { + /* convert to short varlena */ + Size len = VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(normalized[i])); + char *data = palloc(len); + + SET_VARSIZE_SHORT(data, len); + memcpy(data + 1, VARDATA(DatumGetPointer(normalized[i])), len - 1); + + formnewtup = true; + normalized[i] = PointerGetDatum(data); + need_free[i] = true; + } + } - /* Easier case: Tuple has varlena datums, none of which are compressed */ + /* + * Easier case: Tuple has varlena datums, none of which are compressed or + * short with 4B header + */ if (!formnewtup) return itup; @@ -2997,6 +3019,10 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * creating normalized version of the tuple from uncompressed input datums * (normalized input datums). This is rather naive, but shouldn't be * necessary too often. + * Also tuple had short varlena datums with 4B header. Actually there is no + * restriction with have heap tuple containing varlena datum with 4B header + * and corresponding index tuple containing varlena datum with 1B header. + * For fingerprinting let's convert heap tuple varlena datum to 1B format. * * Note that we rely on deterministic index_form_tuple() TOAST compression * of normalized input. @@ -3006,7 +3032,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) /* Cannot leak memory here */ for (i = 0; i < tupleDescriptor->natts; i++) - if (toast_free[i]) + if (need_free[i]) pfree(DatumGetPointer(normalized[i])); return reformed; -- 2.37.1 (Apple Git-137.1) ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-07 18:33 Andrey M. Borodin <[email protected]> parent: Andrey M. Borodin <[email protected]> 1 sibling, 0 replies; 20+ messages in thread From: Andrey M. Borodin @ 2024-01-07 18:33 UTC (permalink / raw) To: Michael Zhilin <[email protected]>; +Cc: pgsql-bugs; y sokolov <[email protected]>; Alexander Lakhin <[email protected]> > On 7 Jan 2024, at 23:04, Andrey M. Borodin <[email protected]> wrote: > > I see that using a temp file in PG_ABS_SRCDIR is common approach. But still I want to ask, maybe can we develop some clever way to reproduce the bug without external file? BTW this stuff is causing problems in CFbot [0,1] COPY varlena_bug TO :'filename'; +ERROR: could not open file "/tmp/cirrus-ci-build/contrib/amcheck/results/varlena_bug.dmp" for writing: No such file or directory +HINT: COPY TO instructs the PostgreSQL server process to write a file. You may want a client-side facility such as psql's \copy. Best regards, Andrey Borodin. [0] https://api.cirrus-ci.com/v1/artifact/task/4880592609738752/testrun/build/testrun/amcheck/regress/re... [1] https://github.com/x4m/postgres_g/runs/20240081462 ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-07 19:00 Alexander Lakhin <[email protected]> parent: Andrey M. Borodin <[email protected]> 1 sibling, 1 reply; 20+ messages in thread From: Alexander Lakhin @ 2024-01-07 19:00 UTC (permalink / raw) To: Andrey M. Borodin <[email protected]>; Michael Zhilin <[email protected]>; +Cc: pgsql-bugs; y sokolov <[email protected]> Hello Andrey, 07.01.2024 21:04, Andrey M. Borodin wrote: >> On 14 Dec 2023, at 21:18, Michael Zhilin <[email protected]> wrote: > I've checked that: > * bug is reproduced by the test in the patch > * bug is fixed by the patch > * fix seems idiomatic, similar to nearby code > What is your opinion regarding similar failures, which are not addressed by the patch? Besides the case shown above, there is another one: CREATE TABLE tbl(i int4, t text); ALTER TABLE tbl ALTER COLUMN t SET STORAGE plain; CREATE INDEX tbl_idx ON tbl (t, i) WITH (fillfactor = 10); INSERT INTO tbl SELECT g, repeat('Test', 250) FROM generate_series(1, 130) g; ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended; CREATE EXTENSION amcheck; SELECT bt_index_check('tbl_idx', true); ERROR: heap tuple (0,1) from table "tbl" lacks matching index tuple within index "tbl_idx" HINT: Retrying verification using the function bt_index_parent_check() might provide a more specific error. Best regards, Alexander ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-08 14:34 Andrey M. Borodin <[email protected]> parent: Alexander Lakhin <[email protected]> 0 siblings, 1 reply; 20+ messages in thread From: Andrey M. Borodin @ 2024-01-08 14:34 UTC (permalink / raw) To: Alexander Lakhin <[email protected]>; +Cc: Michael Zhilin <[email protected]>; pgsql-bugs; y sokolov <[email protected]> Hi Alexander! I think both cases are very interesting and deserve two separate bug items. > On 14 Dec 2023, at 22:17, Alexander Lakhin <[email protected]> wrote: > > > By changing the storage mode for a column, you can also get another error: > CREATE TABLE t(f1 text); > CREATE INDEX t_idx ON t(f1); > INSERT INTO t VALUES(repeat('1234567890', 1000)); > ALTER TABLE t ALTER COLUMN f1 SET STORAGE plain; > > CREATE EXTENSION amcheck; > SELECT bt_index_check('t_idx', true); > > ERROR: index row requires 10016 bytes, maximum size is 8191 > I think In this case we should warn user that index contains tuples with datums that are not insertable anymore. And abort heapallindexed. > On 8 Jan 2024, at 00:00, Alexander Lakhin <[email protected]> wrote: > > What is your opinion regarding similar failures, which are not addressed > by the patch? Besides the case shown above, there is another one: > CREATE TABLE tbl(i int4, t text); > ALTER TABLE tbl ALTER COLUMN t SET STORAGE plain; > CREATE INDEX tbl_idx ON tbl (t, i) WITH (fillfactor = 10); > INSERT INTO tbl SELECT g, repeat('Test', 250) FROM generate_series(1, 130) g; > ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended; > > CREATE EXTENSION amcheck; > SELECT bt_index_check('tbl_idx', true); > > ERROR: heap tuple (0,1) from table "tbl" lacks matching index tuple within index "tbl_idx" > HINT: Retrying verification using the function bt_index_parent_check() might provide a more specific error. IMO In this case we should handle VARATT_IS_EXTENDED in bt_normalize_tuple(). BTW CI fails of the original patch ITT are related to the fact that COPY in\out file is created in PG_ABS_SRCDIR instead of PG_ABS_BUILDDIR. In an off-list conversation I recommended Michael to mimic tests regress/largeobject.sql. Though, there might be better ideas. Thanks! Best regards, Andrey Borodin. ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-09 17:59 Michael Zhilin <[email protected]> parent: Andrey M. Borodin <[email protected]> 0 siblings, 2 replies; 20+ messages in thread From: Michael Zhilin @ 2024-01-09 17:59 UTC (permalink / raw) To: Andrey M. Borodin <[email protected]>; Alexander Lakhin <[email protected]>; pgsql-bugs; +Cc: y sokolov <[email protected]> Hi, Thank you, Andrey, for review and advice! Here is rebased version (v2) of patch supposed to make CF bot happy. Best regards, Michael On 1/8/24 17:34, Andrey M. Borodin wrote: > Hi Alexander! > > I think both cases are very interesting and deserve two separate bug items. > >> On 14 Dec 2023, at 22:17, Alexander Lakhin<[email protected]> wrote: >> >> >> By changing the storage mode for a column, you can also get another error: >> CREATE TABLE t(f1 text); >> CREATE INDEX t_idx ON t(f1); >> INSERT INTO t VALUES(repeat('1234567890', 1000)); >> ALTER TABLE t ALTER COLUMN f1 SET STORAGE plain; >> >> CREATE EXTENSION amcheck; >> SELECT bt_index_check('t_idx', true); >> >> ERROR: index row requires 10016 bytes, maximum size is 8191 >> > I think In this case we should warn user that index contains tuples with datums that are not insertable anymore. And abort heapallindexed. > > >> On 8 Jan 2024, at 00:00, Alexander Lakhin<[email protected]> wrote: >> >> What is your opinion regarding similar failures, which are not addressed >> by the patch? Besides the case shown above, there is another one: >> CREATE TABLE tbl(i int4, t text); >> ALTER TABLE tbl ALTER COLUMN t SET STORAGE plain; >> CREATE INDEX tbl_idx ON tbl (t, i) WITH (fillfactor = 10); >> INSERT INTO tbl SELECT g, repeat('Test', 250) FROM generate_series(1, 130) g; >> ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended; >> >> CREATE EXTENSION amcheck; >> SELECT bt_index_check('tbl_idx', true); >> >> ERROR: heap tuple (0,1) from table "tbl" lacks matching index tuple within index "tbl_idx" >> HINT: Retrying verification using the function bt_index_parent_check() might provide a more specific error. > IMO In this case we should handle VARATT_IS_EXTENDED in bt_normalize_tuple(). > > BTW CI fails of the original patch ITT are related to the fact that COPY in\out file is created in PG_ABS_SRCDIR instead of PG_ABS_BUILDDIR. In an off-list conversation I recommended Michael to mimic tests regress/largeobject.sql. Though, there might be better ideas. > > Thanks! > > > Best regards, Andrey Borodin. > -- Michael Zhilin Postgres Professional +7(925)3366270 https://www.postgrespro.ru Attachments: [text/x-patch] v2-0001-contrib-amcheck-must-support-different-header-siz.patch (6.2K, 3-v2-0001-contrib-amcheck-must-support-different-header-siz.patch) download | inline diff: From ae3850e64e75c1137acce1a20f276fa9a4cd0dbe Mon Sep 17 00:00:00 2001 From: Michael Zhilin <[email protected]> Date: Thu, 14 Dec 2023 16:08:15 +0300 Subject: [PATCH v2] contrib/amcheck: must support different header size of short varlena datum --- contrib/amcheck/expected/check_btree.out | 19 +++++++++++++ contrib/amcheck/sql/check_btree.sql | 17 +++++++++++ contrib/amcheck/verify_nbtree.c | 36 ++++++++++++++++++++---- 3 files changed, 67 insertions(+), 5 deletions(-) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index 86b38d93f41..e37901c4028 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -1,3 +1,5 @@ +-- directory paths are passed to us in environment variables +\getenv abs_builddir PG_ABS_BUILDDIR CREATE TABLE bttest_a(id int8); CREATE TABLE bttest_b(id int8); CREATE TABLE bttest_multi(id int8, data int8); @@ -240,6 +242,22 @@ SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, che (1 row) +-- +-- BUG: must support different header size of short varlena datum +-- +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +\set filename :abs_builddir '/results/varlena_bug.dmp' +COPY varlena_bug TO :'filename'; +COPY varlena_bug FROM :'filename'; +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + bt_index_check +---------------- + +(1 row) + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -250,3 +268,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index aa461f7fb97..e67eb6352f8 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -1,3 +1,6 @@ +-- directory paths are passed to us in environment variables +\getenv abs_builddir PG_ABS_BUILDDIR + CREATE TABLE bttest_a(id int8); CREATE TABLE bttest_b(id int8); CREATE TABLE bttest_multi(id int8, data int8); @@ -148,6 +151,19 @@ SELECT bt_index_check('bttest_unique_nulls_c_key', heapallindexed => true, check CREATE INDEX on bttest_unique_nulls (b,c); SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, checkunique => true); +-- +-- BUG: must support different header size of short varlena datum +-- + +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +\set filename :abs_builddir '/results/varlena_bug.dmp' +COPY varlena_bug TO :'filename'; +COPY varlena_bug FROM :'filename'; +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -158,3 +174,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 91caa53dd8b..e7f01c2addb 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -2942,7 +2942,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) TupleDesc tupleDescriptor = RelationGetDescr(state->rel); Datum normalized[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; - bool toast_free[INDEX_MAX_KEYS]; + bool need_free[INDEX_MAX_KEYS]; bool formnewtup = false; IndexTuple reformed; int i; @@ -2961,7 +2961,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) att = TupleDescAttr(tupleDescriptor, i); /* Assume untoasted/already normalized datum initially */ - toast_free[i] = false; + need_free[i] = false; normalized[i] = index_getattr(itup, att->attnum, tupleDescriptor, &isnull[i]); @@ -2973,6 +2973,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * index without further processing, so an external varlena header * should never be encountered here */ + if (VARATT_IS_EXTERNAL(DatumGetPointer(normalized[i]))) ereport(ERROR, (errcode(ERRCODE_INDEX_CORRUPTED), @@ -2984,11 +2985,32 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) { formnewtup = true; normalized[i] = PointerGetDatum(PG_DETOAST_DATUM(normalized[i])); - toast_free[i] = true; + need_free[i] = true; } + /* + * Short tuples may have 1B or 4B header. Convert 4B header of short + * tuples to 1B + */ + else if (VARATT_CAN_MAKE_SHORT(DatumGetPointer(normalized[i]))) + { + /* convert to short varlena */ + Size len = VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(normalized[i])); + char *data = palloc(len); + + SET_VARSIZE_SHORT(data, len); + memcpy(data + 1, VARDATA(DatumGetPointer(normalized[i])), len - 1); + + formnewtup = true; + normalized[i] = PointerGetDatum(data); + need_free[i] = true; + } + } - /* Easier case: Tuple has varlena datums, none of which are compressed */ + /* + * Easier case: Tuple has varlena datums, none of which are compressed or + * short with 4B header + */ if (!formnewtup) return itup; @@ -2997,6 +3019,10 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * creating normalized version of the tuple from uncompressed input datums * (normalized input datums). This is rather naive, but shouldn't be * necessary too often. + * Also tuple had short varlena datums with 4B header. Actually there is no + * restriction with have heap tuple containing varlena datum with 4B header + * and corresponding index tuple containing varlena datum with 1B header. + * For fingerprinting let's convert heap tuple varlena datum to 1B format. * * Note that we rely on deterministic index_form_tuple() TOAST compression * of normalized input. @@ -3006,7 +3032,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) /* Cannot leak memory here */ for (i = 0; i < tupleDescriptor->natts; i++) - if (toast_free[i]) + if (need_free[i]) pfree(DatumGetPointer(normalized[i])); return reformed; -- 2.43.0 ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-19 18:16 Andrey Borodin <[email protected]> parent: Michael Zhilin <[email protected]> 1 sibling, 1 reply; 20+ messages in thread From: Andrey Borodin @ 2024-01-19 18:16 UTC (permalink / raw) To: Michael Zhilin <[email protected]>; +Cc: Alexander Lakhin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> > On 9 Jan 2024, at 22:59, Michael Zhilin <[email protected]> wrote: > > Here is rebased version (v2) of patch supposed to make CF bot happy. I've marked CF item as ready for committer. > On 1/8/24 17:34, Andrey M. Borodin wrote: >> Hi Alexander! >> >> I think both cases are very interesting and deserve two separate bug items. Alexander, do you plan to provide fixes for bugs you discovered? Best regards, Andrey Borodin. ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-20 04:00 Alexander Lakhin <[email protected]> parent: Andrey Borodin <[email protected]> 0 siblings, 1 reply; 20+ messages in thread From: Alexander Lakhin @ 2024-01-20 04:00 UTC (permalink / raw) To: Andrey Borodin <[email protected]>; Michael Zhilin <[email protected]>; +Cc: pgsql-bugs; Yura Sokolov <[email protected]> Hi Andrey, 19.01.2024 21:16, Andrey Borodin wrote: > I've marked CF item as ready for committer. >> On 1/8/24 17:34, Andrey M. Borodin wrote: >>> Hi Alexander! >>> >>> I think both cases are very interesting and deserve two separate bug items. > Alexander, do you plan to provide fixes for bugs you discovered? No, I don't have a concrete proposal how to fix those bugs. I'd thought that fixing the whole class of such anomalies, not only one case, is a good thing to do, but if it's too complicated, maybe other similar bugs could be put aside. Best regards, Alexander ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-23 00:00 jian he <[email protected]> parent: Michael Zhilin <[email protected]> 1 sibling, 0 replies; 20+ messages in thread From: jian he @ 2024-01-23 00:00 UTC (permalink / raw) To: Michael Zhilin <[email protected]>; +Cc: Andrey M. Borodin <[email protected]>; Alexander Lakhin <[email protected]>; pgsql-bugs; y sokolov <[email protected]> On Wed, Jan 10, 2024 at 1:59 AM Michael Zhilin <[email protected]> wrote: > > Hi, > > Thank you, Andrey, for review and advice! > > Here is rebased version (v2) of patch supposed to make CF bot happy. Hi +-- +-- BUG: must support different header size of short varlena datum +-- + +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +\set filename :abs_builddir '/results/varlena_bug.dmp' +COPY varlena_bug TO :'filename'; +COPY varlena_bug FROM :'filename'; +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); you can simply replace +\set filename :abs_builddir '/results/varlena_bug.dmp' +COPY varlena_bug TO :'filename'; +COPY varlena_bug FROM :'filename'; with COPY varlena_bug from stdin; x \. In the comments, adding the postgres link (https://postgr.es/m/[email protected]) would be great. ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-23 18:09 Andrey M. Borodin <[email protected]> parent: Alexander Lakhin <[email protected]> 0 siblings, 2 replies; 20+ messages in thread From: Andrey M. Borodin @ 2024-01-23 18:09 UTC (permalink / raw) To: Alexander Lakhin <[email protected]>; +Cc: Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> > On 20 Jan 2024, at 09:00, Alexander Lakhin <[email protected]> wrote: > >> Alexander, do you plan to provide fixes for bugs you discovered? > > No, I don't have a concrete proposal how to fix those bugs. I'd thought > that fixing the whole class of such anomalies, not only one case, is a good > thing to do, but if it's too complicated, maybe other similar bugs could be > put aside. PFA draft fixes for both this errors. Alexander, Michael, Jian, what do you think? I did not touch anything in first step - fix for original bug in this thread. However, I think that comments from Jian He worth incorporating into the fix. Best regards, Andrey Borodin. Attachments: [application/octet-stream] v3-0001-contrib-amcheck-must-support-different-header-siz.patch (6.2K, 2-v3-0001-contrib-amcheck-must-support-different-header-siz.patch) download | inline diff: From b514df4afa0c471cea90c84c3977cbc18c13fae4 Mon Sep 17 00:00:00 2001 From: Michael Zhilin <[email protected]> Date: Thu, 14 Dec 2023 16:08:15 +0300 Subject: [PATCH v3 1/3] contrib/amcheck: must support different header size of short varlena datum --- contrib/amcheck/expected/check_btree.out | 19 +++++++++++++ contrib/amcheck/sql/check_btree.sql | 17 +++++++++++ contrib/amcheck/verify_nbtree.c | 36 ++++++++++++++++++++---- 3 files changed, 67 insertions(+), 5 deletions(-) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index 86b38d93f4..e37901c402 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -1,3 +1,5 @@ +-- directory paths are passed to us in environment variables +\getenv abs_builddir PG_ABS_BUILDDIR CREATE TABLE bttest_a(id int8); CREATE TABLE bttest_b(id int8); CREATE TABLE bttest_multi(id int8, data int8); @@ -240,6 +242,22 @@ SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, che (1 row) +-- +-- BUG: must support different header size of short varlena datum +-- +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +\set filename :abs_builddir '/results/varlena_bug.dmp' +COPY varlena_bug TO :'filename'; +COPY varlena_bug FROM :'filename'; +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + bt_index_check +---------------- + +(1 row) + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -250,3 +268,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index aa461f7fb9..e67eb6352f 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -1,3 +1,6 @@ +-- directory paths are passed to us in environment variables +\getenv abs_builddir PG_ABS_BUILDDIR + CREATE TABLE bttest_a(id int8); CREATE TABLE bttest_b(id int8); CREATE TABLE bttest_multi(id int8, data int8); @@ -148,6 +151,19 @@ SELECT bt_index_check('bttest_unique_nulls_c_key', heapallindexed => true, check CREATE INDEX on bttest_unique_nulls (b,c); SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, checkunique => true); +-- +-- BUG: must support different header size of short varlena datum +-- + +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +\set filename :abs_builddir '/results/varlena_bug.dmp' +COPY varlena_bug TO :'filename'; +COPY varlena_bug FROM :'filename'; +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -158,3 +174,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 91caa53dd8..e7f01c2add 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -2942,7 +2942,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) TupleDesc tupleDescriptor = RelationGetDescr(state->rel); Datum normalized[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; - bool toast_free[INDEX_MAX_KEYS]; + bool need_free[INDEX_MAX_KEYS]; bool formnewtup = false; IndexTuple reformed; int i; @@ -2961,7 +2961,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) att = TupleDescAttr(tupleDescriptor, i); /* Assume untoasted/already normalized datum initially */ - toast_free[i] = false; + need_free[i] = false; normalized[i] = index_getattr(itup, att->attnum, tupleDescriptor, &isnull[i]); @@ -2973,6 +2973,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * index without further processing, so an external varlena header * should never be encountered here */ + if (VARATT_IS_EXTERNAL(DatumGetPointer(normalized[i]))) ereport(ERROR, (errcode(ERRCODE_INDEX_CORRUPTED), @@ -2984,11 +2985,32 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) { formnewtup = true; normalized[i] = PointerGetDatum(PG_DETOAST_DATUM(normalized[i])); - toast_free[i] = true; + need_free[i] = true; } + /* + * Short tuples may have 1B or 4B header. Convert 4B header of short + * tuples to 1B + */ + else if (VARATT_CAN_MAKE_SHORT(DatumGetPointer(normalized[i]))) + { + /* convert to short varlena */ + Size len = VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(normalized[i])); + char *data = palloc(len); + + SET_VARSIZE_SHORT(data, len); + memcpy(data + 1, VARDATA(DatumGetPointer(normalized[i])), len - 1); + + formnewtup = true; + normalized[i] = PointerGetDatum(data); + need_free[i] = true; + } + } - /* Easier case: Tuple has varlena datums, none of which are compressed */ + /* + * Easier case: Tuple has varlena datums, none of which are compressed or + * short with 4B header + */ if (!formnewtup) return itup; @@ -2997,6 +3019,10 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * creating normalized version of the tuple from uncompressed input datums * (normalized input datums). This is rather naive, but shouldn't be * necessary too often. + * Also tuple had short varlena datums with 4B header. Actually there is no + * restriction with have heap tuple containing varlena datum with 4B header + * and corresponding index tuple containing varlena datum with 1B header. + * For fingerprinting let's convert heap tuple varlena datum to 1B format. * * Note that we rely on deterministic index_form_tuple() TOAST compression * of normalized input. @@ -3006,7 +3032,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) /* Cannot leak memory here */ for (i = 0; i < tupleDescriptor->natts; i++) - if (toast_free[i]) + if (need_free[i]) pfree(DatumGetPointer(normalized[i])); return reformed; -- 2.37.1 (Apple Git-137.1) [application/octet-stream] v3-0003-amcheck-avoid-failing-on-oversized-tuples.patch (6.3K, 3-v3-0003-amcheck-avoid-failing-on-oversized-tuples.patch) download | inline diff: From 01dd6155276ed97adcad3e5c55c064b7b8116bfd Mon Sep 17 00:00:00 2001 From: "Andrey M. Borodin" <[email protected]> Date: Tue, 23 Jan 2024 23:03:28 +0500 Subject: [PATCH v3 3/3] amcheck: avoid failing on oversized tuples Due to changes in toast policies, some heap tuples might become too big index tuples. This commit prevents ERRORs in this case, because this anomaly does not create dangerous conditions to DMS. However, the database cannot be dumped-restored, so we emit a NOTICE. Reported-by: Alexander Lakhin --- contrib/amcheck/expected/check_btree.out | 14 ++++++ contrib/amcheck/sql/check_btree.sql | 11 +++++ contrib/amcheck/verify_nbtree.c | 54 +++++++++++++++++++++--- 3 files changed, 73 insertions(+), 6 deletions(-) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index 36b1ebe34c..91394bed2b 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -270,6 +270,20 @@ SELECT bt_index_check('tbl_idx', true); (1 row) +DROP TABLE tbl; +-- Check oversized datums that cannot be inserted into index +CREATE TABLE t(f1 text); +CREATE INDEX t_idx ON t(f1); +INSERT INTO t VALUES(repeat('1234567890', 1000)); +ALTER TABLE t ALTER COLUMN f1 SET STORAGE plain; +SELECT bt_index_check('t_idx', true); +NOTICE: Index contain tuples that cannot fit into index page, if toasted with current toast policy + bt_index_check +---------------- + +(1 row) + +DROP TABLE t; -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index 0a827edd1a..cdfbc8a141 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -172,6 +172,17 @@ INSERT INTO tbl VALUES (repeat('Test', 250)); ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended; SELECT bt_index_check('tbl_idx', true); +DROP TABLE tbl; + +-- Check oversized datums that cannot be inserted into index + +CREATE TABLE t(f1 text); +CREATE INDEX t_idx ON t(f1); +INSERT INTO t VALUES(repeat('1234567890', 1000)); +ALTER TABLE t ALTER COLUMN f1 SET STORAGE plain; + +SELECT bt_index_check('t_idx', true); +DROP TABLE t; -- cleanup DROP TABLE bttest_a; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 6654b5afe7..f70e01191b 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -128,6 +128,11 @@ typedef struct BtreeCheckState bloom_filter *filter; /* Debug counter */ int64 heaptuplespresent; + /* + * During check we might find tuples that due to current TOAST policies + * should not reside in index, but still are there. + */ + bool has_oversized_tuples; } BtreeCheckState; /* @@ -1624,10 +1629,17 @@ bt_target_page_check(BtreeCheckState *state) logtuple = bt_posting_plain_tuple(itup, i); norm = bt_normalize_tuple(state, logtuple); - bloom_add_element(state->filter, (unsigned char *) norm, + if (norm == NULL) + { + if (!state->has_oversized_tuples) + elog(NOTICE, "Index contain tuples that cannot fit into index page, if toasted with current toast policy"); + state->has_oversized_tuples = true; + } + else + bloom_add_element(state->filter, (unsigned char *) norm, IndexTupleSize(norm)); /* Be tidy */ - if (norm != logtuple) + if (norm != logtuple && norm != NULL) pfree(norm); pfree(logtuple); } @@ -1635,10 +1647,17 @@ bt_target_page_check(BtreeCheckState *state) else { norm = bt_normalize_tuple(state, itup); - bloom_add_element(state->filter, (unsigned char *) norm, - IndexTupleSize(norm)); + if (norm == NULL) + { + if (!state->has_oversized_tuples) + elog(NOTICE, "Index contain tuples that cannot fit into index page, if toasted with current toast policy"); + state->has_oversized_tuples = true; + } + else + bloom_add_element(state->filter, (unsigned char *) norm, + IndexTupleSize(norm)); /* Be tidy */ - if (norm != itup) + if (norm != itup && norm != NULL) pfree(norm); } } @@ -2876,6 +2895,7 @@ bt_tuple_present_callback(Relation index, ItemPointer tid, Datum *values, BtreeCheckState *state = (BtreeCheckState *) checkstate; IndexTuple itup, norm; + bool miss_oversized_tuple = false; Assert(state->heapallindexed); @@ -2883,9 +2903,19 @@ bt_tuple_present_callback(Relation index, ItemPointer tid, Datum *values, itup = index_form_tuple(RelationGetDescr(index), values, isnull); itup->t_tid = *tid; norm = bt_normalize_tuple(state, itup); + if (norm == NULL) + { + if (state->has_oversized_tuples) + { + /* exempt this oversized tuple */ + state->heaptuplespresent++; + pfree(itup); + return; + } + } /* Probe Bloom filter -- tuple should be present */ - if (bloom_lacks_element(state->filter, (unsigned char *) norm, + if ((norm == NULL) || bloom_lacks_element(state->filter, (unsigned char *) norm, IndexTupleSize(norm))) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), @@ -2936,6 +2966,9 @@ bt_tuple_present_callback(Relation index, ItemPointer tid, Datum *values, * Caller does normalization for non-pivot tuples that have a posting list, * since dummy CREATE INDEX callback code generates new tuples with the same * normalized representation. + * + * If the tuple is exampt from checking due to has_oversized_tuples this function + * returns NULL. */ static IndexTuple bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) @@ -2947,6 +2980,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) bool formnewtup = false; IndexTuple reformed; int i; + Size data_size; /* Caller should only pass "logical" non-pivot tuples here */ Assert(!BTreeTupleIsPosting(itup) && !BTreeTupleIsPivot(itup)); @@ -3026,6 +3060,14 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) if (!formnewtup) return itup; + data_size = MAXALIGN(heap_compute_data_size(tupleDescriptor, + normalized, isnull) + + MAXALIGN(sizeof(IndexTupleData) + sizeof(IndexAttributeBitMapData))); + if ((data_size & INDEX_SIZE_MASK) != data_size) + { + return NULL; + } + /* * Hard case: Tuple had compressed varlena datums that necessitate * creating normalized version of the tuple from uncompressed input datums -- 2.37.1 (Apple Git-137.1) [application/octet-stream] v3-0002-amcheck-prevent-false-positives-from-extended-dat.patch (2.9K, 4-v3-0002-amcheck-prevent-false-positives-from-extended-dat.patch) download | inline diff: From e2ab4c0d3ad9777f70fccb21dd446e248cd5f52a Mon Sep 17 00:00:00 2001 From: "Andrey M. Borodin" <[email protected]> Date: Tue, 23 Jan 2024 21:57:02 +0500 Subject: [PATCH v3 2/3] amcheck: prevent false positives from extended datums Reported-by: Alexander Lakhin --- contrib/amcheck/expected/check_btree.out | 12 ++++++++++++ contrib/amcheck/sql/check_btree.sql | 9 +++++++++ contrib/amcheck/verify_nbtree.c | 12 ++++++++++++ 3 files changed, 33 insertions(+) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index e37901c402..36b1ebe34c 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -258,6 +258,18 @@ SELECT bt_index_check('varlena_bug_idx', true); (1 row) +-- Check extended varlena +CREATE TABLE tbl(t text); +ALTER TABLE tbl ALTER COLUMN t SET STORAGE plain; +CREATE INDEX tbl_idx ON tbl (t); +INSERT INTO tbl VALUES (repeat('Test', 250)); +ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended; +SELECT bt_index_check('tbl_idx', true); + bt_index_check +---------------- + +(1 row) + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index e67eb6352f..0a827edd1a 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -164,6 +164,15 @@ COPY varlena_bug FROM :'filename'; CREATE INDEX varlena_bug_idx on varlena_bug(v); SELECT bt_index_check('varlena_bug_idx', true); +-- Check extended varlena +CREATE TABLE tbl(t text); +ALTER TABLE tbl ALTER COLUMN t SET STORAGE plain; +CREATE INDEX tbl_idx ON tbl (t); +INSERT INTO tbl VALUES (repeat('Test', 250)); +ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended; + +SELECT bt_index_check('tbl_idx', true); + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index e7f01c2add..6654b5afe7 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -23,6 +23,7 @@ */ #include "postgres.h" +#include "access/heaptoast.h" #include "access/htup_details.h" #include "access/nbtree.h" #include "access/table.h" @@ -2981,6 +2982,17 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) ItemPointerGetBlockNumber(&(itup->t_tid)), ItemPointerGetOffsetNumber(&(itup->t_tid)), RelationGetRelationName(state->rel)))); + else if (!VARATT_IS_EXTENDED(DatumGetPointer(normalized[i])) && + VARSIZE(DatumGetPointer(normalized[i])) > TOAST_INDEX_TARGET && + (att->attstorage == TYPSTORAGE_EXTENDED || + att->attstorage == TYPSTORAGE_MAIN)) + { + /* + * this attribute will be compressed by index_form_tuple(), + * this might be already done in heap, so force forming. + */ + formnewtup = true; + } else if (VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i]))) { formnewtup = true; -- 2.37.1 (Apple Git-137.1) ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-23 19:24 Michael Zhilin <[email protected]> parent: Andrey M. Borodin <[email protected]> 1 sibling, 0 replies; 20+ messages in thread From: Michael Zhilin @ 2024-01-23 19:24 UTC (permalink / raw) To: Andrey M. Borodin <[email protected]>; Alexander Lakhin <[email protected]>; [email protected]; +Cc: pgsql-bugs; Yura Sokolov <[email protected]> Hi, Thank you, Jian, for nice comments! PFA version with your recommendations. Andrey, I didn't yet check your patches, but at least compiler complains about added, but unused variable "miss_oversized_tuple". verify_nbtree.c:2898:7: warning: unused variable 'miss_oversized_tuple' [-Wunused-variable] bool miss_oversized_tuple = false; So patch has been updated to fix this warning. Attached v4, rebased version with Jian's comments & removed unused variable. Thanks, Michael. On 1/23/24 21:09, Andrey M. Borodin wrote: > >> On 20 Jan 2024, at 09:00, Alexander Lakhin<[email protected]> wrote: >> >>> Alexander, do you plan to provide fixes for bugs you discovered? >> No, I don't have a concrete proposal how to fix those bugs. I'd thought >> that fixing the whole class of such anomalies, not only one case, is a good >> thing to do, but if it's too complicated, maybe other similar bugs could be >> put aside. > PFA draft fixes for both this errors. Alexander, Michael, Jian, what do you think? > > I did not touch anything in first step - fix for original bug in this thread. However, I think that comments from Jian He worth incorporating into the fix. > > > Best regards, Andrey Borodin. > -- Michael Zhilin Postgres Professional +7(925)3366270 https://www.postgrespro.ru Attachments: [text/x-patch] v4-0001-contrib-amcheck-must-support-different-header-siz.patch (6.1K, 3-v4-0001-contrib-amcheck-must-support-different-header-siz.patch) download | inline diff: From 3cf43195e54439433854cb434a27e851b7ac6528 Mon Sep 17 00:00:00 2001 From: Michael Zhilin <[email protected]> Date: Thu, 14 Dec 2023 16:08:15 +0300 Subject: [PATCH v4 1/3] contrib/amcheck: must support different header size of short varlena datum --- contrib/amcheck/expected/check_btree.out | 17 +++++++++++ contrib/amcheck/sql/check_btree.sql | 16 +++++++++++ contrib/amcheck/verify_nbtree.c | 36 ++++++++++++++++++++---- 3 files changed, 64 insertions(+), 5 deletions(-) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index 86b38d93f41..6ef7915b319 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -1,3 +1,4 @@ +-- directory paths are passed to us in environment variables CREATE TABLE bttest_a(id int8); CREATE TABLE bttest_b(id int8); CREATE TABLE bttest_multi(id int8, data int8); @@ -240,6 +241,21 @@ SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, che (1 row) +-- +-- BUG: must support different header size of short varlena datum +-- https://postgr.es/m/[email protected] +-- +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +COPY varlena_bug from stdin; +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + bt_index_check +---------------- + +(1 row) + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -250,3 +266,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index aa461f7fb97..e17924b1549 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -1,3 +1,4 @@ +-- directory paths are passed to us in environment variables CREATE TABLE bttest_a(id int8); CREATE TABLE bttest_b(id int8); CREATE TABLE bttest_multi(id int8, data int8); @@ -148,6 +149,20 @@ SELECT bt_index_check('bttest_unique_nulls_c_key', heapallindexed => true, check CREATE INDEX on bttest_unique_nulls (b,c); SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, checkunique => true); +-- +-- BUG: must support different header size of short varlena datum +-- https://postgr.es/m/[email protected] +-- + +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +COPY varlena_bug from stdin; +x +\. +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -158,3 +173,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 91caa53dd8b..e7f01c2addb 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -2942,7 +2942,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) TupleDesc tupleDescriptor = RelationGetDescr(state->rel); Datum normalized[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; - bool toast_free[INDEX_MAX_KEYS]; + bool need_free[INDEX_MAX_KEYS]; bool formnewtup = false; IndexTuple reformed; int i; @@ -2961,7 +2961,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) att = TupleDescAttr(tupleDescriptor, i); /* Assume untoasted/already normalized datum initially */ - toast_free[i] = false; + need_free[i] = false; normalized[i] = index_getattr(itup, att->attnum, tupleDescriptor, &isnull[i]); @@ -2973,6 +2973,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * index without further processing, so an external varlena header * should never be encountered here */ + if (VARATT_IS_EXTERNAL(DatumGetPointer(normalized[i]))) ereport(ERROR, (errcode(ERRCODE_INDEX_CORRUPTED), @@ -2984,11 +2985,32 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) { formnewtup = true; normalized[i] = PointerGetDatum(PG_DETOAST_DATUM(normalized[i])); - toast_free[i] = true; + need_free[i] = true; } + /* + * Short tuples may have 1B or 4B header. Convert 4B header of short + * tuples to 1B + */ + else if (VARATT_CAN_MAKE_SHORT(DatumGetPointer(normalized[i]))) + { + /* convert to short varlena */ + Size len = VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(normalized[i])); + char *data = palloc(len); + + SET_VARSIZE_SHORT(data, len); + memcpy(data + 1, VARDATA(DatumGetPointer(normalized[i])), len - 1); + + formnewtup = true; + normalized[i] = PointerGetDatum(data); + need_free[i] = true; + } + } - /* Easier case: Tuple has varlena datums, none of which are compressed */ + /* + * Easier case: Tuple has varlena datums, none of which are compressed or + * short with 4B header + */ if (!formnewtup) return itup; @@ -2997,6 +3019,10 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * creating normalized version of the tuple from uncompressed input datums * (normalized input datums). This is rather naive, but shouldn't be * necessary too often. + * Also tuple had short varlena datums with 4B header. Actually there is no + * restriction with have heap tuple containing varlena datum with 4B header + * and corresponding index tuple containing varlena datum with 1B header. + * For fingerprinting let's convert heap tuple varlena datum to 1B format. * * Note that we rely on deterministic index_form_tuple() TOAST compression * of normalized input. @@ -3006,7 +3032,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) /* Cannot leak memory here */ for (i = 0; i < tupleDescriptor->natts; i++) - if (toast_free[i]) + if (need_free[i]) pfree(DatumGetPointer(normalized[i])); return reformed; -- 2.43.0 [text/x-patch] v4-0002-amcheck-prevent-false-positives-from-extended-dat.patch (2.9K, 4-v4-0002-amcheck-prevent-false-positives-from-extended-dat.patch) download | inline diff: From f231643a2e990d767d78766a2bbd073d30bc6e4d Mon Sep 17 00:00:00 2001 From: "Andrey M. Borodin" <[email protected]> Date: Tue, 23 Jan 2024 21:57:02 +0500 Subject: [PATCH v4 2/3] amcheck: prevent false positives from extended datums Reported-by: Alexander Lakhin --- contrib/amcheck/expected/check_btree.out | 12 ++++++++++++ contrib/amcheck/sql/check_btree.sql | 9 +++++++++ contrib/amcheck/verify_nbtree.c | 12 ++++++++++++ 3 files changed, 33 insertions(+) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index 6ef7915b319..f8638582180 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -256,6 +256,18 @@ SELECT bt_index_check('varlena_bug_idx', true); (1 row) +-- Check extended varlena +CREATE TABLE tbl(t text); +ALTER TABLE tbl ALTER COLUMN t SET STORAGE plain; +CREATE INDEX tbl_idx ON tbl (t); +INSERT INTO tbl VALUES (repeat('Test', 250)); +ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended; +SELECT bt_index_check('tbl_idx', true); + bt_index_check +---------------- + +(1 row) + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index e17924b1549..5da49ea94fd 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -163,6 +163,15 @@ x CREATE INDEX varlena_bug_idx on varlena_bug(v); SELECT bt_index_check('varlena_bug_idx', true); +-- Check extended varlena +CREATE TABLE tbl(t text); +ALTER TABLE tbl ALTER COLUMN t SET STORAGE plain; +CREATE INDEX tbl_idx ON tbl (t); +INSERT INTO tbl VALUES (repeat('Test', 250)); +ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended; + +SELECT bt_index_check('tbl_idx', true); + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index e7f01c2addb..6654b5afe73 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -23,6 +23,7 @@ */ #include "postgres.h" +#include "access/heaptoast.h" #include "access/htup_details.h" #include "access/nbtree.h" #include "access/table.h" @@ -2981,6 +2982,17 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) ItemPointerGetBlockNumber(&(itup->t_tid)), ItemPointerGetOffsetNumber(&(itup->t_tid)), RelationGetRelationName(state->rel)))); + else if (!VARATT_IS_EXTENDED(DatumGetPointer(normalized[i])) && + VARSIZE(DatumGetPointer(normalized[i])) > TOAST_INDEX_TARGET && + (att->attstorage == TYPSTORAGE_EXTENDED || + att->attstorage == TYPSTORAGE_MAIN)) + { + /* + * this attribute will be compressed by index_form_tuple(), + * this might be already done in heap, so force forming. + */ + formnewtup = true; + } else if (VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i]))) { formnewtup = true; -- 2.43.0 [text/x-patch] v4-0003-amcheck-avoid-failing-on-oversized-tuples.patch (6.0K, 5-v4-0003-amcheck-avoid-failing-on-oversized-tuples.patch) download | inline diff: From aa611847e8fee0994b35cda881058cc74a7a5ab0 Mon Sep 17 00:00:00 2001 From: "Andrey M. Borodin" <[email protected]> Date: Tue, 23 Jan 2024 23:03:28 +0500 Subject: [PATCH v4 3/3] amcheck: avoid failing on oversized tuples Due to changes in toast policies, some heap tuples might become too big index tuples. This commit prevents ERRORs in this case, because this anomaly does not create dangerous conditions to DMS. However, the database cannot be dumped-restored, so we emit a NOTICE. Reported-by: Alexander Lakhin --- contrib/amcheck/expected/check_btree.out | 14 +++++++ contrib/amcheck/sql/check_btree.sql | 11 +++++ contrib/amcheck/verify_nbtree.c | 53 +++++++++++++++++++++--- 3 files changed, 72 insertions(+), 6 deletions(-) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index f8638582180..0ea2e3e1f38 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -268,6 +268,20 @@ SELECT bt_index_check('tbl_idx', true); (1 row) +DROP TABLE tbl; +-- Check oversized datums that cannot be inserted into index +CREATE TABLE t(f1 text); +CREATE INDEX t_idx ON t(f1); +INSERT INTO t VALUES(repeat('1234567890', 1000)); +ALTER TABLE t ALTER COLUMN f1 SET STORAGE plain; +SELECT bt_index_check('t_idx', true); +NOTICE: Index contain tuples that cannot fit into index page, if toasted with current toast policy + bt_index_check +---------------- + +(1 row) + +DROP TABLE t; -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index 5da49ea94fd..ceeb7258754 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -171,6 +171,17 @@ INSERT INTO tbl VALUES (repeat('Test', 250)); ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended; SELECT bt_index_check('tbl_idx', true); +DROP TABLE tbl; + +-- Check oversized datums that cannot be inserted into index + +CREATE TABLE t(f1 text); +CREATE INDEX t_idx ON t(f1); +INSERT INTO t VALUES(repeat('1234567890', 1000)); +ALTER TABLE t ALTER COLUMN f1 SET STORAGE plain; + +SELECT bt_index_check('t_idx', true); +DROP TABLE t; -- cleanup DROP TABLE bttest_a; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 6654b5afe73..e30f162e0fb 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -128,6 +128,11 @@ typedef struct BtreeCheckState bloom_filter *filter; /* Debug counter */ int64 heaptuplespresent; + /* + * During check we might find tuples that due to current TOAST policies + * should not reside in index, but still are there. + */ + bool has_oversized_tuples; } BtreeCheckState; /* @@ -1624,10 +1629,17 @@ bt_target_page_check(BtreeCheckState *state) logtuple = bt_posting_plain_tuple(itup, i); norm = bt_normalize_tuple(state, logtuple); - bloom_add_element(state->filter, (unsigned char *) norm, + if (norm == NULL) + { + if (!state->has_oversized_tuples) + elog(NOTICE, "Index contain tuples that cannot fit into index page, if toasted with current toast policy"); + state->has_oversized_tuples = true; + } + else + bloom_add_element(state->filter, (unsigned char *) norm, IndexTupleSize(norm)); /* Be tidy */ - if (norm != logtuple) + if (norm != logtuple && norm != NULL) pfree(norm); pfree(logtuple); } @@ -1635,10 +1647,17 @@ bt_target_page_check(BtreeCheckState *state) else { norm = bt_normalize_tuple(state, itup); - bloom_add_element(state->filter, (unsigned char *) norm, - IndexTupleSize(norm)); + if (norm == NULL) + { + if (!state->has_oversized_tuples) + elog(NOTICE, "Index contain tuples that cannot fit into index page, if toasted with current toast policy"); + state->has_oversized_tuples = true; + } + else + bloom_add_element(state->filter, (unsigned char *) norm, + IndexTupleSize(norm)); /* Be tidy */ - if (norm != itup) + if (norm != itup && norm != NULL) pfree(norm); } } @@ -2883,9 +2902,19 @@ bt_tuple_present_callback(Relation index, ItemPointer tid, Datum *values, itup = index_form_tuple(RelationGetDescr(index), values, isnull); itup->t_tid = *tid; norm = bt_normalize_tuple(state, itup); + if (norm == NULL) + { + if (state->has_oversized_tuples) + { + /* exempt this oversized tuple */ + state->heaptuplespresent++; + pfree(itup); + return; + } + } /* Probe Bloom filter -- tuple should be present */ - if (bloom_lacks_element(state->filter, (unsigned char *) norm, + if ((norm == NULL) || bloom_lacks_element(state->filter, (unsigned char *) norm, IndexTupleSize(norm))) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), @@ -2936,6 +2965,9 @@ bt_tuple_present_callback(Relation index, ItemPointer tid, Datum *values, * Caller does normalization for non-pivot tuples that have a posting list, * since dummy CREATE INDEX callback code generates new tuples with the same * normalized representation. + * + * If the tuple is exampt from checking due to has_oversized_tuples this function + * returns NULL. */ static IndexTuple bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) @@ -2947,6 +2979,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) bool formnewtup = false; IndexTuple reformed; int i; + Size data_size; /* Caller should only pass "logical" non-pivot tuples here */ Assert(!BTreeTupleIsPosting(itup) && !BTreeTupleIsPivot(itup)); @@ -3026,6 +3059,14 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) if (!formnewtup) return itup; + data_size = MAXALIGN(heap_compute_data_size(tupleDescriptor, + normalized, isnull) + + MAXALIGN(sizeof(IndexTupleData) + sizeof(IndexAttributeBitMapData))); + if ((data_size & INDEX_SIZE_MASK) != data_size) + { + return NULL; + } + /* * Hard case: Tuple had compressed varlena datums that necessitate * creating normalized version of the tuple from uncompressed input datums -- 2.43.0 ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-01-26 07:00 Alexander Lakhin <[email protected]> parent: Andrey M. Borodin <[email protected]> 1 sibling, 1 reply; 20+ messages in thread From: Alexander Lakhin @ 2024-01-26 07:00 UTC (permalink / raw) To: Andrey M. Borodin <[email protected]>; +Cc: Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> Hi Andrey, 23.01.2024 21:09, Andrey M. Borodin wrote: > PFA draft fixes for both this errors. Alexander, Michael, Jian, what do you think? > > I did not touch anything in first step - fix for original bug in this thread. However, I think that comments from Jian He worth incorporating into the fix. > I''m confused by a NOTICE added, as it printed now even for cases, which worked before, for example: CREATE TABLE t(f1 text); CREATE INDEX idx ON t(f1); INSERT INTO t VALUES(repeat('1234567890', 1000)); SELECT bt_index_check('idx', true); NOTICE: Index contain tuples that cannot fit into index page, if toasted with current toast policy bt_index_check ---------------- (1 row) Best regards, Alexander ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-03-20 10:24 Alexander Korotkov <[email protected]> parent: Alexander Lakhin <[email protected]> 0 siblings, 2 replies; 20+ messages in thread From: Alexander Korotkov @ 2024-03-20 10:24 UTC (permalink / raw) To: Alexander Lakhin <[email protected]>; +Cc: Andrey M. Borodin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> Hi! On Fri, Jan 26, 2024 at 9:00 AM Alexander Lakhin <[email protected]> wrote: > > 23.01.2024 21:09, Andrey M. Borodin wrote: > > PFA draft fixes for both this errors. Alexander, Michael, Jian, what do you think? > > > > I did not touch anything in first step - fix for original bug in this thread. However, I think that comments from Jian He worth incorporating into the fix. > > > > I''m confused by a NOTICE added, as it printed now even for cases, which > worked before, for example: > CREATE TABLE t(f1 text); > CREATE INDEX idx ON t(f1); > INSERT INTO t VALUES(repeat('1234567890', 1000)); > SELECT bt_index_check('idx', true); > NOTICE: Index contain tuples that cannot fit into index page, if toasted with current toast policy > bt_index_check > ---------------- > > (1 row) Right, the patch number 0003 looks wrong to me. It uses heap_compute_data_size() to compute the size of the future index tuple. But heap_compute_data_size() doesn't apply any compression on the attributes. I suggest instead we just need to have a version of index_form_tuple() that reports an oversized tuple in a return value rather than throwing the error. BTW, 0001 and 0002 look good to me. I'm going to push them if no objections. ------ Regards, Alexander Korotkov ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-03-20 16:00 Alexander Lakhin <[email protected]> parent: Alexander Korotkov <[email protected]> 1 sibling, 1 reply; 20+ messages in thread From: Alexander Lakhin @ 2024-03-20 16:00 UTC (permalink / raw) To: Alexander Korotkov <[email protected]>; +Cc: Andrey M. Borodin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> Hi Alexander, 20.03.2024 13:24, Alexander Korotkov wrote: > BTW, 0001 and 0002 look good to me. I'm going to push them if no objections. Maybe these patches should be polished before committing: +-- directory paths are passed to us in environment variables looks like an irrelevant change (perhaps it was relevant in v1/v2, but that's not so now.) I'm also not sure about: +-- BUG: must support different header size of short varlena datum +-- https://postgr.es/m/[email protected] AFAICS, for most similar bug fixes, the bug report referenced in a commit message only (there is no such comment in 0002, either). I also suspect that the comment: * Also tuple had short varlena datums with 4B header. ... might looks incorrect for native English speakers. This patch also adds a couple of empty lines, which may be not needed. @@ -2973,6 +2973,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * index without further processing, so an external varlena header * should never be encountered here */ > + if (VARATT_IS_EXTERNAL(DatumGetPointer(normalized[i]))) ereport(ERROR, (errcode(ERRCODE_INDEX_CORRUPTED), --- } + /* ... + need_free[i] = true; + } > + } Best regards, Alexander ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-03-20 17:00 Alexander Korotkov <[email protected]> parent: Alexander Lakhin <[email protected]> 0 siblings, 1 reply; 20+ messages in thread From: Alexander Korotkov @ 2024-03-20 17:00 UTC (permalink / raw) To: Alexander Lakhin <[email protected]>; +Cc: Andrey M. Borodin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> On Wed, Mar 20, 2024 at 6:00 PM Alexander Lakhin <[email protected]> wrote: > 20.03.2024 13:24, Alexander Korotkov wrote: > > BTW, 0001 and 0002 look good to me. I'm going to push them if no objections. > > Maybe these patches should be polished before committing: Yes, Alexander. Sorry, I forgot to mention I'm going to polish comments and commit messages before pushing anyway. I'll post it for your review later today. ------ Regards, Alexander Korotkov ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-03-21 05:46 Andrey M. Borodin <[email protected]> parent: Alexander Korotkov <[email protected]> 1 sibling, 0 replies; 20+ messages in thread From: Andrey M. Borodin @ 2024-03-21 05:46 UTC (permalink / raw) To: Alexander Korotkov <[email protected]>; +Cc: Alexander Lakhin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> > On 20 Mar 2024, at 15:24, Alexander Korotkov <[email protected]> wrote: > > Right, the patch number 0003 looks wrong to me. Indeed, step 0003 was added to the patchset after entry marked RfC. And you are right that it’s not ready yet, and, honestly, I do not know how to fix it properly. Let’s proceed with other bugs and patches in this thread, and for bug of the step 0003 we will create separate thread later. Thanks for working on this! Best regards, Andrey Borodin. ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-03-23 00:39 Alexander Korotkov <[email protected]> parent: Alexander Korotkov <[email protected]> 0 siblings, 1 reply; 20+ messages in thread From: Alexander Korotkov @ 2024-03-23 00:39 UTC (permalink / raw) To: Alexander Lakhin <[email protected]>; +Cc: Andrey M. Borodin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> On Wed, Mar 20, 2024 at 7:00 PM Alexander Korotkov <[email protected]> wrote: > On Wed, Mar 20, 2024 at 6:00 PM Alexander Lakhin <[email protected]> wrote: > > 20.03.2024 13:24, Alexander Korotkov wrote: > > > BTW, 0001 and 0002 look good to me. I'm going to push them if no objections. > > > > Maybe these patches should be polished before committing: > > Yes, Alexander. Sorry, I forgot to mention I'm going to polish > comments and commit messages before pushing anyway. I'll post it for > your review later today. There are revised versions of patches. Alexander, please, check them before I push. ------ Regards, Alexander Korotkov Attachments: [application/octet-stream] v5-0001-amcheck-Support-for-different-header-sizes-of-sho.patch (5.6K, 2-v5-0001-amcheck-Support-for-different-header-sizes-of-sho.patch) download | inline diff: From a5418f7c95d5153f298ef2c2fd9e37c830586b51 Mon Sep 17 00:00:00 2001 From: Alexander Korotkov <[email protected]> Date: Sat, 23 Mar 2024 02:36:31 +0200 Subject: [PATCH v5 1/2] amcheck: Support for different header sizes of short varlena datum In the heap, tuples may contain short varlena datum with both 1B header and 4B eaders. But the corresponding index tuple should always have such varlena's with 1B headers. So, for fingerprinting, we need to convert. Backpatch to all supported versions. Discussion: https://postgr.es/m/flat/7bdbe559-d61a-4ae4-a6e1-48abdf3024cc%40postgrespro.ru Author: Michael Zhilin Reviewed-by: Alexander Lakhin, Andrey Borodin, Jian He, Alexander Korotkov Backpatch-through: 12 --- contrib/amcheck/expected/check_btree.out | 13 +++++++++ contrib/amcheck/sql/check_btree.sql | 11 ++++++++ contrib/amcheck/verify_nbtree.c | 36 ++++++++++++++++++++---- 3 files changed, 55 insertions(+), 5 deletions(-) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index 86b38d93f41..d87e7178866 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -240,6 +240,18 @@ SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, che (1 row) +-- Check support of both 1B and 4B header sizes of short varlena datum +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +COPY varlena_bug from stdin; +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + bt_index_check +---------------- + +(1 row) + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -250,3 +262,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index aa461f7fb97..b37fff05078 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -148,6 +148,16 @@ SELECT bt_index_check('bttest_unique_nulls_c_key', heapallindexed => true, check CREATE INDEX on bttest_unique_nulls (b,c); SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, checkunique => true); +-- Check support of both 1B and 4B header sizes of short varlena datum +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +COPY varlena_bug from stdin; +x +\. +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -158,3 +168,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 1ef4cff88e8..e0dffd9bcca 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -2943,7 +2943,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) TupleDesc tupleDescriptor = RelationGetDescr(state->rel); Datum normalized[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; - bool toast_free[INDEX_MAX_KEYS]; + bool need_free[INDEX_MAX_KEYS]; bool formnewtup = false; IndexTuple reformed; int i; @@ -2962,7 +2962,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) att = TupleDescAttr(tupleDescriptor, i); /* Assume untoasted/already normalized datum initially */ - toast_free[i] = false; + need_free[i] = false; normalized[i] = index_getattr(itup, att->attnum, tupleDescriptor, &isnull[i]); @@ -2985,11 +2985,32 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) { formnewtup = true; normalized[i] = PointerGetDatum(PG_DETOAST_DATUM(normalized[i])); - toast_free[i] = true; + need_free[i] = true; + } + + /* + * Short tuples may have 1B or 4B header. Convert 4B header of short + * tuples to 1B + */ + else if (VARATT_CAN_MAKE_SHORT(DatumGetPointer(normalized[i]))) + { + /* convert to short varlena */ + Size len = VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(normalized[i])); + char *data = palloc(len); + + SET_VARSIZE_SHORT(data, len); + memcpy(data + 1, VARDATA(DatumGetPointer(normalized[i])), len - 1); + + formnewtup = true; + normalized[i] = PointerGetDatum(data); + need_free[i] = true; } } - /* Easier case: Tuple has varlena datums, none of which are compressed */ + /* + * Easier case: Tuple has varlena datums, none of which are compressed or + * short with 4B header + */ if (!formnewtup) return itup; @@ -2999,6 +3020,11 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * (normalized input datums). This is rather naive, but shouldn't be * necessary too often. * + * In the heap, tuples may contain short varlena datums with both 1B + * header and 4B headers. But the corresponding index tuple should always + * have such varlena's with 1B headers. So, if there is a short varlena + * with 4B header, we need to convert it for for fingerprinting. + * * Note that we rely on deterministic index_form_tuple() TOAST compression * of normalized input. */ @@ -3007,7 +3033,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) /* Cannot leak memory here */ for (i = 0; i < tupleDescriptor->natts; i++) - if (toast_free[i]) + if (need_free[i]) pfree(DatumGetPointer(normalized[i])); return reformed; -- 2.39.3 (Apple Git-145) [application/octet-stream] v5-0002-amcheck-Normalize-index-tuples-containing-uncompr.patch (3.4K, 3-v5-0002-amcheck-Normalize-index-tuples-containing-uncompr.patch) download | inline diff: From f10c9a59e14c2976f6da6bb1e65cb2fff3bc098c Mon Sep 17 00:00:00 2001 From: Alexander Korotkov <[email protected]> Date: Sat, 23 Mar 2024 02:29:24 +0200 Subject: [PATCH v5 2/2] amcheck: Normalize index tuples containing uncompressed varlena It might happen that the varlena value wasn't compressed by index_form_tuple() due to current storage parameters. If compression is currently enabled, we need to compress such values to much index tuple coming from the heap. Backpatch to all supported versions. Discussion: https://postgr.es/m/flat/7bdbe559-d61a-4ae4-a6e1-48abdf3024cc%40postgrespro.ru Author: Andrey Borodin Reviewed-by: Alexander Lakhin, Michael Zhilin, Jian He, Alexander Korotkov Backpatch-through: 12 --- contrib/amcheck/expected/check_btree.out | 10 ++++++++++ contrib/amcheck/sql/check_btree.sql | 6 ++++++ contrib/amcheck/verify_nbtree.c | 13 +++++++++++++ 3 files changed, 29 insertions(+) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index d87e7178866..cf8284fe12e 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -252,6 +252,16 @@ SELECT bt_index_check('varlena_bug_idx', true); (1 row) +-- Also check that we compress varlena values, which were previously stored +-- uncompressed in index. +INSERT INTO varlena_bug VALUES (repeat('Test', 250)); +ALTER TABLE varlena_bug ALTER COLUMN v SET STORAGE extended; +SELECT bt_index_check('varlena_bug_idx', true); + bt_index_check +---------------- + +(1 row) + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index b37fff05078..68bd71b064f 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -158,6 +158,12 @@ x CREATE INDEX varlena_bug_idx on varlena_bug(v); SELECT bt_index_check('varlena_bug_idx', true); +-- Also check that we compress varlena values, which were previously stored +-- uncompressed in index. +INSERT INTO varlena_bug VALUES (repeat('Test', 250)); +ALTER TABLE varlena_bug ALTER COLUMN v SET STORAGE extended; +SELECT bt_index_check('varlena_bug_idx', true); + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index e0dffd9bcca..f71f1854e0a 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -23,6 +23,7 @@ */ #include "postgres.h" +#include "access/heaptoast.h" #include "access/htup_details.h" #include "access/nbtree.h" #include "access/table.h" @@ -2981,6 +2982,18 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) ItemPointerGetBlockNumber(&(itup->t_tid)), ItemPointerGetOffsetNumber(&(itup->t_tid)), RelationGetRelationName(state->rel)))); + else if (!VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i])) && + VARSIZE(DatumGetPointer(normalized[i])) > TOAST_INDEX_TARGET && + (att->attstorage == TYPSTORAGE_EXTENDED || + att->attstorage == TYPSTORAGE_MAIN)) + { + /* + * This value will be compressed by index_form_tuple() with the + * current storage settings. We may be here because this tuple + * was formed with different storage settings. So, force forming. + */ + formnewtup = true; + } else if (VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i]))) { formnewtup = true; -- 2.39.3 (Apple Git-145) ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2024-03-23 11:37 Alexander Korotkov <[email protected]> parent: Alexander Korotkov <[email protected]> 0 siblings, 1 reply; 20+ messages in thread From: Alexander Korotkov @ 2024-03-23 11:37 UTC (permalink / raw) To: Alexander Lakhin <[email protected]>; +Cc: Andrey M. Borodin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> On Sat, Mar 23, 2024 at 2:39 AM Alexander Korotkov <[email protected]> wrote: > On Wed, Mar 20, 2024 at 7:00 PM Alexander Korotkov <[email protected]> wrote: > > On Wed, Mar 20, 2024 at 6:00 PM Alexander Lakhin <[email protected]> wrote: > > > 20.03.2024 13:24, Alexander Korotkov wrote: > > > > BTW, 0001 and 0002 look good to me. I'm going to push them if no objections. > > > > > > Maybe these patches should be polished before committing: > > > > Yes, Alexander. Sorry, I forgot to mention I'm going to polish > > comments and commit messages before pushing anyway. I'll post it for > > your review later today. > > There are revised versions of patches. Alexander, please, check them > before I push. Fixed typo s/much/match/ in the commit message spotted by Andrey Borodin. ------ Regards, Alexander Korotkov Attachments: [application/octet-stream] v6-0001-amcheck-Support-for-different-header-sizes-of-sho.patch (5.6K, 2-v6-0001-amcheck-Support-for-different-header-sizes-of-sho.patch) download | inline diff: From a5418f7c95d5153f298ef2c2fd9e37c830586b51 Mon Sep 17 00:00:00 2001 From: Alexander Korotkov <[email protected]> Date: Sat, 23 Mar 2024 02:36:31 +0200 Subject: [PATCH v6 1/2] amcheck: Support for different header sizes of short varlena datum In the heap, tuples may contain short varlena datum with both 1B header and 4B eaders. But the corresponding index tuple should always have such varlena's with 1B headers. So, for fingerprinting, we need to convert. Backpatch to all supported versions. Discussion: https://postgr.es/m/flat/7bdbe559-d61a-4ae4-a6e1-48abdf3024cc%40postgrespro.ru Author: Michael Zhilin Reviewed-by: Alexander Lakhin, Andrey Borodin, Jian He, Alexander Korotkov Backpatch-through: 12 --- contrib/amcheck/expected/check_btree.out | 13 +++++++++ contrib/amcheck/sql/check_btree.sql | 11 ++++++++ contrib/amcheck/verify_nbtree.c | 36 ++++++++++++++++++++---- 3 files changed, 55 insertions(+), 5 deletions(-) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index 86b38d93f41..d87e7178866 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -240,6 +240,18 @@ SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, che (1 row) +-- Check support of both 1B and 4B header sizes of short varlena datum +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +COPY varlena_bug from stdin; +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + bt_index_check +---------------- + +(1 row) + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -250,3 +262,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index aa461f7fb97..b37fff05078 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -148,6 +148,16 @@ SELECT bt_index_check('bttest_unique_nulls_c_key', heapallindexed => true, check CREATE INDEX on bttest_unique_nulls (b,c); SELECT bt_index_check('bttest_unique_nulls_b_c_idx', heapallindexed => true, checkunique => true); +-- Check support of both 1B and 4B header sizes of short varlena datum +CREATE TABLE varlena_bug (v text); +ALTER TABLE varlena_bug ALTER column v SET storage plain; +INSERT INTO varlena_bug VALUES ('x'); +COPY varlena_bug from stdin; +x +\. +CREATE INDEX varlena_bug_idx on varlena_bug(v); +SELECT bt_index_check('varlena_bug_idx', true); + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; @@ -158,3 +168,4 @@ DROP FUNCTION ifun(int8); DROP TABLE bttest_unique_nulls; DROP OWNED BY regress_bttest_role; -- permissions DROP ROLE regress_bttest_role; +DROP TABLE varlena_bug; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 1ef4cff88e8..e0dffd9bcca 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -2943,7 +2943,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) TupleDesc tupleDescriptor = RelationGetDescr(state->rel); Datum normalized[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; - bool toast_free[INDEX_MAX_KEYS]; + bool need_free[INDEX_MAX_KEYS]; bool formnewtup = false; IndexTuple reformed; int i; @@ -2962,7 +2962,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) att = TupleDescAttr(tupleDescriptor, i); /* Assume untoasted/already normalized datum initially */ - toast_free[i] = false; + need_free[i] = false; normalized[i] = index_getattr(itup, att->attnum, tupleDescriptor, &isnull[i]); @@ -2985,11 +2985,32 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) { formnewtup = true; normalized[i] = PointerGetDatum(PG_DETOAST_DATUM(normalized[i])); - toast_free[i] = true; + need_free[i] = true; + } + + /* + * Short tuples may have 1B or 4B header. Convert 4B header of short + * tuples to 1B + */ + else if (VARATT_CAN_MAKE_SHORT(DatumGetPointer(normalized[i]))) + { + /* convert to short varlena */ + Size len = VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(normalized[i])); + char *data = palloc(len); + + SET_VARSIZE_SHORT(data, len); + memcpy(data + 1, VARDATA(DatumGetPointer(normalized[i])), len - 1); + + formnewtup = true; + normalized[i] = PointerGetDatum(data); + need_free[i] = true; } } - /* Easier case: Tuple has varlena datums, none of which are compressed */ + /* + * Easier case: Tuple has varlena datums, none of which are compressed or + * short with 4B header + */ if (!formnewtup) return itup; @@ -2999,6 +3020,11 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) * (normalized input datums). This is rather naive, but shouldn't be * necessary too often. * + * In the heap, tuples may contain short varlena datums with both 1B + * header and 4B headers. But the corresponding index tuple should always + * have such varlena's with 1B headers. So, if there is a short varlena + * with 4B header, we need to convert it for for fingerprinting. + * * Note that we rely on deterministic index_form_tuple() TOAST compression * of normalized input. */ @@ -3007,7 +3033,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) /* Cannot leak memory here */ for (i = 0; i < tupleDescriptor->natts; i++) - if (toast_free[i]) + if (need_free[i]) pfree(DatumGetPointer(normalized[i])); return reformed; -- 2.39.3 (Apple Git-145) [application/octet-stream] v6-0002-amcheck-Normalize-index-tuples-containing-uncompr.patch (3.4K, 3-v6-0002-amcheck-Normalize-index-tuples-containing-uncompr.patch) download | inline diff: From b626608a276bef4b4a2fb5df608c42f2be5595eb Mon Sep 17 00:00:00 2001 From: Alexander Korotkov <[email protected]> Date: Sat, 23 Mar 2024 02:29:24 +0200 Subject: [PATCH v6 2/2] amcheck: Normalize index tuples containing uncompressed varlena It might happen that the varlena value wasn't compressed by index_form_tuple() due to current storage parameters. If compression is currently enabled, we need to compress such values to match index tuple coming from the heap. Backpatch to all supported versions. Discussion: https://postgr.es/m/flat/7bdbe559-d61a-4ae4-a6e1-48abdf3024cc%40postgrespro.ru Author: Andrey Borodin Reviewed-by: Alexander Lakhin, Michael Zhilin, Jian He, Alexander Korotkov Backpatch-through: 12 --- contrib/amcheck/expected/check_btree.out | 10 ++++++++++ contrib/amcheck/sql/check_btree.sql | 6 ++++++ contrib/amcheck/verify_nbtree.c | 13 +++++++++++++ 3 files changed, 29 insertions(+) diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out index d87e7178866..cf8284fe12e 100644 --- a/contrib/amcheck/expected/check_btree.out +++ b/contrib/amcheck/expected/check_btree.out @@ -252,6 +252,16 @@ SELECT bt_index_check('varlena_bug_idx', true); (1 row) +-- Also check that we compress varlena values, which were previously stored +-- uncompressed in index. +INSERT INTO varlena_bug VALUES (repeat('Test', 250)); +ALTER TABLE varlena_bug ALTER COLUMN v SET STORAGE extended; +SELECT bt_index_check('varlena_bug_idx', true); + bt_index_check +---------------- + +(1 row) + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; diff --git a/contrib/amcheck/sql/check_btree.sql b/contrib/amcheck/sql/check_btree.sql index b37fff05078..68bd71b064f 100644 --- a/contrib/amcheck/sql/check_btree.sql +++ b/contrib/amcheck/sql/check_btree.sql @@ -158,6 +158,12 @@ x CREATE INDEX varlena_bug_idx on varlena_bug(v); SELECT bt_index_check('varlena_bug_idx', true); +-- Also check that we compress varlena values, which were previously stored +-- uncompressed in index. +INSERT INTO varlena_bug VALUES (repeat('Test', 250)); +ALTER TABLE varlena_bug ALTER COLUMN v SET STORAGE extended; +SELECT bt_index_check('varlena_bug_idx', true); + -- cleanup DROP TABLE bttest_a; DROP TABLE bttest_b; diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index e0dffd9bcca..f71f1854e0a 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -23,6 +23,7 @@ */ #include "postgres.h" +#include "access/heaptoast.h" #include "access/htup_details.h" #include "access/nbtree.h" #include "access/table.h" @@ -2981,6 +2982,18 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) ItemPointerGetBlockNumber(&(itup->t_tid)), ItemPointerGetOffsetNumber(&(itup->t_tid)), RelationGetRelationName(state->rel)))); + else if (!VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i])) && + VARSIZE(DatumGetPointer(normalized[i])) > TOAST_INDEX_TARGET && + (att->attstorage == TYPSTORAGE_EXTENDED || + att->attstorage == TYPSTORAGE_MAIN)) + { + /* + * This value will be compressed by index_form_tuple() with the + * current storage settings. We may be here because this tuple + * was formed with different storage settings. So, force forming. + */ + formnewtup = true; + } else if (VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i]))) { formnewtup = true; -- 2.39.3 (Apple Git-145) ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2026-05-01 17:11 Andres Freund <[email protected]> parent: Alexander Korotkov <[email protected]> 0 siblings, 1 reply; 20+ messages in thread From: Andres Freund @ 2026-05-01 17:11 UTC (permalink / raw) To: Alexander Korotkov <[email protected]>; +Cc: Alexander Lakhin <[email protected]>; Andrey M. Borodin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> Hi, On 2024-03-23 13:37:04 +0200, Alexander Korotkov wrote: > @@ -2981,6 +2982,18 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) > ItemPointerGetBlockNumber(&(itup->t_tid)), > ItemPointerGetOffsetNumber(&(itup->t_tid)), > RelationGetRelationName(state->rel)))); > + else if (!VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i])) && > + VARSIZE(DatumGetPointer(normalized[i])) > TOAST_INDEX_TARGET && > + (att->attstorage == TYPSTORAGE_EXTENDED || > + att->attstorage == TYPSTORAGE_MAIN)) > + { > + /* > + * This value will be compressed by index_form_tuple() with the > + * current storage settings. We may be here because this tuple > + * was formed with different storage settings. So, force forming. > + */ > + formnewtup = true; > + } > else if (VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i]))) > { > formnewtup = true; While hacking on something, I added an assertion to VARSIZE() that the argument is actually a VARATT_4B (which it assumes). Worked everywhere, except for this caller: amcheck/regress fails, because sometimes the varlena is actually a short/1B varlena. Note that VARSIZE_4B on a short datum will give you completely bogus answers. E.g. in the case that failed the assertion, VARSIZE_1B() is 2, but VARSIZE_4B(PTR) is 7681. diff --git i/contrib/amcheck/verify_nbtree.c w/contrib/amcheck/verify_nbtree.c index b74ab5f7a05..0b87109fde3 100644 --- i/contrib/amcheck/verify_nbtree.c +++ w/contrib/amcheck/verify_nbtree.c @@ -2889,6 +2889,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup) ItemPointerGetOffsetNumber(&(itup->t_tid)), RelationGetRelationName(state->rel)))); else if (!VARATT_IS_COMPRESSED(DatumGetPointer(normalized[i])) && + !VARATT_IS_SHORT(DatumGetPointer(normalized[i])) && VARSIZE(DatumGetPointer(normalized[i])) > TOAST_INDEX_TARGET && (att->attstorage == TYPSTORAGE_EXTENDED || att->attstorage == TYPSTORAGE_MAIN)) Fixes the assert failure. I guess I find it aesthetically a bit unpleasing to check the same stuff so many times for one varlena :). Not that it should matter performance-wise here... Greetings, Andres Freund ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2026-05-01 18:06 Andrey Borodin <[email protected]> parent: Andres Freund <[email protected]> 0 siblings, 1 reply; 20+ messages in thread From: Andrey Borodin @ 2026-05-01 18:06 UTC (permalink / raw) To: Andres Freund <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; Alexander Lakhin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> > On 1 May 2026, at 22:11, Andres Freund <[email protected]> wrote: > > While hacking on something, I added an assertion to VARSIZE() that the > argument is actually a VARATT_4B (which it assumes). Worked everywhere, except > for this caller: amcheck/regress fails, because sometimes the varlena is > actually a short/1B varlena. > > Note that VARSIZE_4B on a short datum will give you completely bogus > answers. E.g. in the case that failed the assertion, VARSIZE_1B() is 2, but > VARSIZE_4B(PTR) is 7681. I remember the original code was taken from somewhere else because there was already some instances like this: /* * If value is above size target, and is of a compressible datatype, * try to compress it in-line. */ if (!VARATT_IS_EXTENDED(DatumGetPointer(untoasted_values[i])) && VARSIZE(DatumGetPointer(untoasted_values[i])) > TOAST_INDEX_TARGET && (att->attstorage == TYPSTORAGE_EXTENDED || att->attstorage == TYPSTORAGE_MAIN)) { I don't have VARATT_IS_EXTENDED vs VARATT_IS_COMPRESSED vs VARATT_IS_SHORT business in my warm cache right away, but I'll try to remember what it means soon. Best regards, Andrey Borodin. ^ permalink raw reply [nested|flat] 20+ messages in thread
* Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum @ 2026-05-01 19:41 Andres Freund <[email protected]> parent: Andrey Borodin <[email protected]> 0 siblings, 0 replies; 20+ messages in thread From: Andres Freund @ 2026-05-01 19:41 UTC (permalink / raw) To: Andrey Borodin <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; Alexander Lakhin <[email protected]>; Michael Zhilin <[email protected]>; pgsql-bugs; Yura Sokolov <[email protected]> Hi, On 2026-05-01 23:06:49 +0500, Andrey Borodin wrote: > > On 1 May 2026, at 22:11, Andres Freund <[email protected]> wrote: > > > > While hacking on something, I added an assertion to VARSIZE() that the > > argument is actually a VARATT_4B (which it assumes). Worked everywhere, except > > for this caller: amcheck/regress fails, because sometimes the varlena is > > actually a short/1B varlena. > > > > Note that VARSIZE_4B on a short datum will give you completely bogus > > answers. E.g. in the case that failed the assertion, VARSIZE_1B() is 2, but > > VARSIZE_4B(PTR) is 7681. > > I remember the original code was taken from somewhere else because there > was already some instances like this: > /* > * If value is above size target, and is of a compressible datatype, > * try to compress it in-line. > */ > if (!VARATT_IS_EXTENDED(DatumGetPointer(untoasted_values[i])) && > VARSIZE(DatumGetPointer(untoasted_values[i])) > TOAST_INDEX_TARGET && > (att->attstorage == TYPSTORAGE_EXTENDED || > att->attstorage == TYPSTORAGE_MAIN)) > { > > I don't have VARATT_IS_EXTENDED vs VARATT_IS_COMPRESSED vs VARATT_IS_SHORT > business in my warm cache right away, but I'll try to remember what it means soon. This is checking (as you noted) !VARATT_IS_EXTENDED, whereas the bt_normalize_tuple() code is checking !VARATT_IS_COMPRESSED. VARATT_IS_EXTENDED() will return true for short varlenas (because it's not a standard 4 byte uncompressed varlena), whereas VARATT_IS_COMPRESSED() will return false for a short varlena (since it's not compressed). I didn't find other instanes of similar code that uses !VARATT_IS_COMPRESSED. Greetings, Andres Freund ^ permalink raw reply [nested|flat] 20+ messages in thread
end of thread, other threads:[~2026-05-01 19:41 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2024-01-07 18:04 Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum Andrey M. Borodin <[email protected]> 2024-01-07 18:33 ` Andrey M. Borodin <[email protected]> 2024-01-07 19:00 ` Alexander Lakhin <[email protected]> 2024-01-08 14:34 ` Andrey M. Borodin <[email protected]> 2024-01-09 17:59 ` Michael Zhilin <[email protected]> 2024-01-19 18:16 ` Andrey Borodin <[email protected]> 2024-01-20 04:00 ` Alexander Lakhin <[email protected]> 2024-01-23 18:09 ` Andrey M. Borodin <[email protected]> 2024-01-23 19:24 ` Michael Zhilin <[email protected]> 2024-01-26 07:00 ` Alexander Lakhin <[email protected]> 2024-03-20 10:24 ` Alexander Korotkov <[email protected]> 2024-03-20 16:00 ` Alexander Lakhin <[email protected]> 2024-03-20 17:00 ` Alexander Korotkov <[email protected]> 2024-03-23 00:39 ` Alexander Korotkov <[email protected]> 2024-03-23 11:37 ` Alexander Korotkov <[email protected]> 2026-05-01 17:11 ` Andres Freund <[email protected]> 2026-05-01 18:06 ` Andrey Borodin <[email protected]> 2026-05-01 19:41 ` Andres Freund <[email protected]> 2024-03-21 05:46 ` Andrey M. Borodin <[email protected]> 2024-01-23 00:00 ` jian he <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox