public inbox for [email protected]  
help / color / mirror / Atom feed
[PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
10+ messages / 6 participants
[nested] [flat]

* [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
@ 2026-04-16 06:13  SATYANARAYANA NARLAPURAM <[email protected]>
  0 siblings, 2 replies; 10+ messages in thread

From: SATYANARAYANA NARLAPURAM @ 2026-04-16 06:13 UTC (permalink / raw)
  To: PostgreSQL Hackers <[email protected]>; Álvaro Herrera <[email protected]>; [email protected]

Hi hackers,

restore_tuple() in repack.c uses SET_VARSIZE() to reconstruct the varlena
header when
reading back external attributes from the spill file. In this process,
looks like the flag
SET_VARSIZE_COMPRESSED is silently lost. Because of this, when REPACK
CONCURRENTLY
run  any concurrently updated column whose value was TOAST-compressed ends
up with raw
compressed bytes behind an "uncompressed" header returning garbled data on
subsequent reads.
It appears that existing tests are using random chars which are
uncompressable.

Please find the
attached 0001-Fix-restore_tuple-losing-varlena-compression-flag.patch to
fix this.
Additionally I updated the existing repack_toast test to include the
scenario I was talking about.

Thanks,
Satya


Attachments:

  [application/octet-stream] 0001-Fix-restore_tuple-losing-varlena-compression-flag.patch (654B, 3-0001-Fix-restore_tuple-losing-varlena-compression-flag.patch)
  download | inline diff:
diff --git a/src/backend/commands/repack.c b/src/backend/commands/repack.c
index 58e3867..e6b6fea 100644
--- a/src/backend/commands/repack.c
+++ b/src/backend/commands/repack.c
@@ -2727,7 +2727,10 @@ restore_tuple(BufFile *file, Relation relation, TupleTableSlot *slot)
 			varlensz = VARSIZE_ANY(&chunk_header);
 
 			value = palloc(varlensz);
-			SET_VARSIZE(value, VARSIZE_ANY(&chunk_header));
+			if (VARATT_IS_4B_C(&chunk_header))
+				SET_VARSIZE_COMPRESSED(value, varlensz);
+			else
+				SET_VARSIZE(value, varlensz);
 			BufFileReadExact(file, (char *) value + VARHDRSZ, varlensz - VARHDRSZ);
 
 			slot->tts_values[i] = PointerGetDatum(value);


  [application/octet-stream] 0002-Add-compressed-TOAST-test-to-repack_toast.patch (4.7K, 4-0002-Add-compressed-TOAST-test-to-repack_toast.patch)
  download | inline diff:
diff --git a/src/test/modules/injection_points/expected/repack_toast.out b/src/test/modules/injection_points/expected/repack_toast.out
index b56dde1..0eefc88 100644
--- a/src/test/modules/injection_points/expected/repack_toast.out
+++ b/src/test/modules/injection_points/expected/repack_toast.out
@@ -63,3 +63,60 @@ injection_points_detach
                        
 (1 row)
 
+
+starting permutation: repack_compressed compressed_update wakeup_before_lock verify_compressed
+injection_points_attach
+-----------------------
+                       
+(1 row)
+
+step repack_compressed: 
+	REPACK (CONCURRENTLY) repack_ctest;
+ <waiting ...>
+step compressed_update: 
+	UPDATE repack_ctest SET j = gen_compressible(99) WHERE i = 2;
+
+step wakeup_before_lock: 
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+
+injection_points_wakeup
+-----------------------
+                       
+(1 row)
+
+step repack_compressed: <... completed>
+step verify_compressed: 
+	INSERT INTO post_repack_data
+	SELECT i, j FROM repack_ctest ORDER BY i;
+
+	SELECT e.i,
+	       length(e.j) AS expect_len,
+	       length(p.j) AS actual_len,
+	       (length(e.j) = length(p.j)) AS len_ok,
+	       (md5(e.j) = md5(p.j)) AS hash_ok
+	FROM expected_data e
+	JOIN post_repack_data p USING (i)
+	ORDER BY e.i;
+
+	SELECT count(*) AS mismatches
+	FROM expected_data e
+	JOIN post_repack_data p USING (i)
+	WHERE length(e.j) != length(p.j) OR md5(e.j) != md5(p.j);
+
+i|expect_len|actual_len|len_ok|hash_ok
+-+----------+----------+------+-------
+1|     16000|     16000|t     |t      
+2|     16000|     16000|t     |t      
+3|     16000|     16000|t     |t      
+(3 rows)
+
+mismatches
+----------
+         0
+(1 row)
+
+injection_points_detach
+-----------------------
+                       
+(1 row)
+
diff --git a/src/test/modules/injection_points/specs/repack_toast.spec b/src/test/modules/injection_points/specs/repack_toast.spec
index b878b19..fb8a557 100644
--- a/src/test/modules/injection_points/specs/repack_toast.spec
+++ b/src/test/modules/injection_points/specs/repack_toast.spec
@@ -13,6 +13,14 @@ setup
 		FROM generate_series(1, 2048) s(x);
 	$$;
 
+	-- 500 concatenated md5 hashes = 16000 hex chars; compresses but stays > 2KB.
+	CREATE FUNCTION gen_compressible(seed int)
+	RETURNS text
+	LANGUAGE sql IMMUTABLE as $$
+		SELECT string_agg(md5((seed * 1000 + x)::text), '')
+		FROM generate_series(1, 500) x;
+	$$;
+
 	CREATE TABLE repack_test(i int PRIMARY KEY, j text);
 	INSERT INTO repack_test(i, j) VALUES (1, get_long_string()),
 		(2, get_long_string()), (3, get_long_string());
@@ -21,13 +29,30 @@ setup
 
 	CREATE TABLE data_s1(i int, j text);
 	CREATE TABLE data_s2(i int, j text);
+
+	CREATE TABLE repack_ctest(i int PRIMARY KEY, j text);
+	INSERT INTO repack_ctest(i, j)
+	VALUES (1, gen_compressible(1)),
+	       (2, gen_compressible(2)),
+	       (3, gen_compressible(3));
+
+	CREATE TABLE expected_data AS
+	SELECT 1 AS i, gen_compressible(1) AS j
+	UNION ALL SELECT 2, gen_compressible(99)
+	UNION ALL SELECT 3, gen_compressible(3);
+
+	CREATE TABLE post_repack_data(i int, j text);
 }
 
 teardown
 {
 	DROP TABLE repack_test;
+	DROP TABLE repack_ctest;
+	DROP TABLE expected_data;
+	DROP TABLE post_repack_data;
 	DROP EXTENSION injection_points;
 	DROP FUNCTION get_long_string();
+	DROP FUNCTION gen_compressible(int);
 
 	DROP TABLE relfilenodes;
 	DROP TABLE data_s1;
@@ -67,6 +92,29 @@ step check1
 	FROM data_s1 d1 FULL JOIN data_s2 d2 USING (i, j)
 	WHERE d1.i ISNULL OR d2.i ISNULL;
 }
+step repack_compressed
+{
+	REPACK (CONCURRENTLY) repack_ctest;
+}
+step verify_compressed
+{
+	INSERT INTO post_repack_data
+	SELECT i, j FROM repack_ctest ORDER BY i;
+
+	SELECT e.i,
+	       length(e.j) AS expect_len,
+	       length(p.j) AS actual_len,
+	       (length(e.j) = length(p.j)) AS len_ok,
+	       (md5(e.j) = md5(p.j)) AS hash_ok
+	FROM expected_data e
+	JOIN post_repack_data p USING (i)
+	ORDER BY e.i;
+
+	SELECT count(*) AS mismatches
+	FROM expected_data e
+	JOIN post_repack_data p USING (i)
+	WHERE length(e.j) != length(p.j) OR md5(e.j) != md5(p.j);
+}
 teardown
 {
     SELECT injection_points_detach('repack-concurrently-before-lock');
@@ -101,6 +149,10 @@ step wakeup_before_lock
 {
 	SELECT injection_points_wakeup('repack-concurrently-before-lock');
 }
+step compressed_update
+{
+	UPDATE repack_ctest SET j = gen_compressible(99) WHERE i = 2;
+}
 
 # Test if data changes introduced while one session is performing REPACK
 # CONCURRENTLY find their way into the table.
@@ -110,3 +162,10 @@ permutation
 	check2
 	wakeup_before_lock
 	check1
+
+# Test that compressed TOAST values survive REPACK CONCURRENTLY.
+permutation
+	repack_compressed
+	compressed_update
+	wakeup_before_lock
+	verify_compressed


^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
@ 2026-04-17 09:17  Chao Li <[email protected]>
  parent: SATYANARAYANA NARLAPURAM <[email protected]>
  1 sibling, 0 replies; 10+ messages in thread

From: Chao Li @ 2026-04-17 09:17 UTC (permalink / raw)
  To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; Álvaro Herrera <[email protected]>; [email protected]



> On Apr 16, 2026, at 14:13, SATYANARAYANA NARLAPURAM <[email protected]> wrote:
> 
> Hi hackers,
> 
> restore_tuple() in repack.c uses SET_VARSIZE() to reconstruct the varlena header when 
> reading back external attributes from the spill file. In this process, looks like the flag 
> SET_VARSIZE_COMPRESSED is silently lost. Because of this, when REPACK CONCURRENTLY
> run  any concurrently updated column whose value was TOAST-compressed ends up with raw 
> compressed bytes behind an "uncompressed" header returning garbled data on subsequent reads.
> It appears that existing tests are using random chars which are uncompressable.
> 
> Please find the attached 0001-Fix-restore_tuple-losing-varlena-compression-flag.patch to fix this. 
> Additionally I updated the existing repack_toast test to include the scenario I was talking about.
> 
> Thanks,
> Satya
> <0001-Fix-restore_tuple-losing-varlena-compression-flag.patch><0002-Add-compressed-TOAST-test-to-repack_toast.patch>

I managed to reproduce the bug manually, and confirmed your fix to work for me. The repro is not simple, so I won’t put it here. If somebody is interested in it, then I can provide.

I didn’t review the test in 0002, I guess we don’t have to add the test because once fixed, the bug won’t be there anymore, thus it’s not worthy extending the test time.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/









^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
@ 2026-04-17 15:45  Antonin Houska <[email protected]>
  parent: SATYANARAYANA NARLAPURAM <[email protected]>
  1 sibling, 1 reply; 10+ messages in thread

From: Antonin Houska @ 2026-04-17 15:45 UTC (permalink / raw)
  To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; [email protected]

SATYANARAYANA NARLAPURAM <[email protected]> wrote:

> restore_tuple() in repack.c uses SET_VARSIZE() to reconstruct the varlena header when 
> reading back external attributes from the spill file. In this process, looks like the flag 
> SET_VARSIZE_COMPRESSED is silently lost. Because of this, when REPACK CONCURRENTLY
> run  any concurrently updated column whose value was TOAST-compressed ends up with raw 
> compressed bytes behind an "uncompressed" header returning garbled data on subsequent reads.
> It appears that existing tests are using random chars which are uncompressable.
> 
> Please find the attached 0001-Fix-restore_tuple-losing-varlena-compression-flag.patch to fix this. 
> Additionally I updated the existing repack_toast test to include the scenario I was talking about.

Good catch, thanks!

I'd slightly prefer to fix it w/o checking the varlena type, as
attached. However, your test fails to reproduce the issue here, so I'm not
able to verify the fix. I'll take a closer look early next week.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com



Attachments:

  [text/x-diff] fix_restore_tuple.diff (576B, 2-fix_restore_tuple.diff)
  download | inline diff:
diff --git a/src/backend/commands/repack.c b/src/backend/commands/repack.c
index e2600a8888d..3c07e44d649 100644
--- a/src/backend/commands/repack.c
+++ b/src/backend/commands/repack.c
@@ -2736,7 +2736,7 @@ restore_tuple(BufFile *file, Relation relation, TupleTableSlot *slot)
 			varlensz = VARSIZE_ANY(&chunk_header);
 
 			value = palloc(varlensz);
-			SET_VARSIZE(value, VARSIZE_ANY(&chunk_header));
+			memcpy(value, &chunk_header, VARHDRSZ);
 			BufFileReadExact(file, (char *) value + VARHDRSZ, varlensz - VARHDRSZ);
 
 			slot->tts_values[i] = PointerGetDatum(value);


^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
@ 2026-04-17 17:40  SATYANARAYANA NARLAPURAM <[email protected]>
  parent: Antonin Houska <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: SATYANARAYANA NARLAPURAM @ 2026-04-17 17:40 UTC (permalink / raw)
  To: Antonin Houska <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; [email protected]

Hi

On Fri, Apr 17, 2026 at 8:45 AM Antonin Houska <[email protected]> wrote:

> SATYANARAYANA NARLAPURAM <[email protected]> wrote:
>
> > restore_tuple() in repack.c uses SET_VARSIZE() to reconstruct the
> varlena header when
> > reading back external attributes from the spill file. In this process,
> looks like the flag
> > SET_VARSIZE_COMPRESSED is silently lost. Because of this, when REPACK
> CONCURRENTLY
> > run  any concurrently updated column whose value was TOAST-compressed
> ends up with raw
> > compressed bytes behind an "uncompressed" header returning garbled data
> on subsequent reads.
> > It appears that existing tests are using random chars which are
> uncompressable.
> >
> > Please find the attached
> 0001-Fix-restore_tuple-losing-varlena-compression-flag.patch to fix this.
> > Additionally I updated the existing repack_toast test to include the
> scenario I was talking about.
>
> Good catch, thanks!
>
> I'd slightly prefer to fix it w/o checking the varlena type, as
> attached. However, your test fails to reproduce the issue here, so I'm not
> able to verify the fix. I'll take a closer look early next week.
>

I started with that but tried to follow the existing code pattern. This
LGTM.
Please add a comment as well.


>
> --
> Antonin Houska
> Web: https://www.cybertec-postgresql.com
>
>


^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
@ 2026-04-17 21:32  Michael Paquier <[email protected]>
  parent: SATYANARAYANA NARLAPURAM <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Michael Paquier @ 2026-04-17 21:32 UTC (permalink / raw)
  To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: Antonin Houska <[email protected]>; PostgreSQL Hackers <[email protected]>; [email protected]

On Fri, Apr 17, 2026 at 10:40:39AM -0700, SATYANARAYANA NARLAPURAM wrote:
> I started with that but tried to follow the existing code pattern. This
> LGTM.
> Please add a comment as well.

Hmm.  I was reading restore_tuple(), and could it make sense to expand
a bit more the tests so as more types of varlena pointers could be
checked in this routine?  I am taking about MAIN, EXTENDED and
EXTERNAL, so as we could check more patterns with in-line
[non-]compressed, and external [non-]compressed, counting for the four
different possible patterns that could lead to repacked data.  See for
example strings.sql for such tests, that could be used as base
inspiration.
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
@ 2026-04-20 11:40  Antonin Houska <[email protected]>
  parent: Michael Paquier <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Antonin Houska @ 2026-04-20 11:40 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; PostgreSQL Hackers <[email protected]>; [email protected]

Michael Paquier <[email protected]> wrote:

> On Fri, Apr 17, 2026 at 10:40:39AM -0700, SATYANARAYANA NARLAPURAM wrote:
> > I started with that but tried to follow the existing code pattern. This
> > LGTM.
> > Please add a comment as well.
> 
> Hmm.  I was reading restore_tuple(), and could it make sense to expand
> a bit more the tests so as more types of varlena pointers could be
> checked in this routine?  I am taking about MAIN, EXTENDED and
> EXTERNAL, so as we could check more patterns with in-line
> [non-]compressed, and external [non-]compressed, counting for the four
> different possible patterns that could lead to repacked data.  See for
> example strings.sql for such tests, that could be used as base
> inspiration.

ok, this is a patch to enhance the tests. It also simplifies
gen_compressible() a bit and adjusts it so it is (hopefully) more reliable in
generating compressible values. Now it can be used to reproduce the fix [1].

[1] https://www.postgresql.org/message-id/52301.1776440752%40localhost

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
@ 2026-04-25 16:56  Álvaro Herrera <[email protected]>
  parent: Antonin Houska <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Álvaro Herrera @ 2026-04-25 16:56 UTC (permalink / raw)
  To: Antonin Houska <[email protected]>; Michael Paquier <[email protected]>; +Cc: SATYANARAYANA NARLAPURAM <[email protected]>; L. pgsql-hackers <[email protected]>

Hi Antonin,

On 2026-04-20, Antonin Houska wrote:

> ok, this is a patch to enhance the tests. It also simplifies
> gen_compressible() a bit and adjusts it so it is (hopefully) more reliable in
> generating compressible values. Now it can be used to reproduce the fix [1].
>
> [1] https://www.postgresql.org/message-id/52301.1776440752%40localhost

Thanks, I think it looks good, but unfortunately I don't see any problems when running this test without your fix (I was assuming that the test would fail). I didn't immediately understand why.  Do you see any failures?

Regards,

-- 
Álvaro Herrera





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
@ 2026-04-25 17:33  Antonin Houska <[email protected]>
  parent: Álvaro Herrera <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Antonin Houska @ 2026-04-25 17:33 UTC (permalink / raw)
  To: [email protected]; +Cc: Michael Paquier <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; L. pgsql-hackers <[email protected]>

Álvaro Herrera <[email protected]> wrote:

> > ok, this is a patch to enhance the tests. It also simplifies
> > gen_compressible() a bit and adjusts it so it is (hopefully) more reliable in
> > generating compressible values. Now it can be used to reproduce the fix [1].
> >
> > [1] https://www.postgresql.org/message-id/52301.1776440752%40localhost
> 
> Thanks, I think it looks good, but unfortunately I don't see any problems
> when running this test without your fix (I was assuming that the test would
> fail). I didn't immediately understand why.  Do you see any failures?

The enhanced tests should reproduce it (please let me know if they dont):

https://www.postgresql.org/message-id/44015.1776685220%40localhost

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
@ 2026-04-26 07:12  Antonin Houska <[email protected]>
  parent: Antonin Houska <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Antonin Houska @ 2026-04-26 07:12 UTC (permalink / raw)
  To: [email protected]; +Cc: Michael Paquier <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; L. pgsql-hackers <[email protected]>

Antonin Houska <[email protected]> wrote:

> Álvaro Herrera <[email protected]> wrote:
> 
> > > ok, this is a patch to enhance the tests. It also simplifies
> > > gen_compressible() a bit and adjusts it so it is (hopefully) more reliable in
> > > generating compressible values. Now it can be used to reproduce the fix [1].
> > >
> > > [1] https://www.postgresql.org/message-id/52301.1776440752%40localhost
> > 
> > Thanks, I think it looks good, but unfortunately I don't see any problems
> > when running this test without your fix (I was assuming that the test would
> > fail). I didn't immediately understand why.  Do you see any failures?
> 
> The enhanced tests should reproduce it (please let me know if they dont):
> 
> https://www.postgresql.org/message-id/44015.1776685220%40localhost

I see you actually refer to the correct tests. I see no failure now as
well. I'll try to make the tests more stable. Sorry for the confusion.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY
@ 2026-04-30 22:03  Alvaro Herrera <[email protected]>
  parent: Antonin Houska <[email protected]>
  0 siblings, 0 replies; 10+ messages in thread

From: Alvaro Herrera @ 2026-04-30 22:03 UTC (permalink / raw)
  To: Antonin Houska <[email protected]>; +Cc: Michael Paquier <[email protected]>; SATYANARAYANA NARLAPURAM <[email protected]>; L. pgsql-hackers <[email protected]>; Chao Li <[email protected]>

On 2026-Apr-26, Antonin Houska wrote:

> I see you actually refer to the correct tests. I see no failure now as
> well. I'll try to make the tests more stable. Sorry for the confusion.

So, the reason this wasn't failing for me is that I had default TOAST
compression lz4, and the test case wasn't reaching size for being made
external, as opposed to pglz.  I changed the test to specify pglz
compression explicitly.  I also added some dropped columns and one more
toast column.  (On my machine, this expanded test fails for several
rows, not just one, when the fix is deactivated.)  I fear the buildfarm
may end up showing other failures, but there's no way to know other than
to try.

Thanks, Satya, for reporting the bug.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle."  (Larry Wall, Apocalypse 6)





^ permalink  raw  reply  [nested|flat] 10+ messages in thread


end of thread, other threads:[~2026-04-30 22:03 UTC | newest]

Thread overview: 10+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-04-16 06:13 [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY SATYANARAYANA NARLAPURAM <[email protected]>
2026-04-17 09:17 ` Chao Li <[email protected]>
2026-04-17 15:45 ` Antonin Houska <[email protected]>
2026-04-17 17:40   ` SATYANARAYANA NARLAPURAM <[email protected]>
2026-04-17 21:32     ` Michael Paquier <[email protected]>
2026-04-20 11:40       ` Antonin Houska <[email protected]>
2026-04-25 16:56         ` Álvaro Herrera <[email protected]>
2026-04-25 17:33           ` Antonin Houska <[email protected]>
2026-04-26 07:12             ` Antonin Houska <[email protected]>
2026-04-30 22:03               ` Alvaro Herrera <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox