Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tByFL-007hd5-KQ for pgsql-general@arkaria.postgresql.org; Fri, 15 Nov 2024 15:28:31 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tByFH-00AN02-Gk for pgsql-general@arkaria.postgresql.org; Fri, 15 Nov 2024 15:28:28 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tByFH-00AMzu-2X for pgsql-general@lists.postgresql.org; Fri, 15 Nov 2024 15:28:27 +0000 Received: from mail-lj1-x229.google.com ([2a00:1450:4864:20::229]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1tByF9-0025sg-9N for pgsql-general@lists.postgresql.org; Fri, 15 Nov 2024 15:28:26 +0000 Received: by mail-lj1-x229.google.com with SMTP id 38308e7fff4ca-2fb388e64b0so19551971fa.0 for ; Fri, 15 Nov 2024 07:28:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731684499; x=1732289299; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ggoF7w3SdCd9MFD4+t0Y9WYYJTprANWBlY+BIgDZj9Q=; b=ZcSfZKbUJZ3PDHrr7FIyGsf6ArvZiZKwftoFSvWu2AxGJA7gR+qM0XH1i04AdmFmLr hR+mWn734DTxAggDi6am5PVlWxtI7/2vnZZFcrM+dPuKc3AlUrJ7Gi8El3bvIaw2rZ9J 34sT/3gqkHebekIUNBBfVijoYnydZu0YUfFwbFt6lfd+mzEDIBX08hxwTEpW039eYhEo f8Yh7KkHkxV76LEPJIO4OpABgJOb8SBEjooCSzFhjPwjx72ptvyFunyGD0PSU+0m0hpz WtQOKuv72kqPShew2fEi6J58H2WbBgwaNdrWL0EkDi1fT1DDSmDQmcTSyIqBAd2Xr/Yo BdsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731684499; x=1732289299; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ggoF7w3SdCd9MFD4+t0Y9WYYJTprANWBlY+BIgDZj9Q=; b=I4HvbRwsJCRQZwYU6mMfC/+/EvaWhFEpyUVnaw7dyrsp+qeRktPgKLnfh2wqHAvcNa 2KdRo16Wk2hWtRSZxHTwCoy/iQKLRXgGrGKxvNJNBZWMWXRlLpjjVTdYzYdDzjqKGbEO InbyM9kGgzWDAuYIy/cAUuB7avorOhdWyrYHRiO5KF1o+OWZy3PtR02He/FTJBBcABDM PDKPZhwS0PCCu2eNLTQ8iX1u1GRbxRFyTbM2wBNMRgK1oCCglEgba2xFalKqFD3q1Xpt A7Jz0rpeDF+6NQPj0AFgrjeTl3if8MIgA73uvx+SPv4oDgO7OXzRO5PWL4eF4ePwaXW0 gQpg== X-Gm-Message-State: AOJu0YxGG7BdTrVHdSkDsE8Tam3I80yU5DNfia7r/1dnVcpr7foxrbYC xT3YemQrBSEedBZNH6pnbpohYk2eNJXN6bOfdbHlr3bGCQ7Agh0XczeSqtZiI/59eAHK/uTHIbn aX+hACBf1aW7gXsjwsW5l/D1gPUEIRD9Mdw== X-Google-Smtp-Source: AGHT+IFPpT2E4dZ86VGm/mVdySK5pQKLCqm3Fcv8JX/XHYLHUC8O+O47B6OkTGC5YWK8siKCMHOLGoh8LoyAPCRHlFA= X-Received: by 2002:a2e:a99c:0:b0:2fb:570a:496e with SMTP id 38308e7fff4ca-2ff6062af19mr18637661fa.9.1731684498408; Fri, 15 Nov 2024 07:28:18 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?Q?Torsten_F=C3=B6rtsch?= Date: Fri, 15 Nov 2024 16:28:06 +0100 Message-ID: Subject: Re: Validating check constraints without a table scan? To: Philip Couling Cc: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000640e090626f537e1" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000640e090626f537e1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Nov 15, 2024 at 9:38=E2=80=AFAM Philip Couling = wrote: > Is there a solid reason why adding a check constraint does not use > existing indexes for validation. > > We are currently looking at partitioning a multi TB table leaving all > existing data in place and simply attaching it as a partition to a new > table. To prevent locking, we are trying to add an INVALID check constrai= nt > first and then validate it. > > I can trivially prove the invalid constraint is valid with a simple SELEC= T > which will use an existing index and return instantaneously. But AFAIK > Theres no way to mark a constraint as valid without scanning all the rows= . > Most likely your query is not exactly the same as the check constraint. Think about NULL and similar. > This operation is really problematic on a production database with heavy > IO load. > > Is there a solid ready why validating check constraints cannot use > existing indexes? If I can prove the constraint is valid so trivially wit= h > a SELECT, then why can Postgres not do the same (or similar)? > Here is what has worked for me many times: 1. create the check constraint as NOT VALID. From now on no new or updated row is allowed to violate it. 2. check if the constraint holds with a query on a binary replica. Make sure the query starts only when the constraint is visible on the replica. 3. on the master: UPDATE pg_constraint SET conisvalidated=3Dtrue WHERE conname=3D'your_constraint_name' AND conrelid=3D'schema.table'::regclass How you perform step 2 is irrelevant. Checking it on a replica would simply avoid the load on the master. You just need to make sure there is no conflicting data in the table. WARNING, you need to be damn sure of your data if you do that. But if you are, it works. Here is the procedure how I solved the same problem for some of our multi-TB tables (PG14): The table has a column called transaction_time. We wanted to partition by that column. For some historical reason the column did not have a NOT NULL constraint. However, there was no way our processes could insert NULL in that column and there was no row with NULL in that field. So, first was to add the NOT NULL constraint: BEGIN; ALTER TABLE my.table ADD CONSTRAINT transaction_time_not_null CHECK(transaction_time IS NOT NULL) NOT VALID; UPDATE pg_constraint SET convalidated=3Dtrue WHERE conname =3D 'transaction_time_not_null' AND conrelid =3D 'my.table'::REGCLASS RETURNING conname, conrelid::REGCLASS, convalidated; COMMIT; Now for cosmetic purposes we first turn the check constraint above into a normal NOT NULL constraint: BEGIN; SET LOCAL client_min_messages =3D 'debug4'; -- expecting this message -- DEBUG: existing constraints on column "table.transaction_time" are sufficient to prove that it does not contain nulls ALTER TABLE my.table ALTER COLUMN transaction_time SET NOT NULL; RESET client_min_messages; ALTER TABLE my.table DROP CONSTRAINT transaction_time_not_null; COMMIT; If you set client_min_messages to something like debug4, then the database tells you if it wants to scan the table or if existing constraints are sufficient to prove the condition. transaction_time in our case is never in the future. Also database transactions are short-lived. Nothing exceeds 30 seconds. So, I can trust the effect of any action performed more than 30 seconds ago in the database is visible. So, I set the time after which new rows go to the new partition at least 10 minutes from now at the next hour boundary. 30 seconds would be good enough. I chose 10 minutes just for extra safety. SELECT date_trunc('hour', now()+'1h 10min'::interval)::timestamp AS switch_time\gset Next comes the actual change: BEGIN; -- rename the existing table ALTER TABLE my.table RENAME TO table_old; -- drop triggers. We will recreate them later. DROP TRIGGER ... ON my.table_old; DROP TRIGGER ...; -- create partitioned table CREATE TABLE my.table ( LIKE my.table_old INCLUDING DEFAULTS INCLUDING CONSTRAINTS ) PARTITION BY RANGE (transaction_time); -- recreate triggers CREATE TRIGGER ... BEFORE DELETE ON my.table FOR EACH ROW EXECUTE FUNCTION my,trigger_fun(); CREATE TRIGGER ...; -- create the partition for future rows CREATE TABLE my.table_current PARTITION OF my.table FOR VALUES FROM (:'switch_time') TO ('infinity'); -- and some indexes and FK ALTER TABLE my.table_current ADD PRIMARY KEY (id); CREATE INDEX ON my.table_current (...); ALTER TABLE my.table_current ADD CONSTRAINT fk_name FOREIGN KEY (...) REFERENCES ...(...) ON UPDATE ... ON DELETE ...; -- now attach the old table as partition ALTER TABLE my.table_old ADD CONSTRAINT partition_boundaries CHECK('-infinity'<=3Dtransaction_time AND transaction_time<:'switch_time') NOT VALID; -- for procedural reasons we know the constraint is valid. Let's make PG believe it too. UPDATE pg_constraint SET convalidated=3Dtrue WHERE conname=3D'partition_boundaries' AND conrelid=3D'my.table_old'::REGCLASS RETURNING conname, conrelid::REGCLASS, convalidated; -- now attach it. We use again debug4 to check if the table scan is avoided= . SET LOCAL client_min_messages =3D 'debug4'; -- expecting -- DEBUG: partition constraint for table "table_old" is implied by existing constraints ALTER TABLE my.table ATTACH PARTITION my.table_old FOR VALUES FROM ('-infinity') TO (:'switch_time'); RESET client_min_messages; -- drop the now unnecessary constraint ALTER TABLE my.table_old DROP CONSTRAINT partition_boundaries; COMMIT; Once the new partition gets too full, we will use a similar procedure to adjust the partition boundary of the new partition and then create the next partition. --000000000000640e090626f537e1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Fri, Nov 15, 2024 at 9:38=E2=80=AFAM P= hilip Couling <couling@gmail.com> wrote:


Most likely your query is not = exactly the same as the check constraint. Think about NULL and similar.
=C2=A0
This operation is really problematic on= a production database with heavy IO load.

Is there a solid ready why validating check constraints = cannot use existing indexes? If I can prove the constraint is valid so triv= ially with a SELECT, then why can Postgres not do the same (or similar)?

Here is what has worked for me many tim= es:

1. create the check constraint as NOT VALID. F= rom now on no new or updated row is allowed to violate it.
2. che= ck if the constraint holds with a query on a binary replica. Make sure the = query starts only when the constraint is visible on the replica.
= 3. on the master: UPDATE pg_constraint SET=C2=A0conisvalidated=3Dtrue=C2=A0= WHERE conname=3D'your_constraint_name' AND conrelid=3D'schema.t= able'::regclass

How you perform step 2 is irre= levant. Checking it on a replica would simply avoid the load on the master.= You just need to make sure there is no conflicting data in the table.

WARNING, you need to be damn sure of your data if you = do that. But if you are, it=C2=A0works.

Here is th= e procedure how I solved the same problem for some of our multi-TB tables (= PG14):

The table has a column called transaction_t= ime. We wanted to partition by that column. For some historical reason the = column did not have a NOT NULL constraint. However, there was no way our=C2= =A0processes could insert NULL in that column and there was no row with NUL= L in that field. So, first was to add the NOT NULL constraint:
BEGIN;

ALTER TABLE my.table
ADD CONSTRAINT transacti= on_time_not_null CHECK(transaction_time IS NOT NULL) NOT VALID;

UPDA= TE pg_constraint
=C2=A0 =C2=A0SET convalidated=3Dtrue
=C2=A0WHERE con= name =3D 'transaction_time_not_null'
=C2=A0 =C2=A0AND conrelid = =3D 'my.table'::REGCLASS
RETURNING conname, conrelid::REGCLASS, = convalidated;

COMMIT;

Now for cosmetic purp= oses we first turn the check constraint above into a normal NOT NULL constr= aint:

BEGIN;

SET LOCAL client_min_messages = =3D 'debug4';
-- expecting this message
-- DEBUG: =C2=A0exist= ing constraints on column "table.transaction_time" are sufficient= to prove that it does not contain nulls
ALTER TABLE my.table
=C2=A0 = =C2=A0 =C2=A0 ALTER COLUMN transaction_time SET NOT NULL;
RESET client_m= in_messages;

ALTER TABLE my.table
=C2=A0 =C2=A0 =C2=A0 DROP CONST= RAINT transaction_time_not_null;

COMMIT;
=

If you set client_min_messages to something like debug4= , then the database tells you if it wants to scan the table or if existing = constraints are sufficient to prove the condition.

transaction_time in our case is never in the future. Also database tr= ansactions are short-lived. Nothing exceeds 30 seconds. So, I can trust the= effect of any action performed more than 30 seconds ago in the database is= visible.

So, I set the time after which new rows go to= the new partition at least 10 minutes from now at the next hour boundary. = 30 seconds would be good enough. I chose 10 minutes just for extra safety.<= /div>

SELECT date_trunc('hour', now()+'1h 10= min'::interval)::timestamp AS switch_time\gset

=
Next comes the actual change:

BEGIN;

-- rename the existing table
ALTER TABLE my.tabl= e RENAME TO table_old;

-- drop triggers. We will recreate= them later.
DROP TRIGGER ... ON my.table_old;
DROP TRIGGER ..= .;

-- create partitioned table
CREATE TABLE my.= table (
=C2=A0 =C2=A0 LIKE my.table_old
=C2=A0 =C2=A0 INCLUDING DEFAU= LTS
=C2=A0 =C2=A0 INCLUDING CONSTRAINTS
)
PARTITION BY RANGE (tran= saction_time);

-- recreate triggers
= CREATE TRIGGER ...
BEFORE DELETE ON my.table
=C2=A0 =C2=A0FOR EACH RO= W EXECUTE FUNCTION my,trigger_fun();
CREATE TRIGGER ...;

-- create the partition for future rows
CREA= TE TABLE my.table_current PARTITION OF my.table
=C2=A0 FOR VALUES FROM (= :'switch_time') TO ('infinity');

-- and some indexes and FK
ALTER TABLE my.table_current ADD PR= IMARY KEY (id);
CREATE INDEX ON my.table_current (...);
AL= TER TABLE my.table_current
=C2=A0 ADD CONSTRAINT fk_name
=C2=A0 =C2= =A0 =C2=A0 FOREIGN KEY (...)
=C2=A0 =C2=A0 =C2=A0 REFERENCES ...(...) ON= UPDATE ... ON DELETE ...;

-- now attach the o= ld table as partition
ALTER TABLE my.table_old
ADD CONSTRAINT = partition_boundaries
CHECK('-infinity'<=3Dtransaction_time AN= D transaction_time<:'switch_time')
=C2=A0 NOT VALID;

-= - for procedural reasons we know the constraint is valid. Let's make PG= believe it too.
UPDATE pg_constraint
=C2=A0 =C2=A0SET convali= dated=3Dtrue
=C2=A0WHERE conname=3D'partition_boundaries'
=C2= =A0 =C2=A0AND conrelid=3D'my.table_old'::REGCLASS
RETURNI= NG conname, conrelid::REGCLASS, convalidated;

-- now attach it. We u= se again debug4 to check if the table scan is avoided.
SET LOCAL client_= min_messages =3D 'debug4';
-- expecting
-- DEBUG: =C2=A0parti= tion constraint for table "table_old" is implied by existing cons= traints
ALTER TABLE my.table ATTACH PARTITION my.table_old
=C2=A0 FOR= VALUES FROM ('-infinity') TO (:'switch_time');
RESET cl= ient_min_messages;

-- drop the now unnecessary constraint
ALTER T= ABLE my.table_old
=C2=A0DROP CONSTRAINT partition_boundaries;
<= div>

--000000000000640e090626f537e1--