public inbox for [email protected]
help / color / mirror / Atom feedRe: how to know if the sql will run a seq scan
4+ messages / 2 participants
[nested] [flat]
* Re: how to know if the sql will run a seq scan
@ 2024-10-15 21:29 Adrian Klaver <[email protected]>
2024-10-16 07:02 ` Re: how to know if the sql will run a seq scan Vijaykumar Jain <[email protected]>
0 siblings, 1 reply; 4+ messages in thread
From: Adrian Klaver @ 2024-10-15 21:29 UTC (permalink / raw)
To: Vijaykumar Jain <[email protected]>; +Cc: pgsql-general
On 10/15/24 13:50, Vijaykumar Jain wrote:
> Sorry top posting, coz Gmail app on phone.
>
> Yeah, my point was for example we have a large table and we are
> attaching a table as a partition. Now it will scan the whole table to
> validate the constraint and that will create all sorts of problems.
Now you have changed the problem description.
To get a proper answer you will need to provide a more detailed
description of what you are doing with the following information:
1) Postgres version.
2) Definition of 'large'.
3) The command/process being used to create the partition.
4) The actual constraint definition.
5) The table definition.
> I understand the benefit of not valid constraint and then validating
> constraint to reduce blocking.
> But yeah monitoring locks for the statement should give me good enough
> hint of what will happen.
>
> Thanks for your reply. It helps.
>
>
>
> On Wed, Oct 16, 2024, 1:54 AM Adrian Klaver <[email protected]
> <mailto:[email protected]>> wrote:
>
> On 10/15/24 12:50, Vijaykumar Jain wrote:
> >
> > Hi,
> >
> > tl;dr
> > I am trying to learn what sql can result in a full seq scan.
> >
> > Basically there is a lot of info on the internet of what ddl
> change may
> > take an access exclusive lock while running a seq scan and hold
> for long.
> > And for some cases we can make use of
> > "not valid" constraint and then run a validate constraint as work
> > arounds to avoid long exclusive locks etc.
> > but how do we check the same. i mean for dmls there is a explain/
> > auto_explain.
> >
> > but for DDLs, how do we check the same.
> > i tried to isolate my setup and use pg_stat_user_tables and
> monitor the
> > same, which helped, but it is not useful as it does not link me
> to what
> > process/command invoked the seq scan.
> >
> > am i clear in my question ?
> >
> > if yes,
> > how do i log an alter table that may or may not do a seq scan,
> that may
> > or may not rewrite the table file on disk etc.
> > its a useless question, i am just playing with it for building
> > knowledge, no requirement as such.
>
> Look at the docs:
>
> https://www.postgresql.org/docs/current/sql-altertable.html
> <https://www.postgresql.org/docs/current/sql-altertable.html;
>
> "Scanning a large table to verify a new foreign key or check constraint
> can take a long time, and other updates to the table are locked out
> until the ALTER TABLE ADD CONSTRAINT command is committed. The main
> purpose of the NOT VALID constraint option is to reduce the impact of
> adding a constraint on concurrent updates. With NOT VALID, the ADD
> CONSTRAINT command does not scan the table and can be committed
> immediately. After that, a VALIDATE CONSTRAINT command can be issued to
> verify that existing rows satisfy the constraint. The validation step
> does not need to lock out concurrent updates, since it knows that other
> transactions will be enforcing the constraint for rows that they insert
> or update; only pre-existing rows need to be checked. Hence, validation
> acquires only a SHARE UPDATE EXCLUSIVE lock on the table being altered.
> (If the constraint is a foreign key then a ROW SHARE lock is also
> required on the table referenced by the constraint.) In addition to
> improving concurrency, it can be useful to use NOT VALID and VALIDATE
> CONSTRAINT in cases where the table is known to contain pre-existing
> violations. Once the constraint is in place, no new violations can be
> inserted, and the existing problems can be corrected at leisure until
> VALIDATE CONSTRAINT finally succeeds."
>
>
> > --
> > Thanks,
> > Vijay
> >
> > Open to work
> > Resume - Vijaykumar Jain <https://github.com/cabecada
> <https://github.com/cabecada>;
>
> --
> Adrian Klaver
> [email protected] <mailto:[email protected]>
>
--
Adrian Klaver
[email protected]
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: how to know if the sql will run a seq scan
2024-10-15 21:29 Re: how to know if the sql will run a seq scan Adrian Klaver <[email protected]>
@ 2024-10-16 07:02 ` Vijaykumar Jain <[email protected]>
2024-10-16 16:40 ` Re: how to know if the sql will run a seq scan Adrian Klaver <[email protected]>
0 siblings, 1 reply; 4+ messages in thread
From: Vijaykumar Jain @ 2024-10-16 07:02 UTC (permalink / raw)
To: Adrian Klaver <[email protected]>; +Cc: pgsql-general
On Wed, 16 Oct 2024 at 02:59, Adrian Klaver <[email protected]>
wrote:
> On 10/15/24 13:50, Vijaykumar Jain wrote:
> > Sorry top posting, coz Gmail app on phone.
> >
> > Yeah, my point was for example we have a large table and we are
> > attaching a table as a partition. Now it will scan the whole table to
> > validate the constraint and that will create all sorts of problems.
>
> Now you have changed the problem description.
>
> To get a proper answer you will need to provide a more detailed
> description of what you are doing with the following information:
>
> 1) Postgres version.
>
> 2) Definition of 'large'.
>
> 3) The command/process being used to create the partition.
>
> 4) The actual constraint definition.
>
> 5) The table definition.
>
>
/*
postgres=# create table t(col1 int) partition by list(col1);
CREATE TABLE
postgres=# create table t1(col1 int)
postgres-# ;
CREATE TABLE
postgres=# insert into t1 select 0 from generate_series(1, 100000) x;
INSERT 0 100000
postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
current_timestamp), seq_tup_read from pg_stat_user_tables where relname =
't1';
relname | seq_scan | last_seq_scan | age | seq_tup_read
---------+----------+---------------+-----+--------------
t1 | 0 | | | 0
(1 row)
postgres=# alter table t1 add constraint col10 check (col1 = 0);
ALTER TABLE
postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
current_timestamp), seq_tup_read from pg_stat_user_tables where relname =
't1';
relname | seq_scan | last_seq_scan | age |
seq_tup_read
---------+----------+-------------------------------+------------------+--------------
t1 | 1 | 2024-10-16 06:46:28.641281+00 | -00:00:03.258432 |
100000
(1 row)
postgres=# -- this results in a seq scan , which is ok, but then when i
attach the partition it does a seq scan again
postgres=# alter table t attach partition t1 for values in (0);
ALTER TABLE
postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
current_timestamp), seq_tup_read from pg_stat_user_tables where relname =
't1';
relname | seq_scan | last_seq_scan | age |
seq_tup_read
---------+----------+-------------------------------+------------------+--------------
t1 | 2 | 2024-10-16 06:46:59.512201+00 | -00:00:02.498771 |
200000
(1 row)
postgres=# -- why , when there is a constraint that helps with the
partition boundary/value
postgres=# alter table t detach partition t1;
ALTER TABLE
postgres=# alter table t attach partition t1 for values in (0);
ALTER TABLE
postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
current_timestamp), seq_tup_read from pg_stat_user_tables where relname =
't1';
relname | seq_scan | last_seq_scan | age |
seq_tup_read
---------+----------+-------------------------------+------------------+--------------
t1 | 3 | 2024-10-16 06:54:28.780145+00 | -00:00:03.358524 |
300000
(1 row)
-- despite there being a constraint, it does a full table scan to attach
the partition. why ? note the tup read is full table of t1.
*/
above is one of the cases i found.
my core question still was, how do i know which statement will cause a
full table rewrite
full table scan
how do i get to know that. i know implictly i can use the above stat tables
and pg_rel_filepath function etc to figure out the change in oid , update
in seq count etc.
but i want to pin point which statement made what change among 100 other
statements in production.
I mean is there a way that a certain alter table will do a table rewrite on
disk and other alter table will not.
access exclusive lock on tables does not help answer that question.
if i am not clear, maybe ignore my question. i have some issues explaining
things clearly, so i try to use demos.
Thanks,
Vijay
Open to work
Resume - Vijaykumar Jain <https://github.com/cabecada;
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: how to know if the sql will run a seq scan
2024-10-15 21:29 Re: how to know if the sql will run a seq scan Adrian Klaver <[email protected]>
2024-10-16 07:02 ` Re: how to know if the sql will run a seq scan Vijaykumar Jain <[email protected]>
@ 2024-10-16 16:40 ` Adrian Klaver <[email protected]>
2024-10-17 05:39 ` Re: how to know if the sql will run a seq scan Vijaykumar Jain <[email protected]>
0 siblings, 1 reply; 4+ messages in thread
From: Adrian Klaver @ 2024-10-16 16:40 UTC (permalink / raw)
To: Vijaykumar Jain <[email protected]>; +Cc: pgsql-general
On 10/16/24 00:02, Vijaykumar Jain wrote:
>
>
> postgres=# create table t(col1 int) partition by list(col1);
> CREATE TABLE
> postgres=# create table t1(col1 int)
> postgres-# ;
> CREATE TABLE
> postgres=# insert into t1 select 0 from generate_series(1, 100000) x;
> INSERT 0 100000
> postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
> current_timestamp), seq_tup_read from pg_stat_user_tables where relname
> = 't1';
> relname | seq_scan | last_seq_scan | age | seq_tup_read
> ---------+----------+---------------+-----+--------------
> t1 | 0 | | | 0
> (1 row)
>
> postgres=# alter table t1 add constraint col10 check (col1 = 0);
> ALTER TABLE
> postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
> current_timestamp), seq_tup_read from pg_stat_user_tables where relname
> = 't1';
> relname | seq_scan | last_seq_scan | age
> | seq_tup_read
> ---------+----------+-------------------------------+------------------+--------------
> t1 | 1 | 2024-10-16 06:46:28.641281+00 | -00:00:03.258432
> | 100000
> (1 row)
>
> postgres=# -- this results in a seq scan , which is ok, but then when i
> attach the partition it does a seq scan again
> postgres=# alter table t attach partition t1 for values in (0);
> ALTER TABLE
> postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
> current_timestamp), seq_tup_read from pg_stat_user_tables where relname
> = 't1';
> relname | seq_scan | last_seq_scan | age
> | seq_tup_read
> ---------+----------+-------------------------------+------------------+--------------
> t1 | 2 | 2024-10-16 06:46:59.512201+00 | -00:00:02.498771
> | 200000
> (1 row)
>
> postgres=# -- why , when there is a constraint that helps with the
> partition boundary/value
>
> postgres=# alter table t detach partition t1;
> ALTER TABLE
>
> postgres=# alter table t attach partition t1 for values in (0);
> ALTER TABLE
> postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
> current_timestamp), seq_tup_read from pg_stat_user_tables where relname
> = 't1';
> relname | seq_scan | last_seq_scan | age
> | seq_tup_read
> ---------+----------+-------------------------------+------------------+--------------
> t1 | 3 | 2024-10-16 06:54:28.780145+00 | -00:00:03.358524
> | 300000
> (1 row)
>
> -- despite there being a constraint, it does a full table scan to attach
> the partition. why ? note the tup read is full table of t1.
>
> */
>
> above is one of the cases i found.
> my core question still was, how do i know which statement will cause a
> full table rewrite
> full table scan
I don't have time now to create an example, but I can point you at:
https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE
5.12.2.2. Partition Maintenance
"As an alternative to creating a new partition, it is sometimes more
convenient to create a new table separate from the partition structure
and attach it as a partition later. ... "
Read the section starting above.
>
> how do i get to know that. i know implictly i can use the above stat
> tables and pg_rel_filepath function etc to figure out the change in oid
> , update in seq count etc.
> but i want to pin point which statement made what change among 100 other
> statements in production.
>
> I mean is there a way that a certain alter table will do a table rewrite
> on disk and other alter table will not.
> access exclusive lock on tables does not help answer that question.
>
> if i am not clear, maybe ignore my question. i have some issues
> explaining things clearly, so i try to use demos.
>
>
>
>
>
>
>
> Thanks,
> Vijay
>
> Open to work
> Resume - Vijaykumar Jain <https://github.com/cabecada;
--
Adrian Klaver
[email protected]
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: how to know if the sql will run a seq scan
2024-10-15 21:29 Re: how to know if the sql will run a seq scan Adrian Klaver <[email protected]>
2024-10-16 07:02 ` Re: how to know if the sql will run a seq scan Vijaykumar Jain <[email protected]>
2024-10-16 16:40 ` Re: how to know if the sql will run a seq scan Adrian Klaver <[email protected]>
@ 2024-10-17 05:39 ` Vijaykumar Jain <[email protected]>
0 siblings, 0 replies; 4+ messages in thread
From: Vijaykumar Jain @ 2024-10-17 05:39 UTC (permalink / raw)
To: Adrian Klaver <[email protected]>; +Cc: pgsql-general
>
>
>
> I don't have time now to create an example, but I can point you at:
>
>
> https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE
>
> 5.12.2.2. Partition Maintenance
>
> "As an alternative to creating a new partition, it is sometimes more
> convenient to create a new table separate from the partition structure
> and attach it as a partition later. ... "
>
> Read the section starting above.
>
Thanks Adrian,
I tried similar things as in the doc, but I found my mistake.
I casually did not put a constraint of not null on the partition column. as
a result i think to invalidate any of that data, it does a scan despite the
boundary constraint.
which when i used not null for the partition column, the scans stopped.
although my question was asking something else, but maybe i will not be
able to put it clearly.
at least this part got addressed well. thanks for your help and time.
/*
postgres=# drop table t;
DROP TABLE
postgres=# drop table t1;
ERROR: table "t1" does not exist
postgres=# create table t(col1 int) partition by list(col1);
CREATE TABLE
postgres=# create table t1 (LIKE t INCLUDING DEFAULTS INCLUDING
CONSTRAINTS); CREATE TABLE
postgres=# insert into t1 select 0 from generate_series(1, 100000) x;
INSERT 0 100000
postgres=# alter table t1 add constraint col10 check (col1 = 0);
ALTER TABLE
postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
current_timestamp), seq_tup_read from pg_stat_user_tables where relname =
't1';
relname | seq_scan | last_seq_scan | age |
seq_tup_read
---------+----------+-------------------------------+------------------+--------------
t1 | 1 | 2024-10-17 05:33:40.080319+00 | -00:00:02.488752 |
100000
(1 row)
postgres=# alter table t attach partition t1 for values in (0); --
note the scan count increased as col1 was not set to not null
ALTER TABLE
postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
current_timestamp), seq_tup_read from pg_stat_user_tables where relname =
't1';
relname | seq_scan | last_seq_scan | age |
seq_tup_read
---------+----------+-------------------------------+------------------+--------------
t1 | 2 | 2024-10-17 05:33:48.248001+00 | -00:00:01.510694 |
200000
(1 row)
postgres=# drop table t;
DROP TABLE
postgres=# drop table t1;
ERROR: table "t1" does not exist
postgres=# create table t(col1 int not null) partition by list(col1);
CREATE TABLE
postgres=# create table t1 (LIKE t INCLUDING DEFAULTS INCLUDING
CONSTRAINTS); CREATE TABLE
postgres=# insert into t1 select 0 from generate_series(1, 100000) x;
INSERT 0 100000
postgres=# alter table t1 add constraint col10 check (col1 = 0);
ALTER TABLE
postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
current_timestamp), seq_tup_read from pg_stat_user_tables where relname =
't1';
relname | seq_scan | last_seq_scan | age |
seq_tup_read
---------+----------+-------------------------------+-----------------+--------------
t1 | 1 | 2024-10-17 05:34:41.363401+00 | -00:00:03.19836 |
100000
(1 row)
postgres=# alter table t attach partition t1 for values in (0); -- note no
scan count bump as not null on column to avoid scanning to filter those
values -
ALTER TABLE
postgres=# select relname,seq_scan,last_seq_scan, age(last_seq_scan,
current_timestamp), seq_tup_read from pg_stat_user_tables where relname =
't1';
relname | seq_scan | last_seq_scan | age |
seq_tup_read
---------+----------+-------------------------------+------------------+--------------
t1 | 1 | 2024-10-17 05:34:41.363401+00 | -00:00:08.241812 |
100000
(1 row)
*/
^ permalink raw reply [nested|flat] 4+ messages in thread
end of thread, other threads:[~2024-10-17 05:39 UTC | newest]
Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-10-15 21:29 Re: how to know if the sql will run a seq scan Adrian Klaver <[email protected]>
2024-10-16 07:02 ` Vijaykumar Jain <[email protected]>
2024-10-16 16:40 ` Adrian Klaver <[email protected]>
2024-10-17 05:39 ` Vijaykumar Jain <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox