public inbox for [email protected]
help / color / mirror / Atom feedNested loop behaviour with pg_stat_activity
3+ messages / 2 participants
[nested] [flat]
* Nested loop behaviour with pg_stat_activity
@ 2024-03-12 07:30 Sheryl Prabhu David <[email protected]>
2024-03-13 14:30 ` Re: Nested loop behaviour with pg_stat_activity Tom Lane <[email protected]>
0 siblings, 1 reply; 3+ messages in thread
From: Sheryl Prabhu David @ 2024-03-12 07:30 UTC (permalink / raw)
To: [email protected]
Hi,
I am working on a solution to sample pg_stat_activity along the lines of
what is possible in Oracle, described as follows:
In an Oracle database using a SQL like the one below, we have the ability
to use, a series along the lines of Postgresqls generate_series() as seen
in the 'connect by' line as outer table of a forced nested loop, and
v$session(Oracles equivalent of pg_stat_activity) as the inner table, to
get many samples of v$session each second.
select
/*+ ORDERED NO_MERGE USE_NL(g t s) */
g.*,
t.*,
s.*
from
(
select /*+ NO_MERGE */ 1 from dual connect by level <= :v_samples
) g,
(
select /*+ NO_MERGE */ hsecs from v$timer
) t,
(
select
/*+ NO_MERGE */
s.sql_id,
plan_hash_value,
sql_exec_start,
sql_exec_id,
event,
state,
wait_class,
row_wait_obj#,
program,
s.module,
s.action,
status
from
v$session s
left outer join v$sqlarea sa on
s.sql_hash_value=sa.hash_value and s.sql_address=sa.address
where
type != 'BACKGROUND'
) s
order by
sql_id,
sql_exec_id,
plan_hash_value,
event,
row_wait_obj#
Trying something similar in Postgres does not produce an equivalent result.
Instead of 'Number of sample' times *DIFFERENT* copies of pg_stat_activity,
we are seeing 'Number of sample' times *SAME* copy of pg_stat_activity,
unlike Oracle. MVCC and Isolation guarantees for regular tables is expected
to produce this kind of a result, but I was hoping pg_stat_activity being a
portal into internal data structures will act similar to Oracles v$session
bypassing MVCC+Isolation. I am hoping to find out if there is anyway to
force Oracle type behaviour for pg_stat_activity, please help.
SQL I have tried in Postgres:
vagrant=# set enable_material = off;
SET
vagrant=# explain
with sampled as materialized (
select
*
from
(
select
pid,
datname,
usename,
query,
query_start,
wait_event_type,
wait_event,
state
from
pg_stat_activity i,
( select * from generate_series(1,100) ) o
offset 0
)
where
state = 'active'
)
select
query,
query_start,
wait_event_type,
wait_event,
count(*)
from
sampled
group by
query,
query_start,
wait_event_type,
wait_event;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
HashAggregate (cost=330.56..331.06 rows=50 width=112)
Group Key: sampled.query, sampled.query_start, sampled.wait_event_type,
sampled.wait_event
CTE sampled
-> Subquery Scan on unnamed_subquery (cost=2.39..328.93 rows=50
width=268)
Filter: (unnamed_subquery.state = 'active'::text)
-> Nested Loop (cost=2.39..203.93 rows=10000 width=268)
-> Hash Left Join (cost=2.38..3.93 rows=100 width=268)
Hash Cond: (s.usesysid = u.oid)
-> Hash Left Join (cost=1.05..2.32 rows=100
width=208)
Hash Cond: (s.datid = d.oid)
-> Function Scan on pg_stat_get_activity s
(cost=0.00..1.00 rows=100 width=148)
-> Hash (cost=1.02..1.02 rows=2 width=68)
-> Seq Scan on pg_database d
(cost=0.00..1.02 rows=2 width=68)
-> Hash (cost=1.15..1.15 rows=15 width=68)
-> Seq Scan on pg_authid u (cost=0.00..1.15
rows=15 width=68)
-> Function Scan on generate_series (cost=0.00..1.00
rows=100 width=0)
-> CTE Scan on sampled (cost=0.00..1.00 rows=50 width=104)
(17 rows)
Could not get the nested loop order correct with this.
vagrant=# explain
with sampled as (
select
*
from
(
with o as (
select
pid,
datname,
usename,
query,
query_start,
wait_event_type,
wait_event,
state
from
pg_stat_activity
offset 0
)
select
o2.*
from
o,
o as o1,
o as o2
offset 0
)
where
state = 'active'
)
select
query,
query_start,
wait_event_type,
wait_event,
count(*)
from
sampled
group by
query,
query_start,
wait_event_type,
wait_event;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=33015.13..33139.02 rows=4889 width=112)
Group Key: unnamed_subquery.query, unnamed_subquery.query_start,
unnamed_subquery.wait_event_type, unnamed_subquery.wait_event
-> Sort (cost=33015.13..33027.63 rows=5000 width=104)
Sort Key: unnamed_subquery.query, unnamed_subquery.query_start,
unnamed_subquery.wait_event_type, unnamed_subquery.wait_event
-> Subquery Scan on unnamed_subquery (cost=3.93..32707.93
rows=5000 width=104)
Filter: (unnamed_subquery.state = 'active'::text)
-> Nested Loop (cost=3.93..20207.93 rows=1000000 width=268)
CTE o
-> Hash Left Join (cost=2.38..3.93 rows=100
width=268)
Hash Cond: (s.usesysid = u.oid)
-> Hash Left Join (cost=1.05..2.32 rows=100
width=208)
Hash Cond: (s.datid = d.oid)
-> Function Scan on
pg_stat_get_activity s (cost=0.00..1.00 rows=100 width=148)
-> Hash (cost=1.02..1.02 rows=2
width=68)
-> Seq Scan on pg_database d
(cost=0.00..1.02 rows=2 width=68)
-> Hash (cost=1.15..1.15 rows=15 width=68)
-> Seq Scan on pg_authid u
(cost=0.00..1.15 rows=15 width=68)
-> Nested Loop (cost=0.00..203.00 rows=10000 width=0)
-> CTE Scan on o (cost=0.00..2.00 rows=100
width=0)
-> CTE Scan on o o1 (cost=0.00..2.00 rows=100
width=0)
-> CTE Scan on o o2 (cost=0.00..2.00 rows=100
width=136)
(21 rows)
With this I am able to get the nested loop order correct where the alias o2
from which we are taking the results is the inner table in join order, but
still we only get multiple copies of one state from pg_stat_activity.
Please suggest alternatives if any.
Regards
Prabhu David
^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Nested loop behaviour with pg_stat_activity
2024-03-12 07:30 Nested loop behaviour with pg_stat_activity Sheryl Prabhu David <[email protected]>
@ 2024-03-13 14:30 ` Tom Lane <[email protected]>
2024-03-14 07:29 ` Re: Nested loop behaviour with pg_stat_activity Sheryl Prabhu David <[email protected]>
0 siblings, 1 reply; 3+ messages in thread
From: Tom Lane @ 2024-03-13 14:30 UTC (permalink / raw)
To: Sheryl Prabhu David <[email protected]>; +Cc: [email protected]
Sheryl Prabhu David <[email protected]> writes:
> In an Oracle database using a SQL like the one below, we have the ability
> to use, a series along the lines of Postgresqls generate_series() as seen
> in the 'connect by' line as outer table of a forced nested loop, and
> v$session(Oracles equivalent of pg_stat_activity) as the inner table, to
> get many samples of v$session each second.
That's ... impressively ill-defined. How do you know you are getting
consistent "samples" at all?
> Trying something similar in Postgres does not produce an equivalent result.
> Instead of 'Number of sample' times *DIFFERENT* copies of pg_stat_activity,
> we are seeing 'Number of sample' times *SAME* copy of pg_stat_activity,
> unlike Oracle. MVCC and Isolation guarantees for regular tables is expected
> to produce this kind of a result, but I was hoping pg_stat_activity being a
> portal into internal data structures will act similar to Oracles v$session
> bypassing MVCC+Isolation. I am hoping to find out if there is anyway to
> force Oracle type behaviour for pg_stat_activity, please help.
A Postgres session captures a snapshot of pg_stat_activity on first
reference, and holds it until end of transaction or you call
pg_stat_clear_snapshot(). Without this behavior you would get total
garbage from queries as the data changes under you, especially so
from join queries which may require multiple scans of the input.
I don't think there's any way to precisely duplicate what you describe
doing in Postgres, but you can easily get a similar result by
alternating "SELECT pg_stat_clear_snapshot()" with selects from
pg_stat_activity.
regards, tom lane
^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Nested loop behaviour with pg_stat_activity
2024-03-12 07:30 Nested loop behaviour with pg_stat_activity Sheryl Prabhu David <[email protected]>
2024-03-13 14:30 ` Re: Nested loop behaviour with pg_stat_activity Tom Lane <[email protected]>
@ 2024-03-14 07:29 ` Sheryl Prabhu David <[email protected]>
0 siblings, 0 replies; 3+ messages in thread
From: Sheryl Prabhu David @ 2024-03-14 07:29 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: [email protected]
Thanks a lot Tom, Postgres community truly is impressive, I never thought I
will get a response this soon if at all.
In Oracle with v$session being the inner table of a nested loop for each
scan we get different copies of it even as the data changes under us, thats
the hack someone figured out, and it gives us tons of samples - in the
order of ~500 per second with a single SQL run.
In Postgres the reason I mention the behaviour is different is that the
count I get with the outer group by for each connected client is one record
with a count of 2000 if I use 2000 records from generate_series. Oracles
behaviour would be multiple records each with a different wait event adding
up to 2000.
As you describe this may not be possible with Postgres. I will try SELECT
pg_stat_clear_snapshot() but not sure if that can be done is a single SQL
statement. Will let you know if it works.
Thanks a lot again.
Regards
Prabhu David
On Wed, Mar 13, 2024 at 8:00 PM Tom Lane <[email protected]> wrote:
> Sheryl Prabhu David <[email protected]> writes:
> > In an Oracle database using a SQL like the one below, we have the ability
> > to use, a series along the lines of Postgresqls generate_series() as seen
> > in the 'connect by' line as outer table of a forced nested loop, and
> > v$session(Oracles equivalent of pg_stat_activity) as the inner table, to
> > get many samples of v$session each second.
>
> That's ... impressively ill-defined. How do you know you are getting
> consistent "samples" at all?
>
> > Trying something similar in Postgres does not produce an equivalent
> result.
> > Instead of 'Number of sample' times *DIFFERENT* copies of
> pg_stat_activity,
> > we are seeing 'Number of sample' times *SAME* copy of pg_stat_activity,
> > unlike Oracle. MVCC and Isolation guarantees for regular tables is
> expected
> > to produce this kind of a result, but I was hoping pg_stat_activity
> being a
> > portal into internal data structures will act similar to Oracles
> v$session
> > bypassing MVCC+Isolation. I am hoping to find out if there is anyway to
> > force Oracle type behaviour for pg_stat_activity, please help.
>
> A Postgres session captures a snapshot of pg_stat_activity on first
> reference, and holds it until end of transaction or you call
> pg_stat_clear_snapshot(). Without this behavior you would get total
> garbage from queries as the data changes under you, especially so
> from join queries which may require multiple scans of the input.
>
> I don't think there's any way to precisely duplicate what you describe
> doing in Postgres, but you can easily get a similar result by
> alternating "SELECT pg_stat_clear_snapshot()" with selects from
> pg_stat_activity.
>
> regards, tom lane
>
^ permalink raw reply [nested|flat] 3+ messages in thread
end of thread, other threads:[~2024-03-14 07:29 UTC | newest]
Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-03-12 07:30 Nested loop behaviour with pg_stat_activity Sheryl Prabhu David <[email protected]>
2024-03-13 14:30 ` Tom Lane <[email protected]>
2024-03-14 07:29 ` Sheryl Prabhu David <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox