public inbox for [email protected]  
help / color / mirror / Atom feed
Regression from 9.4-9.6
6+ messages / 2 participants
[nested] [flat]

* Regression from 9.4-9.6
@ 2017-10-08 19:25  Jim Nasby <[email protected]>
  0 siblings, 1 reply; 6+ messages in thread

From: Jim Nasby @ 2017-10-08 19:25 UTC (permalink / raw)
  To: pgsql-performance

I've got a query that's regressed from 9.4 to 9.6. I suspect it has 
something to do with the work done around bad plans from single-row 
estimates. There's a SeqScan happening even though the join is to the PK 
of bd_ident. Full plans are at [1,2,3], but here's the relevant bits...

9.4:
>    ->  Nested Loop Left Join  (cost=1.00..50816.55 rows=1 width=27) (actual time=979.406..3213.286 rows=508 loops=1)
>          ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..50807.51 rows=1 width=36) (actual time=979.381..3207.777 rows=508 loops=1)
>                Index Cond: ((filed_departuretime >= '2017-07-20 05:00:00'::timestamp without time zone) AND (filed_departuretime <= '2017-07-30 04:59:59'::timestamp without time zone))
...
>          ->  Index Scan using bd_ident_pkey on bd_ident i  (cost=0.43..4.45 rows=1 width=11) (actual time=0.006..0.006 rows=1 loops=508)
>                Index Cond: (bdata_forks.ident_id = id)
>          SubPlan 1
>            ->  Index Scan using bd_airport_pkey on bd_airport  (cost=0.56..4.58 rows=1 width=20) (actual time=0.003..0.003 rows=1 loops=508)
>                  Index Cond: (id = bdata_forks.destination_id)

9.6:
>    ->  Nested Loop Left Join  (cost=0.57..14994960.40 rows=1 width=71) (actual time=931.479..327972.891 rows=508 loops=1)
>          Join Filter: (bdata_forks.ident_id = i.id)
>          Rows Removed by Join Filter: 1713127892
>          ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..14894236.06 rows=1 width=36) (actual time=892.664..3025.653 rows=508 loops=1)
...
>          ->  Seq Scan on bd_ident i  (cost=0.00..58566.00 rows=3372300 width=11) (actual time=0.002..280.966 rows=3372300 loops=508)
                ^^^^^^^^
>          SubPlan 1
>            ->  Index Scan using bd_airport_pkey on bd_airport  (cost=0.56..4.58 rows=1 width=20) (actual time=0.009..0.009 rows=1 loops=508)
>                  Index Cond: (id = bdata_forks.destination_id)

Altering the predicates somewhat (removing one of the timestamp 
conditions) results in the input to the outer part of the nested loop 
estimating at 326 rows instead of 1, which generates a good plan:

>    ->  Nested Loop Left Join  (cost=1.00..14535906.91 rows=326 width=71) (actual time=23.670..4558.273 rows=3543 loops=1)
>          ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..14532973.05 rows=326 width=36) (actual time=23.647..4522.428 rows=3543 loops=1)
 
                           ^^^^^^^^
...
>          ->  Index Scan using bd_ident_pkey on bd_ident i  (cost=0.43..4.40 rows=1 width=11) (actual time=0.005..0.006 rows=1 loops=3543)
>                Index Cond: (bdata_forks.ident_id = id)
>          SubPlan 1
>            ->  Index Scan using bd_airport_pkey on bd_airport  (cost=0.56..4.58 rows=1 width=20) (actual time=0.003..0.003 rows=1 loops=3543)
>                  Index Cond: (id = bdata_forks.destination_id)

1: https://explain.depesz.com/s/2A90
2: https://explain.depesz.com/s/jKdr
3: https://explain.depesz.com/s/nFh
-- 
Jim C. Nasby, Data Architect                       [email protected]
512.569.9461 (cell)                         http://jim.nasby.net


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: Regression from 9.4-9.6
@ 2017-10-08 19:34  Tom Lane <[email protected]>
  parent: Jim Nasby <[email protected]>
  0 siblings, 1 reply; 6+ messages in thread

From: Tom Lane @ 2017-10-08 19:34 UTC (permalink / raw)
  To: Jim Nasby <[email protected]>; +Cc: pgsql-performance

Jim Nasby <[email protected]> writes:
> I've got a query that's regressed from 9.4 to 9.6. I suspect it has 
> something to do with the work done around bad plans from single-row 
> estimates.

Why has this indexscan's cost estimate changed so much?

>> ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..50807.51 rows=1 width=36) (actual time=979.381..3207.777 rows=508 loops=1)

>> ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..14894236.06 rows=1 width=36) (actual time=892.664..3025.653 rows=508 loops=1)

I think the reason it's discarding the preferable plan is that, with this
huge increment in the estimated cost getting added to both alternatives,
the two nestloop plans have fuzzily the same total cost, and it's picking
the one you don't want on the basis of some secondary criterion.

			regards, tom lane


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: Regression from 9.4-9.6
@ 2017-10-08 20:02  Jim Nasby <[email protected]>
  parent: Tom Lane <[email protected]>
  0 siblings, 2 replies; 6+ messages in thread

From: Jim Nasby @ 2017-10-08 20:02 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: pgsql-performance

On 10/8/17 2:34 PM, Tom Lane wrote:
> Jim Nasby <[email protected]> writes:
>> I've got a query that's regressed from 9.4 to 9.6. I suspect it has
>> something to do with the work done around bad plans from single-row
>> estimates.
> 
> Why has this indexscan's cost estimate changed so much?
> 
>>> ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..50807.51 rows=1 width=36) (actual time=979.381..3207.777 rows=508 loops=1)
> 
>>> ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..14894236.06 rows=1 width=36) (actual time=892.664..3025.653 rows=508 loops=1)
> 
> I think the reason it's discarding the preferable plan is that, with this
> huge increment in the estimated cost getting added to both alternatives,
> the two nestloop plans have fuzzily the same total cost, and it's picking
> the one you don't want on the basis of some secondary criterion.

Great question... the only thing that sticks out is the coalesce(). Let 
me see if an analyze with a higher stats target changes anything. FWIW, 
the 9.6 database is copied from the 9.4 one once a week and then 
pg_upgraded. I'm pretty sure an ANALYZE is part of that process.

9.4:
>          ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..50807.51 rows=1 width=36) (actual time=979.381..3207.777 rows=508 loops=1)
>                Index Cond: ((filed_departuretime >= '2017-07-20 05:00:00'::timestamp without time zone) AND (filed_departuretime <= '2017-07-30 04:59:59'::timestamp without time zone))
>                Filter: (((view_www IS NULL) OR (view_www IS TRUE)) AND (sch_block_out IS NOT NULL) AND (diverted IS NOT TRUE) AND (true_cancel IS NOT TRUE) AND (sch_block_out >= '2017-07-23 05:00:00'::timestamp without time zone) AND (sch_block_out <= '2017-07-24 04:59:59'::timestamp without time zone) AND (COALESCE(actualarrivaltime, cancellation) >= actualdeparturetime) AND ((act_block_out - sch_block_out) >= '00:15:00'::interval) AND (((SubPlan 2))::text = 'KORD'::text))
>                Rows Removed by Filter: 2696593
>                SubPlan 2
>                  ->  Index Scan using bd_airport_pkey on bd_airport bd_airport_1  (cost=0.56..4.58 rows=1 width=20) (actual time=0.003..0.003 rows=1 loops=21652)
>                        Index Cond: (id = bdata_forks.origin_id)

9.6:
>          ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..14894236.06 rows=1 width=36) (actual time=892.664..3025.653 rows=508 loops=1)
>                Index Cond: ((filed_departuretime >= '2017-07-20 05:00:00'::timestamp without time zone) AND (filed_departuretime <= '2017-07-30 04:59:59'::timestamp without time zone))
>                Filter: (((view_www IS NULL) OR (view_www IS TRUE)) AND (sch_block_out IS NOT NULL) AND (diverted IS NOT TRUE) AND (true_cancel IS NOT TRUE) AND (sch_block_out >= '2017-07-23 05:00:00'::timestamp without time zone) AND (sch_block_out <= '2017-07-24 04:59:59'::timestamp without time zone) AND (COALESCE(actualarrivaltime, cancellation) >= actualdeparturetime) AND ((act_block_out - sch_block_out) >= '00:15:00'::interval) AND (((SubPlan 2))::text = 'KORD'::text))
>                Rows Removed by Filter: 2696592
>                SubPlan 2
>                  ->  Index Scan using bd_airport_pkey on bd_airport bd_airport_1  (cost=0.56..4.58 rows=1 width=20) (actual time=0.004..0.004 rows=1 loops=21652)
>                        Index Cond: (id = bdata_forks.origin_id)


-- 
Jim C. Nasby, Data Architect                       [email protected]
512.569.9461 (cell)                         http://jim.nasby.net


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: Regression from 9.4-9.6
@ 2017-10-08 20:33  Jim Nasby <[email protected]>
  parent: Jim Nasby <[email protected]>
  1 sibling, 0 replies; 6+ messages in thread

From: Jim Nasby @ 2017-10-08 20:33 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: pgsql-performance

On 10/8/17 3:02 PM, Jim Nasby wrote:
>>
>>>> -> Index Scan using bdata_filed_departuretime on bdata_forks 
>>>> (cost=0.57..50807.51 rows=1 width=36) (actual time=979.381..3207.777 
>>>> rows=508 loops=1)
>>
>>>> -> Index Scan using bdata_filed_departuretime on bdata_forks 
>>>> (cost=0.57..14894236.06 rows=1 width=36) (actual 
>>>> time=892.664..3025.653 rows=508 loops=1)
>>
>> I think the reason it's discarding the preferable plan is that, with this
>> huge increment in the estimated cost getting added to both alternatives,
>> the two nestloop plans have fuzzily the same total cost, and it's picking
>> the one you don't want on the basis of some secondary criterion.
> 
> Great question... the only thing that sticks out is the coalesce(). Let 
> me see if an analyze with a higher stats target changes anything. FWIW, 
> the 9.6 database is copied from the 9.4 one once a week and then 
> pg_upgraded. I'm pretty sure an ANALYZE is part of that process.

Turns out that analyze is the 'problem'. On the 9.4 database, pg_stats 
shows that the newest date in filed_departuretime is 3/18/2017, while 
the 9.6 database is up-to-date. If I change the query to use 2/9/2018 
instead of 7/20/2017 I get the same results.

So, the larger cost estimate is theoretically more correct. If I set 
random_page_cost = 1 I end up with a good plan.
-- 
Jim C. Nasby, Data Architect                       [email protected]
512.569.9461 (cell)                         http://jim.nasby.net


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: Regression from 9.4-9.6
@ 2017-10-08 20:37  Tom Lane <[email protected]>
  parent: Jim Nasby <[email protected]>
  1 sibling, 1 reply; 6+ messages in thread

From: Tom Lane @ 2017-10-08 20:37 UTC (permalink / raw)
  To: Jim Nasby <[email protected]>; +Cc: pgsql-performance

Jim Nasby <[email protected]> writes:
> On 10/8/17 2:34 PM, Tom Lane wrote:
>> Why has this indexscan's cost estimate changed so much?

> Great question... the only thing that sticks out is the coalesce(). Let 
> me see if an analyze with a higher stats target changes anything. FWIW, 
> the 9.6 database is copied from the 9.4 one once a week and then 
> pg_upgraded. I'm pretty sure an ANALYZE is part of that process.

Hm, now that I see the SubPlan in there, I wonder whether 9.6 is
accounting more conservatively for the cost of the subplan.  It
probably is assuming that the subplan gets run for each row fetched
from the index, although the loops and rows-removed counts show
that the previous filter conditions reject 99% of the fetched rows.

But that code looks the same in 9.4, so I don't understand why
the 9.4 estimate isn't equally large ...

			regards, tom lane


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: Regression from 9.4-9.6
@ 2017-10-08 21:07  Jim Nasby <[email protected]>
  parent: Tom Lane <[email protected]>
  0 siblings, 0 replies; 6+ messages in thread

From: Jim Nasby @ 2017-10-08 21:07 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: pgsql-performance

On 10/8/17 3:37 PM, Tom Lane wrote:
> Jim Nasby <[email protected]> writes:
>> On 10/8/17 2:34 PM, Tom Lane wrote:
>>> Why has this indexscan's cost estimate changed so much?
> 
>> Great question... the only thing that sticks out is the coalesce(). Let
>> me see if an analyze with a higher stats target changes anything. FWIW,
>> the 9.6 database is copied from the 9.4 one once a week and then
>> pg_upgraded. I'm pretty sure an ANALYZE is part of that process.
> 
> Hm, now that I see the SubPlan in there, I wonder whether 9.6 is
> accounting more conservatively for the cost of the subplan.  It
> probably is assuming that the subplan gets run for each row fetched
> from the index, although the loops and rows-removed counts show
> that the previous filter conditions reject 99% of the fetched rows.
> 
> But that code looks the same in 9.4, so I don't understand why
> the 9.4 estimate isn't equally large ...

Besides the analyze issue, the other part of this is

[email protected]/20106> select 
pg_size_pretty(pg_relation_size('bdata_forks'));
  pg_size_pretty
----------------
  106 GB
(1 row)

[email protected]/20106> select relpages::bigint*8192/reltuples from 
pg_class where relname='bdata_forks';
      ?column?
------------------
  185.559397863791
(1 row)

With an effective_cache_size of 200GB that's not really helping things. 
But it's also another example of the planner's reluctance towards index 
scans.
-- 
Jim C. Nasby, Data Architect                       [email protected]
512.569.9461 (cell)                         http://jim.nasby.net


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




^ permalink  raw  reply  [nested|flat] 6+ messages in thread


end of thread, other threads:[~2017-10-08 21:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2017-10-08 19:25 Regression from 9.4-9.6 Jim Nasby <[email protected]>
2017-10-08 19:34 ` Tom Lane <[email protected]>
2017-10-08 20:02   ` Jim Nasby <[email protected]>
2017-10-08 20:33     ` Jim Nasby <[email protected]>
2017-10-08 20:37     ` Tom Lane <[email protected]>
2017-10-08 21:07       ` Jim Nasby <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox