Suddenly all queries moved to seq scan

public inbox for [email protected]  
help / color / mirror / Atom feed

Suddenly all queries moved to seq scan
4+ messages / 3 participants
[nested] [flat]

* Suddenly all queries moved to seq scan
@ 2024-11-20 10:50  Sreejith P <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Sreejith P @ 2024-11-20 10:50 UTC (permalink / raw)
  To: [email protected]

Hi,

We are using PostgresQL 10 in our production database.  We have around 890
req /s request on peak time.

We have 1 primary and 4 slave databases as well in the same postgres
cluster.

2 days back we applied some patches in the primary server and restarted. We
didn't do anything on the secondary server.

Next day, After 18 hours all our queries from secondary servers started
taking too much time.  queries were working in 2 sec started taking 80
seconds. Almost all queries behaved the same way.

After half an hour of outage we restarted all db servers and system back to
normal.

Still we are not able to understand the root case. We couldn't find any
error log or fatal errors.  During the incident, in  one of the read server
disks was full. We couldn't see any replication lag or query
cancellation due to replication.

please help

Regards
Sreejith

-- 

*Solutions for Care Anywhere*
*dWise HealthCare IT Solutions Pvt. 
Ltd.* | www.lifetrenz.com <http://www.lifetrenz.com;
*Disclaimer*:
 The 
information and attachments contained in this email are intended 
for 
exclusive use of the addressee(s) and may contain confidential or 
privileged information. If you are not the intended recipient, please 
notify the sender immediately and destroy all copies of this message and

any attachments. The views expressed in this email are, unless 
otherwise 
stated, those of the author and not those of dWise HealthCare IT Solutions 
or its management.

^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Suddenly all queries moved to seq scan
@ 2024-11-20 13:02  Daniel Gustafsson <[email protected]>
  parent: Sreejith P <[email protected]>
  0 siblings, 2 replies; 4+ messages in thread

From: Daniel Gustafsson @ 2024-11-20 13:02 UTC (permalink / raw)
  To: Sreejith P <[email protected]>; +Cc: [email protected]

> On 20 Nov 2024, at 11:50, Sreejith P <[email protected]> wrote:

> We are using PostgresQL 10 in our production database.  We have around 890 req /s request on peak time.

PostgreSQL 10 is well out of support and does not receive bugfixes or security
fixes, you should plan a migration to a supported version sooner rather than
later.

> 2 days back we applied some patches in the primary server and restarted. We didn't do anything on the secondary server.

Patches to the operating system, postgres, another application?

> Next day, After 18 hours all our queries from secondary servers started taking too much time.  queries were working in 2 sec started taking 80 seconds. Almost all queries behaved the same way.
> 
> After half an hour of outage we restarted all db servers and system back to normal.
> 
> Still we are not able to understand the root case. We couldn't find any error log or fatal errors.  During the incident, in  one of the read server disks was full. We couldn't see any replication lag or query cancellation due to replication.

You say that all queries started doing sequential scans, is that an assumption
from queries being slow or did you capture plans for the queries which be
compared against "normal" production plans?

--
Daniel Gustafsson







^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Suddenly all queries moved to seq scan
@ 2024-11-20 13:21  Efrain J. Berdecia <[email protected]>
  parent: Daniel Gustafsson <[email protected]>
  1 sibling, 0 replies; 4+ messages in thread

From: Efrain J. Berdecia @ 2024-11-20 13:21 UTC (permalink / raw)
  To: Sreejith P <[email protected]>; Daniel Gustafsson <[email protected]>; +Cc: [email protected] <[email protected]>

Make sure to run analyze on the entire database, possibly using vacuumdb would be faster.
Also, check for invalid indexes.
Efrain J. Berdecia 

    On Wednesday, November 20, 2024 at 08:02:36 AM EST, Daniel Gustafsson <[email protected]> wrote:  

 > On 20 Nov 2024, at 11:50, Sreejith P <[email protected]> wrote:

> We are using PostgresQL 10 in our production database.  We have around 890 req /s request on peak time.

PostgreSQL 10 is well out of support and does not receive bugfixes or security
fixes, you should plan a migration to a supported version sooner rather than
later.

> 2 days back we applied some patches in the primary server and restarted. We didn't do anything on the secondary server.

Patches to the operating system, postgres, another application?

> Next day, After 18 hours all our queries from secondary servers started taking too much time.  queries were working in 2 sec started taking 80 seconds. Almost all queries behaved the same way.
> 
> After half an hour of outage we restarted all db servers and system back to normal.
> 
> Still we are not able to understand the root case. We couldn't find any error log or fatal errors.  During the incident, in  one of the read server disks was full. We couldn't see any replication lag or query cancellation due to replication.

You say that all queries started doing sequential scans, is that an assumption
from queries being slow or did you capture plans for the queries which be
compared against "normal" production plans?

--
Daniel Gustafsson

^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Suddenly all queries moved to seq scan
@ 2024-11-20 14:08  Sreejith P <[email protected]>
  parent: Daniel Gustafsson <[email protected]>
  1 sibling, 0 replies; 4+ messages in thread

From: Sreejith P @ 2024-11-20 14:08 UTC (permalink / raw)
  To: Daniel Gustafsson <[email protected]>; +Cc: [email protected]



> On 20 Nov 2024, at 6:32 PM, Daniel Gustafsson <[email protected]> wrote:
> 
>> On 20 Nov 2024, at 11:50, Sreejith P <[email protected]> wrote:
> 
>> We are using PostgresQL 10 in our production database.  We have around 890 req /s request on peak time.
> 
> PostgreSQL 10 is well out of support and does not receive bugfixes or security
> fixes, you should plan a migration to a supported version sooner rather than
> later.
> 
>> 2 days back we applied some patches in the primary server and restarted. We didn't do anything on the secondary server.
> 
> Patches to the operating system, postgres, another application? 
PostgreSQL Common 10.23-6 
> 
>> Next day, After 18 hours all our queries from secondary servers started taking too much time.  queries were working in 2 sec started taking 80 seconds. Almost all queries behaved the same way.
>> 
>> After half an hour of outage we restarted all db servers and system back to normal.
>> 
>> Still we are not able to understand the root case. We couldn't find any error log or fatal errors.  During the incident, in  one of the read server disks was full. We couldn't see any replication lag or query cancellation due to replication.
> 
> You say that all queries started doing sequential scans, is that an assumption
> from queries being slow or did you capture plans for the queries which be
> compared against "normal" production plans?.

Queries were taking 20 ms started taking 60 seconds. So have done SQL analyse to understand about query plan. There we found that query planner taking seq scan instead in index scan.

I would like to add one ore point.  A delete query were running in DB from 2 days for deleting around 80 million records. 
> 
> --
> Daniel Gustafsson
> 


-- 




 

*Solutions for Care Anywhere*
*dWise HealthCare IT Solutions Pvt. 
Ltd.* | www.lifetrenz.com <http://www.lifetrenz.com;
*Disclaimer*:
 The 
information and attachments contained in this email are intended 
for 
exclusive use of the addressee(s) and may contain confidential or 
privileged information. If you are not the intended recipient, please 
notify the sender immediately and destroy all copies of this message and
 
any attachments. The views expressed in this email are, unless 
otherwise 
stated, those of the author and not those of dWise HealthCare IT Solutions 
or its management.






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

end of thread, other threads:[~2024-11-20 14:08 UTC | newest]

Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-11-20 10:50 Suddenly all queries moved to seq scan Sreejith P <[email protected]>
2024-11-20 13:02 ` Daniel Gustafsson <[email protected]>
2024-11-20 13:21   ` Efrain J. Berdecia <[email protected]>
2024-11-20 14:08   ` Sreejith P <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox