Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tDkLa-002uOe-CB for pgsql-general@arkaria.postgresql.org; Wed, 20 Nov 2024 13:02:18 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tDkLY-005OUX-OY for pgsql-general@arkaria.postgresql.org; Wed, 20 Nov 2024 13:02:16 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tDkLY-005OUP-DJ for pgsql-general@lists.postgresql.org; Wed, 20 Nov 2024 13:02:16 +0000 Received: from smtp.outgoing.loopia.se ([93.188.3.37]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tDkLW-002vOh-3s for pgsql-general@lists.postgresql.org; Wed, 20 Nov 2024 13:02:15 +0000 Received: from s807.loopia.se (localhost [127.0.0.1]) by s807.loopia.se (Postfix) with ESMTP id 344681286FE for ; Wed, 20 Nov 2024 14:02:13 +0100 (CET) Received: from s981.loopia.se (unknown [172.22.191.5]) by s807.loopia.se (Postfix) with ESMTP id 1F54112A29F; Wed, 20 Nov 2024 14:02:13 +0100 (CET) Received: from s470.loopia.se (unknown [172.22.191.6]) by s981.loopia.se (Postfix) with ESMTP id 1C5D122B1747; Wed, 20 Nov 2024 14:02:13 +0100 (CET) X-Virus-Scanned: amavisd-new at amavis.loopia.se X-Spam-Flag: NO X-Spam-Score: -1.2 X-Spam-Level: X-Spam-Status: No, score=-1.2 tagged_above=-999 required=6.2 tests=[ALL_TRUSTED=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1] autolearn=disabled Authentication-Results: s470.loopia.se (amavisd-new); dkim=pass (2048-bit key) header.d=yesql.se Received: from s899.loopia.se ([172.22.191.5]) by s470.loopia.se (s470.loopia.se [172.22.190.34]) (amavisd-new, port 10024) with UTF8LMTP id 7hsNUYDrF0Vw; Wed, 20 Nov 2024 14:02:12 +0100 (CET) X-Loopia-Auth: user X-Loopia-User: daniel@yesql.se X-Loopia-Originating-IP: 89.255.232.193 Received: from smtpclient.apple (customer-89-255-232-193.stosn.net [89.255.232.193]) (Authenticated sender: daniel@yesql.se) by s899.loopia.se (Postfix) with ESMTPSA id A1A952C8BA80; Wed, 20 Nov 2024 14:02:12 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yesql.se; s=loopiadkim1707475645; t=1732107732; bh=NLd1liatoP9LdneUj+xXmwjDrOKjHkyzQU1DjHi+tzs=; h=Subject:From:In-Reply-To:Date:Cc:References:To; b=NW9D+mBPUscWOM9wz9mjN8Iq58vOnbkyfWbvyZYNHmd68/SyWmHYCY3+wB+4lzH26 jQ2XDDex4Gm5jCIVsxBZKybrgHS52IwyYhHFWF/uUjoGlDgSwTYpp17v9/6oAYi8yI xQXj2GtpIualPHoznDl20090GinOEITRIo6rDernqmWe5pmpytbTirqaD1KE3ywABk taCLWR4Gxi44td/4o7a4iHhUSfG1KyzWXg/oipfOqDRxqOLeyAIiDs/JmFBWzRK7Th xHo8mrIzAso60LXKmnKVkCP/HjO7Gb5ejMQNTEktPdCQjXoG7d7dDrnd9Hy1rKYt1j s+Yk9LJT+cVNw== Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51.11.1\)) Subject: Re: Suddenly all queries moved to seq scan From: Daniel Gustafsson In-Reply-To: Date: Wed, 20 Nov 2024 14:02:01 +0100 Cc: pgsql-general@lists.postgresql.org Content-Transfer-Encoding: quoted-printable Message-Id: References: To: Sreejith P X-Mailer: Apple Mail (2.3776.700.51.11.1) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > On 20 Nov 2024, at 11:50, Sreejith P wrote: > We are using PostgresQL 10 in our production database. We have around = 890 req /s request on peak time. PostgreSQL 10 is well out of support and does not receive bugfixes or = security fixes, you should plan a migration to a supported version sooner rather = than later. > 2 days back we applied some patches in the primary server and = restarted. We didn't do anything on the secondary server. Patches to the operating system, postgres, another application? > Next day, After 18 hours all our queries from secondary servers = started taking too much time. queries were working in 2 sec started = taking 80 seconds. Almost all queries behaved the same way. >=20 > After half an hour of outage we restarted all db servers and system = back to normal. >=20 > Still we are not able to understand the root case. We couldn't find = any error log or fatal errors. During the incident, in one of the read = server disks was full. We couldn't see any replication lag or query = cancellation due to replication. You say that all queries started doing sequential scans, is that an = assumption from queries being slow or did you capture plans for the queries which = be compared against "normal" production plans? -- Daniel Gustafsson