public inbox for [email protected]  
help / color / mirror / Atom feed
From: Jelte Fennema-Nio <[email protected]>
To: Andres Freund <[email protected]>
Cc: [email protected]
Cc: Robert Haas <[email protected]>
Cc: Bruce Momjian <[email protected]>
Subject: Re: libpq maligning postgres stability
Date: Tue, 26 May 2026 09:22:07 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <up7whuxn263agr7whdzzoak7mo7ywryc7z7z3ejdvgbbke4vji@3puckkgm4ide>
References: <up7whuxn263agr7whdzzoak7mo7ywryc7z7z3ejdvgbbke4vji@3puckkgm4ide>

On Thu, 27 Mar 2025 at 16:19, Andres Freund <[email protected]> wrote:
> And we don't even just add this message when the connection was actually
> closed unexpectedly, we often do it even when we *did* get a FATAL, as in this
> example:
>
> psql -c 'select pg_terminate_backend(pg_backend_pid())'
> FATAL:  57P01: terminating connection due to administrator command
> LOCATION:  ProcessInterrupts, postgres.c:3351
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> connection to server was lost
>
>
> I think this one is mostly a weakness in how libpq tracks connection state,
> but it kind of shows the silliness of claiming postgres probably crashed.

I ran into this for the nth time (this time while trying to have psql
handle certain FATAL errors differently). Turns out fixing this is
actually really simple. All that's needed is to mark a connection as
CONNECTION_BAD whenever a FATAL or PANIC error is received by the
client.

(this change is intended for PG20)


Attachments:

  [text/x-patch] v1-0001-libpq-Consider-a-connection-with-a-FATAL-error-to.patch (2.5K, 2-v1-0001-libpq-Consider-a-connection-with-a-FATAL-error-to.patch)
  download | inline diff:
From a299e7c7779331dbc09d2c8bfc5923153e15763a Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Tue, 26 May 2026 09:05:56 +0200
Subject: [PATCH v1] libpq: Consider a connection with a FATAL error to be
 closed

This starts marking a connection as closed (i.e. CONNECTION_BAD) when
the client receives a FATAL/PANIC error. Previously any FATAL error would get the
the "server closed the connection unexpectedly" string appended like such:

FATAL:  57P01: terminating connection due to administrator command
LOCATION:  ProcessInterrupts, postgres.c:3431
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

This addition to the error is just plain incorrect, the server told the
client that it was closing the connection. So it's not unexpected, nor
did the server terminate abnormally. It also makes the error harder to
parse by a client, because it would lose the ability to use
PQresultErrorField on the final PGresult.
---
 src/interfaces/libpq/fe-protocol3.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index 840e018cd18..504e8592196 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -955,6 +955,25 @@ pqGetErrorNotice3(PGconn *conn, bool isError)
 					sizeof(conn->last_sqlstate));
 		else if (id == PG_DIAG_STATEMENT_POSITION)
 			have_position = true;
+		else if (isError && id == PG_DIAG_SEVERITY_NONLOCALIZED &&
+				 (strcmp(workBuf.data, "FATAL") == 0 ||
+				  strcmp(workBuf.data, "PANIC") == 0))
+		{
+			/*
+			 * A FATAL or PANIC from the server means the backend is going to
+			 * tear the connection down right after delivering this message.
+			 * Mark the connection bad immediately so callers that drain
+			 * results (PQexecFinish, PQexecStart's discard loop, etc.) stop
+			 * reading from the socket after receiving this result. Further
+			 * reads from the socket will receive an EOF, which would cause us
+			 * to incorrectly report this as an unexpected connection closure
+			 * by appending "server closed the connection unexpectedly ..." to
+			 * the server's own error message. We read SEVERITY_NONLOCALIZED
+			 * rather than SEVERITY so the check is independent of the
+			 * server's lc_messages setting.
+			 */
+			conn->status = CONNECTION_BAD;
+		}
 	}
 
 	/*

base-commit: 2c4bd2bf5700db98be0602854a8b7fa2c16b5f4a
-- 
2.54.0



reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: libpq maligning postgres stability
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox