public inbox for [email protected]
help / color / mirror / Atom feedAdd GoAway protocol message for graceful but fast server shutdown/switchover
15+ messages / 8 participants
[nested] [flat]
* Add GoAway protocol message for graceful but fast server shutdown/switchover
@ 2025-10-23 13:04 Jelte Fennema-Nio <[email protected]>
2025-10-24 05:04 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Kirill Reshke <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
0 siblings, 2 replies; 15+ messages in thread
From: Jelte Fennema-Nio @ 2025-10-23 13:04 UTC (permalink / raw)
To: PostgreSQL Hackers <[email protected]>; +Cc: Dave Cramer <[email protected]>; Jacob Champion <[email protected]>; Heikki Linnakangas <[email protected]>
This change introduces a new GoAway backend-to-frontend protocol
message (byte 'g') that the server can send to the client to politely
request that client to disconnect/reconnect when convenient. This message is
advisory only - the connection remains fully functional and clients may
continue executing queries and starting new transactions. "When
convenient" is obviously not very well defined, but the primary target
clients are clients that maintain a connection pool. Such clients should
disconnect/reconnect a connection in the pool when there's no user of
that connection. This is similar to how such clients often currently
remove a connection from the pool after the connection hits a maximum
lifetime of e.g. 1 hour.
This new message is used by Postgres during the already existing "smart"
shutdown procedure (i.e. when postmaster receives SIGTERM). When
Postgres is in "smart" shutdown mode existing clients can continue to
run queries as usual but new connection attempts are rejected. This mode
is primarily useful when triggering a switchover of a read replica. A
load balancer can route new connections only to the new read replica,
while the old load balancer keeps serving the existing connections until
they disconnect. The problem is that this draining of connections could
often take a long time. Even when clients only run very short
queries/transactions because the session can be kept open much longer
(many connection pools use 1 hour max lifetime of a connection by default).
With the introduction of the GoAway message Postgres now sends this
message to all connected clients when it enters smart shutdown mode.
If these clients respond to the message by reconnecting/disconnecting
earlier than their maximum connection lifetime the draining can complete
much quicker. Similar benefits to switchover duration can be achieved
for other applications or proxies implementing the Postgres protocol,
like when switching over a cluster of PgBouncer machines to a newer
version.
Applications/clients that use libpq can periodically check the result of
the new PQgoAwayReceived() function to find out whether they have been
asked to reconnect.
Attachments:
[text/x-patch] nocfbot.v1-0003-Add-pytest-based-tests-for-GoAway-message.patch (6.9K, 2-nocfbot.v1-0003-Add-pytest-based-tests-for-GoAway-message.patch)
download | inline diff:
From c8e0d059686678c9f8205199348eb0842e5f0dcb Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Thu, 23 Oct 2025 14:31:52 +0200
Subject: [PATCH v1 3/3] Add pytest based tests for GoAway message
I used this patchset as a trial for the new pytest suite that Jacob is
trying to introduce. Feel free to look at it, but I'd say don't review
this test in detail until we have the pytest changes merged or at least
in a more agreed upon state. This patch is built on top of that v3
patchset. This test is not applied by cfbot.
Testing this any other way is actually quite difficult with the
infrastructure we currently have (of course I could change that, but I'd
much rather spend that energy/time on making the pytest test suite a
thing):
- pgregress and perl tests don't work because we need to call a new
libpq function that is not exposed in psql (I guess I could expose it
with some \goawayreceived command, but it doesn't seem very useful).
- libpq_pipeline cannot test this because it would need to restart
the Postgres server and all it has
---
src/test/pytest/libpq.py | 18 ++++++++++++
src/test/pytest/meson.build | 1 +
src/test/pytest/pypg/fixtures.py | 33 +++++++++++++++++++++
src/test/pytest/pypg/server.py | 50 ++++++++++++++++++++++++++++++++
4 files changed, 102 insertions(+)
diff --git a/src/test/pytest/libpq.py b/src/test/pytest/libpq.py
index b851a117b66..5536b605c16 100644
--- a/src/test/pytest/libpq.py
+++ b/src/test/pytest/libpq.py
@@ -133,6 +133,12 @@ def load_libpq_handle(libdir):
lib.PQftype.restype = ctypes.c_uint
lib.PQftype.argtypes = [_PGresult_p, ctypes.c_int]
+ lib.PQgoAwayReceived.restype = ctypes.c_int
+ lib.PQgoAwayReceived.argtypes = [_PGconn_p]
+
+ lib.PQconsumeInput.restype = ctypes.c_int
+ lib.PQconsumeInput.argtypes = [_PGconn_p]
+
return lib
@@ -340,6 +346,18 @@ class PGconn(contextlib.AbstractContextManager):
error_msg = res.error_message() or f"Unexpected status: {status}"
raise LibpqError(f"Query failed: {error_msg}\nQuery: {query}")
+ def consume_input(self) -> bool:
+ """
+ Consumes any available input from the server. Returns True on success.
+ """
+ return bool(self._lib.PQconsumeInput(self._handle))
+
+ def goaway_received(self) -> bool:
+ """
+ Returns True if a GoAway message was received from the server.
+ """
+ return bool(self._lib.PQgoAwayReceived(self._handle))
+
def connstr(opts: Dict[str, Any]) -> str:
"""
diff --git a/src/test/pytest/meson.build b/src/test/pytest/meson.build
index f53193e8686..3c8518243d9 100644
--- a/src/test/pytest/meson.build
+++ b/src/test/pytest/meson.build
@@ -12,6 +12,7 @@ tests += {
'tests': [
'pyt/test_something.py',
'pyt/test_libpq.py',
+ 'pyt/test_goaway.py',
],
},
}
diff --git a/src/test/pytest/pypg/fixtures.py b/src/test/pytest/pypg/fixtures.py
index cf22c8ec436..ba46f048beb 100644
--- a/src/test/pytest/pypg/fixtures.py
+++ b/src/test/pytest/pypg/fixtures.py
@@ -30,6 +30,30 @@ def remaining_timeout():
return lambda: max(deadline - time.monotonic(), 0)
[email protected]
+def wait_until(remaining_timeout):
+ def wait_until(error_message="Did not complete in time", timeout=None, interval=1):
+ """
+ Loop until the timeout is reached. If the timeout is reached, raise an
+ exception with the given error message.
+ """
+ if timeout is None:
+ timeout = remaining_timeout()
+
+ end = time.time() + timeout
+ print_progress = timeout / 10 > 4
+ last_printed_progress = 0
+ while time.time() < end:
+ if print_progress and time.time() - last_printed_progress > 4:
+ last_printed_progress = time.time()
+ print(f"{error_message} - will retry")
+ yield
+ time.sleep(interval)
+ raise TimeoutError(error_message)
+
+ return wait_until
+
+
@pytest.fixture(scope="session")
def libpq_handle(libdir):
"""
@@ -149,6 +173,15 @@ def pg_server_module(pg_server_global):
yield s
[email protected](autouse=True, scope="function")
+def ensure_server_running(pg_server_global):
+ """
+ Autouse fixture that ensures the server is running before each test.
+ If a test shuts down the server, this will restart it for the next test.
+ """
+ pg_server_global.ensure_running()
+
+
@pytest.fixture
def pg(pg_server_module, remaining_timeout):
"""
diff --git a/src/test/pytest/pypg/server.py b/src/test/pytest/pypg/server.py
index d6675cde93d..f09651c089e 100644
--- a/src/test/pytest/pypg/server.py
+++ b/src/test/pytest/pypg/server.py
@@ -332,6 +332,56 @@ class PostgresServer:
# Server may have already been stopped
pass
+ def ensure_running(self):
+ """
+ Ensure that the PostgreSQL server is running and accepting connections.
+
+ If the server is not running, it will be restarted. This method waits
+ for any in-progress shutdown to complete before attempting to restart.
+ """
+ pid_file = os.path.join(self.datadir, "postmaster.pid")
+
+ # Wait for any in-progress shutdown to complete
+ socket_pattern = os.path.join(self.sockdir, f".s.PGSQL.{self.port}*")
+ for _ in range(100): # Wait up to 10 seconds
+ # Server is fully down when both PID file and sockets are gone
+ if not os.path.exists(pid_file) and len(glob.glob(socket_pattern)) == 0:
+ break
+ # Server is running if PID exists and we can connect
+ if os.path.exists(pid_file):
+ # Use pg_isready to check if server is accepting connections
+ try:
+ pg_isready = os.path.join(self._bindir, "pg_isready")
+ run(
+ pg_isready,
+ "-h",
+ self.sockdir,
+ "-p",
+ self.port,
+ stdout=subprocess.DEVNULL,
+ stderr=subprocess.DEVNULL,
+ timeout=1,
+ )
+ # Server is up and ready
+ break
+ except (subprocess.CalledProcessError, subprocess.TimeoutExpired):
+ # Server is not ready yet, keep waiting
+ pass
+ time.sleep(0.1)
+
+ # Now check if server needs to be started
+ if not os.path.exists(pid_file):
+ # Restart the server and wait for it to be ready
+ run(
+ self._pg_ctl,
+ "-D",
+ self.datadir,
+ "-l",
+ self._log,
+ "-w",
+ "start",
+ )
+
def cleanup(self):
"""Run all registered cleanup callbacks."""
self._cleanup_stack.close()
--
2.51.1
[text/x-patch] v1-0001-Bump-protocol-version-to-3.3.patch (3.7K, 3-v1-0001-Bump-protocol-version-to-3.3.patch)
download | inline diff:
From b329eb7426d47561c30d5faba8bdf0b0bf9838ae Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Sun, 19 Oct 2025 00:30:48 +0200
Subject: [PATCH v1 1/3] Bump protocol version to 3.3
This commit increments the PostgreSQL frontend/backend protocol version
from 3.2 to 3.3 and does the required boilerplate changes. The next
commit will introduce an actual protocol change.
---
doc/src/sgml/protocol.sgml | 4 ++--
src/include/libpq/pqcomm.h | 2 +-
src/interfaces/libpq/fe-connect.c | 12 ++++++++++
.../modules/libpq_pipeline/libpq_pipeline.c | 22 ++++++++++++++++---
4 files changed, 34 insertions(+), 6 deletions(-)
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 9d755232873..8234079deba 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -18,8 +18,8 @@
</para>
<para>
- This document describes version 3.2 of the protocol, introduced in
- <productname>PostgreSQL</productname> version 18. The server and the libpq
+ This document describes version 3.3 of the protocol, introduced in
+ <productname>PostgreSQL</productname> version 19. The server and the libpq
client library are backwards compatible with protocol version 3.0,
implemented in <productname>PostgreSQL</productname> 7.4 and later.
</para>
diff --git a/src/include/libpq/pqcomm.h b/src/include/libpq/pqcomm.h
index f04ca135653..2423394b348 100644
--- a/src/include/libpq/pqcomm.h
+++ b/src/include/libpq/pqcomm.h
@@ -94,7 +94,7 @@ is_unixsock_path(const char *path)
*/
#define PG_PROTOCOL_EARLIEST PG_PROTOCOL(3,0)
-#define PG_PROTOCOL_LATEST PG_PROTOCOL(3,2)
+#define PG_PROTOCOL_LATEST PG_PROTOCOL(3,3)
typedef uint32 ProtocolVersion; /* FE/BE protocol version number */
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index a3d12931fff..76f9ce3a23e 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -8301,6 +8301,18 @@ pqParseProtocolVersion(const char *value, ProtocolVersion *result, PGconn *conn,
return true;
}
+ if (strcmp(value, "3.3") == 0)
+ {
+ *result = PG_PROTOCOL(3, 3);
+ return true;
+ }
+
+ if (strcmp(value, "latest") == 0)
+ {
+ *result = PG_PROTOCOL_LATEST;
+ return true;
+ }
+
libpq_append_conn_error(conn, "invalid %s value: \"%s\"",
context, value);
return false;
diff --git a/src/test/modules/libpq_pipeline/libpq_pipeline.c b/src/test/modules/libpq_pipeline/libpq_pipeline.c
index b3af70fa09b..0b113b271ea 100644
--- a/src/test/modules/libpq_pipeline/libpq_pipeline.c
+++ b/src/test/modules/libpq_pipeline/libpq_pipeline.c
@@ -1405,7 +1405,23 @@ test_protocol_version(PGconn *conn)
PQfinish(conn);
/*
- * Test max_protocol_version=latest. 'latest' currently means '3.2'.
+ * Test max_protocol_version=3.3
+ */
+ vals[max_protocol_version_index] = "3.3";
+ conn = PQconnectdbParams(keywords, vals, false);
+
+ if (PQstatus(conn) != CONNECTION_OK)
+ pg_fatal("Connection to database failed: %s",
+ PQerrorMessage(conn));
+
+ protocol_version = PQfullProtocolVersion(conn);
+ if (protocol_version != 30003)
+ pg_fatal("expected 30003, got %d", protocol_version);
+
+ PQfinish(conn);
+
+ /*
+ * Test max_protocol_version=latest. 'latest' currently means '3.3'.
*/
vals[max_protocol_version_index] = "latest";
conn = PQconnectdbParams(keywords, vals, false);
@@ -1415,8 +1431,8 @@ test_protocol_version(PGconn *conn)
PQerrorMessage(conn));
protocol_version = PQfullProtocolVersion(conn);
- if (protocol_version != 30002)
- pg_fatal("expected 30002, got %d", protocol_version);
+ if (protocol_version != 30003)
+ pg_fatal("expected 30003, got %d", protocol_version);
PQfinish(conn);
--
2.51.1
[text/x-patch] v1-0002-Add-GoAway-protocol-message-for-graceful-but-fast.patch (16.5K, 4-v1-0002-Add-GoAway-protocol-message-for-graceful-but-fast.patch)
download | inline diff:
From 9c76c28fd369e176dbadde13134bfad071e04a30 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Sun, 19 Oct 2025 00:36:30 +0200
Subject: [PATCH v1 2/3] Add GoAway protocol message for graceful but fast
server shutdown/switchover
This commit introduces a new GoAway backend-to-frontend protocol
message (byte 'g') that the server can send to the client to politely
request that client to disconnect/reconnect when convenient. This message is
advisory only - the connection remains fully functional and clients may
continue executing queries and starting new transactions. "When
convenient" is obviously not very well defined, but the primary target
clients are clients that maintain a connection pool. Such clients should
disconnect/reconnect a connection in the pool when there's no user of
that connection. This is similar to how such clients often currently
remove a connection from the pool after the connection hits a maximum
lifetime of e.g. 1 hour.
This new message is used by Postgres during the already existing "smart"
shutdown procedure (i.e. when postmaster receives SIGTERM). When
Postgres is in "smart" shutdown mode existing clients can continue to
run queries as usual but new connection attempts are rejected. This mode
is primarily useful when triggering a switchover of a read replica. A
load balancer can route new connections only to the new read replica,
while the old load balancer keeps serving the existing connections until
they disconnect. The problem is that this draining of connections could
often take a long time. Even when clients only run very short
queries/transactions because the session can be kept open much longer
(many connection pools use 1 hour max lifetime of a connection by default).
With the introduction of the GoAway message Postgres now sends this
message to all connected clients when it enters smart shutdown mode.
If these clients respond to the message by reconnecting/disconnecting
earlier than their maximum connection lifetime the draining can complete
much quicker. Similar benefits to switchover duration can be achieved
for other applications or proxies implementing the Postgres protocol,
like when switching over a cluster of PgBouncer machines to a newer
version.
Applications/clients that use libpq can periodically check the result of
PQgoAwayReceived() at an inactive time to see whether they are asked to
reconnect.
---
doc/src/sgml/libpq.sgml | 29 ++++++++++++++++++++
doc/src/sgml/protocol.sgml | 41 ++++++++++++++++++++++++++++
src/backend/postmaster/postmaster.c | 24 ++++++++++++++++
src/backend/storage/ipc/procsignal.c | 3 ++
src/backend/tcop/postgres.c | 37 +++++++++++++++++++++++++
src/include/libpq/protocol.h | 1 +
src/include/storage/procsignal.h | 1 +
src/include/tcop/tcopprot.h | 1 +
src/interfaces/libpq/exports.txt | 1 +
src/interfaces/libpq/fe-connect.c | 1 +
src/interfaces/libpq/fe-exec.c | 27 ++++++++++++++++++
src/interfaces/libpq/fe-protocol3.c | 11 +++++++-
src/interfaces/libpq/libpq-fe.h | 5 ++++
src/interfaces/libpq/libpq-int.h | 1 +
14 files changed, 182 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 5bf59a19855..7554d50a6c3 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2992,6 +2992,35 @@ int PQserverVersion(const PGconn *conn);
</listitem>
</varlistentry>
+ <varlistentry id="libpq-PQgoAwayReceived">
+ <term><function>PQgoAwayReceived</function><indexterm><primary>PQgoAwayReceived</primary></indexterm></term>
+ <listitem>
+ <para>
+ Returns true if the server has sent a <literal>GoAway</literal> message,
+ requesting the client to disconnect when convenient.
+
+<synopsis>
+int PQgoAwayReceived(const PGconn *conn);
+</synopsis>
+ </para>
+
+ <para>
+ The <literal>GoAway</literal> message is sent by the server during a
+ smart shutdown to politely request that clients disconnect. This is
+ advisory only - the connection remains fully functional and queries
+ can continue to be executed. Applications should check this flag
+ periodically and disconnect gracefully when possible, such as after
+ completing the current transaction or unit of work.
+ </para>
+
+ <para>
+ This message is only sent to clients using protocol version 3.3 or
+ later. The function returns 1 if the <literal>GoAway</literal> message
+ was received, 0 otherwise.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="libpq-PQerrorMessage">
<term>
<function>PQerrorMessage</function><indexterm><primary>PQerrorMessage</primary></indexterm>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 8234079deba..340514db007 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -5240,6 +5240,47 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
</listitem>
</varlistentry>
+ <varlistentry id="protocol-message-formats-GoAway">
+ <term>GoAway (B)</term>
+ <listitem>
+ <variablelist>
+ <varlistentry>
+ <term>Byte1('g')</term>
+ <listitem>
+ <para>
+ Identifies the message as a polite shutdown request.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Int32(4)</term>
+ <listitem>
+ <para>
+ Length of message contents in bytes, including self.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ The server sends this message during a smart shutdown to politely
+ request that the client disconnect when convenient. This message is
+ advisory only - the connection remains fully functional and the client
+ may continue to execute queries. Applications should check for this
+ message using <function>PQgoAwayReceived()</function> and disconnect
+ gracefully when possible, such as after completing the current
+ transaction.
+ </para>
+
+ <para>
+ This message is only sent to clients using protocol version 3.3 or later.
+ The server will wait for clients to disconnect during a smart shutdown,
+ but may eventually force termination if shutdown takes too long.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="protocol-message-formats-GSSENCRequest">
<term>GSSENCRequest (F)</term>
<listitem>
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index e1d643b013d..79f116a5a5e 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2126,7 +2126,31 @@ process_pm_shutdown_request(void)
* later state, do not change it.
*/
if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
+ {
+ dlist_iter iter;
+
connsAllowed = false;
+
+ /*
+ * Signal all backends to send a GoAway message to their
+ * clients, to politely request that they disconnect.
+ */
+ dlist_foreach(iter, &ActiveChildList)
+ {
+ PMChild *bp = dlist_container(PMChild, elem, iter.cur);
+
+ /*
+ * Only signal regular backends and walsenders. Skip
+ * auxiliary processes and dead-end backends.
+ */
+ if (bp->bkend_type == B_BACKEND ||
+ bp->bkend_type == B_WAL_SENDER)
+ {
+ SendProcSignal(bp->pid, PROCSIG_SMART_SHUTDOWN,
+ INVALID_PROC_NUMBER);
+ }
+ }
+ }
else if (pmState == PM_STARTUP || pmState == PM_RECOVERY)
{
/* There should be no clients, so proceed to stop children */
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 087821311cc..6011c30d520 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -691,6 +691,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT))
HandleLogMemoryContextInterrupt();
+ if (CheckProcSignal(PROCSIG_SMART_SHUTDOWN))
+ HandleSmartShutdownInterrupt();
+
if (CheckProcSignal(PROCSIG_PARALLEL_APPLY_MESSAGE))
HandleParallelApplyMessageInterrupt();
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7dd75a490aa..ad00d865119 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -42,8 +42,10 @@
#include "common/pg_prng.h"
#include "jit/jit.h"
#include "libpq/libpq.h"
+#include "libpq/pqcomm.h"
#include "libpq/pqformat.h"
#include "libpq/pqsignal.h"
+#include "libpq/protocol.h"
#include "mb/pg_wchar.h"
#include "mb/stringinfo_mb.h"
#include "miscadmin.h"
@@ -91,6 +93,14 @@ const char *debug_query_string; /* client-supplied query string */
/* Note: whereToSendOutput is initialized for the bootstrap/standalone case */
CommandDest whereToSendOutput = DestDebug;
+/*
+ * Track whether we've been notified of smart shutdown and sent GoAway.
+ * SmartShutdownPending is set by the PROCSIG_SMART_SHUTDOWN signal handler.
+ * GoAwaySent tracks whether we've already sent the GoAway message.
+ */
+static volatile sig_atomic_t SmartShutdownPending = false;
+static bool GoAwaySent = false;
+
/* flag for logging end of session */
bool Log_disconnections = false;
@@ -508,6 +518,20 @@ ProcessClientReadInterrupt(bool blocked)
/* Check for general interrupts that arrived before/while reading */
CHECK_FOR_INTERRUPTS();
+ /* Send GoAway message if smart shutdown is pending */
+ if (SmartShutdownPending && !GoAwaySent &&
+ whereToSendOutput == DestRemote &&
+ MyProcPort && MyProcPort->proto >= PG_PROTOCOL(3, 3))
+ {
+ StringInfoData buf;
+
+ pq_beginmessage(&buf, PqMsg_GoAway);
+ pq_endmessage(&buf);
+ pq_flush();
+
+ GoAwaySent = true;
+ }
+
/* Process sinval catchup interrupts, if any */
if (catchupInterruptPending)
ProcessCatchupInterrupt();
@@ -3087,6 +3111,18 @@ FloatExceptionHandler(SIGNAL_ARGS)
"invalid operation, such as division by zero.")));
}
+/*
+ * Tell the next CHECK_FOR_INTERRUPTS() or main loop iteration to send a
+ * GoAway message to the client. Runs in a SIGUSR1 handler.
+ */
+void
+HandleSmartShutdownInterrupt(void)
+{
+ SmartShutdownPending = true;
+ InterruptPending = true;
+ /* latch will be set by procsignal_sigusr1_handler */
+}
+
/*
* Tell the next CHECK_FOR_INTERRUPTS() to check for a particular type of
* recovery conflict. Runs in a SIGUSR1 handler.
@@ -3313,6 +3349,7 @@ ProcessInterrupts(void)
ProcDiePending = false;
QueryCancelPending = false; /* ProcDie trumps QueryCancel */
LockErrorCleanup();
+
/* As in quickdie, don't risk sending to client during auth */
if (ClientAuthInProgress && whereToSendOutput == DestRemote)
whereToSendOutput = DestNone;
diff --git a/src/include/libpq/protocol.h b/src/include/libpq/protocol.h
index 7bf90053bcb..24fbc9f2613 100644
--- a/src/include/libpq/protocol.h
+++ b/src/include/libpq/protocol.h
@@ -53,6 +53,7 @@
#define PqMsg_FunctionCallResponse 'V'
#define PqMsg_CopyBothResponse 'W'
#define PqMsg_ReadyForQuery 'Z'
+#define PqMsg_GoAway 'g'
#define PqMsg_NoData 'n'
#define PqMsg_PortalSuspended 's'
#define PqMsg_ParameterDescription 't'
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..b629341a4af 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -36,6 +36,7 @@ typedef enum
PROCSIG_BARRIER, /* global barrier interrupt */
PROCSIG_LOG_MEMORY_CONTEXT, /* ask backend to log the memory contexts */
PROCSIG_PARALLEL_APPLY_MESSAGE, /* Message from parallel apply workers */
+ PROCSIG_SMART_SHUTDOWN, /* notify backend of smart shutdown */
/* Recovery conflict reasons */
PROCSIG_RECOVERY_CONFLICT_FIRST,
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index c1bcfdec673..b7fd22c43bb 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -74,6 +74,7 @@ extern void die(SIGNAL_ARGS);
pg_noreturn extern void quickdie(SIGNAL_ARGS);
extern void StatementCancelHandler(SIGNAL_ARGS);
pg_noreturn extern void FloatExceptionHandler(SIGNAL_ARGS);
+extern void HandleSmartShutdownInterrupt(void);
extern void HandleRecoveryConflictInterrupt(ProcSignalReason reason);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index dbbae642d76..3385e65c389 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -210,3 +210,4 @@ PQgetAuthDataHook 207
PQdefaultAuthDataHook 208
PQfullProtocolVersion 209
appendPQExpBufferVA 210
+PQgoAwayReceived 211
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 76f9ce3a23e..54c79e5d214 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -698,6 +698,7 @@ pqDropServerData(PGconn *conn)
conn->password_needed = false;
conn->gssapi_used = false;
conn->write_failed = false;
+ conn->goaway_received = false;
free(conn->write_err_msg);
conn->write_err_msg = NULL;
conn->oauth_want_retry = false;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 0b1e37ec30b..16bf10280e5 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -2696,6 +2696,33 @@ PQnotifies(PGconn *conn)
return event;
}
+/*
+ * PQgoAwayReceived
+ * returns 1 if a GoAway message has been received from the server
+ * returns 0 if not
+ *
+ * Note that this function does not read any new data from the socket;
+ * caller should call PQconsumeInput() first if they want to ensure
+ * all available data has been read.
+ */
+int
+PQgoAwayReceived(PGconn *conn)
+{
+ if (!conn)
+ return 0;
+
+ if (conn->goaway_received)
+ return 1;
+
+ /*
+ * Parse any available data to see if a GoAway message has arrived.
+ */
+ pqParseInput3(conn);
+
+ return conn->goaway_received ? 1 : 0;
+}
+
+
/*
* PQputCopyData - send some data to the backend during COPY IN or COPY BOTH
*
diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index da7a8db68c8..9e54e3ff03d 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -157,6 +157,15 @@ pqParseInput3(PGconn *conn)
if (pqGetErrorNotice3(conn, false))
return;
}
+ else if (id == PqMsg_GoAway)
+ {
+ /*
+ * GoAway is an asynchronous message sent during smart shutdown.
+ * Process it immediately regardless of connection state.
+ */
+ conn->goaway_received = true;
+ conn->inCursor += msgLength;
+ }
else if (conn->asyncStatus != PGASYNC_BUSY)
{
/* If not IDLE state, just wait ... */
@@ -168,7 +177,7 @@ pqParseInput3(PGconn *conn)
* ERROR messages are handled using the notice processor;
* ParameterStatus is handled normally; anything else is just
* dropped on the floor after displaying a suitable warning
- * notice. (An ERROR is very possibly the backend telling us why
+ * notice. (An ERROR is very possibly the backend telling us why
* it is about to close the connection, so we don't want to just
* discard it...)
*/
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 0852584edae..1f3bd671a4b 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -63,6 +63,10 @@ extern "C"
/* Indicates presence of the PQAUTHDATA_PROMPT_OAUTH_DEVICE authdata hook */
#define LIBPQ_HAS_PROMPT_OAUTH_DEVICE 1
+/* Features added in PostgreSQL v19: */
+/* Indicates presence of PQgoAwayReceived */
+#define LIBPQ_HAS_GOAWAY 1
+
/*
* Option flags for PQcopyResult
*/
@@ -411,6 +415,7 @@ extern const char *PQparameterStatus(const PGconn *conn,
extern int PQprotocolVersion(const PGconn *conn);
extern int PQfullProtocolVersion(const PGconn *conn);
extern int PQserverVersion(const PGconn *conn);
+extern int PQgoAwayReceived(PGconn *conn);
extern char *PQerrorMessage(const PGconn *conn);
extern int PQsocket(const PGconn *conn);
extern int PQbackendPID(const PGconn *conn);
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 02c114f1405..7dc6e858a16 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -511,6 +511,7 @@ struct pg_conn
bool sigpipe_flag; /* can we mask SIGPIPE via MSG_NOSIGNAL? */
bool write_failed; /* have we had a write failure on sock? */
char *write_err_msg; /* write error message, or NULL if OOM */
+ bool goaway_received; /* true if server sent GoAway message */
bool auth_required; /* require an authentication challenge from
* the server? */
--
2.51.1
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
@ 2025-10-24 05:04 ` Kirill Reshke <[email protected]>
2025-10-24 11:54 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
1 sibling, 1 reply; 15+ messages in thread
From: Kirill Reshke @ 2025-10-24 05:04 UTC (permalink / raw)
To: Jelte Fennema-Nio <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Jacob Champion <[email protected]>; Heikki Linnakangas <[email protected]>
On Thu, 23 Oct 2025 at 18:05, Jelte Fennema-Nio <[email protected]> wrote:
>
> This change introduces a new GoAway backend-to-frontend protocol
> message (byte 'g') that the server can send to the client to politely
> request that client to disconnect/reconnect when convenient. This message is
> advisory only - the connection remains fully functional and clients may
> continue executing queries and starting new transactions. "When
> convenient" is obviously not very well defined, but the primary target
> clients are clients that maintain a connection pool. Such clients should
> disconnect/reconnect a connection in the pool when there's no user of
> that connection. This is similar to how such clients often currently
> remove a connection from the pool after the connection hits a maximum
> lifetime of e.g. 1 hour.
>
> This new message is used by Postgres during the already existing "smart"
> shutdown procedure (i.e. when postmaster receives SIGTERM). When
> Postgres is in "smart" shutdown mode existing clients can continue to
> run queries as usual but new connection attempts are rejected. This mode
> is primarily useful when triggering a switchover of a read replica. A
> load balancer can route new connections only to the new read replica,
> while the old load balancer keeps serving the existing connections until
> they disconnect. The problem is that this draining of connections could
> often take a long time. Even when clients only run very short
> queries/transactions because the session can be kept open much longer
> (many connection pools use 1 hour max lifetime of a connection by default).
> With the introduction of the GoAway message Postgres now sends this
> message to all connected clients when it enters smart shutdown mode.
> If these clients respond to the message by reconnecting/disconnecting
> earlier than their maximum connection lifetime the draining can complete
> much quicker. Similar benefits to switchover duration can be achieved
> for other applications or proxies implementing the Postgres protocol,
> like when switching over a cluster of PgBouncer machines to a newer
> version.
>
> Applications/clients that use libpq can periodically check the result of
> the new PQgoAwayReceived() function to find out whether they have been
> asked to reconnect.
Hi!
Im +1 on this idea. This is something I wanted back in 2020, when
implementing the 'online restart' feature for odyssey[0], but never
bothered to create a thread.
Due to its asyn engine complexity, odyssey cannot simply reuse tcp
connections from 'old' binary, so we accept new connections in new
binary and try to drop connections in old binary with some rate.
About patches:
in 0001:
>+
>+ if (strcmp(value, "latest") == 0)
>+ {
>+ *result = PG_PROTOCOL_LATEST;
>+ return true;
>+ }
Not needed? we already have this check at the beginning of
pqParseProtocolVersion
In 0002:
> + The <literal>GoAway</literal> message is sent by the server during a
> + smart shutdown to politely request that clients disconnect.
I'm not sure this wording is super-foolproof. First of all, is it
'client', not 'clients'? Looks like we should describe single client
to single server interaction in this doc.
Maybe also change the last sentence to ' ... to instruct clients to
disconnect.' ? Maybe this wording is not great also, but I want to
reflect in doc that disconnection is
strongly advised, yet not obligatory
> + Applications should check this flag
> + periodically and disconnect gracefully when possible, such as after
> + completing the current transaction or unit of work.
What flag? Also, 'Applications should' - no, they shouldn't, is it
just an option? Maybe we should change wording to something like
'Applications can decide that it is recommendatory to close (or maybe
re-open) their connection with the server as soon as they get at least
one 'GoAway' msg.'
Also, can the server send more than one 'GoAway' msg? If yes, should
we document this?
> - * notice. (An ERROR is very possibly the backend telling us why
> + * notice. (An ERROR is very possibly the backend telling us why
This change is unrelated
Other coding changes looks straightforward and are fine to me.
[0] https://github.com/yandex/odyssey
--
Best regards,
Kirill Reshke
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2025-10-24 05:04 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Kirill Reshke <[email protected]>
@ 2025-10-24 11:54 ` Jelte Fennema-Nio <[email protected]>
2025-12-01 05:05 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Ajit Awekar <[email protected]>
0 siblings, 1 reply; 15+ messages in thread
From: Jelte Fennema-Nio @ 2025-10-24 11:54 UTC (permalink / raw)
To: Kirill Reshke <[email protected]>; Jelte Fennema-Nio <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Jacob Champion <[email protected]>; Heikki Linnakangas <[email protected]>
On Fri Oct 24, 2025 at 7:04 AM CEST, Kirill Reshke wrote:
> On Thu, 23 Oct 2025 at 18:05, Jelte Fennema-Nio <[email protected]> wrote:
> Im +1 on this idea. This is something I wanted back in 2020, when
> implementing the 'online restart' feature for odyssey[0], but never
> bothered to create a thread.
Yeah, to be clear: A big goal of this is definitely to be used by
poolers/proxies/middleware. Those systems will often be more frequently
restarted than the actual database servers, so being able to do that
quickly without disrupting active connections is much more important
there than with plain PostgreSQL servers.
> About patches:
Thanks for the review. Attached is a new patchset. I think I addressed
all of your comments (I almost fully rewrote the docs). I also fixed
two other issues that I found:
- updating docs for 3.3 in more places
- handling the GoAway message in more code paths on the client side
Attachments:
[text/x-patch] nocfbot.v2-0003-Add-pytest-based-tests-for-GoAway-message.patch (6.9K, 2-nocfbot.v2-0003-Add-pytest-based-tests-for-GoAway-message.patch)
download | inline diff:
From 5e037b9cf644e23fd9e9806a0b72690ddb867f75 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Thu, 23 Oct 2025 14:31:52 +0200
Subject: [PATCH v2 3/3] Add pytest based tests for GoAway message
I used this patchset as a trial for the new pytest suite that Jacob is
trying to introduce. Feel free to look at it, but I'd say don't review
this test in detail until we have the pytest changes merged or at least
in a more agreed upon state. This patch is built on top of that v3
patchset. This test is not applied by cfbot.
Testing this any other way is actually quite difficult with the
infrastructure we currently have (of course I could change that, but I'd
much rather spend that energy/time on making the pytest test suite a
thing):
- pgregress and perl tests don't work because we need to call a new
libpq function that is not exposed in psql (I guess I could expose it
with some \goawayreceived command, but it doesn't seem very useful).
- libpq_pipeline cannot test this because it would need to restart
the Postgres server and all it has
---
src/test/pytest/libpq.py | 18 ++++++++++++
src/test/pytest/meson.build | 1 +
src/test/pytest/pypg/fixtures.py | 33 +++++++++++++++++++++
src/test/pytest/pypg/server.py | 50 ++++++++++++++++++++++++++++++++
4 files changed, 102 insertions(+)
diff --git a/src/test/pytest/libpq.py b/src/test/pytest/libpq.py
index b851a117b66..5536b605c16 100644
--- a/src/test/pytest/libpq.py
+++ b/src/test/pytest/libpq.py
@@ -133,6 +133,12 @@ def load_libpq_handle(libdir):
lib.PQftype.restype = ctypes.c_uint
lib.PQftype.argtypes = [_PGresult_p, ctypes.c_int]
+ lib.PQgoAwayReceived.restype = ctypes.c_int
+ lib.PQgoAwayReceived.argtypes = [_PGconn_p]
+
+ lib.PQconsumeInput.restype = ctypes.c_int
+ lib.PQconsumeInput.argtypes = [_PGconn_p]
+
return lib
@@ -340,6 +346,18 @@ class PGconn(contextlib.AbstractContextManager):
error_msg = res.error_message() or f"Unexpected status: {status}"
raise LibpqError(f"Query failed: {error_msg}\nQuery: {query}")
+ def consume_input(self) -> bool:
+ """
+ Consumes any available input from the server. Returns True on success.
+ """
+ return bool(self._lib.PQconsumeInput(self._handle))
+
+ def goaway_received(self) -> bool:
+ """
+ Returns True if a GoAway message was received from the server.
+ """
+ return bool(self._lib.PQgoAwayReceived(self._handle))
+
def connstr(opts: Dict[str, Any]) -> str:
"""
diff --git a/src/test/pytest/meson.build b/src/test/pytest/meson.build
index f53193e8686..3c8518243d9 100644
--- a/src/test/pytest/meson.build
+++ b/src/test/pytest/meson.build
@@ -12,6 +12,7 @@ tests += {
'tests': [
'pyt/test_something.py',
'pyt/test_libpq.py',
+ 'pyt/test_goaway.py',
],
},
}
diff --git a/src/test/pytest/pypg/fixtures.py b/src/test/pytest/pypg/fixtures.py
index cf22c8ec436..ba46f048beb 100644
--- a/src/test/pytest/pypg/fixtures.py
+++ b/src/test/pytest/pypg/fixtures.py
@@ -30,6 +30,30 @@ def remaining_timeout():
return lambda: max(deadline - time.monotonic(), 0)
[email protected]
+def wait_until(remaining_timeout):
+ def wait_until(error_message="Did not complete in time", timeout=None, interval=1):
+ """
+ Loop until the timeout is reached. If the timeout is reached, raise an
+ exception with the given error message.
+ """
+ if timeout is None:
+ timeout = remaining_timeout()
+
+ end = time.time() + timeout
+ print_progress = timeout / 10 > 4
+ last_printed_progress = 0
+ while time.time() < end:
+ if print_progress and time.time() - last_printed_progress > 4:
+ last_printed_progress = time.time()
+ print(f"{error_message} - will retry")
+ yield
+ time.sleep(interval)
+ raise TimeoutError(error_message)
+
+ return wait_until
+
+
@pytest.fixture(scope="session")
def libpq_handle(libdir):
"""
@@ -149,6 +173,15 @@ def pg_server_module(pg_server_global):
yield s
[email protected](autouse=True, scope="function")
+def ensure_server_running(pg_server_global):
+ """
+ Autouse fixture that ensures the server is running before each test.
+ If a test shuts down the server, this will restart it for the next test.
+ """
+ pg_server_global.ensure_running()
+
+
@pytest.fixture
def pg(pg_server_module, remaining_timeout):
"""
diff --git a/src/test/pytest/pypg/server.py b/src/test/pytest/pypg/server.py
index d6675cde93d..f09651c089e 100644
--- a/src/test/pytest/pypg/server.py
+++ b/src/test/pytest/pypg/server.py
@@ -332,6 +332,56 @@ class PostgresServer:
# Server may have already been stopped
pass
+ def ensure_running(self):
+ """
+ Ensure that the PostgreSQL server is running and accepting connections.
+
+ If the server is not running, it will be restarted. This method waits
+ for any in-progress shutdown to complete before attempting to restart.
+ """
+ pid_file = os.path.join(self.datadir, "postmaster.pid")
+
+ # Wait for any in-progress shutdown to complete
+ socket_pattern = os.path.join(self.sockdir, f".s.PGSQL.{self.port}*")
+ for _ in range(100): # Wait up to 10 seconds
+ # Server is fully down when both PID file and sockets are gone
+ if not os.path.exists(pid_file) and len(glob.glob(socket_pattern)) == 0:
+ break
+ # Server is running if PID exists and we can connect
+ if os.path.exists(pid_file):
+ # Use pg_isready to check if server is accepting connections
+ try:
+ pg_isready = os.path.join(self._bindir, "pg_isready")
+ run(
+ pg_isready,
+ "-h",
+ self.sockdir,
+ "-p",
+ self.port,
+ stdout=subprocess.DEVNULL,
+ stderr=subprocess.DEVNULL,
+ timeout=1,
+ )
+ # Server is up and ready
+ break
+ except (subprocess.CalledProcessError, subprocess.TimeoutExpired):
+ # Server is not ready yet, keep waiting
+ pass
+ time.sleep(0.1)
+
+ # Now check if server needs to be started
+ if not os.path.exists(pid_file):
+ # Restart the server and wait for it to be ready
+ run(
+ self._pg_ctl,
+ "-D",
+ self.datadir,
+ "-l",
+ self._log,
+ "-w",
+ "start",
+ )
+
def cleanup(self):
"""Run all registered cleanup callbacks."""
self._cleanup_stack.close()
--
2.51.1
[text/x-patch] v2-0001-Bump-protocol-version-to-3.3.patch (7.4K, 3-v2-0001-Bump-protocol-version-to-3.3.patch)
download | inline diff:
From 587df6b1a81833b14cf8c2d659b21cd54a3c55f6 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Sun, 19 Oct 2025 00:30:48 +0200
Subject: [PATCH v2 1/3] Bump protocol version to 3.3
This commit increments the PostgreSQL frontend/backend protocol version
from 3.2 to 3.3 and does the required boilerplate changes. The next
commit will introduce an actual protocol change.
---
doc/src/sgml/libpq.sgml | 12 ++++++----
doc/src/sgml/protocol.sgml | 22 ++++++++++++-------
src/include/libpq/pqcomm.h | 2 +-
src/interfaces/libpq/fe-connect.c | 6 +++++
.../modules/libpq_pipeline/libpq_pipeline.c | 22 ++++++++++++++++---
5 files changed, 48 insertions(+), 16 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 5bf59a19855..0f43ee6b039 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2200,10 +2200,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
The current supported values are
- <literal>3.0</literal>, <literal>3.2</literal>,
+ <literal>3.0</literal>,
+ <literal>3.2</literal>,
+ <literal>3.3</literal>
and <literal>latest</literal>. The <literal>latest</literal> value is
equivalent to the latest protocol version supported by the libpq
- version being used, which is currently <literal>3.2</literal>.
+ version being used, which is currently <literal>3.3</literal>.
</para>
</listitem>
</varlistentry>
@@ -2226,10 +2228,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
The current supported values are
- <literal>3.0</literal>, <literal>3.2</literal>,
+ <literal>3.0</literal>,
+ <literal>3.2</literal>,
+ <literal>3.3</literal>
and <literal>latest</literal>. The <literal>latest</literal> value is
equivalent to the latest protocol version supported by the libpq
- version being used, which is currently <literal>3.2</literal>.
+ version being used, which is currently <literal>3.3</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 9d755232873..7665bd7dcb8 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -18,8 +18,8 @@
</para>
<para>
- This document describes version 3.2 of the protocol, introduced in
- <productname>PostgreSQL</productname> version 18. The server and the libpq
+ This document describes version 3.3 of the protocol, introduced in
+ <productname>PostgreSQL</productname> version 19. The server and the libpq
client library are backwards compatible with protocol version 3.0,
implemented in <productname>PostgreSQL</productname> 7.4 and later.
</para>
@@ -192,7 +192,7 @@
<title>Protocol Versions</title>
<para>
- The current, latest version of the protocol is version 3.2. However, for
+ The current, latest version of the protocol is version 3.3. However, for
backwards compatibility with old server versions and middleware that don't
support the version negotiation yet, libpq still uses protocol version 3.0
by default.
@@ -206,7 +206,7 @@
this would occur if the client requested protocol version 4.0, which does
not exist as of this writing). If the minor version requested by the
client is not supported by the server (e.g., the client requests version
- 3.2, but the server supports only 3.0), the server may either reject the
+ 3.3, but the server supports only 3.0), the server may either reject the
connection or may respond with a NegotiateProtocolVersion message
containing the highest minor protocol version which it supports. The
client may then choose either to continue with the connection using the
@@ -238,10 +238,16 @@
</thead>
<tbody>
+ <row>
+ <entry>3.3</entry>
+ <entry>PostgreSQL 19 and later</entry>
+ <entry>Current latest version.
+ </entry>
+ </row>
<row>
<entry>3.2</entry>
<entry>PostgreSQL 18 and later</entry>
- <entry>Current latest version. The secret key used in query
+ <entry>The secret key used in query
cancellation was enlarged from 4 bytes to a variable length field. The
BackendKeyData message was changed to accommodate that, and the CancelRequest
message was redefined to have a variable length payload.
@@ -6076,9 +6082,9 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
<para>
The protocol version number. The most significant 16 bits are
the major version number. The least significant 16 bits are the minor
- version number. As an example protocol version 3.2 is represented as
- <literal>196610</literal> in decimal or more clearly as
- <literal>0x00030002</literal> in hexadecimal.
+ version number. As an example protocol version 3.3 is represented as
+ <literal>196611</literal> in decimal or more clearly as
+ <literal>0x00030003</literal> in hexadecimal.
</para>
</listitem>
</varlistentry>
diff --git a/src/include/libpq/pqcomm.h b/src/include/libpq/pqcomm.h
index f04ca135653..2423394b348 100644
--- a/src/include/libpq/pqcomm.h
+++ b/src/include/libpq/pqcomm.h
@@ -94,7 +94,7 @@ is_unixsock_path(const char *path)
*/
#define PG_PROTOCOL_EARLIEST PG_PROTOCOL(3,0)
-#define PG_PROTOCOL_LATEST PG_PROTOCOL(3,2)
+#define PG_PROTOCOL_LATEST PG_PROTOCOL(3,3)
typedef uint32 ProtocolVersion; /* FE/BE protocol version number */
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index a3d12931fff..62d35e0c789 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -8301,6 +8301,12 @@ pqParseProtocolVersion(const char *value, ProtocolVersion *result, PGconn *conn,
return true;
}
+ if (strcmp(value, "3.3") == 0)
+ {
+ *result = PG_PROTOCOL(3, 3);
+ return true;
+ }
+
libpq_append_conn_error(conn, "invalid %s value: \"%s\"",
context, value);
return false;
diff --git a/src/test/modules/libpq_pipeline/libpq_pipeline.c b/src/test/modules/libpq_pipeline/libpq_pipeline.c
index b3af70fa09b..0b113b271ea 100644
--- a/src/test/modules/libpq_pipeline/libpq_pipeline.c
+++ b/src/test/modules/libpq_pipeline/libpq_pipeline.c
@@ -1405,7 +1405,23 @@ test_protocol_version(PGconn *conn)
PQfinish(conn);
/*
- * Test max_protocol_version=latest. 'latest' currently means '3.2'.
+ * Test max_protocol_version=3.3
+ */
+ vals[max_protocol_version_index] = "3.3";
+ conn = PQconnectdbParams(keywords, vals, false);
+
+ if (PQstatus(conn) != CONNECTION_OK)
+ pg_fatal("Connection to database failed: %s",
+ PQerrorMessage(conn));
+
+ protocol_version = PQfullProtocolVersion(conn);
+ if (protocol_version != 30003)
+ pg_fatal("expected 30003, got %d", protocol_version);
+
+ PQfinish(conn);
+
+ /*
+ * Test max_protocol_version=latest. 'latest' currently means '3.3'.
*/
vals[max_protocol_version_index] = "latest";
conn = PQconnectdbParams(keywords, vals, false);
@@ -1415,8 +1431,8 @@ test_protocol_version(PGconn *conn)
PQerrorMessage(conn));
protocol_version = PQfullProtocolVersion(conn);
- if (protocol_version != 30002)
- pg_fatal("expected 30002, got %d", protocol_version);
+ if (protocol_version != 30003)
+ pg_fatal("expected 30003, got %d", protocol_version);
PQfinish(conn);
--
2.51.1
[text/x-patch] v2-0002-Add-GoAway-protocol-message-for-graceful-but-fast.patch (19.3K, 4-v2-0002-Add-GoAway-protocol-message-for-graceful-but-fast.patch)
download | inline diff:
From fd486d8a83112504114dbaeb4da40106af3361b9 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Sun, 19 Oct 2025 00:36:30 +0200
Subject: [PATCH v2 2/3] Add GoAway protocol message for graceful but fast
server shutdown/switchover
This commit introduces a new GoAway backend-to-frontend protocol
message (byte 'g') that the server can send to the client to politely
request that client to disconnect/reconnect when convenient. This message is
advisory only - the connection remains fully functional and clients may
continue executing queries and starting new transactions. "When
convenient" is obviously not very well defined, but the primary target
clients are clients that maintain a connection pool. Such clients should
disconnect/reconnect a connection in the pool when there's no user of
that connection. This is similar to how such clients often currently
remove a connection from the pool after the connection hits a maximum
lifetime of e.g. 1 hour.
This new message is used by Postgres during the already existing "smart"
shutdown procedure (i.e. when postmaster receives SIGTERM). When
Postgres is in "smart" shutdown mode existing clients can continue to
run queries as usual but new connection attempts are rejected. This mode
is primarily useful when triggering a switchover of a read replica. A
load balancer can route new connections only to the new read replica,
while the old load balancer keeps serving the existing connections until
they disconnect. The problem is that this draining of connections could
often take a long time. Even when clients only run very short
queries/transactions because the session can be kept open much longer
(many connection pools use 1 hour max lifetime of a connection by default).
With the introduction of the GoAway message Postgres now sends this
message to all connected clients when it enters smart shutdown mode.
If these clients respond to the message by reconnecting/disconnecting
earlier than their maximum connection lifetime the draining can complete
much quicker. Similar benefits to switchover duration can be achieved
for other applications or proxies implementing the Postgres protocol,
like when switching over a cluster of PgBouncer machines to a newer
version.
Applications/clients that use libpq can periodically check the result of
PQgoAwayReceived() at an inactive time to see whether they are asked to
reconnect.
---
doc/src/sgml/libpq.sgml | 41 ++++++++++++++++++++
doc/src/sgml/protocol.sgml | 58 +++++++++++++++++++++++++++-
src/backend/postmaster/postmaster.c | 24 ++++++++++++
src/backend/storage/ipc/procsignal.c | 3 ++
src/backend/tcop/postgres.c | 36 +++++++++++++++++
src/include/libpq/protocol.h | 1 +
src/include/storage/procsignal.h | 1 +
src/include/tcop/tcopprot.h | 1 +
src/interfaces/libpq/exports.txt | 1 +
src/interfaces/libpq/fe-connect.c | 1 +
src/interfaces/libpq/fe-exec.c | 27 +++++++++++++
src/interfaces/libpq/fe-protocol3.c | 17 +++++++-
src/interfaces/libpq/libpq-fe.h | 5 +++
src/interfaces/libpq/libpq-int.h | 1 +
14 files changed, 214 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 0f43ee6b039..563e25d9f27 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2996,6 +2996,47 @@ int PQserverVersion(const PGconn *conn);
</listitem>
</varlistentry>
+ <varlistentry id="libpq-PQgoAwayReceived">
+ <term><function>PQgoAwayReceived</function><indexterm><primary>PQgoAwayReceived</primary></indexterm></term>
+ <listitem>
+ <para>
+ Returns true if the server has sent a <literal>GoAway</literal> message,
+ requesting the client to disconnect when convenient.
+
+<synopsis>
+int PQgoAwayReceived(const PGconn *conn);
+</synopsis>
+ </para>
+
+ <para>
+ The <literal>GoAway</literal> message is sent by the server during a
+ smart shutdown to politely request that clients disconnect. This is
+ advisory only - the connection remains fully functional and queries
+ can continue to be executed. Applications can choose to honor the rquest
+ by calling this function periodically and disconnect gracefully when
+ possible, such as after completing the current transaction.
+ </para>
+
+ <para>
+ This message is only sent to clients using protocol version 3.3 or
+ later. The function returns 1 if the <literal>GoAway</literal> message
+ was received, 0 otherwise.
+ </para>
+
+ <para>
+ <function>PQgoAwayReceived</function> does not actually read data from the
+ server; it just returns messages previously absorbed by another
+ <application>libpq</application> function. So normally you would first
+ call <xref linkend="libpq-PQconsumeInput"/>, then check
+ <function>PQgoAwayReceived</function>. You can use
+ <function>select()</function> to wait for data to arrive from the
+ server, thereby using no <acronym>CPU</acronym> power unless there is
+ something to do. (See <xref linkend="libpq-PQsocket"/> to obtain the file
+ descriptor number to use with <function>select()</function>.)
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="libpq-PQerrorMessage">
<term>
<function>PQerrorMessage</function><indexterm><primary>PQerrorMessage</primary></indexterm>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 7665bd7dcb8..484e2970e98 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -242,6 +242,7 @@
<entry>3.3</entry>
<entry>PostgreSQL 19 and later</entry>
<entry>Current latest version.
+ The GoAway message was introduced.
</entry>
</row>
<row>
@@ -596,6 +597,17 @@
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term>GoAway</term>
+ <listitem>
+ <para>
+ The server requests the server politely to close the connection at its
+ earliest convenient moment. See <xref linkend="protocol-async"/> for
+ more details.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
@@ -1344,7 +1356,7 @@ SELCT 1/0;<!-- this typo is intentional -->
</para>
<para>
- It is possible for NoticeResponse and ParameterStatus messages to be
+ It is possible for NoticeResponse, ParameterStatus and GoAway messages to be
interspersed between CopyData messages; frontends must handle these cases,
and should be prepared for other asynchronous message types as well (see
<xref linkend="protocol-async"/>). Otherwise, any message type other than
@@ -1455,6 +1467,25 @@ SELCT 1/0;<!-- this typo is intentional -->
parameters that it does not understand or care about.
</para>
+ <para>
+ A GoAway message is sent by the server to politely request the client to
+ disconnect when convenient and reconnect when the client needs a connection
+ again. This is advisory only - the connection remains fully functional and
+ queries can continue to be executed. "When convenient" is very vaguely
+ defined on purpose because it depends on the client and application whether
+ such a moment even exists. So clients are allowed to completely ignore this
+ message and disconnect whenever they otherwise would have. An important
+ type of client that can actually honor the request to disconnect early is a
+ client that maintains a connection pool. Such a client can honor the
+ request by disconnecting a connection that has received a GoAway message
+ when it's not in use by a user of the pool. It is allowed for a server to
+ send multiple GoAway messages on the same connection, but any subsequent
+ GoAway messages after the first GoAway have no effect on the client's
+ behavior. The GoAway message is currently sent by Postgres during the
+ "smart" shutdown procedure (i.e. when postmaster receives
+ <systemitem>SIGTERM</systemitem>).
+ </para>
+
<para>
If a frontend issues a <command>LISTEN</command> command, then the
backend will send a NotificationResponse message (not to be
@@ -5246,6 +5277,31 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
</listitem>
</varlistentry>
+ <varlistentry id="protocol-message-formats-GoAway">
+ <term>GoAway (B)</term>
+ <listitem>
+ <variablelist>
+ <varlistentry>
+ <term>Byte1('g')</term>
+ <listitem>
+ <para>
+ Identifies the message as a polite request for the client to disconnect.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Int32(4)</term>
+ <listitem>
+ <para>
+ Length of message contents in bytes, including self.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="protocol-message-formats-GSSENCRequest">
<term>GSSENCRequest (F)</term>
<listitem>
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index e1d643b013d..79f116a5a5e 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2126,7 +2126,31 @@ process_pm_shutdown_request(void)
* later state, do not change it.
*/
if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
+ {
+ dlist_iter iter;
+
connsAllowed = false;
+
+ /*
+ * Signal all backends to send a GoAway message to their
+ * clients, to politely request that they disconnect.
+ */
+ dlist_foreach(iter, &ActiveChildList)
+ {
+ PMChild *bp = dlist_container(PMChild, elem, iter.cur);
+
+ /*
+ * Only signal regular backends and walsenders. Skip
+ * auxiliary processes and dead-end backends.
+ */
+ if (bp->bkend_type == B_BACKEND ||
+ bp->bkend_type == B_WAL_SENDER)
+ {
+ SendProcSignal(bp->pid, PROCSIG_SMART_SHUTDOWN,
+ INVALID_PROC_NUMBER);
+ }
+ }
+ }
else if (pmState == PM_STARTUP || pmState == PM_RECOVERY)
{
/* There should be no clients, so proceed to stop children */
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 087821311cc..6011c30d520 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -691,6 +691,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT))
HandleLogMemoryContextInterrupt();
+ if (CheckProcSignal(PROCSIG_SMART_SHUTDOWN))
+ HandleSmartShutdownInterrupt();
+
if (CheckProcSignal(PROCSIG_PARALLEL_APPLY_MESSAGE))
HandleParallelApplyMessageInterrupt();
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7dd75a490aa..7a59e65f2d3 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -42,8 +42,10 @@
#include "common/pg_prng.h"
#include "jit/jit.h"
#include "libpq/libpq.h"
+#include "libpq/pqcomm.h"
#include "libpq/pqformat.h"
#include "libpq/pqsignal.h"
+#include "libpq/protocol.h"
#include "mb/pg_wchar.h"
#include "mb/stringinfo_mb.h"
#include "miscadmin.h"
@@ -91,6 +93,14 @@ const char *debug_query_string; /* client-supplied query string */
/* Note: whereToSendOutput is initialized for the bootstrap/standalone case */
CommandDest whereToSendOutput = DestDebug;
+/*
+ * Track whether we've been notified of smart shutdown and sent GoAway.
+ * SmartShutdownPending is set by the PROCSIG_SMART_SHUTDOWN signal handler.
+ * GoAwaySent tracks whether we've already sent the GoAway message.
+ */
+static volatile sig_atomic_t SmartShutdownPending = false;
+static bool GoAwaySent = false;
+
/* flag for logging end of session */
bool Log_disconnections = false;
@@ -508,6 +518,20 @@ ProcessClientReadInterrupt(bool blocked)
/* Check for general interrupts that arrived before/while reading */
CHECK_FOR_INTERRUPTS();
+ /* Send GoAway message if smart shutdown is pending */
+ if (SmartShutdownPending && !GoAwaySent &&
+ whereToSendOutput == DestRemote &&
+ MyProcPort && MyProcPort->proto >= PG_PROTOCOL(3, 3))
+ {
+ StringInfoData buf;
+
+ pq_beginmessage(&buf, PqMsg_GoAway);
+ pq_endmessage(&buf);
+ pq_flush();
+
+ GoAwaySent = true;
+ }
+
/* Process sinval catchup interrupts, if any */
if (catchupInterruptPending)
ProcessCatchupInterrupt();
@@ -3087,6 +3111,18 @@ FloatExceptionHandler(SIGNAL_ARGS)
"invalid operation, such as division by zero.")));
}
+/*
+ * Tell the next CHECK_FOR_INTERRUPTS() or main loop iteration to send a
+ * GoAway message to the client. Runs in a SIGUSR1 handler.
+ */
+void
+HandleSmartShutdownInterrupt(void)
+{
+ SmartShutdownPending = true;
+ InterruptPending = true;
+ /* latch will be set by procsignal_sigusr1_handler */
+}
+
/*
* Tell the next CHECK_FOR_INTERRUPTS() to check for a particular type of
* recovery conflict. Runs in a SIGUSR1 handler.
diff --git a/src/include/libpq/protocol.h b/src/include/libpq/protocol.h
index 7bf90053bcb..24fbc9f2613 100644
--- a/src/include/libpq/protocol.h
+++ b/src/include/libpq/protocol.h
@@ -53,6 +53,7 @@
#define PqMsg_FunctionCallResponse 'V'
#define PqMsg_CopyBothResponse 'W'
#define PqMsg_ReadyForQuery 'Z'
+#define PqMsg_GoAway 'g'
#define PqMsg_NoData 'n'
#define PqMsg_PortalSuspended 's'
#define PqMsg_ParameterDescription 't'
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..b629341a4af 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -36,6 +36,7 @@ typedef enum
PROCSIG_BARRIER, /* global barrier interrupt */
PROCSIG_LOG_MEMORY_CONTEXT, /* ask backend to log the memory contexts */
PROCSIG_PARALLEL_APPLY_MESSAGE, /* Message from parallel apply workers */
+ PROCSIG_SMART_SHUTDOWN, /* notify backend of smart shutdown */
/* Recovery conflict reasons */
PROCSIG_RECOVERY_CONFLICT_FIRST,
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index c1bcfdec673..b7fd22c43bb 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -74,6 +74,7 @@ extern void die(SIGNAL_ARGS);
pg_noreturn extern void quickdie(SIGNAL_ARGS);
extern void StatementCancelHandler(SIGNAL_ARGS);
pg_noreturn extern void FloatExceptionHandler(SIGNAL_ARGS);
+extern void HandleSmartShutdownInterrupt(void);
extern void HandleRecoveryConflictInterrupt(ProcSignalReason reason);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index dbbae642d76..3385e65c389 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -210,3 +210,4 @@ PQgetAuthDataHook 207
PQdefaultAuthDataHook 208
PQfullProtocolVersion 209
appendPQExpBufferVA 210
+PQgoAwayReceived 211
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 62d35e0c789..f9ee52f0c0a 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -698,6 +698,7 @@ pqDropServerData(PGconn *conn)
conn->password_needed = false;
conn->gssapi_used = false;
conn->write_failed = false;
+ conn->goaway_received = false;
free(conn->write_err_msg);
conn->write_err_msg = NULL;
conn->oauth_want_retry = false;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 0b1e37ec30b..16bf10280e5 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -2696,6 +2696,33 @@ PQnotifies(PGconn *conn)
return event;
}
+/*
+ * PQgoAwayReceived
+ * returns 1 if a GoAway message has been received from the server
+ * returns 0 if not
+ *
+ * Note that this function does not read any new data from the socket;
+ * caller should call PQconsumeInput() first if they want to ensure
+ * all available data has been read.
+ */
+int
+PQgoAwayReceived(PGconn *conn)
+{
+ if (!conn)
+ return 0;
+
+ if (conn->goaway_received)
+ return 1;
+
+ /*
+ * Parse any available data to see if a GoAway message has arrived.
+ */
+ pqParseInput3(conn);
+
+ return conn->goaway_received ? 1 : 0;
+}
+
+
/*
* PQputCopyData - send some data to the backend during COPY IN or COPY BOTH
*
diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index da7a8db68c8..f60fe188806 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -132,8 +132,8 @@ pqParseInput3(PGconn *conn)
}
/*
- * NOTIFY and NOTICE messages can happen in any state; always process
- * them right away.
+ * NOTIFY and NOTICE and GoAway messages can happen in any state;
+ * always process them right away.
*
* Most other messages should only be processed while in BUSY state.
* (In particular, in READY state we hold off further parsing until
@@ -157,6 +157,10 @@ pqParseInput3(PGconn *conn)
if (pqGetErrorNotice3(conn, false))
return;
}
+ else if (id == PqMsg_GoAway)
+ {
+ conn->goaway_received = true;
+ }
else if (conn->asyncStatus != PGASYNC_BUSY)
{
/* If not IDLE state, just wait ... */
@@ -303,6 +307,9 @@ pqParseInput3(PGconn *conn)
if (getParameterStatus(conn))
return;
break;
+ case PqMsg_GoAway:
+ conn->goaway_received = true;
+ break;
case PqMsg_BackendKeyData:
/*
@@ -1841,6 +1848,9 @@ getCopyDataMessage(PGconn *conn)
if (getParameterStatus(conn))
return 0;
break;
+ case PqMsg_GoAway:
+ conn->goaway_received = true;
+ break;
case PqMsg_CopyData:
return msgLength;
case PqMsg_CopyDone:
@@ -2334,6 +2344,9 @@ pqFunctionCall3(PGconn *conn, Oid fnid,
if (getParameterStatus(conn))
continue;
break;
+ case PqMsg_GoAway:
+ conn->goaway_received = true;
+ break;
default:
/* The backend violates the protocol. */
libpq_append_conn_error(conn, "protocol error: id=0x%x", id);
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 0852584edae..1f3bd671a4b 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -63,6 +63,10 @@ extern "C"
/* Indicates presence of the PQAUTHDATA_PROMPT_OAUTH_DEVICE authdata hook */
#define LIBPQ_HAS_PROMPT_OAUTH_DEVICE 1
+/* Features added in PostgreSQL v19: */
+/* Indicates presence of PQgoAwayReceived */
+#define LIBPQ_HAS_GOAWAY 1
+
/*
* Option flags for PQcopyResult
*/
@@ -411,6 +415,7 @@ extern const char *PQparameterStatus(const PGconn *conn,
extern int PQprotocolVersion(const PGconn *conn);
extern int PQfullProtocolVersion(const PGconn *conn);
extern int PQserverVersion(const PGconn *conn);
+extern int PQgoAwayReceived(PGconn *conn);
extern char *PQerrorMessage(const PGconn *conn);
extern int PQsocket(const PGconn *conn);
extern int PQbackendPID(const PGconn *conn);
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 02c114f1405..7dc6e858a16 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -511,6 +511,7 @@ struct pg_conn
bool sigpipe_flag; /* can we mask SIGPIPE via MSG_NOSIGNAL? */
bool write_failed; /* have we had a write failure on sock? */
char *write_err_msg; /* write error message, or NULL if OOM */
+ bool goaway_received; /* true if server sent GoAway message */
bool auth_required; /* require an authentication challenge from
* the server? */
--
2.51.1
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2025-10-24 05:04 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Kirill Reshke <[email protected]>
2025-10-24 11:54 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
@ 2025-12-01 05:05 ` Ajit Awekar <[email protected]>
0 siblings, 0 replies; 15+ messages in thread
From: Ajit Awekar @ 2025-12-01 05:05 UTC (permalink / raw)
To: Jelte Fennema-Nio <[email protected]>; +Cc: Kirill Reshke <[email protected]>; Jelte Fennema-Nio <[email protected]>; PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Jacob Champion <[email protected]>; Heikki Linnakangas <[email protected]>
Hi Jelte,
Thank you for proposing the GoAway protocol message.
I've developed a patch that serves as a strong, immediate use case for its
inclusion.
https://www.postgresql.org/message-id/flat/CAER375OvH3_ONmc-SgUFpA6gv_d6eNj2KdZktzo-f_uqNwwWNw%40mai...
Thanks & Best Regards,
Ajit
On Fri, 24 Oct 2025 at 17:24, Jelte Fennema-Nio <[email protected]> wrote:
> On Fri Oct 24, 2025 at 7:04 AM CEST, Kirill Reshke wrote:
> > On Thu, 23 Oct 2025 at 18:05, Jelte Fennema-Nio <[email protected]> wrote:
> > Im +1 on this idea. This is something I wanted back in 2020, when
> > implementing the 'online restart' feature for odyssey[0], but never
> > bothered to create a thread.
>
> Yeah, to be clear: A big goal of this is definitely to be used by
> poolers/proxies/middleware. Those systems will often be more frequently
> restarted than the actual database servers, so being able to do that
> quickly without disrupting active connections is much more important
> there than with plain PostgreSQL servers.
>
> > About patches:
>
> Thanks for the review. Attached is a new patchset. I think I addressed
> all of your comments (I almost fully rewrote the docs). I also fixed
> two other issues that I found:
> - updating docs for 3.3 in more places
> - handling the GoAway message in more code paths on the client side
>
>
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
@ 2026-02-10 21:59 ` Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
1 sibling, 1 reply; 15+ messages in thread
From: Jelte Fennema-Nio @ 2026-02-10 21:59 UTC (permalink / raw)
To: Zsolt Parragi <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Jacob Champion <[email protected]>; Heikki Linnakangas <[email protected]>
On Sun Feb 8, 2026 at 11:03 PM CET, Jelte Fennema-Nio wrote:
> Attached is v5
Attached is v6 which fixes a rebase conflict.
Attachments:
[text/x-patch] nocfbot.v6-0002-Add-pytest-based-tests-for-GoAway-message.patch (5.9K, 2-nocfbot.v6-0002-Add-pytest-based-tests-for-GoAway-message.patch)
download | inline diff:
From 7ebbf64b84bb2e2c1c6016bd8022ba38a0c40c1a Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Thu, 23 Oct 2025 14:31:52 +0200
Subject: [PATCH v6 2/2] Add pytest based tests for GoAway message
I used this patchset as a trial for the new pytest suite that Jacob is
trying to introduce. Feel free to look at it, but I'd say don't review
this test in detail until we have the pytest changes merged or at least
in a more agreed upon state. This patch is built on top of that v3
patchset. This test is not applied by cfbot.
Testing this any other way is actually quite difficult with the
infrastructure we currently have (of course I could change that, but I'd
much rather spend that energy/time on making the pytest test suite a
thing):
- pgregress and perl tests don't work because we need to call a new
libpq function that is not exposed in psql (I guess I could expose it
with some \goawayreceived command, but it doesn't seem very useful).
- libpq_pipeline cannot test this because it would need to restart
the Postgres server and all it has
---
src/interfaces/libpq/meson.build | 1 +
src/interfaces/libpq/pyt/test_goaway.py | 47 +++++++++++++++++++++++++
src/test/pytest/libpq/_core.py | 18 ++++++++++
src/test/pytest/pypg/fixtures.py | 24 +++++++++++++
4 files changed, 90 insertions(+)
create mode 100644 src/interfaces/libpq/pyt/test_goaway.py
diff --git a/src/interfaces/libpq/meson.build b/src/interfaces/libpq/meson.build
index 56790dd92a9..983af1d5bea 100644
--- a/src/interfaces/libpq/meson.build
+++ b/src/interfaces/libpq/meson.build
@@ -163,6 +163,7 @@ tests += {
'pytest': {
'tests': [
'pyt/test_load_balance.py',
+ 'pyt/test_goaway.py',
],
},
}
diff --git a/src/interfaces/libpq/pyt/test_goaway.py b/src/interfaces/libpq/pyt/test_goaway.py
new file mode 100644
index 00000000000..a03b533f3f4
--- /dev/null
+++ b/src/interfaces/libpq/pyt/test_goaway.py
@@ -0,0 +1,47 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+"""
+Tests for the GoAway protocol message during smart shutdown.
+
+The GoAway message is sent by the server during smart shutdown to politely
+request that clients disconnect when convenient. The connection remains
+functional after receiving the message.
+"""
+
+
+def test_goaway_smart_shutdown(pg, wait_until):
+ """
+ Test that GoAway message is sent during smart shutdown.
+
+ This test:
+ 1. Connects to a running PostgreSQL server via Unix socket
+ 2. Verifies GoAway is not received initially
+ 3. Initiates a smart shutdown
+ 4. Verifies that GoAway is received
+ 5. Verifies that queries still work after GoAway
+ """
+
+ # Connect to the server via Unix socket, libpq will request the
+ # _pq_.goaway protocol extension
+ conn = pg.connect(max_protocol_version="latest")
+
+ # Initially, GoAway should not be received
+ assert not conn.goaway_received(), "GoAway should not be received initially"
+
+ # Execute a simple query to ensure connection is working
+ conn.sql("SELECT 1")
+
+ pg.pg_ctl("stop", "--mode", "smart", "--no-wait")
+
+ for _ in wait_until("Did not receive GoAway after smart shutdown"):
+ # Consume any data the backend may have sent (like GoAway)
+ assert conn.consume_input()
+ if conn.goaway_received():
+ break
+
+ # Connection should still be functional after receiving GoAway
+ conn.sql("SELECT 2")
+ conn.sql("SELECT 3")
+
+ # Verify GoAway is still flagged
+ assert conn.goaway_received(), "GoAway flag should remain set"
diff --git a/src/test/pytest/libpq/_core.py b/src/test/pytest/libpq/_core.py
index 1c059b9b446..c137688a1aa 100644
--- a/src/test/pytest/libpq/_core.py
+++ b/src/test/pytest/libpq/_core.py
@@ -147,6 +147,12 @@ def load_libpq_handle(libdir, bindir):
lib.PQresultErrorField.restype = ctypes.c_char_p
lib.PQresultErrorField.argtypes = [_PGresult_p, ctypes.c_int]
+ lib.PQgoAwayReceived.restype = ctypes.c_int
+ lib.PQgoAwayReceived.argtypes = [_PGconn_p]
+
+ lib.PQconsumeInput.restype = ctypes.c_int
+ lib.PQconsumeInput.argtypes = [_PGconn_p]
+
return lib
@@ -419,6 +425,18 @@ class PGconn(contextlib.AbstractContextManager):
else:
res.raise_error()
+ def consume_input(self) -> bool:
+ """
+ Consumes any available input from the server. Returns True on success.
+ """
+ return bool(self._lib.PQconsumeInput(self._handle))
+
+ def goaway_received(self) -> bool:
+ """
+ Returns True if a GoAway message was received from the server.
+ """
+ return bool(self._lib.PQgoAwayReceived(self._handle))
+
def connstr(opts: Dict[str, Any]) -> str:
"""
diff --git a/src/test/pytest/pypg/fixtures.py b/src/test/pytest/pypg/fixtures.py
index 8c0cb60daa5..4aa6c73349d 100644
--- a/src/test/pytest/pypg/fixtures.py
+++ b/src/test/pytest/pypg/fixtures.py
@@ -57,6 +57,30 @@ def remaining_timeout_module():
return lambda: max(deadline - time.monotonic(), 0)
[email protected]
+def wait_until(remaining_timeout):
+ def wait_until(error_message="Did not complete in time", timeout=None, interval=1):
+ """
+ Loop until the timeout is reached. If the timeout is reached, raise an
+ exception with the given error message.
+ """
+ if timeout is None:
+ timeout = remaining_timeout()
+
+ end = time.time() + timeout
+ print_progress = timeout / 10 > 4
+ last_printed_progress = 0
+ while time.time() < end:
+ if print_progress and time.time() - last_printed_progress > 4:
+ last_printed_progress = time.time()
+ print(f"{error_message} - will retry")
+ yield
+ time.sleep(interval)
+ raise TimeoutError(error_message)
+
+ return wait_until
+
+
@pytest.fixture(scope="session")
def libpq_handle(libdir, bindir):
"""
--
2.52.0
[text/x-patch] v6-0001-Add-GoAway-protocol-message-for-graceful-but-fast.patch (23.9K, 3-v6-0001-Add-GoAway-protocol-message-for-graceful-but-fast.patch)
download | inline diff:
From 4ef8432550ff9feac3b3d32a0affc2d942c55c25 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Sun, 19 Oct 2025 00:30:48 +0200
Subject: [PATCH v6 1/2] Add GoAway protocol message for graceful but fast
server shutdown/switchover
This commit introduces a new GoAway backend-to-frontend protocol
message (byte 'g') that the server can send to the client to politely
request that client to disconnect/reconnect when convenient. This message is
advisory only - the connection remains fully functional and clients may
continue executing queries and starting new transactions. "When
convenient" is obviously not very well defined, but the primary target
clients are clients that maintain a connection pool. Such clients should
disconnect/reconnect a connection in the pool when there's no user of
that connection. This is similar to how such clients often currently
remove a connection from the pool after the connection hits a maximum
lifetime of e.g. 1 hour.
This new message is used by Postgres during the already existing "smart"
shutdown procedure (i.e. when postmaster receives SIGTERM). When
Postgres is in "smart" shutdown mode existing clients can continue to
run queries as usual but new connection attempts are rejected. This mode
is primarily useful when triggering a switchover of a read replica. A
load balancer can route new connections only to the new read replica,
while the old load balancer keeps serving the existing connections until
they disconnect. The problem is that this draining of connections could
often take a long time. Even when clients only run very short
queries/transactions because the session can be kept open much longer
(many connection pools use 1 hour max lifetime of a connection by default).
With the introduction of the GoAway message Postgres now sends this
message to all connected clients when it enters smart shutdown mode.
If these clients respond to the message by reconnecting/disconnecting
earlier than their maximum connection lifetime the draining can complete
much quicker. Similar benefits to switchover duration can be achieved
for other applications or proxies implementing the Postgres protocol,
like when switching over a cluster of PgBouncer machines to a newer
version.
Applications/clients that use libpq can periodically check the result of
PQgoAwayReceived() at an inactive time to see whether they are asked to
reconnect.
---
doc/src/sgml/libpq.sgml | 52 +++++++++++++++++++++
doc/src/sgml/protocol.sgml | 70 ++++++++++++++++++++++++++--
src/backend/postmaster/postmaster.c | 23 +++++++++
src/backend/storage/ipc/procsignal.c | 3 ++
src/backend/tcop/backend_startup.c | 21 +++++++--
src/backend/tcop/postgres.c | 36 ++++++++++++++
src/include/libpq/libpq-be.h | 5 ++
src/include/libpq/protocol.h | 1 +
src/include/storage/procsignal.h | 3 +-
src/include/tcop/tcopprot.h | 1 +
src/interfaces/libpq/exports.txt | 1 +
src/interfaces/libpq/fe-connect.c | 1 +
src/interfaces/libpq/fe-exec.c | 27 +++++++++++
src/interfaces/libpq/fe-protocol3.c | 36 ++++++++++++--
src/interfaces/libpq/libpq-fe.h | 5 ++
src/interfaces/libpq/libpq-int.h | 1 +
16 files changed, 274 insertions(+), 12 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 21e1ba34a4e..9a63847cfeb 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2231,6 +2231,14 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
equivalent to the latest protocol version supported by the libpq
version being used, which is currently <literal>3.2</literal>.
</para>
+
+ <para>
+ When using protocol version <literal>3.2</literal> or higher,
+ <application>libpq</application> requests all supported
+ <link linkend="protocol-extensions">protocol extensions</link>
+ from the server, such as <literal>goaway</literal>
+ which enables <xref linkend="libpq-PQgoAwayReceived"/>.
+ </para>
</listitem>
</varlistentry>
@@ -2987,6 +2995,50 @@ int PQserverVersion(const PGconn *conn);
</listitem>
</varlistentry>
+ <varlistentry id="libpq-PQgoAwayReceived">
+ <term><function>PQgoAwayReceived</function><indexterm><primary>PQgoAwayReceived</primary></indexterm></term>
+ <listitem>
+ <para>
+ Returns true if the server has sent a <literal>GoAway</literal> message,
+ requesting the client to disconnect when convenient.
+
+<synopsis>
+int PQgoAwayReceived(const PGconn *conn);
+</synopsis>
+ </para>
+
+ <para>
+ The <literal>GoAway</literal> message is sent by the server during a
+ smart shutdown to politely request that clients disconnect. This is
+ advisory only - the connection remains fully functional and queries
+ can continue to be executed. Applications can choose to honor the request
+ by calling this function periodically and disconnect gracefully when
+ possible, such as after completing the current transaction.
+ </para>
+
+ <para>
+ This message is only sent to clients that request the
+ <literal>goaway</literal> protocol extension in the startup message,
+ which <application>libpq</application> does when using protocol version
+ 3.2 or higher (see <xref linkend="libpq-connect-max-protocol-version"/>).
+ The function returns 1 if the <literal>GoAway</literal> message was
+ received, 0 otherwise.
+ </para>
+
+ <para>
+ <function>PQgoAwayReceived</function> does not actually read data from the
+ server; it just returns messages previously absorbed by another
+ <application>libpq</application> function. So normally you would first
+ call <xref linkend="libpq-PQconsumeInput"/>, then check
+ <function>PQgoAwayReceived</function>. You can use
+ <function>select()</function> to wait for data to arrive from the
+ server, thereby using no <acronym>CPU</acronym> power unless there is
+ something to do. (See <xref linkend="libpq-PQsocket"/> to obtain the file
+ descriptor number to use with <function>select()</function>.)
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="libpq-PQerrorMessage">
<term>
<function>PQerrorMessage</function><indexterm><primary>PQerrorMessage</primary></indexterm>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 89ac680efd5..e430a0614fc 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -335,8 +335,14 @@
<tbody>
<row>
- <entry namest="last" align="center" valign="middle">
- <emphasis>(No supported protocol extensions are currently defined.)</emphasis>
+ <entry><literal>goaway</literal></entry>
+ <entry><literal>1</literal></entry>
+ <entry>PostgreSQL 19</entry>
+ <entry>
+ Enables the server to send
+ <link linkend="protocol-message-formats-GoAway">GoAway</link>
+ messages to request the client to disconnect when convenient.
+ See <xref linkend="protocol-async"/> for more details.
</entry>
</row>
</tbody>
@@ -699,6 +705,17 @@
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term>GoAway</term>
+ <listitem>
+ <para>
+ The server requests the client politely to close the connection at its
+ earliest convenient moment. See <xref linkend="protocol-async"/> for
+ more details.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
@@ -1447,7 +1464,7 @@ SELCT 1/0;<!-- this typo is intentional -->
</para>
<para>
- It is possible for NoticeResponse and ParameterStatus messages to be
+ It is possible for NoticeResponse, ParameterStatus and GoAway messages to be
interspersed between CopyData messages; frontends must handle these cases,
and should be prepared for other asynchronous message types as well (see
<xref linkend="protocol-async"/>). Otherwise, any message type other than
@@ -1558,6 +1575,28 @@ SELCT 1/0;<!-- this typo is intentional -->
parameters that it does not understand or care about.
</para>
+ <para>
+ A GoAway message is sent by the server to politely request the client to
+ disconnect when convenient and reconnect when the client needs a connection
+ again. This is advisory only - the connection remains fully functional and
+ queries can continue to be executed. "When convenient" is very vaguely
+ defined on purpose because it depends on the client and application whether
+ such a moment even exists. So clients are allowed to completely ignore this
+ message and disconnect whenever they otherwise would have. An important
+ type of client that can actually honor the request to disconnect early is a
+ client that maintains a connection pool. Such a client can honor the
+ request by disconnecting a connection that has received a GoAway message
+ when it's not in use by a user of the pool. It is allowed for a server to
+ send multiple GoAway messages on the same connection, but any subsequent
+ GoAway messages after the first GoAway have no effect on the client's
+ behavior. The GoAway message is currently sent by Postgres during the
+ "smart" shutdown procedure (i.e. when postmaster receives
+ <systemitem>SIGTERM</systemitem>). The server only sends the GoAway
+ message to clients that request the <literal>goaway</literal> protocol
+ extension by setting <literal>_pq_.goaway</literal> to <literal>1</literal>
+ in the StartupMessage.
+ </para>
+
<para>
If a frontend issues a <command>LISTEN</command> command, then the
backend will send a NotificationResponse message (not to be
@@ -5348,6 +5387,31 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
</listitem>
</varlistentry>
+ <varlistentry id="protocol-message-formats-GoAway">
+ <term>GoAway (B)</term>
+ <listitem>
+ <variablelist>
+ <varlistentry>
+ <term>Byte1('g')</term>
+ <listitem>
+ <para>
+ Identifies the message as a polite request for the client to disconnect.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Int32(4)</term>
+ <listitem>
+ <para>
+ Length of message contents in bytes, including self.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="protocol-message-formats-GSSENCRequest">
<term>GSSENCRequest (F)</term>
<listitem>
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d6133bfebc6..7d2bc5910c6 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2134,7 +2134,30 @@ process_pm_shutdown_request(void)
* later state, do not change it.
*/
if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
+ {
+ dlist_iter iter;
+
connsAllowed = false;
+
+ /*
+ * Signal all backends to send a GoAway message to their
+ * clients, to politely request that they disconnect.
+ */
+ dlist_foreach(iter, &ActiveChildList)
+ {
+ PMChild *bp = dlist_container(PMChild, elem, iter.cur);
+
+ /*
+ * Only signal regular backends, since those need to notify
+ * their clients using a GoAway message.
+ */
+ if (bp->bkend_type == B_BACKEND)
+ {
+ SendProcSignal(bp->pid, PROCSIG_SMART_SHUTDOWN,
+ INVALID_PROC_NUMBER);
+ }
+ }
+ }
else if (pmState == PM_STARTUP || pmState == PM_RECOVERY)
{
/* There should be no clients, so proceed to stop children */
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 5d33559926a..512a705dac6 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -694,6 +694,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT))
HandleLogMemoryContextInterrupt();
+ if (CheckProcSignal(PROCSIG_SMART_SHUTDOWN))
+ HandleSmartShutdownInterrupt();
+
if (CheckProcSignal(PROCSIG_PARALLEL_APPLY_MESSAGE))
HandleParallelApplyMessageInterrupt();
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index c517115927c..17ab063bcab 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -779,11 +779,24 @@ ProcessStartupPacket(Port *port, bool ssl_done, bool gss_done)
{
/*
* Any option beginning with _pq_. is reserved for use as a
- * protocol-level option, but at present no such options are
- * defined.
+ * protocol-level option.
*/
- unrecognized_protocol_options =
- lappend(unrecognized_protocol_options, pstrdup(nameptr));
+ if (strcmp(nameptr, "_pq_.goaway") == 0)
+ {
+ /* Client wants to receive GoAway messages. */
+ if (strcmp(valptr, "1") != 0)
+ ereport(FATAL,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid value for protocol option \"%s\": \"%s\"",
+ nameptr, valptr),
+ errhint("Valid values are: \"1\".")));
+ port->goaway_negotiated = true;
+ }
+ else
+ {
+ unrecognized_protocol_options =
+ lappend(unrecognized_protocol_options, pstrdup(nameptr));
+ }
}
else
{
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 21de158adbb..1cc00f23385 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -42,8 +42,10 @@
#include "common/pg_prng.h"
#include "jit/jit.h"
#include "libpq/libpq.h"
+#include "libpq/pqcomm.h"
#include "libpq/pqformat.h"
#include "libpq/pqsignal.h"
+#include "libpq/protocol.h"
#include "mb/pg_wchar.h"
#include "mb/stringinfo_mb.h"
#include "miscadmin.h"
@@ -92,6 +94,14 @@ const char *debug_query_string; /* client-supplied query string */
/* Note: whereToSendOutput is initialized for the bootstrap/standalone case */
CommandDest whereToSendOutput = DestDebug;
+/*
+ * Track whether we've been notified of smart shutdown and sent GoAway.
+ * SmartShutdownPending is set by the PROCSIG_SMART_SHUTDOWN signal handler.
+ * GoAwaySent tracks whether we've already sent the GoAway message.
+ */
+static volatile sig_atomic_t SmartShutdownPending = false;
+static bool GoAwaySent = false;
+
/* flag for logging end of session */
bool Log_disconnections = false;
@@ -507,6 +517,20 @@ ProcessClientReadInterrupt(bool blocked)
/* Check for general interrupts that arrived before/while reading */
CHECK_FOR_INTERRUPTS();
+ /* Send GoAway message if smart shutdown is pending */
+ if (SmartShutdownPending && !GoAwaySent &&
+ whereToSendOutput == DestRemote &&
+ MyProcPort && MyProcPort->goaway_negotiated)
+ {
+ StringInfoData buf;
+
+ pq_beginmessage(&buf, PqMsg_GoAway);
+ pq_endmessage(&buf);
+ pq_flush();
+
+ GoAwaySent = true;
+ }
+
/* Process sinval catchup interrupts, if any */
if (catchupInterruptPending)
ProcessCatchupInterrupt();
@@ -3066,6 +3090,18 @@ FloatExceptionHandler(SIGNAL_ARGS)
"invalid operation, such as division by zero.")));
}
+/*
+ * Tell the next ProcessClientReadInterrupt() call to send a GoAway message to
+ * the client. Runs in a SIGUSR1 handler.
+ */
+void
+HandleSmartShutdownInterrupt(void)
+{
+ SmartShutdownPending = true;
+ InterruptPending = true;
+ /* latch will be set by procsignal_sigusr1_handler */
+}
+
/*
* Tell the next CHECK_FOR_INTERRUPTS() to process recovery conflicts. Runs
* in a SIGUSR1 handler.
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 921b2daa4ff..b8769412c72 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -202,6 +202,11 @@ typedef struct Port
void *gss;
#endif
+ /*
+ * Protocol extensions.
+ */
+ bool goaway_negotiated; /* client supports GoAway message */
+
/*
* SSL structures.
*/
diff --git a/src/include/libpq/protocol.h b/src/include/libpq/protocol.h
index eae8f0e7238..fd3c195daa2 100644
--- a/src/include/libpq/protocol.h
+++ b/src/include/libpq/protocol.h
@@ -53,6 +53,7 @@
#define PqMsg_FunctionCallResponse 'V'
#define PqMsg_CopyBothResponse 'W'
#define PqMsg_ReadyForQuery 'Z'
+#define PqMsg_GoAway 'g'
#define PqMsg_NoData 'n'
#define PqMsg_PortalSuspended 's'
#define PqMsg_ParameterDescription 't'
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a93..bba38b92f22 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -39,9 +39,10 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT, /* backend is blocking recovery, check
* PGPROC->pendingRecoveryConflicts for the
* reason */
+ PROCSIG_SMART_SHUTDOWN, /* notify backend of smart shutdown */
} ProcSignalReason;
-#define NUM_PROCSIGNALS (PROCSIG_RECOVERY_CONFLICT + 1)
+#define NUM_PROCSIGNALS (PROCSIG_SMART_SHUTDOWN + 1)
typedef enum
{
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 5bc5bcfb20d..7447609b3cf 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -74,6 +74,7 @@ extern void die(SIGNAL_ARGS);
pg_noreturn extern void quickdie(SIGNAL_ARGS);
extern void StatementCancelHandler(SIGNAL_ARGS);
pg_noreturn extern void FloatExceptionHandler(SIGNAL_ARGS);
+extern void HandleSmartShutdownInterrupt(void);
extern void HandleRecoveryConflictInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index dbbae642d76..3385e65c389 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -210,3 +210,4 @@ PQgetAuthDataHook 207
PQdefaultAuthDataHook 208
PQfullProtocolVersion 209
appendPQExpBufferVA 210
+PQgoAwayReceived 211
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index a0d2f749811..c7580874b5a 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -700,6 +700,7 @@ pqDropServerData(PGconn *conn)
conn->password_needed = false;
conn->gssapi_used = false;
conn->write_failed = false;
+ conn->goaway_received = false;
free(conn->write_err_msg);
conn->write_err_msg = NULL;
conn->oauth_want_retry = false;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 203d388bdbf..bc1d95324ed 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -2702,6 +2702,33 @@ PQnotifies(PGconn *conn)
return event;
}
+/*
+ * PQgoAwayReceived
+ * returns 1 if a GoAway message has been received from the server
+ * returns 0 if not
+ *
+ * Note that this function does not read any new data from the socket;
+ * caller should call PQconsumeInput() first if they want to ensure
+ * all available data has been read.
+ */
+int
+PQgoAwayReceived(PGconn *conn)
+{
+ if (!conn)
+ return 0;
+
+ if (conn->goaway_received)
+ return 1;
+
+ /*
+ * Parse any available data to see if a GoAway message has arrived.
+ */
+ parseInput(conn);
+
+ return conn->goaway_received ? 1 : 0;
+}
+
+
/*
* PQputCopyData - send some data to the backend during COPY IN or COPY BOTH
*
diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index 90bbb2eba1f..d76f12c4e60 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -134,8 +134,8 @@ pqParseInput3(PGconn *conn)
}
/*
- * NOTIFY and NOTICE messages can happen in any state; always process
- * them right away.
+ * NOTIFY and NOTICE and GoAway messages can happen in any state;
+ * always process them right away.
*
* Most other messages should only be processed while in BUSY state.
* (In particular, in READY state we hold off further parsing until
@@ -159,6 +159,10 @@ pqParseInput3(PGconn *conn)
if (pqGetErrorNotice3(conn, false))
return;
}
+ else if (id == PqMsg_GoAway)
+ {
+ conn->goaway_received = true;
+ }
else if (conn->asyncStatus != PGASYNC_BUSY)
{
/* If not IDLE state, just wait ... */
@@ -1511,8 +1515,7 @@ pqGetNegotiateProtocolVersion3(PGconn *conn)
conn->pversion = their_version;
/*
- * We don't currently request any protocol extensions, so we don't expect
- * the server to reply with any either.
+ * Process the list of unsupported protocol extensions.
*/
for (int i = 0; i < num; i++)
{
@@ -1525,6 +1528,17 @@ pqGetNegotiateProtocolVersion3(PGconn *conn)
libpq_append_conn_error(conn, "received invalid protocol negotiation message: server reported unsupported parameter name without a \"%s\" prefix (\"%s\")", "_pq_.", conn->workBuffer.data);
goto failure;
}
+
+ /*
+ * Handle known protocol extensions. Unknown extensions that were not
+ * requested by us are an error.
+ */
+ if (strcmp(conn->workBuffer.data, "_pq_.goaway") == 0)
+ {
+ /* Server doesn't support GoAway, that's fine */
+ continue;
+ }
+
libpq_append_conn_error(conn, "received invalid protocol negotiation message: server reported an unsupported parameter that was not requested (\"%s\")", conn->workBuffer.data);
goto failure;
}
@@ -1868,6 +1882,9 @@ getCopyDataMessage(PGconn *conn)
if (getParameterStatus(conn))
return 0;
break;
+ case PqMsg_GoAway:
+ conn->goaway_received = true;
+ break;
case PqMsg_CopyData:
return msgLength;
case PqMsg_CopyDone:
@@ -2361,6 +2378,9 @@ pqFunctionCall3(PGconn *conn, Oid fnid,
if (getParameterStatus(conn))
continue;
break;
+ case PqMsg_GoAway:
+ conn->goaway_received = true;
+ break;
default:
/* The backend violates the protocol. */
libpq_append_conn_error(conn, "protocol error: id=0x%x", id);
@@ -2486,6 +2506,14 @@ build_startup_packet(const PGconn *conn, char *packet,
}
}
+ /*
+ * Request protocol extensions. Only do this for protocol version 3.2 and
+ * later, to avoid confusing old proxies that don't understand _pq_.*
+ * options.
+ */
+ if (conn->pversion >= PG_PROTOCOL(3, 2))
+ ADD_STARTUP_OPTION("_pq_.goaway", "1");
+
/* Add trailing terminator */
if (packet)
packet[packet_len] = '\0';
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 905f2f33ab8..96efdcc523d 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -63,6 +63,10 @@ extern "C"
/* Indicates presence of the PQAUTHDATA_PROMPT_OAUTH_DEVICE authdata hook */
#define LIBPQ_HAS_PROMPT_OAUTH_DEVICE 1
+/* Features added in PostgreSQL v19: */
+/* Indicates presence of PQgoAwayReceived */
+#define LIBPQ_HAS_GOAWAY 1
+
/*
* Option flags for PQcopyResult
*/
@@ -411,6 +415,7 @@ extern const char *PQparameterStatus(const PGconn *conn,
extern int PQprotocolVersion(const PGconn *conn);
extern int PQfullProtocolVersion(const PGconn *conn);
extern int PQserverVersion(const PGconn *conn);
+extern int PQgoAwayReceived(PGconn *conn);
extern char *PQerrorMessage(const PGconn *conn);
extern int PQsocket(const PGconn *conn);
extern int PQbackendPID(const PGconn *conn);
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index fb6a7cbf15d..d2026a6a849 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -511,6 +511,7 @@ struct pg_conn
bool sigpipe_flag; /* can we mask SIGPIPE via MSG_NOSIGNAL? */
bool write_failed; /* have we had a write failure on sock? */
char *write_err_msg; /* write error message, or NULL if OOM */
+ bool goaway_received; /* true if server sent GoAway message */
bool auth_required; /* require an authentication challenge from
* the server? */
base-commit: 0f4c8d33d49da012a04076159a008c9fa80bcc47
prerequisite-patch-id: 8730e20fabcc9b062ea1cb95b02804ee2a0c7a26
prerequisite-patch-id: 1974dae8d4bedf3677795137c2281e2e5df8d955
prerequisite-patch-id: d85fed3b2719364b5a905d1ace23371be6f3ed03
prerequisite-patch-id: 0630ead46d1b8c40177a554b7793444a1cb57829
prerequisite-patch-id: d5bd9d7bbef967905058cd78007e4766b29e7833
--
2.52.0
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
@ 2026-02-25 15:08 ` Zsolt Parragi <[email protected]>
2026-03-13 09:39 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
0 siblings, 1 reply; 15+ messages in thread
From: Zsolt Parragi @ 2026-02-25 15:08 UTC (permalink / raw)
To: Jelte Fennema-Nio <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Jacob Champion <[email protected]>; Heikki Linnakangas <[email protected]>
+ /*
+ * Only signal regular backends, since those need to notify
+ * their clients using a GoAway message.
+ */
+ if (bp->bkend_type == B_BACKEND)
This condition is slightly different to how SignalChildren works, is
that intentional? I don't think it causes any practical difference,
and I don't see an easy way to reuse SignalChildren for this, but
maybe it could still follow the same pattern.
Otherwise I don't see any other issues, and this also doesn't seem to
be an important comment.
Since the pytest framework seems unlikely to be included in PG19, have
you considered a different test implementation, to have at least some
minimal coverage?
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
@ 2026-03-13 09:39 ` Jelte Fennema-Nio <[email protected]>
2026-03-20 19:20 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Tomas Vondra <[email protected]>
0 siblings, 1 reply; 15+ messages in thread
From: Jelte Fennema-Nio @ 2026-03-13 09:39 UTC (permalink / raw)
To: Zsolt Parragi <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Jacob Champion <[email protected]>; Heikki Linnakangas <[email protected]>
On Wed Feb 25, 2026 at 4:08 PM CET, Zsolt Parragi wrote:
> + /*
> + * Only signal regular backends, since those need to notify
> + * their clients using a GoAway message.
> + */
> + if (bp->bkend_type == B_BACKEND)
>
> This condition is slightly different to how SignalChildren works, is
> that intentional? I don't think it causes any practical difference,
> and I don't see an easy way to reuse SignalChildren for this, but
> maybe it could still follow the same pattern.
Changed it to be consistent now, and resolved a rebase conflict.
> Since the pytest framework seems unlikely to be included in PG19, have
> you considered a different test implementation, to have at least some
> minimal coverage?
I now included some basic support for GoAway in psql and added a perl
test based on that.
Attachments:
[text/x-patch] v7-0001-Add-GoAway-protocol-message-for-graceful-but-fast.patch (26.8K, 2-v7-0001-Add-GoAway-protocol-message-for-graceful-but-fast.patch)
download | inline diff:
From 720a92a23ffb17bce2095b3701e85938a254b8a5 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Sun, 19 Oct 2025 00:30:48 +0200
Subject: [PATCH v7] Add GoAway protocol message for graceful but fast server
shutdown/switchover
This commit introduces a new GoAway backend-to-frontend protocol
message (byte 'g') that the server can send to the client to politely
request that client to disconnect/reconnect when convenient. This message is
advisory only - the connection remains fully functional and clients may
continue executing queries and starting new transactions. "When
convenient" is obviously not very well defined, but the primary target
clients are clients that maintain a connection pool. Such clients should
disconnect/reconnect a connection in the pool when there's no user of
that connection. This is similar to how such clients often currently
remove a connection from the pool after the connection hits a maximum
lifetime of e.g. 1 hour.
This new message is used by Postgres during the already existing "smart"
shutdown procedure (i.e. when postmaster receives SIGTERM). When
Postgres is in "smart" shutdown mode existing clients can continue to
run queries as usual but new connection attempts are rejected. This mode
is primarily useful when triggering a switchover of a read replica. A
load balancer can route new connections only to the new read replica,
while the old load balancer keeps serving the existing connections until
they disconnect. The problem is that this draining of connections could
often take a long time. Even when clients only run very short
queries/transactions because the session can be kept open much longer
(many connection pools use 1 hour max lifetime of a connection by default).
With the introduction of the GoAway message Postgres now sends this
message to all connected clients when it enters smart shutdown mode.
If these clients respond to the message by reconnecting/disconnecting
earlier than their maximum connection lifetime the draining can complete
much quicker. Similar benefits to switchover duration can be achieved
for other applications or proxies implementing the Postgres protocol,
like when switching over a cluster of PgBouncer machines to a newer
version.
Applications/clients that use libpq can periodically check the result of
PQgoAwayReceived() at an inactive time to see whether they are asked to
reconnect.
---
doc/src/sgml/libpq.sgml | 52 +++++++++++++++++++++
doc/src/sgml/protocol.sgml | 70 ++++++++++++++++++++++++++--
src/backend/postmaster/postmaster.c | 26 +++++++++++
src/backend/storage/ipc/procsignal.c | 3 ++
src/backend/tcop/backend_startup.c | 21 +++++++--
src/backend/tcop/postgres.c | 36 ++++++++++++++
src/bin/psql/common.c | 7 +++
src/bin/psql/meson.build | 1 +
src/bin/psql/t/040_goaway.pl | 46 ++++++++++++++++++
src/include/libpq/libpq-be.h | 5 ++
src/include/libpq/protocol.h | 1 +
src/include/storage/procsignal.h | 3 +-
src/include/tcop/tcopprot.h | 1 +
src/interfaces/libpq/exports.txt | 1 +
src/interfaces/libpq/fe-connect.c | 1 +
src/interfaces/libpq/fe-exec.c | 27 +++++++++++
src/interfaces/libpq/fe-protocol3.c | 39 ++++++++++++++--
src/interfaces/libpq/libpq-fe.h | 3 ++
src/interfaces/libpq/libpq-int.h | 1 +
19 files changed, 331 insertions(+), 13 deletions(-)
create mode 100644 src/bin/psql/t/040_goaway.pl
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 6db823808fc..b304bca765c 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2248,6 +2248,14 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
equivalent to the latest protocol version supported by the libpq
version being used, which is currently <literal>3.2</literal>.
</para>
+
+ <para>
+ When using protocol version <literal>3.2</literal> or higher,
+ <application>libpq</application> requests all supported
+ <link linkend="protocol-extensions">protocol extensions</link>
+ from the server, such as <literal>goaway</literal>
+ which enables <xref linkend="libpq-PQgoAwayReceived"/>.
+ </para>
</listitem>
</varlistentry>
@@ -3004,6 +3012,50 @@ int PQserverVersion(const PGconn *conn);
</listitem>
</varlistentry>
+ <varlistentry id="libpq-PQgoAwayReceived">
+ <term><function>PQgoAwayReceived</function><indexterm><primary>PQgoAwayReceived</primary></indexterm></term>
+ <listitem>
+ <para>
+ Returns true if the server has sent a <literal>GoAway</literal> message,
+ requesting the client to disconnect when convenient.
+
+<synopsis>
+int PQgoAwayReceived(const PGconn *conn);
+</synopsis>
+ </para>
+
+ <para>
+ The <literal>GoAway</literal> message is sent by the server during a
+ smart shutdown to politely request that clients disconnect. This is
+ advisory only - the connection remains fully functional and queries
+ can continue to be executed. Applications can choose to honor the request
+ by calling this function periodically and disconnect gracefully when
+ possible, such as after completing the current transaction.
+ </para>
+
+ <para>
+ This message is only sent to clients that request the
+ <literal>goaway</literal> protocol extension in the startup message,
+ which <application>libpq</application> does when using protocol version
+ 3.2 or higher (see <xref linkend="libpq-connect-max-protocol-version"/>).
+ The function returns 1 if the <literal>GoAway</literal> message was
+ received, 0 otherwise.
+ </para>
+
+ <para>
+ <function>PQgoAwayReceived</function> does not actually read data from the
+ server; it just returns messages previously absorbed by another
+ <application>libpq</application> function. So normally you would first
+ call <xref linkend="libpq-PQconsumeInput"/>, then check
+ <function>PQgoAwayReceived</function>. You can use
+ <function>select()</function> to wait for data to arrive from the
+ server, thereby using no <acronym>CPU</acronym> power unless there is
+ something to do. (See <xref linkend="libpq-PQsocket"/> to obtain the file
+ descriptor number to use with <function>select()</function>.)
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="libpq-PQerrorMessage">
<term>
<function>PQerrorMessage</function><indexterm><primary>PQerrorMessage</primary></indexterm>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 49f81676712..eec13b29439 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -346,8 +346,14 @@
<tbody>
<row>
- <entry namest="last" align="center" valign="middle">
- <emphasis>(No supported protocol extensions are currently defined.)</emphasis>
+ <entry><literal>goaway</literal></entry>
+ <entry><literal>1</literal></entry>
+ <entry>PostgreSQL 19</entry>
+ <entry>
+ Enables the server to send
+ <link linkend="protocol-message-formats-GoAway">GoAway</link>
+ messages to request the client to disconnect when convenient.
+ See <xref linkend="protocol-async"/> for more details.
</entry>
</row>
</tbody>
@@ -710,6 +716,17 @@
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term>GoAway</term>
+ <listitem>
+ <para>
+ The server requests the client politely to close the connection at its
+ earliest convenient moment. See <xref linkend="protocol-async"/> for
+ more details.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
@@ -1458,7 +1475,7 @@ SELCT 1/0;<!-- this typo is intentional -->
</para>
<para>
- It is possible for NoticeResponse and ParameterStatus messages to be
+ It is possible for NoticeResponse, ParameterStatus and GoAway messages to be
interspersed between CopyData messages; frontends must handle these cases,
and should be prepared for other asynchronous message types as well (see
<xref linkend="protocol-async"/>). Otherwise, any message type other than
@@ -1569,6 +1586,28 @@ SELCT 1/0;<!-- this typo is intentional -->
parameters that it does not understand or care about.
</para>
+ <para>
+ A GoAway message is sent by the server to politely request the client to
+ disconnect when convenient and reconnect when the client needs a connection
+ again. This is advisory only - the connection remains fully functional and
+ queries can continue to be executed. "When convenient" is very vaguely
+ defined on purpose because it depends on the client and application whether
+ such a moment even exists. So clients are allowed to completely ignore this
+ message and disconnect whenever they otherwise would have. An important
+ type of client that can actually honor the request to disconnect early is a
+ client that maintains a connection pool. Such a client can honor the
+ request by disconnecting a connection that has received a GoAway message
+ when it's not in use by a user of the pool. It is allowed for a server to
+ send multiple GoAway messages on the same connection, but any subsequent
+ GoAway messages after the first GoAway have no effect on the client's
+ behavior. The GoAway message is currently sent by Postgres during the
+ "smart" shutdown procedure (i.e. when postmaster receives
+ <systemitem>SIGTERM</systemitem>). The server only sends the GoAway
+ message to clients that request the <literal>goaway</literal> protocol
+ extension by setting <literal>_pq_.goaway</literal> to <literal>1</literal>
+ in the StartupMessage.
+ </para>
+
<para>
If a frontend issues a <command>LISTEN</command> command, then the
backend will send a NotificationResponse message (not to be
@@ -5359,6 +5398,31 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
</listitem>
</varlistentry>
+ <varlistentry id="protocol-message-formats-GoAway">
+ <term>GoAway (B)</term>
+ <listitem>
+ <variablelist>
+ <varlistentry>
+ <term>Byte1('g')</term>
+ <listitem>
+ <para>
+ Identifies the message as a polite request for the client to disconnect.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Int32(4)</term>
+ <listitem>
+ <para>
+ Length of message contents in bytes, including self.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="protocol-message-formats-GSSENCRequest">
<term>GSSENCRequest (F)</term>
<listitem>
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 3fac46c402b..9a7baaba9e4 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2134,7 +2134,33 @@ process_pm_shutdown_request(void)
* later state, do not change it.
*/
if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
+ {
+ dlist_iter iter;
+
connsAllowed = false;
+
+ /*
+ * Signal all backends to send a GoAway message to their
+ * clients, to politely request that they disconnect.
+ */
+ dlist_foreach(iter, &ActiveChildList)
+ {
+ PMChild *bp = dlist_container(PMChild, elem, iter.cur);
+
+ /*
+ * Only signal regular backends, since those are the ones
+ * that need to notify their clients using a GoAway
+ * message. Follow the same pattern as SignalChildren to
+ * correctly distinguish backends from WAL senders.
+ */
+ if (bp->bkend_type == B_BACKEND &&
+ !IsPostmasterChildWalSender(bp->child_slot))
+ {
+ SendProcSignal(bp->pid, PROCSIG_SMART_SHUTDOWN,
+ INVALID_PROC_NUMBER);
+ }
+ }
+ }
else if (pmState == PM_STARTUP || pmState == PM_RECOVERY)
{
/* There should be no clients, so proceed to stop children */
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7e017c8d53b..dba32b91865 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -697,6 +697,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT))
HandleLogMemoryContextInterrupt();
+ if (CheckProcSignal(PROCSIG_SMART_SHUTDOWN))
+ HandleSmartShutdownInterrupt();
+
if (CheckProcSignal(PROCSIG_PARALLEL_APPLY_MESSAGE))
HandleParallelApplyMessageInterrupt();
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index c517115927c..17ab063bcab 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -779,11 +779,24 @@ ProcessStartupPacket(Port *port, bool ssl_done, bool gss_done)
{
/*
* Any option beginning with _pq_. is reserved for use as a
- * protocol-level option, but at present no such options are
- * defined.
+ * protocol-level option.
*/
- unrecognized_protocol_options =
- lappend(unrecognized_protocol_options, pstrdup(nameptr));
+ if (strcmp(nameptr, "_pq_.goaway") == 0)
+ {
+ /* Client wants to receive GoAway messages. */
+ if (strcmp(valptr, "1") != 0)
+ ereport(FATAL,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid value for protocol option \"%s\": \"%s\"",
+ nameptr, valptr),
+ errhint("Valid values are: \"1\".")));
+ port->goaway_negotiated = true;
+ }
+ else
+ {
+ unrecognized_protocol_options =
+ lappend(unrecognized_protocol_options, pstrdup(nameptr));
+ }
}
else
{
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d01a09dd0c4..4dcdb21c98b 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -42,8 +42,10 @@
#include "common/pg_prng.h"
#include "jit/jit.h"
#include "libpq/libpq.h"
+#include "libpq/pqcomm.h"
#include "libpq/pqformat.h"
#include "libpq/pqsignal.h"
+#include "libpq/protocol.h"
#include "mb/pg_wchar.h"
#include "mb/stringinfo_mb.h"
#include "miscadmin.h"
@@ -92,6 +94,14 @@ const char *debug_query_string; /* client-supplied query string */
/* Note: whereToSendOutput is initialized for the bootstrap/standalone case */
CommandDest whereToSendOutput = DestDebug;
+/*
+ * Track whether we've been notified of smart shutdown and sent GoAway.
+ * SmartShutdownPending is set by the PROCSIG_SMART_SHUTDOWN signal handler.
+ * GoAwaySent tracks whether we've already sent the GoAway message.
+ */
+static volatile sig_atomic_t SmartShutdownPending = false;
+static bool GoAwaySent = false;
+
/* flag for logging end of session */
bool Log_disconnections = false;
@@ -507,6 +517,20 @@ ProcessClientReadInterrupt(bool blocked)
/* Check for general interrupts that arrived before/while reading */
CHECK_FOR_INTERRUPTS();
+ /* Send GoAway message if smart shutdown is pending */
+ if (SmartShutdownPending && !GoAwaySent &&
+ whereToSendOutput == DestRemote &&
+ MyProcPort && MyProcPort->goaway_negotiated)
+ {
+ StringInfoData buf;
+
+ pq_beginmessage(&buf, PqMsg_GoAway);
+ pq_endmessage(&buf);
+ pq_flush();
+
+ GoAwaySent = true;
+ }
+
/* Process sinval catchup interrupts, if any */
if (catchupInterruptPending)
ProcessCatchupInterrupt();
@@ -3066,6 +3090,18 @@ FloatExceptionHandler(SIGNAL_ARGS)
"invalid operation, such as division by zero.")));
}
+/*
+ * Tell the next ProcessClientReadInterrupt() call to send a GoAway message to
+ * the client. Runs in a SIGUSR1 handler.
+ */
+void
+HandleSmartShutdownInterrupt(void)
+{
+ SmartShutdownPending = true;
+ InterruptPending = true;
+ /* latch will be set by procsignal_sigusr1_handler */
+}
+
/*
* Tell the next CHECK_FOR_INTERRUPTS() to process recovery conflicts. Runs
* in a SIGUSR1 handler.
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 2eadd391a9c..4e99ed69841 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -739,6 +739,7 @@ PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout,
static void
PrintNotifications(void)
{
+ static bool goaway_reported = false;
PGnotify *notify;
PQconsumeInput(pset.db);
@@ -755,6 +756,12 @@ PrintNotifications(void)
PQfreemem(notify);
PQconsumeInput(pset.db);
}
+
+ if (!goaway_reported && PQgoAwayReceived(pset.db))
+ {
+ pg_log_info("Server sent GoAway, requesting disconnect when convenient.");
+ goaway_reported = true;
+ }
}
diff --git a/src/bin/psql/meson.build b/src/bin/psql/meson.build
index 922b2845267..047b40cc40e 100644
--- a/src/bin/psql/meson.build
+++ b/src/bin/psql/meson.build
@@ -78,6 +78,7 @@ tests += {
't/010_tab_completion.pl',
't/020_cancel.pl',
't/030_pager.pl',
+ 't/040_goaway.pl',
],
},
}
diff --git a/src/bin/psql/t/040_goaway.pl b/src/bin/psql/t/040_goaway.pl
new file mode 100644
index 00000000000..a53653308bf
--- /dev/null
+++ b/src/bin/psql/t/040_goaway.pl
@@ -0,0 +1,46 @@
+
+# Copyright (c) 2021-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+my $psql = $node->background_psql('postgres');
+
+# Confirm connection works
+my $result = $psql->query_safe("SELECT 'before_goaway'");
+like($result, qr/before_goaway/, 'connection works before smart shutdown');
+
+# Initiate smart shutdown without waiting for it to complete
+$node->command_ok(
+ [ 'pg_ctl', 'stop', '-D', $node->data_dir, '-m', 'smart', '--no-wait' ],
+ 'pg_ctl smart shutdown');
+
+# The backend sends GoAway once it processes the smart shutdown signal.
+# Poll with queries until psql reports it.
+my $saw_goaway = 0;
+for (my $i = 0; $i < 100; $i++)
+{
+ my $out = $psql->query("SELECT 'after_goaway'");
+ if ($psql->{stderr} =~ /Server sent GoAway, requesting disconnect when convenient/)
+ {
+ $saw_goaway = 1;
+ # The query should still have succeeded
+ like($out, qr/after_goaway/, 'query still works after GoAway');
+ last;
+ }
+ usleep(50_000);
+}
+ok($saw_goaway, 'psql reported GoAway notice during smart shutdown');
+
+$psql->quit;
+
+done_testing();
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 921b2daa4ff..b8769412c72 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -202,6 +202,11 @@ typedef struct Port
void *gss;
#endif
+ /*
+ * Protocol extensions.
+ */
+ bool goaway_negotiated; /* client supports GoAway message */
+
/*
* SSL structures.
*/
diff --git a/src/include/libpq/protocol.h b/src/include/libpq/protocol.h
index eae8f0e7238..fd3c195daa2 100644
--- a/src/include/libpq/protocol.h
+++ b/src/include/libpq/protocol.h
@@ -53,6 +53,7 @@
#define PqMsg_FunctionCallResponse 'V'
#define PqMsg_CopyBothResponse 'W'
#define PqMsg_ReadyForQuery 'Z'
+#define PqMsg_GoAway 'g'
#define PqMsg_NoData 'n'
#define PqMsg_PortalSuspended 's'
#define PqMsg_ParameterDescription 't'
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a93..bba38b92f22 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -39,9 +39,10 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT, /* backend is blocking recovery, check
* PGPROC->pendingRecoveryConflicts for the
* reason */
+ PROCSIG_SMART_SHUTDOWN, /* notify backend of smart shutdown */
} ProcSignalReason;
-#define NUM_PROCSIGNALS (PROCSIG_RECOVERY_CONFLICT + 1)
+#define NUM_PROCSIGNALS (PROCSIG_SMART_SHUTDOWN + 1)
typedef enum
{
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 5bc5bcfb20d..7447609b3cf 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -74,6 +74,7 @@ extern void die(SIGNAL_ARGS);
pg_noreturn extern void quickdie(SIGNAL_ARGS);
extern void StatementCancelHandler(SIGNAL_ARGS);
pg_noreturn extern void FloatExceptionHandler(SIGNAL_ARGS);
+extern void HandleSmartShutdownInterrupt(void);
extern void HandleRecoveryConflictInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1e3d5bd5867..f57aae46b65 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -211,3 +211,4 @@ PQdefaultAuthDataHook 208
PQfullProtocolVersion 209
appendPQExpBufferVA 210
PQgetThreadLock 211
+PQgoAwayReceived 212
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index db9b4c8edbf..4d81d5f03bb 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -701,6 +701,7 @@ pqDropServerData(PGconn *conn)
conn->password_needed = false;
conn->gssapi_used = false;
conn->write_failed = false;
+ conn->goaway_received = false;
free(conn->write_err_msg);
conn->write_err_msg = NULL;
conn->oauth_want_retry = false;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 203d388bdbf..bc1d95324ed 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -2702,6 +2702,33 @@ PQnotifies(PGconn *conn)
return event;
}
+/*
+ * PQgoAwayReceived
+ * returns 1 if a GoAway message has been received from the server
+ * returns 0 if not
+ *
+ * Note that this function does not read any new data from the socket;
+ * caller should call PQconsumeInput() first if they want to ensure
+ * all available data has been read.
+ */
+int
+PQgoAwayReceived(PGconn *conn)
+{
+ if (!conn)
+ return 0;
+
+ if (conn->goaway_received)
+ return 1;
+
+ /*
+ * Parse any available data to see if a GoAway message has arrived.
+ */
+ parseInput(conn);
+
+ return conn->goaway_received ? 1 : 0;
+}
+
+
/*
* PQputCopyData - send some data to the backend during COPY IN or COPY BOTH
*
diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index 8c1fda5caf0..d09002a22ac 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -134,8 +134,8 @@ pqParseInput3(PGconn *conn)
}
/*
- * NOTIFY and NOTICE messages can happen in any state; always process
- * them right away.
+ * NOTIFY and NOTICE and GoAway messages can happen in any state;
+ * always process them right away.
*
* Most other messages should only be processed while in BUSY state.
* (In particular, in READY state we hold off further parsing until
@@ -159,6 +159,10 @@ pqParseInput3(PGconn *conn)
if (pqGetErrorNotice3(conn, false))
return;
}
+ else if (id == PqMsg_GoAway)
+ {
+ conn->goaway_received = true;
+ }
else if (conn->asyncStatus != PGASYNC_BUSY)
{
/* If not IDLE state, just wait ... */
@@ -1446,6 +1450,7 @@ pqGetNegotiateProtocolVersion3(PGconn *conn)
int num;
bool found_test_protocol_negotiation;
bool expect_test_protocol_negotiation;
+ bool requested_goaway = conn->pversion >= PG_PROTOCOL(3, 2);
/*
* During 19beta only, if protocol grease is in use, assume that it's the
@@ -1521,8 +1526,7 @@ pqGetNegotiateProtocolVersion3(PGconn *conn)
conn->pversion = their_version;
/*
- * Check that all expected unsupported parameters are reported by the
- * server.
+ * Process the list of unsupported protocol extensions.
*/
found_test_protocol_negotiation = false;
expect_test_protocol_negotiation = (conn->max_pversion == PG_PROTOCOL_GREASE);
@@ -1539,12 +1543,23 @@ pqGetNegotiateProtocolVersion3(PGconn *conn)
goto failure;
}
- /* Check if this is the expected test parameter */
+ /*
+ * Handle protocol extensions that we requested. Extensions that were
+ * not requested by us are an error.
+ */
if (expect_test_protocol_negotiation &&
strcmp(conn->workBuffer.data, "_pq_.test_protocol_negotiation") == 0)
{
found_test_protocol_negotiation = true;
}
+ else if (requested_goaway && strcmp(conn->workBuffer.data, "_pq_.goaway") == 0)
+ {
+ /*
+ * Server doesn't support GoAway, that's fine. It will simply never
+ * send the message.
+ */
+ continue;
+ }
else
{
libpq_append_conn_error(conn, "received invalid protocol negotiation message: server reported an unsupported parameter that was not requested (\"%s\")",
@@ -1905,6 +1920,9 @@ getCopyDataMessage(PGconn *conn)
if (getParameterStatus(conn))
return 0;
break;
+ case PqMsg_GoAway:
+ conn->goaway_received = true;
+ break;
case PqMsg_CopyData:
return msgLength;
case PqMsg_CopyDone:
@@ -2398,6 +2416,9 @@ pqFunctionCall3(PGconn *conn, Oid fnid,
if (getParameterStatus(conn))
continue;
break;
+ case PqMsg_GoAway:
+ conn->goaway_received = true;
+ break;
default:
/* The backend violates the protocol. */
libpq_append_conn_error(conn, "protocol error: id=0x%x", id);
@@ -2531,6 +2552,14 @@ build_startup_packet(const PGconn *conn, char *packet,
}
}
+ /*
+ * Request protocol extensions. Only do this for protocol version 3.2 and
+ * later, to avoid confusing old proxies that don't understand _pq_.*
+ * options.
+ */
+ if (conn->pversion >= PG_PROTOCOL(3, 2))
+ ADD_STARTUP_OPTION("_pq_.goaway", "1");
+
/* Add trailing terminator */
if (packet)
packet[packet_len] = '\0';
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index f06e7a972c3..d1a6dd54407 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,6 +68,8 @@ extern "C"
#define LIBPQ_HAS_GET_THREAD_LOCK 1
/* Indicates presence of the PQAUTHDATA_OAUTH_BEARER_TOKEN_V2 authdata hook */
#define LIBPQ_HAS_OAUTH_BEARER_TOKEN_V2 1
+/* Indicates presence of PQgoAwayReceived */
+#define LIBPQ_HAS_GOAWAY 1
/*
* Option flags for PQcopyResult
@@ -419,6 +421,7 @@ extern const char *PQparameterStatus(const PGconn *conn,
extern int PQprotocolVersion(const PGconn *conn);
extern int PQfullProtocolVersion(const PGconn *conn);
extern int PQserverVersion(const PGconn *conn);
+extern int PQgoAwayReceived(PGconn *conn);
extern char *PQerrorMessage(const PGconn *conn);
extern int PQsocket(const PGconn *conn);
extern int PQbackendPID(const PGconn *conn);
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index bd7eb59f5f8..2018927d0cb 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -511,6 +511,7 @@ struct pg_conn
bool sigpipe_flag; /* can we mask SIGPIPE via MSG_NOSIGNAL? */
bool write_failed; /* have we had a write failure on sock? */
char *write_err_msg; /* write error message, or NULL if OOM */
+ bool goaway_received; /* true if server sent GoAway message */
bool auth_required; /* require an authentication challenge from
* the server? */
base-commit: f30cebb9542358702ca0f2c4be2cd504a2568606
--
2.53.0
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
2026-03-13 09:39 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
@ 2026-03-20 19:20 ` Tomas Vondra <[email protected]>
2026-03-20 20:11 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jacob Champion <[email protected]>
2026-03-23 22:34 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jim Nasby <[email protected]>
2026-03-24 14:27 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
0 siblings, 3 replies; 15+ messages in thread
From: Tomas Vondra @ 2026-03-20 19:20 UTC (permalink / raw)
To: Jelte Fennema-Nio <[email protected]>; Zsolt Parragi <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Jacob Champion <[email protected]>; Heikki Linnakangas <[email protected]>
Hi,
I've looked at this patch today, to check if there's something we could
get done in PG19.
I find it a bit we didn't get much feedback from people working on the
client/downstream stuff - clients, connection poolers/middleware, that
sort of stuff. OK, we did hear from Kiril, and he seems to like it.
I'm not very involved in the protocol stuff, so I'm sure there's a lot
details I'm missing. It'd be very helpful if there was some sort of PoC
support on the pooler/client side, so that I can experiment with it and
see how helpful the new protocol message is. But I realize that's a bit
too much to ask for.
A couple thoughts about this (some of this may be missing what the patch
aims to do).
* Does it make sense to tie this to smart shutdowns? I realize it's just
an example, and it probably makes sense to send the GoAway message
before a shutdown. But isn't this a bit similar to cancel/terminate of a
backend? Why not to have a pg_goaway_backend() function, that'd send the
message to a single backend? It might be useful for load-balancing, if
we could pick a "heavy" backend and ask it to reconnect / move to a
different replica. (Could that be handled by a middleware?)
* In fact, does it improve the smart shutdown case in practice? Let's
say we have a single instance, and we're restarting it. It'll send
GoAway to all the clients, the good clients will try to reconnect. But
if there's even a single "bad" client ignoring the GoAway, all the
well-behaved clients will get stuck. Ofc, that can happen without the
GoAway message too - a client may disconnect because of timeout etc. But
it makes it more likely, and it'll affect the well-behaved clients.
* Would it make sense to have some payload in the GoAway message? I'm
thinking about (a) some deadline by which the client should disconnect,
e.g. time of planned restart / shutdown, (b) priority, expressing how
much the client should try to disconnect (and maybe take more drastic
actions).
Also, two minor comments:
* The sgml docs say the function is defined as
int PQgoAwayReceived(const PGconn *conn);
but in the .h file it's defined without the "const".
* The new entry in protocol.sgml (in the "Supported Protocol Extensions"
table) says
<entry><literal>goaway</literal></entry>
but the following table includes "_pq_" in the entry name. Should the
new entry do the same?
regards
--
Tomas Vondra
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
2026-03-13 09:39 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-03-20 19:20 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Tomas Vondra <[email protected]>
@ 2026-03-20 20:11 ` Jacob Champion <[email protected]>
2026-03-24 14:29 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2 siblings, 1 reply; 15+ messages in thread
From: Jacob Champion @ 2026-03-20 20:11 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Jelte Fennema-Nio <[email protected]>; Zsolt Parragi <[email protected]>; PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Heikki Linnakangas <[email protected]>
On Fri, Mar 20, 2026 at 12:20 PM Tomas Vondra <[email protected]> wrote:
> * In fact, does it improve the smart shutdown case in practice? Let's
> say we have a single instance, and we're restarting it. It'll send
> GoAway to all the clients, the good clients will try to reconnect. But
> if there's even a single "bad" client ignoring the GoAway, all the
> well-behaved clients will get stuck. Ofc, that can happen without the
> GoAway message too - a client may disconnect because of timeout etc. But
> it makes it more likely, and it'll affect the well-behaved clients.
>
> * Would it make sense to have some payload in the GoAway message? I'm
> thinking about (a) some deadline by which the client should disconnect,
> e.g. time of planned restart / shutdown, (b) priority, expressing how
> much the client should try to disconnect (and maybe take more drastic
> actions).
I'd been wondering about these as well, but in the context of the
tangential thread [1]. HTTP has much stronger semantics for its GOAWAY
frames, for instance.
--Jacob
[1] https://postgr.es/m/CAOYmi%2BmSn8xQ7ExqY07V6G2oFXN2nY%2B7f4yf_RV2%3D7xNCKwW-Q%40mail.gmail.com
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
2026-03-13 09:39 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-03-20 19:20 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Tomas Vondra <[email protected]>
2026-03-20 20:11 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jacob Champion <[email protected]>
@ 2026-03-24 14:29 ` Jelte Fennema-Nio <[email protected]>
2026-03-24 22:21 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
2026-03-30 16:23 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
0 siblings, 2 replies; 15+ messages in thread
From: Jelte Fennema-Nio @ 2026-03-24 14:29 UTC (permalink / raw)
To: Jacob Champion <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Zsolt Parragi <[email protected]>; PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Heikki Linnakangas <[email protected]>
On Fri, 20 Mar 2026 at 21:11, Jacob Champion <[email protected]> wrote:
> I'd been wondering about these as well, but in the context of the
> tangential thread [1]. HTTP has much stronger semantics for its GOAWAY
> frames, for instance.
I reread the HTTP/3 GOAWAY spec[1], but I think our protocol is too
different from HTTP/3 to take any lessons from it (at the moment at
least). HTTP/3 "streams" are independent, we have no such concept. Our
whole session is a single stream, due to all of our session state. So
the semantics that on a single connection a client cannot open newer
streams does not really mean any useful for us, i.e. there's no way to
open a new stream. Even the "which messages have definitely not been
processed feature" can already be inferred from the server right now, by
tracking what responses have been received before the server closes the
connection. So I cannot think of any useful payload to add to the GoAway
message.
[1]: https://www.rfc-editor.org/rfc/rfc9114.html#connection-shutdown
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
2026-03-13 09:39 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-03-20 19:20 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Tomas Vondra <[email protected]>
2026-03-20 20:11 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jacob Champion <[email protected]>
2026-03-24 14:29 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
@ 2026-03-24 22:21 ` Zsolt Parragi <[email protected]>
1 sibling, 0 replies; 15+ messages in thread
From: Zsolt Parragi @ 2026-03-24 22:21 UTC (permalink / raw)
To: Jelte Fennema-Nio <[email protected]>; +Cc: Jacob Champion <[email protected]>; Tomas Vondra <[email protected]>; PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Heikki Linnakangas <[email protected]>
+ if (!goaway_reported && PQgoAwayReceived(pset.db))
+ {
+ pg_log_info("Server sent GoAway, requesting disconnect when convenient.");
+ goaway_reported = true;
+ }
Shouldn't this variable be reset in pqDropServerData?
+ It is possible for NoticeResponse, ParameterStatus and GoAway
messages to be
interspersed between CopyData messages; frontends must handle these cases,
The patch currently doesn't actually do this, did you add this as
future proofing?
> I thought some more about this, but ultimately, the payloads you suggest
> only seem useful if a client has something inbetween "disconnect hard
> now" and "disconnect when the connection is unused"
Couldn't a client optimize the reconnect time if it knows about the
deadline? If it knows that it still has 10 minutes before the server
kicks it out, it might choose to finish a 3-4 minute task, reconnect,
and then continue.
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
2026-03-13 09:39 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-03-20 19:20 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Tomas Vondra <[email protected]>
2026-03-20 20:11 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jacob Champion <[email protected]>
2026-03-24 14:29 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
@ 2026-03-30 16:23 ` Jelte Fennema-Nio <[email protected]>
2026-03-31 07:12 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
1 sibling, 1 reply; 15+ messages in thread
From: Jelte Fennema-Nio @ 2026-03-30 16:23 UTC (permalink / raw)
To: Tatsuo Ishii <[email protected]>; +Cc: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]
On Wed, 25 Mar 2026 at 01:36, Tatsuo Ishii <[email protected]> wrote:
> 1. If clients do not disconnect a session even if they have received
> the GoAway message, PostgreSQL server will give up the shutdown
> sequence. In this case, shouldn't the PostgreSQL server send a
> message indicating "I have given up the smart shutdown request"?
> Otherwise, the fact that GoAway has been received will remain in
> the client, and if the client does not check the receiving timely,
> the client may exit the session unnecessarily.
You mean effectively being able to undo sending the GoAway message?
As the patch is right now, the server will never give up on the
shutdown sequence. It will wait indefinitely until it can shut down.
This is unchanged from the current smart shutdown mode behaviour on
master.
I think it's an interesting idea, but I don't think it's worth the
cost of implementing and maintaining. I don't think it's common for
programs to support cancelling a shutdown sequence. I can't think of
any databases or network services I worked with that support it.
Generally if you want something to shut down, there's a slow graceful
way, and a fast non-graceful way. I personally when the slow graceful
shutdown did not finish "in time for me", I have never felt the need
to cancel the shutdown sequence. Instead I normally trigger the fast
non-graceful shutdown at that point (either manually or through some
automated timeout).
> 2. Can we use a NOTICE message instead of the new protocol GoAway for
> the purpose?
It's a good question. Sadly not easily, e.g. `client_min_messages`
WARNING or ERROR will make sure that won't get sent to the client.
I also thought about adding a read-only parameter and use
ParameterStatus instead of a new message, i.e. similar to how a server
reports its hot_standby status.
A significant issue also arises when the connection isn't made
directly to Postgres, but instead involving poolers like PgBouncer.
The GoAway signal is conceptually a link-level message, i.e. it's
about the connection directly to Postgres that should be disconnected.
But generally proxies forward all messages from Postgres to the
client. But if it does that, the client will disconnect, while the
pooler keeps the connection open to the server. So now the client
disconnected from PgBouncer for no reason, but the connection to
Postgres is still there. I don't think that's behaviour we want to
happen.
That's why I think it would actually be good to have the GoAway
message be opt-in at the connection level, so if an old PgBouncer
doesn't know how to deal with it, it won't forward it accidentally to
the client.
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
2026-03-13 09:39 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-03-20 19:20 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Tomas Vondra <[email protected]>
2026-03-20 20:11 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jacob Champion <[email protected]>
2026-03-24 14:29 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-03-30 16:23 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
@ 2026-03-31 07:12 ` Jelte Fennema-Nio <[email protected]>
0 siblings, 0 replies; 15+ messages in thread
From: Jelte Fennema-Nio @ 2026-03-31 07:12 UTC (permalink / raw)
To: Tatsuo Ishii <[email protected]>; +Cc: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]
On Tue, 31 Mar 2026 at 01:17, Tatsuo Ishii <[email protected]> wrote:
> Interesting. In Pgpool-II case, client disconnection is not a problem,
> the connection to PostgreSQL is kept open anyway (if connection
> expiration is not set). Probably this is because Pgpool-II only
> supports "session level" connection pooling.
I guess, I was not clear. Because the same problem exists for
Pgpool-II. When pgpool forwards the message to the client, the client
will disconnect from pgpool. But if connection_life_time is not
reached (and by default this is unlimited) pgpool will not disconnect
from the postgres server. So the postgres server has not actually
achieved the intended goal, but the client still disconnected for no
benefit, only downsides.
Maybe the problem is not too bad, i.e. it will cause some unnecessary
disconnects from the client, but it shouldn't cause big problems. So
maybe the ParameterStatus approach is worth exploring again (I
remember I ran into some problems, due to us normally only sending
ParameterStatus at the end of a query, but I'm pretty sure that can be
worked around somehow).
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
2026-03-13 09:39 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-03-20 19:20 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Tomas Vondra <[email protected]>
@ 2026-03-23 22:34 ` Jim Nasby <[email protected]>
2 siblings, 0 replies; 15+ messages in thread
From: Jim Nasby @ 2026-03-23 22:34 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Jelte Fennema-Nio <[email protected]>; Zsolt Parragi <[email protected]>; PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Jacob Champion <[email protected]>; Heikki Linnakangas <[email protected]>
On Fri, Mar 20, 2026 at 2:20 PM Tomas Vondra <[email protected]> wrote:
> * Does it make sense to tie this to smart shutdowns? I realize it's just
> an example, and it probably makes sense to send the GoAway message
> before a shutdown. But isn't this a bit similar to cancel/terminate of a
> backend? Why not to have a pg_goaway_backend() function, that'd send the
> message to a single backend? It might be useful for load-balancing, if
> we could pick a "heavy" backend and ask it to reconnect / move to a
> different replica. (Could that be handled by a middleware?)
>
+1. Another scenario that comes to mind is asking for a reconnect based on
backend memory consumption, since there's a bunch of internal structures
(relcache, etc) that can grow in an unbounded fashion.
^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Add GoAway protocol message for graceful but fast server shutdown/switchover
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-10 21:59 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Zsolt Parragi <[email protected]>
2026-03-13 09:39 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2026-03-20 19:20 ` Re: Add GoAway protocol message for graceful but fast server shutdown/switchover Tomas Vondra <[email protected]>
@ 2026-03-24 14:27 ` Jelte Fennema-Nio <[email protected]>
2 siblings, 0 replies; 15+ messages in thread
From: Jelte Fennema-Nio @ 2026-03-24 14:27 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Zsolt Parragi <[email protected]>; PostgreSQL Hackers <[email protected]>; Dave Cramer <[email protected]>; Jacob Champion <[email protected]>; Heikki Linnakangas <[email protected]>; [email protected]
On Fri, 20 Mar 2026 at 20:20, Tomas Vondra <[email protected]> wrote:
> It'd be very helpful if there was some sort of PoC
> support on the pooler/client side, so that I can experiment with it and
> see how helpful the new protocol message is. But I realize that's a bit
> too much to ask for.
I'll see if I can whip something up, it shouldn't be too hard.
> Why not to have a pg_goaway_backend() function, that'd send the
> message to a single backend?
I like this idea a lot. So I added it in the attached v8 patch. This
also allowed we me to add low level tests using the libpq_pipeline
testsuite.
> * In fact, does it improve the smart shutdown case in practice? Let's
> say we have a single instance, and we're restarting it. It'll send
> GoAway to all the clients, the good clients will try to reconnect. But
> if there's even a single "bad" client ignoring the GoAway, all the
> well-behaved clients will get stuck. Ofc, that can happen without the
> GoAway message too - a client may disconnect because of timeout etc. But
> it makes it more likely, and it'll affect the well-behaved clients.
For primary server restarts, I don't think anyone should be using smart
shutdown right now either. Any new connections to the database will be
failing for an indeterminate amount of time. I agree that sending GoAway
might worsen the problem in some cases, but it's already terrible to
start with. Fast shutdown is the only sensible restart mode for a
primary server. This seems to be generally accepted knowledge, given
that we use SIGINT (fast shutdown) in our systemd example[1].
Sending a GoAway on smart shutdown makes that shutdown mode very useful
for read replicas during a planned switch-over to another replica. Now
clients can finish their work and quickly reconnect to the new read
replica, minimizing switchover time while preventing errors.
Even when restarting primary servers, triggering a smart shutdown has a
benefits, as long as it's followed by a fast shutdown after a short
delay (e.g., 1 second). This causes slightly longer downtime (the
additional delay), but it allows most clients to disconnect on their own
terms instead of in the middle of a query. Connection errors can often
be retried transparently more easily than errors in the middle of a
query. In effect, for many applications, this could mean a reduction in
errors and only an increase in latency during a restart.
> * Would it make sense to have some payload in the GoAway message? I'm
> thinking about (a) some deadline by which the client should disconnect,
> e.g. time of planned restart / shutdown, (b) priority, expressing how
> much the client should try to disconnect (and maybe take more drastic
> actions).
I thought some more about this, but ultimately, the payloads you suggest
only seem useful if a client has something inbetween "disconnect hard
now" and "disconnect when the connection is unused". I cannot think of
any such cases. i.e. what other "drastic actions" could a client take
instead of simply closing the connection. If that's the only
possibility, why not simply have the server close the connection in that
case.
Overall, I agree that having no payload in this new message feels a bit
weird. But ultimately, clients don't need any payload to do something
useful.
> Also, two minor comments:
Fixed.
Attachments:
[text/x-patch] v8-0001-Add-GoAway-protocol-message-for-graceful-but-fast.patch (38.9K, 2-v8-0001-Add-GoAway-protocol-message-for-graceful-but-fast.patch)
download | inline diff:
From aae99054edbb594fceac18c26ccbdca38de63980 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <[email protected]>
Date: Sun, 19 Oct 2025 00:30:48 +0200
Subject: [PATCH v8] Add GoAway protocol message for graceful but fast server
shutdown/switchover
This commit introduces a new GoAway backend-to-frontend protocol
message (byte 'g') that the server can send to the client to politely
request that client to disconnect/reconnect when convenient. This message is
advisory only - the connection remains fully functional and clients may
continue executing queries and starting new transactions. "When
convenient" is obviously not very well defined, but the primary target
clients are clients that maintain a connection pool. Such clients should
disconnect/reconnect a connection in the pool when there's no user of
that connection. This is similar to how such clients often currently
remove a connection from the pool after the connection hits a maximum
lifetime of e.g. 1 hour.
A new pg_goaway_backend function is introduced to tell a backend to send
the GoAway message to its connected client. This makes it effectively
graceful counterpart to pg_terminate_backend. This function can be
useful to trigger a graceful reconnect. A reason this might be useful is
if it is using a lot of memory, and an operator wants the process to
clear its catalog caches.
Apart from being sent by pg_goaway_backend, the GoAway message is used by
Postgres during the already existing "smart" shutdown procedure (i.e.
when postmaster receives SIGTERM). When Postgres is in "smart" shutdown
mode existing clients can continue to run queries as usual but new
connection attempts are rejected. This mode is primarily useful when
triggering a switchover of a read replica. A load balancer can route new
connections only to the new read replica, while the old load balancer
keeps serving the existing connections until they disconnect. The
problem is that this draining of connections could often take a long
time. Even when clients only run very short queries/transactions because
the session can be kept open much longer (many connection pools use 1
hour max lifetime of a connection by default). With the introduction of
the GoAway message Postgres now sends this message to all connected
clients when it enters smart shutdown mode. If these clients respond to
the message by reconnecting/disconnecting earlier than their maximum
connection lifetime the draining can complete much quicker. Similar
benefits to switchover duration can be achieved for other applications
or proxies implementing the Postgres protocol, like when switching over
a cluster of PgBouncer machines to a newer version.
Applications/clients that use libpq can periodically check the result of
PQgoAwayReceived() at an inactive time to see whether they are asked to
reconnect.
---
doc/src/sgml/func/func-admin.sgml | 33 +++++
doc/src/sgml/libpq.sgml | 52 ++++++++
doc/src/sgml/protocol.sgml | 70 ++++++++++-
src/backend/postmaster/postmaster.c | 26 ++++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/signalfuncs.c | 119 ++++++++++++++++--
src/backend/tcop/backend_startup.c | 21 +++-
src/backend/tcop/postgres.c | 36 ++++++
src/bin/psql/common.c | 7 ++
src/bin/psql/meson.build | 1 +
src/bin/psql/t/040_goaway.pl | 46 +++++++
src/include/catalog/pg_proc.dat | 5 +
src/include/libpq/libpq-be.h | 5 +
src/include/libpq/protocol.h | 1 +
src/include/storage/procsignal.h | 3 +-
src/include/tcop/tcopprot.h | 1 +
src/interfaces/libpq/exports.txt | 1 +
src/interfaces/libpq/fe-connect.c | 1 +
src/interfaces/libpq/fe-exec.c | 27 ++++
src/interfaces/libpq/fe-protocol3.c | 39 +++++-
src/interfaces/libpq/libpq-fe.h | 3 +
src/interfaces/libpq/libpq-int.h | 1 +
.../modules/libpq_pipeline/libpq_pipeline.c | 73 +++++++++++
23 files changed, 553 insertions(+), 21 deletions(-)
create mode 100644 src/bin/psql/t/040_goaway.pl
diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index 210b1118bdf..3338169bebb 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -163,6 +163,39 @@
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_goaway_backend</primary>
+ </indexterm>
+ <function>pg_goaway_backend</function> ( <parameter>pid</parameter> <type>integer</type>, <parameter>timeout</parameter> <type>bigint</type> <literal>DEFAULT</literal> <literal>0</literal> )
+ <returnvalue>boolean</returnvalue>
+ </para>
+ <para>
+ Requests the backend process with the specified process ID to send a
+ <literal>GoAway</literal> message to its client, politely asking it to
+ disconnect when convenient. The connection remains fully functional
+ after receiving the message. The same permission rules as
+ <function>pg_cancel_backend</function> apply.
+ </para>
+ <para>
+ If <parameter>timeout</parameter> is not specified or zero, this
+ function returns <literal>true</literal> whether the client actually
+ disconnects or not, indicating only that the request was successfully
+ sent to the backend. If the <parameter>timeout</parameter> is
+ specified (in milliseconds) and greater than zero, the function waits
+ until the backend has exited. If it exits within the timeout, the
+ function returns <literal>true</literal>. On timeout, a warning is
+ emitted and <literal>false</literal> is returned.
+ </para>
+ <para>
+ This is useful for gracefully draining connections from a server, for
+ example during a switchover. Clients that do not support
+ the <literal>goaway</literal> <link linkend="protocol-extensions-table">protocol extension</link> will silently ignore
+ the request.
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 6db823808fc..0b408665a6f 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2248,6 +2248,14 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
equivalent to the latest protocol version supported by the libpq
version being used, which is currently <literal>3.2</literal>.
</para>
+
+ <para>
+ When using protocol version <literal>3.2</literal> or higher,
+ <application>libpq</application> requests all supported
+ <link linkend="protocol-extensions">protocol extensions</link>
+ from the server, such as <literal>goaway</literal>
+ which enables <xref linkend="libpq-PQgoAwayReceived"/>.
+ </para>
</listitem>
</varlistentry>
@@ -3004,6 +3012,50 @@ int PQserverVersion(const PGconn *conn);
</listitem>
</varlistentry>
+ <varlistentry id="libpq-PQgoAwayReceived">
+ <term><function>PQgoAwayReceived</function><indexterm><primary>PQgoAwayReceived</primary></indexterm></term>
+ <listitem>
+ <para>
+ Returns true if the server has sent a <literal>GoAway</literal> message,
+ requesting the client to disconnect when convenient.
+
+<synopsis>
+int PQgoAwayReceived(PGconn *conn);
+</synopsis>
+ </para>
+
+ <para>
+ The <literal>GoAway</literal> message is sent by the server during a
+ smart shutdown to politely request that clients disconnect. This is
+ advisory only - the connection remains fully functional and queries
+ can continue to be executed. Applications can choose to honor the request
+ by calling this function periodically and disconnect gracefully when
+ possible, such as after completing the current transaction.
+ </para>
+
+ <para>
+ This message is only sent to clients that request the
+ <literal>goaway</literal> protocol extension in the startup message,
+ which <application>libpq</application> does when using protocol version
+ 3.2 or higher (see <xref linkend="libpq-connect-max-protocol-version"/>).
+ The function returns 1 if the <literal>GoAway</literal> message was
+ received, 0 otherwise.
+ </para>
+
+ <para>
+ <function>PQgoAwayReceived</function> does not actually read data from the
+ server; it just returns messages previously absorbed by another
+ <application>libpq</application> function. So normally you would first
+ call <xref linkend="libpq-PQconsumeInput"/>, then check
+ <function>PQgoAwayReceived</function>. You can use
+ <function>select()</function> to wait for data to arrive from the
+ server, thereby using no <acronym>CPU</acronym> power unless there is
+ something to do. (See <xref linkend="libpq-PQsocket"/> to obtain the file
+ descriptor number to use with <function>select()</function>.)
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="libpq-PQerrorMessage">
<term>
<function>PQerrorMessage</function><indexterm><primary>PQerrorMessage</primary></indexterm>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 49f81676712..d3b3d836893 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -346,8 +346,14 @@
<tbody>
<row>
- <entry namest="last" align="center" valign="middle">
- <emphasis>(No supported protocol extensions are currently defined.)</emphasis>
+ <entry><literal>_pq_.goaway</literal></entry>
+ <entry><literal>1</literal></entry>
+ <entry>PostgreSQL 19</entry>
+ <entry>
+ Enables the server to send
+ <link linkend="protocol-message-formats-GoAway">GoAway</link>
+ messages to request the client to disconnect when convenient.
+ See <xref linkend="protocol-async"/> for more details.
</entry>
</row>
</tbody>
@@ -710,6 +716,17 @@
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term>GoAway</term>
+ <listitem>
+ <para>
+ The server requests the client politely to close the connection at its
+ earliest convenient moment. See <xref linkend="protocol-async"/> for
+ more details.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
@@ -1458,7 +1475,7 @@ SELCT 1/0;<!-- this typo is intentional -->
</para>
<para>
- It is possible for NoticeResponse and ParameterStatus messages to be
+ It is possible for NoticeResponse, ParameterStatus and GoAway messages to be
interspersed between CopyData messages; frontends must handle these cases,
and should be prepared for other asynchronous message types as well (see
<xref linkend="protocol-async"/>). Otherwise, any message type other than
@@ -1569,6 +1586,28 @@ SELCT 1/0;<!-- this typo is intentional -->
parameters that it does not understand or care about.
</para>
+ <para>
+ A GoAway message is sent by the server to politely request the client to
+ disconnect when convenient and reconnect when the client needs a connection
+ again. This is advisory only - the connection remains fully functional and
+ queries can continue to be executed. "When convenient" is very vaguely
+ defined on purpose because it depends on the client and application whether
+ such a moment even exists. So clients are allowed to completely ignore this
+ message and disconnect whenever they otherwise would have. An important
+ type of client that can actually honor the request to disconnect early is a
+ client that maintains a connection pool. Such a client can honor the
+ request by disconnecting a connection that has received a GoAway message
+ when it's not in use by a user of the pool. It is allowed for a server to
+ send multiple GoAway messages on the same connection, but any subsequent
+ GoAway messages after the first GoAway have no effect on the client's
+ behavior. The GoAway message is currently sent by Postgres during the
+ "smart" shutdown procedure (i.e. when postmaster receives
+ <systemitem>SIGTERM</systemitem>). The server only sends the GoAway
+ message to clients that request the <literal>goaway</literal> protocol
+ extension by setting <literal>_pq_.goaway</literal> to <literal>1</literal>
+ in the StartupMessage.
+ </para>
+
<para>
If a frontend issues a <command>LISTEN</command> command, then the
backend will send a NotificationResponse message (not to be
@@ -5359,6 +5398,31 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
</listitem>
</varlistentry>
+ <varlistentry id="protocol-message-formats-GoAway">
+ <term>GoAway (B)</term>
+ <listitem>
+ <variablelist>
+ <varlistentry>
+ <term>Byte1('g')</term>
+ <listitem>
+ <para>
+ Identifies the message as a polite request for the client to disconnect.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Int32(4)</term>
+ <listitem>
+ <para>
+ Length of message contents in bytes, including self.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="protocol-message-formats-GSSENCRequest">
<term>GSSENCRequest (F)</term>
<listitem>
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 3fac46c402b..b79ac383503 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2134,7 +2134,33 @@ process_pm_shutdown_request(void)
* later state, do not change it.
*/
if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
+ {
+ dlist_iter iter;
+
connsAllowed = false;
+
+ /*
+ * Signal all backends to send a GoAway message to their
+ * clients, to politely request that they disconnect.
+ */
+ dlist_foreach(iter, &ActiveChildList)
+ {
+ PMChild *bp = dlist_container(PMChild, elem, iter.cur);
+
+ /*
+ * Only signal regular backends, since those are the ones
+ * that need to notify their clients using a GoAway
+ * message. Follow the same pattern as SignalChildren to
+ * correctly distinguish backends from WAL senders.
+ */
+ if (bp->bkend_type == B_BACKEND &&
+ !IsPostmasterChildWalSender(bp->child_slot))
+ {
+ SendProcSignal(bp->pid, PROCSIG_GOAWAY,
+ INVALID_PROC_NUMBER);
+ }
+ }
+ }
else if (pmState == PM_STARTUP || pmState == PM_RECOVERY)
{
/* There should be no clients, so proceed to stop children */
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7e017c8d53b..d1c9e2c44fd 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -697,6 +697,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT))
HandleLogMemoryContextInterrupt();
+ if (CheckProcSignal(PROCSIG_GOAWAY))
+ HandleGoAwayInterrupt();
+
if (CheckProcSignal(PROCSIG_PARALLEL_APPLY_MESSAGE))
HandleParallelApplyMessageInterrupt();
diff --git a/src/backend/storage/ipc/signalfuncs.c b/src/backend/storage/ipc/signalfuncs.c
index 800b699de21..4bd31c36a40 100644
--- a/src/backend/storage/ipc/signalfuncs.c
+++ b/src/backend/storage/ipc/signalfuncs.c
@@ -23,15 +23,17 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/procsignal.h"
#include "utils/acl.h"
#include "utils/fmgrprotos.h"
#include "utils/wait_event.h"
/*
- * Send a signal to another backend.
+ * Check whether the current user has permission to signal the backend with
+ * the given PID.
*
- * The signal is delivered if the user is either a superuser or the same
+ * Sending a signal is allowed if the user is either a superuser or the same
* role as the backend being signaled. For "dangerous" signals, an explicit
* check for superuser needs to be done prior to calling this function.
*
@@ -42,6 +44,10 @@
* In the event of a general failure (return code 1), a warning message will
* be emitted. For permission errors, doing that is the responsibility of
* the caller.
+ *
+ * The caller can optionally obtain a pointer to the backend's PGPROC struct
+ * using the proc_return argument. If the caller doesn't need that, it can
+ * pass NULL.
*/
#define SIGNAL_BACKEND_SUCCESS 0
#define SIGNAL_BACKEND_ERROR 1
@@ -49,17 +55,22 @@
#define SIGNAL_BACKEND_NOSUPERUSER 3
#define SIGNAL_BACKEND_NOAUTOVAC 4
static int
-pg_signal_backend(int pid, int sig)
+pg_check_signal_backend(int pid, PGPROC **proc_return)
{
+
PGPROC *proc = BackendPidGetProc(pid);
+ if (proc_return != NULL)
+ *proc_return = proc;
+
/*
* BackendPidGetProc returns NULL if the pid isn't valid; but by the time
- * we reach kill(), a process for which we get a valid proc here might
- * have terminated on its own. There's no way to acquire a lock on an
- * arbitrary process to prevent that. But since so far all the callers of
- * this mechanism involve some request for ending the process anyway, that
- * it might end on its own first is not a problem.
+ * the caller actually tries to send a signal, a process for which we get
+ * a valid proc here might have terminated on its own. There's no way to
+ * acquire a lock on an arbitrary process to prevent that. But since so
+ * far all the callers of this mechanism involve some request for ending
+ * the process anyway, that it might end on its own first is not a
+ * problem.
*
* Note that proc will also be NULL if the pid refers to an auxiliary
* process or the postmaster (neither of which can be signaled via
@@ -100,6 +111,22 @@ pg_signal_backend(int pid, int sig)
!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
return SIGNAL_BACKEND_NOPERMISSION;
+ return SIGNAL_BACKEND_SUCCESS;
+}
+
+/*
+ * Send a signal to another backend, after checking permissions.
+ *
+ * See pg_check_signal_backend for return codes.
+ */
+static int
+pg_signal_backend(int pid, int sig)
+{
+ int r = pg_check_signal_backend(pid, NULL);
+
+ if (r != SIGNAL_BACKEND_SUCCESS)
+ return r;
+
/*
* Can the process we just validated above end, followed by the pid being
* recycled for a new process, before reaching here? Then we'd be trying
@@ -276,6 +303,82 @@ pg_terminate_backend(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(r == SIGNAL_BACKEND_SUCCESS);
}
+/*
+ * Request a backend to send a GoAway message to its client, politely asking
+ * it to disconnect when convenient. The connection remains fully functional
+ * after receiving GoAway.
+ *
+ * If timeout is 0, returns true as soon as the signal is sent successfully.
+ * If timeout is nonzero, waits up to that many milliseconds for the backend
+ * to exit (i.e. for the client to honor the GoAway and disconnect). Returns
+ * true if the backend exits within the timeout, false (with a warning) if not.
+ *
+ * Permission checking follows the same rules as pg_cancel_backend.
+ */
+Datum
+pg_goaway_backend(PG_FUNCTION_ARGS)
+{
+ int pid;
+ int r;
+ int timeout; /* milliseconds */
+ PGPROC *proc;
+
+ pid = PG_GETARG_INT32(0);
+ timeout = PG_GETARG_INT64(1);
+
+ if (timeout < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("\"timeout\" must not be negative")));
+
+ r = pg_check_signal_backend(pid, &proc);
+
+ if (r == SIGNAL_BACKEND_NOSUPERUSER)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("permission denied to send GoAway"),
+ errdetail("Only roles with the %s attribute may send GoAway to backends of roles with the %s attribute.",
+ "SUPERUSER", "SUPERUSER")));
+
+ if (r == SIGNAL_BACKEND_NOAUTOVAC)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("permission denied to send GoAway"),
+ errdetail("Only roles with privileges of the \"%s\" role may send GoAway to autovacuum workers.",
+ "pg_signal_autovacuum_worker")));
+
+ if (r == SIGNAL_BACKEND_NOPERMISSION)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("permission denied to send GoAway"),
+ errdetail("Only roles with privileges of the role whose backend is being signaled or with privileges of the \"%s\" role may send GoAway to this backend.",
+ "pg_signal_backend")));
+
+ if (r != SIGNAL_BACKEND_SUCCESS)
+ PG_RETURN_BOOL(false);
+
+ if (proc->backendType != B_BACKEND)
+ {
+ ereport(WARNING,
+ (errmsg("PID %d is not a regular backend process", pid)));
+ PG_RETURN_BOOL(false);
+ }
+
+ if (SendProcSignal(pid, PROCSIG_GOAWAY,
+ INVALID_PROC_NUMBER) != 0)
+ {
+ ereport(WARNING,
+ (errmsg("could not send GoAway signal to process %d: %m", pid)));
+ PG_RETURN_BOOL(false);
+ }
+
+ /* Wait only if actually requested */
+ if (timeout > 0)
+ PG_RETURN_BOOL(pg_wait_until_termination(pid, timeout));
+
+ PG_RETURN_BOOL(true);
+}
+
/*
* Signal to reload the database configuration
*
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index 5abf276c898..1bbd5deb448 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -779,11 +779,24 @@ ProcessStartupPacket(Port *port, bool ssl_done, bool gss_done)
{
/*
* Any option beginning with _pq_. is reserved for use as a
- * protocol-level option, but at present no such options are
- * defined.
+ * protocol-level option.
*/
- unrecognized_protocol_options =
- lappend(unrecognized_protocol_options, pstrdup(nameptr));
+ if (strcmp(nameptr, "_pq_.goaway") == 0)
+ {
+ /* Client wants to receive GoAway messages. */
+ if (strcmp(valptr, "1") != 0)
+ ereport(FATAL,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid value for protocol option \"%s\": \"%s\"",
+ nameptr, valptr),
+ errhint("Valid values are: \"1\".")));
+ port->goaway_negotiated = true;
+ }
+ else
+ {
+ unrecognized_protocol_options =
+ lappend(unrecognized_protocol_options, pstrdup(nameptr));
+ }
}
else
{
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b3563113219..c1df84cc547 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -42,8 +42,10 @@
#include "common/pg_prng.h"
#include "jit/jit.h"
#include "libpq/libpq.h"
+#include "libpq/pqcomm.h"
#include "libpq/pqformat.h"
#include "libpq/pqsignal.h"
+#include "libpq/protocol.h"
#include "mb/pg_wchar.h"
#include "mb/stringinfo_mb.h"
#include "miscadmin.h"
@@ -93,6 +95,14 @@ const char *debug_query_string; /* client-supplied query string */
/* Note: whereToSendOutput is initialized for the bootstrap/standalone case */
CommandDest whereToSendOutput = DestDebug;
+/*
+ * Track whether we've been asked to send GoAway and whether we've sent it.
+ * GoAwayPending is set by the PROCSIG_GOAWAY signal handler.
+ * GoAwaySent tracks whether we've already sent the GoAway message.
+ */
+static volatile sig_atomic_t GoAwayPending = false;
+static bool GoAwaySent = false;
+
/* flag for logging end of session */
bool Log_disconnections = false;
@@ -508,6 +518,20 @@ ProcessClientReadInterrupt(bool blocked)
/* Check for general interrupts that arrived before/while reading */
CHECK_FOR_INTERRUPTS();
+ /* Send GoAway message if requested */
+ if (GoAwayPending && !GoAwaySent &&
+ whereToSendOutput == DestRemote &&
+ MyProcPort && MyProcPort->goaway_negotiated)
+ {
+ StringInfoData buf;
+
+ pq_beginmessage(&buf, PqMsg_GoAway);
+ pq_endmessage(&buf);
+ pq_flush();
+
+ GoAwaySent = true;
+ }
+
/* Process sinval catchup interrupts, if any */
if (catchupInterruptPending)
ProcessCatchupInterrupt();
@@ -3067,6 +3091,18 @@ FloatExceptionHandler(SIGNAL_ARGS)
"invalid operation, such as division by zero.")));
}
+/*
+ * Tell the next ProcessClientReadInterrupt() call to send a GoAway message to
+ * the client. Runs in a SIGUSR1 handler.
+ */
+void
+HandleGoAwayInterrupt(void)
+{
+ GoAwayPending = true;
+ InterruptPending = true;
+ /* latch will be set by procsignal_sigusr1_handler */
+}
+
/*
* Tell the next CHECK_FOR_INTERRUPTS() to process recovery conflicts. Runs
* in a SIGUSR1 handler.
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 2eadd391a9c..4e99ed69841 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -739,6 +739,7 @@ PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout,
static void
PrintNotifications(void)
{
+ static bool goaway_reported = false;
PGnotify *notify;
PQconsumeInput(pset.db);
@@ -755,6 +756,12 @@ PrintNotifications(void)
PQfreemem(notify);
PQconsumeInput(pset.db);
}
+
+ if (!goaway_reported && PQgoAwayReceived(pset.db))
+ {
+ pg_log_info("Server sent GoAway, requesting disconnect when convenient.");
+ goaway_reported = true;
+ }
}
diff --git a/src/bin/psql/meson.build b/src/bin/psql/meson.build
index 922b2845267..047b40cc40e 100644
--- a/src/bin/psql/meson.build
+++ b/src/bin/psql/meson.build
@@ -78,6 +78,7 @@ tests += {
't/010_tab_completion.pl',
't/020_cancel.pl',
't/030_pager.pl',
+ 't/040_goaway.pl',
],
},
}
diff --git a/src/bin/psql/t/040_goaway.pl b/src/bin/psql/t/040_goaway.pl
new file mode 100644
index 00000000000..a53653308bf
--- /dev/null
+++ b/src/bin/psql/t/040_goaway.pl
@@ -0,0 +1,46 @@
+
+# Copyright (c) 2021-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+my $psql = $node->background_psql('postgres');
+
+# Confirm connection works
+my $result = $psql->query_safe("SELECT 'before_goaway'");
+like($result, qr/before_goaway/, 'connection works before smart shutdown');
+
+# Initiate smart shutdown without waiting for it to complete
+$node->command_ok(
+ [ 'pg_ctl', 'stop', '-D', $node->data_dir, '-m', 'smart', '--no-wait' ],
+ 'pg_ctl smart shutdown');
+
+# The backend sends GoAway once it processes the smart shutdown signal.
+# Poll with queries until psql reports it.
+my $saw_goaway = 0;
+for (my $i = 0; $i < 100; $i++)
+{
+ my $out = $psql->query("SELECT 'after_goaway'");
+ if ($psql->{stderr} =~ /Server sent GoAway, requesting disconnect when convenient/)
+ {
+ $saw_goaway = 1;
+ # The query should still have succeeded
+ like($out, qr/after_goaway/, 'query still works after GoAway');
+ last;
+ }
+ usleep(50_000);
+}
+ok($saw_goaway, 'psql reported GoAway notice during smart shutdown');
+
+$psql->quit;
+
+done_testing();
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 84e7adde0e5..e8584a3e458 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6765,6 +6765,11 @@
proargtypes => 'int4 int8', proargnames => '{pid,timeout}',
proargdefaults => '{0}',
prosrc => 'pg_terminate_backend' },
+{ oid => '9018', descr => 'request a server process to send GoAway to its client',
+ proname => 'pg_goaway_backend', provolatile => 'v', prorettype => 'bool',
+ proargtypes => 'int4 int8', proargnames => '{pid,timeout}',
+ proargdefaults => '{0}',
+ prosrc => 'pg_goaway_backend' },
{ oid => '2172', descr => 'prepare for taking an online backup',
proname => 'pg_backup_start', provolatile => 'v', proparallel => 'r',
prorettype => 'pg_lsn', proargtypes => 'text bool',
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 921b2daa4ff..b8769412c72 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -202,6 +202,11 @@ typedef struct Port
void *gss;
#endif
+ /*
+ * Protocol extensions.
+ */
+ bool goaway_negotiated; /* client supports GoAway message */
+
/*
* SSL structures.
*/
diff --git a/src/include/libpq/protocol.h b/src/include/libpq/protocol.h
index eae8f0e7238..fd3c195daa2 100644
--- a/src/include/libpq/protocol.h
+++ b/src/include/libpq/protocol.h
@@ -53,6 +53,7 @@
#define PqMsg_FunctionCallResponse 'V'
#define PqMsg_CopyBothResponse 'W'
#define PqMsg_ReadyForQuery 'Z'
+#define PqMsg_GoAway 'g'
#define PqMsg_NoData 'n'
#define PqMsg_PortalSuspended 's'
#define PqMsg_ParameterDescription 't'
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a93..ecbc62b7fd7 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -39,9 +39,10 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT, /* backend is blocking recovery, check
* PGPROC->pendingRecoveryConflicts for the
* reason */
+ PROCSIG_GOAWAY, /* request backend to send GoAway to client */
} ProcSignalReason;
-#define NUM_PROCSIGNALS (PROCSIG_RECOVERY_CONFLICT + 1)
+#define NUM_PROCSIGNALS (PROCSIG_GOAWAY + 1)
typedef enum
{
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 5bc5bcfb20d..a36547a11f7 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -74,6 +74,7 @@ extern void die(SIGNAL_ARGS);
pg_noreturn extern void quickdie(SIGNAL_ARGS);
extern void StatementCancelHandler(SIGNAL_ARGS);
pg_noreturn extern void FloatExceptionHandler(SIGNAL_ARGS);
+extern void HandleGoAwayInterrupt(void);
extern void HandleRecoveryConflictInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1e3d5bd5867..f57aae46b65 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -211,3 +211,4 @@ PQdefaultAuthDataHook 208
PQfullProtocolVersion 209
appendPQExpBufferVA 210
PQgetThreadLock 211
+PQgoAwayReceived 212
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index db9b4c8edbf..4d81d5f03bb 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -701,6 +701,7 @@ pqDropServerData(PGconn *conn)
conn->password_needed = false;
conn->gssapi_used = false;
conn->write_failed = false;
+ conn->goaway_received = false;
free(conn->write_err_msg);
conn->write_err_msg = NULL;
conn->oauth_want_retry = false;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 203d388bdbf..bc1d95324ed 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -2702,6 +2702,33 @@ PQnotifies(PGconn *conn)
return event;
}
+/*
+ * PQgoAwayReceived
+ * returns 1 if a GoAway message has been received from the server
+ * returns 0 if not
+ *
+ * Note that this function does not read any new data from the socket;
+ * caller should call PQconsumeInput() first if they want to ensure
+ * all available data has been read.
+ */
+int
+PQgoAwayReceived(PGconn *conn)
+{
+ if (!conn)
+ return 0;
+
+ if (conn->goaway_received)
+ return 1;
+
+ /*
+ * Parse any available data to see if a GoAway message has arrived.
+ */
+ parseInput(conn);
+
+ return conn->goaway_received ? 1 : 0;
+}
+
+
/*
* PQputCopyData - send some data to the backend during COPY IN or COPY BOTH
*
diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index 8c1fda5caf0..d09002a22ac 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -134,8 +134,8 @@ pqParseInput3(PGconn *conn)
}
/*
- * NOTIFY and NOTICE messages can happen in any state; always process
- * them right away.
+ * NOTIFY and NOTICE and GoAway messages can happen in any state;
+ * always process them right away.
*
* Most other messages should only be processed while in BUSY state.
* (In particular, in READY state we hold off further parsing until
@@ -159,6 +159,10 @@ pqParseInput3(PGconn *conn)
if (pqGetErrorNotice3(conn, false))
return;
}
+ else if (id == PqMsg_GoAway)
+ {
+ conn->goaway_received = true;
+ }
else if (conn->asyncStatus != PGASYNC_BUSY)
{
/* If not IDLE state, just wait ... */
@@ -1446,6 +1450,7 @@ pqGetNegotiateProtocolVersion3(PGconn *conn)
int num;
bool found_test_protocol_negotiation;
bool expect_test_protocol_negotiation;
+ bool requested_goaway = conn->pversion >= PG_PROTOCOL(3, 2);
/*
* During 19beta only, if protocol grease is in use, assume that it's the
@@ -1521,8 +1526,7 @@ pqGetNegotiateProtocolVersion3(PGconn *conn)
conn->pversion = their_version;
/*
- * Check that all expected unsupported parameters are reported by the
- * server.
+ * Process the list of unsupported protocol extensions.
*/
found_test_protocol_negotiation = false;
expect_test_protocol_negotiation = (conn->max_pversion == PG_PROTOCOL_GREASE);
@@ -1539,12 +1543,23 @@ pqGetNegotiateProtocolVersion3(PGconn *conn)
goto failure;
}
- /* Check if this is the expected test parameter */
+ /*
+ * Handle protocol extensions that we requested. Extensions that were
+ * not requested by us are an error.
+ */
if (expect_test_protocol_negotiation &&
strcmp(conn->workBuffer.data, "_pq_.test_protocol_negotiation") == 0)
{
found_test_protocol_negotiation = true;
}
+ else if (requested_goaway && strcmp(conn->workBuffer.data, "_pq_.goaway") == 0)
+ {
+ /*
+ * Server doesn't support GoAway, that's fine. It will simply never
+ * send the message.
+ */
+ continue;
+ }
else
{
libpq_append_conn_error(conn, "received invalid protocol negotiation message: server reported an unsupported parameter that was not requested (\"%s\")",
@@ -1905,6 +1920,9 @@ getCopyDataMessage(PGconn *conn)
if (getParameterStatus(conn))
return 0;
break;
+ case PqMsg_GoAway:
+ conn->goaway_received = true;
+ break;
case PqMsg_CopyData:
return msgLength;
case PqMsg_CopyDone:
@@ -2398,6 +2416,9 @@ pqFunctionCall3(PGconn *conn, Oid fnid,
if (getParameterStatus(conn))
continue;
break;
+ case PqMsg_GoAway:
+ conn->goaway_received = true;
+ break;
default:
/* The backend violates the protocol. */
libpq_append_conn_error(conn, "protocol error: id=0x%x", id);
@@ -2531,6 +2552,14 @@ build_startup_packet(const PGconn *conn, char *packet,
}
}
+ /*
+ * Request protocol extensions. Only do this for protocol version 3.2 and
+ * later, to avoid confusing old proxies that don't understand _pq_.*
+ * options.
+ */
+ if (conn->pversion >= PG_PROTOCOL(3, 2))
+ ADD_STARTUP_OPTION("_pq_.goaway", "1");
+
/* Add trailing terminator */
if (packet)
packet[packet_len] = '\0';
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index f06e7a972c3..d1a6dd54407 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,6 +68,8 @@ extern "C"
#define LIBPQ_HAS_GET_THREAD_LOCK 1
/* Indicates presence of the PQAUTHDATA_OAUTH_BEARER_TOKEN_V2 authdata hook */
#define LIBPQ_HAS_OAUTH_BEARER_TOKEN_V2 1
+/* Indicates presence of PQgoAwayReceived */
+#define LIBPQ_HAS_GOAWAY 1
/*
* Option flags for PQcopyResult
@@ -419,6 +421,7 @@ extern const char *PQparameterStatus(const PGconn *conn,
extern int PQprotocolVersion(const PGconn *conn);
extern int PQfullProtocolVersion(const PGconn *conn);
extern int PQserverVersion(const PGconn *conn);
+extern int PQgoAwayReceived(PGconn *conn);
extern char *PQerrorMessage(const PGconn *conn);
extern int PQsocket(const PGconn *conn);
extern int PQbackendPID(const PGconn *conn);
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index bd7eb59f5f8..2018927d0cb 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -511,6 +511,7 @@ struct pg_conn
bool sigpipe_flag; /* can we mask SIGPIPE via MSG_NOSIGNAL? */
bool write_failed; /* have we had a write failure on sock? */
char *write_err_msg; /* write error message, or NULL if OOM */
+ bool goaway_received; /* true if server sent GoAway message */
bool auth_required; /* require an authentication challenge from
* the server? */
diff --git a/src/test/modules/libpq_pipeline/libpq_pipeline.c b/src/test/modules/libpq_pipeline/libpq_pipeline.c
index aa0a6bbe762..f2b58d53392 100644
--- a/src/test/modules/libpq_pipeline/libpq_pipeline.c
+++ b/src/test/modules/libpq_pipeline/libpq_pipeline.c
@@ -2101,6 +2101,76 @@ process_result(PGconn *conn, PGresult *res, int results, int numsent)
}
+static void
+test_goaway(PGconn *conn)
+{
+ PGconn *otherConn;
+ PGresult *res;
+ int pid;
+ char pid_str[32];
+ int i;
+
+ fprintf(stderr, "test goaway... ");
+
+ otherConn = copy_connection(conn);
+ Assert(PQstatus(otherConn) == CONNECTION_OK);
+
+ if (!PQconsumeInput(conn))
+ pg_fatal("PQconsumeInput failed: %s", PQerrorMessage(conn));
+
+ pid = PQbackendPID(conn);
+ snprintf(pid_str, sizeof(pid_str), "%d", pid);
+
+ /* Verify GoAway has not been received yet */
+ if (PQgoAwayReceived(conn))
+ pg_fatal("GoAway received before sending request");
+
+ /* Ask the target backend to send GoAway */
+ {
+ const char *paramValues[1] = {pid_str};
+
+ res = PQexecParams(otherConn,
+ "SELECT pg_goaway_backend($1)",
+ 1, NULL, paramValues,
+ NULL, NULL, 0);
+ }
+ if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ pg_fatal("pg_goaway_backend failed: %s", PQerrorMessage(otherConn));
+ if (strcmp(PQgetvalue(res, 0, 0), "t") != 0)
+ pg_fatal("pg_goaway_backend returned false");
+ PQclear(res);
+
+ /*
+ * The GoAway signal is delivered asynchronously. Poll PQgoAwayReceived on
+ * the target connection until it returns true.
+ */
+ for (i = 0; i < 100; i++)
+ {
+ if (!PQconsumeInput(conn))
+ pg_fatal("PQconsumeInput failed: %s", PQerrorMessage(conn));
+
+ if (PQgoAwayReceived(conn))
+ break;
+
+ pg_usleep(50000); /* 50ms */
+ }
+
+ if (!PQgoAwayReceived(conn))
+ pg_fatal("GoAway not received after sending pg_goaway_backend");
+
+ /* Verify the connection still works after GoAway */
+ res = PQexec(conn, "SELECT 'still_alive'");
+ if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ pg_fatal("query failed after GoAway: %s", PQerrorMessage(conn));
+ if (strcmp(PQgetvalue(res, 0, 0), "still_alive") != 0)
+ pg_fatal("unexpected query result after GoAway");
+ PQclear(res);
+
+ PQfinish(otherConn);
+
+ fprintf(stderr, "ok\n");
+}
+
static void
usage(const char *progname)
{
@@ -2118,6 +2188,7 @@ print_test_list(void)
{
printf("cancel\n");
printf("disallowed_in_pipeline\n");
+ printf("goaway\n");
printf("multi_pipelines\n");
printf("nosync\n");
printf("pipeline_abort\n");
@@ -2225,6 +2296,8 @@ main(int argc, char **argv)
test_cancel(conn);
else if (strcmp(testname, "disallowed_in_pipeline") == 0)
test_disallowed_in_pipeline(conn);
+ else if (strcmp(testname, "goaway") == 0)
+ test_goaway(conn);
else if (strcmp(testname, "multi_pipelines") == 0)
test_multi_pipelines(conn);
else if (strcmp(testname, "nosync") == 0)
base-commit: 322bab79744dfb8f7ddb5191b3102cf7986d14a0
--
2.53.0
^ permalink raw reply [nested|flat] 15+ messages in thread
end of thread, other threads:[~2026-03-31 07:12 UTC | newest]
Thread overview: 15+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-10-23 13:04 Add GoAway protocol message for graceful but fast server shutdown/switchover Jelte Fennema-Nio <[email protected]>
2025-10-24 05:04 ` Kirill Reshke <[email protected]>
2025-10-24 11:54 ` Jelte Fennema-Nio <[email protected]>
2025-12-01 05:05 ` Ajit Awekar <[email protected]>
2026-02-10 21:59 ` Jelte Fennema-Nio <[email protected]>
2026-02-25 15:08 ` Zsolt Parragi <[email protected]>
2026-03-13 09:39 ` Jelte Fennema-Nio <[email protected]>
2026-03-20 19:20 ` Tomas Vondra <[email protected]>
2026-03-20 20:11 ` Jacob Champion <[email protected]>
2026-03-24 14:29 ` Jelte Fennema-Nio <[email protected]>
2026-03-24 22:21 ` Zsolt Parragi <[email protected]>
2026-03-30 16:23 ` Jelte Fennema-Nio <[email protected]>
2026-03-31 07:12 ` Jelte Fennema-Nio <[email protected]>
2026-03-23 22:34 ` Jim Nasby <[email protected]>
2026-03-24 14:27 ` Jelte Fennema-Nio <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox