Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w6vGl-004miq-0R for pgsql-hackers@arkaria.postgresql.org; Sun, 29 Mar 2026 18:53:55 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w6vGi-00H3Ie-17 for pgsql-hackers@arkaria.postgresql.org; Sun, 29 Mar 2026 18:53:52 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w6vGi-00H3IW-0C for pgsql-hackers@lists.postgresql.org; Sun, 29 Mar 2026 18:53:52 +0000 Received: from udcm-wwu2.uni-muenster.de ([128.176.118.28]) by makus.postgresql.org with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w6vGf-00000001gTy-4C7w for pgsql-hackers@lists.postgresql.org; Sun, 29 Mar 2026 18:53:51 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=uni-muenster.de; i=@uni-muenster.de; q=dns/txt; s=uniout; t=1774810430; x=1806346430; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=hPSYQUnxKfhLzMuQxKRboVQr/ua/hD/w/PM/wv/5jJE=; b=G5juvm+C6Fah+5woUUB1RQv+YNlHpSXdb15p+g16SC/gmzOPdoCH+TDn 7VYhqSr64MX6ja9c81OuhYcNJ0sqZBp7dscQdhL2TvS/Qf766do0BXAmt YO8qceByYYctuZ6LAhvY1NIpnVPct96QWUiIciRG0ocXTdOjPrN5MtpjH 8A9mzRlPkgVV2nTglMV9kps6sbLd02itjO81R4Gy98jw1OiLZYPmZZ7dP v7xBG4GI/1YIgyXZ960iJ6zjVIP9f7OWYviHEzXuIfBW3xvVIYaPkPSuh RdUD9mDYJ1FPRDe/oygiOVdw8Yo+4kODB4zk+CkHQfEi0e7uLvS79ToG8 g==; X-CSE-ConnectionGUID: 0xshXWoMQBG0RKl2A1bVcA== X-CSE-MsgGUID: ycxhrjNYRE2Jwp6UQFjXig== X-IronPort-AV: E=Sophos;i="6.23,148,1770591600"; d="scan'208";a="389124832" Received: from secmail.uni-muenster.de ([128.176.118.4]) by UDCM-RELAY2.UNI-MUENSTER.DE with ESMTP; 29 Mar 2026 20:53:48 +0200 Received: from [192.168.178.27] (dynamic-078-048-064-065.78.48.pool.telefonica.de [78.48.64.65]) by SECMAIL.UNI-MUENSTER.DE (Postfix) with ESMTPSA id AABA620ADF00; Sun, 29 Mar 2026 20:53:47 +0200 (CEST) Message-ID: Date: Sun, 29 Mar 2026 20:53:46 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Add max_wal_replay_size connection parameter to libpq To: SATYANARAYANA NARLAPURAM Cc: PostgreSQL Hackers References: <126eb1e4-d98e-4647-b629-517adbcad28e@uni-muenster.de> Content-Language: de-DE, en-GB From: Jim Jones In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 29/03/2026 20:31, SATYANARAYANA NARLAPURAM wrote: > What if none of them meets the criteria? You fail the connection? > Wouldn't it cause an availability issue? Yes, the connection fails if no host meets the threshold. This is intentional, and it is consistent with the existing behaviour of target_session_attrs: if you set target_session_attrs=standby and no standby is reachable, the connection fails too. > If pg_last_wal_receive_lsn() is NULL (e.g. no active WAL receiver due to > missing primary_conninfo or a disconnected upstream), the backlog cannot > be determined. In that case, the standby is treated as exceeding the > threshold and is skipped. > > > When a standby is replaying archiving log, it can still be caught up. > This doesn't seem right to me. I totally see your point here. The issue is that pg_last_wal_receive_lsn() returns NULL when there is no WAL receiver process -- regardless of how current the data actually is. Without a receive LSN, the metric this parameter is based on (receive_lsn - replay_lsn) is simply undefined for that standby. Please let me know if I am missing something here. > > This parameter measures only the apply lag on the standby itself, i.e., > how much already-received WAL remains to be replayed. It does not > attempt to measure how far the standby is behind the primary. In > particular, a standby that is slow to receive WAL but fast to replay it > may report a small backlog here while still being significantly behind. > > > IMHO, this change appears to not meet the objective of routing > connections/queries to the most up-to-date standby. The parameter's objective is not to route to the most up-to-date standby; it is to skip standbys whose apply lag exceeds a given threshold. Thanks for the quick feedback. Much appreciated! Best, Jim