Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vd6gu-00E2Is-2C for pgpool-hackers@arkaria.postgresql.org; Tue, 06 Jan 2026 13:01:41 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vd6gt-007sOS-1c for pgpool-hackers@arkaria.postgresql.org; Tue, 06 Jan 2026 13:01:40 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vd6gt-007sOK-12 for pgpool-hackers@lists.postgresql.org; Tue, 06 Jan 2026 13:01:40 +0000 Received: from meldrar.postgresql.org ([2a02:c0:301:0:ffff::31]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vd6gr-004sI5-0o for pgpool-hackers@lists.postgresql.org; Tue, 06 Jan 2026 13:01:39 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=postgresql.org; s=20171124; h=Content-Transfer-Encoding:Content-Type: Mime-Version:References:In-Reply-To:From:Subject:To:Message-Id:Date:Sender: Reply-To:Cc:Content-ID:Content-Description; bh=n7uSQHSu11KaktSlXXrzGMFPitk0w8zG4RugNWnnwIY=; b=JkDKhX4INcyGZs110wp9Cp1X0J wr+xtdpHoGYwrgeeAQWURMNzBv2ia+uabVjA4KLOEaqnkAKbnQqoi12GSnZBVuDC8pMDlFuIFwgA0 JZ+pnK8/1OsYypP9U8Dum8ULHyf63zzOMalqd9VLwnApwtInvyQASqszXLdiBlt6j5OGrvrqRS0tD nwYGVepMxKppNe5K7DUTJdwjxzaVsVDB4dRRDw93Z50Rs/TLlZFhM+PfrQaMrDk4DClBF8/NTV8CD zd7LvsiK8m+BvJvAvdYaUJErmUJraQwJIWXweipKRsl8iRoy/R+hfjwlr35sRljV8Lro2fP6EvOWA Pkr2Ys0g==; Received: from [2409:11:4120:300:255f:39f:122:819f] (helo=localhost) by meldrar.postgresql.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vd6go-009tqi-2k for pgpool-hackers@lists.postgresql.org; Tue, 06 Jan 2026 13:01:37 +0000 Date: Tue, 06 Jan 2026 22:01:23 +0900 (JST) Message-Id: <20260106.220123.1297671994270546143.ishii@postgresql.org> To: pgpool-hackers@lists.postgresql.org Subject: Re: Feature: reduce sync messages From: Tatsuo Ishii In-Reply-To: <20251225.162600.1040566325464308671.ishii@postgresql.org> References: <20251225.162600.1040566325464308671.ishii@postgresql.org> X-Mailer: Mew version 6.8 on Emacs 29.3 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Host-Lookup-Failed: Reverse DNS lookup failed for 2409:11:4120:300:255f:39f:122:819f (failed) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > Currently pgpool forwards sync messages to all configure backend nodes > regardless the backend_weight or load_balance_mode. This is not only > a waste of CPU cycle, but degrades performance since it takes more > message round trip time as the number of backend nodes increases. This > is conspicuous if backend nodes are in a distant location. We should > send sync messages only to necessary backend nodes. > > I have created a patch to enhance this. Send sync message only to > necessary backends. > > Suppose there is a pgpool cluster with 3 backend nodes. Before patch: > > test=# select 1 \bind \g > NOTICE: DB node id: 0 statement: Parse: select 1 > NOTICE: DB node id: 0 statement: Bind: select 1 > NOTICE: DB node id: 0 statement: D message > NOTICE: DB node id: 0 statement: Execute: select 1 > NOTICE: DB node id: 0 statement: Sync > NOTICE: DB node id: 1 statement: Sync > NOTICE: DB node id: 2 statement: Sync > ?column? > ---------- > 1 > (1 row) > > As you can see, the sync messages are sent to backend node 0, 1 and 2, > although only node 0 is involved in the query "select 1". So sending > sync messages to node 1 and 2 are just waste of time. > > After patch: > > test=# select 1 \bind \g > NOTICE: DB node id: 2 statement: Parse: select 1 > NOTICE: DB node id: 2 statement: Bind: select 1 > NOTICE: DB node id: 2 statement: D message > NOTICE: DB node id: 2 statement: Execute: select 1 > NOTICE: DB node id: 2 statement: Sync > ?column? > ---------- > 1 > (1 row) > > The sync message is only sent to node 2. > > Now the implementation details. > > The idea is, if pgpool does not send any query message to a backend > until now (more precisely since the last ReadyForQuery message), we > don't need to send a sync message to the backend node. > > To decide which node we need to send the sync message, add a new > struct member "sync_map" to the session context. A sync_map is an > array of bool. Each array member corresponds to each backend > node. When pgpool forwards a message to backend, corresponding > sync_map member is set to true. > > When a sync message is received from frontend, a sync pending message > is added as we already do. The difference is, previously no query > context is added to the sync pending message. Now we add a query > context with the sync map translated into where_to_send map in the > query context. Then we send a sync message to backend, consulting the > where_to_send map. This way, we don't need to send a sync message to > backend node which pgpool has never sent a query since the last > ReadyForQuery message. > > When a Ready for query message arrives, we decide which backend to > read according to the where_to_send map, which is different from what > we do today: read from all available backend nodes. After receiving > the ReadyForQuery message, sync_map members are all set to false. Also > the sync message query context is destroyed at this timing. > > There are a few other edge cases: > > (1) When an ErrorResponse is received, > pool_discard_except_sync_and_ready_for_query() is called to remove any > pending messages (and backend buffer data) except sync pending message > and ReadyForQuery message. If the sync message is not found in the > queue and receives a sync message from frontend, add a new sync > pending message, consulting the sync_map as described above and > forward the sync message to backends. > > (2) If no query is sent and a sync message is received, the sync_map > is all false. This doesn't make sense since there's no point to send a > sync message, but the protocol does not prohibit this. In this case we > set sync_map all true, which means we send sync messages to all > backend. This may not be the best way in terms of performance, but > this makes things simpler (and compatible what we are doing today). I have just pushed the patch with slight modifications (change copyright year). Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp