Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1undQn-009JG6-Nj for pgpool-hackers@arkaria.postgresql.org; Sun, 17 Aug 2025 13:28:19 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1undQl-00A3P9-8W for pgpool-hackers@arkaria.postgresql.org; Sun, 17 Aug 2025 13:28:15 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1undQl-00A3P1-14 for pgpool-hackers@lists.postgresql.org; Sun, 17 Aug 2025 13:28:15 +0000 Received: from mail-yw1-x1134.google.com ([2607:f8b0:4864:20::1134]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1undQi-000Knc-2n for pgpool-hackers@lists.postgresql.org; Sun, 17 Aug 2025 13:28:15 +0000 Received: by mail-yw1-x1134.google.com with SMTP id 00721157ae682-71d60150590so24908277b3.0 for ; Sun, 17 Aug 2025 06:28:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tailorbrands.com; s=google; t=1755437290; x=1756042090; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=wo76adYCE9jsKdiLmvXLURBfeS1lAsOarHqL4tMncug=; b=G+DZF0pLpgQDxV8B2ec3p36QFcgjN59sDye+Y0k5KePZg8hLVbm4nCYDDsMHag30su TgrfVhQ1U2Zu66trJoLqZpQL7u8eYizGnRkrNvE1OOg/EMdQm/NPr1iB6zhh2Q1ObedD 2ACDS+fJ5bMKh127l8X0tZQuR1ATBaM6m+zYi1/f1LVco/+txNxZyxduWIoVX7kpDaSo b/5aWb+4TTTbMFX1YUTIRNUDyjvk9Gv4DWWa/5W/TxvfaCUKmJjiAV7RV/DHdw8GRLQM rO6iUZkAVzRMjZCbRzgXdWlKeJyzRbNAHYWiWfKPhd1Ht8dZ5Vo4xDEq+hKVs5dN9iEb evrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755437290; x=1756042090; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=wo76adYCE9jsKdiLmvXLURBfeS1lAsOarHqL4tMncug=; b=jio1f106RRpjn9euStqVRm/IfNV3nOsEOHgW21o2/NiuZl7AeBcM6zyEzlCdFilTq/ MmUhGMm9S612u4rV04Ys81aROy1DuDIdO4EF6S1Mi0o9WPbnhlmJKHFX5WeefkvSGDfn DLJsUZsy8DfpMCzeVX6lMV2Ygri8yIPNtHc3ABXINnah518jqIOQavani9AtSuin7XWM hGbNFMhlu4AP1Qae5h+00Ca1cJ7EnYkkTqiLQPikEmddQ5/5bMXYFSZ+JhZ1Na55wgrm lrDt5VQoj4KalHULDPQnCYsmYkAnqMbPGvxXMBOJm/HCFMG7KUnPJXTlz9z8gm640G4l 7qNQ== X-Gm-Message-State: AOJu0YwyuSA6z6HWiYm2+e+fbB8VpvNmU0Lx3LQ4G8zaLoQWdsE4+o2K +rumbMNfLAtmk7XnfaPjrFuTrBGjXoTnXUkkwTF77HGq45J/XCwC8qEfJr4gvj4PDcDl1PfANAo 4jarkFhYo7ltYgW2ofdp0cUZ7o/+9ptK3Uuv57bQcpvqiFX5K5cfJhYceAaXp X-Gm-Gg: ASbGnct+GSNz7IELrgNJkUc6OoAXH4A1sdJnate7ghn0xE4LPBmcXH29XO9rJt5yYUA 22wiSK0HpcuvPQrJbVN0xbhHGjve8Iml0wITP/fDZ2nthla71qZBEMtDxe1k/fqrbVGBMDtnqKa 5GmuOTDGNFfQfsMLH0bvC7m+pxzdipyO9Xa6AlPTpxz7zhBavGMst3bi21N99Y5gTmVFicx1rC9 bqFEGOnLwLfvf/i X-Google-Smtp-Source: AGHT+IEYsY5uJA4c4GKKbyM2sj1Ln1IYTKFKjQ27gX1Jv5QCYI3nTlRZafdDcqE29oozz3I+hBqx+sHyM1nz0p7yrC8= X-Received: by 2002:a05:690c:61c8:b0:71e:6f13:955 with SMTP id 00721157ae682-71e77508079mr69701637b3.34.1755437290401; Sun, 17 Aug 2025 06:28:10 -0700 (PDT) MIME-Version: 1.0 From: Nadav Shatz Date: Sun, 17 Aug 2025 16:27:59 +0300 X-Gm-Features: Ac12FXwmnvHEV6XSYjo7JXdEnEccAzIbk1TAysn4BOVA1iQgcWK3oAWXFAVNzd4 Message-ID: Subject: Proposal: recent access based routing for primary-replica setups To: pgpool-hackers@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000001edb3c063c8f9887" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000001edb3c063c8f9887 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello all, My name is Nadav Shatz, I=E2=80=99m the CTO at Tailor Brands and have been = working with PostgreSQL in high-traffic, distributed environments for many years. Most of my focus has been on backend architecture, scaling, and performance optimization, and I=E2=80=99m a long-time user and admirer of the Postgres ecosystem. I=E2=80=99d like to propose adding a feature to pgpool-II for *recent acces= s based routing* in primary-replica setups. The idea is similar to what we=E2=80=99= ve described in this article , and is also reflected in this pgcat PR . The core concept is to route read queries to the primary if they occur shortly after a write, reducing replica lag inconsistencies while still benefiting from read scaling. *How it would work (high-level):* - *External =E2=80=9Ceffective lag=E2=80=9D via config (hot-reloaded): *Inste= ad of relying on pgpool-II=E2=80=99s replication delay checks (which don=E2=80=99t map we= ll to Aurora semantics), we=E2=80=99ll expose a *config value* representing the effec= tive replica lag (or directly the TTL to use for =E2=80=9Crecency=E2=80=9D). = This value is *pushed by an external controller* and *hot-reloaded* (no restarts). The relevant knobs might look like: - enable_recent_access_routing (boolean, default off) - recent_access_ttl_ms (integer, default 0, can be hot-reloaded) - enable_query_parser (boolean, required for this feature, default off) - *In-memory recent-access map: *Each worker maintains a lightweight per-DB in-memory map of *recently written relations*. On any write (INSERT/UPDATE/DELETE/UPSERT/TRUNCATE), we record the touched relations with a TTL derived from recent_access_ttl_ms. Entries expire automatically; writes refresh them. - *Routing + query parsing: *For incoming statements we parse enough to answer two questions: (1) is it a read or a write? and (2) which relatio= ns are referenced? If a read touches any =E2=80=9Crecently written=E2=80=9D= relation, we *force route to primary*; otherwise we allow normal read load-balancing to replicas. *Notes on behavior & ops:* - *Config & hot reload:* Operators (or an external controller) can update recent_access_ttl_ms dynamically and trigger hot reload to adapt to changing conditions=E2=80=94no reliance on Aurora internals. - *Safety levers:* a global max TTL, optional allow/deny lists, and metrics (e.g., =E2=80=9Creads forced to primary due to recency=E2=80=9D)= for visibility. - *Defaults & compatibility:* all defaults are safe/off; enabling requires explicit opt-in. I=E2=80=99ll prepare the code changes and send a patch/PR, but before divin= g in I wanted to check if anyone has *objections, concerns, or preferred alternatives*=E2=80=94particularly around parser hooks, shared memory use, = or hot-reload mechanics in pgpool-II. Thanks for considering, --=20 Nadav Shatz Tailor Brands | CTO --0000000000001edb3c063c8f9887 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Hello all,

My name is Nadav Shatz, I=E2=80=99m the CTO at Tailor= Brands and have been working with PostgreSQL in high-traffic, distributed = environments for many years. Most of my focus has been on backend architect= ure, scaling, and performance optimization, and I=E2=80=99m a long-time use= r and admirer of the Postgres ecosystem.

I=E2=80=99d like to propose adding a feature to pgpoo= l-II for recent access based routing= in primary-replica setups. The idea is similar to what we=E2=80=99ve descr= ibed in this article, and is also refl= ected in this pgca= t PR. The core concept is to route read queries to the primary if they = occur shortly after a write, reducing replica lag inconsistencies while sti= ll benefiting from read scaling.

How it would w= ork (high-level):

  • External =E2=80=9Ceffective lag=E2=80=9D via confi= g (hot-reloaded):
    Instead of relying on pgpool-II=E2=80=99s replicat= ion delay checks (which don=E2=80=99t map well to Aurora semantics), we=E2= =80=99ll expose a config value repre= senting the effective replica lag (or directly the TTL to use for =E2=80=9C= recency=E2=80=9D). This value is pushed by an e= xternal controller and hot-reloaded<= /b> (no restarts). The relevant knobs might look like:

    • enable_recent_access_routing= (boolean, default off)

    • recent_access_ttl_ms = (integer, default 0, can be hot-reloaded)

    • enable_query_parser (boolean, required for= this feature, default off)

  • In-memory recent-access map:
    Each worker ma= intains a lightweight per-DB in-memory map of r= ecently written relations. On any write (INSERT/UPDATE/DELETE/UP= SERT/TRUNCATE), we record the touched relations with a TTL derived from recent_access_ttl_ms. Entries expire automatic= ally; writes refresh them.

  • Routing + query parsing:
    For incoming state= ments we parse enough to answer two questions: (1) is it a read or a write?= and (2) which relations are referenced? If a read touches any =E2=80=9Crec= ently written=E2=80=9D relation, we force route= to primary; otherwise we allow normal read load-balancing to re= plicas.

Notes on behavior & ops:<= /b>

  • Config & hot reload:<= /b> Operators (or an external controller) can update recent_access_ttl_ms dynamically and trigger hot reload to= adapt to changing conditions=E2=80=94no reliance on Aurora internals.

  • Safety levers:= a global max TTL, optional allow/deny lists, and metrics (e.g., =E2=80=9Cr= eads forced to primary due to recency=E2=80=9D) for visibility.

  • Defaults & compatibil= ity: all defaults are safe/off; enabling requires explicit opt-i= n.

  • I=E2=80=99ll prepare the code changes= and send a patch/PR, but before diving in I wanted to check if anyone has = objections, concerns, or preferred alternatives= =E2=80=94particularly around parser hooks, shared memory use, or= hot-reload mechanics in pgpool-II.

    Thanks for con= sidering,

    --
    =
    Nadav Shatz
    Tailor Brands=C2=A0| CTO
    <= /div>
    --0000000000001edb3c063c8f9887--