Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w4gCs-002RYz-1y for pgsql-hackers@arkaria.postgresql.org; Mon, 23 Mar 2026 14:24:38 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w4gCr-000fzJ-05 for pgsql-hackers@arkaria.postgresql.org; Mon, 23 Mar 2026 14:24:37 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w4gCq-000fzA-20 for pgsql-hackers@lists.postgresql.org; Mon, 23 Mar 2026 14:24:37 +0000 Received: from mail-ot1-x332.google.com ([2607:f8b0:4864:20::332]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w4gCp-00000000dWg-0jeW for pgsql-hackers@lists.postgresql.org; Mon, 23 Mar 2026 14:24:36 +0000 Received: by mail-ot1-x332.google.com with SMTP id 46e09a7af769-7d7f592b8beso2278167a34.1 for ; Mon, 23 Mar 2026 07:24:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774275874; cv=none; d=google.com; s=arc-20240605; b=cFi6dsakbqZt79kreiEOoKXs863MvlhPvFgQi8mnTLG7LL5iR+Zlo6gkblK3O2at0p SiAjS/qUiku19Z6F2A5Xmsdjc6gO6wuJnJOjoxDD4QqCw/CvaTy2Av1dYa7JxZMf8weH 2HQ8zdnE4VKUh8mgfacolNE1kUVNjFwtlxbrjZCIAc8f6EMl30BTVbEARqLTNmtYdO82 FhUGzkf+NOlUWB0ZKfksoZ/KWpkEWgEecfTsHacfNAUXJaOgCi4nD99dQmfkshpyUdDl aENDoMziPDRyIhxuFA1fBypuBPANDdTYWBFf9xdf4bN/8KsyZ4zmBgIYCSSU2xyyjlVN P+cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=jzNKGtEUvv7MxvhMb32MB+shCfzd1EOakz+c49b12V0=; fh=4PEyEgnrNCXBb06IAIllTAtxFESb07aZUXNYQpwl3Mg=; b=QI8LioXUW50BO+hzNRz12+rLbR0MV58Kh6h0cKFxTpDSR2CisJEM5ToNCe4YvUKIUN 7iFssmfK5NPP3cscZAUvkuwlUQynzEQLwCUO9qNEqy6dD5+HziF+NS6YLxVCWIHv1CY3 +5eKIUSBRusbi/vq0a7hub2WJ3GNc/v3q9DorkuNP2hwVgvfO5+RChAdGwC2CuZ4ZYlP XHSX449dyXr9BUHVSjao5xv9FwKA7XD2+c48eVdg0o7QS6NC94esR2auV4w5fOSRh3J1 G+5mV99j61TJ3RoUFrJm2CL2ZM7RkiPyogl0QlCSJ3XnYj5wd8SF46Ga32C6sum8CUK0 XekQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1774275874; x=1774880674; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jzNKGtEUvv7MxvhMb32MB+shCfzd1EOakz+c49b12V0=; b=OH1nerKiAEz0IRprK9Rq7NAI/w5myZipc/qzqjEbQyZMTqx82b2IR8+HkahqwT1+cE DQlT0mkSiU4lC9cCjmfXm/DoYCGgqZKXdChYcIggQc/V+oaBn+BMKlrlQzXTmh1ruEFC 8kAxMUggET475DbCC04kLZTRTx7+pFAfc1miI/Krp85dQ//o5ll2CBatmf+NAMKADjak 8S7OUTeQNLtwcMXDfcAjyosvfXBWjwhQlTWPRdUKvE8RvKtmlWASD2Mla+WpiS/942mO U+l4jdE4E24TCtoeiI1hNDtnCVPMMPM3AL3lI1hIdigRuFKh4jwwiCZLbrrP6YBu0Our Myeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774275874; x=1774880674; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=jzNKGtEUvv7MxvhMb32MB+shCfzd1EOakz+c49b12V0=; b=O7A8DKXVvBl5NCJQbKxbtrCbPfmgOi10kAx8biNxJ9PbeNzx2mOzn8RP19MtTp0z9E WDTnIn/Wnmxii6OPwJPf1vIbwU2Np7bEhsurI0Ak1SfX7tU6hB20OPZPP2Ix95lbB7Zj M4pBZBNRszX7jEykiF2fbblopzV3Xz3+lcPtmc8XWErE40nFwLBypT2nGU7F2W+Do991 Km7f3gNT1+FXeAtbywrdENQpTXbZFLIsBH4q7S/kx3I7CSjtD5TTMhrPQXdlB66s8IHd iePTQHmoXf54PfkcGaX20CcinfmzlSgAD+l38m9ewqfpo6Raucm9UVlEGQb7UroSdenQ Fq6Q== X-Gm-Message-State: AOJu0YyyyAQRAb93ueeKcuYhBy5/1vyTHDtiquX994CjSVEzwZGLHRLC bvVwLnJvD73IbWXkjysdz9/tUIPgltAugMDHabDZABQhc48Zgwam6euDAYPdsPoCDt1qPvN1f/s yGxjffE8INP/Y0TUx/iYxVa3F2cQ5dyg= X-Gm-Gg: ATEYQzymp/QlgPrPjU0JPlO+I3sJfXOoC8kamJOE1RyU8CgOnye9r8afzCncvlVxtU8 uBUnQTMoAAYHOxMk6plBrJBAbA+XGnEkZcOw7erVIssw/CINcismkuxFSdNU40IWlyRECpDX7iM kDHhJcPRMIA+EJWcXDNL/fL4zsVbsFLCe4NIAbRUHimkgIqwB55pSQB6nHFEJKtWWnatVWuqvCb CNwHZkcIY7ogrDRp4w+SQIPjiDlPkAv36g6zP49BFBbzYMdr3rVAj18yrX322h7koMYpzT/xTgU Ptz7TS22pASIzvliQi5q80RvpxDxXPvI2so0aoA/nMbi90yNJg== X-Received: by 2002:a05:6820:4b18:b0:67c:15b2:476d with SMTP id 006d021491bc7-67c22f8574dmr9335663eaf.42.1774275874028; Mon, 23 Mar 2026 07:24:34 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Bharath Rupireddy Date: Mon, 23 Mar 2026 07:24:23 -0700 X-Gm-Features: AaiRm508Ur4IgZb3rZJ5MBPioI_9EjzJycyio0dwiT9S5RMxFoRqryfBRfMJCwo Message-ID: Subject: Re: Add logical_decoding_spill_limit to cap spill file disk usage per slot To: shawn wang Cc: pgsql-hackers@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On Mon, Mar 23, 2026 at 6:20=E2=80=AFAM shawn wang wrote: > > Hi hackers, Thank you for proposing this new feature. > =3D=3D Motivation =3D=3D > > We operate a fleet of PostgreSQL instances with logical replication. On s= everal occasions, we have experienced production incidents where logical de= coding spill files (pg_replslot//xid-*.spill) grew uncontrollably =E2= =80=94 consuming tens of gigabytes and eventually filling up the data disk.= This caused the entire instance to go read-only, impacting not just replic= ation but all write workloads. > > The typical scenario is a large transaction (e.g. bulk data load or a lon= g-running DDL) combined with a subscriber that is either slow or temporaril= y disconnected. The reorder buffer exceeds logical_decoding_work_mem and st= arts spilling, but there is no upper bound on how much can be spilled. The = only backstop today is the OS returning ENOSPC, at which point the damage i= s already done. Having a lot of spill files also increases crash/recovery times. However, files spilling to disk causing no-space-left-on-disk issues leading to downtime applies to WAL files, historical catalog snapshot files, subtransaction overflow files, CLOG (and all the subsystems backed by SLRU data structure), etc. - basically any Postgres subsystem writing files to disk. I'm a bit worried that we may end up solving disk space issues, which IMHO are outside of the database scope, in the database. Others may have different opinions though. How common is this issue? Could you please add a test case to the proposed patch that without this feature would otherwise hit the issue described? Having said that, were alternatives like disabling subscriptions when seen occupying the disk space considered? > We looked for existing protections: > > max_slot_wal_keep_size: limits WAL retention, but does not affect spill f= iles at all. > logical_decoding_work_mem: controls *when* spilling starts, but not *how = much* can be spilled. > There is no existing GUC, patch, or commitfest entry that addresses spill= file disk quota. Interesting! > The "Report reorder buffer size" patch (CF #6053, by Ashutosh Bapat) impr= oves observability of reorder buffer state, which is complementary =E2=80= =94 but observability alone cannot prevent disk-full incidents. With the proposed reorder buffer stats above, would it be possible to have a monitoring solution (an extension or a tool) to disable subscriptions and notify the admin? Would something like this work? > =3D=3D Proposed solution =3D=3D > > The attached patch adds a new GUC: > logical_decoding_spill_limit (integer, unit kB, default 0) > > When set to a positive value, it limits the total size of on-disk spill f= iles per replication slot. Key design points: > > Tracking: We add two new fields: - ReorderBuffer.spillBytesOnDisk =E2=80= =94 current total on-disk spill size for this slot (unlike spillBytes which= is a cumulative statistic counter, this is a live gauge). - ReorderBufferT= XN.serialized_size =E2=80=94 per-transaction on-disk size, so we can accura= tely decrement the global counter during cleanup. > Increment: In ReorderBufferSerializeChange(), after a successful write(),= both counters are incremented by the size written. > Decrement: In ReorderBufferRestoreCleanup(), when spill files are unlinke= d, the global counter is decremented by the transaction's serialized_size. > Enforcement: In ReorderBufferCheckMemoryLimit(), before calling ReorderBu= fferSerializeTXN(), we check: if (spillBytesOnDisk + txn->size > spill_limi= t) ereport(ERROR, ...) This is only checked on the spill-to-disk path =E2= =80=94 not on the streaming path (which involves no disk I/O). > Behavior on limit exceeded: An ERROR is raised with ERRCODE_CONFIGURATION= _LIMIT_EXCEEDED. The walsender exits, but the slot's restart_lsn and confir= med_flush are preserved. The subscriber can reconnect after the DBA: > > increases logical_decoding_spill_limit, or > increases logical_decoding_work_mem (to reduce spilling), or > switches to a streaming-capable output plugin (which avoids spilling enti= rely). When the logical_decoding_spill_limit is exceeded, ERRORing out in the walsender is even more problematic, right? The replication slot would be inactive, causing bloat and preventing tuple freezing, WAL files growth and eventually the system may hit disk-space issues - it is like "we avoided disk space issues for one subsystem, but introduced it for another". This looks a bit problematic IMHO. Others may have different opinions though. -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com