Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uP12Q-00Fi8H-Gm for pgsql-general@arkaria.postgresql.org; Tue, 10 Jun 2025 15:37:22 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uP12O-005vS7-Eq for pgsql-general@arkaria.postgresql.org; Tue, 10 Jun 2025 15:37:21 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uP12O-005vNz-3S for pgsql-general@lists.postgresql.org; Tue, 10 Jun 2025 15:37:20 +0000 Received: from mail-ej1-x629.google.com ([2a00:1450:4864:20::629]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uP12M-001G3p-1z for pgsql-general@lists.postgresql.org; Tue, 10 Jun 2025 15:37:19 +0000 Received: by mail-ej1-x629.google.com with SMTP id a640c23a62f3a-ade48b24c97so476039266b.2 for ; Tue, 10 Jun 2025 08:37:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749569836; x=1750174636; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=qUqNpdxMNNWPewYOB40PLmXa58aWcRWSu+NohN9GFbM=; b=mt2xwKgDN1TR+Ghoom9lf7Y/gkr2wOnFiPfUpcIRwToNVmiL9hSRjmta1JvAJoakSx vMHAEykCC7Cm2/OpE4nn8FaLGw0lO8piMo4gLjgUzXstlr6QzJCzpvZPRRk5F2x8WnoF kTqoXpSJO8TxlH/TE1P8ezFgzL8Ub+1csgZDTBMFd3zSuOYSguQ4qKDtHECgJ9cl3Gup s6v74B/O2IdIAs8aPEj7cFMtC/OZazB8JMpK0Xh9+0Lzq32CLFV+kvDnzqJgbaGtqra3 I8XpVem6t3lP+HuUmBvUVWaMkyVWVkdIR+W3nMyWtbvUdQwodiESDSrtkLlMr1+lcmU6 6E5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749569836; x=1750174636; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=qUqNpdxMNNWPewYOB40PLmXa58aWcRWSu+NohN9GFbM=; b=EJcRfIbahbGUU74aF9SPpaRf6h5GOWIHqKEsbIatzUQx8OrBft/A7MGtr2Mo5la9hu SilwZePa8hbM2sqyGGyot28/HQzOtz9zMy/K5cAZmQHt+zOQP5B/s3i78jGh0NXN0aZP yf3QezN26Fp1GxpfyuDAOu5cLVPp08hyhfXuHJUiNqWJPIP+6yBuXcgUBsXJHlIvzpJE bnRNTpjh8beAM41TyVWnxlb6pFcjUUVVt4qaf0gOMMjGy1AeWU281WLVABKcDU3zTuQd kQ0KHglC6xgj7eCFA3W7o62Xa6b0sS1dY503VedkZ0IUKM+W6W86phSzNinqyKqX4fpa D98A== X-Gm-Message-State: AOJu0YyH6AFnamDE60d2Tni+EI0oasT2fW5gt2inWT+R01iXx7Y6wHg1 vU1oDKbMYcgRH+BTVkpOD5NoL7K+2h6HljY8Ie0kZOZXtznwA1j0fbJPpdFXnege3mgK8MFqELk SGEknTO6GUdhnpYpCkYbMvFlxGBr7K1d8LVKG X-Gm-Gg: ASbGncvwzL2s/VmYtiHL7PsLW8mslyl3BWwF/DIuKo9CV9ypmFZjfeKJTjJSroDrhKp 3eKgQZftZzmlklv6kaYRlPpoVcW4zWvv34XViODmBlttBjlVPVU2InADLPkXVItBlCQe0o1fWMI MmAcCmBzWpiZRvQNmhTSIpKDO2BW4PiKSC3foOUp8LNRUIAQ== X-Google-Smtp-Source: AGHT+IFRvcj3/fRtEOJ/fKUnwUbYeKT3+MSCUKQWjXobDbWGR/50pqkSNzDsBwi5heu+SDRoN2bgW06Aml7bk1O0eNE= X-Received: by 2002:a17:907:db03:b0:ad8:9041:7724 with SMTP id a640c23a62f3a-ade1aa688f0mr1451034566b.61.1749569835486; Tue, 10 Jun 2025 08:37:15 -0700 (PDT) MIME-Version: 1.0 From: Max Madden Date: Tue, 10 Jun 2025 16:37:03 +0100 X-Gm-Features: AX0GCFtzJ068YWuHxGC2XCcl5_X2BD11bgvtzOKiKEIwoSgGjBHaQaGEoR4dAwA Message-ID: Subject: Logical Replication Memory Allocation Error - "invalid memory alloc request size" To: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000008dd12b063739785c" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000008dd12b063739785c Content-Type: text/plain; charset="UTF-8" Hello, I'm encountering a consistent issue with PostgreSQL 15 logical replication and would appreciate any guidance on debugging or resolving this problem. *Setup:* - Source: PostgreSQL 15.x - Target: PostgreSQL 15.x - Replication: Logical replication using publication/subscription (pgoutput) - Tables: 3 tables (details below) *Table Details:* - Table 1: ~1,300 records, 7 columns, no large objects - Table 2: ~100,000 records, 7 columns, no large objects - Table 3: ~100,000 records, 17 columns, no large objects *Problem:* The initial snapshot and data copy complete successfully for all tables. However, anywhere from 5 minutes to 2 hours after the initial sync, the subscription consistently fails with memory allocation errors like: ``` 2025-06-10 14:14:56.800 UTC [299] ERROR: could not receive data from WAL stream: ERROR: invalid memory alloc request size 1238451248 2025-06-10 14:14:56.805 UTC [1] LOG: background worker "logical replication worker" (PID 299) exited with exit code 1 ``` This occurs whether I replicate all 3 tables together or individually. My initial hypothesis is that large transactions are creating WAL segments that exceed memory limits when sent to the subscriber. However, I haven't been able to confirm this / find the cause. *Questions:* 1. What's the best approach to debug this memory allocation issue? 2. Are there specific PostgreSQL settings I should check ? 3. How can I identify if large transactions are indeed the root cause? *Additional Context:* - This happens consistently across multiple replication attempts - The error size varies but is always requesting > 1GB - No custom logical replication settings currently applied - Subscriber machine has 256 GB of RAM and Ubuntu 20.04 - Can recreate it on different machines I should also mention that we're operating in a managed environment on DigitalOcean, which means we don't have direct access to the WAL logs on the publisher node. This is why the log information above is limited. I understand this constraint makes it more difficult to provide help, but I would really appreciate any insights or suggestions you might have. Thanks, Max --0000000000008dd12b063739785c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,

I'm encountering a consistent issue with= PostgreSQL 15 logical replication and would appreciate any guidance on deb= ugging or resolving this problem.

Setup:
- Source: Postgr= eSQL 15.x
- Target: PostgreSQL 15.x
- Replication: Logical replicatio= n using publication/subscription (pgoutput)
- Tables: 3 tables (details = below)

Table Details:
- Table 1: ~1,300 records, 7 column= s, no large objects=C2=A0
- Table 2: ~100,000 records, 7 columns, no lar= ge objects
- Table 3: ~100,000 records, 17 columns, no large objects
=
Problem:

The initial snapshot and data = copy complete successfully for all tables. However, anywhere from 5 minutes= to 2 hours after the initial sync, the subscription consistently fails wit= h memory allocation errors like:

```
2025-06-10 14:14:56.800 UTC = [299] ERROR: could not receive data from WAL stream: ERROR: invalid memory = alloc request size 1238451248
2025-06-10 14:14:56.805 UTC [1] LOG: back= ground worker "logical replication worker" (PID 299) exited with = exit code 1
```

This occurs whether I replicate all 3 tables toge= ther or individually.

My initial hypothesis is= that large transactions are creating WAL segments that exceed memory limit= s when sent to the subscriber. However, I haven't been able to confirm = this / find the cause.

Questions:

1. What's the best = approach to debug this memory allocation issue?
2. Are there specific Po= stgreSQL settings I should check ?
3. How can I identify if large transa= ctions are indeed the root cause?

Additional Context:
- Th= is happens consistently across multiple replication attempts
- The error= size varies but is always requesting > 1GB
- No custom logical repli= cation settings currently applied
- Subscriber machine has 256 GB of RAM= and Ubuntu 20.04
- Can recreate it on different machines

= I should also mention that we're operating in a managed environment on = DigitalOcean, which means we don't have direct access to the WAL logs o= n the publisher node. This is why the log information above is limited. I u= nderstand this constraint makes it more difficult to provide help, but I wo= uld really appreciate any insights or suggestions you might have.

Th= anks,
=C2=A0
Max
--0000000000008dd12b063739785c--