Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vsJIr-00Fvqi-1X for pgsql-bugs@arkaria.postgresql.org; Tue, 17 Feb 2026 11:31:41 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vsJIq-009vqT-06 for pgsql-bugs@arkaria.postgresql.org; Tue, 17 Feb 2026 11:31:40 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vs5T3-005gmT-1Q for pgsql-bugs@lists.postgresql.org; Mon, 16 Feb 2026 20:45:17 +0000 Received: from mahout.postgresql.org ([2001:4800:3e1:1::227]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vs5Sy-0000000159M-01GF for pgsql-bugs@lists.postgresql.org; Mon, 16 Feb 2026 20:45:15 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=postgresql.org; s=20171124; h=Message-ID:Date:Reply-To:Cc:From:To:Subject: Content-Transfer-Encoding:MIME-Version:Content-Type:Sender:Content-ID: Content-Description:In-Reply-To:References; bh=l06YmWd/JygEFhLyGcVBoywTScEbD1gEL75wvtTBiFs=; b=rrdwbTFuiIxk6lWInQs8Hjfc3U xmz/hZS5j9NkA9NyyqlQB0dTXLZy7M7SE96CzZ2f6Vc0otkt9nfRbJy9UScTnUIgTMpTqc6crMKkl h7K05QedvhcmFIxoniDu6bGeztAPj+DX22j7Riyxp0iqAO3gSRnqh4ep8vIYg+ratSHjHldOBmjqw /Nt+vpMXMR6V6ziPYGlDg+51O/U2dsnH4vvtOiRiblapJ1+b5W16gMDLmz87kbjsni8ultVnO7jcw TsnTNN5i7f4aVpxUNkfrNPVBWghfd+lMWh1Tm77ol/e/7Bbn2Cgip2rGSQv8Vv2MGhzHLCQNTCWzU /Hw+coOw==; Received: from wrigleys.postgresql.org ([2a02:16a8:dc51::60]) by mahout.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vs5Sv-0038Db-38 for pgsql-bugs@lists.postgresql.org; Mon, 16 Feb 2026 20:45:10 +0000 Received: from localhost ([127.0.0.1] helo=wrigleys.postgresql.org) by wrigleys.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vs5Sp-001M6Z-0z for pgsql-bugs@lists.postgresql.org; Mon, 16 Feb 2026 20:45:03 +0000 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Subject: BUG #19411: libpq 16.x exhibits a memory leak when connections are repeatedly created and destroyed To: pgsql-bugs@lists.postgresql.org From: PG Bug reporting form Cc: mcarter@twosigma.com Reply-To: mcarter@twosigma.com, pgsql-bugs@lists.postgresql.org Date: Mon, 16 Feb 2026 20:44:35 +0000 Message-ID: <19411-0440c8897a04b638@postgresql.org> X-Auto-Response-Suppress: All Auto-Submitted: auto-generated List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk The following bug has been logged on the website: Bug reference: 19411 Logged by: Matt Carter Email address: mcarter@twosigma.com PostgreSQL version: 16.7 Operating system: Debian GNU/Linux 12 with kernel 6.12.51-ts1-amd64 Description: =20 This report is for the libpq client library for PostgreSQL. I noticed the issue because my Apache Airflow task supervisor processes were experiencing memory growth of ~600 KB/min. When running hundreds of tasks, my 32 GB machine OOM-crashed every few hours. I traced the problem to when I upgraded libpq from version 13 to 16. Varying Python version, and PostgreSQL DB version didn't make a difference. When I updated to psycopg3, the leak rate dropped dramatically. Confirmed leaking: libpq 16.0.4 libpq 16.0.7 Confirmed NOT leaking: libpq 13.0.11 libpq 13.11 My environment for the below reproducer is: Client libpq: 16.0.4 or 16.0.7 (runtime version) Python: 3.10.16 or 3.11.11 psycopg2: 2.9.10 SQLAlchemy: 1.4.46 PostgreSQL Server: 13.11 OR 16.7 (both affected equally) Platform: Linux, K8s containers Reliable reproducer: ``` #!/usr/bin/env python3 """ Reproduces ~800 KB/min memory leak in libpq 16.x MUST monitor with 'ps aux', NOT Python's resource module! """ import datetime, os, subprocess from time import sleep from sqlalchemy import Column, Integer, types, create_engine from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import scoped_session, sessionmaker from sqlalchemy.pool import NullPool # Verify libpq version import psycopg2 print(f"psycopg2 compiled against libpq: {psycopg2.__libpq_version__}") # Check ACTUAL runtime libpq (critical!) so_path =3D os.path.dirname(psycopg2.__file__) + "/_psycopg.cpython-311-x86_64-linux-gnu.so" result =3D subprocess.run(["ldd", so_path], capture_output=3DTrue, text=3DT= rue) for line in result.stdout.split('\n'): if 'libpq.so' in line: print(f"ACTUAL runtime libpq: {line.split('=3D>')[1].split('(')[0].strip()}") print(f"\nMonitor this PID: {os.getpid()}") print("Run in another terminal: watch -n 10 'ps -p -o pid,rss,vsz,cmd'") print() # Setup SQLAlchemy with NullPool (creates/destroys connections) Base =3D declarative_base(metadata=3DNone) class TestTable(Base): __tablename__ =3D "test_leak_table" id =3D Column(Integer, primary_key=3DTrue) updated_at =3D Column(types.DateTime(timezone=3DTrue)) obj =3D TestTable() engine =3D create_engine( "postgresql://your-server/your-database", poolclass=3DNullPool, # Critical: no pooling, create/destroy each time encoding=3D"utf-8" ) Session =3D scoped_session(sessionmaker( autocommit=3DFalse, autoflush=3DFalse, bind=3Dengine, expire_on_commit=3DFalse, )) session =3D Session() session.add(obj) # Monitor memory using ps aux (CRITICAL - internal monitoring doesn't show leak!) def get_rss_kb(): """Get RSS from ps aux - DO NOT use resource.getrusage()!""" pid =3D os.getpid() result =3D subprocess.run(["ps", "aux"], capture_output=3DTrue, text=3D= True) for line in result.stdout.split('\n'): if str(pid) in line and 'python' in line: return int(line.split()[5]) # RSS is 6th column return None print("Starting 10-minute test...") print(f"{'Time(s)':<8} | {'Iteration':<10} | {'RSS(KB)':<10} | {'Growth(KB)':<10}") print("-" * 55) start_time =3D datetime.datetime.now() baseline_rss =3D None for i in range(1, 601): # 600 iterations =3D 10 minutes # Update and commit (creates/destroys connection with NullPool) obj.updated_at =3D datetime.datetime.now(datetime.timezone.utc) session.commit() # Report every 30 iterations if i % 30 =3D=3D 0: rss =3D get_rss_kb() if rss: if i =3D=3D 30: baseline_rss =3D rss growth =3D 0 else: growth =3D rss - baseline_rss elapsed =3D (datetime.datetime.now() - start_time).total_second= s() print(f"{int(elapsed):<8} | {i:<10} | {rss:<10} | {growth:<10}", flush=3DTrue) sleep(1) # Final results final_rss =3D get_rss_kb() total_elapsed =3D (datetime.datetime.now() - start_time).total_seconds() growth_total =3D final_rss - baseline_rss leak_rate =3D growth_total / (total_elapsed / 60) print() print("=3D" * 60) print(f"Baseline RSS (t=3D30s): {baseline_rss} KB") print(f"Final RSS (t=3D600s): {final_rss} KB") print(f"Total Growth: {growth_total} KB") print(f"Leak Rate: {leak_rate:.1f} KB/min") print("=3D" * 60) ``` The DB set-up for this reproducer is simple: ``` CREATE TABLE test_leak_table ( id SERIAL PRIMARY KEY, updated_at TIMESTAMP WITH TIME ZONE ); INSERT INTO test_leak_table (updated_at) VALUES (NOW()); ``` When I run this with libpq 13.x, I get: Baseline: ~41,000 KB Final: ~41,200 KB Growth: ~200 KB (stabilizes) Leak Rate: ~14 KB/min But when I run it with libpq 16.x, I reproduce the leak behavior: Baseline: ~42,000 KB Final: ~52,000 KB Growth: ~10,000 KB (continuous linear growth) Leak Rate: ~800 KB/min Notes: 1. PostgreSQL server version does NOT matter PG 13.11: ~797 KB/min average PG 16.7: ~802 KB/min average Difference: <1% (within measurement error) 2. Leak is linear and continuous Grows ~530 KB every 30 seconds No stabilization No sawtooth pattern Suggests unbounded memory accumulation 3. Leak is in C Library Memory The leak is: - Visible in `ps aux` RSS (operating system view) - NOT visible in Python's resource.getrusage() (application view) - NOT any Python object (I checked with the Python VM's `gc.get_objects()`) This indicates the leak is in libpq's C-level allocations, not Python heap. 4. The leak requires connection churn. In particular, it needs: - NullPool (no connection pooling) - Repeated connection create/destroy - UPDATE + COMMIT operations each iteration Note that the only psycopg2 version I tested with was 2.9.10, as it is the only version available in my test environment. Also, I only tested with SQLAlchemy version 1.4.46, as v2 would require a rewrite due to an incompatible API. Workarounds: Use libpq 13.x (43x improvement: 14-22 KB/min vs 800 KB/min). Use connection pooling to reduce connection churn. Use psycopg3. Recycle processes frequently.