Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2lsZ-000aan-2r for pgsql-hackers@arkaria.postgresql.org; Wed, 18 Mar 2026 08:03:48 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w2liY-008VMu-3C for pgsql-hackers@arkaria.postgresql.org; Wed, 18 Mar 2026 07:53:26 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2liY-008VMk-22 for pgsql-hackers@lists.postgresql.org; Wed, 18 Mar 2026 07:53:26 +0000 Received: from mail-pf1-x436.google.com ([2607:f8b0:4864:20::436]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w2liV-00000000tSC-3Rsu for pgsql-hackers@lists.postgresql.org; Wed, 18 Mar 2026 07:53:26 +0000 Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-82a124f3a5bso3723742b3a.0 for ; Wed, 18 Mar 2026 00:53:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773820401; x=1774425201; darn=lists.postgresql.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=p/80w2dOKcR++YMizVVxcLVcxW1i5AyzjeXCT9lW9O0=; b=YGORHGvfGAvWRZ0KLJq8nHtCvT5e1z0LjkKUVVtALlHmeyiHkUji7bnKrddRsCwxSZ SNCTREiD9dQHtgjAz88NcPlKzZP7qtpoQj6axSCkARym4mwi+fUBxkQ62L1ggSuZA254 HBmRPvtbPntjYaEJ7OBGTkiximn2p61BohNaO0tD+JR8g0szbXxD67pc+TABKZERui9w jhGNBz0r17hm1R1K7fiOD1L3gFeh6hzS8EgtNefKUWdc5y8dKND8XIz0sn14q6+tkxg/ SnNUP+ktec8MV6M6jZlfJhSs6ZRMs0r2bfQ0O0plODPMpRmr55ZpWIfDAwkYJFGxzlN1 IzZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773820401; x=1774425201; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-gg:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=p/80w2dOKcR++YMizVVxcLVcxW1i5AyzjeXCT9lW9O0=; b=K/tUf5Ny6O8mPRqzZzy8s9FupGe5DPsayENUhoXrhO5WmyfUJE9kZm3hel/IWE61nt y/GmQ+G/c76MngZ6Qad6WqdDrxlXkYvcNeZewCnEuo2lApZNPrliyd8cc8Zy36dJdnje cJZEHrQN9muhFbc7CMGAZOJu0PrIGOY5+HFionpFBCPuRv7Tz7TKxEcBuo7usBsyGmVs 9FV/03gTRaHYQakFmXjiKudWvV+eBkGHiXEFBJ0mw4IApuv7ZqIjXTy4QeM4EiF4xk5H da0UN01VkA1mqN3rpz1hfOoEtq3g+yg6sbXCxjq8aHuSZWigbmqp3NznIz8zQfAdizrH F0ZA== X-Gm-Message-State: AOJu0Ywouxi6uNWqTh0KuTiZb10bjpGYWiWcsoIkjFoCHY/TP2Tan8wL veaUXLqOjorG/iP4PYd+CsiRGkXXq/P60rv/QgGX0jUsKsJ6+rX/7w0h X-Gm-Gg: ATEYQzwBLDRAsujOd8T8+gKwOlau/71uqQIarZKjuB4mP4JyvzGRnj5oj3IStdoFpya wfSMldl87uX64oZ4CmGJ7PUqWxaXTRsplswC/+0AndZkP20KQUk1yZ1iz1/G+LUMHfjaKKkXR7r 6cvtC42dI1ImX9wqTDMEdOaelhpjXBgdOU5d0EbWcZDOcLlCpleTsBk5Qkb44xxy7mnr2xHcXlr xP/r/Ge1+yJkuSHaDKYIT33bQYl3s+W80FUsKQ4krb4xUNp03Qo/kYGQQ/DtuJ6BUGDyOuH6mi5 OWMfFlHxf9R3Obo9X8+RCpqfqkSZJUnIH1UZJhhXNUX26LCElrbzk+oQVfkQBsbyuI80I5FaVgw i/ZyD1877hWLyvKFvYuxfN1++F4qXw7O116qwEO06LZStBkaBatg3moGZjaOSIXLO4MN67SaRTe uklK48b+T1Kzob3uuIUpfiSm8UnOKIiLg= X-Received: by 2002:a05:6a00:ab85:b0:82a:13ef:5ab0 with SMTP id d2e1a72fcca58-82a6adc7011mr2575592b3a.5.1773820400660; Wed, 18 Mar 2026 00:53:20 -0700 (PDT) Received: from smtpclient.apple ([45.32.121.103]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82a6b542037sm2189239b3a.5.2026.03.18.00.53.18 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Mar 2026 00:53:20 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3864.400.21\)) Subject: Re: Avoiding memory leakage in jsonpath evaluation From: Chao Li In-Reply-To: <569394.1773783211@sss.pgh.pa.us> Date: Wed, 18 Mar 2026 15:52:40 +0800 Cc: pgsql-hackers@lists.postgresql.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <569394.1773783211@sss.pgh.pa.us> To: Tom Lane X-Mailer: Apple Mail (2.3864.400.21) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > On Mar 18, 2026, at 05:33, Tom Lane wrote: >=20 > I got an off-list report that a query like this consumes > an unreasonable amount of memory: >=20 > SELECT jsonb_path_query((SELECT jsonb_agg(i) FROM = generate_series(1,10000) i), > '$[*] ? (@ < $)'); >=20 > For me, that eats about 6GB by the time it's done executing. > If that doesn't seem like a lot to you, just add another zero to the > generate_series call, and then it'll be more like 600GB, because the > leakage is O(N^2). >=20 > Admittedly, this isn't an especially useful query: its runtime is > also O(N^2), because that path expression basically requires us to > compare every element of the input JSON array to every other element. > But it's not cool that it leaks so much memory while at it. >=20 > I poked into this and found that the leakage is entirely composed of > "JsonValueList"s that are built during path evaluation and then just > left to rot until the end of jsonb_path_query(). We can fix it by > being careful to free those lists on the way out of each jsonpath > evaluation function that creates one. However, just doing that would > mean adding pfree overhead on top of palloc overhead, so I went a bit > further and reimplemented JsonValueList to be more compact and cheaper > to allocate/free. The attached seems to be a bit faster than the > existing code as well as not leaking so much memory. See the draft > commit message for more details. >=20 > regards, tom lane This patch looks like a big win. It not only saves memory, but also = makes the query much faster. I tested the query on my MacBook M4, increasing the iteration count from = 10000 to 50000. Current master (3b4c2b9db25): ``` evantest=3D# SELECT jsonb_path_query((SELECT jsonb_agg(i) FROM = generate_series(1,50000) i), '$[*] ? (@ < $)'); Time: 208581.771 ms (03:28.582) evantest=3D# SELECT jsonb_path_query((SELECT jsonb_agg(i) FROM = generate_series(1,50000) i), '$[*] ? (@ < $)'); Time: 217269.595 ms (03:37.270) ``` With the patch: ``` evantest=3D# SELECT jsonb_path_query((SELECT jsonb_agg(i) FROM = generate_series(1,50000) i), '$[*] ? (@ < $)'); Time: 18674.580 ms (00:18.675) evantest=3D# SELECT jsonb_path_query((SELECT jsonb_agg(i) FROM = generate_series(1,50000) i), '$[*] ? (@ < $)'); Time: 18889.329 ms (00:18.889) ``` My observations were: * Before the patch, the backend process memory usage fluctuated between = roughly 50GB and 145GB, while CPU usage stayed around 30%. * With the patch, the backend process memory usage stayed stable at = around 30MB, while CPU usage stayed around 100%. After reviewing the patch, I thought JsonValueListLength() might be = worth optimizing, since it is O(n). I tried adding an ntotal_items field = to JsonValueList to track the total number of items, similar to the last = pointer that is only meaningful in the base chunk. But that did not help = in my test, and I realized JsonValueListLength() is not on the hottest = path, so I dropped that idea. =46rom the MacOS Instruments tool, the most expensive parts seem to be = fillJsonbValue, JsonbIteratorNext, cmp_var_common, and cmp_numerics. But = those look like separate topics. Overall, this looks like a solid patch. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/