Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5EiY-0032fE-2X for pgsql-hackers@arkaria.postgresql.org; Wed, 25 Mar 2026 03:15:39 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w5EiX-00B2XR-0f for pgsql-hackers@arkaria.postgresql.org; Wed, 25 Mar 2026 03:15:37 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5EiW-00B2WB-1E for pgsql-hackers@lists.postgresql.org; Wed, 25 Mar 2026 03:15:37 +0000 Received: from fhigh-a6-smtp.messagingengine.com ([103.168.172.157]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w5EiU-00000000tdH-2tza for pgsql-hackers@postgresql.org; Wed, 25 Mar 2026 03:15:35 +0000 Received: from phl-compute-05.internal (phl-compute-05.internal [10.202.2.45]) by mailfhigh.phl.internal (Postfix) with ESMTP id DF78914001DD; Tue, 24 Mar 2026 23:15:33 -0400 (EDT) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-05.internal (MEProxy); Tue, 24 Mar 2026 23:15:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1774408533; x=1774494933; bh=DqyD1nmuI4CO+ajah8zgzPQFwbofz24LcqyYZRAtY6U=; b= 15A7pXeebRVPDzBo0D5gMGMP6U940v5Dcoj5/3li3JVuVLrSxj8xftrFgqzP6EkB Rl3t2vM1FpuCJpo8K105H3RYtST1AKcmHx0y0g2b8wRA8yD5McT5iWOn3x3Ez4KH gNInFNn2OAIVtP1Cn78g0ng1oXT2NWlFosV8cK9XAPWvrPGdP5yo3Aj1L9QNXRi1 IMdbwSpw5mJuJvqhGzO7vlceNIDgMRIz6iGlOxywq+acD0Xtfgcr1DRumJP+HpEg u0e63m72QfhmjCIlfMWahCw4GdX9AEN+cUk9uDJG9salK1Uq1qGSUyZ2YB3ueV9H LC/ceF5dOfvoSlEnpnteGw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1774408533; x= 1774494933; bh=DqyD1nmuI4CO+ajah8zgzPQFwbofz24LcqyYZRAtY6U=; b=T +3weVNQytU60PP//pF4jjE/qIPoTVtKAIZkPbd2TfipI0SNlZ24C3yIDoKStp6nl wwHYqUZrPIu+HQFZDSswhkmcw2MiO1H1BX3f++CZqlvmqiEFfsdRvSWk1HgopV7G +Q8laCivQ3Xsp2spph4c+MPSIBA1Z3BlYkm6FhSWIcEOzhPrt5g/IU5GMXrz9ael P64JQqJWLnDNvf3j5e9Ek128xzsbrtRVhuqMCH84WxrA2sP/bp6SQSO6egWZ8YlD z3WJ3PswxpiKSO6+3flFtBKAo0IRYhbr2NmdTZqd/Dst8Ds4AixUUHmmx7vjvUEM eczRjd8wsbjzfzjB76XVA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgdefvdeffeeiucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkfhggtggugfgjsehtkefstddttdejnecuhfhrohhmpeetnhgurhgv shcuhfhrvghunhguuceorghnughrvghssegrnhgrrhgriigvlhdruggvqeenucggtffrrg htthgvrhhnpedtleelvdfgjedvffeiueekfeeuleffhfegfffhgfffkeevueehieehhfei gffhvdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpe grnhgurhgvshesrghnrghrrgiivghlrdguvgdpnhgspghrtghpthhtohephedpmhhouggv pehsmhhtphhouhhtpdhrtghpthhtohepughilhhiphgsrghlrghuthesghhmrghilhdrtg homhdprhgtphhtthhopehthhhomhgrshdrmhhunhhrohesghhmrghilhdrtghomhdprhgt phhtthhopehpohhsthhgrhgvshesjhgvlhhtvghfrdhnlhdprhgtphhtthhopehnohgrhh eslhgvrggusghorghtrdgtohhmpdhrtghpthhtohepphhgshhqlhdqhhgrtghkvghrshes phhoshhtghhrvghsqhhlrdhorhhg X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 24 Mar 2026 23:15:33 -0400 (EDT) Date: Tue, 24 Mar 2026 23:15:32 -0400 From: Andres Freund To: pgsql-hackers@postgresql.org Cc: Dilip Kumar , Jelte Fennema-Nio , Thomas Munro , Noah Misch Subject: Re: Test timings are increasing too fast for cfbot Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2026-03-24 17:30:46 -0400, Andres Freund wrote: > Due to the number of times we run the main regression tests, they have a > particularly large effect on test resources. > > time cycles syscalls > 17 37.13 239.2T 1.573M > 18 44.27 295.7T 1.715M > master 48.19 323.5T 1.854M A surprisingly large source of this is psql internal queries. I first ran the regression tests with pg_stat_statements activated [1] and noticed that a lot of the queries, by exec and or plan time, are psql queries. I patched psql to add "-- psql internal\n" to every internal query and then used log_min_duration_statement=0 to see how much of the time is spent in psql - ~13% of the reported duration of all queries. A lot of that is because describeTbleDetails() ends up with a sequential scan: regression[1020428][1]=# EXPLAIN ANALYZE SELECT c.oid, n.nspname, c.relname FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relname OPERATOR(pg_catalog.~) '^(pg_class)$' COLLATE pg_catalog.default AND pg_catalog.pg_table_is_visible(c.oid) ORDER BY 2, 3; ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Sort (cost=324.79..324.80 rows=1 width=132) (actual time=7.192..7.194 rows=1.00 loops=1) │ │ Sort Key: n.nspname, c.relname │ │ Sort Method: quicksort Memory: 25kB │ │ Buffers: shared hit=234 │ │ -> Nested Loop Left Join (cost=0.00..324.78 rows=1 width=132) (actual time=2.489..7.121 rows=1.00 loops=1) │ │ Join Filter: (n.oid = c.relnamespace) │ │ Buffers: shared hit=231 │ │ -> Seq Scan on pg_class c (cost=0.00..313.79 rows=1 width=72) (actual time=2.473..7.105 rows=1.00 loops=1) │ │ Filter: ((relname ~ '^(pg_class)$'::text) AND pg_table_is_visible(oid)) │ │ Rows Removed by Filter: 2260 │ │ Buffers: shared hit=229 │ │ -> Seq Scan on pg_namespace n (cost=0.00..8.22 rows=222 width=68) (actual time=0.011..0.011 rows=1.00 loops=1) │ │ Buffers: shared hit=2 │ │ Planning: │ │ Buffers: shared hit=6 │ │ Planning Time: 0.520 ms │ │ Execution Time: 7.265 ms │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (17 rows) It seems decidedly not optimal that "\d tablename", without any patterns, ends up doing a seqscan. That's bad enough in the regression database, but there are many PG instances with many many entries in pg_class. I don't think this was always the case? If I remove the COLLATE pg_catalog.default, a sane plan is chosen. That's obviously not the right fix, but seemed interesting enough to mention. That alone seems makes a surprising 4.5% improvement in an unoptimized -O0 build. Another psql query that stands out is \d's query for publications. Not because of runtime, but because of the planning cost. Nuking that (obviously not correct), is worth another 3% of test time (and some test failues). I've not analyzed the psql queries any further, but I'd bet that the rest also do add up. The most expensive query in the entire regression tests is: DO $$ BEGIN FOR r IN 1..1350 LOOP DELETE FROM dedup_unique_test_table; INSERT INTO dedup_unique_test_table SELECT 1; END LOOP; END$$; which makes sense, as that's pretty clearly O(N^2). There are a surprisingly large number of temporary file creations in the regression tests. A lot of them due to shared file sets in the context of parallelism. I wonder whether that is a partial cause for the slowness on windows, which has very slow metadata operations and slower data caching. The slowest test is stats_ext.sql - Not surprising, it does sequential scans of tables with ~1000-10000 rows over and over again. I don't see why it has to do that with as many rows as it does. Another thing we spend a decent amount of time, distributed over many places, is FROM generate_series(x, y), which ends up slow due to nodeFunctionscan.c always using a tuplestore. Code wise, the most immediately noticeable things are 1) MEMORY_CONTEXT_CHECKING 2) AssertCheckRanges() (only in the brin test, but there a very large portion of the runtime) 3) verify_compact_attribute(), pretty spread around Greetings, Andres Freund [1] which annoyingly generates a bunch of test failures due to "Query Identifier" getting added to tests