Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQrND-001OiV-0S for pgsql-hackers@arkaria.postgresql.org; Wed, 03 Dec 2025 18:14:43 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vQrNC-00FGwT-0K for pgsql-hackers@arkaria.postgresql.org; Wed, 03 Dec 2025 18:14:42 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQrNB-00FGwH-2b for pgsql-hackers@lists.postgresql.org; Wed, 03 Dec 2025 18:14:42 +0000 Received: from mail-ed1-x52a.google.com ([2a00:1450:4864:20::52a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vQrN9-002xkY-2u for pgsql-hackers@lists.postgresql.org; Wed, 03 Dec 2025 18:14:41 +0000 Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-640c1fda178so10132a12.1 for ; Wed, 03 Dec 2025 10:14:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764785678; x=1765390478; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=j5nI4V+r7aJ0Gxn1iji8utAgHDqTJ6O3BOOgryKWk8g=; b=DEn0IjAjlpqYj4XV4Mb1XrceLBIxJ8vxv/a5ygvkBJ1u9+3oCR4pT03NwyOCC4AAbs 9HRr+tXtr3/Zd1F+WgNW0DOIyX2zX7wnzXKVm8SeFFQ3VSlU2C7lwQSkJzONhO6rYjsy FBrVXbU9bNN8sF4e7RrUzrexD3PTyKffRI2IQfkc/0rwQpsMZxCABs6ud8LIKtodINWf n++yKDvKE/byFodEejSc5QzR66OXe3NcFWlNyvpINaaM5UrKJHDg9XREMWyIERIaSRRz VZoeicuuQEVQFUTKbVSbqzr5ZPqHkmMHfVvF3TJ+MQBuLBSZOUE68AK5cbK5P5d9QBwL SCvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764785678; x=1765390478; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=j5nI4V+r7aJ0Gxn1iji8utAgHDqTJ6O3BOOgryKWk8g=; b=lIpLzVY1/DO/aVc3Qt9EIyr5M81gnDSIE206OPpVNNq1QahcGhwMANf5F2kWF8kKXr TarKGk+KhXWVEb47i6+qIE9cDqrALrofWtY9BBpr0fHoHp2iTmfr9kembw0tARqczBm0 nbZ7OFYkYwc2znm8getX/Y9O0Vu+wYYy3Bag/vUxai/SkW3tTpqyJhbeeaJ9krDQ9hUC bahuSZeMcmz3y2k2SoyS4uI7WjxYAEcLI2rM3UjnNm3vtmDtcjIIyW1Vm/t7NPxswaIJ M21Gh+H3r5sLRoAWJjtCBX6VS7HKOBM/0/mqRWFYeyUvAoHQKUHr/zRBPQDRYyOXTXIY NOWg== X-Forwarded-Encrypted: i=1; AJvYcCWwkD6cIoh3WtGuKETQQXnKjv0lKll5qRs1HwFrPebVNf0rMDj1dRdgWiiYTD8cUfmgqmSGsKHH3yu4QOEF@lists.postgresql.org X-Gm-Message-State: AOJu0Ywo3k5KiBwdLBMazZwSkb6k3GXXiM55fGfeUgkpSXVPjvvyhMY9 rhKt7V+fp/356+kfYAI8bp/2kGWCm6BTxL3CJSBgTvIxuZyv+OaI1Gv6gLa/QkxfbRYZH47ZYsv yks9gb8HDZOzZP7vn0j1o8APbsVpawoM= X-Gm-Gg: ASbGncssjMYuFEXVyzrdGLmZvpDPVw9f9tPqAbOJfS1wZNeZRq1EiNwLvu+rHnMeTdq QXYICPkL4VTw/5cRSp+vVO/A6ThgZ+fiAVmufFZuiLzvCMAyJrRyz+10owYx1q06JmtAk2zsGvW wV1IIohA3BR+QyiGUzyW8UaZAUVmAEDszOPK4URVSjjJBbXnMFuwqWWpc/PgrGmkDKbRhUcbESJ h3PPSlCYe/rsVwcQcOWl78OqMlk6n2de5kcEAqxqg9uvi/q8ulI1hs/uxNlFuGFUfc= X-Google-Smtp-Source: AGHT+IGOV47A8Kw2IgDtXrGjTrgWEJUH4mRSHC4TCb4s1eOn88Fj5NyUWVGb8JkmcAfeSctWxj2TQfR1NIcs1Cotlb0= X-Received: by 2002:a05:6402:1d53:b0:640:9c99:bfac with SMTP id 4fb4d7f45d1cf-6479c47c53emr3173921a12.13.1764785677666; Wed, 03 Dec 2025 10:14:37 -0800 (PST) MIME-Version: 1.0 References: <28f47336-84e2-445e-8216-d1ce7d3ddc3e@iki.fi> In-Reply-To: From: Sami Imseih Date: Wed, 3 Dec 2025 12:14:25 -0600 X-Gm-Features: AWmQ_bkogL_FKaQcYSMpQFDVutZDZKqWpVAJlyb8x_9nIquXQRmOktei1WtbEik Message-ID: Subject: Re: Support loser tree for k-way merge To: cca5507 Cc: Heikki Linnakangas , pgsql-hackers Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, Thanks for raising this. > Now we use 'heap' during the k-way merge, it's O(n log k). The 'loser tree' is also O(n log k), but > it's usually has fewer comparisons than the 'heap'. When the tuple comparator is complex, the > 'loser tree' can significantly speed up the k-way merge. I was playing around with this and I do also see the performance benefits that you mention with loser tree. ``` DROP TABLE IF EXISTS t; CREATE TABLE t ( col1 INT, col2 INT ); INSERT INTO t (col1, col2) SELECT (i % 10000000), (i % 100) FROM generate_series(1, 10000000) AS s(i); ``` Using something like the above, testing an "external merge" on the high cardinality column there is a bit of an improvement with loser tree, but there is a bit of a loss on low-cardinality. In general though, I see a bit of an issue with allowing a GUC to change strategies. What happens if there are multiple "external merge" nodes and they can each benefit from different strategies? Maybe that's not an issue since all the other enable_ GUCs have that problem, but it is something to consider. Can we drive the decision for what to do based on optimizer stats, i.e. n_distinct and row counts? Not sure what the calculation would be specifically, but something else to consider. We can still provide the GUC to override the optimizer decisions, but at least the optimizer, given up-to-date stats, may get it right most of the time. -- Sami Imseih Amazon Web Services (AWS)