Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t7uz0-0014Eu-3O for pgsql-general@arkaria.postgresql.org; Mon, 04 Nov 2024 11:10:53 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t7uxz-000LNA-6W for pgsql-general@arkaria.postgresql.org; Mon, 04 Nov 2024 11:09:51 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t7uxy-000LN2-RD for pgsql-general@lists.postgresql.org; Mon, 04 Nov 2024 11:09:51 +0000 Received: from mail-lf1-x134.google.com ([2a00:1450:4864:20::134]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t7uxw-0007Pg-VV for pgsql-general@postgresql.org; Mon, 04 Nov 2024 11:09:50 +0000 Received: by mail-lf1-x134.google.com with SMTP id 2adb3069b0e04-53c78ebe580so4137944e87.1 for ; Mon, 04 Nov 2024 03:09:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730718587; x=1731323387; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=USKBXdk/NgzyIxNMghKHb4u7o+60vkrL821HYs0PC1g=; b=dh6ikEjtJK0ke95z+UIgkJSchil+vBlKTf9bBqzyLSjBl8oN+j/0lxA7x0JcpxNAL8 Q3i13ffyLaiog6lQTRcNYw5Rc94ebkQcC/tD1QZJty1g0qc5klUVYS2/p4tYtQpwNlM0 xQ/XJ/lPjEZuZKNB5JGPtWhFeMbUWLsC3YpILev5lhHxKzAzRWa/x+W9z3L/SeG26F2l gTRhkIWsqJIrNmMfZg9fByfYvQs8ah95QYAVV/Tku/ub31MOEBZAZfEBPy3hXbUIxtIR gfRzaUZLBzT5NKdvYKCFxWJ49s9wBK2g75PdRIEDlCQfL9HUZOX/W4DsZbZr2NKPnLAH 6Njg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730718587; x=1731323387; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=USKBXdk/NgzyIxNMghKHb4u7o+60vkrL821HYs0PC1g=; b=kdL8ZOgcCubTQaNwWt/1Czr7dJgbMF4J8DK5KUETabInTZMFZ5UoF9A6JrRE4LEH5o V8rMw4SvO5dmtna7LMBdXtH+tEI83P0bReowGKiufKf7FH4txBuwkbGTOdT5uGA5Vgtp oJM8tQtj+WMqkC4SLOf0mYd8CfNXFx8fKRTbHmPtqQoDmVkT89P+0AasOUBZ4/9BL9g2 FB0O7BYclrJK4miNo+B2aJcdwfvJk02DTY9zjw/XUUG/EJmD/8FahqxvC1/cBuNKC0uP YPqJzfl9xBSxr3wa3SOpAHEJNrZxTeKtwZCIQOxUcg9hGN1WBepKJZtvnPywG/ZV03ux YcPA== X-Gm-Message-State: AOJu0YxT5gguYr3veO6zMGOJ9PmlWdpVbHwQNo7Zo6wQK/EXRv6/+Iax 316TOwBMB3RZpsZetxQLfQKQ/ewkLEhtz98GjhVXNbHUowXacWx0PhRWJOSc7VJLOLUpILEu17B vzYthnL8hRNLLm58KBaYawSHUzmM= X-Google-Smtp-Source: AGHT+IGYHogz0u492nEjbefd+QZ2BsG1ky6807lng8e6L4BvCq3xfSdzHEp1ZMh5mHLqJ/2dSyfKqkblGyqCquibc9g= X-Received: by 2002:ac2:5fb0:0:b0:539:e4ad:8bab with SMTP id 2adb3069b0e04-53c7bbee91emr4195129e87.16.1730718586823; Mon, 04 Nov 2024 03:09:46 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: David Rowley Date: Tue, 5 Nov 2024 00:09:35 +1300 Message-ID: Subject: Re: Why not do distinct before SetOp To: ma lz Cc: "pgsql-general@postgresql.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, 4 Nov 2024 at 22:52, ma lz wrote: > > some sql like ' select a from t1 intersect select a from t1 ' > > if t1 has large number rows but has few distinct rows > > select distinct a from t1 intersect select distinct a from t1; =E2=80= =94 this is faster than origin sql > > can postgres do this optimize during plan-queries? No, the planner does not attempt that optimisation. INTERSECT really isn't very well optimised. If we did want to improve this area, I think the first thing we'd want to do is use standard join types rather than HashSetOp Intersect to implement INTERSECT (without ALL). To do that efficiently, we'd need to do a bit more work on the standard join types to have them efficiently support IS NOT DISTINCT FROM clauses as the join keys. There's a fair bit of work to do and it's likely not been done as INTERSECT isn't used that commonly. There was a bit of work done in PG17 to teach the query planner some new tricks around UNION. I think UNION is a much more commonly used setop than INTERSECT, so you might have to wait a while. For now, it's best to adjust your query. David