Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1slTIJ-007yqY-Qt for pgsql-admin@arkaria.postgresql.org; Tue, 03 Sep 2024 13:10:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1slTII-00BcQc-Hq for pgsql-admin@arkaria.postgresql.org; Tue, 03 Sep 2024 13:10:02 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1slTII-00BcQT-6E for pgsql-admin@lists.postgresql.org; Tue, 03 Sep 2024 13:10:02 +0000 Received: from forward102b.mail.yandex.net ([2a02:6b8:c02:900:1:45:d181:d102]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1slTIE-000S8f-Dm for pgsql-admin@lists.postgresql.org; Tue, 03 Sep 2024 13:10:01 +0000 Received: from mail-nwsmtp-smtp-production-main-19.sas.yp-c.yandex.net (mail-nwsmtp-smtp-production-main-19.sas.yp-c.yandex.net [IPv6:2a02:6b8:c24:3b2:0:640:ff71:0]) by forward102b.mail.yandex.net (Yandex) with ESMTPS id 38F9B60910 for ; Tue, 3 Sep 2024 16:09:54 +0300 (MSK) Received: by mail-nwsmtp-smtp-production-main-19.sas.yp-c.yandex.net (smtp/Yandex) with ESMTPSA id p9bfHT9HcSw0-o49GT8xM; Tue, 03 Sep 2024 16:09:53 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tantorlabs.com; s=mail; t=1725368993; bh=2qAGB8UTJXbIUS6B0Ggl2pCM1C9q0orwZkCzf8tBx+Y=; h=Subject:To:Message-ID:Date:From; b=JR4NqDBac15inf/HdIT3tQIuXBS/c60d4f8Nf2An6ldpiTK+MGS6wbukGLYDMY0X1 WSKsWAhzlPFeAflQdEzsnv/C/ewI8USPKcQW3EJ4564wNLA1sKtJIs2jlHotdnn2NO GWiB2b+khnt1hJzAxKzkOEjVYqlAN+lWrD/GMot0= Authentication-Results: mail-nwsmtp-smtp-production-main-19.sas.yp-c.yandex.net; dkim=pass header.i=@tantorlabs.com Content-Type: multipart/alternative; boundary="------------KH07GEyuWZsuVuq0tBuzU48j" Message-ID: <0251f755-00bb-41c7-8f76-14a3604f6d18@tantorlabs.com> Date: Tue, 3 Sep 2024 16:09:50 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: pgsql-admin@lists.postgresql.org From: Ilia Evdokimov Subject: Duplicate Extended Statistics List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------KH07GEyuWZsuVuq0tBuzU48j Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Hello everyone, I have a question regarding extended statistics in PostgreSQL. Why is it possible to create duplicate extended statistics? To make it clearer, here’s an example: CREATE TABLE t(a int, b int); INSERT INTO t(a, b) VALUES (...); CREATE STATISTICS ON a, b FROM t; ANALYZE t; .... CREATE STATISTICS ON a, b FROM t; ANALYZE t; After executing these queries, the following issues might arise: 1. ANALYZE will take longer to run because, for example, MCV extended statistics would need to be gathered twice. 2. Duplicate information will be stored. 3. The planner might take longer to find the relevant statistics since it has to search through them in a loop. Or do duplicate extended statistics practically never occur in practice? Thanks in advance for your response. -- Regards, Ilia Evdokimov, Tantor Labs LCC. --------------KH07GEyuWZsuVuq0tBuzU48j Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

Hello everyone,

I have a question regarding extended statistics in PostgreSQL. Why is it possible to create duplicate extended statistics? To make it clearer, here’s an example:

CREATE TABLE t(a int, b int);
INSERT INTO t(a, b) VALUES (...);
CREATE STATISTICS ON a, b FROM t;
ANALYZE t;
....
CREATE STATISTICS ON a, b FROM t;
ANALYZE t;

After executing these queries, the following issues might arise:

  1. ANALYZE will take longer to run because, for example, MCV extended statistics would need to be gathered twice.
  2. Duplicate information will be stored.
  3. The planner might take longer to find the relevant statistics since it has to search through them in a loop.

Or do duplicate extended statistics practically never occur in practice?

Thanks in advance for your response.

-- 
Regards,
Ilia Evdokimov,
Tantor Labs LCC.
--------------KH07GEyuWZsuVuq0tBuzU48j--