Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sbZ8j-00HTms-PC for pgsql-general@arkaria.postgresql.org; Wed, 07 Aug 2024 05:23:13 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sbZ8h-004edA-JG for pgsql-general@arkaria.postgresql.org; Wed, 07 Aug 2024 05:23:11 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sbZ8h-004ebg-5X for pgsql-general@lists.postgresql.org; Wed, 07 Aug 2024 05:23:11 +0000 Received: from mail-lj1-x22a.google.com ([2a00:1450:4864:20::22a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sbZ8e-003Zot-UG for pgsql-general@lists.postgresql.org; Wed, 07 Aug 2024 05:23:10 +0000 Received: by mail-lj1-x22a.google.com with SMTP id 38308e7fff4ca-2eeb1ba0481so19075231fa.2 for ; Tue, 06 Aug 2024 22:23:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723008187; x=1723612987; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=eq7Y9AlhOm4lUPCAb7pT7IFwSdp3WWD6bQ7/h3Fwrao=; b=BlBTYgw9vSyCa+n9Vf6Sy6NoffIkmrM0VAJOUjOPN2cxV8oVp0R4huUiMmQzZQuFjE BND/oBYDTlZ2HFsqrbKrGnXX0Q0Gn8k6yFia+ZcxktY2WrYys7qp7X/OpwcnIMxYPCpw QrpizkaSmZ45VtNBl4ay2RhKzIEIrzJztbJc4r6Q6EidN0lCx9zQ8Kt0zJxtDQRrhuPY ZwuB6C3PaS8jwz0/lvnW3shhZrJSIPdC6kJv3N0hhZesXLHfuaJjI5yJgKFomeyil+kX MGCzBxptCMmn6DwDfcZeDwetw/844tX+tSuydyxTvd7wXAm33pbCgTYHCMH9k5pGut1r 7iTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723008187; x=1723612987; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eq7Y9AlhOm4lUPCAb7pT7IFwSdp3WWD6bQ7/h3Fwrao=; b=fFiax67bpNayOrEmOZZOVdHFiVh+EXWsnIoMF/IpAXTo57KRfx++rAcVOKDeAi82oK NJ8ffA9I1xwqW9PLOEP4M1Dna/mQzY/aYmJR81DBTxYHizMT+X7KDQZjyPdqn+Sl8fSW 4XgvfOYEddyoKVq2IEFfZbrodI2GGkoLm6w4cKkVLkezzHXqIaUz6TAqseUtrpaNvmHJ vqt9mQvMGRLsOUtVZlnTAfiJSGEGvdJnOIvYfZGwAh0YNzyf52wvijlOZR4NA6PylU8S lNOMmQ7m//08PlpN1XBDwvhTleZPTjJ9Hoa8bNIU389PcWNLtE13KF1I/nh2pyevdvL+ nKfQ== X-Gm-Message-State: AOJu0YzvYOUh0jVnSxDu2AStP2EcZ313Etk4NfBrOcS1ThVWGA6rNCgU t0iSm+tCGMRFh1+KvxCHlnX3LYOh2SGVd2QtGaB+MdiwH4xlI+mvoJBsfM+wHzf78GjH1lTBE7J MKg97+wTHVuPztR+/buSFVWbmVCs= X-Google-Smtp-Source: AGHT+IHzQz7HjcKevUJbXzQCG790Kb5gQ5Hg8mlXxEJJVlvMnvzrQSKyRc1aZa6Ap+YTSp8Rjl3Vv9IzMq7kjDiP5t4= X-Received: by 2002:a2e:b602:0:b0:2ef:2df7:3d9a with SMTP id 38308e7fff4ca-2f15aa84e17mr109737941fa.9.1723008186297; Tue, 06 Aug 2024 22:23:06 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: David Rowley Date: Wed, 7 Aug 2024 17:22:51 +1200 Message-ID: Subject: Re: ANALYZE on partitioned tables vs on individual partitions To: Michael Harris Cc: pgsql-general@lists.postgresql.org, Christophe Pettus Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, 7 Aug 2024 at 16:44, Christophe Pettus wrote: > Child partitions should be autovacuumed and autoanalyzed just like any ot= her table; they are not prohibited from autovacuum in any way by default. = It's probably a good idea to investigate why they are not being picked up b= y autovacuum. If they are created by a bulk load process, it's not a bad i= dea to do a VACUUM ANALYZE on them once the bulk load is complete. I think the complaint was about no autovacuum on the partitioned table, not the partitions. This is expected as we don't track the counters (in particular n_mod_since_analyze) shown in pg_stat_all_tables at the partitioned table level, so the trigger points that normally cause autovacuum to analyze or vacuum a table just won't be triggered for a partitioned table. For VACUUM, that's fine as, as you mentioned, no rows are stored. But for analyze, that does present a problem. To name the aspects of planning that rely on statistics of the partitioned table, basically anything above the Append or MergeAppend which joins the partitioned results together. So that doesn't include the scans of each partition and any quals that are pushed down to the scan level as those are able to use the partition level statistics. However, it does include things like joins, group by, distinct as those require n_distinct estimates for the partitioned table. It's not all bad though as the row estimates for each individual partition will be totalled up through the Append / MergeAppend simply by adding up the row estimates for each Append / MergeAppend child plan. So, it's really only an estimation problem for any node that comes after a join node or a group by node as the output rows for those nodes will depend on a good n_distinct estimate for the partitioned table. Partition-wise joins and aggregates do change things a bit as those features do permit moving those operations below the Append / Merge Append, in which case the statistics for the individual partition can be used. You could consider manually setting the n_distinct_inherited estimates for the columns that you join on or group by in the partitioned table. You might find that you're able to choose a suitable value for that if you review the documentation for that setting. In particular, please review what is mentioned about using negative numbers for that setting. You may be able to choose a value that scales correctly with the row estimate that doesn't get outdated as you add more rows to the partitions. You'll need to determine that based on the data you're storing. David