Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qQs2c-001EvO-Sh for pgsql-admin@arkaria.postgresql.org; Tue, 01 Aug 2023 16:16:11 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1qQs2a-0085IS-4n for pgsql-admin@arkaria.postgresql.org; Tue, 01 Aug 2023 16:16:09 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qQs2Z-0085IK-NG for pgsql-admin@lists.postgresql.org; Tue, 01 Aug 2023 16:16:09 +0000 Received: from mail-wm1-x329.google.com ([2a00:1450:4864:20::329]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1qQs2X-000EAM-2N for pgsql-admin@lists.postgresql.org; Tue, 01 Aug 2023 16:16:08 +0000 Received: by mail-wm1-x329.google.com with SMTP id 5b1f17b1804b1-3fb4146e8ceso54388265e9.0 for ; Tue, 01 Aug 2023 09:16:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cybertec-at.20221208.gappssmtp.com; s=20221208; t=1690906563; x=1691511363; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:to:from:subject:message-id:from:to:cc:subject:date :message-id:reply-to; bh=n1FI8Bsjve5+UOoeRjSkx2F+TkXLHY7DzeyCbAzZmp8=; b=1QZRebQiIhc5iNbFBxrtnZU9x97gImtr0poegik+T1rAmmsyvYupmntst4M6wW9wb+ Sg/29c9FC5di+ngEv5grk5l3IxL0NP0D4v0PPsai58OxEn5ITgWYSGCN29hi48av5B9j 8vNZt+aZwBWY3/FTlchsOO/Yp2F/CSjMaNyWpKHJvpweExcqTEbsSs7uMNM4LT0lwLDm VtbjpDjdl64i6UJaXjFjRPt4naX7lw8ABJ3GoM0Qo/IL9UZtR1cjRXvEuAWrdngNVet0 NPBvVfZO2O1LUp0CZvBEjVzfJxGUSVcgpjEqQXLWRw3w0eJtDgIpjSgqTwc2Ps5AuhRR vvfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690906563; x=1691511363; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:to:from:subject:message-id:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=n1FI8Bsjve5+UOoeRjSkx2F+TkXLHY7DzeyCbAzZmp8=; b=K7poDc5s3IIA+9p7It+UAUD3kpajyRaAjniRMIZxu4FRv406VhhgWYHAXyqfn/rZUh 5cH8yYlIxDkZL+KHqwnl26eOPaMvKtATEzfF0cQT3P24NqL4e2n1kluzN1IGT3VWgWxX oZYoWr2TTl6zT0FVKlzqmxS6sqcixrf7BSEW7FV0bv6hwXi6u9XS1ju1PQsu1AgnwOiF qwjpqK/vq1OGiJVMMjNPuAuHAsKdi5RZ/GrhnlT2EK73DxQpNFp/KrbA+2v/IOYpvwBK yCyo3x33FFt0G3wt+soxHCh8Vb0B+f3SFUlQoQN0U0sSRgIgXUwyjdOYGd43HNwwPUVP VTsg== X-Gm-Message-State: ABy/qLbKJFuY53xB8nIIzneJJZdx0BLbJt2xJyr7KjAAbPJSqOL08kt/ JrronGVWWjXM5px3P+eFtGDgbw== X-Google-Smtp-Source: APBJJlFNiTkBx7EXmcpjeBLk2OmZYlVuqOnxhE19KTryX8hTpUZbdo0oGBa7Oivtyki5ON3IymbUXA== X-Received: by 2002:a5d:4d11:0:b0:313:edaa:24fd with SMTP id z17-20020a5d4d11000000b00313edaa24fdmr2343647wrt.21.1690906562868; Tue, 01 Aug 2023 09:16:02 -0700 (PDT) Received: from localhost.localdomain ([2001:871:5e:cf26:d659:7164:252e:5616]) by smtp.gmail.com with ESMTPSA id c18-20020a5d4f12000000b0030647449730sm16574714wru.74.2023.08.01.09.16.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Aug 2023 09:16:02 -0700 (PDT) Message-ID: Subject: Re: analyze foreign tables From: Laurenz Albe To: richard coleman , Pgsql-admin Date: Tue, 01 Aug 2023 18:16:01 +0200 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.4 (3.48.4-1.fc38) MIME-Version: 1.0 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, 2023-08-01 at 09:47 -0400, richard coleman wrote: > In PostgreSQL foreign tables are not automatically analyzed and analyze m= ust be > specifically called on each table.=C2=A0 In the case of sharing tables be= tween > PostgreSQL clusters, there is the=C2=A0use_remote_estimate foreign server= option. >=20 > In some of our multi terabyte databases, manually running analyze on all = of > the foreign tables can take more than a day.=C2=A0 This is per database c= ontaining > the foreign schema.=C2=A0 Since we have certain large schema that we have= centrally > located and share to all of our other database clusters, this really adds= up. >=20 > use_remote_estimate isn't really a solution as it adds way too much overh= ead > and processing time to every query run. >=20 > Since these tables are being continuously analyzed in the database that h= osts > the data, is there some way that they statistics could be easily passed t= hrough > the foreign server mechanism to the remote database that's calling the qu= ery? >=20 > Unless I'm missing something we can either: > 1. manually run analyze on each foreign table in each database that point= s to > the host table > 2. set use_remote_estimate =3D true which will cause PostgreSQL to re-obt= ain > statistics on a per query basis. >=20 > What I am hoping for is either: > 1. pass through the results of analyze from the source database to the on= e > where the foreign query is being run > 2. add the ability to automatically run analyze on foreign tables just as= they > are currently run on local tables. >=20 > Of the two, #1 would seem to be the easiest and least wasteful of resourc= es. Unfortunately, both your wishes don't look feasible: - Transferring table statistics would mean that PostgreSQL understands stat= istics from other server versions. This is complicated, and we have decided not= to do this for pg_upgrade, so I don't think we'll try to do it here. - Autoanalyzing foreign tables would mean that we have some idea how much d= ata has changed on the remote server. How should we do that? What I can imagine is that instead of reading the complete remote table dur= ing ANALYZE, PostgreSQL applies TABLESAMPLE to fetch only part. That could be = a workable enhancement. Yours, Laurenz Albe