Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dQ0Oc-00066z-EF for pgsql-performance@arkaria.postgresql.org; Tue, 27 Jun 2017 23:59:50 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1dQ0Ob-0001Bk-RZ for pgsql-performance@arkaria.postgresql.org; Tue, 27 Jun 2017 23:59:49 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1dQ0Mo-0006Tv-6S for pgsql-performance@postgresql.org; Tue, 27 Jun 2017 23:57:58 +0000 Received: from mail-ua0-x22a.google.com ([2607:f8b0:400c:c08::22a]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1dQ0Mf-0006Kd-3R for pgsql-performance@postgresql.org; Tue, 27 Jun 2017 23:57:57 +0000 Received: by mail-ua0-x22a.google.com with SMTP id w19so7472225uac.0 for ; Tue, 27 Jun 2017 16:57:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=nbmy4D86ED+oePzEP4ajrinixsQdjkM9Hc655PBYxj4=; b=DEyKcjqi0MMAhNoGVp/yueLhrgLvl10baHx/zklm4UEYREZ6vsqOexCX76pftYV3jm pEgAco9ez5ziC6Jc5KX6X9DjJHQRJwBf/wOw/IsLoP1VOW6xJRe6CdmlHlgNN4Uouare K4JPhoDM4iEQicvNFTGahc673uGOM2kqFR6tQgDdBX5/V9mfvlzcjm6jLF/FXqGxaX5G tI4zCU9tQc0r9tRM14DDH3Lf15wj1Q2Mus+kRSWAdRqfFvE/orY6ixPXyGHPQDzH8OiJ 34stYz6TwwSXc/1DSaiUn356Ohhjd0drbcyjcfOgkcm3mwlZHJcx05fCAY47hVhWzHIL 8JpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=nbmy4D86ED+oePzEP4ajrinixsQdjkM9Hc655PBYxj4=; b=i8zBW2rym4nJKs8JzazO93mFQkwXw0pnHtmOR+HPrv7WpbsE6Vv0L/4cDFDI1808CH O5Q3uqhNjvXSEqFAm/3ghAa6rzbxMg3k2yqy9bNeZPGD1s7/c4t/aZJnMf+UK9lkxX7w ZHkKUQ0KwmBiDG2iyiJeNAfvh+jh8t0rD1DasmdR/mF3T1vSRuoZqnbkkfUjAkiORy4L kupT4b2hh7H0By2CbpN51ktYYppdKhvJ7E4x2TXXmYDSPzmv/DIFGzh4FtdvdcvcAp5L e0Byoy+R1cunef/Z6+8ch85VE/jfUyOdANH0XtqfDr2m87wkZfEDBBJvgfJoM4jmVp/x WCaA== X-Gm-Message-State: AKS2vOwseN5rBB80+DtSjVV7cjTlY18X5Lf7dotmRLczwZAhFnWBAX61 RchYhXvPpfzRRIboVzSGytfrY6xO7nd8aqX/Tw== X-Received: by 10.176.95.69 with SMTP id z5mr4738930uah.113.1498607866990; Tue, 27 Jun 2017 16:57:46 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.143.155 with HTTP; Tue, 27 Jun 2017 16:57:46 -0700 (PDT) From: =?UTF-8?Q?Ulf_Lohbr=C3=BCgge?= Date: Wed, 28 Jun 2017 01:57:46 +0200 Message-ID: Subject: Performance of information_schema with many schemata and tables To: pgsql-performance@postgresql.org Content-Type: multipart/alternative; boundary="f403045fdfa491dd020552f9d525" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --f403045fdfa491dd020552f9d525 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi all, we use schemata to separate our customers in a multi-tenant setup (9.5.7, Debian stable). Each tenant is managed in his own schema with all the tables that only he can access. All tables in all schemata are the same in terms of their DDL: Every tenant uses e.g. his own table 'address'. We currently manage around 1200 schemata (i.e. tenants) on one cluster. Every schema consists currently of ~200 tables - so we end up with ~240000 tables plus constraints, indexes, sequences et al. Our current approach is quite nice in terms of data privacy because every tenant is isolated from all other tenants. A tenant uses his own user that gives him only access to the corresponding schema. Performance is great for us - we didn't expect Postgres to scale so well! But performance is pretty bad when we query things in the information_schema: SELECT * FROM information_schema.tables WHERE table_schema =3D 'foo' AND table_name =3D 'bar';`` Above query results in a large sequence scan with a filter that removes 1305161 rows: QUERY PLAN ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ------------------------------------------------------- Nested Loop Left Join (cost=3D0.70..101170.18 rows=3D3 width=3D265) (actu= al time=3D383.505..383.505 rows=3D0 loops=3D1) -> Nested Loop (cost=3D0.00..101144.65 rows=3D3 width=3D141) (actual time=3D383.504..383.504 rows=3D0 loops=3D1) Join Filter: (nc.oid =3D c.relnamespace) -> Seq Scan on pg_class c (cost=3D0.00..101023.01 rows=3D867 width=3D77) (actual time=3D383.502..383.502 rows=3D0 loops=3D1) Filter: ((relkind =3D ANY ('{r,v,f}'::"char"[])) AND (((relname)::information_schema.sql_identifier)::text =3D 'bar'::text) AND (pg_has_role(relowner, 'USAGE'::text) OR has_table_privilege(oid, 'SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR has_any_column_privilege(oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text))) Rows Removed by Filter: 1305161 -> Materialize (cost=3D0.00..56.62 rows=3D5 width=3D68) (never executed) -> Seq Scan on pg_namespace nc (cost=3D0.00..56.60 rows=3D= 5 width=3D68) (never executed) Filter: ((NOT pg_is_other_temp_schema(oid)) AND (((nspname)::information_schema.sql_identifier)::text =3D 'foo'::text)) -> Nested Loop (cost=3D0.70..8.43 rows=3D1 width=3D132) (never execute= d) -> Index Scan using pg_type_oid_index on pg_type t (cost=3D0.42..8.12 rows=3D1 width=3D72) (never executed) Index Cond: (c.reloftype =3D oid) -> Index Scan using pg_namespace_oid_index on pg_namespace nt (cost=3D0.28..0.30 rows=3D1 width=3D68) (never executed) Index Cond: (oid =3D t.typnamespace) Planning time: 0.624 ms Execution time: 383.784 ms (16 rows) We noticed the degraded performance first when using the psql cli. Pressing tab after beginning a WHERE clause results in a query against the information_schema which is pretty slow and ends in "lag" when trying to enter queries. We also use Flyway (https://flywaydb.org/) to handle our database migrations. Unfortunately Flyway is querying the information_schema to check if specific tables exist (I guess this is one of the reasons information_schema exists) and therefore vastly slows down the migration of our tenants. Our last migration run on all tenants (schemata) almost took 2h because the above query is executed multiple times per tenant. The migration run consisted of multiple sql files to be executed and triggered more than 10 queries on information_schema per tenant. I don't think that Flyway is to blame because querying the information_schema should be a fast operation (and was fast for us when we had less schemata). I tried to speedup querying pg_class by adding indexes (after enabling allow_system_table_mods) but didn't succeed. The function call 'pg_has_role' is probably not easy to optimize. Postgres is really doing a great job to handle those many schemata and tables but doesn't scale well when querying information_schema. I actually don't want to change my current multi-tenant setup (one schema per tenant) as it is working great but the slow information_schema is killing our deployments. Are there any other options besides switching from one-schema-per-tenant-approach? Any help is greatly appreciated! Regards, Ulf --f403045fdfa491dd020552f9d525 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi all,

we use schemata to s= eparate our customers in a multi-tenant setup (9.5.7, Debian stable). Each = tenant is managed in his own schema with all the tables that only he can ac= cess. All tables in all schemata are the same in terms of their DDL: Every = tenant uses e.g. his own table 'address'. We currently manage aroun= d 1200 schemata (i.e. tenants) on one cluster. Every schema consists curren= tly of ~200 tables - so we end up with ~240000 tables plus constraints, ind= exes, sequences et al.

Our current approach is qui= te nice in terms of data privacy because every tenant is isolated from all = other tenants. A tenant uses his own user that gives him only access to the= corresponding schema. Performance is great for us - we didn't expect P= ostgres to scale so well!

But performance is prett= y bad when we query things in the information_schema:

<= div>SELECT
=C2=A0 *
FROM information_schema.tables
WHERE table_schema =3D 'foo'
AND table_name =3D = 9;bar';``

Above query results in a large seque= nce scan with a filter that removes 1305161 rows:

= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 QUERY PLAN
----------------------------------------= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------
=C2=A0Nested Loop Left Join =C2=A0(cost=3D0.70..1= 01170.18 rows=3D3 width=3D265) (actual time=3D383.505..383.505 rows=3D0 loo= ps=3D1)
=C2=A0 =C2=A0-> =C2=A0Nested Loop =C2=A0(cost=3D0.00..= 101144.65 rows=3D3 width=3D141) (actual time=3D383.504..383.504 rows=3D0 lo= ops=3D1)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Join Filter: (nc.oid = =3D c.relnamespace)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2= =A0Seq Scan on pg_class c =C2=A0(cost=3D0.00..101023.01 rows=3D867 width=3D= 77) (actual time=3D383.502..383.502 rows=3D0 loops=3D1)
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Filter: ((relkind =3D ANY (= '{r,v,f}'::"char"[])) AND (((relname)::information_schema= .sql_identifier)::text =3D 'bar'::text) AND (pg_has_role(relowner, = 'USAGE'::text) OR has_table_privilege(oid, 'SELECT, INSERT, UPD= ATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR has_any_column_pr= ivilege(oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text)))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Rows Removed by Fi= lter: 1305161
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Mater= ialize =C2=A0(cost=3D0.00..56.62 rows=3D5 width=3D68) (never executed)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Se= q Scan on pg_namespace nc =C2=A0(cost=3D0.00..56.60 rows=3D5 width=3D68) (n= ever executed)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0Filter: ((NOT pg_is_other_temp_schema(oid)) AND = (((nspname)::information_schema.sql_identifier)::text =3D 'foo'::te= xt))
=C2=A0 =C2=A0-> =C2=A0Nested Loop =C2=A0(cost=3D0.70..8.4= 3 rows=3D1 width=3D132) (never executed)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0-> =C2=A0Index Scan using pg_type_oid_index on pg_type t =C2= =A0(cost=3D0.42..8.12 rows=3D1 width=3D72) (never executed)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Index Cond: (c.reloftyp= e =3D oid)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Index Sc= an using pg_namespace_oid_index on pg_namespace nt =C2=A0(cost=3D0.28..0.30= rows=3D1 width=3D68) (never executed)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Index Cond: (oid =3D t.typnamespace)
=C2=A0Planning time: 0.624 ms
=C2=A0Execution time: 383.784 ms=
(16 rows)

We noticed the degraded perfo= rmance first when using the psql cli. Pressing tab after beginning a WHERE = clause results in a query against the information_schema which is pretty sl= ow and ends in "lag" when trying to enter queries.

=
We also use Flyway (https://fl= ywaydb.org/) to handle our database migrations. Unfortunately Flyway is= querying the information_schema to check if specific tables exist (I guess= this is one of the reasons information_schema exists) and therefore vastly= slows down the migration of our tenants. Our last migration run on all ten= ants (schemata) almost took 2h because the above query is executed multiple= times per tenant. The migration run consisted of multiple sql files to be = executed and triggered more than 10 queries on information_schema per tenan= t.

I don't think that Flyway is to blame becau= se querying the information_schema should be a fast operation (and was fast= for us when we had less schemata). I tried to speedup querying pg_class by= adding indexes (after enabling allow_system_table_mods) but didn't suc= ceed. The function call 'pg_has_role' is probably not easy to optim= ize.

Postgres is really doing a great job to handl= e those many schemata and tables but doesn't scale well when querying i= nformation_schema. I actually don't want to change my current multi-ten= ant setup (one schema per tenant) as it is working great but the slow infor= mation_schema is killing our deployments.

Are ther= e any other options besides switching from one-schema-per-tenant-approach? = Any help is greatly appreciated!

Regards,
Ulf --f403045fdfa491dd020552f9d525--