Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dQ9SG-0005M7-SX for pgsql-performance@arkaria.postgresql.org; Wed, 28 Jun 2017 09:40:13 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1dQ9SF-0007Of-J0 for pgsql-performance@arkaria.postgresql.org; Wed, 28 Jun 2017 09:40:11 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1dQ9QU-0004JZ-6L for pgsql-performance@postgresql.org; Wed, 28 Jun 2017 09:38:22 +0000 Received: from mail-ua0-x22d.google.com ([2607:f8b0:400c:c08::22d]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1dQ9QP-0000jf-2W for pgsql-performance@postgresql.org; Wed, 28 Jun 2017 09:38:21 +0000 Received: by mail-ua0-x22d.google.com with SMTP id w19so13287497uac.0 for ; Wed, 28 Jun 2017 02:38:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=e4JeNOCJH+BSCYVj1Q1hBOXlmXEU1dbrOL+fbhKrdYI=; b=EyaklL94HfRhTtKMN5Qh+si6E9mZTc/f1IedJVbGXq4s1+m9FGmWYKEeZElXGPXnPs /LzN0dMColhDj1rIxmmBwCkonvYkkTLsfrA18Ic1cafVORfuTcCAMvagQlzSf0VXpCZ0 GL3lDo9zNV/VDT6VAPmocsQcq4RANiecPyXwO+vPXw3RnfnyQcXOOSnzHtLvPdWmC7jn 0Jnw214/7xLnBiDdYCYpauZMwwDQA8dwUJgrfW5xDtiGqDQ7NpC+6+ZiCfldj3wRh9iG cfqqk0/5DTLmC4YSqT31LXQaYukwL1ohggaINEbLOz/aqnJAZjwcmRuQtISFO0ddsmqa n0IA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=e4JeNOCJH+BSCYVj1Q1hBOXlmXEU1dbrOL+fbhKrdYI=; b=ajmjpZjyFaOryPvx5U2TZmxRfNrqtJCOAMErl+4qJdwdZMcl7ASDpUcUE9OJvHEeEi D2RntgZ6s1QjzwwmuyDdoQyhmmjB+fHY/4sp/llmvJypgV4lTTmejNCIC9yoDRRHQn48 L4UoyBr8I5U/Qq8GLGYNFujZcbrnQ+ebBLeJeTuu3P4JvunvfqQqsko1s7QxsMWmTiz5 KzvK6z8gEBZDeH++9OGEORpStw1z7CgsKMqIMy0yEbfaeqGaRw2qq4x39R1NM3Ty91GX pKzL9UBwjpqytP2JQi6ZmLl/vgN4MTU1rEXHkxR6KsTQZQmrz9D1IonYvf+fGdrKAk0y quWg== X-Gm-Message-State: AKS2vOw5RiNP8OaB5DGKqOxgH1hb4GUqGflPnjBBhot7CAypYe/m3q0E G8bku9OZb/Qb/HszgACV3X5TcNbpXbS9IEc= X-Received: by 10.176.2.84 with SMTP id 78mr6012448uas.80.1498642694706; Wed, 28 Jun 2017 02:38:14 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.143.155 with HTTP; Wed, 28 Jun 2017 02:38:14 -0700 (PDT) In-Reply-To: References: From: =?UTF-8?Q?Ulf_Lohbr=C3=BCgge?= Date: Wed, 28 Jun 2017 11:38:14 +0200 Message-ID: Subject: Re: Performance of information_schema with many schemata and tables To: pgsql-performance@postgresql.org Content-Type: multipart/alternative; boundary="001a11376f00769a3a055301f1cc" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --001a11376f00769a3a055301f1cc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Nope, I didn't try that yet. But I don't have the impression that reindexing the indexes in information_schema will help. The table information_schema.tables consists of the following indexes: "pg_class_oid_index" UNIQUE, btree (oid) "pg_class_relname_nsp_index" UNIQUE, btree (relname, relnamespace) "pg_class_tblspc_relfilenode_index" btree (reltablespace, relfilenode) The costly sequence scan in question on pg_class happens with the following WHERE clause: WHERE (c.relkind =3D ANY (ARRAY['r'::"char", 'v'::"char", 'f'::"char"])) AN= D NOT pg_is_other_temp_schema(nc.oid) AND (pg_has_role(c.relowner, 'USAGE'::text) OR has_table_privilege(c.oid, 'SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR has_any_column_privilege(c.oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text)); Besides pg_class_oid_index none of the referenced columns is indexed. I tried to add an index on relowner but didn't succeed because the column is used in the function call pg_has_role and the query is still forced to do a sequence scan. Regards, Ulf 2017-06-28 3:31 GMT+02:00 Pritam Baral : > On Wednesday 28 June 2017 05:27 AM, Ulf Lohbr=C3=BCgge wrote: > > Hi all, > > > > we use schemata to separate our customers in a multi-tenant setup > (9.5.7, Debian stable). Each tenant is managed in his own schema with all > the tables that only he can access. All tables in all schemata are the sa= me > in terms of their DDL: Every tenant uses e.g. his own table 'address'. We > currently manage around 1200 schemata (i.e. tenants) on one cluster. Ever= y > schema consists currently of ~200 tables - so we end up with ~240000 tabl= es > plus constraints, indexes, sequences et al. > > > > Our current approach is quite nice in terms of data privacy because > every tenant is isolated from all other tenants. A tenant uses his own us= er > that gives him only access to the corresponding schema. Performance is > great for us - we didn't expect Postgres to scale so well! > > > > But performance is pretty bad when we query things in the > information_schema: > > > > SELECT > > * > > FROM information_schema.tables > > WHERE table_schema =3D 'foo' > > AND table_name =3D 'bar';`` > > > > Above query results in a large sequence scan with a filter that removes > 1305161 rows: > > > > > > QUERY PLAN > > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------- > > Nested Loop Left Join (cost=3D0.70..101170.18 rows=3D3 width=3D265) (= actual > time=3D383.505..383.505 rows=3D0 loops=3D1) > > -> Nested Loop (cost=3D0.00..101144.65 rows=3D3 width=3D141) (actu= al > time=3D383.504..383.504 rows=3D0 loops=3D1) > > Join Filter: (nc.oid =3D c.relnamespace) > > -> Seq Scan on pg_class c (cost=3D0.00..101023.01 rows=3D867 > width=3D77) (actual time=3D383.502..383.502 rows=3D0 loops=3D1) > > Filter: ((relkind =3D ANY ('{r,v,f}'::"char"[])) AND > (((relname)::information_schema.sql_identifier)::text =3D 'bar'::text) AN= D > (pg_has_role(relowner, 'USAGE'::text) OR has_table_privilege(oid, 'SELECT= , > INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR > has_any_column_privilege(oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text)= )) > > Rows Removed by Filter: 1305161 > > -> Materialize (cost=3D0.00..56.62 rows=3D5 width=3D68) (nev= er > executed) > > -> Seq Scan on pg_namespace nc (cost=3D0.00..56.60 row= s=3D5 > width=3D68) (never executed) > > Filter: ((NOT pg_is_other_temp_schema(oid)) AND > (((nspname)::information_schema.sql_identifier)::text =3D 'foo'::text)) > > -> Nested Loop (cost=3D0.70..8.43 rows=3D1 width=3D132) (never exe= cuted) > > -> Index Scan using pg_type_oid_index on pg_type t > (cost=3D0.42..8.12 rows=3D1 width=3D72) (never executed) > > Index Cond: (c.reloftype =3D oid) > > -> Index Scan using pg_namespace_oid_index on pg_namespace nt > (cost=3D0.28..0.30 rows=3D1 width=3D68) (never executed) > > Index Cond: (oid =3D t.typnamespace) > > Planning time: 0.624 ms > > Execution time: 383.784 ms > > (16 rows) > > > > We noticed the degraded performance first when using the psql cli. > Pressing tab after beginning a WHERE clause results in a query against th= e > information_schema which is pretty slow and ends in "lag" when trying to > enter queries. > > > > We also use Flyway (https://flywaydb.org/) to handle our database > migrations. Unfortunately Flyway is querying the information_schema to > check if specific tables exist (I guess this is one of the reasons > information_schema exists) and therefore vastly slows down the migration = of > our tenants. Our last migration run on all tenants (schemata) almost took > 2h because the above query is executed multiple times per tenant. The > migration run consisted of multiple sql files to be executed and triggere= d > more than 10 queries on information_schema per tenant. > > > > I don't think that Flyway is to blame because querying the > information_schema should be a fast operation (and was fast for us when w= e > had less schemata). I tried to speedup querying pg_class by adding indexe= s > (after enabling allow_system_table_mods) but didn't succeed. The function > call 'pg_has_role' is probably not easy to optimize. > > > > Postgres is really doing a great job to handle those many schemata and > tables but doesn't scale well when querying information_schema. I actuall= y > don't want to change my current multi-tenant setup (one schema per tenant= ) > as it is working great but the slow information_schema is killing our > deployments. > > > > Are there any other options besides switching from one-schema-per-tenan= t-approach? > Any help is greatly appreciated! > > Have you tried a `REINDEX SYSTEM `? > > > > > Regards, > > Ulf > > -- > #!/usr/bin/env regards > Chhatoi Pritam Baral > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org= ) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > --001a11376f00769a3a055301f1cc Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Nope, I didn't try th= at yet. But I don't have the impression that reindexing the indexes in = information_schema will help. The table information_schema.tables consists = of the following indexes:

<= div style=3D"font-size:12.8px">
=C2=A0 =C2=A0 "pg_class_oid_index&= quot; UNIQUE, btree (oid)
=C2=A0 =C2=A0 "pg_class_relname_ns= p_index" UNIQUE, btree (relname, relnamespace)
=C2=A0 =C2=A0= "pg_class_tblspc_relfilenode_index" btree (reltablespace, r= elfilenode)

The costly sequence scan in question on pg_class happ= ens with the following WHERE clause:
<= br>
WHERE (c.relkind =3D ANY (ARRAY[&#= 39;r'::"char", 'v'::"char", 'f'::&q= uot;char"])) AND NOT pg_is_other_temp_schema(nc.oid) AND (pg_has_= role(c.relowner, 'USAGE'::text) OR has_table_privilege(c.oid, '= SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) O= R has_any_column_privilege(c.oid, 'SELECT, INSERT, UPDATE, REFEREN= CES'::text));

Besides pg_class_oid_index none of the referenced= columns is indexed. I tried to add an index on relowner but didn't suc= ceed because the column is used in the function call pg_has_role and the qu= ery is still forced to do a sequence scan.

Regards,
Ulf

2017-06-28 3:31 GMT+02:00 Pritam Baral &= lt;pritam@prita= mbaral.com>:
On Wednesday 28 June 2017 05:27 AM, Ulf Lohbr=C3= =BCgge wrote:
> Hi all,
>
> we use schemata to separate our customers in a multi-tenant setup (9.5= .7, Debian stable). Each tenant is managed in his own schema with all the t= ables that only he can access. All tables in all schemata are the same in t= erms of their DDL: Every tenant uses e.g. his own table 'address'. = We currently manage around 1200 schemata (i.e. tenants) on one cluster. Eve= ry schema consists currently of ~200 tables - so we end up with ~240000 tab= les plus constraints, indexes, sequences et al.
>
> Our current approach is quite nice in terms of data privacy because ev= ery tenant is isolated from all other tenants. A tenant uses his own user t= hat gives him only access to the corresponding schema. Performance is great= for us - we didn't expect Postgres to scale so well!
>
> But performance is pretty bad when we query things in the information_= schema:
>
> SELECT
>=C2=A0 =C2=A0*
> FROM information_schema.tables
> WHERE table_schema =3D 'foo'
> AND table_name =3D 'bar';``
>
> Above query results in a large sequence scan with a filter that remove= s 1305161 rows:
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0QUERY PLAN
> ------------------------------------------------------------= -----------------------------------------------------------------= -----------------------------------------------------------------= -----------------------------------------------------------------= -----------------------------------------------------------------= -----------------------------------
>=C2=A0 Nested Loop Left Join=C2=A0 (cost=3D0.70..101170.18 rows=3D3 wid= th=3D265) (actual time=3D383.505..383.505 rows=3D0 loops=3D1)
>=C2=A0 =C2=A0 ->=C2=A0 Nested Loop=C2=A0 (cost=3D0.00..101144.65 row= s=3D3 width=3D141) (actual time=3D383.504..383.504 rows=3D0 loops=3D1)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Join Filter: (nc.oid =3D c.relnamesp= ace)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ->=C2=A0 Seq Scan on pg_class c= =C2=A0 (cost=3D0.00..101023.01 rows=3D867 width=3D77) (actual time=3D383.50= 2..383.502 rows=3D0 loops=3D1)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Filter: ((relki= nd =3D ANY ('{r,v,f}'::"char"[])) AND (((relname)::inform= ation_schema.sql_identifier)::text =3D 'bar'::text) AND (pg_ha= s_role(relowner, 'USAGE'::text) OR has_table_privilege(oid, 'SE= LECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR = has_any_column_privilege(oid, 'SELECT, INSERT, UPDATE, REFERENCES':= :text)))
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Rows Removed by= Filter: 1305161
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ->=C2=A0 Materialize=C2=A0 (cost= =3D0.00..56.62 rows=3D5 width=3D68) (never executed)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ->=C2=A0 Seq= Scan on pg_namespace nc=C2=A0 (cost=3D0.00..56.60 rows=3D5 width=3D68) (ne= ver executed)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 Filter: ((NOT pg_is_other_temp_schema(oid)) AND (((nspname)::informa= tion_schema.sql_identifier)::text =3D 'foo'::text))
>=C2=A0 =C2=A0 ->=C2=A0 Nested Loop=C2=A0 (cost=3D0.70..8.43 rows=3D1= width=3D132) (never executed)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ->=C2=A0 Index Scan using pg_type= _oid_index on pg_type t=C2=A0 (cost=3D0.42..8.12 rows=3D1 width=3D72) (neve= r executed)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Index Cond: (c.= reloftype =3D oid)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ->=C2=A0 Index Scan using pg_name= space_oid_index on pg_namespace nt=C2=A0 (cost=3D0.28..0.30 rows=3D1 width= =3D68) (never executed)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Index Cond: (oi= d =3D t.typnamespace)
>=C2=A0 Planning time: 0.624 ms
>=C2=A0 Execution time: 383.784 ms
> (16 rows)
>
> We noticed the degraded performance first when using the psql cli. Pre= ssing tab after beginning a WHERE clause results in a query against the inf= ormation_schema which is pretty slow and ends in "lag" when tryin= g to enter queries.
>
> We also use Flyway (https://flywaydb.org/) to handle our database migr= ations. Unfortunately Flyway is querying the information_schema to check if= specific tables exist (I guess this is one of the reasons information_sche= ma exists) and therefore vastly slows down the migration of our tenants. Ou= r last migration run on all tenants (schemata) almost took 2h because the a= bove query is executed multiple times per tenant. The migration run consist= ed of multiple sql files to be executed and triggered more than 10 queries = on information_schema per tenant.
>
> I don't think that Flyway is to blame because querying the informa= tion_schema should be a fast operation (and was fast for us when we had les= s schemata). I tried to speedup querying pg_class by adding indexes (after = enabling allow_system_table_mods) but didn't succeed. The function call= 'pg_has_role' is probably not easy to optimize.
>
> Postgres is really doing a great job to handle those many schemata and= tables but doesn't scale well when querying information_schema. I actu= ally don't want to change my current multi-tenant setup (one schema per= tenant) as it is working great but the slow information_schema is killing = our deployments.
>
> Are there any other options besides switching from one-schema-per-tena= nt-approach? Any help is greatly appreciated!

Have you tried a `REINDEX SYSTEM <dbname>`?

>
> Regards,
> Ulf

--
#!/usr/bin/env regards
Chhatoi Pritam Baral



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-pe= rformance

--001a11376f00769a3a055301f1cc--