Performance of information_schema with many schemata and tables

public inbox for [email protected]  
help / color / mirror / Atom feed

Performance of information_schema with many schemata and tables
4+ messages / 2 participants
[nested] [flat]

* Performance of information_schema with many schemata and tables
@ 2017-06-27 23:57 Ulf Lohbrügge <[email protected]>
  2017-06-28 01:31 ` Re: Performance of information_schema with many schemata and tables Pritam Baral <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Ulf Lohbrügge @ 2017-06-27 23:57 UTC (permalink / raw)
  To: pgsql-performance

Hi all,

we use schemata to separate our customers in a multi-tenant setup (9.5.7,
Debian stable). Each tenant is managed in his own schema with all the
tables that only he can access. All tables in all schemata are the same in
terms of their DDL: Every tenant uses e.g. his own table 'address'. We
currently manage around 1200 schemata (i.e. tenants) on one cluster. Every
schema consists currently of ~200 tables - so we end up with ~240000 tables
plus constraints, indexes, sequences et al.

Our current approach is quite nice in terms of data privacy because every
tenant is isolated from all other tenants. A tenant uses his own user that
gives him only access to the corresponding schema. Performance is great for
us - we didn't expect Postgres to scale so well!

But performance is pretty bad when we query things in the
information_schema:

SELECT
  *
FROM information_schema.tables
WHERE table_schema = 'foo'
AND table_name = 'bar';``

Above query results in a large sequence scan with a filter that removes
1305161 rows:

                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=0.70..101170.18 rows=3 width=265) (actual
time=383.505..383.505 rows=0 loops=1)
   ->  Nested Loop  (cost=0.00..101144.65 rows=3 width=141) (actual
time=383.504..383.504 rows=0 loops=1)
         Join Filter: (nc.oid = c.relnamespace)
         ->  Seq Scan on pg_class c  (cost=0.00..101023.01 rows=867
width=77) (actual time=383.502..383.502 rows=0 loops=1)
               Filter: ((relkind = ANY ('{r,v,f}'::"char"[])) AND
(((relname)::information_schema.sql_identifier)::text = 'bar'::text) AND
(pg_has_role(relowner, 'USAGE'::text) OR has_table_privilege(oid, 'SELECT,
INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR
has_any_column_privilege(oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text)))
               Rows Removed by Filter: 1305161
         ->  Materialize  (cost=0.00..56.62 rows=5 width=68) (never
executed)
               ->  Seq Scan on pg_namespace nc  (cost=0.00..56.60 rows=5
width=68) (never executed)
                     Filter: ((NOT pg_is_other_temp_schema(oid)) AND
(((nspname)::information_schema.sql_identifier)::text = 'foo'::text))
   ->  Nested Loop  (cost=0.70..8.43 rows=1 width=132) (never executed)
         ->  Index Scan using pg_type_oid_index on pg_type t
 (cost=0.42..8.12 rows=1 width=72) (never executed)
               Index Cond: (c.reloftype = oid)
         ->  Index Scan using pg_namespace_oid_index on pg_namespace nt
 (cost=0.28..0.30 rows=1 width=68) (never executed)
               Index Cond: (oid = t.typnamespace)
 Planning time: 0.624 ms
 Execution time: 383.784 ms
(16 rows)

We noticed the degraded performance first when using the psql cli. Pressing
tab after beginning a WHERE clause results in a query against the
information_schema which is pretty slow and ends in "lag" when trying to
enter queries.

We also use Flyway (https://flywaydb.org/) to handle our database
migrations. Unfortunately Flyway is querying the information_schema to
check if specific tables exist (I guess this is one of the reasons
information_schema exists) and therefore vastly slows down the migration of
our tenants. Our last migration run on all tenants (schemata) almost took
2h because the above query is executed multiple times per tenant. The
migration run consisted of multiple sql files to be executed and triggered
more than 10 queries on information_schema per tenant.

I don't think that Flyway is to blame because querying the
information_schema should be a fast operation (and was fast for us when we
had less schemata). I tried to speedup querying pg_class by adding indexes
(after enabling allow_system_table_mods) but didn't succeed. The function
call 'pg_has_role' is probably not easy to optimize.

Postgres is really doing a great job to handle those many schemata and
tables but doesn't scale well when querying information_schema. I actually
don't want to change my current multi-tenant setup (one schema per tenant)
as it is working great but the slow information_schema is killing our
deployments.

Are there any other options besides switching from
one-schema-per-tenant-approach? Any help is greatly appreciated!

Regards,
Ulf

^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Performance of information_schema with many schemata and tables
  2017-06-27 23:57 Performance of information_schema with many schemata and tables Ulf Lohbrügge <[email protected]>
@ 2017-06-28 01:31 ` Pritam Baral <[email protected]>
  2017-06-28 09:38   ` Re: Performance of information_schema with many schemata and tables Ulf Lohbrügge <[email protected]>
  2017-06-28 14:25   ` Re: Performance of information_schema with many schemata and tables Ulf Lohbrügge <[email protected]>
  0 siblings, 2 replies; 4+ messages in thread

From: Pritam Baral @ 2017-06-28 01:31 UTC (permalink / raw)
  To: pgsql-performance

On Wednesday 28 June 2017 05:27 AM, Ulf Lohbrügge wrote:
> Hi all,
>
> we use schemata to separate our customers in a multi-tenant setup (9.5.7, Debian stable). Each tenant is managed in his own schema with all the tables that only he can access. All tables in all schemata are the same in terms of their DDL: Every tenant uses e.g. his own table 'address'. We currently manage around 1200 schemata (i.e. tenants) on one cluster. Every schema consists currently of ~200 tables - so we end up with ~240000 tables plus constraints, indexes, sequences et al.
>
> Our current approach is quite nice in terms of data privacy because every tenant is isolated from all other tenants. A tenant uses his own user that gives him only access to the corresponding schema. Performance is great for us - we didn't expect Postgres to scale so well!
>
> But performance is pretty bad when we query things in the information_schema:
>
> SELECT
>   *
> FROM information_schema.tables
> WHERE table_schema = 'foo'
> AND table_name = 'bar';``
>
> Above query results in a large sequence scan with a filter that removes 1305161 rows:
>
>                                                                                                                                                                             QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Nested Loop Left Join  (cost=0.70..101170.18 rows=3 width=265) (actual time=383.505..383.505 rows=0 loops=1)
>    ->  Nested Loop  (cost=0.00..101144.65 rows=3 width=141) (actual time=383.504..383.504 rows=0 loops=1)
>          Join Filter: (nc.oid = c.relnamespace)
>          ->  Seq Scan on pg_class c  (cost=0.00..101023.01 rows=867 width=77) (actual time=383.502..383.502 rows=0 loops=1)
>                Filter: ((relkind = ANY ('{r,v,f}'::"char"[])) AND (((relname)::information_schema.sql_identifier)::text = 'bar'::text) AND (pg_has_role(relowner, 'USAGE'::text) OR has_table_privilege(oid, 'SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR has_any_column_privilege(oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text)))
>                Rows Removed by Filter: 1305161
>          ->  Materialize  (cost=0.00..56.62 rows=5 width=68) (never executed)
>                ->  Seq Scan on pg_namespace nc  (cost=0.00..56.60 rows=5 width=68) (never executed)
>                      Filter: ((NOT pg_is_other_temp_schema(oid)) AND (((nspname)::information_schema.sql_identifier)::text = 'foo'::text))
>    ->  Nested Loop  (cost=0.70..8.43 rows=1 width=132) (never executed)
>          ->  Index Scan using pg_type_oid_index on pg_type t  (cost=0.42..8.12 rows=1 width=72) (never executed)
>                Index Cond: (c.reloftype = oid)
>          ->  Index Scan using pg_namespace_oid_index on pg_namespace nt  (cost=0.28..0.30 rows=1 width=68) (never executed)
>                Index Cond: (oid = t.typnamespace)
>  Planning time: 0.624 ms
>  Execution time: 383.784 ms
> (16 rows)
>
> We noticed the degraded performance first when using the psql cli. Pressing tab after beginning a WHERE clause results in a query against the information_schema which is pretty slow and ends in "lag" when trying to enter queries.
>
> We also use Flyway (https://flywaydb.org/) to handle our database migrations. Unfortunately Flyway is querying the information_schema to check if specific tables exist (I guess this is one of the reasons information_schema exists) and therefore vastly slows down the migration of our tenants. Our last migration run on all tenants (schemata) almost took 2h because the above query is executed multiple times per tenant. The migration run consisted of multiple sql files to be executed and triggered more than 10 queries on information_schema per tenant.
>
> I don't think that Flyway is to blame because querying the information_schema should be a fast operation (and was fast for us when we had less schemata). I tried to speedup querying pg_class by adding indexes (after enabling allow_system_table_mods) but didn't succeed. The function call 'pg_has_role' is probably not easy to optimize.
>
> Postgres is really doing a great job to handle those many schemata and tables but doesn't scale well when querying information_schema. I actually don't want to change my current multi-tenant setup (one schema per tenant) as it is working great but the slow information_schema is killing our deployments.
>
> Are there any other options besides switching from one-schema-per-tenant-approach? Any help is greatly appreciated!

Have you tried a `REINDEX SYSTEM <dbname>`?

>
> Regards,
> Ulf

-- 
#!/usr/bin/env regards
Chhatoi Pritam Baral



-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Performance of information_schema with many schemata and tables
  2017-06-27 23:57 Performance of information_schema with many schemata and tables Ulf Lohbrügge <[email protected]>
  2017-06-28 01:31 ` Re: Performance of information_schema with many schemata and tables Pritam Baral <[email protected]>
@ 2017-06-28 09:38   ` Ulf Lohbrügge <[email protected]>
  1 sibling, 0 replies; 4+ messages in thread

From: Ulf Lohbrügge @ 2017-06-28 09:38 UTC (permalink / raw)
  To: pgsql-performance

Nope, I didn't try that yet. But I don't have the impression that
reindexing the indexes in information_schema will help. The table
information_schema.tables consists of the following indexes:

    "pg_class_oid_index" UNIQUE, btree (oid)
    "pg_class_relname_nsp_index" UNIQUE, btree (relname, relnamespace)
    "pg_class_tblspc_relfilenode_index" btree (reltablespace, relfilenode)

The costly sequence scan in question on pg_class happens with the following
WHERE clause:

WHERE (c.relkind = ANY (ARRAY['r'::"char", 'v'::"char", 'f'::"char"])) AND
NOT pg_is_other_temp_schema(nc.oid) AND (pg_has_role(c.relowner,
'USAGE'::text) OR has_table_privilege(c.oid, 'SELECT, INSERT, UPDATE,
DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR
has_any_column_privilege(c.oid,
'SELECT, INSERT, UPDATE, REFERENCES'::text));

Besides pg_class_oid_index none of the referenced columns is indexed. I
tried to add an index on relowner but didn't succeed because the column is
used in the function call pg_has_role and the query is still forced to do a
sequence scan.

Regards,
Ulf

2017-06-28 3:31 GMT+02:00 Pritam Baral <[email protected]>:

> On Wednesday 28 June 2017 05:27 AM, Ulf Lohbrügge wrote:
> > Hi all,
> >
> > we use schemata to separate our customers in a multi-tenant setup
> (9.5.7, Debian stable). Each tenant is managed in his own schema with all
> the tables that only he can access. All tables in all schemata are the same
> in terms of their DDL: Every tenant uses e.g. his own table 'address'. We
> currently manage around 1200 schemata (i.e. tenants) on one cluster. Every
> schema consists currently of ~200 tables - so we end up with ~240000 tables
> plus constraints, indexes, sequences et al.
> >
> > Our current approach is quite nice in terms of data privacy because
> every tenant is isolated from all other tenants. A tenant uses his own user
> that gives him only access to the corresponding schema. Performance is
> great for us - we didn't expect Postgres to scale so well!
> >
> > But performance is pretty bad when we query things in the
> information_schema:
> >
> > SELECT
> >   *
> > FROM information_schema.tables
> > WHERE table_schema = 'foo'
> > AND table_name = 'bar';``
> >
> > Above query results in a large sequence scan with a filter that removes
> 1305161 rows:
> >
> >
>
>                        QUERY PLAN
> > ------------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> -------------------------------------------------------
> >  Nested Loop Left Join  (cost=0.70..101170.18 rows=3 width=265) (actual
> time=383.505..383.505 rows=0 loops=1)
> >    ->  Nested Loop  (cost=0.00..101144.65 rows=3 width=141) (actual
> time=383.504..383.504 rows=0 loops=1)
> >          Join Filter: (nc.oid = c.relnamespace)
> >          ->  Seq Scan on pg_class c  (cost=0.00..101023.01 rows=867
> width=77) (actual time=383.502..383.502 rows=0 loops=1)
> >                Filter: ((relkind = ANY ('{r,v,f}'::"char"[])) AND
> (((relname)::information_schema.sql_identifier)::text = 'bar'::text) AND
> (pg_has_role(relowner, 'USAGE'::text) OR has_table_privilege(oid, 'SELECT,
> INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR
> has_any_column_privilege(oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text)))
> >                Rows Removed by Filter: 1305161
> >          ->  Materialize  (cost=0.00..56.62 rows=5 width=68) (never
> executed)
> >                ->  Seq Scan on pg_namespace nc  (cost=0.00..56.60 rows=5
> width=68) (never executed)
> >                      Filter: ((NOT pg_is_other_temp_schema(oid)) AND
> (((nspname)::information_schema.sql_identifier)::text = 'foo'::text))
> >    ->  Nested Loop  (cost=0.70..8.43 rows=1 width=132) (never executed)
> >          ->  Index Scan using pg_type_oid_index on pg_type t
> (cost=0.42..8.12 rows=1 width=72) (never executed)
> >                Index Cond: (c.reloftype = oid)
> >          ->  Index Scan using pg_namespace_oid_index on pg_namespace nt
> (cost=0.28..0.30 rows=1 width=68) (never executed)
> >                Index Cond: (oid = t.typnamespace)
> >  Planning time: 0.624 ms
> >  Execution time: 383.784 ms
> > (16 rows)
> >
> > We noticed the degraded performance first when using the psql cli.
> Pressing tab after beginning a WHERE clause results in a query against the
> information_schema which is pretty slow and ends in "lag" when trying to
> enter queries.
> >
> > We also use Flyway (https://flywaydb.org/) to handle our database
> migrations. Unfortunately Flyway is querying the information_schema to
> check if specific tables exist (I guess this is one of the reasons
> information_schema exists) and therefore vastly slows down the migration of
> our tenants. Our last migration run on all tenants (schemata) almost took
> 2h because the above query is executed multiple times per tenant. The
> migration run consisted of multiple sql files to be executed and triggered
> more than 10 queries on information_schema per tenant.
> >
> > I don't think that Flyway is to blame because querying the
> information_schema should be a fast operation (and was fast for us when we
> had less schemata). I tried to speedup querying pg_class by adding indexes
> (after enabling allow_system_table_mods) but didn't succeed. The function
> call 'pg_has_role' is probably not easy to optimize.
> >
> > Postgres is really doing a great job to handle those many schemata and
> tables but doesn't scale well when querying information_schema. I actually
> don't want to change my current multi-tenant setup (one schema per tenant)
> as it is working great but the slow information_schema is killing our
> deployments.
> >
> > Are there any other options besides switching from one-schema-per-tenant-approach?
> Any help is greatly appreciated!
>
> Have you tried a `REINDEX SYSTEM <dbname>`?
>
> >
> > Regards,
> > Ulf
>
> --
> #!/usr/bin/env regards
> Chhatoi Pritam Baral
>
>
>
> --
> Sent via pgsql-performance mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Performance of information_schema with many schemata and tables
  2017-06-27 23:57 Performance of information_schema with many schemata and tables Ulf Lohbrügge <[email protected]>
  2017-06-28 01:31 ` Re: Performance of information_schema with many schemata and tables Pritam Baral <[email protected]>
@ 2017-06-28 14:25   ` Ulf Lohbrügge <[email protected]>
  1 sibling, 0 replies; 4+ messages in thread

From: Ulf Lohbrügge @ 2017-06-28 14:25 UTC (permalink / raw)
  To: pgsql-performance

2017-06-28 10:43 GMT+02:00 Pritam Baral <[email protected]>:

>
>
> On Wednesday 28 June 2017 02:00 PM, Ulf Lohbrügge wrote:
> > Nope, I didn't try that yet. But I don't have the impression that
> reindexing the indexes in information_schema will help. The table
> information_schema.tables consists of the following indexes:
> >
> >     "pg_class_oid_index" UNIQUE, btree (oid)
> >     "pg_class_relname_nsp_index" UNIQUE, btree (relname, relnamespace)
> >     "pg_class_tblspc_relfilenode_index" btree (reltablespace,
> relfilenode)
>
> information_schema.tables is not a table, it's a view; at least on 9.5[0].
> These indexes you list are actually indexes on the pg_catalog.pg_class
> table.
>

Yes, it's a view. \d+ information_schema.tables gives:

View definition:
 SELECT current_database()::information_schema.sql_identifier AS
table_catalog,
    nc.nspname::information_schema.sql_identifier AS table_schema,
    c.relname::information_schema.sql_identifier AS table_name,
        CASE
            WHEN nc.oid = pg_my_temp_schema() THEN 'LOCAL TEMPORARY'::text
            WHEN c.relkind = 'r'::"char" THEN 'BASE TABLE'::text
            WHEN c.relkind = 'v'::"char" THEN 'VIEW'::text
            WHEN c.relkind = 'f'::"char" THEN 'FOREIGN TABLE'::text
            ELSE NULL::text
        END::information_schema.character_data AS table_type,
    NULL::character varying::information_schema.sql_identifier AS
self_referencing_column_name,
    NULL::character varying::information_schema.character_data AS
reference_generation,
        CASE
            WHEN t.typname IS NOT NULL THEN current_database()
            ELSE NULL::name
        END::information_schema.sql_identifier AS user_defined_type_catalog,
    nt.nspname::information_schema.sql_identifier AS
user_defined_type_schema,
    t.typname::information_schema.sql_identifier AS user_defined_type_name,
        CASE
            WHEN c.relkind = 'r'::"char" OR (c.relkind = ANY
(ARRAY['v'::"char", 'f'::"char"])) AND
(pg_relation_is_updatable(c.oid::regclass, false) & 8) = 8 THEN 'YES'::text
            ELSE 'NO'::text
        END::information_schema.yes_or_no AS is_insertable_into,
        CASE
            WHEN t.typname IS NOT NULL THEN 'YES'::text
            ELSE 'NO'::text
        END::information_schema.yes_or_no AS is_typed,
    NULL::character varying::information_schema.character_data AS
commit_action
   FROM pg_namespace nc
     JOIN pg_class c ON nc.oid = c.relnamespace
     LEFT JOIN (pg_type t
     JOIN pg_namespace nt ON t.typnamespace = nt.oid) ON c.reloftype = t.oid
  WHERE (c.relkind = ANY (ARRAY['r'::"char", 'v'::"char", 'f'::"char"]))
AND NOT pg_is_other_temp_schema(nc.oid) AND (pg_has_role(c.relowner,
'USAGE'::text) OR has_table_privilege(c.oid, 'SELECT, INSERT, UPDATE,
DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR
has_any_column_privilege(c.oid, 'SELECT, INSERT, UPDATE,
REFERENCES'::text));


>
> >
> > The costly sequence scan in question on pg_class happens with the
> following WHERE clause:
> >
> > WHERE (c.relkind = ANY (ARRAY['r'::"char", 'v'::"char", 'f'::"char"]))
> AND NOT pg_is_other_temp_schema(nc.oid) AND (pg_has_role(c.relowner,
> 'USAGE'::text) OR has_table_privilege(c.oid, 'SELECT, INSERT, UPDATE,
> DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR has_any_column_privilege(c.oid,
> 'SELECT, INSERT, UPDATE, REFERENCES'::text));
>
> This is not the bottleneck WHERE clause the query plan from your first
> mail shows. That one is:
>
> ((relkind = ANY ('{r,v,f}'::"char"[])) AND (((relname)::information_
> schema.sql_identifier)::text = 'bar'::text) AND (pg_has_role(relowner,
> 'USAGE'::text) OR has_table_privilege(oid, 'SELECT, INSERT, UPDATE, DELETE,
> TRUNCATE, REFERENCES, TRIGGER'::text) OR has_any_column_privilege(oid,
> 'SELECT, INSERT, UPDATE, REFERENCES'::text)))
>

The part you copied is from the EXPLAIN ANALYZE output. The WHERE clause I
posted earlier (or see view definition) above does unfortunately not
contain the relname.


>
> I can say with certainty that an index on pg_catalog.pg_class.relname is
> going to speed this up. Postgres doesn't allow modifying system catalogs,
> but the `REINDEX SYSTEM <dbname>;` command should rebuild the system
> indexes and pg_catalog.pg_class.relname should be included in them (I
> tested on 9.6).
>
> Do try that once. If you still see sequential scans, check what indexes
> are present on pg_catalog.pg_class.
>

I just fired a 'REINDEX SYSTEM <dbname>;' but the output of EXPLAIN ANALYZE
is unchanged and the query duration did not change.

Best Regards,
Ulf


>
>
> >
> > Besides pg_class_oid_index none of the referenced columns is indexed. I
> tried to add an index on relowner but didn't succeed because the column is
> used in the function call pg_has_role and the query is still forced to do a
> sequence scan.
> >
> > Regards,
> > Ulf
> >
> > 2017-06-28 3:31 GMT+02:00 Pritam Baral <[email protected] <mailto:
> [email protected]>>:
> >
> >     On Wednesday 28 June 2017 05:27 AM, Ulf Lohbrügge wrote:
> >     > Hi all,
> >     >
> >     > we use schemata to separate our customers in a multi-tenant setup
> (9.5.7, Debian stable). Each tenant is managed in his own schema with all
> the tables that only he can access. All tables in all schemata are the same
> in terms of their DDL: Every tenant uses e.g. his own table 'address'. We
> currently manage around 1200 schemata (i.e. tenants) on one cluster. Every
> schema consists currently of ~200 tables - so we end up with ~240000 tables
> plus constraints, indexes, sequences et al.
> >     >
> >     > Our current approach is quite nice in terms of data privacy
> because every tenant is isolated from all other tenants. A tenant uses his
> own user that gives him only access to the corresponding schema.
> Performance is great for us - we didn't expect Postgres to scale so well!
> >     >
> >     > But performance is pretty bad when we query things in the
> information_schema:
> >     >
> >     > SELECT
> >     >   *
> >     > FROM information_schema.tables
> >     > WHERE table_schema = 'foo'
> >     > AND table_name = 'bar';``
> >     >
> >     > Above query results in a large sequence scan with a filter that
> removes 1305161 rows:
> >     >
> >     >
>
>                              QUERY PLAN
> >     > ------------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> -------------------------------------------------------
> >     >  Nested Loop Left Join  (cost=0.70..101170.18 rows=3 width=265)
> (actual time=383.505..383.505 rows=0 loops=1)
> >     >    ->  Nested Loop  (cost=0.00..101144.65 rows=3 width=141)
> (actual time=383.504..383.504 rows=0 loops=1)
> >     >          Join Filter: (nc.oid = c.relnamespace)
> >     >          ->  Seq Scan on pg_class c  (cost=0.00..101023.01
> rows=867 width=77) (actual time=383.502..383.502 rows=0 loops=1)
> >     >                Filter: ((relkind = ANY ('{r,v,f}'::"char"[])) AND
> (((relname)::information_schema.sql_identifier)::text = 'bar'::text) AND
> (pg_has_role(relowner, 'USAGE'::text) OR has_table_privilege(oid, 'SELECT,
> INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR
> has_any_column_privilege(oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text)))
> >     >                Rows Removed by Filter: 1305161
> >     >          ->  Materialize  (cost=0.00..56.62 rows=5 width=68)
> (never executed)
> >     >                ->  Seq Scan on pg_namespace nc  (cost=0.00..56.60
> rows=5 width=68) (never executed)
> >     >                      Filter: ((NOT pg_is_other_temp_schema(oid))
> AND (((nspname)::information_schema.sql_identifier)::text = 'foo'::text))
> >     >    ->  Nested Loop  (cost=0.70..8.43 rows=1 width=132) (never
> executed)
> >     >          ->  Index Scan using pg_type_oid_index on pg_type t
> (cost=0.42..8.12 rows=1 width=72) (never executed)
> >     >                Index Cond: (c.reloftype = oid)
> >     >          ->  Index Scan using pg_namespace_oid_index on
> pg_namespace nt  (cost=0.28..0.30 rows=1 width=68) (never executed)
> >     >                Index Cond: (oid = t.typnamespace)
> >     >  Planning time: 0.624 ms
> >     >  Execution time: 383.784 ms
> >     > (16 rows)
> >     >
> >     > We noticed the degraded performance first when using the psql cli.
> Pressing tab after beginning a WHERE clause results in a query against the
> information_schema which is pretty slow and ends in "lag" when trying to
> enter queries.
> >     >
> >     > We also use Flyway (https://flywaydb.org/) to handle our database
> migrations. Unfortunately Flyway is querying the information_schema to
> check if specific tables exist (I guess this is one of the reasons
> information_schema exists) and therefore vastly slows down the migration of
> our tenants. Our last migration run on all tenants (schemata) almost took
> 2h because the above query is executed multiple times per tenant. The
> migration run consisted of multiple sql files to be executed and triggered
> more than 10 queries on information_schema per tenant.
> >     >
> >     > I don't think that Flyway is to blame because querying the
> information_schema should be a fast operation (and was fast for us when we
> had less schemata). I tried to speedup querying pg_class by adding indexes
> (after enabling allow_system_table_mods) but didn't succeed. The function
> call 'pg_has_role' is probably not easy to optimize.
> >     >
> >     > Postgres is really doing a great job to handle those many schemata
> and tables but doesn't scale well when querying information_schema. I
> actually don't want to change my current multi-tenant setup (one schema per
> tenant) as it is working great but the slow information_schema is killing
> our deployments.
> >     >
> >     > Are there any other options besides switching from
> one-schema-per-tenant-approach? Any help is greatly appreciated!
> >
> >     Have you tried a `REINDEX SYSTEM <dbname>`?
> >
> >     >
> >     > Regards,
> >     > Ulf
> >
> >     --
> >     #!/usr/bin/env regards
> >     Chhatoi Pritam Baral
> >
> >
> >
> >     --
> >     Sent via pgsql-performance mailing list (
> [email protected] <mailto:[email protected]
> >)
> >     To make changes to your subscription:
> >     http://www.postgresql.org/mailpref/pgsql-performance <
> http://www.postgresql.org/mailpref/pgsql-performance;
> >
> >
>
> [0]: https://www.postgresql.org/docs/9.5/static/infoschema-tables.html
>
> --
> #!/usr/bin/env regards
> Chhatoi Pritam Baral
>
>


^ permalink  raw  reply  [nested|flat] 4+ messages in thread

end of thread, other threads:[~2017-06-28 14:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2017-06-27 23:57 Performance of information_schema with many schemata and tables Ulf Lohbrügge <[email protected]>
2017-06-28 01:31 ` Pritam Baral <[email protected]>
2017-06-28 09:38   ` Ulf Lohbrügge <[email protected]>
2017-06-28 14:25   ` Ulf Lohbrügge <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox