Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ucaZI-00EadG-VE for pgsql-general@arkaria.postgresql.org; Fri, 18 Jul 2025 02:11:25 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ucaZH-006rJb-2b for pgsql-general@arkaria.postgresql.org; Fri, 18 Jul 2025 02:11:23 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ucaZG-006rJR-NK for pgsql-general@lists.postgresql.org; Fri, 18 Jul 2025 02:11:23 +0000 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ucaZF-007rMZ-1C for pgsql-general@lists.postgresql.org; Fri, 18 Jul 2025 02:11:22 +0000 Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-74af4af04fdso1909092b3a.1 for ; Thu, 17 Jul 2025 19:11:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; t=1752804680; x=1753409480; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=DJgNO4m59v5t+qS8djd8XOhKTIVv32LuYlHZ9/+0Jro=; b=W5pVSBacS0UqggazVPpsV8TBc65O0sRWUplbIX10qLYZCNji+tRNC0yHXgDXM5Io2g wjxszTs8aHxoFI25QO9JoXgfpAFzRTXrHoUf69KIjbI13kW4LmABf6MBziQLHV4jIdXK 2nTLAyU5ZwXe3ae4t++8nOqGbVrLTbsrwZND97xQbfgeqY+4I2PlxemiQ77F0Jma1L1o kaG43K+SJcjwi+9n9VmsUgNbu2vKbQw8l9iSg2q/6I9l58SkzD2KWGIlqO7QJa5p1O3B 75mMv8dREWLmZnV3plmqWoNa3WP5FnBHQEHCpVXvMTYYGWHMQiSyBVdIenrd8wjwfzjH LbGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752804680; x=1753409480; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DJgNO4m59v5t+qS8djd8XOhKTIVv32LuYlHZ9/+0Jro=; b=WsV11h4YsaNiHZzZmi4nGZkyITLUS0Ws0IovZB8Up4bbT3hZOxeB49QIQ42F/7/Yy2 /l5SNxpkVlt3wxgMVayGaWTRyTJyTWdgx7oPND33JLXjTGgEjxOuPPt8ypiRopOWVekk emIyDpzQTHkQdECx+qajQ4bFqcL/1dDfrjZtLpGeRWckC+HBmfiCR0POB1vdyq+QDFmy 2Djeg/Y1ymiMSifUKWRaijLa3kyHoZ4K5RmY7x6VLPyF9kMMq4UZ+JU68fBZDeYXOA5V +mpS6Leqrb0JLfNYSQTcWBRepcMjvuu+CSPFA2zKYeFG88TArY2MrOYe3HzaqrFKpnbw VUcA== X-Gm-Message-State: AOJu0Yy2UfGqrRPmIICXIfhROjUQ4+trQ8/a72jQYHpfS/40JJKB38YU 1uWGl3dwxJkfwaPmyX5XlKmRHgve8ugHprY8zZtIb7z3kXPFooljOMm111m0Imvczu5GOussJbs yQcfiwh6XJuP6Ze8z9cz98Keds5sNLc7rlU9fJa1MkkUf/IqQsGL0tA== X-Gm-Gg: ASbGncvnz+6ghp3imK3SepBm7vGhXIqIMq2cmIAtZFrcqxEUZkOfO5rimu2uzVi2cr2 2ENhp/SiTHlRqbwqM/QPTbvNME4Q6NIw2jqYmzXx23fb+CM+rSiYQZNK0ArupJO/JeJG1CoCoEh 4Uzduj9kLZD1al5Im2qgRN1uqcBaQZNgILfmMMcTmKZvETpcWuElVFzQeQf+Yh5AymA8Hl1o9VQ c7VvQ== X-Google-Smtp-Source: AGHT+IFyiEtBxBryVVIR0CES+NXe/nsp2EyYNYb7eIfj+ry81DTt44JlMu5f6HYBVQjbhDLxTdAxnhSlwGjzydRb2Pk= X-Received: by 2002:a05:6a21:998b:b0:238:351a:f960 with SMTP id adf61e73a8af0-2390c84dd57mr8975675637.23.1752804679803; Thu, 17 Jul 2025 19:11:19 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Craig Ringer Date: Fri, 18 Jul 2025 14:10:43 +1200 X-Gm-Features: Ac12FXwxeO_O4kQ_isxPQzUvEGoBkKIUaubQcLllbH6rPm8YrW9tglS95cFcE90 Message-ID: Subject: Re: Should we document the cost of pg_database_size()? Alternatives? To: pgsql-general@lists.postgresql.org Cc: Gregory Bulloch , Shan Shan Zhao Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, 18 Jul 2025 at 12:54, Craig Ringer wrote: > [...] > I recently ran into an issue where a monitoring component was calling > pg_catalog.pg_database_size() at a ~15s interval on a large > schema-sharded database. It took so long that the query was timing out > before the next 15s scrape interval would start. Fast-follow on this, because I thought to do a local experiment on performance. I suspect that there's more at work with the reported issue than just the cost of statting all the files, because even if I create a 100,000 table db with at least 2 indexes per table (and all the auxiliary forks etc that will result in) I created 100,000 tables to fake up a DB that looks like a schema-sharded one, then tested timing on pg_database_size(). test=# select count(1) from pg_class; count ------- 40496 (1 row) and now test=# select pg_database_size(current_database()); pg_database_size ------------------ 660013571 (1 row) Time: 75.614 ms ... it still takes hardly any time at all. If I drop my disk cache on my workstation: echo 3 > /proc/sys/vm/drop_caches then repeating pg_database_size() only takes 254 ms the first time, then back to 74ms. In other words, even if the DB has many millions of tables, there's no way that pg_database_size() could reasonably take minutes to run and contribute meaningfully to a timeout of a monitoring or scrape process. Not unless there's something (or several somethings) else badly wrong on the DB instance - extreme inode cache thrashing, excessive block device read-ahead, unreasonably high block device I/O latencies, etc. I still think it's worth mentioning pg_database_size() needing to stat every file in the docs, but it's clear there's more going on in the particular case I'm seeing than that alone. I'll try to report back if I learn anything interesting that explains the rest of the performance issues. FYI the crude method used to create the schema since I couldn't be bothered scripting something sensible up was: turn fsync off (never do this unless you're happy to completely destroy all data in your postgres instance), set max_locks_per_transaction=1000 and restart Pg then DO LANGUAGE plpgsql $$ BEGIN FOR i IN 1..10000 LOOP EXECUTE format('CREATE TABLE %I(x serial primary key, y integer unique)', 'padding_table_b_'||i); EXECUTE format('INSERT INTO %I(y) VALUES (0)', 'padding_table_b_'||i); IF i % 100 = 0 THEN RAISE NOTICE 'up to: %', i; END IF; END LOOP; END; $$; -- Craig Ringer EnterpriseDB