public inbox for [email protected]  
help / color / mirror / Atom feed
From: Bossart, Nathan <[email protected]>
To: Magnus Hagander <[email protected]>
To: Michael Paquier <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Mark Dilger <[email protected]>
Cc: Don Seiler <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: Estimating HugePages Requirements?
Date: Fri, 27 Aug 2021 18:16:01 +0000
Message-ID: <[email protected]> (raw)
In-Reply-To: <CABUevEwGHcsmp4rgZdcbinZOdEs_cSCabihDvRty=0zz1H95kw@mail.gmail.com>
References: <CAHJZqBBLHFNs6it-fcJ6LEUXeC5t73soR3h50zUSFpg7894qfQ@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CABUevEwGHcsmp4rgZdcbinZOdEs_cSCabihDvRty=0zz1H95kw@mail.gmail.com>

On 8/27/21, 7:41 AM, "Magnus Hagander" <[email protected]> wrote:
> On Fri, Aug 27, 2021 at 8:46 AM Michael Paquier <[email protected]> wrote:
>> On Wed, Aug 11, 2021 at 11:23:52PM +0000, Bossart, Nathan wrote:
>> > While testing this new option, I noticed that you can achieve similar
>> > results today with the following command, although this one will
>> > actually try to create the shared memory, too.
>>
>> That may not be the best option.
>
> I would say that can be a disastrous option.
>
> First of all it would probably not work if you already have something
> running -- especially when using huge pages. And if it does work, in
> that or other scenarios, it can potentially have significant impact on
> a running cluster to suddenly allocate many GB of more memory...

The v3 patch actually didn't work if the server was already running.
I removed that restriction in v4.

>> > IMO the new option is still handy, but I can see the argument that it
>> > might not be necessary.
>>
>> A separate option looks handy.  Wouldn't it be better to document it
>> in postgres-ref.sgml then?
>
> I'd say a lot more than just handy. I don't think the workaround is
> really all that useful.

I added some documentation in v4.

Nathan  



Attachments:

  [application/octet-stream] v4-0001-introduce-option-for-retreiving-shmem-size.patch (13.5K, 2-v4-0001-introduce-option-for-retreiving-shmem-size.patch)
  download | inline diff:
From dff1d2b57a65ada412386b3dce2f1cf6d8409cac Mon Sep 17 00:00:00 2001
From: Nathan Bossart <[email protected]>
Date: Fri, 27 Aug 2021 17:54:26 +0000
Subject: [PATCH v4 1/1] introduce option for retreiving shmem size

---
 doc/src/sgml/ref/postgres-ref.sgml |  17 +++++
 doc/src/sgml/runtime.sgml          |  17 ++---
 src/backend/bootstrap/bootstrap.c  |  25 +++++--
 src/backend/main/main.c            |   6 +-
 src/backend/storage/ipc/ipci.c     | 142 ++++++++++++++++++++++---------------
 src/include/bootstrap/bootstrap.h  |   2 +-
 src/include/storage/ipc.h          |   1 +
 7 files changed, 132 insertions(+), 78 deletions(-)

diff --git a/doc/src/sgml/ref/postgres-ref.sgml b/doc/src/sgml/ref/postgres-ref.sgml
index 4aaa7abe1a..da20fb559a 100644
--- a/doc/src/sgml/ref/postgres-ref.sgml
+++ b/doc/src/sgml/ref/postgres-ref.sgml
@@ -280,6 +280,23 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--output-shmem</option></term>
+      <listitem>
+       <para>
+        Prints the amount of shared memory required for the current
+        configuration and exits.  This can be used on a running server.
+        This must be the first argument on the command line.
+       </para>
+
+       <para>
+        This option is useful for determining the number of huge pages needed
+        for the server.  For more information, see
+        <xref linkend="linux-huge-pages"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-p <replaceable class="parameter">port</replaceable></option></term>
       <listitem>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index f1cbc1d9e9..535eacadb2 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -1442,17 +1442,14 @@ export PG_OOM_ADJUST_VALUE=0
     with <varname>CONFIG_HUGETLBFS=y</varname> and
     <varname>CONFIG_HUGETLB_PAGE=y</varname>. You will also have to configure
     the operating system to provide enough huge pages of the desired size.
-    To estimate the number of huge pages needed, start
-    <productname>PostgreSQL</productname> without huge pages enabled and check
-    the postmaster's anonymous shared memory segment size, as well as the
-    system's default and supported huge page sizes, using the
-    <filename>/proc</filename> and <filename>/sys</filename> file systems.
+    To estimate the number of huge pages needed, use the
+    <command>postgres</command> command to determine the amount of shared memory
+    needed, and use the <filename>/proc</filename> and <filename>/sys</filename>
+    file systems to find the system's default and supported huge page sizes.
     This might look like:
 <programlisting>
-$ <userinput>head -1 $PGDATA/postmaster.pid</userinput>
-4170
-$ <userinput>pmap 4170 | awk '/rw-s/ &amp;&amp; /zero/ {print $2}'</userinput>
-6490428K
+$ <userinput>postgres --output-shmem -D $PGDATA</userinput>
+6646198272 bytes
 $ <userinput>grep ^Hugepagesize /proc/meminfo</userinput>
 Hugepagesize:       2048 kB
 $ <userinput>ls /sys/kernel/mm/hugepages</userinput>
@@ -1463,7 +1460,7 @@ hugepages-1048576kB  hugepages-2048kB
      either 2MB or 1GB with <xref linkend="guc-huge-page-size"/>.
 
      Assuming <literal>2MB</literal> huge pages,
-     <literal>6490428</literal> / <literal>2048</literal> gives approximately
+     <literal>6646198272</literal> / <literal>2097152</literal> gives approximately
      <literal>3169.154</literal>, so in this example we need at
      least <literal>3170</literal> huge pages.  A larger setting would be
      appropriate if other programs on the machine also need huge pages.
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 48615c0ebc..5bb491da8b 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -198,9 +198,13 @@ CheckerModeMain(void)
  *	 the current configuration, particularly the passed in options pertaining
  *	 to shared memory sizing, options work (or at least do not cause an error
  *	 up to shared memory creation).
+ *
+ *	 When output_shmem is true, startup is done only far enough to calculate the
+ *	 amount of shared memory required for the current configuration.  The result
+ *	 of this calculation is printed.
  */
 void
-BootstrapModeMain(int argc, char *argv[], bool check_only)
+BootstrapModeMain(int argc, char *argv[], bool check_only, bool output_shmem)
 {
 	int			i;
 	char	   *progname = argv[0];
@@ -214,10 +218,11 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	/* Set defaults, to be overridden by explicit options below */
 	InitializeGUCOptions();
 
-	/* an initial --boot or --check should be present */
+	/* an initial --boot, --check, or --output-shmem should be present */
 	Assert(argc > 1
 		   && (strcmp(argv[1], "--boot") == 0
-			   || strcmp(argv[1], "--check") == 0));
+			   || strcmp(argv[1], "--check") == 0
+			   || strcmp(argv[1], "--output-shmem") == 0));
 	argv++;
 	argc--;
 
@@ -317,21 +322,29 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	checkDataDir();
 	ChangeToDataDir();
 
-	CreateDataDirLockFile(false);
+	/*
+	 * We do not create the lock file when --output-shmem is specified so
+	 * that it can be used while the server is running.
+	 */
+	if (!output_shmem)
+		CreateDataDirLockFile(false);
 
 	SetProcessingMode(BootstrapProcessing);
 	IgnoreSystemIndexes = true;
 
 	InitializeMaxBackends();
 
-	CreateSharedMemoryAndSemaphores();
+	if (output_shmem)
+		printf("%zu bytes\n", CalculateShmemSize(NULL));
+	else
+		CreateSharedMemoryAndSemaphores();
 
 	/*
 	 * XXX: It might make sense to move this into its own function at some
 	 * point. Right now it seems like it'd cause more code duplication than
 	 * it's worth.
 	 */
-	if (check_only)
+	if (check_only || output_shmem)
 	{
 		SetProcessingMode(NormalProcessing);
 		CheckerModeMain();
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 3a2a0d598c..c141ae3d1c 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -182,9 +182,11 @@ main(int argc, char *argv[])
 	 */
 
 	if (argc > 1 && strcmp(argv[1], "--check") == 0)
-		BootstrapModeMain(argc, argv, true);
+		BootstrapModeMain(argc, argv, true, false);
+	else if (argc > 1 && strcmp(argv[1], "--output-shmem") == 0)
+		BootstrapModeMain(argc, argv, false, true);
 	else if (argc > 1 && strcmp(argv[1], "--boot") == 0)
-		BootstrapModeMain(argc, argv, false);
+		BootstrapModeMain(argc, argv, false, false);
 #ifdef EXEC_BACKEND
 	else if (argc > 1 && strncmp(argv[1], "--fork", 6) == 0)
 		SubPostmasterMain(argc, argv);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 3e4ec53a97..b225b1ee70 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -75,6 +75,87 @@ RequestAddinShmemSpace(Size size)
 	total_addin_request = add_size(total_addin_request, size);
 }
 
+/*
+ * CalculateShmemSize
+ *		Calculates the amount of shared memory and number of semaphores needed.
+ *
+ * If num_semaphores is not NULL, it will be set to the number of semaphores
+ * required.
+ *
+ * Note that this function freezes the additional shared memory request size
+ * from loadable modules.
+ */
+Size
+CalculateShmemSize(int *num_semaphores)
+{
+	Size size;
+	int numSemas;
+
+	/* Compute number of semaphores we'll need */
+	numSemas = ProcGlobalSemas();
+	numSemas += SpinlockSemas();
+
+	/* Return the number of semaphores if requested by the caller */
+	if (num_semaphores)
+		*num_semaphores = numSemas;
+
+	/*
+	 * Size of the Postgres shared-memory block is estimated via moderately-
+	 * accurate estimates for the big hogs, plus 100K for the stuff that's too
+	 * small to bother with estimating.
+	 *
+	 * We take some care to ensure that the total size request doesn't overflow
+	 * size_t.  If this gets through, we don't need to be so careful during the
+	 * actual allocation phase.
+	 */
+	size = 100000;
+	size = add_size(size, PGSemaphoreShmemSize(numSemas));
+	size = add_size(size, SpinlockSemaSize());
+	size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
+											 sizeof(ShmemIndexEnt)));
+	size = add_size(size, dsm_estimate_size());
+	size = add_size(size, BufferShmemSize());
+	size = add_size(size, LockShmemSize());
+	size = add_size(size, PredicateLockShmemSize());
+	size = add_size(size, ProcGlobalShmemSize());
+	size = add_size(size, XLOGShmemSize());
+	size = add_size(size, CLOGShmemSize());
+	size = add_size(size, CommitTsShmemSize());
+	size = add_size(size, SUBTRANSShmemSize());
+	size = add_size(size, TwoPhaseShmemSize());
+	size = add_size(size, BackgroundWorkerShmemSize());
+	size = add_size(size, MultiXactShmemSize());
+	size = add_size(size, LWLockShmemSize());
+	size = add_size(size, ProcArrayShmemSize());
+	size = add_size(size, BackendStatusShmemSize());
+	size = add_size(size, SInvalShmemSize());
+	size = add_size(size, PMSignalShmemSize());
+	size = add_size(size, ProcSignalShmemSize());
+	size = add_size(size, CheckpointerShmemSize());
+	size = add_size(size, AutoVacuumShmemSize());
+	size = add_size(size, ReplicationSlotsShmemSize());
+	size = add_size(size, ReplicationOriginShmemSize());
+	size = add_size(size, WalSndShmemSize());
+	size = add_size(size, WalRcvShmemSize());
+	size = add_size(size, PgArchShmemSize());
+	size = add_size(size, ApplyLauncherShmemSize());
+	size = add_size(size, SnapMgrShmemSize());
+	size = add_size(size, BTreeShmemSize());
+	size = add_size(size, SyncScanShmemSize());
+	size = add_size(size, AsyncShmemSize());
+#ifdef EXEC_BACKEND
+	size = add_size(size, ShmemBackendArraySize());
+#endif
+
+	/* freeze the addin request size and include it */
+	addin_request_allowed = false;
+	size = add_size(size, total_addin_request);
+
+	/* might as well round it off to a multiple of a typical page size */
+	size = add_size(size, 8192 - (size % 8192));
+
+	return size;
+}
 
 /*
  * CreateSharedMemoryAndSemaphores
@@ -102,65 +183,8 @@ CreateSharedMemoryAndSemaphores(void)
 		Size		size;
 		int			numSemas;
 
-		/* Compute number of semaphores we'll need */
-		numSemas = ProcGlobalSemas();
-		numSemas += SpinlockSemas();
-
-		/*
-		 * Size of the Postgres shared-memory block is estimated via
-		 * moderately-accurate estimates for the big hogs, plus 100K for the
-		 * stuff that's too small to bother with estimating.
-		 *
-		 * We take some care during this phase to ensure that the total size
-		 * request doesn't overflow size_t.  If this gets through, we don't
-		 * need to be so careful during the actual allocation phase.
-		 */
-		size = 100000;
-		size = add_size(size, PGSemaphoreShmemSize(numSemas));
-		size = add_size(size, SpinlockSemaSize());
-		size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
-												 sizeof(ShmemIndexEnt)));
-		size = add_size(size, dsm_estimate_size());
-		size = add_size(size, BufferShmemSize());
-		size = add_size(size, LockShmemSize());
-		size = add_size(size, PredicateLockShmemSize());
-		size = add_size(size, ProcGlobalShmemSize());
-		size = add_size(size, XLOGShmemSize());
-		size = add_size(size, CLOGShmemSize());
-		size = add_size(size, CommitTsShmemSize());
-		size = add_size(size, SUBTRANSShmemSize());
-		size = add_size(size, TwoPhaseShmemSize());
-		size = add_size(size, BackgroundWorkerShmemSize());
-		size = add_size(size, MultiXactShmemSize());
-		size = add_size(size, LWLockShmemSize());
-		size = add_size(size, ProcArrayShmemSize());
-		size = add_size(size, BackendStatusShmemSize());
-		size = add_size(size, SInvalShmemSize());
-		size = add_size(size, PMSignalShmemSize());
-		size = add_size(size, ProcSignalShmemSize());
-		size = add_size(size, CheckpointerShmemSize());
-		size = add_size(size, AutoVacuumShmemSize());
-		size = add_size(size, ReplicationSlotsShmemSize());
-		size = add_size(size, ReplicationOriginShmemSize());
-		size = add_size(size, WalSndShmemSize());
-		size = add_size(size, WalRcvShmemSize());
-		size = add_size(size, PgArchShmemSize());
-		size = add_size(size, ApplyLauncherShmemSize());
-		size = add_size(size, SnapMgrShmemSize());
-		size = add_size(size, BTreeShmemSize());
-		size = add_size(size, SyncScanShmemSize());
-		size = add_size(size, AsyncShmemSize());
-#ifdef EXEC_BACKEND
-		size = add_size(size, ShmemBackendArraySize());
-#endif
-
-		/* freeze the addin request size and include it */
-		addin_request_allowed = false;
-		size = add_size(size, total_addin_request);
-
-		/* might as well round it off to a multiple of a typical page size */
-		size = add_size(size, 8192 - (size % 8192));
-
+		/* Compute the size of the shared-memory block */
+		size = CalculateShmemSize(&numSemas);
 		elog(DEBUG3, "invoking IpcMemoryCreate(size=%zu)", size);
 
 		/*
diff --git a/src/include/bootstrap/bootstrap.h b/src/include/bootstrap/bootstrap.h
index 7d3b78e374..8e420574f5 100644
--- a/src/include/bootstrap/bootstrap.h
+++ b/src/include/bootstrap/bootstrap.h
@@ -32,7 +32,7 @@ extern Form_pg_attribute attrtypes[MAXATTR];
 extern int	numattr;
 
 
-extern void BootstrapModeMain(int argc, char *argv[], bool check_only) pg_attribute_noreturn();
+extern void BootstrapModeMain(int argc, char *argv[], bool check_only, bool output_shmem) pg_attribute_noreturn();
 
 extern void closerel(char *name);
 extern void boot_openrel(char *name);
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index 753a6dd4d7..80e191d407 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
 /* ipci.c */
 extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
 
+extern Size CalculateShmemSize(int *num_semaphores);
 extern void CreateSharedMemoryAndSemaphores(void);
 
 #endif							/* IPC_H */
-- 
2.16.6



view thread (108+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Estimating HugePages Requirements?
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox