public inbox for [email protected]  
help / color / mirror / Atom feed
From: Heikki Linnakangas <[email protected]>
To: Tomas Vondra <[email protected]>
To: [email protected] <[email protected]>
To: Robert Haas <[email protected]>
To: Rahila Syed <[email protected]>
Subject: Re: Shared hash table allocations
Date: Wed, 1 Apr 2026 00:25:44 +0300
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

On 31/03/2026 01:02, Heikki Linnakangas wrote:
> I wonder if we should change the defaults somehow. In usual 
> configurations, people are currently getting much more lock space than 
> you'd expect based on max_connections and max_locks_per_transaction, and 
> after these patches, they'll get much fewer locks. It might be prudent 
> bump up the default max_locks_per_transaction setting so that you'd get 
> roughly the same amount of locks in the default configuration.

I did some testing of the memory usage and how removing the wiggle room 
affects the number of locks you can acquire. Attached are the test 
procedures I used, and proposed patches. The patches are new, designed 
to just change the parameters of the hash tables and shmem calculations 
with no other changes. They don't include the refactorings we've 
discussed so far in this thread. My plan is to commit these new patches 
first, and those other refactorings after that. Once these new patches 
are committed, the refactorings won't materially change the overall 
memory usage or how it's divided between different hash tables, all 
those effects are in these new patches.


master: With the default configuration on master, the attached test 
procedure can take 14927 locks before hitting "out of shared memory" 
error. At that point, all the "wiggle room" is assigned for the LOCK 
hash table. A different scenario could make the PROCLOCK hash table 
consume all the wiggle room instead, but I believe running out of LOCK 
space is more common, and I don't think it changes the big picture 
anyway if you hit the ceiling with PROCLOCK instead.

0001: While looking at this, I noticed that we add 10% "safety margin" 
to the shmem calculations in predicate.c, but we had already marked the 
predicate.c hash tables as HASH_FIXED_SIZE so they were never able to 
make use of the safety margin. Oops. The extra memory was available for 
the lock.c hash tables, though. After removing that bogus 10% safety 
margin from predicate.c, memory usage was reduced by 200 kB, but the 
number of locks you could take went down from 14927 to 14159.

0002: As the next step, I also removed the 10% safety margin from 
lock.c. That reduced memory usage by another 320 kB, and the number of 
locks went down from 14159 to 12815.

0003: After those changes, there's only little extra memory sloshing 
around that's not accounted for any data structure. ipci.c reserves a 
constant 100 kB, but that's pretty much it. However, there's still 
flexibility between the LOCK and the PROCLOCK hash tables. The PROCLOCK 
hash table is estimated to be 2x the size of the LOCK table, but when 
it's not, the space can get assigned to the LOCK table instead. In patch 
0003 I removed that flexibility by marking them both with 
HASH_FIXED_SIZE, and making init_size equal to max_size. That also stops 
the hash tables from using any of the other remaining wiggle room, 
making them truly fixed-size. This doesn't change the overall shared 
memory allocated, but the number of locks that the test procedure could 
acquire went down from 12815 to 8767, mostly because it cannot "steal" 
space from PROCLOCK anymore.

0004: To buy back that lock manager space in common out-of-the box 
situations, I propose to bump up the default for 
max_locks_per_transactions from 64 to 128. That increases memory usage 
again by 3216 kB, making it 2696 kB higher than on master (remember that 
the previous changes reduced memory usage). The number of locks you can 
take after that is 17535, which more than on master (14927).

Increasing the default won't affect users who have already set 
max_locks_per_transaction to a non-default value. They will see that the 
number of locks they can acquire with their existing configuration will 
be reduced, again because of the lost wiggle room and flexibility 
between LOCK and PROCLOCK. Not sure if we could or should do something 
about that. Probably best to just document in the release notes that if 
you had raised increase max_locks_per_transaction, you might need to 
raise it further to be able to accommodate the same amount of locks as 
before.

Here's all that in table form:

| Patch                                 | Shmem (kB) | Locks |
| --------------------------------------+------------|-------|
| master                                |     153560 | 14927 |
| 0001: remove 10% from predicate.c     |     153360 | 14159 |
| 0002: remove 10% from lock.c          |     153040 | 12815 |
| 0003: Make lock.c tables fixed size   |     153040 |  8767 |
| 0004: max_locks_per_transactions=128  |     156256 | 17535 |

This increase in memory usage is not great, but it's not that big in the 
grand scheme of things. I think it's well worth, and better than the 
sloppy scheme we have today.

Any thoughts, objections?

- Heikki


Attachments:

  [application/sql] shmem-test.sql (2.1K, 2-shmem-test.sql)
  download

  [text/x-patch] v3-0001-Remove-bogus-safety-margin-from-predicate.c-shmem.patch (1.6K, 3-v3-0001-Remove-bogus-safety-margin-from-predicate.c-shmem.patch)
  download | inline diff:
From 8f55a325a4bfeeeb87ec2582c7d30d8a20d0abe9 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 31 Mar 2026 12:06:44 +0300
Subject: [PATCH v3 1/4] Remove bogus "safety margin" from predicate.c shmem
 estimates

The 10% safety margin was copy-pasted from lock.c when the predicate
locking code was originally added. However, we later (commit
7c797e7194) added the HASH_FIXED_SIZE flag to the hash tables, which
means that they cannot actually use the safety margin that we're
calculating for them.

The extra memory was mainly used by the main lock manager, which is
the only shmem hash table of non-trivial size that does not use the
HASH_FIXED_SIZE flag. If we wanted to have more space for the lock
manager, we should reserve it directly in lock.c. As this patch
stands, the lock manager will just have less memory available than
before.
---
 src/backend/storage/lmgr/predicate.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index ae0e96aee5f..efa47ec1684 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1383,12 +1383,6 @@ PredicateLockShmemSize(void)
 	size = add_size(size, hash_estimate_size(max_predicate_locks,
 											 sizeof(PREDICATELOCK)));
 
-	/*
-	 * Since NPREDICATELOCKTARGETENTS is only an estimate, add 10% safety
-	 * margin.
-	 */
-	size = add_size(size, size / 10);
-
 	/* transaction list */
 	max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
 	size = add_size(size, PredXactListDataSize);
-- 
2.47.3



  [text/x-patch] v3-0002-Remove-10-safety-marging-from-lock-hash-table-est.patch (818B, 4-v3-0002-Remove-10-safety-marging-from-lock-hash-table-est.patch)
  download | inline diff:
From 53e0a614fdfaa42003841b4776cc4aab72f11982 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 31 Mar 2026 23:14:36 +0300
Subject: [PATCH v3 2/4] Remove 10% safety marging from lock hash table
 estimates

---
 src/backend/storage/lmgr/lock.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 234643e4dd7..2159de9015a 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -3778,11 +3778,6 @@ LockManagerShmemSize(void)
 	max_table_size *= 2;
 	size = add_size(size, hash_estimate_size(max_table_size, sizeof(PROCLOCK)));
 
-	/*
-	 * Since NLOCKENTS is only an estimate, add 10% safety margin.
-	 */
-	size = add_size(size, size / 10);
-
 	return size;
 }
 
-- 
2.47.3



  [text/x-patch] v3-0003-Make-the-lock-hash-tables-fixed-sized.patch (2.1K, 5-v3-0003-Make-the-lock-hash-tables-fixed-sized.patch)
  download | inline diff:
From edc25ff76b77244880120803ca0f495fa734a02c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 31 Mar 2026 18:39:03 +0300
Subject: [PATCH v3 3/4] Make the lock hash tables fixed-sized

---
 src/backend/storage/lmgr/lock.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 2159de9015a..ac7c5ed0604 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -445,16 +445,14 @@ void
 LockManagerShmemInit(void)
 {
 	HASHCTL		info;
-	int64		init_table_size,
-				max_table_size;
+	int64		max_table_size;
 	bool		found;
 
 	/*
-	 * Compute init/max size to request for lock hashtables.  Note these
-	 * calculations must agree with LockManagerShmemSize!
+	 * Compute sizes for lock hashtables.  Note that these calculations must
+	 * agree with LockManagerShmemSize!
 	 */
 	max_table_size = NLOCKENTS();
-	init_table_size = max_table_size / 2;
 
 	/*
 	 * Allocate hash table for LOCK structs.  This stores per-locked-object
@@ -465,14 +463,14 @@ LockManagerShmemInit(void)
 	info.num_partitions = NUM_LOCK_PARTITIONS;
 
 	LockMethodLockHash = ShmemInitHash("LOCK hash",
-									   init_table_size,
+									   max_table_size,
 									   max_table_size,
 									   &info,
-									   HASH_ELEM | HASH_BLOBS | HASH_PARTITION);
+									   HASH_ELEM | HASH_BLOBS |
+									   HASH_PARTITION  | HASH_FIXED_SIZE);
 
 	/* Assume an average of 2 holders per lock */
 	max_table_size *= 2;
-	init_table_size *= 2;
 
 	/*
 	 * Allocate hash table for PROCLOCK structs.  This stores
@@ -484,10 +482,11 @@ LockManagerShmemInit(void)
 	info.num_partitions = NUM_LOCK_PARTITIONS;
 
 	LockMethodProcLockHash = ShmemInitHash("PROCLOCK hash",
-										   init_table_size,
+										   max_table_size,
 										   max_table_size,
 										   &info,
-										   HASH_ELEM | HASH_FUNCTION | HASH_PARTITION);
+										   HASH_ELEM | HASH_FUNCTION |
+										   HASH_FIXED_SIZE | HASH_PARTITION);
 
 	/*
 	 * Allocate fast-path structures.
-- 
2.47.3



  [text/x-patch] v3-0004-Change-default-max_locks_per_transactions-128.patch (4.1K, 6-v3-0004-Change-default-max_locks_per_transactions-128.patch)
  download | inline diff:
From 692c0727d84d534a2575e0442c96e4df3752a7b4 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 00:09:35 +0300
Subject: [PATCH v3 4/4] Change default max_locks_per_transactions=128

---
 doc/src/sgml/config.sgml                      | 2 +-
 src/backend/utils/init/postinit.c             | 2 +-
 src/backend/utils/misc/guc_parameters.dat     | 2 +-
 src/backend/utils/misc/postgresql.conf.sample | 2 +-
 src/bin/pg_resetwal/pg_resetwal.c             | 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 229f41353eb..20706b56158 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -11470,7 +11470,7 @@ dynamic_library_path = '/usr/local/lib/postgresql:$libdir'
         can lock more objects as long as the locks of all transactions
         fit in the lock table.  This is <emphasis>not</emphasis> the number of
         rows that can be locked; that value is unlimited.  The default,
-        64, has historically proven sufficient, but you might need to
+        128, has historically proven sufficient, but you might need to
         raise this value if you have queries that touch many different
         tables in a single transaction, e.g., query of a parent table with
         many children.  This parameter can only be set at server start.
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 577ef5effbb..783a7400464 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -593,7 +593,7 @@ InitializeFastPathLocks(void)
 	 * value at FP_LOCK_GROUPS_PER_BACKEND_MAX and insist the value is at
 	 * least 1.
 	 *
-	 * The default max_locks_per_transaction = 64 means 4 groups by default.
+	 * The default max_locks_per_transaction = 128 means 8 groups by default.
 	 */
 	FastPathLockGroupsPerBackend =
 		Max(Min(pg_nextpower2_32(max_locks_per_xact) / FP_LOCK_SLOTS_PER_GROUP,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 0a862693fcd..a3e0dda34af 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -1979,7 +1979,7 @@
   short_desc => 'Sets the maximum number of locks per transaction.',
   long_desc => 'The shared lock table is sized on the assumption that at most "max_locks_per_transaction" objects per server process or prepared transaction will need to be locked at any one time.',
   variable => 'max_locks_per_xact',
-  boot_val => '64',
+  boot_val => '128',
   min => '10',
   max => 'INT_MAX',
 },
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index cf15597385b..6e376d85d61 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -856,7 +856,7 @@
 #------------------------------------------------------------------------------
 
 #deadlock_timeout = 1s
-#max_locks_per_transaction = 64         # min 10
+#max_locks_per_transaction = 128        # min 10
                                         # (change requires restart)
 #max_pred_locks_per_transaction = 64    # min 10
                                         # (change requires restart)
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index ab766c34d4b..44f2b446e5d 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -722,7 +722,7 @@ GuessControlValues(void)
 	ControlFile.max_wal_senders = 10;
 	ControlFile.max_worker_processes = 8;
 	ControlFile.max_prepared_xacts = 0;
-	ControlFile.max_locks_per_xact = 64;
+	ControlFile.max_locks_per_xact = 128;
 
 	ControlFile.maxAlign = MAXIMUM_ALIGNOF;
 	ControlFile.floatFormat = FLOATFORMAT_VALUE;
@@ -931,7 +931,7 @@ RewriteControlFile(void)
 	ControlFile.max_wal_senders = 10;
 	ControlFile.max_worker_processes = 8;
 	ControlFile.max_prepared_xacts = 0;
-	ControlFile.max_locks_per_xact = 64;
+	ControlFile.max_locks_per_xact = 128;
 
 	/* The control file gets flushed here. */
 	update_controlfile(".", &ControlFile, true);
-- 
2.47.3



view thread (12+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Shared hash table allocations
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox