Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w8JPf-000PRS-1m for pgsql-hackers@arkaria.postgresql.org; Thu, 02 Apr 2026 14:52:52 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w8JPd-006Ucn-0p for pgsql-hackers@arkaria.postgresql.org; Thu, 02 Apr 2026 14:52:49 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w8JPc-006Ucd-2i for pgsql-hackers@lists.postgresql.org; Thu, 02 Apr 2026 14:52:49 +0000 Received: from mail-wr1-x42e.google.com ([2a00:1450:4864:20::42e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w8JPb-00000000Cbu-0rmE for pgsql-hackers@postgresql.org; Thu, 02 Apr 2026 14:52:48 +0000 Received: by mail-wr1-x42e.google.com with SMTP id ffacd0b85a97d-43d17bb1c1dso899114f8f.2 for ; Thu, 02 Apr 2026 07:52:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1775141566; cv=none; d=google.com; s=arc-20240605; b=Q+KvyCaIuxIICrc6y0xPru+WyWDXB3NfVbm9URkHausIFT+eOD/E1oAHuaPy6Ho1wv D1hySYqzO71o69Rusrvy5fVATTGXxp6kNjs8dFogNoDVbz/YAhgBXLQMNfyElX4R16rj yyQwgI6Spxm6VE5Uk7C3G3/SPFxiia1/h9I5rHUsIUtn5CC7tIsd/0Kq+iW2jm9CjR2R Q+wHSF5xuL8dPpigm5mQ/IfvtsrfGJ3nx0t5l8zGoV70+43daPWlMHnrkIulveXdkF6q WfYKVQzfSmErVXrTNQZCwdEgidm45IQ+BgzVkQ6PUYZwuvWwBEQdIeFt79XmxAC2rNtj xg2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=06wkwd9guD5EmfWedoZ7SiK4F3m457p3+FKvnXDjz7g=; fh=aHielgk0BLXa0R4P6z1Um91WKurS3PkKXQ/gkkWMJwk=; b=Z8LYrcvIphJej8EHLuXNI8AxrsE9I8R1BicJ1tr9fzVxAasGjRvLXB3rGK2ZqeGIHP FoaKoqJPX2RKZwAxXTOkAav02pXBL+mU+iE2HCKfGBOUWR8j0tuEDAONQ+taoWJIAE5n XIgjFyPW938/0mvd6K5mNdiTgPSKlZOKBv+IwOeWCUezkKD0I9nVnRiFMerBm1JD/0KC dAvc2gSdoTpTC6gr/RSmXeDp9TqhgUAqmtK0SYDAKlmVneqepmWUN9A/w6yF+xoOh1h/ azmDcw3+p2UHcnLOT/VZULO44tP8viSRPIngVaEcdeseqIY4Q/5RafmhfAyZNTgvSkxF 9FvQ==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1775141566; x=1775746366; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=06wkwd9guD5EmfWedoZ7SiK4F3m457p3+FKvnXDjz7g=; b=IwvZF5iF0LWHkv7907tF5/SVYjm6mwb+VSqR9d0i1IJBdI+/Wa5Q5s+IP/1eLD+mfM gW2qo2UdvwiXInOT/wgggoBm4SgAEXnSXGsaMfxAno68G61/GXpdHMcKC7kL1WXQyrGW +hqZUO0LrdM7k4qO81YznKt/k8vmGaz4WHeXc5EgQu1yQ2F6K1Uiu043QbQhnZDmYXIM FrGQnaDll7kOFtSA+mIt2km45jaAAjO8hh9sIhKxxhkqB8JYr8byNLDYiM0huxlhQ6Rw yrr2/WLA3Y7ywLpFfPyayeE2cpSgkv//Y3ibUJ1baIzgf5YzAusI3/T7A7HgXOVC2tFl ENFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775141566; x=1775746366; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=06wkwd9guD5EmfWedoZ7SiK4F3m457p3+FKvnXDjz7g=; b=f01MugG6nhtS5ZQeyfShRVvLc6hbGxlKNj3PQJ6wpkXoEpnnecjSVb/vd7m4Qw+z/D dlrXKsMMOqQQBxw78I9YFCc+vFpV2vLDW4CZUZCM2pxfN4sWPJLhXx9Np87ef8IzcOBR 8dzxJxz83ORISrFG/E299F7ASNAYaakJfLOqrqL9QPxu8ZrHt41kP4HaUubrFQ1rBI19 zqKcA6gaPd073qnDWEs0PG8Qya9dBbSfbcNWvjtKGqYzKQY8kh5XCvgW1w3Lk3WKQ3be 909bD6RXF76BG2VJPamDly0Uu6bAkT8sC7kS1jeMmGAJnSGhxrXBFR/130w6B582sgaA SKhQ== X-Forwarded-Encrypted: i=1; AJvYcCXrtk+TErCWkUY/KtjCmUbYKyIM5v+54SN/OvNzt0z7CX0hwmZF9o/8f1RiYg98aCSBYFZuJ6XFLpQd4WV/@postgresql.org X-Gm-Message-State: AOJu0YxMiGu7YbXr37KZGp/sNq+zXYz7jAYywiF2RxHrcZ0tOHj+v39n K06i7FNk7mCIP7CefK4Gt8AEEfqKqLy/PjITlQSlaWWZOcb/HcHGt81vx/lUAer8VZ37mVu5lvL JCSvxVO5Kjyd7K7jsGusbWMdEfvynDac= X-Gm-Gg: AeBDiesz23EXasDYAYD/A9HL9Oa0USEmk6XfFvgn7JBqtirX9KIFBh2W/yif6YWyS2E 27HVOIrgRGvHimRgGVP142KOPOoFopNxSwDVMFSVuVmZkk2lSfyfNAaa3oQcKo9ycxQxg+ZBpYt HrZO5t7aTtzIjICLB/NIdNfHYxKxQwk/JHIiLZpjLrQzJfs+aVaV60r6lYMP4vB4FpiPCNOeQEy mz/IBI8r56nw6yGp4YCChaKdOCoOa5/eg96CrdcJxxFeN4PeOALSdy77Lvd1Sweql3Wb++ixjTP uXdxNV0m02T+8Th+spj8hoJooYgxcaz6iKHHYXjsegzJWeASo/mNW0rpg3Rf X-Received: by 2002:a5d:5f94:0:b0:439:d755:a895 with SMTP id ffacd0b85a97d-43d1f26a6e6mr6919192f8f.42.1775141565557; Thu, 02 Apr 2026 07:52:45 -0700 (PDT) MIME-Version: 1.0 References: <01ab1d41-3eda-4705-8bbd-af898f5007f1@iki.fi> <2981bb36-6bbe-4bdc-9a94-29b1114c79bd@vondra.me> <3026ec05-f664-4ebe-8bf6-0a1218b234ec@iki.fi> <19945803-6bcc-40fe-a14a-7dc5c462ed80@iki.fi> <83e37829-0d94-49b2-ad48-5feb7b5d5e44@iki.fi> In-Reply-To: <83e37829-0d94-49b2-ad48-5feb7b5d5e44@iki.fi> From: Ashutosh Bapat Date: Thu, 2 Apr 2026 20:22:32 +0530 X-Gm-Features: AQROBzCdYgl9aTM3XbWR8USz4IPtOTtZ_zMNWzlpj5sAXZxBVsrsqR_rDKBQDI4 Message-ID: Subject: Re: Shared hash table allocations To: Heikki Linnakangas Cc: Tomas Vondra , "pgsql-hackers@postgresql.org" , Robert Haas , Rahila Syed Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Apr 2, 2026 at 7:44=E2=80=AFPM Heikki Linnakangas = wrote: > > On 02/04/2026 15:55, Ashutosh Bapat wrote: > > When we "allocate" shared memory, we are just allocating space on > > systems which use mmap. The memory gets allocated only when it is > > touched. The wiggle room as a whole is never touched during > > initialization. Those pages get allocated when wiggle room is used - > > i.e. when the entries beyond initial number are allocated. By > > allocating maximal hash tables, I was worried that we will allocate > > more memory than required. But that's not true since a 4K memory page > > fits only 50-60 entries - far less than the default configuration > > permits. Most of the memory for the hash table will be allocated as > > the entries as used. > > Hmm, that's a good point about untouched memory not being allocated. I > think it's fine, though. > > With small changes on top of the the earlier refactorings from this > thread, we could stop pre-allocating all the elements when a shared > memory hash table is created, and have ShmemHashAlloc() allocate them on > the fly, but instead of doing them as anonymous allocations like we do > with ShmemAlloc() today, the allocations could come from the > pre-allocated region dedicated to the hash table. You'd still get the > same determinism and visibility in pg_shmem_allocations, but you could > avoid actually touching the pages until they're needed. Not sure it's > worth the trouble. share hash table refactoring + shared memory structure refactoring + resizable structures, we should be able to get resizable shared hash tables as well. But that's not required immediately. I feel large hash tables like buffer hash table, lock hash tables can benefit from this kind of thing. > > > The second hazard of increasing hash table size is the hash table > > access becomes slower as it becomes sparse [1]. I don't think it shows > > up in performance but maybe worth trying a trivial pgbench run, just > > to make sure that default performance doesn't regress. > > Interesting, but yeah I don't think that's going to be measurable. I did > some quick testing with a test function that just locks and unlocks > relations: > > PG_FUNCTION_INFO_V1(test_lock_bench); > Datum > test_lock_bench(PG_FUNCTION_ARGS) > { > int32 num_distinct_locks =3D PG_GETARG_INT32(0); > int32 num_acquires =3D PG_GETARG_INT32(1); > > LOCKMODE lockmode =3D AccessExclusiveLock; > > #define FIRST_RELID 1000000000 > > for (int32 i =3D 0; i < num_acquires; i++) > { > Oid relid =3D FIRST_RELID + i % num_d= istinct_locks; > > if (i >=3D num_distinct_locks) > UnlockRelationOid(relid, lockmode); > > if (!ConditionalLockRelationOid(relid, lockmode)) > { > elog(LOG, "could not acquire lock, iteration %d",= i); > break; > } > } > > PG_RETURN_VOID(); > } > > With test_lock_bench(1, 5000000), I don't see any meaningful difference, > i.e. it's within 1-2 %, with anything from max_locks_per_transactions=3D1= 0 > to max_locks_per_transactions=3D128. > > With more distinct locks involved, the caching effects might be bigger, > and maybe you'd see a difference because of more or less collisions. > Spot testing some values on my laptop, I don't see anything that would > worry me though. Great. This agrees with my experiments with sparse buffer lookup table. > > > The increase in memory usage is 3MB, which is fine usually. I mean, we > > didn't hear any complaints when we increased the default size of the > > shared buffer pool - this is much less than that. But why do you want > > to double the max_locks_per_transaction? I first thought it's because > > the hash table size is anyway a power of 2. But then the size of the > > hash table is actually max_locks_per_transaction * (number of backends > > + number of prepared transactions). What we want is the default > > max_locks_per_transaction such that 14927 locks are allowed. Playing > > with max_locks_per_transaction using your script 109 seems to be the > > number which will give us 14951 locks. It looks (and is) an odd > > number. If we are worried about memory increase, that's the number we > > should use as default and then write a long paragraph about why we > > chose such an odd-looking number :D. > > My first thought was actually to set max_locks_per_transaction=3D100, > making it a nice round number :-). But then the neighboring default of > max_pred_locks_per_transaction=3D64 looks weird. We could reduce it > max_pred_locks_per_transaction=3D50 to make it fit in. But it feels a > little arbitrary to change just for aesthetic reasons. +1. Let's keep it 128 and see if there are complaints. We can set it to 100 or 109 if the complaints look serious. --=20 Best Wishes, Ashutosh Bapat