Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w25Br-000rbb-2Q for pgsql-hackers@arkaria.postgresql.org; Mon, 16 Mar 2026 10:28:52 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w25Bq-009IaL-1g for pgsql-hackers@arkaria.postgresql.org; Mon, 16 Mar 2026 10:28:51 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w25Bq-009IaD-0e for pgsql-hackers@lists.postgresql.org; Mon, 16 Mar 2026 10:28:50 +0000 Received: from mail-wr1-x42a.google.com ([2a00:1450:4864:20::42a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w25Bo-00000000PJy-18ri for pgsql-hackers@postgresql.org; Mon, 16 Mar 2026 10:28:50 +0000 Received: by mail-wr1-x42a.google.com with SMTP id ffacd0b85a97d-439b8a3f2bcso3109426f8f.3 for ; Mon, 16 Mar 2026 03:28:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773656927; cv=none; d=google.com; s=arc-20240605; b=gl5Pw6fbNjqKTCroEQBMaMMKdWBxPvwDx1s5Lx7AjFWh5G/sNRKbXMnet59bvzSKx2 +dg2QrBtZl3Im9VyVe5EMYP38g5hmqB37ZcsCfJ25Qd9sQ6dFSXpFTIlIPFM8C5gsFGN 1HFAgZrXIpR8QGyIsJZKZ7mUuQKgvo6/SczNZK3+bUlVPCmKZvuP6nOFAKhg4sIlhtNb QwPoAJInDPEjg+czvScXd+A8Ya2KvUu/Pge0gN9PvbVswsEld4ZI5AoTRBbe1fgNFWyw z8e88DdS2DjQ7b4p3n7HDpI2ZzRiVjgPo650/HljK7rDGIL+Hb571TTh6zsZCEfqybZq FIsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=JfnVIxZlAcbyRQb0GBS1FL8kzYLgFf7ACRf2Jr6FcPk=; fh=qvnY1h7oOb4dQD9lt1FNf4WbbhJM9CyT4qsG/ae2BQQ=; b=Isl3vVZIYlosVsuyJLr4LS7dRCvTkVl/kLK7sfUuY31weqA7PpMtX9EkJCG3O7KID1 a9hx0IRSYvr4D841j+gH9Q4p5gBZ94bRVxtJrTKya+xTRuPAodrg9JmI8DwhfRFWnZ5p Nij7BUMxGKtFequpbIXZxEuaSYaO4ifPkFfLSXeU0cWn/aI0e3FTz1+XDOKR7xZ8GFbq hNCvgxD/2EuBR4DJyRYDueByEfFWNftCTpUg+qxE/v2hB10j88LJLQ3atzWh/E4axylD mwx1KU89omoUCDjxTCOETGyyO2nutuq6xiSJ6q7M95O3GG/HdnEfwu+xRpXDWN/xukNh PlJQ==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773656927; x=1774261727; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=JfnVIxZlAcbyRQb0GBS1FL8kzYLgFf7ACRf2Jr6FcPk=; b=bFuxoypkWtHkGNDo+a5uWtgrf2MjiclwuUn0kqT5cQMT9zaJc7uYe2Vn/7ezvE6NQD /3BFFWq8iGg+aKyZ35Or+HqGnRdIzlPRKR07p4BlRF1537PMmLHyLQ6T2VDl7x/inBFS ixStm2ZKJ8BJgyRX+k3B9B6qmXlutB6fHOQlc8JtkTHqJJREUJ+6Z+wHMMH+zxyebbaN MTYwgHgTaLU6bsIrhMGiRQF6y2AQ5SzaxAmWjVR0EXIVs8LzXPshL44P7V5TqOsmNQjd KAcWNijqA70PcPHAn8iIhZ5XwHMuLIPWIB0+d1B5Uq2M4yIcqWOpoz8E8OEId4OXjsPa CpZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773656927; x=1774261727; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=JfnVIxZlAcbyRQb0GBS1FL8kzYLgFf7ACRf2Jr6FcPk=; b=HpbQVEohA82Hg4hPvmNn8QX/jwxXhilvKbHKtnhnDluQxYdWj4q8PV2u98QPhuwSEk 4EPouf0ZWexwUd7vklIt7w/p2mLfwPPttUVCg4j945yjOX9ue/idbFq6P+Op6KDbBz+w 5nIPdN4DMZ5sO2lGMafISj42yAXc241EfQlBMiLxWFz7YlvPJ6jjZUiCCfuFs8nB08Ys En4bTd//TB3OezojHMXjeN3at12xPJCw3s0vJcxb9w6fSICwDeK+a0roAsxKgnxBPJH9 UFNvw2eUGjfNdTdyxmwflTc8VLclfvJm9Iry+MZNKdSC2m7FpwmtOJ4Z/o6dDp9wR9dy jWPg== X-Forwarded-Encrypted: i=1; AJvYcCUtUEqIHuM5a2S/rkdLvnJ7v3mOyoWgfnWnXzJnLKYSxyteWP+L7WUMyTHJh1dPgenZKJ5OpSHdWI/T0U1r@postgresql.org X-Gm-Message-State: AOJu0YxrCIBFqW+AH7qYqIeDN3ZIEGdvuex+8YgV2OpqX1rJhZaxve8l S4VNhde/IrYO3a1HaPNLnBv3/Ny2rDCQ406pJgbLbSbO07NB/J6MQ4oX/rU6TBG42vigqkyl75H K35ptLsAP3cxL8lqFHNivz9R8ewvc41Q= X-Gm-Gg: ATEYQzyRlnMvXFWLxYVXk1lrCAQ0oVPH50S9m1su3YNYPAwfyD+7j3yx599k6o9nwqP /wmTCRKh3sFY649FJXYODG9zOStBNQ8NUxXGr/M0mVeOI4FqDQM/c1kjV6pSZg3GPivNyxdf3IR IEjBaIIrBBw/lOSPxV3MU1zVAaWhHyl/hCLEfXAWGQ61GOSDs0Y/kI9/MjWfF6FSzrVVjWO+UxR jexBX6klf7YWYwDUiUg2wbzmcLI2QUhQwkmIG5bMINjmnebOh1qlxeTm1CPBTdU3KGx9Raq4xoF mu8omyjdQrovppZOX92CcKRtLUc2mW3W2Yyk/lQVBS6TA5eyA3MG X-Received: by 2002:a5d:5d13:0:b0:439:a95b:3c43 with SMTP id ffacd0b85a97d-43a04d86498mr20963020f8f.21.1773656927133; Mon, 16 Mar 2026 03:28:47 -0700 (PDT) MIME-Version: 1.0 References: <5a37c2e3-619d-4816-84d7-0b27e3e6797f@iki.fi> <26c766d6-db0f-43d3-a618-44f8d40a3121@iki.fi> <7de8e39c-d1c4-4783-91b9-129d90e91411@iki.fi> In-Reply-To: <7de8e39c-d1c4-4783-91b9-129d90e91411@iki.fi> From: Ashutosh Bapat Date: Mon, 16 Mar 2026 15:58:34 +0530 X-Gm-Features: AaiRm51o-0qXCNyMxkx1sYThJDJDfy6lCLomtR6ZC02Ll9pUEV4jetBAenALCEg Message-ID: Subject: Re: Better shared data structure management and resizable shared data structures To: Heikki Linnakangas Cc: Andres Freund , pgsql-hackers , chaturvedipalak1911@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sat, Mar 14, 2026 at 2:39=E2=80=AFAM Heikki Linnakangas wrote: > > On 06/03/2026 16:12, Heikki Linnakangas wrote: > > Firstly, I'm not sure what to do with ShmemRegisterHash() and the > > 'HASHCTL *infoP' argument to it. I feel it'd be nicer if the HASHCTL wa= s > > just part of the ShmemHashDesc struct, but I'm not sure if that fits al= l > > the callers. I'll have to try that out I guess. > I took a stab at that, and it turned out to be straightforward. I'm not > sure why I hesitated on that earlier. > Yeah. I wondered about that too. > Here's a new version with that change, and a ton of little comment > cleanups and such. Here are initial comments on these patches. 0001 @@ -3236,6 +3239,8 @@ PostmasterStateMachine(void) LocalProcessControlFile(true); /* re-create shared memory and semaphores */ + ResetShmemAllocator(); This name is misleading. The function does not touch ShmemAllocator at all. Instead it resets the ShmemStruct registry. I suggest ResetShememRegistry() instead. + * + * There are two kinds of shared memory data structures: fixed-size struct= ures + * and hash tables. In future we will have resizable "fixed" structures and we may also have resizable hash tables i.e. hash tables whose directory would be resizable. The later would be help support resizable shared buffers lookup table. It will be good to write the above sentence so that we can just add more types of data structures without needing to rewrite everything. If we could find a good term for "fixed-size structures" which are really "structures that require contiguous memory", we will be able to write the above sentence as "There are two kinds of shared memory structures: contiguous structures and hash tables.". When we add resizable structures, we can just add a sentence "A contiguous structure may be fixed size or resizable". When we add resizable hash tables, we can just replace that with "Both of these kinds can be fixed-size or resizable". I am not sure whether "contiguous structures" is a good term though (for one, word contiguous can be confused with continuous). Whatever term we use should be something that we can carry further in the remaining paragraphs. + Fixed-size structures contain things like global + * variables for a module and should never be allocated after the shared + * memory initialization phase. I think the existing comment is not accurate. The term "global variables" in the sentence can be confused with process global variables. We should be using the term "shared variables" or better "shared state". If we adopt "contiguous structures" as the term for the first kind of data structure, we can write "Contiguous structures contain shared state, maintained in a contiguous chunk of memory, for a module. It should never be allocated after the shared memory initialization phase.". + * postmaster calls ShmemInitRegistered(), which calls the init_fn callbac= ks + * of each registered area, in the order that they were registered. ... calls the init_fn, if any, of each registered area .... - infoP->dsize =3D infoP->max_dsize =3D hash_select_dirsize(max_size); - infoP->alloc =3D ShmemAllocNoError; - hash_flags |=3D HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE; + desc->hash_info.dsize =3D desc->hash_info.max_dsize =3D hash_select_dirsize(desc->max_size); + desc->hash_info.alloc =3D ShmemAllocNoError; + desc->hash_flags |=3D HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE; /* look it up in the shmem index */ The next several lines of code look up shmem index. Should we remove this comments or modify it to say "Register and initialize the hash table". +HTAB * +ShmemInitHash(const char *name, /* table string name for shmem index */ + int64 init_size, /* initial table size */ + int64 max_size, /* max size of the table */ + HASHCTL *infoP, /* info about key and bucket size */ + int hash_flags) /* info about infoP */ +{ + ShmemHashDesc *desc; + ... snip ... + + ShmemRegisterHash(desc); + return *desc->ptr; +} I like the way these functions are written using the new API. I think we should keep these legacy interface at the end of section of shmem APIs, rather than keeping those at the end of the file where we have monitoring and arithmetic functions. If you want to get rid of the legacy APIs in this release itself, I think it's ok to keep them at the end of the file. ShmemInitStruct() now calls ShmemRegisterStruct(). Earlier it could be called from any backend, in any state to fetch a pointer to a shared memory structure. It didn't add a new structure. Now it can add a new structure. I am wondering whether that can cause registry in different backends to get out of sync. Should we limit the window when it can be called just like how shmem_request_hook call is limited. In that sense ShmemRegisterStruct() looks something to be called from a shmem_register_hook which is also called from EXEC_BACKEND. Sorry to expand it here rather in my previous reply. In case we replace all the current calls to ShmemInitStruct() with ShmemRegisterStruct(), we may be able to get rid of the Shmem Index altogether; after all it's used only for fetching the pointers to the shared memory areas in EXEC_BACKEND mode. I thought that we could save the registry in the shared memory. In EXEC_BACKEND mode, we go over the registry calling attach_fn for each entry. But since the binary is overwritten in EXEC_BACKEND case, attach/init fns are not guaranteed to have the same address in all the backends. Maybe we have to resort to launch_backend() to transfer the registry to the backend through the file (to keep it in sycn in all the backends): a solution you may not like. + void **ptr; +} ShmemStructDesc; I think the comments for each member should highlight which of these fields are required (non-zero) and which can be optional (zero'ed out). + */ + ShmemStructDesc base_desc; Once we have calculated the base_desc in ShmemRegisterHash() and called ShmemRegisterStruct(), we don't need base_desc anymore. Even the pointer to the allocated hash table memory is available through *ptr. Probably we could just remove this member from here. ShmemRegisterHash() can declare a variable of type ShmemStructDesc, populate it based on the members in this structure and pass it to ShmemRegisterStruct(). I am not comfortable with specification structure being modified by the registration function. 0003 - pages =3D (size / FPM_PAGE_SIZE) - first_page; - FreePageManagerPut(fpm, first_page, pages); - } + ShmemRegisterStruct(&dsm_main_space_shmem_desc); Shouldn't we be setting dsm_main_space_shmem_desc.size here to size before calling ShmemRegisterStruct()? @@ -102,15 +102,14 @@ CalculateShmemSize(void) size =3D add_size(size, ShmemRegisteredSize()); size =3D add_size(size, dsm_estimate_size()); We have defined dsm_main_space_shmem_desc, but we still use dsm_estimate_size() here and initialize the memory in dsm_shmem_init(), which is explicitily called from CreateOrAttachShmemStructs(). Why is that? Shouldn't we be registering the structure in RegisterShmemStructs(), and let ShmemInitRegistered() initialize it? Am I missing something here? I will continue to review the patches further. --=20 Best Wishes, Ashutosh Bapat