Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQkrx-00H1Qr-10 for pgsql-hackers@arkaria.postgresql.org; Wed, 03 Dec 2025 11:18:02 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vQkrv-00CwVz-30 for pgsql-hackers@arkaria.postgresql.org; Wed, 03 Dec 2025 11:18:00 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQkrv-00CwVn-1f for pgsql-hackers@lists.postgresql.org; Wed, 03 Dec 2025 11:17:59 +0000 Received: from mail-wm1-x32f.google.com ([2a00:1450:4864:20::32f]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vQkro-002uFW-0J for pgsql-hackers@lists.postgresql.org; Wed, 03 Dec 2025 11:17:54 +0000 Received: by mail-wm1-x32f.google.com with SMTP id 5b1f17b1804b1-47118259fd8so64710735e9.3 for ; Wed, 03 Dec 2025 03:17:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764760666; x=1765365466; darn=lists.postgresql.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=csciZon7KZjH/uJ70YGV//bCRAj2RlH1VDWaws9CJ9M=; b=AqIu5WrGKpZ67JrlZZvnumMKIKQeMmwFYjKjEe4vuchFPTOpWwIxvmfTuZRlJ6HV6q xjghaY9Ua+umRs1sjtypqhbgQRh4554UOicad+Ihx3y3XKzFthbWMpbU215GiCiCKuhR q0AN8gdailDVFTGYhN7/LI5EfSTk7fC7ne3u1v1xk6q9HrSYvYBtdakj1uOEaY4gPx6g KFwDlMRe5Yht8Cr+zba9TZSw9NshVBviobuGdLdcyRwX4w9lhZ/eVM9DnzNkaRsF3YvP CTODvX1bcHaVNqT6v4rDbT/H/Ey6cMj10TrRGc3VMnPItq+I7jllTCdaK7mSnfFIBxil xPEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764760666; x=1765365466; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=csciZon7KZjH/uJ70YGV//bCRAj2RlH1VDWaws9CJ9M=; b=Fp60QDDmqrCMCCXao9ip5zUmUN81dZYfzFArobQjC3QRIGJovNszhg5ZYeh78AOTFe +3hERN7rLyCzoNrJ+qFSa64p+Z5oHWrLsCeheVMakPGaEvhJb9lJsWHxvqs87QJvXNfc MGNH8KOX1bG6q7PshZ1mS/JAl1UWcRXGp8bqCq5CKWidf8ZOttbjSGKVJ9tsUk3haWN9 S1PslqfxmXA0sXKwKvw+fZMG4hQhs1zeZiT7zKYQYma38M8qQocJwobfvTBA0yTG9P8Z jFlfUjWNgqAi/9eBLXwkgBY3P7Bvepl9ql+L+9rqXcQ6+4423L3R9mfPsJr0yTYEvqOQ Atfg== X-Gm-Message-State: AOJu0YwaUjlStkBt+oa+1phR2dtalr61NfRdeV4Fj+uGjWuL8pQEYDGb rJUjMce5T5L3otgLGCUD9/wWXnOyCQb/5QQRQJYK0H5K9bbi1xE5fHW7jRDK/Q== X-Gm-Gg: ASbGncv0p9tg+sYum1K2EBiI2GfzjzHK/8OtsaZbXgOid4idn8+5V6ONSEMmRuV+bms F9hKEO75y33kd808WNJbag26wTpfovS3UoTmEE/cDVFI1REw7fVs7tR7eLQAItEJOJ8van+iZw6 4bEspHP/6/IDFo9w4y1Ex5VO+UOvDXY3VgsQmhQBjojhQv0dC6RmhVClJdD+ldWL7K3oExoPcID 7JlTJJIYz8QgrVEPdXieHTeRpU/4CaAFR91/EhV3uparI/O2dHGQCK5LFAzcnrrvclJpxRp6Y5x v4pNiXi9Yhil0rUdM+P9axRgGmRd1hqw3+5cy0e9IHlmJWe8rlHcug/xukek+dN+d0cFixvhasi X8aw92gsEuK3DYAYBdwh8s86NMZGYo3rcRa+X9N8J4Z/8zJKYbROZ3997N57hmlftywoB5TTw4f de6gFanJGwx4n+A2u+Ks/a6jgpBeV4Atlpj2A0J54wMCMeXSXh7REvTyPp3blBeKcJFurT8H4dv oQdqsZi/ISXFgWzPyDdQ6QbheTJGIlyYBnKQ0HSCjVAoQ== X-Google-Smtp-Source: AGHT+IEHAKLDipRdYoOPtKbvQUPF9GvYBoRc9dyZsleTULhxVzcDeUxjFTsQHBtEaTWFObC/CzHc1Q== X-Received: by 2002:a05:600c:a45:b0:45d:d97c:236c with SMTP id 5b1f17b1804b1-4792af32c6cmr18329815e9.21.1764760665717; Wed, 03 Dec 2025 03:17:45 -0800 (PST) Received: from ip-10-97-1-34.eu-west-3.compute.internal (ec2-15-237-197-144.eu-west-3.compute.amazonaws.com. [15.237.197.144]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4792b150878sm16360845e9.3.2025.12.03.03.17.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Dec 2025 03:17:45 -0800 (PST) Date: Wed, 3 Dec 2025 11:17:43 +0000 From: Bertrand Drouvot To: Jelte Fennema-Nio Cc: pgsql-hackers@lists.postgresql.org Subject: Re: Safer hash table initialization macro Message-ID: References: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="Mm7e/lTiPFa8Qm1V" Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --Mm7e/lTiPFa8Qm1V Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, On Mon, Dec 01, 2025 at 03:44:41PM +0100, Jelte Fennema-Nio wrote: > On Mon, 1 Dec 2025 at 14:45, Bertrand Drouvot > wrote: > > Thoughts? > > I think the hashtable creation API in postgres is so terrible that it > actively discourages usage. Thanks for sharing your thoughts! > So I'm definitely in favor of improving this API (probably by adding a > few new functions). I have a few main thoughts on what could be > improved: > > 1. Automatically determine keysize and entrysize given a keymember and > entrytype (like you suggested). PFA a patch introducing and using the new macro. Note that it also introduces HASH_ELEM_INIT_FULL for the rare cases where the whole struct is the key. Also one case remains untouched: $ git grep "entrysize = sizeof" "*.c" src/backend/replication/logical/relation.c: ctl.entrysize = sizeof(LogicalRepRelMapEntry); That's because the key is a member of a nested struct so that the new macro can not be used. As there is only one occurrence of it, I think we can keep it as it is. But we could create a dedicated macro for those cases if we feel the need. Now that I'm writing this, that might be a better idea: that way we'd avoid any "entrysize/keysize = " in the .c files. Also a nice side effect of using the macros: 138 insertions(+), 203 deletions(-) > 2. Autodect most of the flags. > a. HASH_PARTITION, HASH_SEGMENT, HASH_FUNCTION, HASH_DIRSIZE, > HASH_COMPARE, HASH_KEYCOPY, HASH_ALLOC can all be simply be detected > based on the relevant fields from HASHCTL. Passing them in explicitly > is just duplication that causes code noise and is easy to forget > accidentally. > b. HASH_ELEM is useless noise because it is required > c. HASH_BLOBS could be the default (and maybe use HASH_STRINGS by > default if the keytype is char*) > 3. Default to CurrentMemoryContext instead of TopMemoryContext. Like > all of the other functions that allocate, because right now it's way > too easy to accidentally use TopMemoryContext when you did not intend > to. > 4. Have a creation function that doesn't require HASHCTL at all (just > takes entrytype and keymember and maybe a version of this that takes a > memorycontext). Thanks for the above suggestions! I did not think so deep as you did during your Citus time, but will think about those too. I suggest we move forward one step at a time, first step being the new macros. Does that make sense to you? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com --Mm7e/lTiPFa8Qm1V Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="v1-0001-Safer-hash-table-initialization-macro.patch"