Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQ599-009GNT-29 for pgsql-hackers@arkaria.postgresql.org; Mon, 01 Dec 2025 14:45:00 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vQ598-003EhD-0o for pgsql-hackers@arkaria.postgresql.org; Mon, 01 Dec 2025 14:44:58 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQ597-003Eh4-2w for pgsql-hackers@lists.postgresql.org; Mon, 01 Dec 2025 14:44:58 +0000 Received: from mail-lf1-x131.google.com ([2a00:1450:4864:20::131]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vQ594-002aS7-2U for pgsql-hackers@lists.postgresql.org; Mon, 01 Dec 2025 14:44:57 +0000 Received: by mail-lf1-x131.google.com with SMTP id 2adb3069b0e04-5959d9a8eceso3600697e87.3 for ; Mon, 01 Dec 2025 06:44:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jeltef.nl; s=google; t=1764600292; x=1765205092; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=s9NBO2aHLdXqQ+oBqC2sQIxgdQAyfhF3lVUMCbdZf68=; b=RcZqvNZaoy2oOj1uDjIJQO4W82BfBlGR/5GQIKoVEZ9bAIylB/gZQMQh5wlIidAfGn 2nvWxDlQexQehALgus7jJbRMf7dJjOVbl1w/P91sbLwKOmF51A3baV8azg8MLK0wfGPT mlUMxIsMiH16VxhNIv0V1w3eUN0BDotM9MGSxLbRtCzPZDdT/neTYXufLNaMeohdDjuZ lwDIeDkhPgILLB5RNh8NHvmOZn4EPPrbh+LBo+dazjjXzoOE6oItJQggaf6eNUYBp3SG IVqemkf8uViUKFefzyowpu/wUXndAHtaGzGSlmNZZqe2xkLAcKkgs0lpdgpbDdzzogE6 65VQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764600292; x=1765205092; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=s9NBO2aHLdXqQ+oBqC2sQIxgdQAyfhF3lVUMCbdZf68=; b=BRVkrMTeL53ssjn1sSyHwGdl5kopZBUV9kj5Mv1vaguCqMw49gb2tykC2dy09r2K7w HyFG5T5D9uy2AytBxUIwEU/KLm9LTKpua5qBeCNsFAiIb0eUWMM78b6PteagTJZcSnTx L7LdWTAvcrBBT924S3wVIjKktO3lqnOKLZbfjmrKWzCKk7CbjtvIo7Peh+bog+ocvufm MxEJkxkIRN90rCqvjooPfIVoVMpA0Toz6+FL03wWLtTaitJXTtQNbVf140I61vnxMzQX wsaHVmIseX2788P0W6E3eJvfCNLceBvJxwIiaVSbcOfMiPdsRgYfpzWbyBEGOCKwo0Tf cFyQ== X-Gm-Message-State: AOJu0YxejTMYwKPXXYl0xFWTyjOPQdekOLVRlnPBd8D4JltQFop/V3E/ f9QFelV8M/rRWHotJF3Ziift/HI+b4kHr0ooiS3FBv4EOfLYtsHLsN5NuTvYGvQltF1Z6/kH/kx KxNtI6SgwKT490qaZc+cEjw41eTbE4uPKqc3Me+ERbw== X-Gm-Gg: ASbGncvZOIBCy7rALl+nG33ijG6m+SdfDjQcgLM7PZCC46eZEVZfwzYvE+6CS35gyiv 4yqwW6OoUiztenmuiQl/67sYM+nAZxIDdCzqIYpDfshDJlGQ64c3cuQexp/xYh/9WKHa8U3eLMR 0rv0MZI2tnCdJxuiToe9zDydJa8DYi80nxCRYF+HJKhy94bz9jenikmcWo4aa9DzWMGoHhsooBb JV+KRhvpiWF9gNFHHsuFtujYTuT/6sKn/npzXFKz9RWnQHEfjs+dWawcLWszT+zIP+HX1dfkQFk q+7x X-Google-Smtp-Source: AGHT+IHNV7Jzst/Rkyi/PgXH/IktK2v9Rv3PnddFUQHr3vtdCKp0+vUwQxRUnNAi6VyLzUfdByA9Lwsr1gIURPBGJzk= X-Received: by 2002:a05:6512:3f19:b0:58a:f865:d7a6 with SMTP id 2adb3069b0e04-596a3ee5500mr12360949e87.48.1764600292128; Mon, 01 Dec 2025 06:44:52 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Jelte Fennema-Nio Date: Mon, 1 Dec 2025 15:44:41 +0100 X-Gm-Features: AWmQ_bmQvNbPTrp4sgSblxyyL1bcVYnxSd3I5UxAtzVIQ6UK_9Yrk128O3WEYd4 Message-ID: Subject: Re: Safer hash table initialization macro To: Bertrand Drouvot Cc: pgsql-hackers@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, 1 Dec 2025 at 14:45, Bertrand Drouvot wrote: > Thoughts? I think the hashtable creation API in postgres is so terrible that it actively discourages usage. At Citus we definitely had the problem that we would use Lists for cases where a hashtable was preferable perf wise, just because the API was so terrible. That's why I eventually implemented some wrapper helper functions to make it less verbose and error prone to use in by far the most common patterns we had[1]. So I'm definitely in favor of improving this API (probably by adding a few new functions). I have a few main thoughts on what could be improved: 1. Automatically determine keysize and entrysize given a keymember and entrytype (like you suggested). 2. Autodect most of the flags. a. HASH_PARTITION, HASH_SEGMENT, HASH_FUNCTION, HASH_DIRSIZE, HASH_COMPARE, HASH_KEYCOPY, HASH_ALLOC can all be simply be detected based on the relevant fields from HASHCTL. Passing them in explicitly is just duplication that causes code noise and is easy to forget accidentally. b. HASH_ELEM is useless noise because it is required c. HASH_BLOBS could be the default (and maybe use HASH_STRINGS by default if the keytype is char*) 3. Default to CurrentMemoryContext instead of TopMemoryContext. Like all of the other functions that allocate, because right now it's way too easy to accidentally use TopMemoryContext when you did not intend to. 4. Have a creation function that doesn't require HASHCTL at all (just takes entrytype and keymember and maybe a version of this that takes a memorycontext). [1]: https://github.com/citusdata/citus/blob/ae2eb65be082d52db646b68a644474e24bc6cea1/src/include/distributed/hash_helpers.h#L74