Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vfvl1-006BGj-0L for pgsql-hackers@arkaria.postgresql.org; Wed, 14 Jan 2026 07:57:35 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vfvl0-008gNH-0x for pgsql-hackers@arkaria.postgresql.org; Wed, 14 Jan 2026 07:57:34 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vfvl0-008gN9-00 for pgsql-hackers@lists.postgresql.org; Wed, 14 Jan 2026 07:57:34 +0000 Received: from mail-wm1-x330.google.com ([2a00:1450:4864:20::330]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vfvkx-000LIv-2Q for pgsql-hackers@lists.postgresql.org; Wed, 14 Jan 2026 07:57:33 +0000 Received: by mail-wm1-x330.google.com with SMTP id 5b1f17b1804b1-47a95efd2ceso70655885e9.2 for ; Tue, 13 Jan 2026 23:57:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1768377451; x=1768982251; darn=lists.postgresql.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=qhvwdxlM51kN3QYcj88Qy63lm2a+6PPr2iLvVU4/K4U=; b=dWzWiweVBuCwgbLAoL6QSHUdBNvp3XcpAD9apLaE4RN3NOuxWa4XbfEuL0yOmre1AJ o1tW9Pv63xQPclkP3+MNetMv3kVaMwu9fEjYyYspaGBhAEUaDxxAWnDlqI2PXVJSsEn5 FEnivzRX/s5cpDKubqhfly+UlVzkcq1BlJaMsgGwruOkZJHk7rqzQdztUlWJGSMPFDQ3 AMvXIB097Q0mMxB1HXtw7sx6dRVe4fJI5gf+axPP0knpZB1VpPUDQFq7haG6zaZZ8ok9 1k0udElt9ghvwv5Cr/9HrsPfWCNkRH5hU7ANCNtXzzE+qpNUYDsTJHlia9gwEGtSJ6VL JocQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768377451; x=1768982251; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qhvwdxlM51kN3QYcj88Qy63lm2a+6PPr2iLvVU4/K4U=; b=COjmmJC1ZHhfPHX7Ez15WL/iFcyle8Gz5VLa2n+DSBDw4n//z8I7ASmmzN3a1dYRHW 5nlsJ0aVg+981nsiXV2ZqfCRphcNvyt+m2GMTuq8zxVCku7ZZVWpNzj+n4enXbxNwkTX lF9p2MZsIXpO8XoB8mE4/1OpT4JrFhuxe6Xl2xBv4d67lFyuNoCZ6dnAVzwMIRR4X1N0 gY8ybR5w55/jfT2qqCL6bqTAS0sY5w0GGyL6QhgcSWaCFphEh9u6J3CZsC0UaIKyW2Lc ranu0C8zjRgDaxTdnV91ErR0HmXB/iMBNIPj9BzVHzUBM3KhYaHLLSYTqHzeLdJJkOyL H6VA== X-Forwarded-Encrypted: i=1; AJvYcCV5bnrNlI7jfE8PHknZMsJvmrlXwx5cmAQdlfZtPYRd4LAnP4uJx3Cea7dH+PhIFGv+3yy8oDwHf+NDUo3T@lists.postgresql.org X-Gm-Message-State: AOJu0YyX6f5IsDfIdy9OlsbsDWflJP3HsBbbsypuQQP89xXm4BXdQXed 5BuMbCllKmqUgy36OqwZ3Snv314hgtar91kRvsAhnYVqy8/Qxmserqhs X-Gm-Gg: AY/fxX6mG+bw2Pr879i/bfUMtQs8t7/3YCHsqal7qYgiyaw4mchMs5LUYMeAn8XMaqu cenQH0DC8uzXY0TU5am4/wNk/pkfkwWXzY3Z/sl7Pe0/prysCswt9qnpKxCCcYdo9zsI8RZKxvz xfRlyT8i12kkr+mJ1RKr4jQqb6vDKdOMq09DX9B2UlsuXVjOjyhcBGjalE9Tkw1YNts9jectBB5 nvqYsBG2p6z/pitP7Y/HdX6aZPr6tpETgMKsGQIwG/ftUnJD0SWrapMcyvy3sj2zVFpLtsstbM0 q1Emw4O7SsJiinM4m9RgOTu4QdA5aUJrnJBw8m360FypAKQVgD8XF6zOCic6g7uXbkUyoCnwIIy PMJyhMy0cFXj4UaWU6oAJWfP2vlzAW+xcudzbS+0J2GxsWvDkF+r8LBT5t2RlNp62m7BrX2X6bF eB6ZB5TaPZ/IQ3fbQy8yDt1R4u4CvvNeEVszVxYY99avvHyTnEsK9y5cwtZf4Dy6RmP6riU+AoQ yMrnl5XwcVBzG7iMus+sHzQyeJDRsjIbkg8fVM22hB85Q== X-Received: by 2002:a05:6000:220d:b0:430:b100:f591 with SMTP id ffacd0b85a97d-4342d39f650mr1200964f8f.28.1768377450737; Tue, 13 Jan 2026 23:57:30 -0800 (PST) Received: from ip-10-97-1-34.eu-west-3.compute.internal (ec2-15-237-197-144.eu-west-3.compute.amazonaws.com. [15.237.197.144]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-432bd5ee243sm48803195f8f.31.2026.01.13.23.57.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jan 2026 23:57:30 -0800 (PST) Date: Wed, 14 Jan 2026 07:57:29 +0000 From: Bertrand Drouvot To: Jelte Fennema-Nio Cc: Thomas Munro , pgsql-hackers@lists.postgresql.org Subject: Re: Safer hash table initialization macro Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On Tue, Jan 13, 2026 at 10:31:18AM +0100, Jelte Fennema-Nio wrote: > On Tue Jan 13, 2026 at 8:27 AM CET, Bertrand Drouvot wrote: > > I've probably a too paranoid concern: what if someone use char[N] to store > > binary stuff with embedded null? That would detect it as string and then > > make use of strncmp() and then would provide incorrect result. > > > > While the risk is probably very low, I think it is there. > > Added a warning in the comment for these macros. For non of our > usages this was the case (i.e. the char arrays were all storing null > terminated strings). Agreed, I did check that too before doing the initial comment. > So I'm not too worried about this being a problem > in practice. I agree, it's very low risk that one adds a new "bad" one in the future. Adding a comment looks enough then. + * WARNING: If you use char[N] to store binary data that is not null-terminated, + * the automatic detection will incorrectly treat it as a string and use string + * comparison. In such cases, use hash_make_ext() with .force_blobs = true to + * override the automatic detection maybe s/is not null-terminated/may contain null bytes/? Also, nit, "Note or NOTE" looks more commonly used that "WARNING". We might want to use that instead. > Especially because in most cases there will be no null byte > in the key, and instead you'll start reading out of bounds, which wil > cause problems pretty much immediately during development. Agreed. > Especially, because to make this macro nice to > use in the two cases that it would apply to we'd have to make it treat 0 > as a special value. Not necessary, we could also just add a foreach_hash_with_hash_value() to make things more consistent? > Finally, I converted the last couple of hash_seq_init stragglers (some > I had missed/were added) and others needed the now newly added > foreach_hash_reset macro to be converted. I see that you added foreach_hash_restart(), I think that makes sense (even if it's used only in 3 places). Two more comments: === 1 -static void -cfunc_hashtable_init(void) -{ - /* don't allow double-initialization */ - Assert(cfunc_hashtable == NULL); Most of the hash_make_fn_cxt() callers check that the destination is already NULL so that removing the Assert() is not that worrying for them. There are 2 places where it's not the case: InitializeAttoptCache() and build_guc_variables() , should we add an Assert (or an if check) in them? I think that an Assert is more appropriate for those 2. === 2 " At the very least we should choose a few places where we use the new macros to make sure they have coverage." I do agree that the refactoring is quite large. I do think there is no rush to do all the changes at once. We could update a subset of files at a time per month so that, that would: - keep changes consistent within each file - ease the review(s) - avoid "large" rebases for patches waiting in the commitfest Thoughts? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com