public inbox for [email protected]  
help / color / mirror / Atom feed
From: Nazir Bilal Yavuz <[email protected]>
To: Rahila Syed <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: Improve monitoring of shared memory allocations
Date: Fri, 21 Mar 2025 14:15:43 +0300
Message-ID: <CAN55FZ3cJxy0VkeXpuO3K4BpjzJo3S6oU+iMyc00P6gEjqPztw@mail.gmail.com> (raw)
In-Reply-To: <CAH2L28uGLhkXBKDWFKm5XZtp_0nNqpYQ3Hc35vG++mM7wuOhgg@mail.gmail.com>
References: <CAH2L28vHzRankszhqz7deXURxKncxfirnuW68zD7+hVAqaS5GQ@mail.gmail.com>
	<k6f6ynjvb7lvebhygaiqsfrdohq672uughk3q4ve4q5jqljywz@7jbrz724epeq>
	<CAH2L28uL-8EQTRSeyTpW1DqAXsDXRkXCkT1dus2u6p4HYDrxAg@mail.gmail.com>
	<CAH2L28uGLhkXBKDWFKm5XZtp_0nNqpYQ3Hc35vG++mM7wuOhgg@mail.gmail.com>

Hi,

On Wed, 12 Mar 2025 at 13:46, Rahila Syed <[email protected]> wrote:
> I have now made the changes uniformly across shared and non-shared hash tables,
> making the code fix look cleaner. I hope this aligns with your suggestions.
> Please find attached updated and rebased versions of both patches.

Thank you for working on this!

I have a couple of comments, I have only reviewed 0001 so far.

You may need to run pgindent, it makes some changes.

diff --git a/src/backend/utils/hash/dynahash.c
b/src/backend/utils/hash/dynahash.c
index cd5a00132f..3bdf3d6fd5 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c

+        /*
+         * If table is shared, calculate the offset at which to find the the
+         * first partition of elements
+         */
+
+        nsegs = compute_buckets_and_segs(nelem, &nbuckets,
hctl->num_partitions, hctl->ssize);

Blank line between the comment and the code.

+    bool        element_alloc = true;
+    Size        elementSize = MAXALIGN(sizeof(HASHELEMENT)) +
MAXALIGN(info->entrysize);

It looks odd to me that camelCase and snake_case are used together but
it is already used like that in this file; so I think it should be
okay.

 /*
  * allocate some new elements and link them into the indicated free list
  */
-static bool
-element_alloc(HTAB *hashp, int nelem, int freelist_idx)
+static HASHELEMENT *
+element_alloc(HTAB *hashp, int nelem)

Comment needs an update. This function no longer links elements into
the free list.

+static int
+compute_buckets_and_segs(long nelem, int *nbuckets, long
num_partitions, long ssize)
+{
...
+    /*
+     * In a partitioned table, nbuckets must be at least equal to
+     * num_partitions; were it less, keys with apparently different partition
+     * numbers would map to the same bucket, breaking partition independence.
+     * (Normally nbuckets will be much bigger; this is just a safety check.)
+     */
+    while ((*nbuckets) < num_partitions)
+        (*nbuckets) <<= 1;

I have some worries about this function, I am not sure what I said
below has real life implications as you already said 'Normally
nbuckets will be much bigger; this is just a safety check.'.

1- num_partitions is long and nbuckets is int, so could there be any
case where num_partition is bigger than MAX_INT and cause an infinite
loop?
2- Although we assume both nbuckets and num_partition initialized as
the same type, (*nbuckets) <<= 1 will cause an infinite loop if
num_partition is bigger than MAX_TYPE / 2.

So I think that the solution is to confirm that num_partition <
MAX_NBUCKETS_TYPE / 2, what do you think?

--
Regards,
Nazir Bilal Yavuz
Microsoft





view thread (20+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Improve monitoring of shared memory allocations
  In-Reply-To: <CAN55FZ3cJxy0VkeXpuO3K4BpjzJo3S6oU+iMyc00P6gEjqPztw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox