<div dir="ltr"><div>Hi,</div><div><br></div><div>Thank you for the review.<br><br><br></div><div class="gmail_quote gmail_quote_container"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
cfbot found a few compiler warnings:<br>
<br>
<a href="https://cirrus-ci.com/task/6526903542087680" rel="noreferrer" target="_blank">https://cirrus-ci.com/task/6526903542087680</a><br>
[16:47:46.964] make -s -j${BUILD_JOBS} clean<br>
[16:47:47.452] time make -s -j${BUILD_JOBS} world-bin<br>
[16:49:10.496] lwlock.c: In function ‘CreateLWLocks’:<br>
[16:49:10.496] lwlock.c:467:22: error: unused variable ‘found’ [-Werror=unused-variable]<br>
[16:49:10.496]   467 |                 bool found;<br>
[16:49:10.496]       |                      ^~~~~<br>
[16:49:10.496] cc1: all warnings being treated as errors<br>
[16:49:10.496] make[4]: *** [&lt;builtin&gt;: lwlock.o] Error 1<br>
[16:49:10.496] make[3]: *** [../../../src/backend/<a href="http://common.mk:37" rel="noreferrer" target="_blank">common.mk:37</a>: lmgr-recursive] Error 2<br>
[16:49:10.496] make[3]: *** Waiting for unfinished jobs....<br>
[16:49:11.881] make[2]: *** [<a href="http://common.mk:37" rel="noreferrer" target="_blank">common.mk:37</a>: storage-recursive] Error 2<br>
[16:49:11.881] make[2]: *** Waiting for unfinished jobs....<br>
[16:49:20.195] dynahash.c: In function ‘hash_create’:<br>
[16:49:20.195] dynahash.c:643:37: error: ‘curr_offset’ may be used uninitialized [-Werror=maybe-uninitialized]<br>
[16:49:20.195]   643 |                         curr_offset = (((char *)curr_offset) + (temp * elementSize));<br>
[16:49:20.195]       |                         ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
[16:49:20.195] dynahash.c:588:23: note: ‘curr_offset’ was declared here<br>
[16:49:20.195]   588 |                 void *curr_offset;<br>
[16:49:20.195]       |                       ^~~~~~~~~~~<br>
[16:49:20.195] cc1: all warnings being treated as errors<br>
[16:49:20.196] make[4]: *** [&lt;builtin&gt;: dynahash.o] Error 1<br><br></blockquote><div> </div><div> Fixed these. </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
&gt; diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c<br>
&gt; index cd5a00132f..5203f5b30b 100644<br>
&gt; --- a/src/backend/utils/hash/dynahash.c<br>
&gt; +++ b/src/backend/utils/hash/dynahash.c<br>
&gt; @@ -120,7 +120,6 @@<br>
&gt;   * a good idea of the maximum number of entries!).  For non-shared hash<br>
&gt;   * tables, the initial directory size can be left at the default.<br>
&gt;   */<br>
&gt; -#define DEF_SEGSIZE                     256<br>
&gt;  #define DEF_SEGSIZE_SHIFT       8    /* must be log2(DEF_SEGSIZE) */<br>
&gt;  #define DEF_DIRSIZE                     256<br>
<br>
Why did you move this to the header? Afaict it&#39;s just used in<br>
hash_get_shared_size(), which is also in dynahash.c?<br>
<br></blockquote><div> </div><div>Yes. This was accidentally left behind by the previous version of the</div><div>patch, so I undid the change. </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
&gt;  static void register_seq_scan(HTAB *hashp);<br>
&gt;  static void deregister_seq_scan(HTAB *hashp);<br>
&gt;  static bool has_seq_scans(HTAB *hashp);<br>
&gt; -<br>
&gt; +static int find_num_of_segs(long nelem, int *nbuckets, long num_partitions, long ssize);<br>
&gt;  <br>
&gt;  /*<br>
&gt;   * memory allocation support<br>
<br>
You removed a newline here that probably shouldn&#39;t be removed.</blockquote><div> </div><div>Fixed this. </div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
&gt; @@ -468,7 +466,11 @@ hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)<br>
&gt;       else<br>
&gt;               hashp-&gt;keycopy = memcpy;<br>
&gt;  <br>
&gt; -     /* And select the entry allocation function, too. */<br>
&gt; +     /*<br>
&gt; +      * And select the entry allocation function, too. XXX should this also<br>
&gt; +      * Assert that flags &amp; HASH_SHARED_MEM is true, since HASH_ALLOC is<br>
&gt; +      * currently only set with HASH_SHARED_MEM *<br>
&gt; +      */<br>
&gt;       if (flags &amp; HASH_ALLOC)<br>
&gt;               hashp-&gt;alloc = info-&gt;alloc;<br>
&gt;       else<br>
&gt; @@ -518,6 +520,7 @@ hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)<br>
&gt;  <br>
&gt;       hashp-&gt;frozen = false;<br>
&gt;  <br>
&gt; +     /* Initializing the HASHHDR variables with default values */<br>
&gt;       hdefault(hashp);<br>
&gt;  <br>
&gt;       hctl = hashp-&gt;hctl;<br>
<br>
I assume these were just observations you made while looking into this? They<br>
seem unrelated to the change itself?</blockquote><div> </div><div>Yes. I removed the first one and left the second one as a code comment. <br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
&gt; @@ -582,7 +585,8 @@ hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)<br>
&gt;                                       freelist_partitions,<br>
&gt;                                       nelem_alloc,<br>
&gt;                                       nelem_alloc_first;<br>
&gt; -<br>
&gt; +             void *curr_offset;<br>
&gt; +     <br>
&gt;               /*<br>
&gt;                * If hash table is partitioned, give each freelist an equal share of<br>
&gt;                * the initial allocation.  Otherwise only freeList[0] is used.<br>
&gt; @@ -592,6 +596,20 @@ hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)<br>
&gt;               else<br>
&gt;                       freelist_partitions = 1;<br>
&gt;  <br>
&gt; +             /*<br>
&gt; +              * If table is shared, calculate the offset at which to find the<br>
&gt; +              * the first partition of elements<br>
&gt; +              */<br>
&gt; +             if (hashp-&gt;isshared)<br>
&gt; +             {<br>
&gt; +                     int                     nsegs;<br>
&gt; +                     int                     nbuckets;<br>
&gt; +                     nsegs = find_num_of_segs(nelem, &amp;nbuckets, hctl-&gt;num_partitions, hctl-&gt;ssize);<br>
&gt; +                     <br>
&gt; +                     curr_offset =  (((char *) hashp-&gt;hctl) + sizeof(HASHHDR) + (info-&gt;dsize * sizeof(HASHSEGMENT)) +<br>
&gt; +                        + (sizeof(HASHBUCKET) * hctl-&gt;ssize * nsegs));<br>
&gt; +             }<br>
&gt; +<br>
<br>
Why only do this for shared hashtables? Couldn&#39;t we allocate the elments<br>
together with the rest for non-share hashtables too?<br></blockquote><div><br>I think it is possible to consolidate the allocations for non-shared hash tables<br>too. However, initial elements are much smaller in non-shared hash tables due to <br>their ease of expansion. Therefore, there is probably less benefit in trying to do <br>that for non-shared tables.<br>In addition, the proposed changes are targeted to improve the monitoring in </div><div>pg_shmem_allocations which won&#39;t be applicable to non-shared hashtables. <br>While I believe it is feasible, I am uncertain about the utility of such a change. <br><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Seems a bit ugly to go through element_alloc() when pre-allocating.  Perhaps<br>
it&#39;s the best thing we can do to avoid duplicating code, but it seems worth<br>
checking if we can do better. Perhaps we could split element_alloc() into<br>
element_alloc() and element_add() or such?  With the latter doing everything<br>
after hashp-&gt;alloc().<br>
<br></blockquote><div> </div><div>Makes sense. I split the element_alloc() into element_alloc() and element_add().</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
&gt; -<br>
&gt; +     <br>
&gt;       /*<br>
&gt;        * initialize mutexes if it&#39;s a partitioned table<br>
&gt;        */<br>
<br>
Spurious change.<br>
<br></blockquote><div> </div><div>Fixed. <br><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
A function called find_num_of_segs() that also sets nbuckets seems a bit<br>
confusing.  I also don&#39;t like &quot;find_*&quot;, as that sounds like it&#39;s searching<br>
some datastructure, rather than just doing a bit of math.<br></blockquote><div> </div><div> I renamed it to compute_buckets_and_segs(). I am open to better suggestions. <br><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
&gt;       segp = (HASHSEGMENT) hashp-&gt;alloc(sizeof(HASHBUCKET) * hashp-&gt;ssize);<br>
&gt;  <br>
&gt;       if (!segp)<br>
<br>
Spurious change.<br></blockquote><div><br></div><div>Fixed. <br><br><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
&gt; -static int<br>
&gt; +int<br>
&gt;  next_pow2_int(long num)<br>
&gt;  {<br>
&gt;       if (num &gt; INT_MAX / 2)<br>
&gt; @@ -1957,3 +1995,31 @@ AtEOSubXact_HashTables(bool isCommit, int nestDepth)<br>
&gt;               }<br>
&gt;       }<br>
&gt;  }<br>
<br>
Why export this?<br>
<br></blockquote><div>It was a stale change, I removed it now </div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
&gt; diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h<br>
&gt; index 932cc4f34d..5e16bd4183 100644<br>
&gt; --- a/src/include/utils/hsearch.h<br>
&gt; +++ b/src/include/utils/hsearch.h<br>
&gt; @@ -151,7 +151,7 @@ extern void hash_seq_term(HASH_SEQ_STATUS *status);<br>
&gt;  extern void hash_freeze(HTAB *hashp);<br>
&gt;  extern Size hash_estimate_size(long num_entries, Size entrysize);<br>
&gt;  extern long hash_select_dirsize(long num_entries);<br>
&gt; -extern Size hash_get_shared_size(HASHCTL *info, int flags);<br>
&gt; +extern Size hash_get_shared_size(HASHCTL *info, int flags, long init_size);<br>
&gt;  extern void AtEOXact_HashTables(bool isCommit);<br>
&gt;  extern void AtEOSubXact_HashTables(bool isCommit, int nestDepth);<br>
<br>
It&#39;s imo a bit weird that we have very related logic in hash_estimate_size()<br>
and hash_get_shared_size(). Why do we need to duplicate it?<br><br></blockquote><div> </div><div>hash_estimate_size() estimates using default values and hash_get_shared_size()<br>calculates using specific values depending on the flags associated with the hash<br>table.  For instance, segment_size used by the former is DEF_SEGSIZE  and <br>the latter uses info-&gt;ssize which is set when the HASH_SEGMENT flag is true.<br>Hence, they might return different values for shared memory sizes.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
These I&#39;d just combine with the ShmemInitStruct(&quot;PredXactList&quot;), by allocating<br>
the additional space. The pointer math is a bit annoying, but it makes much<br>
more sense to have one entry in pg_shmem_allocations.<br>
<br></blockquote><div>Fixed accordingly.<br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
&gt; -             (TransactionId *) ShmemAlloc(TotalProcs * sizeof(*ProcGlobal-&gt;xids));<br>
&gt; +             (TransactionId *) ShmemInitStruct(&quot;Proc Transaction Ids&quot;, TotalProcs * sizeof(*ProcGlobal-&gt;xids), &amp;found);<br>
&gt;       MemSet(ProcGlobal-&gt;xids, 0, TotalProcs * sizeof(*ProcGlobal-&gt;xids));<br>
&gt; -     ProcGlobal-&gt;subxidStates = (XidCacheStatus *) ShmemAlloc(TotalProcs * sizeof(*ProcGlobal-&gt;subxidStates));<br>
&gt; +     ProcGlobal-&gt;subxidStates = (XidCacheStatus *) ShmemInitStruct(&quot;Proc Sub-transaction id states&quot;, TotalProcs * sizeof(*ProcGlobal-&gt;subxidStates), &amp;found);<br>
&gt;       MemSet(ProcGlobal-&gt;subxidStates, 0, TotalProcs * sizeof(*ProcGlobal-&gt;subxidStates));<br>
&gt; -     ProcGlobal-&gt;statusFlags = (uint8 *) ShmemAlloc(TotalProcs * sizeof(*ProcGlobal-&gt;statusFlags));<br>
&gt; +     ProcGlobal-&gt;statusFlags = (uint8 *) ShmemInitStruct(&quot;Proc Status Flags&quot;, TotalProcs * sizeof(*ProcGlobal-&gt;statusFlags), &amp;found);<br>
&gt;       MemSet(ProcGlobal-&gt;statusFlags, 0, TotalProcs * sizeof(*ProcGlobal-&gt;statusFlags));<br>
&gt;  <br>
&gt;       /*<br>
<br>
Same.<br>
<br>
Although here I&#39;d say it&#39;s worth padding the size of each separate<br>
&quot;allocation&quot; by PG_CACHE_LINE_SIZE.<br></blockquote><div><br></div><div>Made this change.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
&gt; -     fpPtr = ShmemAlloc(TotalProcs * (fpLockBitsSize + fpRelIdSize));<br>
&gt; +     fpPtr = ShmemInitStruct(&quot;Fast path lock arrays&quot;, TotalProcs * (fpLockBitsSize + fpRelIdSize), &amp;found);<br>
&gt;       MemSet(fpPtr, 0, TotalProcs * (fpLockBitsSize + fpRelIdSize));<br>
&gt;  <br>
&gt;       /* For asserts checking we did not overflow. */<br>
<br>
This one might actually make sense to keep separate, depending on the<br>
configuration it can be reasonably big (max_connection = 1k,<br>
max_locks_per_transaction=1k results in ~5MB)..<br>
<br></blockquote><div>OK <br><br>PFA the rebased patches with the above changes.<br><br>Kindly let me know your views.<br><br>Thank you,<br>Rahila Syed</div></div></div>