Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1trjU6-008RdK-SQ for pgsql-hackers@arkaria.postgresql.org; Mon, 10 Mar 2025 20:12:23 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1trjU5-00HJvn-IM for pgsql-hackers@arkaria.postgresql.org; Mon, 10 Mar 2025 20:12:21 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1trfIX-00EDfD-QM for pgsql-hackers@lists.postgresql.org; Mon, 10 Mar 2025 15:44:10 +0000 Received: from mail-ej1-x62f.google.com ([2a00:1450:4864:20::62f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1trfIV-0023JA-2J for pgsql-hackers@postgresql.org; Mon, 10 Mar 2025 15:44:08 +0000 Received: by mail-ej1-x62f.google.com with SMTP id a640c23a62f3a-ac2963dc379so171660966b.2 for ; Mon, 10 Mar 2025 08:44:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741621446; x=1742226246; darn=postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=r0N/UWQARDrzzwRYQ6UIJpxQezCFBCOjjI0WYXeOUno=; b=m0FKzM+AI3Mhi9k4GDvhA+ZENh/bZdpWwbrWSQKCeu7fs6G7etkjU2B1AFCqvbNXRa E1de9qP/L6awb3O9Nnn04B8GJyoeGQ3WyBnPgF9cq15WIUwwFbj9L4bLRQ5VOQ/4eUpc Zqc81znoPK+hnjhkUiDqZxn6qGLaVsPS8ZjOVhWZzTvhANLXv1ocWUGEMqPH4zkcyOcw ytmp8DO8gAkOPuXmtDOLZxywSac/laaZbAhaTFsdFIaT9fCIZtFtnswLaAgqtpjqzTTV /nZnm1NuIeKFZuJmlYf+RyMZY6E15w+xiD/dbjv5II7m3kQ+/O/1LZxAZCQ5hvdII4If 7ALA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741621446; x=1742226246; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=r0N/UWQARDrzzwRYQ6UIJpxQezCFBCOjjI0WYXeOUno=; b=sTky20wZwPS8fbL+jt5Zw/CMFRDzUUleibE7OsZXFi+BLtnpQy15XeDWESBgrvgxFi WY4m8egwKnKEISwdHKaH5t2lcgPOk03VVUcYckb9Pb/xzuWPcCZKgGSkl9fdnlajeeF1 dirOfWpW5O6SsPi/ZeXVGpBg1+flclONdShmR2XB04l5uFWSdAvEziAS9AZGusAS3wFZ MZIFaZHu3ZcrhX6+LkB4GIfoQfAxHAYT6KGf3yYNdLzOHKid6/CFuUy5257AQilqM4L2 6qb8OORZBRDCXzorhaDmBe6xFUy5tIyjBLVq0H1TrYpLcXSju0VvRA0dY9RKYFGzDf4V Ayvw== X-Gm-Message-State: AOJu0Yyk1ENNkEFihoEjgbvxghb9YhkNY7LoVOd+TuLajBS/P9G66VB5 LY1pfQ4f8khqwGtePIYK9Enh1uSdvdD5Gi0jxlYlPvM0pY+S24cEJ6cpCfaJeT4XIV3qeATaDPU e6tKRnpabaRurLBtwyQVYBbN7MaaqKG3n1OxeXA== X-Gm-Gg: ASbGncuJFjARNOO9IndILA8VD3GDKZlGiWji8PftI2zLP7XN8LgVNMJRCd7buEGYKrp 94eUE5/xxxVKhS1HeyVMAUhFBk8BtEMUwKja3R5mY0Sbdww1P3Tm/guXwYmU6BHO/g64OcT86vp OhGj/cczG7dlL4oIcB+NCG56rIpVKm1L6m X-Google-Smtp-Source: AGHT+IEZaR0iE2iwC5yY20Erux/DGYiVqjoZJG1LHlJFIuTOCCkTsgtZhbiD0M1cj76Z0EuctnzNjTzPdrQBkZ1XR3o= X-Received: by 2002:a05:6402:5201:b0:5df:6a:54ea with SMTP id 4fb4d7f45d1cf-5e5e22b92e8mr35453808a12.11.1741621445575; Mon, 10 Mar 2025 08:44:05 -0700 (PDT) MIME-Version: 1.0 From: Naga Appani Date: Mon, 10 Mar 2025 10:43:54 -0500 X-Gm-Features: AQ5f1JrGtjGSDFZIc901H0nxhf1KBBiTQ-hKWT4z_B5qmp1-Of6jb5PtXwxm6xo Message-ID: Subject: [Proposal] Expose internal MultiXact member count function for efficient monitoring To: pgsql-hackers@postgresql.org Content-Type: multipart/alternative; boundary="00000000000098d182062ffed75d" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000098d182062ffed75d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, I would like to propose exposing an internal PostgreSQL function called ReadMultiXactCounts() to allow for efficient monitoring of MultiXact member usage. This function provides an accurate, real-time view of MultiXact activity by directly retrieving the actual member count, rather than relying on storage-based calculations. *Current Challenges: *The existing approach we are currently using to estimate MultiXact member usage has several drawbacks: - *Filesystem scanning overhead: *These functions recursively scan the pg_multixact directory, iterating over potentially thousands or millions of files, and retrieving file sizes using stat() calls, which introduces significant I/O overhead. - *Potential performance bottleneck:* On systems with high transaction throughput generating large numbers of MultiXact members, the filesystem-based approach scales poorly due to the latency of stat() cal= ls, especially on network-based filesystems like RDS/Aurora. - *Not a real-time or memory-efficient solution:* The current approach does not provide a direct, in-memory view of MultiXact activity. *Proposed Solution*The internal ReadMultiXactCounts() function, implemented in multixact.c, directly calculates the number of MultiXact members by reading live state from shared memory. This approach avoids the performance issues of the current filesystem-based estimation methods. By exposing ReadMultiXactCounts() for external use, we can provide PostgreSQL users with an efficient way to monitor MultiXact member usage. This could be particularly useful for integrating with tools like Amazon RDS Performance Insights and Amazon CloudWatch to provide enhanced database insights and proactive managed monitoring for users. The performance comparison between the current and proposed approaches shows a significant improvement, with the proposed solution taking only a fraction of a millisecond to retrieve the MultiXact member count, compared to tens or hundreds of milliseconds for the current filesystem-based approach. Following is the comparison of performance between calculating storage of MultiXact members directory and retrieving the count of members. Implementation Used size MultiXact members (approx) Time taken (ms) Time factor (vs Baseline) EC2 community (RDS version of pg_ls_multixactdir) 8642 MB 1.8 billion 96.87= 9 1.00 Linux du command 8642 MB 1.8 billion 96 NA Proposal (ReadMultiXactCounts) N/A 1.99 billion 0.167 580 times faster I believe exposing ReadMultiXactCounts() would be a valuable addition to the PostgreSQL ecosystem, providing users with a more reliable and efficient way to monitor MultiXact usage. Appreciate your feedback or discussion on this proposal. Please let me know if this approach is acceptable, so I=E2=80=99ll go ahea= d and submit a patch. Thank you! Best regards, Naga Appani Postgres Database Engineer Amazon Web Services --00000000000098d182062ffed75d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

I would like to propose exposing an internal P= ostgreSQL function called=C2=A0ReadMultiXactCounts()=C2=A0to a= llow for efficient monitoring of MultiXact member usage. This function prov= ides an accurate, real-time view of MultiXact activity by directly retrievi= ng the actual member count, rather than relying on storage-based calculatio= ns.

Current Challenges:=C2=A0The existing approac= h we are currently using to estimate MultiXact member usage has several dra= wbacks:
  • Filesystem scanning overhead:=C2=A0These function= s recursively scan the=C2=A0pg_multixact=C2=A0directory, itera= ting over potentially thousands or millions of files, and retrieving file s= izes using=C2=A0stat()=C2=A0calls, which introduces significan= t I/O overhead.
  • Potential performance bottleneck:=C2=A0On sy= stems with high transaction throughput generating large numbers of MultiXac= t members, the filesystem-based approach scales poorly due to the latency o= f=C2=A0stat()=C2=A0calls, especially on network-based filesyst= ems like RDS/Aurora.
  • Not a real-time or memory-efficient solutio= n:=C2=A0The current approach does not provide a direct, in-memory view = of MultiXact activity.
Proposed SolutionThe internal=C2=A0<= code>ReadMultiXactCounts()=C2=A0function, implemented in=C2=A0= multixact.c, directly calculates the number of MultiXact members by = reading live state from shared memory. This approach avoids the performance= issues of the current filesystem-based estimation methods.

B= y exposing=C2=A0ReadMultiXactCounts()=C2=A0for external use, w= e can provide PostgreSQL users with an efficient way to monitor MultiXact m= ember usage. This could be particularly useful for integrating with tools l= ike Amazon RDS Performance Insights and Amazon CloudWatch to provide enhanc= ed database insights and proactive managed monitoring for users.

The performance comparison between the current and proposed approac= hes shows a significant improvement, with the proposed solution taking only= a fraction of a millisecond to retrieve the MultiXact member count, compar= ed to tens or hundreds of milliseconds for the current filesystem-based app= roach.=C2=A0

Following is the comparison of perfor= mance between calculating storage of MultiXact members directory and retrie= ving=C2=A0the count of members.=C2=A0

Impl= ementation Used = size Mul= tiXact members (approx) Time= taken (ms) Tim= e factor (vs Baseline)
EC2 community (RDS = version of pg_ls_multixactdir) 8642 MB 1.8 billion 96.879 1.00
Linux du command 8642 MB 1.8 billion 96 NA
Proposal (ReadMulti= XactCounts) N/A 1.99 billion 0.167 580 times faster

I believe exposing=C2=A0ReadMultiXactC= ounts()=C2=A0would be a valuable addition to the PostgreSQL ecosyste= m, providing users with a more reliable and efficient way to monitor MultiX= act usage. Appreciate your feedback or discussion on this proposal.=C2=A0

Please let me know if this approach is acceptable,= =C2=A0 so I=E2=80=99ll go ahead and submit a patch.

Thank you!

Best regards,=C2=A0
Naga Appani
Postgres Database Engineer
Amazon Web Services
--00000000000098d182062ffed75d--