public inbox for [email protected]  
help / color / mirror / Atom feed
From: Heikki Linnakangas <[email protected]>
To: Andres Freund <[email protected]>
To: Bertrand Drouvot <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: PGPROC alignment (was Re: pgsql: Separate RecoveryConflictReasons from procsignals)
Date: Tue, 10 Feb 2026 22:53:58 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

On 10/02/2026 21:46, Andres Freund wrote:
> On 2026-02-10 19:15:27 +0000, Bertrand Drouvot wrote:
>> On Tue, Feb 10, 2026 at 01:15:01PM -0500, Andres Freund wrote:
>>> On 2026-02-10 19:14:44 +0200, Heikki Linnakangas wrote:
>>> Yea, I don't think we need to be perfect here. Just a bit less bad. And, as
>>> you say, the current order doesn't make a lot of sense.
>>> Just grouping things like
>>> - pid, pgxactoff, backendType (i.e. barely if ever changing)
>>> - wait_event_info, waitStart (i.e. very frequently changing, but typically
>>>    accessed within one proc)
>>> - sem, lwWaiting, waitLockMode (i.e. stuff that is updated frequently and
>>>    accessed across processes)
>>
>> With an ordering like in the attached (to apply on top of Heikki's patch), we're
>> back to 832 bytes.
> 
> You'd really need to insert padding between the sections to make it work...

Here's my attempt at grouping things more logically. I didn't insert 
padding and also didn't try to avoid alignment padding. I tried to 
optimize for readability rather than size or performance. That said, I 
would assume that grouping things logically like this would also help to 
avoid false sharing. If not, inserting explicit padding seems like a a 
good fix.

I also think we should split 'links' into two fields. For clarity.

With this, sizeof(PGPROC) == 864 without the explicit alignment to 
PG_CACHE_LINE_SIZE, and 896 with it.

- Heikki


Attachments:

  [text/x-patch] v2-0001-Align-PGPROC-to-cache-line-boundary.patch (1.5K, 2-v2-0001-Align-PGPROC-to-cache-line-boundary.patch)
  download | inline diff:
From 83481187f6e2f8c177bc85c522917d64e1b78b4b Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 10 Feb 2026 18:53:31 +0200
Subject: [PATCH v2 1/4] Align PGPROC to cache line boundary

---
 src/include/storage/proc.h | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index ac0df4aeaaa..53acce8a5a1 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -182,7 +182,7 @@ typedef enum
  *
  * See PROC_HDR for details.
  */
-struct PGPROC
+typedef struct PGPROC
 {
 	dlist_node	links;			/* list link if process is in a list */
 	dlist_head *procgloballist; /* procglobal list that owns this PGPROC */
@@ -337,10 +337,18 @@ struct PGPROC
 	PGPROC	   *lockGroupLeader;	/* lock group leader, if I'm a member */
 	dlist_head	lockGroupMembers;	/* list of members, if I'm a leader */
 	dlist_node	lockGroupLink;	/* my member link, if I'm a member */
-};
-
-/* NOTE: "typedef struct PGPROC PGPROC" appears in storage/lock.h. */
+}
 
+/*
+ * If compiler understands aligned pragma, use it to align the struct at cache
+ * line boundaries.  This is just for performance, to (a) avoid false sharing
+ * and (b) to make the multiplication / division to convert between PGPROC *
+ * and ProcNumber be a little cheaper.
+ */
+#if defined(pg_attribute_aligned)
+			pg_attribute_aligned(PG_CACHE_LINE_SIZE)
+#endif
+PGPROC;
 
 extern PGDLLIMPORT PGPROC *MyProc;
 
-- 
2.47.3



  [text/x-patch] v2-0002-Remove-useless-store.patch (900B, 3-v2-0002-Remove-useless-store.patch)
  download | inline diff:
From f5485ae9eb4b12eb7e57b8b7e7fbdaa0df7c5575 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 10 Feb 2026 21:54:56 +0200
Subject: [PATCH v2 2/4] Remove useless store

It was a mishap introduced in commit 5764f611e1, which converted the
loop to use dclist_foreach
---
 src/backend/storage/lmgr/lock.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 7f0cd784f79..e1168ad3837 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -4148,7 +4148,6 @@ GetSingleProcBlockerStatusData(PGPROC *blocked_proc, BlockedProcsData *data)
 		if (queued_proc == blocked_proc)
 			break;
 		data->waiter_pids[data->npids++] = queued_proc->pid;
-		queued_proc = (PGPROC *) queued_proc->links.next;
 	}
 
 	bproc->num_locks = data->nlocks - bproc->first_lock;
-- 
2.47.3



  [text/x-patch] v2-0003-Split-PGPROC-links-field-into-two-for-clarity.patch (13.0K, 4-v2-0003-Split-PGPROC-links-field-into-two-for-clarity.patch)
  download | inline diff:
From b4cc9e67a3fb8344a7c4934723fde3d012f2105e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 10 Feb 2026 22:45:08 +0200
Subject: [PATCH v2 3/4] Split PGPROC 'links' field into two, for clarity

The same field was mainly used for the position in a LOCK's wait
queue, but also as the position in a the freelist when the PGPROC
entry was not in use. The reuse saves a some memory, at the expense of
readability, which seems like a bad tradeoff. If we wanted to make the
struct smaller there's other things we could do, but we're actually
just discussing adding padding to the struct for performance reasons.
---
 src/backend/access/transam/twophase.c |  2 +-
 src/backend/storage/lmgr/deadlock.c   | 12 ++++-----
 src/backend/storage/lmgr/lock.c       |  6 ++---
 src/backend/storage/lmgr/proc.c       | 37 ++++++++++++++-------------
 src/include/storage/proc.h            |  8 +++---
 5 files changed, 32 insertions(+), 33 deletions(-)

diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index eabc4d48208..e4340b59640 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -447,7 +447,6 @@ MarkAsPreparingGuts(GlobalTransaction gxact, FullTransactionId fxid,
 
 	/* Initialize the PGPROC entry */
 	MemSet(proc, 0, sizeof(PGPROC));
-	dlist_node_init(&proc->links);
 	proc->waitStatus = PROC_WAIT_STATUS_OK;
 	if (LocalTransactionIdIsValid(MyProc->vxid.lxid))
 	{
@@ -474,6 +473,7 @@ MarkAsPreparingGuts(GlobalTransaction gxact, FullTransactionId fxid,
 	proc->lwWaiting = LW_WS_NOT_WAITING;
 	proc->lwWaitMode = 0;
 	proc->waitLock = NULL;
+	dlist_node_init(&proc->waitLink);
 	proc->waitProcLock = NULL;
 	pg_atomic_init_u64(&proc->waitStart, 0);
 	for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index 0a8dd5eb7c2..d642500d4c9 100644
--- a/src/backend/storage/lmgr/deadlock.c
+++ b/src/backend/storage/lmgr/deadlock.c
@@ -260,7 +260,7 @@ DeadLockCheck(PGPROC *proc)
 		/* Reset the queue and re-add procs in the desired order */
 		dclist_init(waitQueue);
 		for (int j = 0; j < nProcs; j++)
-			dclist_push_tail(waitQueue, &procs[j]->links);
+			dclist_push_tail(waitQueue, &procs[j]->waitLink);
 
 #ifdef DEBUG_DEADLOCK
 		PrintLockQueue(lock, "rearranged to:");
@@ -502,7 +502,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
 	 * If the process is waiting, there is an outgoing waits-for edge to each
 	 * process that blocks it.
 	 */
-	if (checkProc->links.next != NULL && checkProc->waitLock != NULL &&
+	if (!dlist_node_is_detached(&checkProc->waitLink) &&
 		FindLockCycleRecurseMember(checkProc, checkProc, depth, softEdges,
 								   nSoftEdges))
 		return true;
@@ -520,7 +520,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
 
 		memberProc = dlist_container(PGPROC, lockGroupLink, iter.cur);
 
-		if (memberProc->links.next != NULL && memberProc->waitLock != NULL &&
+		if (!dlist_node_is_detached(&memberProc->waitLink) && memberProc->waitLock != NULL &&
 			memberProc != checkProc &&
 			FindLockCycleRecurseMember(memberProc, checkProc, depth, softEdges,
 									   nSoftEdges))
@@ -713,7 +713,7 @@ FindLockCycleRecurseMember(PGPROC *checkProc,
 		{
 			dclist_foreach(proc_iter, waitQueue)
 			{
-				proc = dlist_container(PGPROC, links, proc_iter.cur);
+				proc = dlist_container(PGPROC, waitLink, proc_iter.cur);
 
 				if (proc->lockGroupLeader == checkProcLeader)
 					lastGroupMember = proc;
@@ -728,7 +728,7 @@ FindLockCycleRecurseMember(PGPROC *checkProc,
 		{
 			PGPROC	   *leader;
 
-			proc = dlist_container(PGPROC, links, proc_iter.cur);
+			proc = dlist_container(PGPROC, waitLink, proc_iter.cur);
 
 			leader = proc->lockGroupLeader == NULL ? proc :
 				proc->lockGroupLeader;
@@ -877,7 +877,7 @@ TopoSort(LOCK *lock,
 	i = 0;
 	dclist_foreach(proc_iter, waitQueue)
 	{
-		proc = dlist_container(PGPROC, links, proc_iter.cur);
+		proc = dlist_container(PGPROC, waitLink, proc_iter.cur);
 		topoProcs[i++] = proc;
 	}
 	Assert(i == queue_size);
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index e1168ad3837..d930c66cdbd 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -2052,13 +2052,13 @@ RemoveFromWaitQueue(PGPROC *proc, uint32 hashcode)
 
 	/* Make sure proc is waiting */
 	Assert(proc->waitStatus == PROC_WAIT_STATUS_WAITING);
-	Assert(proc->links.next != NULL);
+	Assert(!dlist_node_is_detached(&proc->waitLink));
 	Assert(waitLock);
 	Assert(!dclist_is_empty(&waitLock->waitProcs));
 	Assert(0 < lockmethodid && lockmethodid < lengthof(LockMethods));
 
 	/* Remove proc from lock's wait queue */
-	dclist_delete_from_thoroughly(&waitLock->waitProcs, &proc->links);
+	dclist_delete_from_thoroughly(&waitLock->waitProcs, &proc->waitLink);
 
 	/* Undo increments of request counts by waiting process */
 	Assert(waitLock->nRequested > 0);
@@ -4143,7 +4143,7 @@ GetSingleProcBlockerStatusData(PGPROC *blocked_proc, BlockedProcsData *data)
 	/* Collect PIDs from the lock's wait queue, stopping at blocked_proc */
 	dclist_foreach(proc_iter, waitQueue)
 	{
-		PGPROC	   *queued_proc = dlist_container(PGPROC, links, proc_iter.cur);
+		PGPROC	   *queued_proc = dlist_container(PGPROC, waitLink, proc_iter.cur);
 
 		if (queued_proc == blocked_proc)
 			break;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 31ccdb1ef89..dbc2ad931b8 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -331,25 +331,25 @@ InitProcGlobal(void)
 		if (i < MaxConnections)
 		{
 			/* PGPROC for normal backend, add to freeProcs list */
-			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
+			dlist_push_tail(&ProcGlobal->freeProcs, &proc->freeProcsLink);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
 		else if (i < MaxConnections + autovacuum_worker_slots + NUM_SPECIAL_WORKER_PROCS)
 		{
 			/* PGPROC for AV or special worker, add to autovacFreeProcs list */
-			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
+			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->freeProcsLink);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
 		else if (i < MaxConnections + autovacuum_worker_slots + NUM_SPECIAL_WORKER_PROCS + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
-			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
+			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->freeProcsLink);
 			proc->procgloballist = &ProcGlobal->bgworkerFreeProcs;
 		}
 		else if (i < MaxBackends)
 		{
 			/* PGPROC for walsender, add to walsenderFreeProcs list */
-			dlist_push_tail(&ProcGlobal->walsenderFreeProcs, &proc->links);
+			dlist_push_tail(&ProcGlobal->walsenderFreeProcs, &proc->freeProcsLink);
 			proc->procgloballist = &ProcGlobal->walsenderFreeProcs;
 		}
 
@@ -438,7 +438,7 @@ InitProcess(void)
 
 	if (!dlist_is_empty(procgloballist))
 	{
-		MyProc = dlist_container(PGPROC, links, dlist_pop_head_node(procgloballist));
+		MyProc = dlist_container(PGPROC, freeProcsLink, dlist_pop_head_node(procgloballist));
 		SpinLockRelease(ProcStructLock);
 	}
 	else
@@ -471,7 +471,7 @@ InitProcess(void)
 	 * Initialize all fields of MyProc, except for those previously
 	 * initialized by InitProcGlobal.
 	 */
-	dlist_node_init(&MyProc->links);
+	dlist_node_init(&MyProc->freeProcsLink);
 	MyProc->waitStatus = PROC_WAIT_STATUS_OK;
 	MyProc->fpVXIDLock = false;
 	MyProc->fpLocalTransactionId = InvalidLocalTransactionId;
@@ -493,6 +493,7 @@ InitProcess(void)
 	MyProc->lwWaiting = LW_WS_NOT_WAITING;
 	MyProc->lwWaitMode = 0;
 	MyProc->waitLock = NULL;
+	dlist_node_init(&MyProc->waitLink);
 	MyProc->waitProcLock = NULL;
 	pg_atomic_write_u64(&MyProc->waitStart, 0);
 #ifdef USE_ASSERT_CHECKING
@@ -672,7 +673,7 @@ InitAuxiliaryProcess(void)
 	 * Initialize all fields of MyProc, except for those previously
 	 * initialized by InitProcGlobal.
 	 */
-	dlist_node_init(&MyProc->links);
+	dlist_node_init(&MyProc->freeProcsLink);
 	MyProc->waitStatus = PROC_WAIT_STATUS_OK;
 	MyProc->fpVXIDLock = false;
 	MyProc->fpLocalTransactionId = InvalidLocalTransactionId;
@@ -689,6 +690,7 @@ InitAuxiliaryProcess(void)
 	MyProc->lwWaiting = LW_WS_NOT_WAITING;
 	MyProc->lwWaitMode = 0;
 	MyProc->waitLock = NULL;
+	dlist_node_init(&MyProc->waitLink);
 	MyProc->waitProcLock = NULL;
 	pg_atomic_write_u64(&MyProc->waitStart, 0);
 #ifdef USE_ASSERT_CHECKING
@@ -849,7 +851,7 @@ LockErrorCleanup(void)
 	partitionLock = LockHashPartitionLock(lockAwaited->hashcode);
 	LWLockAcquire(partitionLock, LW_EXCLUSIVE);
 
-	if (!dlist_node_is_detached(&MyProc->links))
+	if (!dlist_node_is_detached(&MyProc->waitLink))
 	{
 		/* We could not have been granted the lock yet */
 		RemoveFromWaitQueue(MyProc, lockAwaited->hashcode);
@@ -981,7 +983,7 @@ ProcKill(int code, Datum arg)
 
 				/* Leader exited first; return its PGPROC. */
 				SpinLockAcquire(ProcStructLock);
-				dlist_push_head(procgloballist, &leader->links);
+				dlist_push_head(procgloballist, &leader->freeProcsLink);
 				SpinLockRelease(ProcStructLock);
 			}
 		}
@@ -1026,7 +1028,7 @@ ProcKill(int code, Datum arg)
 		Assert(dlist_is_empty(&proc->lockGroupMembers));
 
 		/* Return PGPROC structure (and semaphore) to appropriate freelist */
-		dlist_push_tail(procgloballist, &proc->links);
+		dlist_push_tail(procgloballist, &proc->freeProcsLink);
 	}
 
 	/* Update shared estimate of spins_per_delay */
@@ -1215,7 +1217,7 @@ JoinWaitQueue(LOCALLOCK *locallock, LockMethod lockMethodTable, bool dontWait)
 
 		dclist_foreach(iter, waitQueue)
 		{
-			PGPROC	   *proc = dlist_container(PGPROC, links, iter.cur);
+			PGPROC	   *proc = dlist_container(PGPROC, waitLink, iter.cur);
 
 			/*
 			 * If we're part of the same locking group as this waiter, its
@@ -1279,9 +1281,9 @@ JoinWaitQueue(LOCALLOCK *locallock, LockMethod lockMethodTable, bool dontWait)
 	 * Insert self into queue, at the position determined above.
 	 */
 	if (insert_before)
-		dclist_insert_before(waitQueue, &insert_before->links, &MyProc->links);
+		dclist_insert_before(waitQueue, &insert_before->waitLink, &MyProc->waitLink);
 	else
-		dclist_push_tail(waitQueue, &MyProc->links);
+		dclist_push_tail(waitQueue, &MyProc->waitLink);
 
 	lock->waitMask |= LOCKBIT_ON(lockmode);
 
@@ -1715,13 +1717,13 @@ ProcSleep(LOCALLOCK *locallock)
 void
 ProcWakeup(PGPROC *proc, ProcWaitStatus waitStatus)
 {
-	if (dlist_node_is_detached(&proc->links))
+	if (dlist_node_is_detached(&proc->waitLink))
 		return;
 
 	Assert(proc->waitStatus == PROC_WAIT_STATUS_WAITING);
 
 	/* Remove process from wait queue */
-	dclist_delete_from_thoroughly(&proc->waitLock->waitProcs, &proc->links);
+	dclist_delete_from_thoroughly(&proc->waitLock->waitProcs, &proc->waitLink);
 
 	/* Clean up process' state and pass it the ok/fail signal */
 	proc->waitLock = NULL;
@@ -1752,7 +1754,7 @@ ProcLockWakeup(LockMethod lockMethodTable, LOCK *lock)
 
 	dclist_foreach_modify(miter, waitQueue)
 	{
-		PGPROC	   *proc = dlist_container(PGPROC, links, miter.cur);
+		PGPROC	   *proc = dlist_container(PGPROC, waitLink, miter.cur);
 		LOCKMODE	lockmode = proc->waitLockMode;
 
 		/*
@@ -1816,8 +1818,7 @@ CheckDeadLock(void)
 	 * We check by looking to see if we've been unlinked from the wait queue.
 	 * This is safe because we hold the lock partition lock.
 	 */
-	if (MyProc->links.prev == NULL ||
-		MyProc->links.next == NULL)
+	if (dlist_node_is_detached(&MyProc->waitLink))
 	{
 		result = DS_NO_DEADLOCK;
 		goto check_done;
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 53acce8a5a1..f642830fcf8 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -154,10 +154,6 @@ typedef enum
  * Each backend has a PGPROC struct in shared memory.  There is also a list of
  * currently-unused PGPROC structs that will be reallocated to new backends.
  *
- * links: list link for any list the PGPROC is in.  When waiting for a lock,
- * the PGPROC is linked into that lock's waitProcs queue.  A recycled PGPROC
- * is linked into ProcGlobal's freeProcs list.
- *
  * Note: twophase.c also sets up a dummy PGPROC struct for each currently
  * prepared transaction.  These PGPROCs appear in the ProcArray data structure
  * so that the prepared transactions appear to be still running and are
@@ -184,8 +180,9 @@ typedef enum
  */
 typedef struct PGPROC
 {
-	dlist_node	links;			/* list link if process is in a list */
 	dlist_head *procgloballist; /* procglobal list that owns this PGPROC */
+	dlist_node	freeProcsLink;	/* link in procgloballist, when in recycled
+								 * state */
 
 	PGSemaphore sem;			/* ONE semaphore to sleep on */
 	ProcWaitStatus waitStatus;
@@ -263,6 +260,7 @@ typedef struct PGPROC
 	/* Info about lock the process is currently waiting for, if any. */
 	/* waitLock and waitProcLock are NULL if not currently waiting. */
 	LOCK	   *waitLock;		/* Lock object we're sleeping on ... */
+	dlist_node	waitLink;		/* position in waitLock->waitProcs queue */
 	PROCLOCK   *waitProcLock;	/* Per-holder info for awaited lock */
 	LOCKMODE	waitLockMode;	/* type of lock we're waiting for */
 	LOCKMASK	heldLocks;		/* bitmask for lock types already held on this
-- 
2.47.3



  [text/x-patch] v2-0004-Rearrange-fields-in-PGPROC-for-clarity.patch (8.0K, 5-v2-0004-Rearrange-fields-in-PGPROC-for-clarity.patch)
  download | inline diff:
From 455db18cb525d151f0dee40738023efbcdecdae4 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 10 Feb 2026 22:47:22 +0200
Subject: [PATCH v2 4/4] Rearrange fields in PGPROC, for clarity

---
 src/include/storage/proc.h | 124 +++++++++++++++++++++----------------
 1 file changed, 71 insertions(+), 53 deletions(-)

diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index f642830fcf8..b1077562cb5 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -184,27 +184,31 @@ typedef struct PGPROC
 	dlist_node	freeProcsLink;	/* link in procgloballist, when in recycled
 								 * state */
 
-	PGSemaphore sem;			/* ONE semaphore to sleep on */
-	ProcWaitStatus waitStatus;
-
-	Latch		procLatch;		/* generic latch for process */
+	/*---- Backend identity ----*/
 
+	/*
+	 * These fields that don't change after backend startup, or only very
+	 * rarely
+	 */
+	int			pid;			/* Backend's process ID; 0 if prepared xact */
+	BackendType backendType;	/* what kind of process is this? */
 
-	TransactionId xid;			/* id of top-level transaction currently being
-								 * executed by this proc, if running and XID
-								 * is assigned; else InvalidTransactionId.
-								 * mirrored in ProcGlobal->xids[pgxactoff] */
-
-	TransactionId xmin;			/* minimal running XID as it was when we were
-								 * starting our xact, excluding LAZY VACUUM:
-								 * vacuum must not remove tuples deleted by
-								 * xid >= xmin ! */
+	/* These fields are zero while a backend is still starting up: */
+	Oid			databaseId;		/* OID of database this backend is using */
+	Oid			roleId;			/* OID of role using this backend */
 
-	int			pid;			/* Backend's process ID; 0 if prepared xact */
+	Oid			tempNamespaceId;	/* OID of temp schema this backend is
+									 * using */
 
 	int			pgxactoff;		/* offset into various ProcGlobal->arrays with
 								 * data mirrored from this PGPROC */
 
+	uint8		statusFlags;	/* this backend's status flags, see PROC_*
+								 * above. mirrored in
+								 * ProcGlobal->statusFlags[pgxactoff] */
+
+	/*---- Transactions and snapshots ----*/
+
 	/*
 	 * Currently running top-level transaction's virtual xid. Together these
 	 * form a VirtualTransactionId, but we don't use that struct because this
@@ -224,14 +228,27 @@ typedef struct PGPROC
 									 * InvalidLocalTransactionId */
 	}			vxid;
 
-	/* These fields are zero while a backend is still starting up: */
-	Oid			databaseId;		/* OID of database this backend is using */
-	Oid			roleId;			/* OID of role using this backend */
+	TransactionId xid;			/* id of top-level transaction currently being
+								 * executed by this proc, if running and XID
+								 * is assigned; else InvalidTransactionId.
+								 * mirrored in ProcGlobal->xids[pgxactoff] */
 
-	Oid			tempNamespaceId;	/* OID of temp schema this backend is
-									 * using */
+	TransactionId xmin;			/* minimal running XID as it was when we were
+								 * starting our xact, excluding LAZY VACUUM:
+								 * vacuum must not remove tuples deleted by
+								 * xid >= xmin ! */
 
-	BackendType backendType;	/* what kind of process is this? */
+	XidCacheStatus subxidStatus;	/* mirrored with
+									 * ProcGlobal->subxidStates[i] */
+	struct XidCache subxids;	/* cache for subtransaction XIDs */
+
+	/*---- Inter-process signaling ----*/
+
+	Latch		procLatch;		/* generic latch for process */
+
+	PGSemaphore sem;			/* ONE semaphore to sleep on */
+
+	int			delayChkptFlags;	/* for DELAY_CHKPT_* flags */
 
 	/*
 	 * While in hot standby mode, shows that a conflict signal has been sent
@@ -243,6 +260,8 @@ typedef struct PGPROC
 	 */
 	pg_atomic_uint32 pendingRecoveryConflicts;
 
+	/*---- LWLock waiting ----*/
+
 	/*
 	 * Info about LWLock the process is currently waiting for, if any.
 	 *
@@ -257,6 +276,16 @@ typedef struct PGPROC
 	/* Support for condition variables. */
 	proclist_node cvWaitLink;	/* position in CV wait list */
 
+	/*---- Lock manager data ----*/
+
+	/*
+	 * Support for lock groups.  Use LockHashPartitionLockByProc on the group
+	 * leader to get the LWLock protecting these fields.
+	 */
+	PGPROC	   *lockGroupLeader;	/* lock group leader, if I'm a member */
+	dlist_head	lockGroupMembers;	/* list of members, if I'm a leader */
+	dlist_node	lockGroupLink;	/* my member link, if I'm a member */
+
 	/* Info about lock the process is currently waiting for, if any. */
 	/* waitLock and waitProcLock are NULL if not currently waiting. */
 	LOCK	   *waitLock;		/* Lock object we're sleeping on ... */
@@ -265,14 +294,28 @@ typedef struct PGPROC
 	LOCKMODE	waitLockMode;	/* type of lock we're waiting for */
 	LOCKMASK	heldLocks;		/* bitmask for lock types already held on this
 								 * lock object by this backend */
+
 	pg_atomic_uint64 waitStart; /* time at which wait for lock acquisition
 								 * started */
 
-	int			delayChkptFlags;	/* for DELAY_CHKPT_* flags */
+	ProcWaitStatus waitStatus;
 
-	uint8		statusFlags;	/* this backend's status flags, see PROC_*
-								 * above. mirrored in
-								 * ProcGlobal->statusFlags[pgxactoff] */
+	/*
+	 * All PROCLOCK objects for locks held or awaited by this backend are
+	 * linked into one of these lists, according to the partition number of
+	 * their lock.
+	 */
+	dlist_head	myProcLocks[NUM_LOCK_PARTITIONS];
+
+	/*-- recording fast-path locks taken by this backend. --*/
+	LWLock		fpInfoLock;		/* protects per-backend fast-path state */
+	uint64	   *fpLockBits;		/* lock modes held for each fast-path slot */
+	Oid		   *fpRelId;		/* slots for rel oids */
+	bool		fpVXIDLock;		/* are we holding a fast-path VXID lock? */
+	LocalTransactionId fpLocalTransactionId;	/* lxid for fast-path VXID
+												 * lock */
+
+	/*---- Synchronous replication waiting ----*/
 
 	/*
 	 * Info to allow us to wait for synchronous replication, if needed.
@@ -284,18 +327,8 @@ typedef struct PGPROC
 	int			syncRepState;	/* wait state for sync rep */
 	dlist_node	syncRepLinks;	/* list link if process is in syncrep queue */
 
-	/*
-	 * All PROCLOCK objects for locks held or awaited by this backend are
-	 * linked into one of these lists, according to the partition number of
-	 * their lock.
-	 */
-	dlist_head	myProcLocks[NUM_LOCK_PARTITIONS];
-
-	XidCacheStatus subxidStatus;	/* mirrored with
-									 * ProcGlobal->subxidStates[i] */
-	struct XidCache subxids;	/* cache for subtransaction XIDs */
+	/*---- Support for group XID clearing. ----*/
 
-	/* Support for group XID clearing. */
 	/* true, if member of ProcArray group waiting for XID clear */
 	bool		procArrayGroupMember;
 	/* next ProcArray group member waiting for XID clear */
@@ -307,9 +340,7 @@ typedef struct PGPROC
 	 */
 	TransactionId procArrayGroupMemberXid;
 
-	uint32		wait_event_info;	/* proc's wait information */
-
-	/* Support for group transaction status update. */
+	/*---- Support for group transaction status update. ----*/
 	bool		clogGroupMember;	/* true, if member of clog group */
 	pg_atomic_uint32 clogGroupNext; /* next clog group member */
 	TransactionId clogGroupMemberXid;	/* transaction id of clog group member */
@@ -320,21 +351,8 @@ typedef struct PGPROC
 	XLogRecPtr	clogGroupMemberLsn; /* WAL location of commit record for clog
 									 * group member */
 
-	/* Lock manager data, recording fast-path locks taken by this backend. */
-	LWLock		fpInfoLock;		/* protects per-backend fast-path state */
-	uint64	   *fpLockBits;		/* lock modes held for each fast-path slot */
-	Oid		   *fpRelId;		/* slots for rel oids */
-	bool		fpVXIDLock;		/* are we holding a fast-path VXID lock? */
-	LocalTransactionId fpLocalTransactionId;	/* lxid for fast-path VXID
-												 * lock */
-
-	/*
-	 * Support for lock groups.  Use LockHashPartitionLockByProc on the group
-	 * leader to get the LWLock protecting these fields.
-	 */
-	PGPROC	   *lockGroupLeader;	/* lock group leader, if I'm a member */
-	dlist_head	lockGroupMembers;	/* list of members, if I'm a leader */
-	dlist_node	lockGroupLink;	/* my member link, if I'm a member */
+	/*---- Status reporting ----*/
+	uint32		wait_event_info;	/* proc's wait information */
 }
 
 /*
-- 
2.47.3



view thread (22+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: PGPROC alignment (was Re: pgsql: Separate RecoveryConflictReasons from procsignals)
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox