public inbox for [email protected]  
help / color / mirror / Atom feed
From: Ayush Tiwari <[email protected]>
To: Andrey Borodin <[email protected]>
Cc: Marko Tiikkaja <[email protected]>
Cc: [email protected]
Cc: PostgreSQL mailing lists <[email protected]>
Subject: Re: BUG #19490: Streaming standby on 16.14 stops applying WAL on MultiXactOffsetSLRU when primary is 16.8
Date: Thu, 21 May 2026 13:15:07 +0530
Message-ID: <CAJTYsWV6X4D7eyg=WWLwF+ucq1BvvF4nffgy0XGxiA59ty1C6w@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<CAL9smLBMxKBCmsA9UGcmf93bT2_MsZ+POH-oHREuwKdmMU7jfQ@mail.gmail.com>
	<[email protected]>

Hi,

On Thu, 21 May 2026 at 12:55, Andrey Borodin <[email protected]> wrote:

>
>
> > On 21 May 2026, at 00:12, Marko Tiikkaja <[email protected]> wrote:
> >
> > #8  0x0000654c8ae2acba in SimpleLruWriteAll (ctl=0x654c8b63e400
>
> Thanks!
>
> This clearly points to SimpleLruWriteAll() added in 77dff5d937b1.
> If by chance you will have a backtrace of another deadlocking process -
> please post it.
>
> But it's not strictly necessary for analysis, I think we can figure out
> what
> happened from the backtrace you already posted.
>

I had a look at the code that Marko's backtrace pointed at and I
believe this is a straightforward self-deadlock introduced by
77dff5d937b.

In RecordNewMultiXact() on REL_16_STABLE:

  LWLockAcquire(MultiXactOffsetSLRULock, LW_EXCLUSIVE);

  ...

  if (InRecovery && next_pageno != pageno)
  {
      ...
      if (last_initialized_offsets_page == -1)
      {
          SimpleLruWriteAll(MultiXactOffsetCtl, false);  /* <-- here */
          init_needed = !SimpleLruDoesPhysicalPageExist(MultiXactOffsetCtl,
next_pageno);
      }
      else
          init_needed = (last_initialized_offsets_page == pageno);
      ...
  }

The outer LWLockAcquire takes MultiXactOffsetSLRULock EXCLUSIVE.
SimpleLruWriteAll() in REL_16_STABLE then does

  LWLockAcquire(shared->ControlLock, LW_EXCLUSIVE);

and for the MultiXactOffsetCtl SLRU, shared->ControlLock is
MultiXactOffsetSLRULock (set up by SimpleLruInit(...
MultiXactOffsetSLRULock ...)).
So it tries to take the very lock the same backend already holds.
LWLockAcquire does not detect that and parks the process on
LWLock:MultiXactOffsetSLRU forever.

That matches every datum in the report:

  -  wait_event = LWLock:MultiXactOffsetSLRU.
   - pg_stat_slru shows zero MultiXact activity, because the
    SimpleLruWriteAll loop never gets past LWLockAcquire to actually
    write a page.
  - Restart unwedges things briefly.
  - The deadlock only triggers when last_initialized_offsets_page is
    still -1, i.e. before any XLOG_MULTIXACT_ZERO_OFF_PAGE record has
    been replayed in this recovery session, which is at most once per
    startup and consistent with the "recurs after catch-up" behaviour.

The "safety flush" the comment justifies is it needed?
Every offsets page that this code path initializes is synchronously written
via SimpleLruWritePage() a few lines below the SimpleLruZeroPage(),
with an Assert that the page is clean afterwards.  So at the moment
we call SimpleLruDoesPhysicalPageExist(), there shouldn't be a relevant
dirty offsets page in the SLRU buffer cache that would lead to a
false negative.  Dropping the SimpleLruWriteAll() call therefore
removes the self-deadlock without changing correctness.

Maybe I'm missing something here. Thoughts?

Regards,
Ayush


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: BUG #19490: Streaming standby on 16.14 stops applying WAL on MultiXactOffsetSLRU when primary is 16.8
  In-Reply-To: <CAJTYsWV6X4D7eyg=WWLwF+ucq1BvvF4nffgy0XGxiA59ty1C6w@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox