Re: Comments on Custom RMGRs

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Danil Anisimow <[email protected]>
To: Simon Riggs <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Jeff Davis <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: Comments on Custom RMGRs
Date: Mon, 26 Feb 2024 23:29:26 +0700
Message-ID: <CABm2Ma7v4=W-1vH=kiA4M352F=Nd-Nvd_QBZufJ6EMGsS7Q-_A@mail.gmail.com> (raw)
In-Reply-To: <CANbhV-FFgAFeNZ=qNc7Fr991Asym+sfj=R5hb02XeRE6067=hg@mail.gmail.com>
References: <CANbhV-E4pTWeF-DsdaGsOrjJNFWPaR+Dstjrnkqvf9JFFgOKKQ@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<CANbhV-G7SfdRLR6r7sxY0uL=TRU0S=cxkWJGWyTdss3u8LFK=g@mail.gmail.com>
	<[email protected]>
	<CANbhV-FFgAFeNZ=qNc7Fr991Asym+sfj=R5hb02XeRE6067=hg@mail.gmail.com>

Hi,

The checkpoint hook looks very useful, especially for extensions that have
their own storage, like pg_stat_statements.
For example, we can keep work data in shared memory and save it only during
checkpoints.
When recovering, we need to read all the data from the disk and then repeat
the latest changes from the WAL.

On Mon, Feb 26, 2024 at 2:42 PM Simon Riggs <[email protected]>
wrote:
> On Fri, 13 May 2022 at 00:42, Andres Freund <[email protected]> wrote:
> >
> > On 2022-05-12 22:26:51 +0100, Simon Riggs wrote:
> > > On Thu, 12 May 2022 at 04:40, Andres Freund <[email protected]>
wrote:
> > > > I'm not happy with the idea of random code being executed in the
middle of
> > > > CheckPointGuts(), without any documentation of what is legal to do
at that
> > > > point.
> > >
> > > The "I'm not happy.." ship has already sailed with pluggable rmgrs.
> >
> > I don't agree. The ordering within a checkpoint is a lot more fragile
than
> > what you do in an individual redo routine.
>
> Example?
>
>
> > > Checkpoints exist for one purpose - as the starting place for
recovery.
> > >
> > > Why would we allow pluggable recovery without *also* allowing
> > > pluggable checkpoints?
> >
> > Because one can do a lot of stuff with just pluggable WAL records,
without
> > integrating into checkpoints?
> >
> > Note that I'm *not* against making checkpoint extensible - I just think
it
> > needs a good bit of design work around when the hook is called etc.
>
> When was any such work done previously for any other hook?? That isn't
needed.
>
> Checkpoints aren't complete until all data structures have
> checkpointed, so there are no problems from a partial checkpoint being
> written.
>
> As a result, the order of actions in CheckpointGuts() is mostly
> independent of each other. The SLRUs are all independent of each
> other, as is CheckPointBuffers().
>
> The use cases I'm trying to support aren't tricksy modifications of
> existing code, they are just entirely new data structures which are
> completely independent of other Postgres objects.
>
>
> > I definitely think it's too late in the cycle to add checkpoint
extensibility
> > now.
> >
> >
> > > > To actually be useful we'd likely need multiple calls to such an
rmgr
> > > > callback, with a parameter where in CheckPointGuts() we are. Right
now the
> > > > sequencing is explicit in CheckPointGuts(), but with the proposed
callback,
> > > > that'd not be the case anymore.
> > >
> > > It is useful without the extra complexity you mention.
> >
> > Shrug. The documentation work definitely is needed. Perhaps we can get
away
> > without multiple callbacks within a checkpoint, I think it'll become
more
> > apparent when writing information about the precise point in time the
> > checkpoint callback is called.
>
> You seem to be thinking in terms of modifying the existing actions in
> CheckpointGuts(). I don't care about that. Anybody that wishes to do
> that can work out the details of their actions.
>
> There is nothing to document, other than "don't do things that won't
> work". How can anyone enumerate all the things that wouldn't work??
>
> There is no list of caveats for any other hook. Why is it needed here?

There are easily reproducible issues where rm_checkpoint() throws an ERROR.
When it occurs at the end-of-recovery checkpoint, the server fails to start
with a message like this:
ERROR:  Test error
FATAL:  checkpoint request failed
HINT:  Consult recent messages in the server log for details.

Even if we remove the broken extension from shared_preload_libraries, we
get the following message in the server log:
FATAL:  resource manager with ID 128 not registered
HINT:  Include the extension module that implements this resource manager
in shared_preload_libraries.

In both cases, with or without the extension in shared_preload_libraries,
the server cannot start.

This seems like a programmer's problem, but what should the user do after
receiving such messages?

Maybe it would be safer to use something like after_checkpoint_hook, which
would be called after the checkpoint is completed?
This is enough for some cases when we only need to save shared memory to
disk.

--
Regards,
Daniil Anisimov
Postgres Professional: http://postgrespro.com

view thread (5+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Comments on Custom RMGRs
  In-Reply-To: <CABm2Ma7v4=W-1vH=kiA4M352F=Nd-Nvd_QBZufJ6EMGsS7Q-_A@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox