public inbox for [email protected]
help / color / mirror / Atom feedFrom: Noah Misch <[email protected]>
To: Andrew Dunstan <[email protected]>
Cc: Mahendra Singh Thalor <[email protected]>
Cc: Álvaro Herrera <[email protected]>
Cc: jian he <[email protected]>
Cc: Srinath Reddy <[email protected]>
Cc: [email protected]
Subject: Re: Non-text mode for pg_dumpall
Date: Mon, 21 Jul 2025 17:53:39 -0700
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<CAKYtNAoE1xUcMmBq-1qCaZGUDQLfMSVa6hzBVA-PqbmSTzPf5A@mail.gmail.com>
<CAKYtNAq530UTAtnVs5xfONQ0j6vS-Sys50p5+SNfC7G7_ghCVQ@mail.gmail.com>
<[email protected]>
<[email protected]>
<CAKYtNAr-6_CCz+tnJ0T2597Ar4iAZXkE1qPimsO9EY-Kn7LvzQ@mail.gmail.com>
<[email protected]>
<CAKYtNAqXTvfAw-y4FHvzprg72cFrC63cge9xMVDO_R4=NHc5rA@mail.gmail.com>
<CAKYtNAppY=vVtTMdkxRPV48cHKyzLeYF6mrPk15Wczme7e9=ww@mail.gmail.com>
<[email protected]>
On Mon, Jul 21, 2025 at 04:41:03PM -0400, Andrew Dunstan wrote:
> On 2025-07-17 Th 6:18 AM, Mahendra Singh Thalor wrote
> > > > > > > --- a/src/bin/pg_dump/pg_restore.c
> > > > > > > +++ b/src/bin/pg_dump/pg_restore.c
> > > > > > > +/*
> > > > > > > + * read_one_statement
> > > > > > > + *
> > > > > > > + * This will start reading from passed file pointer using fgetc and read till
> > > > > > > + * semicolon(sql statement terminator for global.dat file)
> > > > > > > + *
> > > > > > > + * EOF is returned if end-of-file input is seen; time to shut down.
> > > > > > What makes it okay to use this particular subset of SQL lexing?
> > > > > To support complex syntax, we used this code from another file.
> > > > I'm hearing that you copied this code from somewhere. Running
> > > > "git grep 'time to shut down'" suggests you copied it from
> > > > InteractiveBackend(). Is that right? I do see other similarities between
> > > > read_one_statement() and InteractiveBackend().
> > > >
> > > > Copying InteractiveBackend() provides negligible assurance that this is the
> > > > right subset of SQL lexing. Only single-user mode uses InteractiveBackend().
> > > > Single-user mode survives mostly as a last resort for recovering from having
> > > > reached xidStopLimit, is rarely used, and only superusers write queries to it.
> > > Yes, we copied this from InteractiveBackend to read statements from
> > > global.dat file.
>
> Maybe we should ensure that identifiers with CR or LF are turned into
> Unicode quoted identifiers, so each SQL statement would always only occupy
> one line.
Interesting. That might work.
> Or just reject role and tablespace names with CR or LF altogether,
> just as we do for database names.
There are other ways to get multi-line statements. Non-exhaustive list:
- pg_db_role_setting.setconfig
- pg_shdescription.description
- pg_shseclabel.label
- pg_tablespace.spcoptions (if we add a text option in the future)
I think this decision about lexing also ties to other unfinished open item
work of aligning "pg_dumpall -Fd;pg_restore [options]" behavior with "pg_dump
-Fd;pg_restore [options]". "pg_restore --no-privileges" should not restore
pg_tablespace.spcacl, and "pg_restore --no-comments" should not emit COMMENT
statements.
I suspect this is going to end with a structured dump like we use on the
pg_dump (per-database) side. It's not an accident that v17 pg_restore doesn't
lex text files to do its job. pg_dumpall deals with a more-limited set of
statements than pg_dump deals with, but they're not _that much_ more limited.
I won't veto a lexing-based approach if it gets the behaviors right, but I
don't have high hopes for it getting the behaviors right and staying that way.
(I almost said "pg_restore --no-owner" should not restore
pg_tablespace.spcowner, but v17 "pg_dumpall --no-owner" does restore it. One
could argue for or against aligning $SUBJECT behavior w/ v17's mistake there.)
view thread (100+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Non-text mode for pg_dumpall
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox