public inbox for [email protected]  
help / color / mirror / Atom feed
From: Noah Misch <[email protected]>
To: Andrew Dunstan <[email protected]>
Cc: jian he <[email protected]>
Cc: Mahendra Singh Thalor <[email protected]>
Cc: tushar <[email protected]>
Cc: Vaibhav Dalvi <[email protected]>
Cc: [email protected]
Subject: Re: Non-text mode for pg_dumpall
Date: Sat, 6 Jun 2026 17:02:18 -0700
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<CAKYtNAq73_UeL4-0wgu9rawctHdS2zhpzWtqVuOChCS3M+2Ymg@mail.gmail.com>
	<CACJufxFs9NXXTeb78i2MD+pnZootayAtiGcNHNeU35SmQgMjbA@mail.gmail.com>
	<[email protected]>
	<CACJufxEnmJ8otfmN7Kp110Wqi=M4YE0-zn-W6SK+eTpWuZw_fg@mail.gmail.com>
	<CAKYtNAqxjQMLVPfz6XESqszhxZKgigi7UEBzBhz_Wj_Bnasdag@mail.gmail.com>
	<[email protected]>
	<CACJufxE-zJiBAaZABtJBJr8FZdj=xP5adf_excmu9S6Op=KpLA@mail.gmail.com>
	<[email protected]>
	<[email protected]>

On Thu, Feb 26, 2026 at 09:02:48AM -0500, Andrew Dunstan wrote:
> pushed with a slight tweak.

Having now reviewed commit 763aaa0, I don't think it's ready to remain part of
v19.  While some points from my v18 review are now resolved, other points
still seem unresolved.  I didn't find discussion of the unresolved points.  I
also see new issues.


Wider issues:

> @@ -796,24 +964,45 @@ dropRoles(PGconn *conn)

> +		if (archDumpFormat == archNull)
> +		{
> +			appendPQExpBuffer(delQry, "DROP ROLE %s%s;\n",
> +							  if_exists ? "IF EXISTS " : "",
> +							  fmtId(rolename));
> +			fprintf(OPF, "%s", delQry->data);
> +		}
> +		else
> +		{
> +			appendPQExpBuffer(delQry, "DROP ROLE IF EXISTS %s;\n",
> +							  fmtId(rolename));
> +
> +			ArchiveEntry(fout,
> +						 nilCatalogId,	/* catalog ID */
> +						 createDumpId(),	/* dump ID */
> +						 ARCHIVE_OPTS(.tag = psprintf("ROLE %s", fmtId(rolename)),
> +									  .description = "DROP_GLOBAL",
> +									  .section = SECTION_PRE_DATA,
> +									  .createStmt = delQry->data));
> +		}

pg_dumpall.c should call ArchiveEntry() in the same way pg_dump.c does.  In
pg_dump.c, per-object-class support code calls ArchiveEntry() unconditionally,
and the object-independent infra of pg_backup_archiver.c deals with the
difference between plain and non-plain formats.

There should be one appendPQExpBuffer(delQry, "DROP ROLE ..."), not one for
plain format and another for non-plain formats.  Having two creates excess
risk of format-specific bugs, something pg_dump.c has long avoided well.  (I'm
echoing my postgr.es/m/[email protected] review.  I wrote,
"The strength of the archiver architecture shows in how rarely new features
need format-specific logic and how rarely format-specific bugs get reported."
That holds for the way pg_dump.c uses the archiver, but it doesn't hold for
the way pg_dumpall.c now uses the archiver.)

Separately, DROP should be in dropStmt, not in createStmt.  This likely
entails refactoring to merge dumpRoles() and dropRoles() into one function,
per the style of pg_dump.c.

> +		/*
> +		 * For pg_dumpall archives, --clean implies --if-exists since global
> +		 * objects may not exist in the target cluster.
> +		 */
> +		if (opts->dropSchema && !opts->if_exists)
> +		{
> +			opts->if_exists = 1;
> +			pg_log_info("--if-exists is implied by --clean for pg_dumpall archives");
> +		}

The last comment of postgr.es/m/[email protected] disagreed
with this decision, so that remains unresolved.

> +		/* If database is already created, then don't set createDB flag. */
> +		if (tmpopts->cparams.dbname)
> +		{
> +			PGconn	   *test_conn;
> +
> +			test_conn = ConnectDatabase(dbidname->str, NULL, tmpopts->cparams.pghost,
> +										tmpopts->cparams.pgport, tmpopts->cparams.username, TRI_DEFAULT,
> +										false, progname, NULL, NULL, NULL, NULL);
> +			if (test_conn)
> +			{
> +				PQfinish(test_conn);
> +
> +				/* Use already created database for connection. */
> +				tmpopts->createDB = 0;
> +				tmpopts->cparams.dbname = dbidname->str;
> +			}
> +			else
> +			{
> +				/* We'll have to create it */
> +				tmpopts->createDB = 1;
> +				tmpopts->cparams.dbname = connected_db;
> +			}

postgr.es/m/[email protected] called for this to change.

None of this is meant to say the feature is impossible. But I don’t think this
commit is at the point where post-commit fixups are the right workflow.  I
recommend reverting, posting a new version, and letting commitfest review
finish.


Localized issues:

There's new code to write map and globals files, but I don't see corresponding
code to fsync those files, like we fsync a plain-format dump.

> @@ -3027,6 +3049,16 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH)
>  			return 0;
>  	}
>  
> +	/*
> +	 * Global object TOC entries (e.g., ROLEs or TABLESPACEs) must not be
> +	 * ignored.
> +	 */
> +	if (strcmp(te->desc, "ROLE") == 0 ||
> +		strcmp(te->desc, "ROLE PROPERTIES") == 0 ||
> +		strcmp(te->desc, "TABLESPACE") == 0 ||
> +		strcmp(te->desc, "DROP_GLOBAL") == 0)
> +		return REQ_SCHEMA;
> +

If I delete this addition, no src/bin/pg_dump test fails.  Would you explain
the rationale for this addition?

> +			if (comment_buf->data[0] != '\0')
> +				ArchiveEntry(fout,
> +							 nilCatalogId,	/* catalog ID */
> +							 createDumpId(),	/* dump ID */
> +							 ARCHIVE_OPTS(.tag = tag,
> +										  .description = "COMMENT",
> +										  .section = SECTION_PRE_DATA,
> +										  .createStmt = comment_buf->data));
> +
> +			if (seclabel_buf->data[0] != '\0')
> +				ArchiveEntry(fout,
> +							 nilCatalogId,	/* catalog ID */
> +							 createDumpId(),	/* dump ID */
> +							 ARCHIVE_OPTS(.tag = tag,
> +										  .description = "SECURITY LABEL",
> +										  .section = SECTION_PRE_DATA,
> +										  .createStmt = seclabel_buf->data));

COMMENT and SECURITY LABEL should use .deps and .nDeps to record a dependency
on the main object.  Representative example from pg_dump.c:

		ArchiveEntry(fout, nilCatalogId, createDumpId(),
					 ARCHIVE_OPTS(.tag = target->data,
								  .namespace = tbinfo->dobj.namespace->dobj.name,
								  .owner = tbinfo->rolname,
								  .description = "SECURITY LABEL",
								  .section = SECTION_NONE,
								  .createStmt = query->data,
								  .deps = &(tbinfo->dobj.dumpId),
								  .nDeps = 1));

This probably has no user-visible consequences, since "/* Parallel execution
is not supported for global object restoration. */".  Still, best to follow
the standard.

> +		/*
> +		 * If this is not a plain format dump, then append dboid and dbname to
> +		 * the map.dat file.
> +		 */

The first six lines inside the block aren't for the purpose described here.
The one line that is for this purpose has its own comment.  Please delete this
comment.

> +		if (archDumpFormat != archNull)
> +		{
> +			if (archDumpFormat == archCustom)
> +				snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".dmp", db_subdir, oid);
> +			else if (archDumpFormat == archTar)
> +				snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".tar", db_subdir, oid);
> +			else
> +				snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\"", db_subdir, oid);

My review in postgr.es/m/[email protected] called for this
to change.  I'm finding no discussion of the rationale for not changing it.

> +
> +			/* Put one line entry for dboid and dbname in map file. */
> +			fprintf(map_file, "%s %s\n", oid, dbname);
> +		}

> +runPgDump(const char *dbname, const char *create_opts, char *dbfile)

> +	if (archDumpFormat != archNull)
> +	{
> +		printfPQExpBuffer(&cmd, "\"%s\" %s -f %s %s", pg_dump_bin,
> +						  pgdumpopts->data, dbfile, create_opts);

A different "-f" argument is already in pgdumpopts.  That works, but it's
untidy.

> +			pg_fatal("option %s cannot exclude %s when restoring a pg_dumpall archive",
> +					 "--section", "--pre-data");

s/--pre-data/pre-data/

> +		/*
> +		 * Always restore global objects, even if --exclude-database results
> +		 * in zero databases to process. If 'globals-only' is set, exit
> +		 * immediately.

I think this comment is out of date, because the code doesn't have an exit:

> +		 */
> +		snprintf(global_path, MAXPGPATH, "%s/toc.glo", inputFileSpec);
> +
> +		n_errors = restore_global_objects(global_path, tmpopts);
> +
> +		if (globals_only)
> +			pg_log_info("database restoring skipped because option %s was specified",
> +						"-g/--globals-only");
> +		else
> +		{
> +			/* Now restore all the databases from map.dat */
> +			n_errors = n_errors + restore_all_databases(inputFileSpec, db_exclude_patterns,
> +														opts, numWorkers);
> +		}
> +
> +		/* Free db pattern list. */
> +		simple_string_list_destroy(&db_exclude_patterns);
> +	}

> +	/* Don't output TOC entry comments when restoring globals */
> +	((ArchiveHandle *) AH)->noTocComments = 1;

Why not?

> +static int
> +get_dbnames_list_to_restore(PGconn *conn,
> +							SimplePtrList *dbname_oid_list,
> +							SimpleStringList db_exclude_patterns)
> +{
...
> +			if (pg_strcasecmp(dbidname->str, pat_cell->val) == 0)
> +				skip_db_restore = true;
> +			/* Otherwise, try a pattern match if there is a connection */

The code assumes a connection unconditionally (which seems right to me):

> +			else
> +			{
> +				int			dotcnt;
> +
> +				appendPQExpBufferStr(query, "SELECT 1 ");
> +				processSQLNamePattern(conn, query, pat_cell->val, false,
> +									  false, NULL, db_lit->data,
> +									  NULL, NULL, NULL, &dotcnt);
> +
> +				if (dotcnt > 0)
> +				{
> +					pg_log_error("improper qualified name (too many dotted names): %s",
> +								 dbidname->str);
> +					PQfinish(conn);
> +					exit_nicely(1);
> +				}
> +
> +				res = executeQuery(conn, query->data);






view thread (111+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Non-text mode for pg_dumpall
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox