Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uZZt2-005PiE-LB for pgsql-hackers@arkaria.postgresql.org; Wed, 09 Jul 2025 18:51:20 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uZZt0-0030nE-G7 for pgsql-hackers@arkaria.postgresql.org; Wed, 09 Jul 2025 18:51:19 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uZZt0-0030n5-0i for pgsql-hackers@lists.postgresql.org; Wed, 09 Jul 2025 18:51:18 +0000 Received: from mail-qt1-x82b.google.com ([2607:f8b0:4864:20::82b]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uZZsy-006Rmj-1H for pgsql-hackers@lists.postgresql.org; Wed, 09 Jul 2025 18:51:17 +0000 Received: by mail-qt1-x82b.google.com with SMTP id d75a77b69052e-4a77ffcb795so2312751cf.0 for ; Wed, 09 Jul 2025 11:51:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752087075; x=1752691875; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=OYWDjo6xhHdXYjvxxpR3Lz7KWEQ4leKzKfCnieKNSVc=; b=lX/NGZQsZJua1tkeX3xku14pj2xAgy7tCyRDiSMwdnNxgTToijGtIA/CsHtx45HDOl kBV3VzA19DVZqcIGET1sZ4rYtzQ/WJtjhw2N6BzSwrPq8RNHfymWRI5p7sg9IrnZxdNJ FMacH7xAao/KbU6vfMkpY7j8SQdHulYNBZPMFqX66Y1e0R1G154IqVyqNrjC6kqpUl2P aux7ZAqm6hXmgX2y8//AuDMUQWI2AnWYgaBna8PpQlrUseNXTApPNQFRMf8cW/agQsEQ B/K3qhm2UalC47Nztu/gmX0CRsqreWctTekFRoDwjoau8gsCyOMGOYzcW2h8TnmE2COV U5og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752087075; x=1752691875; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OYWDjo6xhHdXYjvxxpR3Lz7KWEQ4leKzKfCnieKNSVc=; b=ei8bSZ7LTVrEbGZiY8w0VrutJNQoMtDvj/yIG9LZlCHRu46ibsIylWvB2tPmOv9m3c Q0v7ZmNSS5T3nnJBNOax4Edz0yfuQ2sR981PlAARDNwvTDEWJd25zil1/ShTOLXJ15pA YgAXZIOsn6k7U2Tym1mu1usT9a6lF5xq+xb+GMUDP95hnPQ9AZDyKGUrSksUF3aoZgK5 EsZB7cjYpfJIiYFWFxPMP9lELlCar1+rbj7A8CNHVGCkQdTzEJct7k8keyJ8yHJbqUmC JiuADEIbdXUnLxeGXVhgokWLExe0U2p9Db6xylq/GPGnM45UGddBZTlXB7IkqRKm55Zc aGJA== X-Forwarded-Encrypted: i=1; AJvYcCX6CnKDU2630NBmKgCE0Dbw7x39RcnUeuzoXIOUyd1lvUbb8ZMlo0RvVtGP3aVYsyZHziqGkSgalbSVMaTQ@lists.postgresql.org X-Gm-Message-State: AOJu0YwIAFrMrSRbDFuP0ZneZJOXzMBmsLcf2TO7vk4fAGWiNfV8J95E UEe1X6MnSJTDloNbqV6qgEkRmfDYi7Qp1b9CAId5vfwL7E0esTfognFGTTdlXT+LiNAUGETiowy SbZquqclnwEfWUGqlgz0ZEXhNrybTLpA= X-Gm-Gg: ASbGncv0zw/INccOp1SS/nu3ICkROb4fqniIpZF0d1bTyW2Uex12jHrn30manCAU1Bd 4R7meLDslgc/f3/lUVggbz/lSmrUSWzkML1n2VRltz9OIgpcXgegKIb36gEsEz4FxWXk9oCBGYv 70cegIusTcMTrAqYtGMgT8KGrDItm7O74D3fUE0PPjLeqnFQb4DkVqWNksa+vRPHVC8UNwhWxp6 jIfUg== X-Google-Smtp-Source: AGHT+IEGBEGWeAzDxchI9n0VulHho4lSxLN6NQjTPHN5r2PhlcIEzPW7VBZDL02PcDSseGAwV/6Dd4TBawSBpUdTRfc= X-Received: by 2002:a05:622a:142:b0:4a9:96f9:262e with SMTP id d75a77b69052e-4a9dec8bad5mr56001481cf.20.1752087075409; Wed, 09 Jul 2025 11:51:15 -0700 (PDT) MIME-Version: 1.0 References: <202503311812.vxg5b7rzfgf6@alvherre.pgsql> <616efe2c-3986-43cf-b88c-4435849acf9e@dunslane.net> <948154fe-0278-4f4c-8f5a-085e12f03163@dunslane.net> <20250708212819.09.nmisch@google.com> In-Reply-To: <20250708212819.09.nmisch@google.com> From: Mahendra Singh Thalor Date: Thu, 10 Jul 2025 00:21:03 +0530 X-Gm-Features: Ac12FXwmo6eJTYERYWMOURrllABgK2ejajAciyXLOzFNQcsjB3G3cEtmIRGo6UA Message-ID: Subject: Re: Non-text mode for pg_dumpall To: Noah Misch Cc: Andrew Dunstan , =?UTF-8?Q?=C3=81lvaro_Herrera?= , jian he , Srinath Reddy , pgsql-hackers@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Thanks Noah for the comments. On Wed, 9 Jul 2025 at 02:58, Noah Misch wrote: > > On Fri, Apr 04, 2025 at 04:11:05PM -0400, Andrew Dunstan wrote: > > Thanks. I have pushed these now with a few further small tweaks. > > This drops all databases: > > pg_dumpall --clean -Fd -f /tmp/dump > pg_restore -d template1 --globals-only /tmp/dump > > That didn't match my expectations given this help text: > > $ pg_restore --help|grep global > -g, --globals-only restore only global objects, no databases Databases are global objects so due to --clean command, we are putting drop commands in global.dat for all the databases. While restoring, we used the "--globals-only" option so we are dropping all these databases by global.dat file. Please let us know your expectations for this specific case. > > This happens in dropDBs(). I found that by searching pg_dumpall.c for "OPF", > which finds all the content we can write to globals.dat. > > commit 1495eff wrote: > > --- a/src/bin/pg_dump/pg_dumpall.c > > +++ b/src/bin/pg_dump/pg_dumpall.c > > > @@ -1612,9 +1683,27 @@ dumpDatabases(PGconn *conn) > > continue; > > } > > > > + /* > > + * If this is not a plain format dump, then append dboid and dbname to > > + * the map.dat file. > > + */ > > + if (archDumpFormat != archNull) > > + { > > + if (archDumpFormat == archCustom) > > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".dmp", db_subdir, oid); > > + else if (archDumpFormat == archTar) > > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".tar", db_subdir, oid); > > + else > > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\"", db_subdir, oid); > > Use appendShellString() instead. Plain mode already does that for the > "pg_dumpall -f" argument, which is part of db_subdir here. We don't want > weird filename characters to work out differently for plain vs. non-plain > mode. Also, it's easier to search for appendShellString() than to search for > open-coded shell quoting. Yes, we can use appendShellString also. We are using snprintf in the pg_dump.c file also. Ex: snprintf(tagbuf, sizeof(tagbuf), "LARGE OBJECTS %u..%u", loinfo->looids[0], loinfo->looids[loinfo->numlos - 1]); If we want to use appendShellString, I can write a patch for these. Please let me know your opinion. > > > @@ -1641,19 +1727,30 @@ dumpDatabases(PGconn *conn) > > if (filename) > > fclose(OPF); > > > > - ret = runPgDump(dbname, create_opts); > > + ret = runPgDump(dbname, create_opts, dbfilepath, archDumpFormat); > > if (ret != 0) > > pg_fatal("pg_dump failed on database \"%s\", exiting", dbname); > > > > if (filename) > > { > > - OPF = fopen(filename, PG_BINARY_A); > > + char global_path[MAXPGPATH]; > > + > > + if (archDumpFormat != archNull) > > + snprintf(global_path, MAXPGPATH, "%s/global.dat", filename); > > + else > > + snprintf(global_path, MAXPGPATH, "%s", filename); > > + > > + OPF = fopen(global_path, PG_BINARY_A); > > if (!OPF) > > pg_fatal("could not re-open the output file \"%s\": %m", > > - filename); > > + global_path); > > Minor item: plain mode benefits from reopening, because pg_dump appended to > the plain output file. There's no analogous need to reopen global.dat, since > just this one process writes to global.dat. yes, only once we need to open global.dat file but to keep simple code, we kept old code. > > > @@ -1672,17 +1770,36 @@ runPgDump(const char *dbname, const char *create_opts) > > initPQExpBuffer(&connstrbuf); > > initPQExpBuffer(&cmd); > > > > - printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, > > - pgdumpopts->data, create_opts); > > - > > /* > > - * If we have a filename, use the undocumented plain-append pg_dump > > - * format. > > + * If this is not a plain format dump, then append file name and dump > > + * format to the pg_dump command to get archive dump. > > */ > > - if (filename) > > - appendPQExpBufferStr(&cmd, " -Fa "); > > + if (archDumpFormat != archNull) > > + { > > + printfPQExpBuffer(&cmd, "\"%s\" -f %s %s", pg_dump_bin, > > + dbfile, create_opts); > > + > > + if (archDumpFormat == archDirectory) > > + appendPQExpBufferStr(&cmd, " --format=directory "); > > + else if (archDumpFormat == archCustom) > > + appendPQExpBufferStr(&cmd, " --format=custom "); > > + else if (archDumpFormat == archTar) > > + appendPQExpBufferStr(&cmd, " --format=tar "); > > + } > > else > > - appendPQExpBufferStr(&cmd, " -Fp "); > > + { > > + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, > > + pgdumpopts->data, create_opts); > > This uses pgdumpopts for plain mode only, so many pg_dumpall options silently > have no effect in non-plain mode. Example: > > strace -f pg_dumpall --lock-wait-timeout=10 2>&1 >/dev/null | grep exec > strace -f pg_dumpall --lock-wait-timeout=10 -Fd -f /tmp/dump3 2>&1 >/dev/null | grep exec Agreed. We can add pgdumpopts->data to all the dump formats. > > > --- a/src/bin/pg_dump/pg_restore.c > > +++ b/src/bin/pg_dump/pg_restore.c > > > +/* > > + * read_one_statement > > + * > > + * This will start reading from passed file pointer using fgetc and read till > > + * semicolon(sql statement terminator for global.dat file) > > + * > > + * EOF is returned if end-of-file input is seen; time to shut down. > > What makes it okay to use this particular subset of SQL lexing? To support complex syntax, we used this code from another file. > > > +/* > > + * get_dbnames_list_to_restore > > + * > > + * This will mark for skipping any entries from dbname_oid_list that pattern match an > > + * entry in the db_exclude_patterns list. > > + * > > + * Returns the number of database to be restored. > > + * > > + */ > > +static int > > +get_dbnames_list_to_restore(PGconn *conn, > > + SimpleOidStringList *dbname_oid_list, > > + SimpleStringList db_exclude_patterns) > > +{ > > + int count_db = 0; > > + PQExpBuffer query; > > + PGresult *res; > > + > > + query = createPQExpBuffer(); > > + > > + if (!conn) > > + pg_log_info("considering PATTERN as NAME for --exclude-database option as no db connection while doing pg_restore."); > > When do we not have a connection here? We'd need to document this behavior > variation if it stays, but I'd prefer if we can just rely on having a > connection. Yes, we can document this behavior. > > > + /* If database is already created, then don't set createDB flag. */ > > + if (opts->cparams.dbname) > > + { > > + PGconn *test_conn; > > + > > + test_conn = ConnectDatabase(db_cell->str, NULL, opts->cparams.pghost, > > + opts->cparams.pgport, opts->cparams.username, TRI_DEFAULT, > > + false, progname, NULL, NULL, NULL, NULL); > > + if (test_conn) > > + { > > + PQfinish(test_conn); > > + > > + /* Use already created database for connection. */ > > + opts->createDB = 0; > > + opts->cparams.dbname = db_cell->str; > > + } > > + else > > + { > > + /* we'll have to create it */ > > + opts->createDB = 1; > > + opts->cparams.dbname = connected_db; > > + } > > In released versions, "pg_restore --create" fails if the database exists, and > pg_restore w/o --create fails unless the database exists. I think we should > continue that pattern in this new feature. If not, pg_restore should document > how it treats pg_dumpall-sourced dumps with the "create if not exists" > semantics appearing here. Yes, we can document this behavior also. I am working on all these review comments and I will post a patch in the coming days. Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com