Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uZFrY-000tAq-9e for pgsql-hackers@arkaria.postgresql.org; Tue, 08 Jul 2025 21:28:28 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uZFrW-00C7c8-Aw for pgsql-hackers@arkaria.postgresql.org; Tue, 08 Jul 2025 21:28:26 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uZFrV-00C7by-Mq for pgsql-hackers@lists.postgresql.org; Tue, 08 Jul 2025 21:28:26 +0000 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uZFrU-006Wuh-1H for pgsql-hackers@lists.postgresql.org; Tue, 08 Jul 2025 21:28:26 +0000 Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-234b9dfb842so44896535ad.1 for ; Tue, 08 Jul 2025 14:28:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=leadboat.com; s=google; t=1752010102; x=1752614902; darn=lists.postgresql.org; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=oDXf7daPqSsJNQldKVUmBHvHhMjZ4X7Lm4CrjtXsobw=; b=GHxzQZ2JWrSX4LAMrTS8cYX6TJxxvGs3m33rmUEjTwkYs9JQTto+stXYbX5jCDbPDo xyWh96BIWd/o3q9ajQD31Wyj8Cn5aGrUG/1JvT3nmWP6Z50avgkNUZiG7O0wNT7mY4mv zSDpVHr0uD3If34iUydo0jjrK24MxgX8wOIiw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752010102; x=1752614902; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oDXf7daPqSsJNQldKVUmBHvHhMjZ4X7Lm4CrjtXsobw=; b=qwLYbkH393ebqEm1M6QcnVB/6sBWdOFyM5jvwBosxoVgbhlF8FKyRFgsUBmUfadGkS LG2Okwb4JG9TZ/3/+yNq5D7aj6MbZVa+e9pnuGNzBal+VuSvCDIM3BuBoOX6ubPIDpQe jGrCUr+QP5ONvdw3I5d0ictcrpMnJshE2Nvm9RdT+vc9RpDD4Ladd6T+BjJTWkn06hmh KISyFbzS28bQnKdeQIiS24BWPblE/dxHHlKlZ27I5kt+TvvxsUi5LyH9JdC/ZYC8MCsz G0ffrfsgPDKAdYDWgsqMmKs4i7hhCrXGQ1RFv3ffjslbJr7D3Jzi2AIvX7HPOkLx6xyK /+BQ== X-Forwarded-Encrypted: i=1; AJvYcCUEvyCWwliUiZidgUsV5qNJCH2WaT/PwJeDXM/N8zC1E1vrfcf/m2hFd5eRlYCAYl3IvPJQeH5kuBMsRZx0@lists.postgresql.org X-Gm-Message-State: AOJu0YytwpiezMKCZED0p58lwGr1h5C5ZR02CB4hbI0RMnpWYzAK4ucj Aa290MKwYFs3x30vDx0usuJVJn6NasNzAkj6q5ef1kctn9+FY7JJJHBi2dYUtHNnZQ== X-Gm-Gg: ASbGncsoEmNl6s/fSs7Tu26j4sVQg+wnRJlMc+WVddqc332gZfglNHNmLYzF63sybU0 e46WArBfiE2WgcCoU/wuTgsr2ifa2SgujYnmeqlv+hxrCcp3gL4Tfeh27vD7sXy+3BuuWmFatd/ tsBYU4ijOrkdNu+IZV3KGCAHukNbB3kMPLPLUUdNCo9mNsHvqmmPcsn6HSRd0I6/FJdrurvDBee azjPPvgbHI+kxm/q+QlqeOhUW8R6j8NRFHqj63x147MuJNpZVmnusQc/7YjCPtA/NWS2Wqn0a6X /z5fAmZLJ95t77Ck9ZxSGvzFQVfit4ObjssmlN7GBRH1l7qbGiuqCH7l4g0I4Hi4mwJxIeRI/dC 31tHRXrWsHK6w4jg= X-Google-Smtp-Source: AGHT+IHQZNRxnOzytQAxleqVprTOqzVNRcsGwhwW232yh8Oeci5kavb4JXeXWPN6U95e+69lHQ1OQA== X-Received: by 2002:a17:902:ef48:b0:236:94ac:cc11 with SMTP id d9443c01a7336-23ddb1973afmr533185ad.7.1752010101967; Tue, 08 Jul 2025 14:28:21 -0700 (PDT) Received: from google.com (c-76-132-193-99.hsd1.ca.comcast.net. [76.132.193.99]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23c8431a527sm117716965ad.26.2025.07.08.14.28.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 08 Jul 2025 14:28:21 -0700 (PDT) Date: Tue, 8 Jul 2025 14:28:19 -0700 From: Noah Misch To: Andrew Dunstan Cc: Mahendra Singh Thalor , =?iso-8859-1?Q?=C1lvaro?= Herrera , jian he , Srinath Reddy , pgsql-hackers@lists.postgresql.org Subject: Re: Non-text mode for pg_dumpall Message-ID: <20250708212819.09.nmisch@google.com> References: <202503311812.vxg5b7rzfgf6@alvherre.pgsql> <616efe2c-3986-43cf-b88c-4435849acf9e@dunslane.net> <948154fe-0278-4f4c-8f5a-085e12f03163@dunslane.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <948154fe-0278-4f4c-8f5a-085e12f03163@dunslane.net> User-Agent: Mutt/2.2.12 (2023-09-09) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Apr 04, 2025 at 04:11:05PM -0400, Andrew Dunstan wrote: > Thanks. I have pushed these now with a few further small tweaks. This drops all databases: pg_dumpall --clean -Fd -f /tmp/dump pg_restore -d template1 --globals-only /tmp/dump That didn't match my expectations given this help text: $ pg_restore --help|grep global -g, --globals-only restore only global objects, no databases This happens in dropDBs(). I found that by searching pg_dumpall.c for "OPF", which finds all the content we can write to globals.dat. commit 1495eff wrote: > --- a/src/bin/pg_dump/pg_dumpall.c > +++ b/src/bin/pg_dump/pg_dumpall.c > @@ -1612,9 +1683,27 @@ dumpDatabases(PGconn *conn) > continue; > } > > + /* > + * If this is not a plain format dump, then append dboid and dbname to > + * the map.dat file. > + */ > + if (archDumpFormat != archNull) > + { > + if (archDumpFormat == archCustom) > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".dmp", db_subdir, oid); > + else if (archDumpFormat == archTar) > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".tar", db_subdir, oid); > + else > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\"", db_subdir, oid); Use appendShellString() instead. Plain mode already does that for the "pg_dumpall -f" argument, which is part of db_subdir here. We don't want weird filename characters to work out differently for plain vs. non-plain mode. Also, it's easier to search for appendShellString() than to search for open-coded shell quoting. > @@ -1641,19 +1727,30 @@ dumpDatabases(PGconn *conn) > if (filename) > fclose(OPF); > > - ret = runPgDump(dbname, create_opts); > + ret = runPgDump(dbname, create_opts, dbfilepath, archDumpFormat); > if (ret != 0) > pg_fatal("pg_dump failed on database \"%s\", exiting", dbname); > > if (filename) > { > - OPF = fopen(filename, PG_BINARY_A); > + char global_path[MAXPGPATH]; > + > + if (archDumpFormat != archNull) > + snprintf(global_path, MAXPGPATH, "%s/global.dat", filename); > + else > + snprintf(global_path, MAXPGPATH, "%s", filename); > + > + OPF = fopen(global_path, PG_BINARY_A); > if (!OPF) > pg_fatal("could not re-open the output file \"%s\": %m", > - filename); > + global_path); Minor item: plain mode benefits from reopening, because pg_dump appended to the plain output file. There's no analogous need to reopen global.dat, since just this one process writes to global.dat. > @@ -1672,17 +1770,36 @@ runPgDump(const char *dbname, const char *create_opts) > initPQExpBuffer(&connstrbuf); > initPQExpBuffer(&cmd); > > - printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, > - pgdumpopts->data, create_opts); > - > /* > - * If we have a filename, use the undocumented plain-append pg_dump > - * format. > + * If this is not a plain format dump, then append file name and dump > + * format to the pg_dump command to get archive dump. > */ > - if (filename) > - appendPQExpBufferStr(&cmd, " -Fa "); > + if (archDumpFormat != archNull) > + { > + printfPQExpBuffer(&cmd, "\"%s\" -f %s %s", pg_dump_bin, > + dbfile, create_opts); > + > + if (archDumpFormat == archDirectory) > + appendPQExpBufferStr(&cmd, " --format=directory "); > + else if (archDumpFormat == archCustom) > + appendPQExpBufferStr(&cmd, " --format=custom "); > + else if (archDumpFormat == archTar) > + appendPQExpBufferStr(&cmd, " --format=tar "); > + } > else > - appendPQExpBufferStr(&cmd, " -Fp "); > + { > + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, > + pgdumpopts->data, create_opts); This uses pgdumpopts for plain mode only, so many pg_dumpall options silently have no effect in non-plain mode. Example: strace -f pg_dumpall --lock-wait-timeout=10 2>&1 >/dev/null | grep exec strace -f pg_dumpall --lock-wait-timeout=10 -Fd -f /tmp/dump3 2>&1 >/dev/null | grep exec > --- a/src/bin/pg_dump/pg_restore.c > +++ b/src/bin/pg_dump/pg_restore.c > +/* > + * read_one_statement > + * > + * This will start reading from passed file pointer using fgetc and read till > + * semicolon(sql statement terminator for global.dat file) > + * > + * EOF is returned if end-of-file input is seen; time to shut down. What makes it okay to use this particular subset of SQL lexing? > +/* > + * get_dbnames_list_to_restore > + * > + * This will mark for skipping any entries from dbname_oid_list that pattern match an > + * entry in the db_exclude_patterns list. > + * > + * Returns the number of database to be restored. > + * > + */ > +static int > +get_dbnames_list_to_restore(PGconn *conn, > + SimpleOidStringList *dbname_oid_list, > + SimpleStringList db_exclude_patterns) > +{ > + int count_db = 0; > + PQExpBuffer query; > + PGresult *res; > + > + query = createPQExpBuffer(); > + > + if (!conn) > + pg_log_info("considering PATTERN as NAME for --exclude-database option as no db connection while doing pg_restore."); When do we not have a connection here? We'd need to document this behavior variation if it stays, but I'd prefer if we can just rely on having a connection. > + /* If database is already created, then don't set createDB flag. */ > + if (opts->cparams.dbname) > + { > + PGconn *test_conn; > + > + test_conn = ConnectDatabase(db_cell->str, NULL, opts->cparams.pghost, > + opts->cparams.pgport, opts->cparams.username, TRI_DEFAULT, > + false, progname, NULL, NULL, NULL, NULL); > + if (test_conn) > + { > + PQfinish(test_conn); > + > + /* Use already created database for connection. */ > + opts->createDB = 0; > + opts->cparams.dbname = db_cell->str; > + } > + else > + { > + /* we'll have to create it */ > + opts->createDB = 1; > + opts->cparams.dbname = connected_db; > + } In released versions, "pg_restore --create" fails if the database exists, and pg_restore w/o --create fails unless the database exists. I think we should continue that pattern in this new feature. If not, pg_restore should document how it treats pg_dumpall-sourced dumps with the "create if not exists" semantics appearing here.