Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ubpsW-005t3w-3i for pgsql-hackers@arkaria.postgresql.org; Wed, 16 Jul 2025 00:20:08 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ubpsU-006ktO-7G for pgsql-hackers@arkaria.postgresql.org; Wed, 16 Jul 2025 00:20:06 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ubpsT-006kt6-J7 for pgsql-hackers@lists.postgresql.org; Wed, 16 Jul 2025 00:20:06 +0000 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ubpsR-007y4B-1q for pgsql-hackers@lists.postgresql.org; Wed, 16 Jul 2025 00:20:05 +0000 Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-234fcadde3eso74993415ad.0 for ; Tue, 15 Jul 2025 17:20:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=leadboat.com; s=google; t=1752625201; x=1753230001; darn=lists.postgresql.org; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=rbTm6I7p/ZIJuTk/h/9jKjIamGDOtbdmbtnDhjp0Bhk=; b=J4k932CsqrAuCb/q/ZHq6JO5RaMBkfWtKYFZ7nYAI0V41MgVX/oMCp/tcUC11cF8yo GNzyo9239jmwG2unILtS6eqrj195R9RlQsKfCX+8tQdSmli2EKJgsD14DESURlB3l2JJ +WsqPC1rjIdQdTqGGvZYDMIH2IjAa4Kn44TOU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752625201; x=1753230001; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rbTm6I7p/ZIJuTk/h/9jKjIamGDOtbdmbtnDhjp0Bhk=; b=KlO2JQ/nvbLik1bnt561uQ4mVDckOvnM+96O3CLbXZCoo0Qu4oW0dxULNkCRh5OGE4 Clclg9ASk7abVDAowoRf20rJR11reoCFN9hJZpczSeQvG5Pz+svqV8O5rWuUFdDB1tYs 64UMuclucm/RHPxugkt2S4xy39p7YhG+Z6Szm4IPeC2M6A9JOpgMQRVDNKxAqZfBMpkQ yEroZjuRzZKXicOn3TLWq7Cff8T4deb6sVnC12fT6PP4aELWNcXwRFLwfFQTVtGBjjsu zwR2OsYBS52UyjWuNVO9IN8W8Rs2cZz02tf37ah1JzU/4hWWn/IO3yRVULTNF1ju/Qnj H0XA== X-Forwarded-Encrypted: i=1; AJvYcCUKlch4CcZ9DF9CHedehMa5QBSxgsZPvHvNHF3YtdZrP8eGvQnjhQLVMr/On96EP4ZAHQxCeEtdpPORXjoj@lists.postgresql.org X-Gm-Message-State: AOJu0YzVJNRqotrT5P2Vr7vzSz0vSLCjmzFjKx/7z3sXRgZvhJgrINxv Y7W+TBljI6AXHuX9YXDhSaQKO4J10sLdJ6TGLu2kOlWIxHD6ZTMyXSzfA4BG2aQHAg== X-Gm-Gg: ASbGncvtNHMypWa1OEsE2oOpRpu9VMtDKDraHY/qnrOHLE2EIDrRGpOuxIuyIe+dNlH XrW4THT2alBhHbhPubw5csp3JEbY0x2AeEFTrWfbZDuNPEeUx+ma0sELTBP1kZ+NbKdblqO5paT DkMTJZuSYbC6D7F+pnhRdzGT8vean6BIf2zvLcJAi2DzjMwNUZgmZT+OGov679cIgbZMT17qQYE 6mLsI4VlA01nP2jfxzdIfuwQg66HJC8p7l755BjLcQHw4E7QvC1ETSM/X3WWge0+s88qtmpNZbH x/5EymC0Q9ouP9LlNCUwEd9o/cMTgLxtolmsTCW03v32I3eA2l6vvC/hGo/VOmKntT57kazKqW2 sTL9uNWUiw+VoQ4XzsKSl2LKs2vsegg86n4T29l3lb0EBd8nVMP+jUTlz4lpn7g== X-Google-Smtp-Source: AGHT+IFj6/YM5f/U/XsUYB/sI3LYQKDy+9rWZ5aNgoURoy6GePf92A/3X1B8Yd6Se0E1MgX/A4XiGw== X-Received: by 2002:a17:902:cf05:b0:224:23be:c569 with SMTP id d9443c01a7336-23e256d77b5mr10103895ad.22.1752625201010; Tue, 15 Jul 2025 17:20:01 -0700 (PDT) Received: from google.com (c-73-15-160-255.hsd1.ca.comcast.net. [73.15.160.255]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23de4323e36sm114579595ad.114.2025.07.15.17.19.59 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 15 Jul 2025 17:20:00 -0700 (PDT) Date: Tue, 15 Jul 2025 17:19:57 -0700 From: Noah Misch To: Mahendra Singh Thalor Cc: Andrew Dunstan , =?iso-8859-1?Q?=C1lvaro?= Herrera , jian he , Srinath Reddy , pgsql-hackers@lists.postgresql.org Subject: Re: Non-text mode for pg_dumpall Message-ID: <20250716001957.c6.nmisch@google.com> References: <202503311812.vxg5b7rzfgf6@alvherre.pgsql> <616efe2c-3986-43cf-b88c-4435849acf9e@dunslane.net> <948154fe-0278-4f4c-8f5a-085e12f03163@dunslane.net> <20250708212819.09.nmisch@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/2.2.12 (2023-09-09) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Jul 10, 2025 at 12:21:03AM +0530, Mahendra Singh Thalor wrote: > On Wed, 9 Jul 2025 at 02:58, Noah Misch wrote: > > On Fri, Apr 04, 2025 at 04:11:05PM -0400, Andrew Dunstan wrote: > > > Thanks. I have pushed these now with a few further small tweaks. > > > > This drops all databases: > > > > pg_dumpall --clean -Fd -f /tmp/dump > > pg_restore -d template1 --globals-only /tmp/dump > > > > That didn't match my expectations given this help text: > > > > $ pg_restore --help|grep global > > -g, --globals-only restore only global objects, no databases > > Databases are global objects so due to --clean command, we are putting > drop commands in global.dat for all the databases. While restoring, we > used the "--globals-only" option so we are dropping all these > databases by global.dat file. > > Please let us know your expectations for this specific case. Be consistent with "pg_dump". A quick check suggests "pg_dump --clean" affects plain format only. For non-plain formats, only the pg_restore argument governs the final commands: $ rm -r /tmp/dump; pg_dump --clean -Fd -f /tmp/dump && pg_restore -f- /tmp/dump | grep DROP $ rm -r /tmp/dump; pg_dump --clean -Fd -f /tmp/dump && pg_restore --clean -f- /tmp/dump | grep DROP DROP TABLE public.example; That said, you should audit code referencing the --clean flag and see if there's more to it than that quick test suggests. Note that aligning with pg_dump will require changes for object types beyond databases. "pg_restore --clean" of a global dump should emit DROP TABLESPACE and DROP ROLE as appropriate, regardless of whether the original pg_dumpall had --clean. For my earlier example (pg_dumpall --clean; pg_restore --globals-only) I expect the same outcome as plain-format "pg_dumpall --globals-only", which is no databases dropped or created. The help line says "no databases". Plain "pg_dumpall --globals-only" and even "pg_dumpall --globals-only --clean" do not drop or create databases. > > commit 1495eff wrote: > > > --- a/src/bin/pg_dump/pg_dumpall.c > > > +++ b/src/bin/pg_dump/pg_dumpall.c > > > > > @@ -1612,9 +1683,27 @@ dumpDatabases(PGconn *conn) > > > continue; > > > } > > > > > > + /* > > > + * If this is not a plain format dump, then append dboid and dbname to > > > + * the map.dat file. > > > + */ > > > + if (archDumpFormat != archNull) > > > + { > > > + if (archDumpFormat == archCustom) > > > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".dmp", db_subdir, oid); > > > + else if (archDumpFormat == archTar) > > > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".tar", db_subdir, oid); > > > + else > > > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\"", db_subdir, oid); > > > > Use appendShellString() instead. Plain mode already does that for the > > "pg_dumpall -f" argument, which is part of db_subdir here. We don't want > > weird filename characters to work out differently for plain vs. non-plain > > mode. Also, it's easier to search for appendShellString() than to search for > > open-coded shell quoting. > > Yes, we can use appendShellString also. We are using snprintf in the > pg_dump.c file also. > Ex: snprintf(tagbuf, sizeof(tagbuf), "LARGE OBJECTS %u..%u", > loinfo->looids[0], loinfo->looids[loinfo->numlos - 1]); It's true snprintf() is not banned in these programs, but don't use it to do the quoting for OS shell command lines or fragments thereof. dbfilepath is a fragment of an OS shell command line. The LARGE OBJECTS string is not one of those. Hence, the LARGE OBJECTS scenario should keep using snprintf(). > If we want to use appendShellString, I can write a patch for these. > Please let me know your opinion. Use appendShellString() for shell quoting. Don't attempt to use it for other purposes. > > > --- a/src/bin/pg_dump/pg_restore.c > > > +++ b/src/bin/pg_dump/pg_restore.c > > > > > +/* > > > + * read_one_statement > > > + * > > > + * This will start reading from passed file pointer using fgetc and read till > > > + * semicolon(sql statement terminator for global.dat file) > > > + * > > > + * EOF is returned if end-of-file input is seen; time to shut down. > > > > What makes it okay to use this particular subset of SQL lexing? > > To support complex syntax, we used this code from another file. I'm hearing that you copied this code from somewhere. Running "git grep 'time to shut down'" suggests you copied it from InteractiveBackend(). Is that right? I do see other similarities between read_one_statement() and InteractiveBackend(). Copying InteractiveBackend() provides negligible assurance that this is the right subset of SQL lexing. Only single-user mode uses InteractiveBackend(). Single-user mode survives mostly as a last resort for recovering from having reached xidStopLimit, is rarely used, and only superusers write queries to it. > > > +/* > > > + * get_dbnames_list_to_restore > > > + * > > > + * This will mark for skipping any entries from dbname_oid_list that pattern match an > > > + * entry in the db_exclude_patterns list. > > > + * > > > + * Returns the number of database to be restored. > > > + * > > > + */ > > > +static int > > > +get_dbnames_list_to_restore(PGconn *conn, > > > + SimpleOidStringList *dbname_oid_list, > > > + SimpleStringList db_exclude_patterns) > > > +{ > > > + int count_db = 0; > > > + PQExpBuffer query; > > > + PGresult *res; > > > + > > > + query = createPQExpBuffer(); > > > + > > > + if (!conn) > > > + pg_log_info("considering PATTERN as NAME for --exclude-database option as no db connection while doing pg_restore."); > > > > When do we not have a connection here? We'd need to document this behavior > > variation if it stays, but I'd prefer if we can just rely on having a > > connection. > > Yes, we can document this behavior. My review asked a question there. I don't see an answer to that question. Would you answer that question?