public inbox for [email protected]help / color / mirror / Atom feed
Non-text mode for pg_dumpall 15+ messages / 5 participants [nested] [flat]
* Non-text mode for pg_dumpall @ 2024-06-10 12:58 Andrew Dunstan <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: Andrew Dunstan @ 2024-06-10 12:58 UTC (permalink / raw) To: pgsql-hackers Tom and Nathan opined recently that providing for non-text mode for pg_dumpall would be a Good Thing (TM). Not having it has been a long-standing complaint, so I've decided to give it a go. I think we would need to restrict it to directory mode, at least to begin with. I would have a toc.dat with a different magic block (say "PGGLO" instead of "PGDMP") containing the global entries (roles, tablespaces, databases). Then for each database there would be a subdirectory (named for its toc entry) with a standard directory mode dump for that database. These could be generated in parallel (possibly by pg_dumpall calling pg_dump for each database). pg_restore on detecting a global type toc.data would restore the globals and then each of the databases (again possibly in parallel). I'm sure there are many wrinkles I haven't thought of, but I don't see any insurmountable obstacles, just a significant amount of code. Barring the unforeseen my main is to have a preliminary patch by the September CF. Following that I would turn my attention to using it in pg_upgrade. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-06-10 14:14 Nathan Bossart <[email protected]> parent: Andrew Dunstan <[email protected]> 0 siblings, 2 replies; 15+ messages in thread From: Nathan Bossart @ 2024-06-10 14:14 UTC (permalink / raw) To: Andrew Dunstan <[email protected]>; +Cc: pgsql-hackers On Mon, Jun 10, 2024 at 08:58:49AM -0400, Andrew Dunstan wrote: > Tom and Nathan opined recently that providing for non-text mode for > pg_dumpall would be a Good Thing (TM). Not having it has been a > long-standing complaint, so I've decided to give it a go. Thank you! > I think we would need to restrict it to directory mode, at least to begin > with. I would have a toc.dat with a different magic block (say "PGGLO" > instead of "PGDMP") containing the global entries (roles, tablespaces, > databases). Then for each database there would be a subdirectory (named for > its toc entry) with a standard directory mode dump for that database. These > could be generated in parallel (possibly by pg_dumpall calling pg_dump for > each database). pg_restore on detecting a global type toc.data would restore > the globals and then each of the databases (again possibly in parallel). I'm curious why we couldn't also support the "custom" format. > Following that I would turn my attention to using it in pg_upgrade. +1 -- nathan ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-06-10 14:51 Andrew Dunstan <[email protected]> parent: Nathan Bossart <[email protected]> 1 sibling, 1 reply; 15+ messages in thread From: Andrew Dunstan @ 2024-06-10 14:51 UTC (permalink / raw) To: Nathan Bossart <[email protected]>; +Cc: pgsql-hackers On 2024-06-10 Mo 10:14, Nathan Bossart wrote: > On Mon, Jun 10, 2024 at 08:58:49AM -0400, Andrew Dunstan wrote: >> Tom and Nathan opined recently that providing for non-text mode for >> pg_dumpall would be a Good Thing (TM). Not having it has been a >> long-standing complaint, so I've decided to give it a go. > Thank you! > >> I think we would need to restrict it to directory mode, at least to begin >> with. I would have a toc.dat with a different magic block (say "PGGLO" >> instead of "PGDMP") containing the global entries (roles, tablespaces, >> databases). Then for each database there would be a subdirectory (named for >> its toc entry) with a standard directory mode dump for that database. These >> could be generated in parallel (possibly by pg_dumpall calling pg_dump for >> each database). pg_restore on detecting a global type toc.data would restore >> the globals and then each of the databases (again possibly in parallel). > I'm curious why we couldn't also support the "custom" format. We could, but the housekeeping would be a bit harder. We'd need to keep pointers to the offsets of the per-database TOCs (I don't want to have a single per-cluster TOC). And we can't produce it in parallel, so I'd rather start with something we can produce in parallel. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-06-10 14:52 Magnus Hagander <[email protected]> parent: Nathan Bossart <[email protected]> 1 sibling, 1 reply; 15+ messages in thread From: Magnus Hagander @ 2024-06-10 14:52 UTC (permalink / raw) To: Nathan Bossart <[email protected]>; +Cc: Andrew Dunstan <[email protected]>; pgsql-hackers On Mon, Jun 10, 2024 at 4:14 PM Nathan Bossart <[email protected]> wrote: > On Mon, Jun 10, 2024 at 08:58:49AM -0400, Andrew Dunstan wrote: > > Tom and Nathan opined recently that providing for non-text mode for > > pg_dumpall would be a Good Thing (TM). Not having it has been a > > long-standing complaint, so I've decided to give it a go. > > Thank you! > Indeed, this has been quite annoying! > I think we would need to restrict it to directory mode, at least to begin > > with. I would have a toc.dat with a different magic block (say "PGGLO" > > instead of "PGDMP") containing the global entries (roles, tablespaces, > > databases). Then for each database there would be a subdirectory (named > for > > its toc entry) with a standard directory mode dump for that database. > These > > could be generated in parallel (possibly by pg_dumpall calling pg_dump > for > > each database). pg_restore on detecting a global type toc.data would > restore > > the globals and then each of the databases (again possibly in parallel). > > I'm curious why we couldn't also support the "custom" format. > Or maybe even a combo - a directory of custom format files? Plus that one special file being globals? I'd say that's what most use cases I've seen would prefer. -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/; Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/; ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-06-10 14:52 Nathan Bossart <[email protected]> parent: Andrew Dunstan <[email protected]> 0 siblings, 0 replies; 15+ messages in thread From: Nathan Bossart @ 2024-06-10 14:52 UTC (permalink / raw) To: Andrew Dunstan <[email protected]>; +Cc: pgsql-hackers On Mon, Jun 10, 2024 at 10:51:42AM -0400, Andrew Dunstan wrote: > On 2024-06-10 Mo 10:14, Nathan Bossart wrote: >> I'm curious why we couldn't also support the "custom" format. > > We could, but the housekeeping would be a bit harder. We'd need to keep > pointers to the offsets of the per-database TOCs (I don't want to have a > single per-cluster TOC). And we can't produce it in parallel, so I'd rather > start with something we can produce in parallel. Got it. -- nathan ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-06-10 15:03 Nathan Bossart <[email protected]> parent: Magnus Hagander <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: Nathan Bossart @ 2024-06-10 15:03 UTC (permalink / raw) To: Magnus Hagander <[email protected]>; +Cc: Andrew Dunstan <[email protected]>; pgsql-hackers On Mon, Jun 10, 2024 at 04:52:06PM +0200, Magnus Hagander wrote: > On Mon, Jun 10, 2024 at 4:14 PM Nathan Bossart <[email protected]> > wrote: >> I'm curious why we couldn't also support the "custom" format. > > Or maybe even a combo - a directory of custom format files? Plus that one > special file being globals? I'd say that's what most use cases I've seen > would prefer. Is there a particular advantage to that approach as opposed to just using "directory" mode for everything? I know pg_upgrade uses "custom" mode for each of the databases, so a combo approach would be a closer match to the existing behavior, but that doesn't strike me as an especially strong reason to keep doing it that way. -- nathan ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-06-10 15:45 Magnus Hagander <[email protected]> parent: Nathan Bossart <[email protected]> 0 siblings, 2 replies; 15+ messages in thread From: Magnus Hagander @ 2024-06-10 15:45 UTC (permalink / raw) To: Nathan Bossart <[email protected]>; +Cc: Andrew Dunstan <[email protected]>; pgsql-hackers On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <[email protected]> wrote: > On Mon, Jun 10, 2024 at 04:52:06PM +0200, Magnus Hagander wrote: > > On Mon, Jun 10, 2024 at 4:14 PM Nathan Bossart <[email protected] > > > > wrote: > >> I'm curious why we couldn't also support the "custom" format. > > > > Or maybe even a combo - a directory of custom format files? Plus that one > > special file being globals? I'd say that's what most use cases I've seen > > would prefer. > > Is there a particular advantage to that approach as opposed to just using > "directory" mode for everything? I know pg_upgrade uses "custom" mode for > each of the databases, so a combo approach would be a closer match to the > existing behavior, but that doesn't strike me as an especially strong > reason to keep doing it that way. > A gazillion files to deal with? Much easier to work with individual custom files if you're moving databases around and things like that. Much easier to monitor eg sizes/dates if you're using it for backups. It's not things that are make-it-or-break-it or anything, but there are some smaller things that definitely can be useful. -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/; Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/; ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-06-10 16:20 Nathan Bossart <[email protected]> parent: Magnus Hagander <[email protected]> 1 sibling, 0 replies; 15+ messages in thread From: Nathan Bossart @ 2024-06-10 16:20 UTC (permalink / raw) To: Magnus Hagander <[email protected]>; +Cc: Andrew Dunstan <[email protected]>; pgsql-hackers On Mon, Jun 10, 2024 at 05:45:19PM +0200, Magnus Hagander wrote: > On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <[email protected]> > wrote: >> Is there a particular advantage to that approach as opposed to just using >> "directory" mode for everything? I know pg_upgrade uses "custom" mode for >> each of the databases, so a combo approach would be a closer match to the >> existing behavior, but that doesn't strike me as an especially strong >> reason to keep doing it that way. > > A gazillion files to deal with? Much easier to work with individual custom > files if you're moving databases around and things like that. > Much easier to monitor eg sizes/dates if you're using it for backups. > > It's not things that are make-it-or-break-it or anything, but there are > some smaller things that definitely can be useful. Makes sense, thanks for elaborating. -- nathan ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-06-10 16:21 Tom Lane <[email protected]> parent: Magnus Hagander <[email protected]> 1 sibling, 2 replies; 15+ messages in thread From: Tom Lane @ 2024-06-10 16:21 UTC (permalink / raw) To: Magnus Hagander <[email protected]>; +Cc: Nathan Bossart <[email protected]>; Andrew Dunstan <[email protected]>; pgsql-hackers Magnus Hagander <[email protected]> writes: > On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <[email protected]> > wrote: >> Is there a particular advantage to that approach as opposed to just using >> "directory" mode for everything? > A gazillion files to deal with? Much easier to work with individual custom > files if you're moving databases around and things like that. > Much easier to monitor eg sizes/dates if you're using it for backups. You can always tar up the directory tree after-the-fact if you want one file. Sure, that step's not parallelized, but I think we'd need some non-parallelized copying to create such a file anyway. regards, tom lane ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-06-10 17:27 Andrew Dunstan <[email protected]> parent: Tom Lane <[email protected]> 1 sibling, 0 replies; 15+ messages in thread From: Andrew Dunstan @ 2024-06-10 17:27 UTC (permalink / raw) To: Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; +Cc: Nathan Bossart <[email protected]>; pgsql-hackers On 2024-06-10 Mo 12:21, Tom Lane wrote: > Magnus Hagander <[email protected]> writes: >> On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <[email protected]> >> wrote: >>> Is there a particular advantage to that approach as opposed to just using >>> "directory" mode for everything? >> A gazillion files to deal with? Much easier to work with individual custom >> files if you're moving databases around and things like that. >> Much easier to monitor eg sizes/dates if you're using it for backups. > You can always tar up the directory tree after-the-fact if you want > one file. Sure, that step's not parallelized, but I think we'd need > some non-parallelized copying to create such a file anyway. > > Yeah. I think I can probably allow for Magnus' suggestion fairly easily, but if I have to choose I'm going to go for the format that can be produced with the maximum parallelism. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-06-10 19:36 Magnus Hagander <[email protected]> parent: Tom Lane <[email protected]> 1 sibling, 1 reply; 15+ messages in thread From: Magnus Hagander @ 2024-06-10 19:36 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Nathan Bossart <[email protected]>; Andrew Dunstan <[email protected]>; pgsql-hackers On Mon, Jun 10, 2024 at 6:21 PM Tom Lane <[email protected]> wrote: > Magnus Hagander <[email protected]> writes: > > On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <[email protected] > > > > wrote: > >> Is there a particular advantage to that approach as opposed to just > using > >> "directory" mode for everything? > > > A gazillion files to deal with? Much easier to work with individual > custom > > files if you're moving databases around and things like that. > > Much easier to monitor eg sizes/dates if you're using it for backups. > > You can always tar up the directory tree after-the-fact if you want > one file. Sure, that step's not parallelized, but I think we'd need > some non-parallelized copying to create such a file anyway. > That would require double the disk space. But you can also just run pg_dump manually on each database and a pg_dumpall -g like people are doing today -- I thought this whole thing was about making it more convenient :) -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/; Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/; ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2024-12-31 18:23 Mahendra Singh Thalor <[email protected]> parent: Magnus Hagander <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: Mahendra Singh Thalor @ 2024-12-31 18:23 UTC (permalink / raw) To: Magnus Hagander <[email protected]>; +Cc: Tom Lane <[email protected]>; Nathan Bossart <[email protected]>; Andrew Dunstan <[email protected]>; pgsql-hackers; Dilip Kumar <[email protected]> Hi all, With the help of Andrew and Dilip Kumar, I made a poc patch to dump all the databases in archive format and then restore them using pg_restore. Brief about the patch: new option to pg_dumpall: -F, --format=d|p (directory|plain) output file format (directory, plain text (default)) Ex: ./pg_dumpall --format=directory --file=dumpDirName dumps are as: global.dat ::: global sql commands in simple plain format map.dat. ::: dboid dbname ---entries for all databases in simple text form databases. ::: subdir dboid1 -> toc.dat and data files in archive format subdir dboid2. -> toc.dat and data files in archive format etc --------------------------------------------------------------------------- new options to pg_restore: -g, --globals-only restore only global objects, no databases --exclude-database=PATTERN exclude databases whose name matches PATTERN When we give -g/--globals-only option, then only restore globals, no db restoring. *Design*: When --format=directory is specified and there is no toc.dat file in the main directory, then check for global.dat and map.dat to restore all databases. If both files exist in a directory, then first restore all globals from global.dat and then restore all databases one by one from map.dat list. While restoring, skip the databases that are given with exclude-database. --------------------------------------------------------------------------- NOTE: if needed, restore single db by particular subdir Ex: ./pg_restore --format=directory -d postgres dumpDirName/databases/5 -- here, 5 is the dboid of postgres db -- to get dboid, refer dbname in map.file -------------------------------------------------------------------------- Please let me know feedback for the attached patch. On Tue, 11 Jun 2024 at 01:06, Magnus Hagander <[email protected]> wrote: > On Mon, Jun 10, 2024 at 6:21 PM Tom Lane <[email protected]> wrote: > >> Magnus Hagander <[email protected]> writes: >> > On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart < >> [email protected]> >> > wrote: >> >> Is there a particular advantage to that approach as opposed to just >> using >> >> "directory" mode for everything? >> >> > A gazillion files to deal with? Much easier to work with individual >> custom >> > files if you're moving databases around and things like that. >> > Much easier to monitor eg sizes/dates if you're using it for backups. >> >> You can always tar up the directory tree after-the-fact if you want >> one file. Sure, that step's not parallelized, but I think we'd need >> some non-parallelized copying to create such a file anyway. >> > > That would require double the disk space. > > But you can also just run pg_dump manually on each database and a > pg_dumpall -g like people are doing today -- I thought this whole thing was > about making it more convenient :) > > -- > Magnus Hagander > Me: https://www.hagander.net/ <http://www.hagander.net/; > Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/; > -- Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com Attachments: [application/octet-stream] v01_poc_pg_dumpall_with_directory_31dec.patch (33.2K, 3-v01_poc_pg_dumpall_with_directory_31dec.patch) download | inline diff: From 332530c17f2fb46af55791e7c7ee6393767a29eb Mon Sep 17 00:00:00 2001 From: Mahendra Singh Thalor <[email protected]> Date: Tue, 31 Dec 2024 09:58:20 -0800 Subject: [PATCH] pg_dumpall with directory format and restore it by pg_restore new option to pg_dumpall: -F, --format=d|p|directory|plain output file format (directory, plain text (default)) Ex: ./pg_dumpall --format=directory --file=dumpDirName dumps are as: global.dat ::: global sql commands in simple plain format map.dat. ::: dboid dbname ---entries for all databases in simple text form databases. ::: subdir dboid1 -> toc.dat and data files in archive format subdir dboid2. -> toc.dat and data files in archive format etc --------------------------------------------------------------------------- NOTE: if needed, restore single db by particular subdir Ex: ./pg_restore --format=directory -d postgres dumpDirName/databases/5 -- here, 5 is the dboid of postgres db -- to get dboid, refer dbname in map.file -------------------------------------------------------------------------- new options to pg_restore: -g, --globals-only restore only global objects, no databases --exclude-database=PATTERN exclude databases whose name matches PATTERN When we give -g/--globals-only option, then only restore globals, no db restoring. Design: When --format=directory is specified and there is no toc.dat in main directory, then check for global.dat and map.dat to restore all databases. If both files are exists in directory, then firest restore all globals from global.dat and then restore all databases one by one from map.dat list. --- doc/src/sgml/ref/pg_dumpall.sgml | 30 ++ doc/src/sgml/ref/pg_restore.sgml | 30 ++ src/bin/pg_dump/pg_dumpall.c | 138 ++++++++-- src/bin/pg_dump/pg_restore.c | 571 ++++++++++++++++++++++++++++++++++++++- 4 files changed, 739 insertions(+), 30 deletions(-) diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml index 014f279..b6c9feb 100644 --- a/doc/src/sgml/ref/pg_dumpall.sgml +++ b/doc/src/sgml/ref/pg_dumpall.sgml @@ -582,6 +582,36 @@ exclude database <replaceable class="parameter">PATTERN</replaceable> </listitem> </varlistentry> + <varlistentry> + <term><option>-F <replaceable class="parameter">format</replaceable></option></term> + <term><option>--format=<replaceable class="parameter">format</replaceable></option></term> + <listitem> + <para> + Specify format of dump files. If we want to dump all the databases, then pass this as directory so that dump of all databases can be taken in separate subdirectory in archive format. +by default, this is plain. + + <variablelist> + <varlistentry> + <term><literal>d</literal></term> + <term><literal>directory</literal></term> + <listitem> + <para> + The archive is a directory archive. + </para> + </listitem> + </varlistentry> + + <variablelist> + <varlistentry> + <term><literal>p</literal></term> + <term><literal>plain</literal></term> + <listitem> + <para> + The archive is a plain archive.(by default also) + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>-?</option></term> <term><option>--help</option></term> diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml index b8b27e1..ab2e035 100644 --- a/doc/src/sgml/ref/pg_restore.sgml +++ b/doc/src/sgml/ref/pg_restore.sgml @@ -316,6 +316,16 @@ PostgreSQL documentation </varlistentry> <varlistentry> + <term><option>-g</option></term> + <term><option>--globals-only</option></term> + <listitem> + <para> + Restore only global objects (roles and tablespaces), no databases. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><option>-I <replaceable class="parameter">index</replaceable></option></term> <term><option>--index=<replaceable class="parameter">index</replaceable></option></term> <listitem> @@ -932,6 +942,26 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry> + <term><option>--exclude-database=<replaceable class="parameter">pattern</replaceable></option></term> + <listitem> + <para> + Do not restore databases whose name matches + <replaceable class="parameter">pattern</replaceable>. + Multiple patterns can be excluded by writing multiple + <option>--exclude-database</option> switches. The + <replaceable class="parameter">pattern</replaceable> parameter is + interpreted as a pattern according to the same rules used by + <application>psql</application>'s <literal>\d</literal> + commands (see <xref linkend="app-psql-patterns"/>), + so multiple databases can also be excluded by writing wildcard + characters in the pattern. When using wildcards, be careful to + quote the pattern if needed to prevent shell wildcard expansion. + </para> + </listitem> + </varlistentry> + + </variablelist> </para> </refsect1> diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c index 9a04e51..a748962 100644 --- a/src/bin/pg_dump/pg_dumpall.c +++ b/src/bin/pg_dump/pg_dumpall.c @@ -15,6 +15,7 @@ #include "postgres_fe.h" +#include <sys/stat.h> #include <time.h> #include <unistd.h> @@ -29,6 +30,7 @@ #include "filter.h" #include "getopt_long.h" #include "pg_backup.h" +#include "pg_backup_archiver.h" /* version string we expect back from pg_dump */ #define PGDUMP_VERSIONSTR "pg_dump (PostgreSQL) " PG_VERSION "\n" @@ -64,9 +66,10 @@ static void dropTablespaces(PGconn *conn); static void dumpTablespaces(PGconn *conn); static void dropDBs(PGconn *conn); static void dumpUserConfig(PGconn *conn, const char *username); -static void dumpDatabases(PGconn *conn); +static void dumpDatabases(PGconn *conn, bool directory_format); static void dumpTimestamp(const char *msg); -static int runPgDump(const char *dbname, const char *create_opts); +static int runPgDump(const char *dbname, const char *create_opts, + char *dbfile); static void buildShSecLabels(PGconn *conn, const char *catalog_name, Oid objectId, const char *objtype, const char *objname, @@ -147,6 +150,7 @@ main(int argc, char *argv[]) {"password", no_argument, NULL, 'W'}, {"no-privileges", no_argument, NULL, 'x'}, {"no-acl", no_argument, NULL, 'x'}, + {"format", required_argument, NULL, 'F'}, /* * the following options don't have an equivalent short option letter @@ -188,11 +192,13 @@ main(int argc, char *argv[]) char *pgdb = NULL; char *use_role = NULL; const char *dumpencoding = NULL; + const char *format; trivalue prompt_password = TRI_DEFAULT; bool data_only = false; bool globals_only = false; bool roles_only = false; bool tablespaces_only = false; + bool directory_format = false; PGconn *conn; int encoding; const char *std_strings; @@ -237,7 +243,7 @@ main(int argc, char *argv[]) pgdumpopts = createPQExpBuffer(); - while ((c = getopt_long(argc, argv, "acd:E:f:gh:l:Op:rsS:tU:vwWx", long_options, &optindex)) != -1) + while ((c = getopt_long(argc, argv, "acd:E:f:F:gh:l:Op:rsS:tU:vwWx", long_options, &optindex)) != -1) { switch (c) { @@ -265,7 +271,17 @@ main(int argc, char *argv[]) appendPQExpBufferStr(pgdumpopts, " -f "); appendShellString(pgdumpopts, filename); break; - + case 'F': + format = optarg; + if ((strcmp(format, "directory") == 0 || strcmp(format, "d") == 0)) + directory_format = true; + else if (strcmp(format, "plain") != 0 || strcmp(format, "p") == 0) + { + pg_log_error("invalid format specified: %s", format); + pg_log_error_hint("Try \"%s --help\" for more information.", progname); + exit_nicely(1); + } + break; case 'g': globals_only = true; break; @@ -497,9 +513,31 @@ main(int argc, char *argv[]) &database_exclude_names); /* - * Open the output file if required, otherwise use stdout + * Open the output file if required, otherwise use stdout. */ - if (filename) + if (directory_format) + { + char toc_path[MAXPGPATH]; + + /* + * If directory format is specified then we must provide the directory + * name. + */ + if (!filename || strcmp(filename, "") == 0) + pg_fatal("no output directory specified"); + + /* TODO: accept the empty existing directory. */ + if (mkdir(filename, 0700) < 0) + pg_fatal("could not create directory \"%s\": %m", + filename); + + snprintf(toc_path, MAXPGPATH, "%s/global.dat", filename); + + OPF = fopen(toc_path, "w"); + if (!OPF) + pg_fatal("could not open global.dat file: %s", strerror(errno)); + } + else if (filename) { OPF = fopen(filename, PG_BINARY_W); if (!OPF) @@ -607,7 +645,7 @@ main(int argc, char *argv[]) } if (!globals_only && !roles_only && !tablespaces_only) - dumpDatabases(conn); + dumpDatabases(conn, directory_format); PQfinish(conn); @@ -620,7 +658,7 @@ main(int argc, char *argv[]) fclose(OPF); /* sync the resulting file, errors are not fatal */ - if (dosync) + if (dosync && !directory_format) (void) fsync_fname(filename, false); } @@ -637,6 +675,7 @@ help(void) printf(_("\nGeneral options:\n")); printf(_(" -f, --file=FILENAME output file name\n")); + printf(_(" -F, --format=d|p output file format (directory, plain text (default))\n")); printf(_(" -v, --verbose verbose mode\n")); printf(_(" -V, --version output version information, then exit\n")); printf(_(" --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n")); @@ -1487,10 +1526,13 @@ expand_dbname_patterns(PGconn *conn, * Dump contents of databases. */ static void -dumpDatabases(PGconn *conn) +dumpDatabases(PGconn *conn, bool directory_format) { PGresult *res; int i; + char db_subdir[MAXPGPATH]; + char dbfilepath[MAXPGPATH]; + FILE *map_file; /* * Skip databases marked not datallowconn, since we'd be unable to connect @@ -1504,7 +1546,7 @@ dumpDatabases(PGconn *conn) * doesn't have some failure mode with --clean. */ res = executeQuery(conn, - "SELECT datname " + "SELECT datname, oid " "FROM pg_database d " "WHERE datallowconn AND datconnlimit != -2 " "ORDER BY (datname <> 'template1'), datname"); @@ -1512,9 +1554,30 @@ dumpDatabases(PGconn *conn) if (PQntuples(res) > 0) fprintf(OPF, "--\n-- Databases\n--\n\n"); + /* + * If directory format is specified then create a subdirectory under the + * main directory and each database dump file will be created under the + * subdirectory in archive mode as per single db pg_dump. + */ + if (directory_format) + { + char map_file_path[MAXPGPATH]; + + snprintf(db_subdir, MAXPGPATH, "%s/databases", filename); + if (mkdir(db_subdir, 0755) != 0) + pg_log_error("could not create subdirectory \"%s\": %m", db_subdir); + + /* Create a map file (to store dboid and dbname) */ + snprintf(map_file_path, MAXPGPATH, "%s/map.dat", filename); + map_file = fopen(map_file_path, "w"); + if (!map_file) + pg_fatal("could not open map file: %s", strerror(errno)); + } + for (i = 0; i < PQntuples(res); i++) { char *dbname = PQgetvalue(res, i, 0); + char *oid = PQgetvalue(res, i, 1); const char *create_opts; int ret; @@ -1522,6 +1585,14 @@ dumpDatabases(PGconn *conn) if (strcmp(dbname, "template0") == 0) continue; + if (directory_format) + { + snprintf(dbfilepath, MAXPGPATH, "-f %s/%s", db_subdir, oid); + + /* append dboid and dbname in map file. */ + fprintf(map_file, "%s %s\n", oid, dbname); + } + /* Skip any explicitly excluded database */ if (simple_string_list_member(&database_exclude_names, dbname)) { @@ -1531,7 +1602,8 @@ dumpDatabases(PGconn *conn) pg_log_info("dumping database \"%s\"", dbname); - fprintf(OPF, "--\n-- Database \"%s\" dump\n--\n\n", dbname); + if (!directory_format) + fprintf(OPF, "--\n-- Database \"%s\" dump\n--\n\n", dbname); /* * We assume that "template1" and "postgres" already exist in the @@ -1549,20 +1621,21 @@ dumpDatabases(PGconn *conn) { create_opts = ""; /* Since pg_dump won't emit a \connect command, we must */ - fprintf(OPF, "\\connect %s\n\n", dbname); + if (!directory_format) + fprintf(OPF, "\\connect %s\n\n", dbname); } } else create_opts = "--create"; - if (filename) + if (!directory_format && filename) fclose(OPF); - ret = runPgDump(dbname, create_opts); + ret = runPgDump(dbname, create_opts, dbfilepath); if (ret != 0) pg_fatal("pg_dump failed on database \"%s\", exiting", dbname); - if (filename) + if (!directory_format && filename) { OPF = fopen(filename, PG_BINARY_A); if (!OPF) @@ -1571,6 +1644,10 @@ dumpDatabases(PGconn *conn) } } + /* close map file */ + if (directory_format) + fclose(map_file); + PQclear(res); } @@ -1580,7 +1657,7 @@ dumpDatabases(PGconn *conn) * Run pg_dump on dbname, with specified options. */ static int -runPgDump(const char *dbname, const char *create_opts) +runPgDump(const char *dbname, const char *create_opts, char *dbfile) { PQExpBufferData connstrbuf; PQExpBufferData cmd; @@ -1589,17 +1666,26 @@ runPgDump(const char *dbname, const char *create_opts) initPQExpBuffer(&connstrbuf); initPQExpBuffer(&cmd); - printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, - pgdumpopts->data, create_opts); - - /* - * If we have a filename, use the undocumented plain-append pg_dump - * format. - */ - if (filename) - appendPQExpBufferStr(&cmd, " -Fa "); + if (dbfile) + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + dbfile, create_opts); + appendPQExpBufferStr(&cmd, " -F d "); + } else - appendPQExpBufferStr(&cmd, " -Fp "); + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + pgdumpopts->data, create_opts); + + /* + * If we have a filename, use the undocumented plain-append pg_dump + * format. + */ + if (filename) + appendPQExpBufferStr(&cmd, " -Fa "); + else + appendPQExpBufferStr(&cmd, " -Fp "); + } /* * Append the database name to the already-constructed stem of connection diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c index 88ae39d..594ae27 100644 --- a/src/bin/pg_dump/pg_restore.c +++ b/src/bin/pg_dump/pg_restore.c @@ -41,27 +41,45 @@ #include "postgres_fe.h" #include <ctype.h> +#include <sys/stat.h> #ifdef HAVE_TERMIOS_H #include <termios.h> #endif +#include "common/connect.h" +#include "compress_io.h" +#include "common/string.h" #include "fe_utils/option_utils.h" +#include "fe_utils/string_utils.h" #include "filter.h" #include "getopt_long.h" #include "parallel.h" +#include "pg_backup_archiver.h" #include "pg_backup_utils.h" static void usage(const char *progname); static void read_restore_filters(const char *filename, RestoreOptions *opts); +static bool _fileExistsInDirectory(const char *dir, const char *filename); +static bool restoreOneDatabase(const char *inputFileSpec, + RestoreOptions *opts, int numWorkers); +static void expand_dbname_patterns(PGconn *conn, SimpleStringList *patterns, + SimpleStringList *names); +static PGconn *connectDatabase(const char *dbname, const char *connection_string, + const char *pghost, const char *pgport, const char *pguser, + trivalue prompt_password, bool fail_on_error); +static PGresult *executeQuery(PGconn *conn, const char *query); +static int ReadOneStatement(StringInfo inBuf, FILE *f_glo); +static int restoreAllDatabases(const char *dumpdirpath, + SimpleStringList database_exclude_names, RestoreOptions *opts, + int numWorkers); +static void execute_global_sql_commands(PGconn *conn, const char *dumpdirpath); int main(int argc, char **argv) { RestoreOptions *opts; int c; - int exit_code; int numWorkers = 1; - Archive *AH; char *inputFileSpec; static int disable_triggers = 0; static int enable_row_security = 0; @@ -77,11 +95,14 @@ main(int argc, char **argv) static int strict_names = 0; bool data_only = false; bool schema_only = false; + SimpleStringList database_exclude_patterns = {NULL, NULL}; + bool globals_only = false; struct option cmdopts[] = { {"clean", 0, NULL, 'c'}, {"create", 0, NULL, 'C'}, {"data-only", 0, NULL, 'a'}, + {"globals-only", 0, NULL, 'g'}, {"dbname", 1, NULL, 'd'}, {"exit-on-error", 0, NULL, 'e'}, {"exclude-schema", 1, NULL, 'N'}, @@ -128,6 +149,7 @@ main(int argc, char **argv) {"no-security-labels", no_argument, &no_security_labels, 1}, {"no-subscriptions", no_argument, &no_subscriptions, 1}, {"filter", required_argument, NULL, 4}, + {"exclude-database", required_argument, NULL, 6}, {NULL, 0, NULL, 0} }; @@ -156,7 +178,7 @@ main(int argc, char **argv) } } - while ((c = getopt_long(argc, argv, "acCd:ef:F:h:I:j:lL:n:N:Op:P:RsS:t:T:U:vwWx1", + while ((c = getopt_long(argc, argv, "aAcCd:ef:F:gh:I:j:lL:n:N:Op:P:RsS:t:T:U:vwWx1", cmdopts, NULL)) != -1) { switch (c) @@ -183,11 +205,14 @@ main(int argc, char **argv) if (strlen(optarg) != 0) opts->formatName = pg_strdup(optarg); break; + case 'g': + /* restore only global.dat file from directory */ + globals_only = true; + break; case 'h': if (strlen(optarg) != 0) opts->cparams.pghost = pg_strdup(optarg); break; - case 'j': /* number of restore jobs */ if (!option_parse_int(optarg, "-j/--jobs", 1, PG_MAX_JOBS, @@ -302,6 +327,10 @@ main(int argc, char **argv) exit(1); opts->exit_on_error = true; break; + case 6: + /* list of databases those needs to skip while restoring */ + simple_string_list_append(&database_exclude_patterns, optarg); + break; default: /* getopt_long already emitted a complaint */ @@ -329,6 +358,16 @@ main(int argc, char **argv) if (!opts->cparams.dbname && !opts->filename && !opts->tocSummary) pg_fatal("one of -d/--dbname and -f/--file must be specified"); + if (database_exclude_patterns.head != NULL && globals_only) + { + pg_log_error("option --exclude-database cannot be used together with -g/--globals-only"); + pg_log_error_hint("Try \"%s --help\" for more information.", progname); + exit_nicely(1); + } + + if (globals_only && opts->cparams.dbname == NULL) + pg_fatal("option -g/--globals-only requires option -d/--dbname"); + /* Should get at most one of -d and -f, else user is confused */ if (opts->cparams.dbname) { @@ -406,6 +445,76 @@ main(int argc, char **argv) } } + /* + * If directory format, then first check that toc.dat file exist or not? + * + * if toc.dat exist, then no need to check for map.dat and global.dat + * + */ + if (opts->format == archDirectory && + inputFileSpec != NULL && + !_fileExistsInDirectory(inputFileSpec, "toc.dat")) + { + /* if global.dat and map.dat are exist, then open them */ + if (_fileExistsInDirectory(pg_strdup(inputFileSpec), "global.dat") + && _fileExistsInDirectory(pg_strdup(inputFileSpec), "map.dat")) + { + /* Found the global.dat and map.dat file so process. */ + PGconn *conn = NULL; + SimpleStringList database_exclude_names = {NULL, NULL}; + + if (opts->cparams.dbname == NULL) + pg_fatal(" -d/--dbanme should be given if using dump of dumpall and global.dat"); + + if (opts->createDB != 1) + pg_fatal("option -C/--create should be specified if using dump of dumpall with global.dat"); + + /* Connect to database so that we can execute global.dat */ + conn = connectDatabase(opts->cparams.dbname, NULL, + opts->cparams.pghost, opts->cparams.pgport, opts->cparams.username, + TRI_DEFAULT, false); + + if (!conn) + pg_fatal("could not connect to database \"%s\"", opts->cparams.dbname); + + /* Open global.dat file and execute all the sql commands */ + execute_global_sql_commands(conn, inputFileSpec); + + /* if globals-only, then return from here */ + if (globals_only) + { + PQfinish(conn); /* close the connection */ + return 0; + } + + /* Get a list of database names that match the exclude patterns */ + expand_dbname_patterns(conn, &database_exclude_patterns, + &database_exclude_names); + + /* Close the db connection as we are done with globals */ + PQfinish(conn); + + /* Now restore all the databases from map.dat file */ + return restoreAllDatabases(inputFileSpec, database_exclude_names, + opts, numWorkers); + }/* end if */ + }/* end if */ + + return restoreOneDatabase(inputFileSpec, opts, numWorkers); +} + +/* + * restoreOneDatabase + * + * This will restore one database using toc.dat file. + */ +static bool +restoreOneDatabase(const char *inputFileSpec, RestoreOptions *opts, + int numWorkers) +{ + Archive *AH; + bool exit_code; + AH = OpenArchive(inputFileSpec, opts->format); SetArchiveOptions(AH, NULL, opts); @@ -471,6 +580,7 @@ usage(const char *progname) printf(_(" -c, --clean clean (drop) database objects before recreating\n")); printf(_(" -C, --create create the target database\n")); printf(_(" -e, --exit-on-error exit on error, default is to continue\n")); + printf(_(" -g, --globals-only restore only global objects, no databases\n")); printf(_(" -I, --index=NAME restore named index\n")); printf(_(" -j, --jobs=NUM use this many parallel jobs to restore\n")); printf(_(" -L, --use-list=FILENAME use table of contents from this file for\n" @@ -483,6 +593,7 @@ usage(const char *progname) printf(_(" -S, --superuser=NAME superuser user name to use for disabling triggers\n")); printf(_(" -t, --table=NAME restore named relation (table, view, etc.)\n")); printf(_(" -T, --trigger=NAME restore named trigger\n")); + printf(_(" --exclude-database=PATTERN exclude databases whose name matches PATTERN\n")); printf(_(" -x, --no-privileges skip restoration of access privileges (grant/revoke)\n")); printf(_(" -1, --single-transaction restore as a single transaction\n")); printf(_(" --disable-triggers disable triggers during data-only restore\n")); @@ -621,3 +732,455 @@ read_restore_filters(const char *filename, RestoreOptions *opts) filter_free(&fstate); } + +static bool +_fileExistsInDirectory(const char *dir, const char *filename) +{ + struct stat st; + char buf[MAXPGPATH]; + + if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH) + pg_fatal("directory name too long: \"%s\"", dir); + + return (stat(buf, &st) == 0 && S_ISREG(st.st_mode)); +} + +/* + * Find a list of database names that match the given patterns. + * See also expand_table_name_patterns() in pg_dump.c + */ +static void +expand_dbname_patterns(PGconn *conn, + SimpleStringList *patterns, + SimpleStringList *names) +{ + PQExpBuffer query; + PGresult *res; + + if (patterns->head == NULL) + return; /* nothing to do */ + + query = createPQExpBuffer(); + + /* + * The loop below runs multiple SELECTs, which might sometimes result in + * duplicate entries in the name list, but we don't care, since all we're + * going to do is test membership of the list. + */ + + for (SimpleStringListCell *cell = patterns->head; cell; cell = cell->next) + { + int dotcnt; + + appendPQExpBufferStr(query, + "SELECT datname FROM pg_catalog.pg_database n\n"); + processSQLNamePattern(conn, query, cell->val, false, + false, NULL, "datname", NULL, NULL, NULL, + &dotcnt); + + if (dotcnt > 0) + { + pg_log_error("improper qualified name (too many dotted names): %s", + cell->val); + PQfinish(conn); + exit_nicely(1); + } + + res = executeQuery(conn, query->data); + for (int i = 0; i < PQntuples(res); i++) + { + simple_string_list_append(names, PQgetvalue(res, i, 0)); + } + + PQclear(res); + resetPQExpBuffer(query); + } + + destroyPQExpBuffer(query); +} + +/* + * Make a database connection with the given parameters. An + * interactive password prompt is automatically issued if required. + * + * If fail_on_error is false, we return NULL without printing any message + * on failure, but preserve any prompted password for the next try. + * + */ +static PGconn * +connectDatabase(const char *dbname, const char *connection_string, + const char *pghost, const char *pgport, const char *pguser, + trivalue prompt_password, bool fail_on_error) +{ + PGconn *conn; + bool new_pass; + const char *remoteversion_str; + int my_version; + const char **keywords = NULL; + const char **values = NULL; + PQconninfoOption *conn_opts = NULL; + static char *password = NULL; + static int server_version; + + if (prompt_password == TRI_YES && !password) + password = simple_prompt("Password: ", false); + + /* + * Start the connection. Loop until we have a password if requested by + * backend. + */ + do + { + int argcount = 6; + PQconninfoOption *conn_opt; + char *err_msg = NULL; + int i = 0; + + free(keywords); + free(values); + PQconninfoFree(conn_opts); + + /* + * Merge the connection info inputs given in form of connection string + * and other options. Explicitly discard any dbname value in the + * connection string; otherwise, PQconnectdbParams() would interpret + * that value as being itself a connection string. + */ + if (connection_string) + { + conn_opts = PQconninfoParse(connection_string, &err_msg); + if (conn_opts == NULL) + pg_fatal("%s", err_msg); + + for (conn_opt = conn_opts; conn_opt->keyword != NULL; conn_opt++) + { + if (conn_opt->val != NULL && conn_opt->val[0] != '\0' && + strcmp(conn_opt->keyword, "dbname") != 0) + argcount++; + } + + keywords = pg_malloc0((argcount + 1) * sizeof(*keywords)); + values = pg_malloc0((argcount + 1) * sizeof(*values)); + + for (conn_opt = conn_opts; conn_opt->keyword != NULL; conn_opt++) + { + if (conn_opt->val != NULL && conn_opt->val[0] != '\0' && + strcmp(conn_opt->keyword, "dbname") != 0) + { + keywords[i] = conn_opt->keyword; + values[i] = conn_opt->val; + i++; + } + } + } + else + { + keywords = pg_malloc0((argcount + 1) * sizeof(*keywords)); + values = pg_malloc0((argcount + 1) * sizeof(*values)); + } + + if (pghost) + { + keywords[i] = "host"; + values[i] = pghost; + i++; + } + if (pgport) + { + keywords[i] = "port"; + values[i] = pgport; + i++; + } + if (pguser) + { + keywords[i] = "user"; + values[i] = pguser; + i++; + } + if (password) + { + keywords[i] = "password"; + values[i] = password; + i++; + } + if (dbname) + { + keywords[i] = "dbname"; + values[i] = dbname; + i++; + } + keywords[i] = "fallback_application_name"; + values[i] = progname; + i++; + + new_pass = false; + conn = PQconnectdbParams(keywords, values, true); + + if (!conn) + pg_fatal("could not connect to database \"%s\"", dbname); + + if (PQstatus(conn) == CONNECTION_BAD && + PQconnectionNeedsPassword(conn) && + !password && + prompt_password != TRI_NO) + { + PQfinish(conn); + password = simple_prompt("Password: ", false); + new_pass = true; + } + } while (new_pass); + + /* check to see that the backend connection was successfully made */ + if (PQstatus(conn) == CONNECTION_BAD) + { + if (fail_on_error) + pg_fatal("%s", PQerrorMessage(conn)); + else + { + PQfinish(conn); + + free(keywords); + free(values); + PQconninfoFree(conn_opts); + + return NULL; + } + } + + free(keywords); + free(values); + PQconninfoFree(conn_opts); + + /* Check version */ + remoteversion_str = PQparameterStatus(conn, "server_version"); + if (!remoteversion_str) + pg_fatal("could not get server version"); + server_version = PQserverVersion(conn); + if (server_version == 0) + pg_fatal("could not parse server version \"%s\"", + remoteversion_str); + + my_version = PG_VERSION_NUM; + + /* + * We allow the server to be back to 9.2, and up to any minor release of + * our own major version. (See also version check in pg_dump.c.) + */ + if (my_version != server_version + && (server_version < 90200 || + (server_version / 100) > (my_version / 100))) + { + pg_log_error("aborting because of server version mismatch"); + pg_log_error_detail("server version: %s; %s version: %s", + remoteversion_str, progname, PG_VERSION); + exit_nicely(1); + } + + PQclear(executeQuery(conn, ALWAYS_SECURE_SEARCH_PATH_SQL)); + + return conn; +} + +/* + * Run a query, return the results, exit program on failure. + */ +static PGresult * +executeQuery(PGconn *conn, const char *query) +{ + PGresult *res; + + pg_log_info("executing %s", query); + + res = PQexec(conn, query); + if (!res || PQresultStatus(res) != PGRES_TUPLES_OK) + { + pg_log_error("query failed: %s", PQerrorMessage(conn)); + pg_log_error_detail("Query was: %s", query); + PQfinish(conn); + exit_nicely(1); + } + + return res; +} + +/* ---------------- + * ReadOneStatement() + * + * This will start reading from passed file pointer using fgetc and read till + * semicolon(sql statement terminator for global.sql file) + * + * EOF is returned if end-of-file input is seen; time to shut down. + * ---------------- + */ + +static int +ReadOneStatement(StringInfo inBuf, FILE *f_glo) +{ + int c; /* character read from getc() */ + + resetStringInfo(inBuf); + + /* + * Read characters until EOF or the appropriate delimiter is seen. + */ + while ((c = fgetc(f_glo)) != EOF) + { + appendStringInfoChar(inBuf, (char) c); + + if (c == '\n') + { + if(inBuf->len > 1 && + inBuf->data[inBuf->len - 2] == ';') + break; + else + continue; + } + } + + /* No input before EOF signal means time to quit. */ + if (c == EOF && inBuf->len == 0) + return EOF; + + /* Add '\0' to make it look the same as message case. */ + appendStringInfoChar(inBuf, (char) '\0'); + + return 'Q'; +} + +/* + * This will restore databases those dumps are present in + * directory. + * + * This will process db one by one as names and dboid are mentioned in map.dat + * file. + */ +static int +restoreAllDatabases(const char *dumpdirpath, + SimpleStringList database_exclude_names, RestoreOptions *opts, + int numWorkers) +{ + char map_file_path[MAXPGPATH]; + char line[MAXPGPATH]; + int lineno = 0; + int exit_code = 0; + int processed_db = 0; + FILE *pfile; + + snprintf(map_file_path, MAXPGPATH, "%s/map.dat", dumpdirpath); + + /* Open map.dat file. */ + pfile = fopen(map_file_path, "r"); + + if (pfile == NULL) + pg_fatal("could not open map.dat file: %s", strerror(errno)); + + /* + * Read one line from map.dat and extract dbname and dboid to restore it. + */ + while((fgets(line, MAXPGPATH, pfile)) != NULL) + { + char subdirpath[MAXPGPATH]; + Oid dboid; + char dbname[MAXPGPATH + 1]; + int dbexit_code; + + lineno++; + + /* + * We need to reset override_dbname so that objects can be restored into + * already created database. (used with -d/--dbname option) + */ + if (opts->cparams.override_dbname) + { + pfree(opts->cparams.override_dbname); + opts->cparams.override_dbname = NULL; + } + + /* Extract dbname and dboid from line */ + sscanf(line, "%u %s" , &dboid, dbname); + + pg_log_info("found dbname as :%s and dboid:%d in map.dat file while restoring", dbname, dboid); + + /* Report error if file has any corrupted data. */ + if (!OidIsValid(dboid) || strlen(dbname) == 0) + pg_fatal("invalid entry in map.dat file at line : %d", lineno); + + snprintf(subdirpath, MAXPGPATH, "%s/databases/%u", dumpdirpath, dboid); + + /* + * Database -d/--dbname is already created so reset createDB to ignore + * database creation error. + */ + if (strcmp(dbname, opts->cparams.dbname) == 0) + opts->createDB = 0; + + /* Skip any explicitly excluded database */ + if (simple_string_list_member(&database_exclude_names, dbname)) + { + pg_log_info("excluding database \"%s\"", dbname); + continue; + } + + pg_log_info("restoring database \"%s\"", dbname); + + dbexit_code = restoreOneDatabase(subdirpath, opts, numWorkers); + + /* Store exit_code to report it back. */ + if (exit_code == 0 && dbexit_code != 0) + exit_code = dbexit_code; + + processed_db++; + + /* Set createDB option to create new database. */ + if (strcmp(dbname, opts->cparams.dbname) == 0) + opts->createDB = 1; + } /* end while */ + + /* Log number of processed databases.*/ + pg_log_info("number of restored databases are %d", processed_db); + + return exit_code; +} + +/* + * This will open global.dat file and will execute all global sql commands one + * by one statement. + * + * semicolon is considered as statement terminator. + */ +static void +execute_global_sql_commands(PGconn *conn, const char *dumpdirpath) +{ + char global_file_path[MAXPGPATH]; + PGresult *result; + StringInfoData sqlstatement; + FILE *pfile; + + snprintf(global_file_path, MAXPGPATH, "%s/global.dat", dumpdirpath); + + /* now open global.dat file */ + pfile = fopen(global_file_path, "r"); + + if (pfile == NULL) + pg_fatal("could not open global.dat file: %s", strerror(errno)); + + /* Init sqlstatement to append commands */ + initStringInfo(&sqlstatement); + + /* Process file till EOF and execute sql statements */ + while (ReadOneStatement(&sqlstatement, pfile) != EOF) + { + result = PQexec(conn, sqlstatement.data); + + switch (PQresultStatus(result)) + { + case PGRES_COMMAND_OK: + case PGRES_TUPLES_OK: + case PGRES_EMPTY_QUERY: + case PGRES_COPY_IN: + break; + default: + pg_log_error("could not execute query: %s \nCommand was: %s", PQerrorMessage(conn), sqlstatement.data); + } + } +} -- 1.8.3.1 ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2025-01-01 20:35 Mahendra Singh Thalor <[email protected]> parent: Mahendra Singh Thalor <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: Mahendra Singh Thalor @ 2025-01-01 20:35 UTC (permalink / raw) To: Magnus Hagander <[email protected]>; +Cc: Tom Lane <[email protected]>; Nathan Bossart <[email protected]>; Andrew Dunstan <[email protected]>; pgsql-hackers; Dilip Kumar <[email protected]> Here, I am attaching an updated patch. I fixed some bugs of v01 patch and did some code cleanup also. TODO WIP 1: after excluding databases, we have paths of all the databases that are needed to restore so we can launch parallel workers for each database. I am studying for this part. TODO WIP 2: exclude-database=NAME, for pg_restore, I am using NAME as of now, I will try to make it PATTERN. PATTERN should be matched from map.dat file. Please have a look over the patch and let me know feedback. On Tue, 31 Dec 2024 at 23:53, Mahendra Singh Thalor <[email protected]> wrote: > Hi all, > With the help of Andrew and Dilip Kumar, I made a poc patch to dump all > the databases in archive format and then restore them using pg_restore. > > Brief about the patch: > new option to pg_dumpall: > -F, --format=d|p (directory|plain) output file format (directory, plain > text (default)) > > Ex: ./pg_dumpall --format=directory --file=dumpDirName > > dumps are as: > global.dat ::: global sql commands in simple plain format > map.dat. ::: dboid dbname ---entries for all databases in simple text > form > databases. ::: > subdir dboid1 -> toc.dat and data files in archive format > subdir dboid2. -> toc.dat and data files in archive format > etc > --------------------------------------------------------------------------- > > new options to pg_restore: > -g, --globals-only restore only global objects, no databases > --exclude-database=PATTERN exclude databases whose name matches PATTERN > > When we give -g/--globals-only option, then only restore globals, no db > restoring. > > *Design*: > When --format=directory is specified and there is no toc.dat file in the > main directory, then check > for global.dat and map.dat to restore all databases. If both files exist > in a directory, > then first restore all globals from global.dat and then restore all > databases one by one > from map.dat list. > While restoring, skip the databases that are given with exclude-database. > > --------------------------------------------------------------------------- > NOTE: > if needed, restore single db by particular subdir > > Ex: ./pg_restore --format=directory -d postgres dumpDirName/databases/5 > -- here, 5 is the dboid of postgres db > -- to get dboid, refer dbname in map.file > -------------------------------------------------------------------------- > > Please let me know feedback for the attached patch. > > On Tue, 11 Jun 2024 at 01:06, Magnus Hagander <[email protected]> wrote: > >> On Mon, Jun 10, 2024 at 6:21 PM Tom Lane <[email protected]> wrote: >> >>> Magnus Hagander <[email protected]> writes: >>> > On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart < >>> [email protected]> >>> > wrote: >>> >> Is there a particular advantage to that approach as opposed to just >>> using >>> >> "directory" mode for everything? >>> >>> > A gazillion files to deal with? Much easier to work with individual >>> custom >>> > files if you're moving databases around and things like that. >>> > Much easier to monitor eg sizes/dates if you're using it for backups. >>> >>> You can always tar up the directory tree after-the-fact if you want >>> one file. Sure, that step's not parallelized, but I think we'd need >>> some non-parallelized copying to create such a file anyway. >>> >> >> That would require double the disk space. >> >> But you can also just run pg_dump manually on each database and a >> pg_dumpall -g like people are doing today -- I thought this whole thing was >> about making it more convenient :) >> >> -- >> Magnus Hagander >> Me: https://www.hagander.net/ <http://www.hagander.net/; >> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/; >> > > > -- > Thanks and Regards > Mahendra Singh Thalor > EnterpriseDB: http://www.enterprisedb.com > -- Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com Attachments: [application/octet-stream] v02_poc_pg_dumpall_with_directory_2nd_jan.patch (35.4K, 3-v02_poc_pg_dumpall_with_directory_2nd_jan.patch) download | inline diff: From 8de74fda7825301e3f10b10ce132751386ea5fb7 Mon Sep 17 00:00:00 2001 From: Mahendra Singh Thalor <[email protected]> Date: Wed, 1 Jan 2025 12:19:08 -0800 Subject: [PATCH] pg_dumpall with directory format and restore it by pg_restore new option to pg_dumpall: -F, --format=d|p|directory|plain output file format (directory, plain text (default)) Ex: ./pg_dumpall --format=directory --file=dumpDirName dumps are as: global.dat ::: global sql commands in simple plain format map.dat. ::: dboid dbname ---entries for all databases in simple text form databases. ::: subdir dboid1 -> toc.dat and data files in archive format subdir dboid2. -> toc.dat and data files in archive format etc --------------------------------------------------------------------------- NOTE: if needed, restore single db by particular subdir Ex: ./pg_restore --format=directory -d postgres dumpDirName/databases/5 -- here, 5 is the dboid of postgres db -- to get dboid, refer dbname in map.file -------------------------------------------------------------------------- new options to pg_restore: -g, --globals-only restore only global objects, no databases --exclude-database=NAME exclude database whose name matches name When we give -g/--globals-only option, then only restore globals, no db restoring. Design: When --format=directory is specified and there is no toc.dat in main directory, then check for global.dat and map.dat to restore all databases. If both files are exists in directory, then first restore all globals from global.dat and then restore all databases one by one from map.dat list. TODO: We can restore databases in parallel mode. --- doc/src/sgml/ref/pg_dumpall.sgml | 30 ++ doc/src/sgml/ref/pg_restore.sgml | 30 ++ src/bin/pg_dump/pg_dumpall.c | 138 ++++++-- src/bin/pg_dump/pg_restore.c | 661 ++++++++++++++++++++++++++++++++++++++- 4 files changed, 829 insertions(+), 30 deletions(-) diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml index 014f279..b6c9feb 100644 --- a/doc/src/sgml/ref/pg_dumpall.sgml +++ b/doc/src/sgml/ref/pg_dumpall.sgml @@ -582,6 +582,36 @@ exclude database <replaceable class="parameter">PATTERN</replaceable> </listitem> </varlistentry> + <varlistentry> + <term><option>-F <replaceable class="parameter">format</replaceable></option></term> + <term><option>--format=<replaceable class="parameter">format</replaceable></option></term> + <listitem> + <para> + Specify format of dump files. If we want to dump all the databases, then pass this as directory so that dump of all databases can be taken in separate subdirectory in archive format. +by default, this is plain. + + <variablelist> + <varlistentry> + <term><literal>d</literal></term> + <term><literal>directory</literal></term> + <listitem> + <para> + The archive is a directory archive. + </para> + </listitem> + </varlistentry> + + <variablelist> + <varlistentry> + <term><literal>p</literal></term> + <term><literal>plain</literal></term> + <listitem> + <para> + The archive is a plain archive.(by default also) + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>-?</option></term> <term><option>--help</option></term> diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml index b8b27e1..ab2e035 100644 --- a/doc/src/sgml/ref/pg_restore.sgml +++ b/doc/src/sgml/ref/pg_restore.sgml @@ -316,6 +316,16 @@ PostgreSQL documentation </varlistentry> <varlistentry> + <term><option>-g</option></term> + <term><option>--globals-only</option></term> + <listitem> + <para> + Restore only global objects (roles and tablespaces), no databases. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><option>-I <replaceable class="parameter">index</replaceable></option></term> <term><option>--index=<replaceable class="parameter">index</replaceable></option></term> <listitem> @@ -932,6 +942,26 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry> + <term><option>--exclude-database=<replaceable class="parameter">pattern</replaceable></option></term> + <listitem> + <para> + Do not restore databases whose name matches + <replaceable class="parameter">pattern</replaceable>. + Multiple patterns can be excluded by writing multiple + <option>--exclude-database</option> switches. The + <replaceable class="parameter">pattern</replaceable> parameter is + interpreted as a pattern according to the same rules used by + <application>psql</application>'s <literal>\d</literal> + commands (see <xref linkend="app-psql-patterns"/>), + so multiple databases can also be excluded by writing wildcard + characters in the pattern. When using wildcards, be careful to + quote the pattern if needed to prevent shell wildcard expansion. + </para> + </listitem> + </varlistentry> + + </variablelist> </para> </refsect1> diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c index 396f797..066197b 100644 --- a/src/bin/pg_dump/pg_dumpall.c +++ b/src/bin/pg_dump/pg_dumpall.c @@ -15,6 +15,7 @@ #include "postgres_fe.h" +#include <sys/stat.h> #include <time.h> #include <unistd.h> @@ -29,6 +30,7 @@ #include "filter.h" #include "getopt_long.h" #include "pg_backup.h" +#include "pg_backup_archiver.h" /* version string we expect back from pg_dump */ #define PGDUMP_VERSIONSTR "pg_dump (PostgreSQL) " PG_VERSION "\n" @@ -64,9 +66,10 @@ static void dropTablespaces(PGconn *conn); static void dumpTablespaces(PGconn *conn); static void dropDBs(PGconn *conn); static void dumpUserConfig(PGconn *conn, const char *username); -static void dumpDatabases(PGconn *conn); +static void dumpDatabases(PGconn *conn, bool directory_format); static void dumpTimestamp(const char *msg); -static int runPgDump(const char *dbname, const char *create_opts); +static int runPgDump(const char *dbname, const char *create_opts, + char *dbfile); static void buildShSecLabels(PGconn *conn, const char *catalog_name, Oid objectId, const char *objtype, const char *objname, @@ -147,6 +150,7 @@ main(int argc, char *argv[]) {"password", no_argument, NULL, 'W'}, {"no-privileges", no_argument, NULL, 'x'}, {"no-acl", no_argument, NULL, 'x'}, + {"format", required_argument, NULL, 'F'}, /* * the following options don't have an equivalent short option letter @@ -188,11 +192,13 @@ main(int argc, char *argv[]) char *pgdb = NULL; char *use_role = NULL; const char *dumpencoding = NULL; + const char *format; trivalue prompt_password = TRI_DEFAULT; bool data_only = false; bool globals_only = false; bool roles_only = false; bool tablespaces_only = false; + bool directory_format = false; PGconn *conn; int encoding; const char *std_strings; @@ -237,7 +243,7 @@ main(int argc, char *argv[]) pgdumpopts = createPQExpBuffer(); - while ((c = getopt_long(argc, argv, "acd:E:f:gh:l:Op:rsS:tU:vwWx", long_options, &optindex)) != -1) + while ((c = getopt_long(argc, argv, "acd:E:f:F:gh:l:Op:rsS:tU:vwWx", long_options, &optindex)) != -1) { switch (c) { @@ -265,7 +271,17 @@ main(int argc, char *argv[]) appendPQExpBufferStr(pgdumpopts, " -f "); appendShellString(pgdumpopts, filename); break; - + case 'F': + format = optarg; + if ((strcmp(format, "directory") == 0 || strcmp(format, "d") == 0)) + directory_format = true; + else if (strcmp(format, "plain") != 0 || strcmp(format, "p") == 0) + { + pg_log_error("invalid format specified: %s", format); + pg_log_error_hint("Try \"%s --help\" for more information.", progname); + exit_nicely(1); + } + break; case 'g': globals_only = true; break; @@ -497,9 +513,31 @@ main(int argc, char *argv[]) &database_exclude_names); /* - * Open the output file if required, otherwise use stdout + * Open the output file if required, otherwise use stdout. */ - if (filename) + if (directory_format) + { + char toc_path[MAXPGPATH]; + + /* + * If directory format is specified then we must provide the directory + * name. + */ + if (!filename || strcmp(filename, "") == 0) + pg_fatal("no output directory specified"); + + /* TODO: accept the empty existing directory. */ + if (mkdir(filename, 0700) < 0) + pg_fatal("could not create directory \"%s\": %m", + filename); + + snprintf(toc_path, MAXPGPATH, "%s/global.dat", filename); + + OPF = fopen(toc_path, "w"); + if (!OPF) + pg_fatal("could not open global.dat file: %s", strerror(errno)); + } + else if (filename) { OPF = fopen(filename, PG_BINARY_W); if (!OPF) @@ -607,7 +645,7 @@ main(int argc, char *argv[]) } if (!globals_only && !roles_only && !tablespaces_only) - dumpDatabases(conn); + dumpDatabases(conn, directory_format); PQfinish(conn); @@ -620,7 +658,7 @@ main(int argc, char *argv[]) fclose(OPF); /* sync the resulting file, errors are not fatal */ - if (dosync) + if (dosync && !directory_format) (void) fsync_fname(filename, false); } @@ -637,6 +675,7 @@ help(void) printf(_("\nGeneral options:\n")); printf(_(" -f, --file=FILENAME output file name\n")); + printf(_(" -F, --format=d|p output file format (directory, plain text (default))\n")); printf(_(" -v, --verbose verbose mode\n")); printf(_(" -V, --version output version information, then exit\n")); printf(_(" --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n")); @@ -1487,10 +1526,13 @@ expand_dbname_patterns(PGconn *conn, * Dump contents of databases. */ static void -dumpDatabases(PGconn *conn) +dumpDatabases(PGconn *conn, bool directory_format) { PGresult *res; int i; + char db_subdir[MAXPGPATH]; + char dbfilepath[MAXPGPATH]; + FILE *map_file; /* * Skip databases marked not datallowconn, since we'd be unable to connect @@ -1504,7 +1546,7 @@ dumpDatabases(PGconn *conn) * doesn't have some failure mode with --clean. */ res = executeQuery(conn, - "SELECT datname " + "SELECT datname, oid " "FROM pg_database d " "WHERE datallowconn AND datconnlimit != -2 " "ORDER BY (datname <> 'template1'), datname"); @@ -1512,9 +1554,30 @@ dumpDatabases(PGconn *conn) if (PQntuples(res) > 0) fprintf(OPF, "--\n-- Databases\n--\n\n"); + /* + * If directory format is specified then create a subdirectory under the + * main directory and each database dump file will be created under the + * subdirectory in archive mode as per single db pg_dump. + */ + if (directory_format) + { + char map_file_path[MAXPGPATH]; + + snprintf(db_subdir, MAXPGPATH, "%s/databases", filename); + if (mkdir(db_subdir, 0755) != 0) + pg_log_error("could not create subdirectory \"%s\": %m", db_subdir); + + /* Create a map file (to store dboid and dbname) */ + snprintf(map_file_path, MAXPGPATH, "%s/map.dat", filename); + map_file = fopen(map_file_path, "w"); + if (!map_file) + pg_fatal("could not open map file: %s", strerror(errno)); + } + for (i = 0; i < PQntuples(res); i++) { char *dbname = PQgetvalue(res, i, 0); + char *oid = PQgetvalue(res, i, 1); const char *create_opts; int ret; @@ -1522,6 +1585,14 @@ dumpDatabases(PGconn *conn) if (strcmp(dbname, "template0") == 0) continue; + if (directory_format) + { + snprintf(dbfilepath, MAXPGPATH, "-f %s/%s", db_subdir, oid); + + /* append dboid and dbname in map file. */ + fprintf(map_file, "%s %s\n", oid, dbname); + } + /* Skip any explicitly excluded database */ if (simple_string_list_member(&database_exclude_names, dbname)) { @@ -1531,7 +1602,8 @@ dumpDatabases(PGconn *conn) pg_log_info("dumping database \"%s\"", dbname); - fprintf(OPF, "--\n-- Database \"%s\" dump\n--\n\n", dbname); + if (!directory_format) + fprintf(OPF, "--\n-- Database \"%s\" dump\n--\n\n", dbname); /* * We assume that "template1" and "postgres" already exist in the @@ -1549,20 +1621,21 @@ dumpDatabases(PGconn *conn) { create_opts = ""; /* Since pg_dump won't emit a \connect command, we must */ - fprintf(OPF, "\\connect %s\n\n", dbname); + if (!directory_format) + fprintf(OPF, "\\connect %s\n\n", dbname); } } else create_opts = "--create"; - if (filename) + if (!directory_format && filename) fclose(OPF); - ret = runPgDump(dbname, create_opts); + ret = runPgDump(dbname, create_opts, dbfilepath); if (ret != 0) pg_fatal("pg_dump failed on database \"%s\", exiting", dbname); - if (filename) + if (!directory_format && filename) { OPF = fopen(filename, PG_BINARY_A); if (!OPF) @@ -1571,6 +1644,10 @@ dumpDatabases(PGconn *conn) } } + /* close map file */ + if (directory_format) + fclose(map_file); + PQclear(res); } @@ -1580,7 +1657,7 @@ dumpDatabases(PGconn *conn) * Run pg_dump on dbname, with specified options. */ static int -runPgDump(const char *dbname, const char *create_opts) +runPgDump(const char *dbname, const char *create_opts, char *dbfile) { PQExpBufferData connstrbuf; PQExpBufferData cmd; @@ -1589,17 +1666,26 @@ runPgDump(const char *dbname, const char *create_opts) initPQExpBuffer(&connstrbuf); initPQExpBuffer(&cmd); - printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, - pgdumpopts->data, create_opts); - - /* - * If we have a filename, use the undocumented plain-append pg_dump - * format. - */ - if (filename) - appendPQExpBufferStr(&cmd, " -Fa "); + if (dbfile) + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + dbfile, create_opts); + appendPQExpBufferStr(&cmd, " -F d "); + } else - appendPQExpBufferStr(&cmd, " -Fp "); + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + pgdumpopts->data, create_opts); + + /* + * If we have a filename, use the undocumented plain-append pg_dump + * format. + */ + if (filename) + appendPQExpBufferStr(&cmd, " -Fa "); + else + appendPQExpBufferStr(&cmd, " -Fp "); + } /* * Append the database name to the already-constructed stem of connection diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c index 88ae39d..55d1862 100644 --- a/src/bin/pg_dump/pg_restore.c +++ b/src/bin/pg_dump/pg_restore.c @@ -41,27 +41,65 @@ #include "postgres_fe.h" #include <ctype.h> +#include <sys/stat.h> #ifdef HAVE_TERMIOS_H #include <termios.h> #endif +#include "common/connect.h" +#include "compress_io.h" +#include "common/string.h" #include "fe_utils/option_utils.h" +#include "fe_utils/string_utils.h" #include "filter.h" #include "getopt_long.h" #include "parallel.h" +#include "pg_backup_archiver.h" #include "pg_backup_utils.h" +typedef struct SimpleDBoidListCell +{ + struct SimpleDBoidListCell *next; + Oid dboid; + const char *dbname; +} SimpleDBoidListCell; + +typedef struct SimpleActionList +{ + SimpleDBoidListCell *head; + SimpleDBoidListCell *tail; +} SimpleDBoidList; + +static void +simple_dboid_list_append(SimpleDBoidList *list, Oid dboid, const char *dbname); + static void usage(const char *progname); static void read_restore_filters(const char *filename, RestoreOptions *opts); +static bool _fileExistsInDirectory(const char *dir, const char *filename); +static bool restoreOneDatabase(const char *inputFileSpec, + RestoreOptions *opts, int numWorkers); +static PGconn *connectDatabase(const char *dbname, const char *conn_string, + const char *pghost, const char *pgport, const char *pguser, + trivalue prompt_password, bool fail_on_error); +static PGresult *executeQuery(PGconn *conn, const char *query); +static int ReadOneStatement(StringInfo inBuf, FILE *f_glo); +static int restoreAllDatabases(const char *dumpdirpath, + SimpleStringList database_exclude_names, RestoreOptions *opts, + int numWorkers); +static void execute_global_sql_commands(PGconn *conn, const char *dumpdirpath); +static int filter_dbnames_for_restore(SimpleDBoidList *dbname_oid_list, + SimpleStringList database_exclude_names); +static int get_dbname_oid_list_from_mfile(const char *dumpdirpath, + SimpleDBoidList *dbname_oid_list); +static void simple_dboid_list_append(SimpleDBoidList *list, Oid dboid, + const char *dbname); int main(int argc, char **argv) { RestoreOptions *opts; int c; - int exit_code; int numWorkers = 1; - Archive *AH; char *inputFileSpec; static int disable_triggers = 0; static int enable_row_security = 0; @@ -77,11 +115,14 @@ main(int argc, char **argv) static int strict_names = 0; bool data_only = false; bool schema_only = false; + SimpleStringList database_exclude_names = {NULL, NULL}; + bool globals_only = false; struct option cmdopts[] = { {"clean", 0, NULL, 'c'}, {"create", 0, NULL, 'C'}, {"data-only", 0, NULL, 'a'}, + {"globals-only", 0, NULL, 'g'}, {"dbname", 1, NULL, 'd'}, {"exit-on-error", 0, NULL, 'e'}, {"exclude-schema", 1, NULL, 'N'}, @@ -128,6 +169,7 @@ main(int argc, char **argv) {"no-security-labels", no_argument, &no_security_labels, 1}, {"no-subscriptions", no_argument, &no_subscriptions, 1}, {"filter", required_argument, NULL, 4}, + {"exclude-database", required_argument, NULL, 6}, {NULL, 0, NULL, 0} }; @@ -156,7 +198,7 @@ main(int argc, char **argv) } } - while ((c = getopt_long(argc, argv, "acCd:ef:F:h:I:j:lL:n:N:Op:P:RsS:t:T:U:vwWx1", + while ((c = getopt_long(argc, argv, "aAcCd:ef:F:gh:I:j:lL:n:N:Op:P:RsS:t:T:U:vwWx1", cmdopts, NULL)) != -1) { switch (c) @@ -183,11 +225,14 @@ main(int argc, char **argv) if (strlen(optarg) != 0) opts->formatName = pg_strdup(optarg); break; + case 'g': + /* restore only global.dat file from directory */ + globals_only = true; + break; case 'h': if (strlen(optarg) != 0) opts->cparams.pghost = pg_strdup(optarg); break; - case 'j': /* number of restore jobs */ if (!option_parse_int(optarg, "-j/--jobs", 1, PG_MAX_JOBS, @@ -302,6 +347,14 @@ main(int argc, char **argv) exit(1); opts->exit_on_error = true; break; + case 6: + /* list of databases those needs to skip while restoring */ + simple_string_list_append(&database_exclude_names, optarg); + /* + * XXX: TODO as of now, considering only db names but we can + * implement for patterns also. + */ + break; default: /* getopt_long already emitted a complaint */ @@ -329,6 +382,16 @@ main(int argc, char **argv) if (!opts->cparams.dbname && !opts->filename && !opts->tocSummary) pg_fatal("one of -d/--dbname and -f/--file must be specified"); + if (database_exclude_names.head != NULL && globals_only) + { + pg_log_error("option --exclude-database cannot be used together with -g/--globals-only"); + pg_log_error_hint("Try \"%s --help\" for more information.", progname); + exit_nicely(1); + } + + if (globals_only && opts->cparams.dbname == NULL) + pg_fatal("option -g/--globals-only requires option -d/--dbname"); + /* Should get at most one of -d and -f, else user is confused */ if (opts->cparams.dbname) { @@ -406,6 +469,68 @@ main(int argc, char **argv) } } + /* + * If directory format, then first check that toc.dat file exist or not? + * + * if toc.dat exist, then no need to check for map.dat and global.dat + * + */ + if (opts->format == archDirectory && + inputFileSpec != NULL && + !_fileExistsInDirectory(inputFileSpec, "toc.dat")) + { + /* if global.dat and map.dat are exist, then open them */ + if (_fileExistsInDirectory(pg_strdup(inputFileSpec), "global.dat") + && _fileExistsInDirectory(pg_strdup(inputFileSpec), "map.dat")) + { + /* Found the global.dat and map.dat file so process. */ + PGconn *conn = NULL; + + if (opts->cparams.dbname == NULL) + pg_fatal(" -d/--dbanme should be given if using dump of dumpall and global.dat"); + + if (opts->createDB != 1) + pg_fatal("option -C/--create should be specified if using dump of dumpall with global.dat"); + + /* Connect to database so that we can execute global.dat */ + conn = connectDatabase(opts->cparams.dbname, NULL, + opts->cparams.pghost, opts->cparams.pgport, opts->cparams.username, + TRI_DEFAULT, false); + + if (!conn) + pg_fatal("could not connect to database \"%s\"", opts->cparams.dbname); + + /* Open global.dat file and execute all the sql commands */ + execute_global_sql_commands(conn, inputFileSpec); + + /* Close the db connection as we are done with globals */ + PQfinish(conn); + + /* if globals-only, then return from here */ + if (globals_only) + return 0; + + /* Now restore all the databases from map.dat file */ + return restoreAllDatabases(inputFileSpec, database_exclude_names, + opts, numWorkers); + }/* end if */ + }/* end if */ + + return restoreOneDatabase(inputFileSpec, opts, numWorkers); +} + +/* + * restoreOneDatabase + * + * This will restore one database using toc.dat file. + */ +static bool +restoreOneDatabase(const char *inputFileSpec, RestoreOptions *opts, + int numWorkers) +{ + Archive *AH; + bool exit_code; + AH = OpenArchive(inputFileSpec, opts->format); SetArchiveOptions(AH, NULL, opts); @@ -471,6 +596,7 @@ usage(const char *progname) printf(_(" -c, --clean clean (drop) database objects before recreating\n")); printf(_(" -C, --create create the target database\n")); printf(_(" -e, --exit-on-error exit on error, default is to continue\n")); + printf(_(" -g, --globals-only restore only global objects, no databases\n")); printf(_(" -I, --index=NAME restore named index\n")); printf(_(" -j, --jobs=NUM use this many parallel jobs to restore\n")); printf(_(" -L, --use-list=FILENAME use table of contents from this file for\n" @@ -483,6 +609,7 @@ usage(const char *progname) printf(_(" -S, --superuser=NAME superuser user name to use for disabling triggers\n")); printf(_(" -t, --table=NAME restore named relation (table, view, etc.)\n")); printf(_(" -T, --trigger=NAME restore named trigger\n")); + printf(_(" --exclude-database=NAME exclude databases whose name matches with name\n")); printf(_(" -x, --no-privileges skip restoration of access privileges (grant/revoke)\n")); printf(_(" -1, --single-transaction restore as a single transaction\n")); printf(_(" --disable-triggers disable triggers during data-only restore\n")); @@ -621,3 +748,529 @@ read_restore_filters(const char *filename, RestoreOptions *opts) filter_free(&fstate); } + +static bool +_fileExistsInDirectory(const char *dir, const char *filename) +{ + struct stat st; + char buf[MAXPGPATH]; + + if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH) + pg_fatal("directory name too long: \"%s\"", dir); + + return (stat(buf, &st) == 0 && S_ISREG(st.st_mode)); +} + +/* + * Make a database connection with the given parameters. An + * interactive password prompt is automatically issued if required. + * + * If fail_on_error is false, we return NULL without printing any message + * on failure, but preserve any prompted password for the next try. + * + */ +static PGconn * +connectDatabase(const char *dbname, const char *connection_string, + const char *pghost, const char *pgport, const char *pguser, + trivalue prompt_password, bool fail_on_error) +{ + PGconn *conn; + bool new_pass; + const char *remoteversion_str; + int my_version; + const char **keywords = NULL; + const char **values = NULL; + PQconninfoOption *conn_opts = NULL; + static char *password = NULL; + static int server_version; + + if (prompt_password == TRI_YES && !password) + password = simple_prompt("Password: ", false); + + /* + * Start the connection. Loop until we have a password if requested by + * backend. + */ + do + { + int argcount = 6; + PQconninfoOption *conn_opt; + char *err_msg = NULL; + int i = 0; + + free(keywords); + free(values); + PQconninfoFree(conn_opts); + + /* + * Merge the connection info inputs given in form of connection string + * and other options. Explicitly discard any dbname value in the + * connection string; otherwise, PQconnectdbParams() would interpret + * that value as being itself a connection string. + */ + if (connection_string) + { + conn_opts = PQconninfoParse(connection_string, &err_msg); + if (conn_opts == NULL) + pg_fatal("%s", err_msg); + + for (conn_opt = conn_opts; conn_opt->keyword != NULL; conn_opt++) + { + if (conn_opt->val != NULL && conn_opt->val[0] != '\0' && + strcmp(conn_opt->keyword, "dbname") != 0) + argcount++; + } + + keywords = pg_malloc0((argcount + 1) * sizeof(*keywords)); + values = pg_malloc0((argcount + 1) * sizeof(*values)); + + for (conn_opt = conn_opts; conn_opt->keyword != NULL; conn_opt++) + { + if (conn_opt->val != NULL && conn_opt->val[0] != '\0' && + strcmp(conn_opt->keyword, "dbname") != 0) + { + keywords[i] = conn_opt->keyword; + values[i] = conn_opt->val; + i++; + } + } + } + else + { + keywords = pg_malloc0((argcount + 1) * sizeof(*keywords)); + values = pg_malloc0((argcount + 1) * sizeof(*values)); + } + + if (pghost) + { + keywords[i] = "host"; + values[i] = pghost; + i++; + } + if (pgport) + { + keywords[i] = "port"; + values[i] = pgport; + i++; + } + if (pguser) + { + keywords[i] = "user"; + values[i] = pguser; + i++; + } + if (password) + { + keywords[i] = "password"; + values[i] = password; + i++; + } + if (dbname) + { + keywords[i] = "dbname"; + values[i] = dbname; + i++; + } + keywords[i] = "fallback_application_name"; + values[i] = progname; + i++; + + new_pass = false; + conn = PQconnectdbParams(keywords, values, true); + + if (!conn) + pg_fatal("could not connect to database \"%s\"", dbname); + + if (PQstatus(conn) == CONNECTION_BAD && + PQconnectionNeedsPassword(conn) && + !password && + prompt_password != TRI_NO) + { + PQfinish(conn); + password = simple_prompt("Password: ", false); + new_pass = true; + } + } while (new_pass); + + /* check to see that the backend connection was successfully made */ + if (PQstatus(conn) == CONNECTION_BAD) + { + if (fail_on_error) + pg_fatal("%s", PQerrorMessage(conn)); + else + { + PQfinish(conn); + + free(keywords); + free(values); + PQconninfoFree(conn_opts); + + return NULL; + } + } + + free(keywords); + free(values); + PQconninfoFree(conn_opts); + + /* Check version */ + remoteversion_str = PQparameterStatus(conn, "server_version"); + if (!remoteversion_str) + pg_fatal("could not get server version"); + server_version = PQserverVersion(conn); + if (server_version == 0) + pg_fatal("could not parse server version \"%s\"", + remoteversion_str); + + my_version = PG_VERSION_NUM; + + /* + * We allow the server to be back to 9.2, and up to any minor release of + * our own major version. (See also version check in pg_dump.c.) + */ + if (my_version != server_version + && (server_version < 90200 || + (server_version / 100) > (my_version / 100))) + { + pg_log_error("aborting because of server version mismatch"); + pg_log_error_detail("server version: %s; %s version: %s", + remoteversion_str, progname, PG_VERSION); + exit_nicely(1); + } + + PQclear(executeQuery(conn, ALWAYS_SECURE_SEARCH_PATH_SQL)); + + return conn; +} + +/* + * Run a query, return the results, exit program on failure. + */ +static PGresult * +executeQuery(PGconn *conn, const char *query) +{ + PGresult *res; + + pg_log_info("executing %s", query); + + res = PQexec(conn, query); + if (!res || PQresultStatus(res) != PGRES_TUPLES_OK) + { + pg_log_error("query failed: %s", PQerrorMessage(conn)); + pg_log_error_detail("Query was: %s", query); + PQfinish(conn); + exit_nicely(1); + } + + return res; +} + +/* ---------------- + * ReadOneStatement() + * + * This will start reading from passed file pointer using fgetc and read till + * semicolon(sql statement terminator for global.sql file) + * + * EOF is returned if end-of-file input is seen; time to shut down. + * ---------------- + */ + +static int +ReadOneStatement(StringInfo inBuf, FILE *f_glo) +{ + int c; /* character read from getc() */ + + resetStringInfo(inBuf); + + /* + * Read characters until EOF or the appropriate delimiter is seen. + */ + while ((c = fgetc(f_glo)) != EOF) + { + appendStringInfoChar(inBuf, (char) c); + + if (c == '\n') + { + if(inBuf->len > 1 && + inBuf->data[inBuf->len - 2] == ';') + break; + else + continue; + } + } + + /* No input before EOF signal means time to quit. */ + if (c == EOF && inBuf->len == 0) + return EOF; + + /* Add '\0' to make it look the same as message case. */ + appendStringInfoChar(inBuf, (char) '\0'); + + return 'Q'; +} + +/* + * This will remove names from all dblist that are given with exclude-database + * option. + * + * returns number of dbnames those will be restored. + */ +static int +filter_dbnames_for_restore(SimpleDBoidList *dbname_oid_list, + SimpleStringList database_exclude_names) +{ + int countdb = 0; + SimpleDBoidListCell *cell = dbname_oid_list->head; + SimpleDBoidListCell *precell = NULL; + + /* Return 0 if there is no db to restore. */ + if (cell == NULL) + return 0; + + while (cell != NULL) + { + bool skip_db = false; + + /* Now match this dbname with exclude-database list. */ + for (SimpleStringListCell *celldb = database_exclude_names.head; celldb; celldb = celldb->next) + { + if (strcmp(celldb->val, cell->dbname) == 0) + { + /* + * As we need to skip this dbname so set flag to remove it from + * list. + */ + skip_db = true; + break; + } + } + + /* Increment count if db needs to be restored. */ + if (!skip_db) + { + countdb++; + precell = cell; + cell = cell->next; + } + else + { + if (precell != NULL) + { + precell->next = cell->next; + pfree(cell); + cell = precell->next; + } + else + { + dbname_oid_list->head = cell->next; + pfree(cell); + cell = dbname_oid_list->head; + } + } + } + + return countdb; +} + +/* + * Open map.dat file and read line by line and then prepare a list of database + * names and correspoding dboid. + * + * Returns, total number of database names in map.dat file. + */ +static int +get_dbname_oid_list_from_mfile(const char *dumpdirpath, SimpleDBoidList *dbname_oid_list) +{ + FILE *pfile; + char map_file_path[MAXPGPATH]; + char line[MAXPGPATH]; + int count = 0; + + snprintf(map_file_path, MAXPGPATH, "%s/map.dat", dumpdirpath); + + /* Open map.dat file. */ + pfile = fopen(map_file_path, "r"); + + if (pfile == NULL) + pg_fatal("could not open map.dat file: %s", strerror(errno)); + + /* Append all the dbname and dboid to the list. */ + while((fgets(line, MAXPGPATH, pfile)) != NULL) + { + Oid dboid; + char dbname[MAXPGPATH + 1]; + + /* Extract dbname and dboid from line */ + sscanf(line, "%u %s" , &dboid, dbname); + pg_log_info("found dbname as :%s and dboid:%d in map.dat file while restoring", dbname, dboid); + + /* Report error if file has any corrupted data. */ + if (!OidIsValid(dboid) || strlen(dbname) == 0) + pg_fatal("invalid entry in map.dat file at line : %d", count + 1); + + /* + * TODO : before adding dbanme into list, we can verify that this db + * needs to skipped for restore or not. + */ + simple_dboid_list_append(dbname_oid_list, dboid, dbname); + count++; + } + + /* Close map.dat file. */ + fclose(pfile); + + return count; +} + +/* + * This will restore databases those dumps are present in + * directory based on map.dat file mapping. + * + * This will skip restoring for databases that are specified with + * exclude-database option. + */ +static int +restoreAllDatabases(const char *dumpdirpath, + SimpleStringList database_exclude_names, RestoreOptions *opts, + int numWorkers) +{ + SimpleDBoidList dbname_oid_list = {NULL, NULL}; + SimpleDBoidListCell *cell; + int exit_code = 0; + int num_db_restore; + int num_total_db; + + num_total_db = get_dbname_oid_list_from_mfile(dumpdirpath, &dbname_oid_list); + + /* If map.dat has no entry, return from here. */ + if (dbname_oid_list.head == NULL) + return 0; + + pg_log_info("found total %d database names in map.dat file", num_total_db); + + /* Skip any explicitly excluded database. */ + num_db_restore = filter_dbnames_for_restore(&dbname_oid_list, database_exclude_names); + + /* Exit if no db needs to be restored. */ + if (dbname_oid_list.head == NULL) + return 0; + + pg_log_info("needs to restore %d databases out of %d databases", num_db_restore, num_total_db); + + /* + * XXX: TODO till now, we made a list of databases, those needs to be restored + * after skipping names of exclude-database. Now we can launch parallel + * workers to restore these databases. + */ + cell = dbname_oid_list.head; + + while(cell != NULL) + { + char subdirpath[MAXPGPATH]; + int dbexit_code; + + /* + * We need to reset override_dbname so that objects can be restored into + * already created database. (used with -d/--dbname option) + */ + if (opts->cparams.override_dbname) + { + pfree(opts->cparams.override_dbname); + opts->cparams.override_dbname = NULL; + } + + snprintf(subdirpath, MAXPGPATH, "%s/databases/%u", dumpdirpath, cell->dboid); + + /* + * Database -d/--dbname is already created so reset createDB to ignore + * database creation error. + */ + if (strcmp(cell->dbname, opts->cparams.dbname) == 0) + opts->createDB = 0; + + pg_log_info("restoring database \"%s\"", cell->dbname); + + dbexit_code = restoreOneDatabase(subdirpath, opts, numWorkers); + + /* Store exit_code to report it back. */ + if (exit_code == 0 && dbexit_code != 0) + exit_code = dbexit_code; + + /* Set createDB option to create new database. */ + if (strcmp(cell->dbname, opts->cparams.dbname) == 0) + opts->createDB = 1; + + cell = cell->next; + } /* end while */ + + /* Log number of processed databases.*/ + pg_log_info("number of restored databases are %d", num_db_restore); + + return exit_code; +} + +/* + * This will open global.dat file and will execute all global sql commands one + * by one statement. + * + * semicolon is considered as statement terminator. + */ +static void +execute_global_sql_commands(PGconn *conn, const char *dumpdirpath) +{ + char global_file_path[MAXPGPATH]; + PGresult *result; + StringInfoData sqlstatement; + FILE *pfile; + + snprintf(global_file_path, MAXPGPATH, "%s/global.dat", dumpdirpath); + + /* now open global.dat file */ + pfile = fopen(global_file_path, "r"); + + if (pfile == NULL) + pg_fatal("could not open global.dat file: %s", strerror(errno)); + + /* Init sqlstatement to append commands */ + initStringInfo(&sqlstatement); + + /* Process file till EOF and execute sql statements */ + while (ReadOneStatement(&sqlstatement, pfile) != EOF) + { + result = PQexec(conn, sqlstatement.data); + + switch (PQresultStatus(result)) + { + case PGRES_COMMAND_OK: + case PGRES_TUPLES_OK: + case PGRES_EMPTY_QUERY: + case PGRES_COPY_IN: + break; + default: + pg_log_error("could not execute query: %s \nCommand was: %s", PQerrorMessage(conn), sqlstatement.data); + } + } + + fclose(pfile); +} + +/* + * appends a node to the list in the end. + */ +static void +simple_dboid_list_append(SimpleDBoidList *list, Oid dboid, const char *dbname) +{ + SimpleDBoidListCell *cell; + + cell = pg_malloc_object(SimpleDBoidListCell); + + cell->next = NULL; + cell->dboid = dboid; + cell->dbname = pg_strdup(dbname); + + if (list->tail) + list->tail->next = cell; + else + list->head = cell; + list->tail = cell; +} -- 1.8.3.1 ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2025-01-06 17:35 Nathan Bossart <[email protected]> parent: Mahendra Singh Thalor <[email protected]> 0 siblings, 1 reply; 15+ messages in thread From: Nathan Bossart @ 2025-01-06 17:35 UTC (permalink / raw) To: Mahendra Singh Thalor <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>; pgsql-hackers; Dilip Kumar <[email protected]> On Thu, Jan 02, 2025 at 02:05:13AM +0530, Mahendra Singh Thalor wrote: > Here, I am attaching an updated patch. I fixed some bugs of v01 patch and > did some code cleanup also. Thank you for picking this up! I started to review it, but the documentation changes didn't build, and a few tests in check-world are failing. Would you mind resolving those issues? Also, if you haven't already, please add an entry to the next commitfest [0] to ensure that 1) this feature is tracked and 2) the automated tests will run. + if (dbfile) + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + dbfile, create_opts); + appendPQExpBufferStr(&cmd, " -F d "); + } Have you given any thought to allowing a directory of custom format files, as discussed upthread [1]? Perhaps that is better handled as a follow-up patch, but it'd be good to understand the plan, anyway. [0] https://commitfest.postgresql.org [1] https://postgr.es/m/CABUevExoQ26jo%2BaQ9QZq%2BUMA1aD6gfpm9xBnh_t5e0DhaCeRYA%40mail.gmail.com -- nathan ^ permalink raw reply [nested|flat] 15+ messages in thread
* Re: Non-text mode for pg_dumpall @ 2025-01-07 19:04 Mahendra Singh Thalor <[email protected]> parent: Nathan Bossart <[email protected]> 0 siblings, 0 replies; 15+ messages in thread From: Mahendra Singh Thalor @ 2025-01-07 19:04 UTC (permalink / raw) To: Nathan Bossart <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>; pgsql-hackers; Dilip Kumar <[email protected]> On Mon, 6 Jan 2025 at 23:05, Nathan Bossart <[email protected]> wrote: > > On Thu, Jan 02, 2025 at 02:05:13AM +0530, Mahendra Singh Thalor wrote: > > Here, I am attaching an updated patch. I fixed some bugs of v01 patch and > > did some code cleanup also. > > Thank you for picking this up! I started to review it, but the > documentation changes didn't build, and a few tests in check-world are > failing. Would you mind resolving those issues? Also, if you haven't > already, please add an entry to the next commitfest [0] to ensure that 1) > this feature is tracked and 2) the automated tests will run. Thanks Nathan for the quick response. I fixed bugs of documentation changes and check-world in the latest patch. Now docs are building and check-world is passing. I added entry into commitfest for this patch.[0] <https://commitfest.postgresql.org/52/5495/; > > + if (dbfile) > + { > + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, > + dbfile, create_opts); > + appendPQExpBufferStr(&cmd, " -F d "); > + } > > Have you given any thought to allowing a directory of custom format files, > as discussed upthread [1]? Perhaps that is better handled as a follow-up > patch, but it'd be good to understand the plan, anyway. I will make these changes and will test. I will update my findings after doing some testing. Apart from these bugs, I added code to handle --exclude-database= PATTERN. Earlier I was using NAME only to skip databases for restore. *TODO: .pl test cases for new added options.* Here, I am attaching an updated patch for review and feedback. > > [0] https://commitfest.postgresql.org > [1] https://postgr.es/m/CABUevExoQ26jo%2BaQ9QZq%2BUMA1aD6gfpm9xBnh_t5e0DhaCeRYA%40mail.gmail.com > > -- > nathan -- Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com Attachments: [application/octet-stream] v03-pg_dumpall-with-directory-format-and-restore-08_jan.patch (38.2K, 3-v03-pg_dumpall-with-directory-format-and-restore-08_jan.patch) download | inline diff: From 9e854f93197c230b82047dfd802c1b64cb3d2903 Mon Sep 17 00:00:00 2001 From: Mahendra Singh Thalor <[email protected]> Date: Wed, 8 Jan 2025 00:15:54 +0530 Subject: [PATCH] pg_dumpall with directory format and restore it by pg_restore new option to pg_dumpall: -F, --format=d|p|directory|plain output file format (directory, plain text (default)) Ex: ./pg_dumpall --format=directory --file=dumpDirName dumps are as: global.dat ::: global sql commands in simple plain format map.dat. ::: dboid dbname ---entries for all databases in simple text form databases. ::: subdir dboid1 -> toc.dat and data files in archive format subdir dboid2. -> toc.dat and data files in archive format etc --------------------------------------------------------------------------- NOTE: if needed, restore single db by particular subdir Ex: ./pg_restore --format=directory -d postgres dumpDirName/databases/5 -- here, 5 is the dboid of postgres db -- to get dboid, refer dbname in map.file -------------------------------------------------------------------------- new options to pg_restore: -g, --globals-only restore only global objects, no databases --exclude-database=PATTERN exclude database whose name matches pattern When we give -g/--globals-only option, then only restore globals, no db restoring. Design: When --format=directory is specified and there is no toc.dat in main directory, then check for global.dat and map.dat to restore all databases. If both files are exists in directory, then first restore all globals from global.dat and then restore all databases one by one from map.dat list. TODO: We can dump and restore databases in parallel mode. This needs more study. --- doc/src/sgml/ref/pg_dumpall.sgml | 35 ++ doc/src/sgml/ref/pg_restore.sgml | 30 ++ src/bin/pg_dump/pg_dumpall.c | 150 ++++-- src/bin/pg_dump/pg_restore.c | 760 ++++++++++++++++++++++++++++++- 4 files changed, 945 insertions(+), 30 deletions(-) diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml index 014f279258..51deaae0d1 100644 --- a/doc/src/sgml/ref/pg_dumpall.sgml +++ b/doc/src/sgml/ref/pg_dumpall.sgml @@ -125,6 +125,41 @@ PostgreSQL documentation </listitem> </varlistentry> +<varlistentry> + <term><option>-F <replaceable class="parameter">format</replaceable></option></term> + <term><option>--format=<replaceable class="parameter">format</replaceable></option></term> + <listitem> + <para> + Specify format of dump files. If we want to dump all the databases, + then pass this as directory so that dump of all databases can be taken + in separate subdirectory in archive format. + by default, this is plain. + + <variablelist> + <varlistentry> + <term><literal>d</literal></term> + <term><literal>directory</literal></term> + <listitem> + <para> + The archive is a directory archive. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>p</literal></term> + <term><literal>plain</literal></term> + <listitem> + <para> + The archive is a plain archive.(by default also) + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>--filter=<replaceable class="parameter">filename</replaceable></option></term> <listitem> diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml index b8b27e1719..ab2e035671 100644 --- a/doc/src/sgml/ref/pg_restore.sgml +++ b/doc/src/sgml/ref/pg_restore.sgml @@ -315,6 +315,16 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry> + <term><option>-g</option></term> + <term><option>--globals-only</option></term> + <listitem> + <para> + Restore only global objects (roles and tablespaces), no databases. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>-I <replaceable class="parameter">index</replaceable></option></term> <term><option>--index=<replaceable class="parameter">index</replaceable></option></term> @@ -932,6 +942,26 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry> + <term><option>--exclude-database=<replaceable class="parameter">pattern</replaceable></option></term> + <listitem> + <para> + Do not restore databases whose name matches + <replaceable class="parameter">pattern</replaceable>. + Multiple patterns can be excluded by writing multiple + <option>--exclude-database</option> switches. The + <replaceable class="parameter">pattern</replaceable> parameter is + interpreted as a pattern according to the same rules used by + <application>psql</application>'s <literal>\d</literal> + commands (see <xref linkend="app-psql-patterns"/>), + so multiple databases can also be excluded by writing wildcard + characters in the pattern. When using wildcards, be careful to + quote the pattern if needed to prevent shell wildcard expansion. + </para> + </listitem> + </varlistentry> + + </variablelist> </para> </refsect1> diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c index 396f79781c..ceb4c908d8 100644 --- a/src/bin/pg_dump/pg_dumpall.c +++ b/src/bin/pg_dump/pg_dumpall.c @@ -15,6 +15,7 @@ #include "postgres_fe.h" +#include <sys/stat.h> #include <time.h> #include <unistd.h> @@ -29,6 +30,7 @@ #include "filter.h" #include "getopt_long.h" #include "pg_backup.h" +#include "pg_backup_archiver.h" /* version string we expect back from pg_dump */ #define PGDUMP_VERSIONSTR "pg_dump (PostgreSQL) " PG_VERSION "\n" @@ -64,9 +66,10 @@ static void dropTablespaces(PGconn *conn); static void dumpTablespaces(PGconn *conn); static void dropDBs(PGconn *conn); static void dumpUserConfig(PGconn *conn, const char *username); -static void dumpDatabases(PGconn *conn); +static void dumpDatabases(PGconn *conn, bool directory_format); static void dumpTimestamp(const char *msg); -static int runPgDump(const char *dbname, const char *create_opts); +static int runPgDump(const char *dbname, const char *create_opts, + char *dbfile); static void buildShSecLabels(PGconn *conn, const char *catalog_name, Oid objectId, const char *objtype, const char *objname, @@ -147,6 +150,7 @@ main(int argc, char *argv[]) {"password", no_argument, NULL, 'W'}, {"no-privileges", no_argument, NULL, 'x'}, {"no-acl", no_argument, NULL, 'x'}, + {"format", required_argument, NULL, 'F'}, /* * the following options don't have an equivalent short option letter @@ -188,11 +192,13 @@ main(int argc, char *argv[]) char *pgdb = NULL; char *use_role = NULL; const char *dumpencoding = NULL; + const char *formatName = NULL; trivalue prompt_password = TRI_DEFAULT; bool data_only = false; bool globals_only = false; bool roles_only = false; bool tablespaces_only = false; + bool directory_format = false; PGconn *conn; int encoding; const char *std_strings; @@ -237,7 +243,7 @@ main(int argc, char *argv[]) pgdumpopts = createPQExpBuffer(); - while ((c = getopt_long(argc, argv, "acd:E:f:gh:l:Op:rsS:tU:vwWx", long_options, &optindex)) != -1) + while ((c = getopt_long(argc, argv, "acd:E:f:F:gh:l:Op:rsS:tU:vwWx", long_options, &optindex)) != -1) { switch (c) { @@ -265,7 +271,9 @@ main(int argc, char *argv[]) appendPQExpBufferStr(pgdumpopts, " -f "); appendShellString(pgdumpopts, filename); break; - + case 'F': + formatName = optarg; + break; case 'g': globals_only = true; break; @@ -414,6 +422,26 @@ main(int argc, char *argv[]) exit_nicely(1); } + if (formatName) + { + switch (formatName[0]) + { + case 'd': + case 'D': + directory_format = true; + break; + + case 'p': + case 'P': + /* Default plain format. */ + break; + + default: + pg_fatal("unrecognized dump format \"%s\"; please specify \"d\", or \"p\" ", + formatName); + } + } + /* * If password values are not required in the dump, switch to using * pg_roles which is equally useful, just more likely to have unrestricted @@ -497,9 +525,31 @@ main(int argc, char *argv[]) &database_exclude_names); /* - * Open the output file if required, otherwise use stdout + * Open the output file if required, otherwise use stdout. */ - if (filename) + if (directory_format) + { + char toc_path[MAXPGPATH]; + + /* + * If directory format is specified then we must provide the directory + * name. + */ + if (!filename || strcmp(filename, "") == 0) + pg_fatal("no output directory specified"); + + /* TODO: accept the empty existing directory. */ + if (mkdir(filename, 0700) < 0) + pg_fatal("could not create directory \"%s\": %m", + filename); + + snprintf(toc_path, MAXPGPATH, "%s/global.dat", filename); + + OPF = fopen(toc_path, "w"); + if (!OPF) + pg_fatal("could not open global.dat file: %s", strerror(errno)); + } + else if (filename) { OPF = fopen(filename, PG_BINARY_W); if (!OPF) @@ -607,7 +657,7 @@ main(int argc, char *argv[]) } if (!globals_only && !roles_only && !tablespaces_only) - dumpDatabases(conn); + dumpDatabases(conn, directory_format); PQfinish(conn); @@ -620,7 +670,7 @@ main(int argc, char *argv[]) fclose(OPF); /* sync the resulting file, errors are not fatal */ - if (dosync) + if (dosync && !directory_format) (void) fsync_fname(filename, false); } @@ -637,6 +687,7 @@ help(void) printf(_("\nGeneral options:\n")); printf(_(" -f, --file=FILENAME output file name\n")); + printf(_(" -F, --format=d|p output file format (directory, plain text (default))\n")); printf(_(" -v, --verbose verbose mode\n")); printf(_(" -V, --version output version information, then exit\n")); printf(_(" --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n")); @@ -1487,10 +1538,13 @@ expand_dbname_patterns(PGconn *conn, * Dump contents of databases. */ static void -dumpDatabases(PGconn *conn) +dumpDatabases(PGconn *conn, bool directory_format) { PGresult *res; int i; + char db_subdir[MAXPGPATH]; + char dbfilepath[MAXPGPATH]; + FILE *map_file; /* * Skip databases marked not datallowconn, since we'd be unable to connect @@ -1504,7 +1558,7 @@ dumpDatabases(PGconn *conn) * doesn't have some failure mode with --clean. */ res = executeQuery(conn, - "SELECT datname " + "SELECT datname, oid " "FROM pg_database d " "WHERE datallowconn AND datconnlimit != -2 " "ORDER BY (datname <> 'template1'), datname"); @@ -1512,9 +1566,30 @@ dumpDatabases(PGconn *conn) if (PQntuples(res) > 0) fprintf(OPF, "--\n-- Databases\n--\n\n"); + /* + * If directory format is specified then create a subdirectory under the + * main directory and each database dump file will be created under the + * subdirectory in archive mode as per single db pg_dump. + */ + if (directory_format) + { + char map_file_path[MAXPGPATH]; + + snprintf(db_subdir, MAXPGPATH, "%s/databases", filename); + if (mkdir(db_subdir, 0755) != 0) + pg_log_error("could not create subdirectory \"%s\": %m", db_subdir); + + /* Create a map file (to store dboid and dbname) */ + snprintf(map_file_path, MAXPGPATH, "%s/map.dat", filename); + map_file = fopen(map_file_path, "w"); + if (!map_file) + pg_fatal("could not open map file: %s", strerror(errno)); + } + for (i = 0; i < PQntuples(res); i++) { char *dbname = PQgetvalue(res, i, 0); + char *oid = PQgetvalue(res, i, 1); const char *create_opts; int ret; @@ -1522,6 +1597,14 @@ dumpDatabases(PGconn *conn) if (strcmp(dbname, "template0") == 0) continue; + if (directory_format) + { + snprintf(dbfilepath, MAXPGPATH, "-f %s/%s", db_subdir, oid); + + /* append dboid and dbname in map file. */ + fprintf(map_file, "%s %s\n", oid, dbname); + } + /* Skip any explicitly excluded database */ if (simple_string_list_member(&database_exclude_names, dbname)) { @@ -1531,7 +1614,8 @@ dumpDatabases(PGconn *conn) pg_log_info("dumping database \"%s\"", dbname); - fprintf(OPF, "--\n-- Database \"%s\" dump\n--\n\n", dbname); + if (!directory_format) + fprintf(OPF, "--\n-- Database \"%s\" dump\n--\n\n", dbname); /* * We assume that "template1" and "postgres" already exist in the @@ -1549,20 +1633,21 @@ dumpDatabases(PGconn *conn) { create_opts = ""; /* Since pg_dump won't emit a \connect command, we must */ - fprintf(OPF, "\\connect %s\n\n", dbname); + if (!directory_format) + fprintf(OPF, "\\connect %s\n\n", dbname); } } else create_opts = "--create"; - if (filename) + if (!directory_format && filename) fclose(OPF); - ret = runPgDump(dbname, create_opts); + ret = runPgDump(dbname, create_opts, directory_format ? dbfilepath : NULL); if (ret != 0) pg_fatal("pg_dump failed on database \"%s\", exiting", dbname); - if (filename) + if (!directory_format && filename) { OPF = fopen(filename, PG_BINARY_A); if (!OPF) @@ -1571,6 +1656,10 @@ dumpDatabases(PGconn *conn) } } + /* close map file */ + if (directory_format) + fclose(map_file); + PQclear(res); } @@ -1580,7 +1669,7 @@ dumpDatabases(PGconn *conn) * Run pg_dump on dbname, with specified options. */ static int -runPgDump(const char *dbname, const char *create_opts) +runPgDump(const char *dbname, const char *create_opts, char *dbfile) { PQExpBufferData connstrbuf; PQExpBufferData cmd; @@ -1589,17 +1678,26 @@ runPgDump(const char *dbname, const char *create_opts) initPQExpBuffer(&connstrbuf); initPQExpBuffer(&cmd); - printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, - pgdumpopts->data, create_opts); - - /* - * If we have a filename, use the undocumented plain-append pg_dump - * format. - */ - if (filename) - appendPQExpBufferStr(&cmd, " -Fa "); + if (dbfile) + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + dbfile, create_opts); + appendPQExpBufferStr(&cmd, " -F d "); + } else - appendPQExpBufferStr(&cmd, " -Fp "); + { + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin, + pgdumpopts->data, create_opts); + + /* + * If we have a filename, use the undocumented plain-append pg_dump + * format. + */ + if (filename) + appendPQExpBufferStr(&cmd, " -Fa "); + else + appendPQExpBufferStr(&cmd, " -Fp "); + } /* * Append the database name to the already-constructed stem of connection diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c index 88ae39d938..273f2002f1 100644 --- a/src/bin/pg_dump/pg_restore.c +++ b/src/bin/pg_dump/pg_restore.c @@ -41,27 +41,69 @@ #include "postgres_fe.h" #include <ctype.h> +#include <sys/stat.h> #ifdef HAVE_TERMIOS_H #include <termios.h> #endif +#include "common/connect.h" +#include "compress_io.h" +#include "common/string.h" #include "fe_utils/option_utils.h" +#include "fe_utils/string_utils.h" #include "filter.h" #include "getopt_long.h" #include "parallel.h" +#include "pg_backup_archiver.h" #include "pg_backup_utils.h" +typedef struct SimpleDatabaseOidListCell +{ + struct SimpleDatabaseOidListCell *next; + Oid dboid; + const char *dbname; +} SimpleDatabaseOidListCell; + +typedef struct SimpleDatabaseOidList +{ + SimpleDatabaseOidListCell *head; + SimpleDatabaseOidListCell *tail; +} SimpleDatabaseOidList; + +static void +simple_dboid_list_append(SimpleDatabaseOidList *list, Oid dboid, const char *dbname); + static void usage(const char *progname); static void read_restore_filters(const char *filename, RestoreOptions *opts); +static bool _fileExistsInDirectory(const char *dir, const char *filename); +static bool restoreOneDatabase(const char *inputFileSpec, + RestoreOptions *opts, int numWorkers); +static PGconn *connectDatabase(const char *dbname, const char *conn_string, + const char *pghost, const char *pgport, const char *pguser, + trivalue prompt_password, bool fail_on_error); +static PGresult *executeQuery(PGconn *conn, const char *query); +static int ReadOneStatement(StringInfo inBuf, FILE *f_glo); +static int restoreAllDatabases(PGconn *conn, const char *dumpdirpath, + SimpleStringList db_exclude_patterns, RestoreOptions *opts, int numWorkers); +static void execute_global_sql_commands(PGconn *conn, const char *dumpdirpath); +static int filter_dbnames_for_restore(PGconn *conn, + SimpleDatabaseOidList *dbname_oid_list, SimpleStringList db_exclude_patterns); +static int get_dbname_oid_list_from_mfile(const char *dumpdirpath, + SimpleDatabaseOidList *dbname_oid_list); +static void simple_dboid_list_append(SimpleDatabaseOidList *list, Oid dboid, + const char *dbname); +static bool is_full_pattern(PGconn *conn, const char *str, const char *ptrn); +static void simple_string_list_delete(SimpleStringList *list, + SimpleStringListCell *cell, SimpleStringListCell *prev); +static void simple_dboid_list_delete(SimpleDatabaseOidList *list, + SimpleDatabaseOidListCell *cell, SimpleDatabaseOidListCell *prev); int main(int argc, char **argv) { RestoreOptions *opts; int c; - int exit_code; int numWorkers = 1; - Archive *AH; char *inputFileSpec; static int disable_triggers = 0; static int enable_row_security = 0; @@ -77,11 +119,14 @@ main(int argc, char **argv) static int strict_names = 0; bool data_only = false; bool schema_only = false; + SimpleStringList db_exclude_patterns = {NULL, NULL}; + bool globals_only = false; struct option cmdopts[] = { {"clean", 0, NULL, 'c'}, {"create", 0, NULL, 'C'}, {"data-only", 0, NULL, 'a'}, + {"globals-only", 0, NULL, 'g'}, {"dbname", 1, NULL, 'd'}, {"exit-on-error", 0, NULL, 'e'}, {"exclude-schema", 1, NULL, 'N'}, @@ -128,6 +173,7 @@ main(int argc, char **argv) {"no-security-labels", no_argument, &no_security_labels, 1}, {"no-subscriptions", no_argument, &no_subscriptions, 1}, {"filter", required_argument, NULL, 4}, + {"exclude-database", required_argument, NULL, 6}, {NULL, 0, NULL, 0} }; @@ -156,7 +202,7 @@ main(int argc, char **argv) } } - while ((c = getopt_long(argc, argv, "acCd:ef:F:h:I:j:lL:n:N:Op:P:RsS:t:T:U:vwWx1", + while ((c = getopt_long(argc, argv, "aAcCd:ef:F:gh:I:j:lL:n:N:Op:P:RsS:t:T:U:vwWx1", cmdopts, NULL)) != -1) { switch (c) @@ -183,11 +229,14 @@ main(int argc, char **argv) if (strlen(optarg) != 0) opts->formatName = pg_strdup(optarg); break; + case 'g': + /* restore only global.dat file from directory */ + globals_only = true; + break; case 'h': if (strlen(optarg) != 0) opts->cparams.pghost = pg_strdup(optarg); break; - case 'j': /* number of restore jobs */ if (!option_parse_int(optarg, "-j/--jobs", 1, PG_MAX_JOBS, @@ -302,6 +351,14 @@ main(int argc, char **argv) exit(1); opts->exit_on_error = true; break; + case 6: + /* list of databases those needs to skip while restoring */ + simple_string_list_append(&db_exclude_patterns, optarg); + /* + * XXX: TODO as of now, considering only db names but we can + * implement for patterns also. + */ + break; default: /* getopt_long already emitted a complaint */ @@ -329,6 +386,16 @@ main(int argc, char **argv) if (!opts->cparams.dbname && !opts->filename && !opts->tocSummary) pg_fatal("one of -d/--dbname and -f/--file must be specified"); + if (db_exclude_patterns.head != NULL && globals_only) + { + pg_log_error("option --exclude-database cannot be used together with -g/--globals-only"); + pg_log_error_hint("Try \"%s --help\" for more information.", progname); + exit_nicely(1); + } + + if (globals_only && opts->cparams.dbname == NULL) + pg_fatal("option -g/--globals-only requires option -d/--dbname"); + /* Should get at most one of -d and -f, else user is confused */ if (opts->cparams.dbname) { @@ -406,6 +473,69 @@ main(int argc, char **argv) } } + /* + * If directory format, then first check that toc.dat file exist or not? + * + * if toc.dat exist, then no need to check for map.dat and global.dat + * + */ + if (opts->format == archDirectory && + inputFileSpec != NULL && + !_fileExistsInDirectory(inputFileSpec, "toc.dat")) + { + /* if global.dat and map.dat are exist, then open them */ + if (_fileExistsInDirectory(pg_strdup(inputFileSpec), "global.dat") + && _fileExistsInDirectory(pg_strdup(inputFileSpec), "map.dat")) + { + /* Found the global.dat and map.dat file so process. */ + PGconn *conn = NULL; + + if (opts->cparams.dbname == NULL) + pg_fatal(" -d/--dbanme should be given if using dump of dumpall and global.dat"); + + if (opts->createDB != 1) + pg_fatal("option -C/--create should be specified if using dump of dumpall with global.dat"); + + /* Connect to database so that we can execute global.dat */ + conn = connectDatabase(opts->cparams.dbname, NULL, + opts->cparams.pghost, opts->cparams.pgport, opts->cparams.username, + TRI_DEFAULT, false); + + if (!conn) + pg_fatal("could not connect to database \"%s\"", opts->cparams.dbname); + + /* Open global.dat file and execute all the sql commands */ + execute_global_sql_commands(conn, inputFileSpec); + + /* if globals-only, then return from here */ + if (globals_only) + { + PQfinish(conn); + return 0; + } + + /* Now restore all the databases from map.dat file */ + return restoreAllDatabases(conn, inputFileSpec, + db_exclude_patterns, + opts, numWorkers); + }/* end if */ + }/* end if */ + + return restoreOneDatabase(inputFileSpec, opts, numWorkers); +} + +/* + * restoreOneDatabase + * + * This will restore one database using toc.dat file. + */ +static bool +restoreOneDatabase(const char *inputFileSpec, RestoreOptions *opts, + int numWorkers) +{ + Archive *AH; + bool exit_code; + AH = OpenArchive(inputFileSpec, opts->format); SetArchiveOptions(AH, NULL, opts); @@ -471,6 +601,7 @@ usage(const char *progname) printf(_(" -c, --clean clean (drop) database objects before recreating\n")); printf(_(" -C, --create create the target database\n")); printf(_(" -e, --exit-on-error exit on error, default is to continue\n")); + printf(_(" -g, --globals-only restore only global objects, no databases\n")); printf(_(" -I, --index=NAME restore named index\n")); printf(_(" -j, --jobs=NUM use this many parallel jobs to restore\n")); printf(_(" -L, --use-list=FILENAME use table of contents from this file for\n" @@ -483,6 +614,7 @@ usage(const char *progname) printf(_(" -S, --superuser=NAME superuser user name to use for disabling triggers\n")); printf(_(" -t, --table=NAME restore named relation (table, view, etc.)\n")); printf(_(" -T, --trigger=NAME restore named trigger\n")); + printf(_(" --exclude-database=PATTERN exclude databases whose name matches with pattern\n")); printf(_(" -x, --no-privileges skip restoration of access privileges (grant/revoke)\n")); printf(_(" -1, --single-transaction restore as a single transaction\n")); printf(_(" --disable-triggers disable triggers during data-only restore\n")); @@ -621,3 +753,623 @@ read_restore_filters(const char *filename, RestoreOptions *opts) filter_free(&fstate); } + +static bool +_fileExistsInDirectory(const char *dir, const char *filename) +{ + struct stat st; + char buf[MAXPGPATH]; + + if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH) + pg_fatal("directory name too long: \"%s\"", dir); + + return (stat(buf, &st) == 0 && S_ISREG(st.st_mode)); +} + +/* + * Make a database connection with the given parameters. An + * interactive password prompt is automatically issued if required. + * + * If fail_on_error is false, we return NULL without printing any message + * on failure, but preserve any prompted password for the next try. + * + */ +static PGconn * +connectDatabase(const char *dbname, const char *connection_string, + const char *pghost, const char *pgport, const char *pguser, + trivalue prompt_password, bool fail_on_error) +{ + PGconn *conn; + bool new_pass; + const char *remoteversion_str; + int my_version; + const char **keywords = NULL; + const char **values = NULL; + PQconninfoOption *conn_opts = NULL; + static char *password = NULL; + static int server_version; + + if (prompt_password == TRI_YES && !password) + password = simple_prompt("Password: ", false); + + /* + * Start the connection. Loop until we have a password if requested by + * backend. + */ + do + { + int argcount = 6; + PQconninfoOption *conn_opt; + char *err_msg = NULL; + int i = 0; + + free(keywords); + free(values); + PQconninfoFree(conn_opts); + + /* + * Merge the connection info inputs given in form of connection string + * and other options. Explicitly discard any dbname value in the + * connection string; otherwise, PQconnectdbParams() would interpret + * that value as being itself a connection string. + */ + if (connection_string) + { + conn_opts = PQconninfoParse(connection_string, &err_msg); + if (conn_opts == NULL) + pg_fatal("%s", err_msg); + + for (conn_opt = conn_opts; conn_opt->keyword != NULL; conn_opt++) + { + if (conn_opt->val != NULL && conn_opt->val[0] != '\0' && + strcmp(conn_opt->keyword, "dbname") != 0) + argcount++; + } + + keywords = pg_malloc0((argcount + 1) * sizeof(*keywords)); + values = pg_malloc0((argcount + 1) * sizeof(*values)); + + for (conn_opt = conn_opts; conn_opt->keyword != NULL; conn_opt++) + { + if (conn_opt->val != NULL && conn_opt->val[0] != '\0' && + strcmp(conn_opt->keyword, "dbname") != 0) + { + keywords[i] = conn_opt->keyword; + values[i] = conn_opt->val; + i++; + } + } + } + else + { + keywords = pg_malloc0((argcount + 1) * sizeof(*keywords)); + values = pg_malloc0((argcount + 1) * sizeof(*values)); + } + + if (pghost) + { + keywords[i] = "host"; + values[i] = pghost; + i++; + } + if (pgport) + { + keywords[i] = "port"; + values[i] = pgport; + i++; + } + if (pguser) + { + keywords[i] = "user"; + values[i] = pguser; + i++; + } + if (password) + { + keywords[i] = "password"; + values[i] = password; + i++; + } + if (dbname) + { + keywords[i] = "dbname"; + values[i] = dbname; + i++; + } + keywords[i] = "fallback_application_name"; + values[i] = progname; + i++; + + new_pass = false; + conn = PQconnectdbParams(keywords, values, true); + + if (!conn) + pg_fatal("could not connect to database \"%s\"", dbname); + + if (PQstatus(conn) == CONNECTION_BAD && + PQconnectionNeedsPassword(conn) && + !password && + prompt_password != TRI_NO) + { + PQfinish(conn); + password = simple_prompt("Password: ", false); + new_pass = true; + } + } while (new_pass); + + /* check to see that the backend connection was successfully made */ + if (PQstatus(conn) == CONNECTION_BAD) + { + if (fail_on_error) + pg_fatal("%s", PQerrorMessage(conn)); + else + { + PQfinish(conn); + + free(keywords); + free(values); + PQconninfoFree(conn_opts); + + return NULL; + } + } + + free(keywords); + free(values); + PQconninfoFree(conn_opts); + + /* Check version */ + remoteversion_str = PQparameterStatus(conn, "server_version"); + if (!remoteversion_str) + pg_fatal("could not get server version"); + server_version = PQserverVersion(conn); + if (server_version == 0) + pg_fatal("could not parse server version \"%s\"", + remoteversion_str); + + my_version = PG_VERSION_NUM; + + /* + * We allow the server to be back to 9.2, and up to any minor release of + * our own major version. (See also version check in pg_dump.c.) + */ + if (my_version != server_version + && (server_version < 90200 || + (server_version / 100) > (my_version / 100))) + { + pg_log_error("aborting because of server version mismatch"); + pg_log_error_detail("server version: %s; %s version: %s", + remoteversion_str, progname, PG_VERSION); + exit_nicely(1); + } + + PQclear(executeQuery(conn, ALWAYS_SECURE_SEARCH_PATH_SQL)); + + return conn; +} + +/* + * Run a query, return the results, exit program on failure. + */ +static PGresult * +executeQuery(PGconn *conn, const char *query) +{ + PGresult *res; + + pg_log_info("executing %s", query); + + res = PQexec(conn, query); + if (!res || PQresultStatus(res) != PGRES_TUPLES_OK) + { + pg_log_error("query failed: %s", PQerrorMessage(conn)); + pg_log_error_detail("Query was: %s", query); + PQfinish(conn); + exit_nicely(1); + } + + return res; +} + +/* ---------------- + * ReadOneStatement() + * + * This will start reading from passed file pointer using fgetc and read till + * semicolon(sql statement terminator for global.sql file) + * + * EOF is returned if end-of-file input is seen; time to shut down. + * ---------------- + */ + +static int +ReadOneStatement(StringInfo inBuf, FILE *f_glo) +{ + int c; /* character read from getc() */ + + resetStringInfo(inBuf); + + /* + * Read characters until EOF or the appropriate delimiter is seen. + */ + while ((c = fgetc(f_glo)) != EOF) + { + appendStringInfoChar(inBuf, (char) c); + + if (c == '\n') + { + if(inBuf->len > 1 && + inBuf->data[inBuf->len - 2] == ';') + break; + else + continue; + } + } + + /* No input before EOF signal means time to quit. */ + if (c == EOF && inBuf->len == 0) + return EOF; + + /* Add '\0' to make it look the same as message case. */ + appendStringInfoChar(inBuf, (char) '\0'); + + return 'Q'; +} + +/* + * This will remove names from all dblist that are given with exclude-database + * option. + * + * returns number of dbnames those will be restored. + */ +static int +filter_dbnames_for_restore(PGconn *conn, SimpleDatabaseOidList *dbname_oid_list, + SimpleStringList db_exclude_patterns) +{ + int countdb = 0; + SimpleDatabaseOidListCell *cell = dbname_oid_list->head; + SimpleDatabaseOidListCell *precell = NULL; + + /* Return 0 if there is no db to restore. */ + if (cell == NULL) + return 0; + + while (cell != NULL) + { + bool skip_db_restore = false; + SimpleStringListCell *prev = NULL; + SimpleDatabaseOidListCell *next = cell->next; + + /* Now match this dbname with exclude-database list. */ + for (SimpleStringListCell *celldb = db_exclude_patterns.head; celldb; celldb = celldb->next) + { + if (is_full_pattern(conn, cell->dbname, celldb->val)) + { + /* + * As we need to skip this dbname so set flag to remove it from + * list. + */ + skip_db_restore = true; + + /* + * As this pattern is skipped, now delete entry from list to + * avoid multiple looping. + */ + simple_string_list_delete(&db_exclude_patterns, celldb, prev); + break; + } + + prev = celldb; + } + + /* Increment count if db needs to be restored. */ + if (skip_db_restore) + simple_dboid_list_delete(dbname_oid_list, cell, precell); + else + { + countdb++; /* Increment db couter. */ + precell = cell; + } + + cell = next; /* Process next dbname from dbname list. */ + } + + return countdb; +} + +/* + * Open map.dat file and read line by line and then prepare a list of database + * names and correspoding dboid. + * + * Returns, total number of database names in map.dat file. + */ +static int +get_dbname_oid_list_from_mfile(const char *dumpdirpath, SimpleDatabaseOidList *dbname_oid_list) +{ + FILE *pfile; + char map_file_path[MAXPGPATH]; + char line[MAXPGPATH]; + int count = 0; + + snprintf(map_file_path, MAXPGPATH, "%s/map.dat", dumpdirpath); + + /* Open map.dat file. */ + pfile = fopen(map_file_path, "r"); + + if (pfile == NULL) + pg_fatal("could not open map.dat file: %s", strerror(errno)); + + /* Append all the dbname and dboid to the list. */ + while((fgets(line, MAXPGPATH, pfile)) != NULL) + { + Oid dboid; + char dbname[MAXPGPATH + 1]; + + /* Extract dbname and dboid from line */ + sscanf(line, "%u %s" , &dboid, dbname); + + pg_log_info("found dbname as :%s and dboid:%d in map.dat file while restoring", dbname, dboid); + + /* Report error if file has any corrupted data. */ + if (!OidIsValid(dboid) || strlen(dbname) == 0) + pg_fatal("invalid entry in map.dat file at line : %d", count + 1); + + /* + * XXX : before adding dbanme into list, we can verify that this db + * needs to skipped for restore or not but as of now, we are making + * a list of all the databases. + */ + simple_dboid_list_append(dbname_oid_list, dboid, dbname); + count++; + } + + /* Close map.dat file. */ + fclose(pfile); + + return count; +} + +/* + * This will restore databases those dumps are present in + * directory based on map.dat file mapping. + * + * This will skip restoring for databases that are specified with + * exclude-database option. + */ +static int +restoreAllDatabases(PGconn *conn, const char *dumpdirpath, + SimpleStringList db_exclude_patterns, RestoreOptions *opts, + int numWorkers) +{ + SimpleDatabaseOidList dbname_oid_list = {NULL, NULL}; + SimpleDatabaseOidListCell *cell; + int exit_code = 0; + int num_db_restore; + int num_total_db; + + num_total_db = get_dbname_oid_list_from_mfile(dumpdirpath, &dbname_oid_list); + + /* If map.dat has no entry, return from here. */ + if (dbname_oid_list.head == NULL) + return 0; + + pg_log_info("found total %d database names in map.dat file", num_total_db); + + /* Skip any explicitly excluded database. */ + num_db_restore = filter_dbnames_for_restore(conn, &dbname_oid_list, + db_exclude_patterns); + + /* Close the db connection as we are done globals and patterns. */ + PQfinish(conn); + + /* Exit if no db needs to be restored. */ + if (dbname_oid_list.head == NULL) + return 0; + + pg_log_info("needs to restore %d databases out of %d databases", num_db_restore, num_total_db); + + /* + * XXX: TODO till now, we made a list of databases, those needs to be restored + * after skipping names of exclude-database. Now we can launch parallel + * workers to restore these databases. + */ + cell = dbname_oid_list.head; + + while(cell != NULL) + { + char subdirpath[MAXPGPATH]; + int dbexit_code; + + /* + * We need to reset override_dbname so that objects can be restored into + * already created database. (used with -d/--dbname option) + */ + if (opts->cparams.override_dbname) + { + pfree(opts->cparams.override_dbname); + opts->cparams.override_dbname = NULL; + } + + snprintf(subdirpath, MAXPGPATH, "%s/databases/%u", dumpdirpath, cell->dboid); + + /* + * Database -d/--dbname is already created so reset createDB to ignore + * database creation error. + */ + if (strcmp(cell->dbname, opts->cparams.dbname) == 0) + opts->createDB = 0; + + pg_log_info("restoring database \"%s\"", cell->dbname); + + dbexit_code = restoreOneDatabase(subdirpath, opts, numWorkers); + + /* Store exit_code to report it back. */ + if (exit_code == 0 && dbexit_code != 0) + exit_code = dbexit_code; + + /* Set createDB option to create new database. */ + if (strcmp(cell->dbname, opts->cparams.dbname) == 0) + opts->createDB = 1; + + cell = cell->next; + } /* end while */ + + /* Log number of processed databases.*/ + pg_log_info("number of restored databases are %d", num_db_restore); + + return exit_code; +} + +/* + * This will open global.dat file and will execute all global sql commands one + * by one statement. + * + * semicolon is considered as statement terminator. + */ +static void +execute_global_sql_commands(PGconn *conn, const char *dumpdirpath) +{ + char global_file_path[MAXPGPATH]; + PGresult *result; + StringInfoData sqlstatement; + FILE *pfile; + + snprintf(global_file_path, MAXPGPATH, "%s/global.dat", dumpdirpath); + + /* now open global.dat file */ + pfile = fopen(global_file_path, "r"); + + if (pfile == NULL) + pg_fatal("could not open global.dat file: %s", strerror(errno)); + + /* Init sqlstatement to append commands */ + initStringInfo(&sqlstatement); + + /* Process file till EOF and execute sql statements */ + while (ReadOneStatement(&sqlstatement, pfile) != EOF) + { + result = PQexec(conn, sqlstatement.data); + + switch (PQresultStatus(result)) + { + case PGRES_COMMAND_OK: + case PGRES_TUPLES_OK: + case PGRES_EMPTY_QUERY: + case PGRES_COPY_IN: + break; + default: + pg_log_error("could not execute query: %s \nCommand was: %s", PQerrorMessage(conn), sqlstatement.data); + } + PQclear(result); + } + + fclose(pfile); +} + +/* + * appends a node to the list in the end. + */ +static void +simple_dboid_list_append(SimpleDatabaseOidList *list, Oid dboid, const char *dbname) +{ + SimpleDatabaseOidListCell *cell; + + cell = pg_malloc_object(SimpleDatabaseOidListCell); + + cell->next = NULL; + cell->dboid = dboid; + cell->dbname = pg_strdup(dbname); + + if (list->tail) + list->tail->next = cell; + else + list->head = cell; + list->tail = cell; +} + +/* + * delete cell from string list. + */ +static void +simple_string_list_delete(SimpleStringList *list, SimpleStringListCell *cell, + SimpleStringListCell *prev) +{ + if (prev == NULL) + { + list->head = cell->next; + pfree(cell); + } + else + { + prev->next = cell->next; + pfree(cell); + } +} + +/* + * delete cell from database and oid list. + */ +static void +simple_dboid_list_delete(SimpleDatabaseOidList *list, SimpleDatabaseOidListCell *cell, + SimpleDatabaseOidListCell *prev) +{ + if (prev == NULL) + { + list->head = cell->next; + pfree(cell); + } + else + { + prev->next = cell->next; + pfree(cell); + } +} + +/* + * is_full_pattern + * + * Returns true if we can constuct 1st string from given pattern. + * + */ +static bool +is_full_pattern(PGconn *conn, const char *str, const char *ptrn) +{ + PQExpBuffer query; + PGresult *result; + + query = createPQExpBuffer(); + + printfPQExpBuffer(query, + "SELECT substring ( " + " '%s' , " + " '%s' ) ", str, ptrn); + + result = executeQuery(conn, query->data); + + if (PQresultStatus(result) == PGRES_TUPLES_OK) + { + if (PQntuples(result) == 1) + { + const char *outstr; + + outstr = PQgetvalue(result, 0, 0); + + PQclear(result); + destroyPQExpBuffer(query); + + /* + * If output string of substring function is matches with str, then + * we can construct str from pattern. + */ + if (strcmp(outstr, str) == 0) + return true; + else + return false; + } + } + else + pg_log_error("could not execute query: %s \nCommand was: %s", PQerrorMessage(conn), query->data); + + PQclear(result); + destroyPQExpBuffer(query); + + return false; +} -- 2.39.3 ^ permalink raw reply [nested|flat] 15+ messages in thread
end of thread, other threads:[~2025-01-07 19:04 UTC | newest] Thread overview: 15+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2024-06-10 12:58 Non-text mode for pg_dumpall Andrew Dunstan <[email protected]> 2024-06-10 14:14 ` Nathan Bossart <[email protected]> 2024-06-10 14:51 ` Andrew Dunstan <[email protected]> 2024-06-10 14:52 ` Nathan Bossart <[email protected]> 2024-06-10 14:52 ` Magnus Hagander <[email protected]> 2024-06-10 15:03 ` Nathan Bossart <[email protected]> 2024-06-10 15:45 ` Magnus Hagander <[email protected]> 2024-06-10 16:20 ` Nathan Bossart <[email protected]> 2024-06-10 16:21 ` Tom Lane <[email protected]> 2024-06-10 17:27 ` Andrew Dunstan <[email protected]> 2024-06-10 19:36 ` Magnus Hagander <[email protected]> 2024-12-31 18:23 ` Mahendra Singh Thalor <[email protected]> 2025-01-01 20:35 ` Mahendra Singh Thalor <[email protected]> 2025-01-06 17:35 ` Nathan Bossart <[email protected]> 2025-01-07 19:04 ` Mahendra Singh Thalor <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox