Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sErXO-0047wX-VN for pgsql-general@arkaria.postgresql.org; Wed, 05 Jun 2024 14:22:52 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sErXO-0092KR-V0 for pgsql-general@arkaria.postgresql.org; Wed, 05 Jun 2024 14:22:50 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sErXO-0092Jo-JJ for pgsql-general@lists.postgresql.org; Wed, 05 Jun 2024 14:22:50 +0000 Received: from mail-yb1-xb30.google.com ([2607:f8b0:4864:20::b30]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sErXM-00089a-6Y for pgsql-general@postgresql.org; Wed, 05 Jun 2024 14:22:50 +0000 Received: by mail-yb1-xb30.google.com with SMTP id 3f1490d57ef6-dfa4876a5bbso6308521276.2 for ; Wed, 05 Jun 2024 07:22:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aweber.com; s=2018090500; t=1717597366; x=1718202166; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=d/aLTch3yzZ4d2rJSZGudzwb2ZXKeosBD/lBhiGUWa0=; b=lOF3z6BvZYiVZk6XVqhS1k07QgWA86N4QPjsGp20vz97948N0y5JR1PAvYdqbF7xgA qOx673XJbELW5wsQIfn5AHWKhUcorWi8nxkeVMu6y8hEifqeYL+x42KTiTFNgD8G5pCq mpR/wNqrU+iYrFzo6FhGGVlaDgONfOe+AFsBdnAM8zEBWdQZwI69lviTA8FOshS8n6Rn XEpYRgBxYixz5pQFDLIoU/G/4K8+7h97G7ZsCswP+3Y2U8/2Nd8Z8kptf5uOkYPUeJXX qBs349kMDTxll8AXskQirBM2GHjn3eziIYxhlsIZKe5h7bOOe8TPy5QnfEu2lYHyLuRb iLiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717597366; x=1718202166; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=d/aLTch3yzZ4d2rJSZGudzwb2ZXKeosBD/lBhiGUWa0=; b=P/bqo0tSrbCjDCsmh0ceaG6yGVeHn7PKMxAZKFytHdQURaNtrOn5Luht/cPVP/pBMO oDpnLkgpSr7dAVbXwhajtkHT18lu7d0WEaFviikfeLQaVG3rkylxL9lZKUxuKvZnPzP3 H4OXdQnS4smIPA+LQWAGpeFrnonH0/U/zyXQKZIHluQ83kBamezGYAKUa9pfdKIqcVhc xdSgaGpE6wOC/Jw+uDNa2mJ8AjozSOFYMEMprHhJ5AxC7ZkmCslorSnU+Lwvetay9iAf 1s7Jj82JCW67X+RU6ZAwfOd2qSWsBYeu/tN2tpr/T+Xs+p+T59szN9Z8Q3JCJ3zsBGkS XjEA== X-Gm-Message-State: AOJu0Yzmxu4MAiAP4jyfeQgsvjoikl15sckvDKPOFbp/epVbjtl/g+7Y LbF1rk6DVVMedo88LfOj5ynxNHa+XMWdLTfj6zL0owm0BqwPPS/N2Kn7l1lchSHE0Z5xsYu+ysX jHgS3AKbad2b067qY7rOQHljA3k/ITKluSEyuBQ== X-Google-Smtp-Source: AGHT+IHAnx3xyBAPvBwcl+Tc02lJloeJjEXumOwLaXcdZmicuDe/gIojGWHjPHm6rh0SVIOatv/LWvtYFDSW7ipjepw= X-Received: by 2002:a25:828a:0:b0:df4:e791:867c with SMTP id 3f1490d57ef6-dfacacf972fmr2690838276.43.1717597366046; Wed, 05 Jun 2024 07:22:46 -0700 (PDT) MIME-Version: 1.0 References: <25e9749c-ff38-4832-9b26-386cac33b3d8@aklaver.com> <7005ac8c-2f83-4122-9172-04bca268f987@aklaver.com> <4b28e899-d802-43ce-b20b-655a9077f08f@gmail.com> In-Reply-To: From: Gavin Roy Date: Wed, 5 Jun 2024 10:22:35 -0400 Message-ID: Subject: Re: Purpose of pg_dump tar archive format? To: Ron Johnson Cc: pgsql-general Content-Type: multipart/alternative; boundary="000000000000def1b7061a254c24" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000def1b7061a254c24 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jun 4, 2024 at 7:36=E2=80=AFPM Ron Johnson wrote: > On Tue, Jun 4, 2024 at 3:47=E2=80=AFPM Gavin Roy wrot= e: > >> >> On Tue, Jun 4, 2024 at 3:15=E2=80=AFPM Ron Johnson >> wrote: >> >>> >>> But why tar instead of custom? That was part of my original question. >>> >> >> I've found it pretty useful for programmatically accessing data in a dum= p >> for large databases outside of the normal pg_dump/pg_restore workflow. Y= ou >> don't have to seek through one large binary file to get to the data sect= ion >> to get at the data. >> > > Interesting. Please explain, though, since a big tarball _is_ "one large > binary file" that you have to sequentially scan. (I don't know the > internal structure of custom format files, and whether they have file > pointers to each table.) > Not if you untar it first. > Is it because you need individual .dat "COPY" files for something other > than loading into PG tables (since pg_restore --table=3Dxxxx does that, t= oo), > and directory format archives can be inconvenient? > In the past I've used it for data analysis outside of Postgres. --=20 *Gavin M. Roy* CTO AWeber --000000000000def1b7061a254c24 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Tue, Jun 4, 2024 at 7:36=E2=80=AFPM Ro= n Johnson <= ronljohnsonjr@gmail.com> wrote:
On Tue, Jun 4, 202= 4 at 3:47=E2=80=AFPM Gavin Roy <gavinr@aweber.com> wrote:

On Tue, Jun 4, 2024 at 3:15=E2=80=AFPM Ron Johnson <ronljohnsonjr@gm= ail.com> wrote:

But why tar instead of custom? That was part of my original questio= n.

I've found it pre= tty useful for programmatically accessing data in a dump for large database= s outside of the normal pg_dump/pg_restore workflow. You don't have to = seek through one large binary file to get to the data section to get at the= data.

Interesting.=C2=A0 Please = explain, though, since a big tarball _is_ "one large binary file"= that you have to sequentially scan.=C2=A0 (I don't know the internal s= tructure of custom format files, and whether they have file pointers to eac= h=C2=A0table.)

Not if you= untar it first.
=C2=A0
Is it becaus= e you need individual .dat=C2=A0"COPY" files for something other = than loading into PG tables (since pg_restore --table=3Dxxxx does that, too= ), and directory format archives can be inconvenient?

In the past I've used it for data analysis outsi= de of Postgres.
--
Gavin M. Roy
CTO
AWeber
--000000000000def1b7061a254c24--