public inbox for [email protected]  
help / color / mirror / Atom feed
From: Ron Johnson <[email protected]>
To: pgsql-general <[email protected]>
Subject: Re: Load a csv or a avro?
Date: Fri, 5 Jul 2024 09:03:15 -0400
Message-ID: <CANzqJaC419Dfe3xQHdOtQBkQPPWwPAUbNUvdfo=MWxf9eNWb3g@mail.gmail.com> (raw)
In-Reply-To: <CAD=mzVUo9UpTw7F_8HKDK19ZmO6tE6Cfa4T-7i1J_QGfi6NpOw@mail.gmail.com>
References: <CAD=mzVUo9UpTw7F_8HKDK19ZmO6tE6Cfa4T-7i1J_QGfi6NpOw@mail.gmail.com>

On Fri, Jul 5, 2024 at 5:08 AM sud <[email protected]> wrote:

> Hello all,
>
> Its postgres database. We have option of getting files in csv and/or in
> avro format messages from another system to load it into our postgres
> database. The volume will be 300million messages per day across many files
> in batches.
>
> My question was, which format should we chose in regards to faster data
> loading performance ?
>

What application will be loading the data?   If psql, then go with CSV;
COPY is *really* efficient.

If the PG tables are already mapped to the avro format, then maybe avro
will be faster.

> and if any other aspects to it also should be considered apart from just
> loading performance?
>

If all the data comes in at night, drop as many indices as possible before
loading.

Load each file in as few DB connections as possible: the most efficient
binary format won't do you any good if you open and close a connection for
each and every row.


view thread (3+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected]
  Subject: Re: Load a csv or a avro?
  In-Reply-To: <CANzqJaC419Dfe3xQHdOtQBkQPPWwPAUbNUvdfo=MWxf9eNWb3g@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox