public inbox for [email protected]  
help / color / mirror / Atom feed
From: Muhammad Ikram <[email protected]>
To: Josef Šimánek <[email protected]>
Cc: sud <[email protected]>
Cc: pgsql-general <[email protected]>
Subject: Re: Load a csv or a avro?
Date: Fri, 5 Jul 2024 15:15:28 +0500
Message-ID: <CAGeimVo_EOJO+BViDnLoxEdFoFKQkeHU=gniEQ9e2GbjAUvUHg@mail.gmail.com> (raw)
In-Reply-To: <CAFp7QwoNgw_3rGsJF1nK5Mzceg2_+UL0-9pJ2WRCw04O71S4zA@mail.gmail.com>
References: <CAD=mzVUo9UpTw7F_8HKDK19ZmO6tE6Cfa4T-7i1J_QGfi6NpOw@mail.gmail.com>
	<CAFp7QwoNgw_3rGsJF1nK5Mzceg2_+UL0-9pJ2WRCw04O71S4zA@mail.gmail.com>

Hi,

Performance Considerations
    Avro files are smaller due to compression so needing less I/O time.
whereas CSV files are simpler but larger in size so read/write will need
more time.
    COPY command works very well with CSV files whereas ETL process is
required for handling Avro.

Regards,
Muhammad Ikram


On Fri, Jul 5, 2024 at 3:03 PM Josef Šimánek <[email protected]>
wrote:

> pá 5. 7. 2024 v 11:08 odesílatel sud <[email protected]> napsal:
> >
> > Hello all,
> >
> > Its postgres database. We have option of getting files in csv and/or in
> avro format messages from another system to load it into our postgres
> database. The volume will be 300million messages per day across many files
> in batches.
> >
> > My question was, which format should we chose in regards to faster data
> loading performance ? and if any other aspects to it also should be
> considered apart from just loading performance?
>
> We are able to load ~300 million rows per one day using CSV and COPY
> functions (
> https://www.postgresql.org/docs/current/libpq-copy.html#LIBPQ-COPY-SEND).
>
>
>

-- 
Muhammad Ikram


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Load a csv or a avro?
  In-Reply-To: <CAGeimVo_EOJO+BViDnLoxEdFoFKQkeHU=gniEQ9e2GbjAUvUHg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox