public inbox for [email protected]  
help / color / mirror / Atom feed
From: Adrian Klaver <[email protected]>
To: yudhi s <[email protected]>
To: pgsql-general <[email protected]>
Subject: Re: Moving delta data faster
Date: Wed, 3 Apr 2024 21:46:56 -0700
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAEzWdqfGN5cHN4cwSJm-rruab4E0y_9tqzihR2jQGpMXHR7cqw@mail.gmail.com>
References: <CAEzWdqcYGi0U5_cK1FVykx2-OZHmEUD8EZ_VE=kpoVaZKYWJeg@mail.gmail.com>
	<[email protected]>
	<CAEzWdqfGN5cHN4cwSJm-rruab4E0y_9tqzihR2jQGpMXHR7cqw@mail.gmail.com>

On 4/3/24 20:54, yudhi s wrote:
> On Thu, Apr 4, 2024 at 2:41 AM Adrian Klaver <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>     On 4/3/24 13:38, yudhi s wrote:
>      >   Hi All,
>      >   It's postgresql database version 15.4. We have a requirement in
>     which
>      > we will be initially moving full table data for 3-4 tables, from
>     source
>      > database to target(i.e. postgres) . Maximum number of rows will be
>      > ~10million rows in those tables. Then subsequently these rows
>     will be
>      > inserted/updated based on the delta number of rows that got
>      > inserted/updated in the source database. In some cases these changed
>      > data can flow multiple times per day to the downstream i.e. postgres
>      > database and in other cases once daily.
> 
>     What is the source database?
> 
>     Can it be reached with a FDW?:
> 
>     https://wiki.postgresql.org/wiki/Foreign_data_wrappers
>     <https://wiki.postgresql.org/wiki/Foreign_data_wrappers;
> 
>     Can the delta on the source be output as CSV?
> 
> 
> 
> Thank you Adrian.
> 
> And one thing i forgot to mention this target postgresql database would 
> be on AWS RDS whereas the source Oracle databases is on premise. I think 
> we don't have the FDW extension currently in place but we can get that. 
> I am just not able to understand clearly  though, but do you mean export 
> the data from source using CSV and do truncate and import on target. And 
> as these data will be moved through the network won't that cause slowness?
> 
> The source database here is Oracle database. Correct me if wrong, it 
> looks like foreign data wrapper is like a DB link. Or do you mean 
> writing a query on the target database (which can be UPSERT or MERGE) 
> but will be joining the table from the source database through the 
> DBlink/DDW? But my question was whether we should use UPSERT or MERGE 
> for comparing and loading the delta records to the target postgresql 
> database. Want to understand which is more performant , as I see in the 
> past Merge having performance issues in the past, but not very sure 
> about that.

My motivation was to get some basic information about your setup and 
what you are trying to achieve.

If I understand correctly you have:

1) An Oracle database with tables that you want to copy the complete 
data from to a Postgres database. For this sort of thing 
COPY(https://www.postgresql.org/docs/current/sql-copy.html) on the 
Postgres end using CSV data generated from the source is probably the 
quickest bulk load method.

2) After the initial load you want to do follow up INSERT/UPDATEs based 
on a delta of the source tables relative to the initial load. This is 
still a bit of mystery to me. How are determining the delta: a) On the 
source end entirely or b) Target relative to source? Also what is the 
anticipated size of the delta per transfer?

Additional information needed:

1) Network distance between source and target?

2) Network capacity?

3) Expected load on both source and target servers from other operations?

-- 
Adrian Klaver
[email protected]







view thread (5+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Moving delta data faster
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox