public inbox for [email protected]  
help / color / mirror / Atom feed
From: Adrian Klaver <[email protected]>
To: veem v <[email protected]>
To: Christophe Pettus <[email protected]>
Cc: pgsql-general <[email protected]>
Subject: Re: IO related waits
Date: Tue, 17 Sep 2024 08:54:49 -0700
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAB+=1TWdRd2sBw7-vXCovH_VHLANh+aSaU-WyJ2m8tL4TkF=8g@mail.gmail.com>
References: <CAB+=1TWZNvMhVthJ2iKs_Q4qBzMw-v_oaSz7HbFE_P_qC5jMFA@mail.gmail.com>
	<[email protected]>
	<CAB+=1TWdRd2sBw7-vXCovH_VHLANh+aSaU-WyJ2m8tL4TkF=8g@mail.gmail.com>

On 9/16/24 20:55, veem v wrote:
> 
> 
> On Tue, 17 Sept 2024 at 03:41, Adrian Klaver <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> 
>     Are you referring to this?:
> 
>     https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/operators/asyncio/ <https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/operators/asyncio/;
> 
>     If not then you will need to be more specific.
> 
> 
> Yes, I was referring to this one. So what can be the caveats in this 
> approach, considering transactions meant to be ACID compliant as 
> financial transactions.Additionally I was not aware of the parameter 
> "synchronous_commit" in DB side which will mimic the synchronous commit.
> 
> Would both of these mimic the same asynchronous behaviour and achieves 
> the same, which means the client data load throughput will increase 
> because the DB will not wait for those data to be written to the WAL and 
> give a confirmation back to the client and also the client will not wait 
> for the DB to give a confirmation back on the data to be persisted in 
> the DB or not?. Also, as in the backend the flushing of the WAL to the 
> disk has to happen anyway(just that it will be delayed now), so can this 
> method cause contention in the database storage side if the speed in 
> which the data gets ingested from the client is not getting written to 
> the disk , and if it can someway impact the data consistency for the 
> read queries?

This is not something that I am that familiar with. I suspect though 
this is more complicated then you think. From the link above:

" Prerequisites #

As illustrated in the section above, implementing proper asynchronous 
I/O to a database (or key/value store) requires a client to that 
database that supports asynchronous requests. Many popular databases 
offer such a client.

In the absence of such a client, one can try and turn a synchronous 
client into a limited concurrent client by creating multiple clients and 
handling the synchronous calls with a thread pool. However, this 
approach is usually less efficient than a proper asynchronous client.
"

Which means you need to on Flink end:

1) Use Flink async I/O .

2) Find a client that supports async or fake it by using multiple 
synchronous clients.

On Postgres end there is this:

https://www.postgresql.org/docs/current/wal-async-commit.html

That will return a success signal to the client quicker if 
synchronous_commit is set to off. Though the point of the Flink async 
I/O is not to wait for the response before moving on, so I am not sure 
how much synchronous_commit = off would help.

-- 
Adrian Klaver
[email protected]







view thread (7+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: IO related waits
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox