[pgjdbc/pgjdbc] issue #2325: Support pipelining queries

pgjdbc/pgjdbc GitHub issues and pull requests (mirror)  
help / color / mirror / Atom feed

[pgjdbc/pgjdbc] issue #2325: Support pipelining queries
15+ messages / 5 participants
[nested] [flat]

* [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2021-10-27 09:33  "kustodian (@kustodian)" <[email protected]>
  0 siblings, 0 replies; 15+ messages in thread

From: kustodian (@kustodian) @ 2021-10-27 09:33 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

**I'm submitting a ...**
- feature request

**Describe the issue**
Postgres 14 libpq added the [pipeline mode](https://www.postgresql.org/docs/14/libpq-pipeline-mode.html) so it would be great if JDBC also supported this feature which can significantly improve performance over high latency connections or for workloads with many small write (INSERT/UPDATE/DELETE) operations. 

**PostgreSQL Version?**
14

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2021-10-27 09:36  "davecramer (@davecramer)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: davecramer (@davecramer) @ 2021-10-27 09:36 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

Unfortunately the driver is synchronous. Making the driver asynchronous is a pretty big blocker here.
Curious how would you even use this in the JDBC model ? You would need to use futures, etc.

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2021-12-17 14:00  "davecramer (@davecramer)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: davecramer (@davecramer) @ 2021-12-17 14:00 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

I'm not even sure how you would use this in JDBC? It's not asynchronous at all

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2021-12-17 16:39  "vlsi (@vlsi)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: vlsi (@vlsi) @ 2021-12-17 16:39 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

We can try inventing a better "prepared statement batching".

1. JDBC spec requires that batch is executed only with `executeBatch` call. In other words, the spec does not allow sending queries earlier. We could do better if we were allowed to send queries as soon as the user calls `addBatch`. Unfortunately, the spec allows for `clearBatch` method, so eager sending should be opt-in (e.g. via special `setBatchMode(forward_only)`).
On the other hand, we already have `rewrite batch inserts to insert values`, so we might add an option that would automatically pipeline after each `addBatch`, and it would throw an exception if the user calls `clearBatch` when some of the data has been pipelined. @davecramer , WDYT?

1. JDBC spec requires that the shape of all the batched statements is the same. This effectively blocks batching statements with different SQL. Of course, we might try adding our own API (e.g. something like `Copy` where you add `PreparedStatements`), however, it would be a non-standard API.

----

If we consider pipelining arbitrary SQLs, then it could be like:

```
nameResultSet = con.executeStatement("select name from object where id=1"); // pipelines query
descriptionResultSet = con.executeStatement("select description from object where id=1"); // pipelines query

nameResultSet.next(); // <-- blocks and waits for the result
descriptionResultSet.next(); // <-- the result is likely ready by this time
```

Unfortunately, we can't move exception from `executeStatement` till `nameResultSet.next()` as it would change the semantics. The application might expect that "invalid SQL" would result in SQL exception when you call `executeStatement`, and they would be surprised in case the exception is delayed till the first `resultSet` operation or even till the statement is closed (e.g. in case resultset was not really used).

On the other hand, I do not know if moving "exceptions" from `executeStatement` till "the first access of resultset" is really that breaking for the applications.
It might be fun to try `autoPipelineAllQueries=true` approach so `statement.executeQuery()` does not really wait for the response from the backend.

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2021-12-17 16:55  "davecramer (@davecramer)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: davecramer (@davecramer) @ 2021-12-17 16:55 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

Good catch, I hadn't thought about the batching stuff. It would certainly be non-standard. The question becomes who has the bandwidth to implement this ?



^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2022-07-20 14:31  "ebrandsberg (@ebrandsberg)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: ebrandsberg (@ebrandsberg) @ 2022-07-20 14:31 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

This would impact proxies as well potentially...

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2026-02-11 22:58  "beikov (@beikov)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: beikov (@beikov) @ 2026-02-11 22:58 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

I'd be interested in implementing support for pipelining when using virtual threads.

The idea is that, when a virtual thread executes a query, to block the thread waiting for the result, but then allow another virtual thread to execute another query in pipeline mode on the same connection.

Would you also be interested in such an implementation?

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2026-02-12 10:43  "davecramer (@davecramer)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: davecramer (@davecramer) @ 2026-02-12 10:43 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

Yup, have at it. Look forward to seeing this.

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2026-03-27 10:37  "beikov (@beikov)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: beikov (@beikov) @ 2026-03-27 10:37 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

I would appreciate if I could get some feedback on e.g. https://github.com/pgjdbc/pgjdbc/pull/4010. I also have a [different prototype](https://github.com/pgjdbc/pgjdbc/pull/4009), but there are some locking issues which are avoided by https://github.com/pgjdbc/pgjdbc/pull/4010

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2026-03-27 11:19  "vlsi (@vlsi)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: vlsi (@vlsi) @ 2026-03-27 11:19 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

@beikov , thank you for working on this, however, could you please clarify why you suggest pipeline mode for virtual threads only?
I guess the same approach should work with regular threads as well.

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2026-03-27 11:25  "beikov (@beikov)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: beikov (@beikov) @ 2026-03-27 11:25 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

I could try to remodel it to also work with regular threads, but sharing a connection between different non-virtual threads didn't seem like a good idea to me, whereas with virtual threads it feels more natural.

Are there any good use cases for doing pipelining this way with platform threads? I would think that for non-virtual thread users it would be much more convenient to use the JDBC batch API for pipelining instead. Wdyt?

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2026-03-27 15:00  "vlsi (@vlsi)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: vlsi (@vlsi) @ 2026-03-27 15:00 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> connection between different non-virtual threads didn't seem like a good idea to me

> whereas with virtual threads it feels more natural

Why so?

I do not understand why sharing a connection between virtual threads should look any better than sharing connections between regular threads.

PS PostgreSQL backend semantics is "any failure kills the transaction". So, if a single SQL fails, all subsequent SQLs would fail as well until commit/rollback. That means, if virtual threads are unrelated, a single problematic would kill the results for all the others.

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2026-03-27 15:01  "vlsi (@vlsi)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: vlsi (@vlsi) @ 2026-03-27 15:01 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> JDBC batch API

It does not support select pipelining, it does not support pipelining queries of different shapes (e.g. insert / update pipelining).

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2026-03-27 15:25  "beikov (@beikov)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: beikov (@beikov) @ 2026-03-27 15:25 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> > connection between different non-virtual threads didn't seem like a good idea to me
> 
> > whereas with virtual threads it feels more natural
> 
> Why so?
> 
> I do not understand why sharing a connection between virtual threads should look any better than sharing connections between regular threads.
> 

In my mind, a connection is always used in a single threaded manner. Since virtual threads would usually (always?) run on the same carrier thread, the whole thing is more or less single-threaded, which is why it seems more natural to me. Anyway, if you think that people will spawn or share the connection with platform threads to achieve pipelining, then I'll test to see if that works as well.

> PS PostgreSQL backend semantics is "any failure kills the transaction". So, if a single SQL fails, all subsequent SQLs would fail as well until commit/rollback. That means, if virtual threads are unrelated, a single problematic would kill the results for all the others.

I don't understand why virtual threads would be the problem here. A connection is shared, so any thread that uses that connection has to understand the implications. Either way, a connection can fail due to communication errors any time, so not sure what you're trying to get at here.

> > JDBC batch API
> 
> It does not support select pipelining, it does not support pipelining queries of different shapes (e.g. insert / update pipelining).

I was thinking about `java.sql.Statement#addBatch(String)`, but that doesn't support parameters, which is unfortunate.

I heard that the Oracle JDBC driver folks toyed around with splitting the SQL string passed to `prepareStatement` on semi-colon and then pipeline these statements, which supports parameters, but requires numbering them for the whole string.

Anyway, my focus is on implementing support for pipelining such that it works with virtual threads, because that is the use case I want to work. I'll see if I can easily support platform threads and add that.

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

* Re: [pgjdbc/pgjdbc] issue #2325: Support pipelining queries
@ 2026-03-27 16:55  "vlsi (@vlsi)" <[email protected]>
  13 siblings, 0 replies; 15+ messages in thread

From: vlsi (@vlsi) @ 2026-03-27 16:55 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> then I'll test to see if that works as well.

I haven't noticed if you use any virtual thread specific APIs, so it is unclear why you cover the feature with `if (virtual)` check.

> Since virtual threads would usually (always?) run on the same carrier thread, the whole thing is more or less single-threaded

Imagine someone switching their Spring Boot app to virtual threads and saying something like "we don't need connection pool, just pipeline everything through a single connection".

---

Do you know practical ways to test the change?
I wonder what are the limitations and what could theoretically be put in a changelog.

Would it support just `preparedStatement.execute` or all the APIs?

---

Historically, the implementation was not designed for full thread-safety, so mutable fields in `PgConnection` might play against concurrent use of the connection.

^ permalink  raw  reply  [nested|flat] 15+ messages in thread

end of thread, other threads:[~2026-03-27 16:55 UTC | newest]

Thread overview: 15+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2021-10-27 09:33 [pgjdbc/pgjdbc] issue #2325: Support pipelining queries "kustodian (@kustodian)" <[email protected]>
2021-10-27 09:36 ` "davecramer (@davecramer)" <[email protected]>
2021-12-17 14:00 ` "davecramer (@davecramer)" <[email protected]>
2021-12-17 16:39 ` "vlsi (@vlsi)" <[email protected]>
2021-12-17 16:55 ` "davecramer (@davecramer)" <[email protected]>
2022-07-20 14:31 ` "ebrandsberg (@ebrandsberg)" <[email protected]>
2026-02-11 22:58 ` "beikov (@beikov)" <[email protected]>
2026-02-12 10:43 ` "davecramer (@davecramer)" <[email protected]>
2026-03-27 10:37 ` "beikov (@beikov)" <[email protected]>
2026-03-27 11:19 ` "vlsi (@vlsi)" <[email protected]>
2026-03-27 11:25 ` "beikov (@beikov)" <[email protected]>
2026-03-27 15:00 ` "vlsi (@vlsi)" <[email protected]>
2026-03-27 15:01 ` "vlsi (@vlsi)" <[email protected]>
2026-03-27 15:25 ` "beikov (@beikov)" <[email protected]>
2026-03-27 16:55 ` "vlsi (@vlsi)" <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox