[pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip

pgjdbc/pgjdbc GitHub issues and pull requests (mirror)  
help / color / mirror / Atom feed

[pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
32+ messages / 5 participants
[nested] [flat]

* [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2023-12-07 09:20 "vlsi (@vlsi)" <[email protected]>
  0 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2023-12-07 09:20 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

The original goal was to fix `TypeInfoCache`:

* many `Map`s for navigating between oids, names, and other information — hard to maintain, hard to invalidate
* no support for identifier quoting or `search_path` — name lookups like `= ANY(current_schemas(..)) order by oid limit 1` could pick the wrong oid

`getPgTypeByPgName` now resolves through `::regtype`, and `getPgTypeByOid` serves the well-known scalar OIDs from a static table. Once `PgType` carries the full metadata, Java↔PG conversion no longer needs to live inside
`PgResultSet` — that part grew into a new codec API.

### For users

* **`flushCacheOnDdl`** (new connection property, default `true`).
  The driver invalidates server-prepared statements on `CREATE` / `DROP` / `ALTER`, so the next execute transparently re-prepares — no more `cached plan must not change result type` after `ALTER TABLE` on a referenced relation. Set to `false` for the legacy behaviour. Already split out as #4067 against `master`.
* **Composite types round-trip end-to-end, including in binary.**
  `Connection.createStruct(name, attrs)` is implemented (master throws `notImplemented`); `rs.getObject(i, Struct.class)` works; nested composites and arrays of composites encode/decode in both text and binary wire formats.
* **`SQLData` read and write.**
  Implement `SQLData` on a POJO and pass it to `setObject` or pull it out via `getObject(i, MyClass.class)`; the driver wires it through `CompositeCodec` for both binary and text.
* **`ServiceLoader` for custom codecs.**
 Add a `Codec` implementation to `META-INF/services/org.postgresql.api.codec.Codec` and it's picked up by every `Connection` — useful for shipping codecs from a third-party jar without `Connection.addDataType` calls in every place that opens a connection.
* **Identifier quoting and `search_path` work end-to-end.**
  `getPgTypeByPgName("\"Composites\".\"ComplexCompositeTest\"")` resolves correctly; `ResultSetMetaData.getColumnTypeName` returns `"schema"."typname"` for off-path / shadowed types (was returning the bare typname, which broke legacy callers).
* **No extra round-trips for built-in types.**
 `SELECT 1::int4` is one round-trip; metadata for known OIDs is preloaded. `TypeCacheRoundTripTest` regression-tests this via `CountingSocketFactory`.
* **`getObject(int, Map<String, Class<?>>)` is implemented.**
  Master throws `notImplemented` for a non-empty typemap; here it works, routing composite columns through the `SQLData` class specified in the map and recursively applying the same map to nested composite fields (`PgStruct#getAttributes(map)`). Useful for selecting a column whose type maps to a different `SQLData` class than the connection-level default.

### For pgjdbc developers

A new public API `org.postgresql.api.codec`, marked `@Experimental`:

* `BinaryCodec` / `TextCodec` — pluggable per-OID encoders/decoders.
* `CodecContext` / `CodecRegistry` — connection-scoped lookup; overridable via `PgResultSet#getCodecContext`.
* Built-in codecs cover all PG scalars, `numeric`, the date/time family, `bytea`, `uuid`, JSON, geometric, range, enum, domain, plus `CompositeCodec` and `ArrayCodec` for the structured paths.

`PgResultSet.getObject`, `PgPreparedStatement.setObject` and the metadata queries route through codecs. The legacy `Connection.addDataType(name, PGobject subclass)` contract is preserved.

`TypeInfoCache` itself is Caffeine-backed with epoch-based invalidation (`typeCacheEpoch++` on `CREATE` / `DROP` / `ALTER` and on `SET search_path`).

`PgType` is an immutable value object; composite field lists load lazily.

### TODO

* [x] Invalidate type cache on `search_path` changes, create, drop statements
* [x] Make sure the new approach does not generate new roundtrips for known
  types like `int4`
* [x] Fix association of Java class ↔ oid
* [x] ~Move `PgType` to a top-level, and add subtypes (e.g. `PgArrayType` for use in `PgArray`)~ — the idea was not helpful, and `PgArrayType` was not added
* [x] Add consumer-level JDBC / Spring Framework JDBC tests: `org.postgresql.test.consumer.composite` package, `pgjdbc-spring-jdbc-test` gradle module
* [ ] Review commits since `docs: PGTYPE_SPEC overview`
* [ ] Exclude unused Caffeine classes
* [ ] Double-check `CallableStatements` is covered properly
* [ ] Double-check updateable `ResultSet` is covered properly
* [ ] Prepare "notable changes" section in the changelog
* [ ] Compare performance between the old and new code
* [ ] Enable `SQLData` to create structs/arrays for `SQLOutput.writeStruct(Struct)`. We should probably add `PGSQLOutput` extension interface that would expose `createStruct`/`createArray`
* [ ] Consider an extension api for creating anonymous row: `PGConnection.createStruct(PgType[], Object[])`
* [ ] Add tests (edge cases, property-based, mutation)
* [ ] Check whether the API integrates with `COPY ... FORMAT BINARY` (see [pgingester `main.rs`](https://github.com/jamessewell/pgingester/blob/e323bf2706b09d23f75ae689e29d3f5b8f4017b7/src/main.rs#...))
* [ ] Polish the API surface
* [ ] Add an API to parse and generate the PostgreSQL object notation

### Fixes
* https://github.com/pgjdbc/pgjdbc/issues/381
* https://github.com/pgjdbc/pgjdbc/issues/641
* https://github.com/pgjdbc/pgjdbc/issues/1225
* https://github.com/pgjdbc/pgjdbc/issues/3049
* https://github.com/pgjdbc/pgjdbc/issues/2029
* https://github.com/pgjdbc/pgjdbc/issues/2416
* https://github.com/pgjdbc/pgjdbc/issues/1443
* https://github.com/pgjdbc/pgjdbc/issues/1884

* https://github.com/pgjdbc/pgjdbc/pull/3396
* https://github.com/pgjdbc/pgjdbc/issues/1225

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2023-12-07 13:34 ` "davecramer (@davecramer)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: davecramer (@davecramer) @ 2023-12-07 13:34 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on pgjdbc/src/main/java/org/postgresql/core/v3/QueryExecutorImpl.java:2504)

ends with ?

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2024-04-11 09:13 ` "davecramer (@davecramer)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: davecramer (@davecramer) @ 2024-04-11 09:13 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on pgjdbc/src/main/java/org/postgresql/core/v3/QueryExecutorImpl.java:2504)

Shouldn't this be `status.startsWith("DROP ")` ?

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 06:58 ` "vlsi (@vlsi)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2026-05-14 06:58 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

@davecramer , @sehrope , @bokken , The PR seems to add full Struct / Array support, proper type resolution (name -> oid), and SQLData support.

It looks like this goes in a pretty good shape modulo remaining cleanups 🎉 🎉 🎉 

I still plan to review the changes after `docs: PGTYPE_SPEC overview` (those changes are "test failure fixes", "`@Nullable` fixes", etc).

The commits up to "docs: PGTYPE_SPEC overview" are coherently structured (I asked claude to restructure WIP commits), and the rest are fixups to be reviewed (and later squashed into the proper commits).

I plan to drop something like "hey, codex, the PR claims to support Struct/ Array / SQLData, so go ahead and generate user-level tests for it using JDBC / Hibernate / jOOQ / Spring Data ...". I'm not sure if it would identify meaningful issues, however, it could surface some.

I need to double-check `CallableStatements` as well (I haven't paid much attention to them).

I haven't paid attention to "structs/composites in updateable resultsets". It is yet another todo.

/cc @lukaseder. It would be nice to hear from you as well. Hopefully this should ease struct/array/binary handling with pgjdbc.

PS. To everybody: this is almost fully machine-generated code. I did review it, and I steered it. Sorry for issuing such a big change, however, I think it is ready for the general review. If you feel "oh, slop code detected", please ping me.

`PGTYPE_SPEC.md` is a part of the initial plan (a lot of iterations between claude and codex), and I intend to remove it.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 09:44 ` "davecramer (@davecramer)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: davecramer (@davecramer) @ 2026-05-14 09:44 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

I noticed all of the new files have a date of 2024?

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 09:57 ` "davecramer (@davecramer)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: davecramer (@davecramer) @ 2026-05-14 09:57 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

why does this require Caffeine

  The PR uses Caffeine as the backing cache for TypeInfoCache (OID→PgType and name→PgType lookups) and CodecRegistry
   (OID→Codec lookups). The rationale from the design spec:

  1. Size-based eviction — The type cache can grow unbounded if a connection encounters many user-defined types
  (composites, enums, domains). Caffeine provides LRU eviction without hand-rolling one.
  2. Concurrent access — Caffeine's Cache is thread-safe without external synchronization, which matters because
  TypeInfoCache is shared per-connection and the codec lookup happens on every getObject()/setObject() call.
  3. Avoids ConcurrentHashMap.computeIfAbsent deadlock — The spec explicitly calls out that computeIfAbsent can
  deadlock if the mapping function accesses the same map (which happens when resolving a composite type triggers
  resolution of its field types). Caffeine's get(key, loader) doesn't have this problem.

  Could it be done without Caffeine? Yes — a plain ConcurrentHashMap with the get() + putIfAbsent() pattern (which the
  spec already uses for fieldsByOid) would work. You'd lose automatic size-based eviction, but for a per-connection
  cache that gets cleared on epoch bump anyway, unbounded growth is unlikely to be a real problem in practice. The
  epoch-based invalidation already clears everything on DDL.

  The tradeoff is ~300KB added to the shaded JAR (Caffeine 2.9.3) for what is essentially a convenience over manual
  ConcurrentHashMap management. Whether that's worth it depends on how much you value the eviction policy vs. JAR
  size. For most server-side applications it's negligible; for Lambda cold starts or Android it's noticeable.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 10:00 ` "vlsi (@vlsi)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2026-05-14 10:00 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> Could it be done without Caffeine? Yes — a plain ConcurrentHashMap with the get() + putIfAbsent() pattern (which the spec already uses for fieldsByOid) would work

ConcurrentHashMap does not support eviction of unused entries, and we do require it.

> The tradeoff is ~300KB added to the shaded JAR

We can reduce the classes and include only those that are needed for pgjdbc. I'm sure it would be way less than 300KiB overhead.



^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 10:02 ` "davecramer (@davecramer)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: davecramer (@davecramer) @ 2026-05-14 10:02 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

the rest of the review - some stuff that is obvious

5. ThreadLocal CodecDepth — The spec says "safe with virtual threads due to short lifespan and proper cleanup." This
  is partially true — virtual threads do inherit ThreadLocals, and if a codec decode is interrupted mid-recursion
  (e.g., by a timeout), the ThreadLocal won't be cleaned up. The clear() method exists but needs to be called in a
  finally block at every entry point. Worth verifying this is done consistently.
  6. TOCTOU in epoch-based invalidation. Between checking typeCacheEpoch and using the cached PgType, another thread
  could increment the epoch. The volatile on typeCacheEpoch provides visibility but not atomicity of the
  check-then-use. For a JDBC driver (typically single-threaded per connection), this is fine in practice, but the
  spec's ConcurrentHashMap usage suggests multi-threaded access was considered.

  Medium Priority

  7. PgStruct extends PGobject — This is pragmatic but creates a diamond: PgStruct is both a Struct and a PGobject.
  Code that does if (obj instanceof PGobject) will now match composite columns that previously returned a plain
  PGobject. The ordering in PgPreparedStatement.setObject (Struct before PGobject) is correct, but downstream code
  that pattern-matches on PGobject subtypes may behave differently.
  8. FallbackCodec for unknown types — OID 705 (unknown) gets a special short-circuit to getString() in
  PgResultSet.getObject(int). This is correct for backward compat, but the layering is leaky — the codec layer should
  ideally handle this internally rather than requiring a special case in the ResultSet.
  9. Range types: typelem is zero — The code acknowledges this (pg_range.rngsubtype is not loaded). Range element
  typing is deferred. This means getObject() on a range column returns bounds as unparsed strings. Acceptable for a
  first release but should be tracked.
  10. SET search_path detection relies on the CommandComplete message containing "SET". If a user does SELECT
  set_config('search_path', ...), the cache won't invalidate. This is a known limitation but worth documenting more
  prominently.
  11. Commit structure — The first 10 commits are well-structured and reviewable. Commits 11–26 are fixup commits that
  should be squashed into the appropriate earlier commits before merge (vlsi acknowledges this).

  Low Priority

  12. map.pg_type.boolean property name uses dots, which is unusual for pgjdbc properties (most use camelCase).
  Consider mapPgTypeBoolean for consistency (which the PGProperty enum name already uses).
  13. @Experimental annotation — Good practice, but the Javadoc should clarify what "experimental" means: will the API
  be removed? Will method signatures change? Will the package be relocated?
  14. No multirange support — Explicitly deferred, which is fine for v1.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 10:08 ` "davecramer (@davecramer)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: davecramer (@davecramer) @ 2026-05-14 10:08 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

I'm also curious if there is any performance regression?

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 10:09 ` "davecramer (@davecramer)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: davecramer (@davecramer) @ 2026-05-14 10:09 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> > Could it be done without Caffeine? Yes — a plain ConcurrentHashMap with the get() + putIfAbsent() pattern (which the spec already uses for fieldsByOid) would work
> 
> ConcurrentHashMap does not support eviction of unused entries, and we do require it.
> 
> > The tradeoff is ~300KB added to the shaded JAR
> 
> We can reduce the classes and include only those that are needed for pgjdbc. I'm sure it would be way less than 300KiB overhead.

Excellent!

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 12:38 ` "sehrope (@sehrope)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: sehrope (@sehrope) @ 2026-05-14 12:38 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on pgjdbc/src/main/java/org/postgresql/core/Parser.java:1214)

Why is this parser changing as part of this cache stuff?

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 12:41 ` "sehrope (@sehrope)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: sehrope (@sehrope) @ 2026-05-14 12:41 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on pgjdbc-spring-jdbc-test/build.gradle.kts:21)

Why are we adding spring-jdbc? We have a lot of the same stuff in our `TestUtil` helpers.

I'm not totally against adding it (*though less is more even for testing deps...*). But if we're going to do that, we should do it in its own PR knowing that it breaks back patching usage in tests for back branches unless we do it there as well.

If you can avoid using it it's preferable.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 12:44 ` "sehrope (@sehrope)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: sehrope (@sehrope) @ 2026-05-14 12:44 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on pgjdbc/src/main/java/org/postgresql/jdbc/PgStruct.java:175)

This is materially different than the old return value.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 12:47 ` "sehrope (@sehrope)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: sehrope (@sehrope) @ 2026-05-14 12:47 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on pgjdbc-spring-jdbc-test/src/test/java/org/postgresql/test/consumer/springjdbc/SpringJdbcCompositeConsumerTest.java:165)

These tests should do this before the test starts too. Because otherwise if it crashes once and some of these object exist, the `CREATE ...` will fail with object already exists and test will not run. It's not an issue with a clean env like CI. But it can be annoying working locally if you ever CTRL-C the testing process (e.g., when testing out that interrupt stuff and it hangs forever running tests).

Not really specific to this either. I just noticed it here. We have lots of other tests that do this and we should clean those up too.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 12:50 ` "sehrope (@sehrope)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: sehrope (@sehrope) @ 2026-05-14 12:50 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on pgjdbc-spring-jdbc-test/src/test/java/org/postgresql/test/consumer/springjdbc/SpringJdbcCompositeConsumerTest.java:134)

Why is everything named "`spring_`"? That's like naming the test objects "`junit_`".

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 13:21 ` "lukaseder (@lukaseder)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: lukaseder (@lukaseder) @ 2026-05-14 13:21 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> /cc @lukaseder. It would be nice to hear from you as well. Hopefully this should ease struct/array/binary handling with pgjdbc.

Thanks for the ping. I think that 2 things would be useful for libraries on top of pgjdbc:

- An API to parse / generate the PostgreSQL object notation
- Implementation of `SQLInput`, `SQLOutput` as well as `SQLData` support

I can only speak on behalf of jOOQ, not other libraries / ORMs, but obviously this has been a bit of a pain to implement in jOOQ. It was done mainly because PostgreSQL is so popular with jOOQ users and so the cost / benefit ratio of implementing both parsers and generators supporting the object notation was worth it.

Now that jOOQ already has this, I guess that a pgjdbc implementation won't be of much immediate help as it'll take quite some time until all edge cases are handled correctly, and there might always be an edge case that jOOQ can handle but pgjdbc won't support. Besides, `SQLInput`, `SQLOutput`, and `SQLData` aren't JDBC's best APIs. They leave a lot of open questions for various edge cases, so hacks are necessary, or in case of Oracle, binding to ojdbc specific API that isn't available in JDBC directly. One of the worst flaws of `SQLData` is that neither `SQLInput` nor `SQLOutput` expose a JDBC `Connection` or some other context object allowing for creating / reading JDBC types, such as arrays. With ojdbc, `ThreadLocal` usage was inevitable from within `SQLData` implementations.

> I haven't paid attention to "structs/composites in updateable resultsets". It is yet another todo.

I doubt these are still being used a lot. Given that there's talk about deprecating `RowSet` (https://bugs.openjdk.org/browse/JDK-8382268), I wonder edge cases like structs in updatable resultsets should be a priority. `CallableStatement` might be a higher priority, though jOOQ generally doesn't use that API with PostgreSQL compared to just calling a stored function from a SQL query directly.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 13:36 ` "vlsi (@vlsi)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2026-05-14 13:36 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on pgjdbc/src/main/java/org/postgresql/core/Parser.java:1214)

I haven't analyzed it yet, however, the change is needed for calling a function that returns a struct (see commit message https://github.com/pgjdbc/pgjdbc/pull/3062/changes/90679b990e0464479f659a113ec5ded873350b54)

As far as I understand, the syntax of `select * from function(...)` destructures the struct and it comes out as independent columns. Frankly, if I execute a function that returns struct, I would expect a single column of type struct rather than a column per struct field.

We can hide this under a feature flag, however, I am leaning towards "single struct column" is a better behaviour.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 13:42 ` "vlsi (@vlsi)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2026-05-14 13:42 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on pgjdbc-spring-jdbc-test/build.gradle.kts:21)

>Why are we adding spring-jdbc? 

The idea is to test behavior from the users' point of view.
There might be corner-cases related to Spring's JDBC wrappers, and it makes sense testing them.

Spring itself does not have such tests: pgjdbc does not support structs/arrays, so Spring can't test them.

As time passes, we could donate the tests to Spring itself.

> it breaks back patching

I don't think we would ever back-patch the change.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 13:50 ` "vlsi (@vlsi)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2026-05-14 13:50 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on pgjdbc-spring-jdbc-test/src/test/java/org/postgresql/test/consumer/springjdbc/SpringJdbcCompositeConsumerTest.java:134)

A common prefix helps to identify a leftover table in the db, and it prevents sharing a table across tests.
Do you have a better idea for the prefix?

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 14:07 ` "vlsi (@vlsi)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2026-05-14 14:07 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> I guess that a pgjdbc implementation won't be of much immediate help

Do you mean you generate and parse **binary** encodings for composites/arrays as well?

I'm quite sure binary should be significantly faster as it would avoid 2^N (?) quoting degradation when sending nested structs that include text fields with quotes. Well. Not being force to quote contents should be a noticeable win on its own.

> `SQLInput`, `SQLOutput`

They are implemented as well.

> SQLData is that neither SQLInput nor SQLOutput expose a JDBC Connection or some other context object allowing for creating / reading JDBC types, such as arrays

Will add as a TODO. It is indeed sad to have `SQLOutput.writeStruct(Struct)` without being able to instantiate one.

>I wonder edge cases like structs in updatable resultsets should be a priority

10 min of claude fixed the issue and added binary encoding as well (previously, updateable resultsets were text-only) 🤷 

> CallableStatement might be a higher priority

Callable seems to be fixed as well

> it'll take quite some time until all edge cases are handled correctly

Right you are. I wonder if you could try pgjdbc with jOOQ tests to see if there's a regression.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-14 20:19 ` "lukaseder (@lukaseder)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: lukaseder (@lukaseder) @ 2026-05-14 20:19 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> Do you mean you generate and parse **binary** encodings for composites/arrays as well?
> 
> I'm quite sure binary should be significantly faster as it would avoid 2^N (?) quoting degradation when sending nested structs that include text fields with quotes. Well. Not being force to quote contents should be a noticeable win on its own

I've noticed the terrible performance characteristics of the text encoding:
https://www.postgresql.org/message-id/19063-c1ba39b11b9d43b7%40postgresql.org

Though it doesn't matter all that much in realistic scenarios. The quoting is annoying but manageable in a library.

How does one access binary encodings? E.g. `SELECT row(1, 2)::udt` will produce the text encoding. How would I tell PostgreSQL to prefer a binary encoding, instead?

> Will add as a TODO. It is indeed sad to have `SQLOutput.writeStruct(Struct)` without being able to instantiate one.

I've not used that method yet. Nested structs can be written as `SQLOutput.writeObject(SQLData)`. The problem I've had is when you need to write arrays

> > it'll take quite some time until all edge cases are handled correctly
> 
> Right you are. I wonder if you could try pgjdbc with jOOQ tests to see if there's a regression.

I could try, though I guess it'll be a bit of work. To do so, I'll just build your branch from github?

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-15 08:41 ` "vlsi (@vlsi)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2026-05-15 08:41 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> How does one access binary encodings?

The driver needs to describe the query to know the resultset shape.
When the driver knows the column types, it can request binary encoding for the types it can decode.

We have two parts:
* Currently, the driver defers describe to 5th execution of a query (to be changed in https://github.com/pgjdbc/pgjdbc/issues/2886)
* The current PR adds a lot of codecs (including record, struct, array), so the record row or udt won't scare the driver, and it would request binary encoding from the backend

For the reference, I've added your `select row (row (row (row`  query to `NestedRecordBinaryTransferTest`, and it successfully executes and fetches rows nested to the depth of 32. The test succeeds with various PostgreSQL versions.
The limitation is that we can't use the trick with "simple query mode" (e.g. when running over a replication connection).

The PR limits for the nesting depth with 64 do avoid infinite recursion. It is a hard limit via `CodecDepth.MAX_DEPTH=64`. We could probably add a system property for it as well. It will never be a connection property for security reasons though.

>I could try, though I guess it'll be a bit of work. To do so, I'll just build your branch from github?

Right you are.

For example:
```sh
% ./gradlew -Ppgjdbc.version=42.7.12-pr3062 publishToMavenLocal

% tree ~/.m2/org/postgresql/postgresql/42.7.12-pr3062-SNAPSHOT
/Users/vlsi/.m2/org/postgresql/postgresql/42.7.12-pr3062-SNAPSHOT
├── maven-metadata-local.xml
├── postgresql-42.7.12-pr3062-SNAPSHOT-features.xml
├── postgresql-42.7.12-pr3062-SNAPSHOT-javadoc.jar
├── postgresql-42.7.12-pr3062-SNAPSHOT-jdbc-src.tar.gz
├── postgresql-42.7.12-pr3062-SNAPSHOT-sources.jar
├── postgresql-42.7.12-pr3062-SNAPSHOT.jar
└── postgresql-42.7.12-pr3062-SNAPSHOT.po

% ./gradlew -Ppgjdbc.version=42.7.12-pr3062 publishAllPublicationsToTmp-mavenRepository

% tree ./pgjdbc/build/local-maven-repo
./pgjdbc/build/local-maven-repo
└── org
    └── postgresql
        └── postgresql
            ├── 42.7.12-pr3062-SNAPSHOT
            │   ├── maven-metadata.xml
            │   ├── postgresql-42.7.12-pr3062-20260515.083450-1-features.xml
            │   ├── postgresql-42.7.12-pr3062-20260515.083450-1-javadoc.jar
            │   ├── postgresql-42.7.12-pr3062-20260515.083450-1-jdbc-src.tar.gz
            │   ├── postgresql-42.7.12-pr3062-20260515.083450-1-sources.jar
            │   ├── postgresql-42.7.12-pr3062-20260515.083450-1.jar
            │   ├── postgresql-42.7.12-pr3062-20260515.083450-1.pom
            ├── maven-metadata.xml
```




^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-19 07:24 ` "lukaseder (@lukaseder)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: lukaseder (@lukaseder) @ 2026-05-19 07:24 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> The driver needs to describe the query to know the resultset shape.
> When the driver knows the column types, it can request binary encoding for the types it can decode.

Ah, I get it now. That's really interesting, then! Might speed up a few queries for jOOQ users, if we switched to the `SQLData` API, when fetching heavily nested UDTs. For dynamic types (`ROW()`, `MULTISET()`), jOOQ defaults to `JSONB` serialisation, so I wouldn't expect massive performance gains, but for UDTs, definitely.

I'll try to work with your branch, later this week. If your branch gets merged, we can have an experimental, opt-in implementation in jOOQ that users could activate with a switch. I'd expect that this could greatly help increase test coverage for you as well.

Just so I understand, `BINARY_TRANSFER_ENABLE` is only affecting binary transfer (i.e. pgjdbc looking up type structures). Even if this flag isn't set, `SQLData` can be used for all types, with more overhead?

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-19 07:56 ` "vlsi (@vlsi)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2026-05-19 07:56 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> I'll try to work with your branch

I just noticed the array support is incomplete (yet it somehow works). Arrays still rely on old ArrayEncoding/ArrayDecoding. I am going to harmonize it.

>Even if this flag isn't set, SQLData can be used for all types, with more overhead?

RECORD is not in the driver's SUPPORTED_BINARY_OIDS default set yet, so the test must opt in via binaryTransferEnable=RECORD to exercise the binary path;

I guess one more TODO for the PR is to enable binary transfer by default for all the types pgjdbc can decode.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-19 11:02 ` "lukaseder (@lukaseder)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: lukaseder (@lukaseder) @ 2026-05-19 11:02 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

What I meant was to ask whether `SQLData` transparently decodes *both* binary and text encodings, or if clients have to be aware of the `binaryTransferEnable` flag as well as `SUPPORTED_BINARY_OIDS` for specific types, when using `SQLData`.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-19 11:14 ` "vlsi (@vlsi)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2026-05-19 11:14 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> What I meant was to ask whether SQLData transparently decodes both binary and text encodings

SQLData will transparently encode and decode both.
It would prefer binary encodings by default unless user explicitly configures "disable binary encoding for type ABC".
Normally, users should not configure `binaryTransferDisable`. The property is there for workarounds only.

---

For "simple connection protocol" e.g. replication connection, binary transfer is not available, so SQLData would resort to text encodings. That is an edge case though.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-19 15:19 ` "lukaseder (@lukaseder)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: lukaseder (@lukaseder) @ 2026-05-19 15:19 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

Should I report findings here? E.g.:

```
org.postgresql.util.PSQLException: readArray() not implemented
	at org.postgresql.jdbc.PgSQLInput.readArray(PgSQLInput.java:322)
```

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-19 15:33 ` "vlsi (@vlsi)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: vlsi (@vlsi) @ 2026-05-19 15:33 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

>Should I report findings here? E.g.:

Yes, please. Feel free to ping me on Telegram / Twitter (vladimirsitnikv) if that works for you better.

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-19 15:35 ` "lukaseder (@lukaseder)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: lukaseder (@lukaseder) @ 2026-05-19 15:35 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

No, here works better, though I wondered if I should create issues, so things can be tracked. Probably not on here, but perhaps on your own account? https://github.com/vlsi/pgjdbc/issues

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-19 16:55 ` "lukaseder (@lukaseder)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: lukaseder (@lukaseder) @ 2026-05-19 16:55 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

So, the issues I've found so far are:

- Lack of `Array` support
- Lack of `CallableStatement` support
- No support for qualified or quoted identifiers in `Map<String, Class<?>>` arguments (e.g. `ResultSet.getObject(int, Map)`). Qualification support is necessary, quoted identifiers probably optional?

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-20 09:19 ` "lukaseder (@lukaseder)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: lukaseder (@lukaseder) @ 2026-05-20 09:19 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

Regarding the lack of array support, to be clear, both arrays contained in UDTs as well as UDT arrays seem to be unsupported.

Specifically, I also tried `select array[row(1, 2)]` and the binary encoding did not apply, possibly in parts because `_RECORD` (oid = 2287) was not registered for binary transfer?

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip
@ 2026-05-21 12:55 ` "mizhka (@mizhka)" <[email protected]>
  30 siblings, 0 replies; 32+ messages in thread

From: mizhka (@mizhka) @ 2026-05-21 12:55 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

(on build-logic/jvm/src/main/kotlin/build-logic.test-base.gradle.kts:43)

Any reason to keep path "$HOME/Documents" in test files? ;)

^ permalink  raw  reply  [nested|flat] 32+ messages in thread

end of thread, other threads:[~2026-05-21 12:55 UTC | newest]

Thread overview: 32+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2023-12-07 09:20 [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip "vlsi (@vlsi)" <[email protected]>
2023-12-07 13:34 ` "davecramer (@davecramer)" <[email protected]>
2024-04-11 09:13 ` "davecramer (@davecramer)" <[email protected]>
2026-05-14 06:58 ` "vlsi (@vlsi)" <[email protected]>
2026-05-14 09:44 ` "davecramer (@davecramer)" <[email protected]>
2026-05-14 09:57 ` "davecramer (@davecramer)" <[email protected]>
2026-05-14 10:00 ` "vlsi (@vlsi)" <[email protected]>
2026-05-14 10:02 ` "davecramer (@davecramer)" <[email protected]>
2026-05-14 10:08 ` "davecramer (@davecramer)" <[email protected]>
2026-05-14 10:09 ` "davecramer (@davecramer)" <[email protected]>
2026-05-14 12:38 ` "sehrope (@sehrope)" <[email protected]>
2026-05-14 12:41 ` "sehrope (@sehrope)" <[email protected]>
2026-05-14 12:44 ` "sehrope (@sehrope)" <[email protected]>
2026-05-14 12:47 ` "sehrope (@sehrope)" <[email protected]>
2026-05-14 12:50 ` "sehrope (@sehrope)" <[email protected]>
2026-05-14 13:21 ` "lukaseder (@lukaseder)" <[email protected]>
2026-05-14 13:36 ` "vlsi (@vlsi)" <[email protected]>
2026-05-14 13:42 ` "vlsi (@vlsi)" <[email protected]>
2026-05-14 13:50 ` "vlsi (@vlsi)" <[email protected]>
2026-05-14 14:07 ` "vlsi (@vlsi)" <[email protected]>
2026-05-14 20:19 ` "lukaseder (@lukaseder)" <[email protected]>
2026-05-15 08:41 ` "vlsi (@vlsi)" <[email protected]>
2026-05-19 07:24 ` "lukaseder (@lukaseder)" <[email protected]>
2026-05-19 07:56 ` "vlsi (@vlsi)" <[email protected]>
2026-05-19 11:02 ` "lukaseder (@lukaseder)" <[email protected]>
2026-05-19 11:14 ` "vlsi (@vlsi)" <[email protected]>
2026-05-19 15:19 ` "lukaseder (@lukaseder)" <[email protected]>
2026-05-19 15:33 ` "vlsi (@vlsi)" <[email protected]>
2026-05-19 15:35 ` "lukaseder (@lukaseder)" <[email protected]>
2026-05-19 16:55 ` "lukaseder (@lukaseder)" <[email protected]>
2026-05-20 09:19 ` "lukaseder (@lukaseder)" <[email protected]>
2026-05-21 12:55 ` "mizhka (@mizhka)" <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox