Message-ID: From: "vlsi (@vlsi)" To: "pgjdbc/pgjdbc" Date: Thu, 07 Dec 2023 09:20:43 +0000 Subject: [pgjdbc/pgjdbc] PR #3062: feat: type cache rework, codec API, and composite-type round-trip List-Id: X-GitHub-Additions: 42198 X-GitHub-Author-Id: 213894 X-GitHub-Author-Login: vlsi X-GitHub-Base: master X-GitHub-Changed-Files: 207 X-GitHub-Commits: 37 X-GitHub-Deletions: 2956 X-GitHub-Draft: true X-GitHub-Head-Branch: typecache X-GitHub-Head-SHA: ecb3977fa1259c163d87c5f0a276efdad3f30e99 X-GitHub-Issue: 3062 X-GitHub-Merge-SHA: 3985ca0e32e9968f73ae87b91516e5b5b2b9ddfd X-GitHub-Repo: pgjdbc/pgjdbc X-GitHub-State: open X-GitHub-Type: pull_request X-GitHub-Url: https://github.com/pgjdbc/pgjdbc/pull/3062 Content-Type: text/plain; charset=utf-8 The original goal was to fix `TypeInfoCache`: * many `Map`s for navigating between oids, names, and other information — hard to maintain, hard to invalidate * no support for identifier quoting or `search_path` — name lookups like `= ANY(current_schemas(..)) order by oid limit 1` could pick the wrong oid `getPgTypeByPgName` now resolves through `::regtype`, and `getPgTypeByOid` serves the well-known scalar OIDs from a static table. Once `PgType` carries the full metadata, Java↔PG conversion no longer needs to live inside `PgResultSet` — that part grew into a new codec API. ### For users * **`flushCacheOnDdl`** (new connection property, default `true`). The driver invalidates server-prepared statements on `CREATE` / `DROP` / `ALTER`, so the next execute transparently re-prepares — no more `cached plan must not change result type` after `ALTER TABLE` on a referenced relation. Set to `false` for the legacy behaviour. Already split out as #4067 against `master`. * **Composite types round-trip end-to-end, including in binary.** `Connection.createStruct(name, attrs)` is implemented (master throws `notImplemented`); `rs.getObject(i, Struct.class)` works; nested composites and arrays of composites encode/decode in both text and binary wire formats. * **`SQLData` read and write.** Implement `SQLData` on a POJO and pass it to `setObject` or pull it out via `getObject(i, MyClass.class)`; the driver wires it through `CompositeCodec` for both binary and text. * **`ServiceLoader` for custom codecs.** Add a `Codec` implementation to `META-INF/services/org.postgresql.api.codec.Codec` and it's picked up by every `Connection` — useful for shipping codecs from a third-party jar without `Connection.addDataType` calls in every place that opens a connection. * **Identifier quoting and `search_path` work end-to-end.** `getPgTypeByPgName("\"Composites\".\"ComplexCompositeTest\"")` resolves correctly; `ResultSetMetaData.getColumnTypeName` returns `"schema"."typname"` for off-path / shadowed types (was returning the bare typname, which broke legacy callers). * **No extra round-trips for built-in types.** `SELECT 1::int4` is one round-trip; metadata for known OIDs is preloaded. `TypeCacheRoundTripTest` regression-tests this via `CountingSocketFactory`. * **`getObject(int, Map>)` is implemented.** Master throws `notImplemented` for a non-empty typemap; here it works, routing composite columns through the `SQLData` class specified in the map and recursively applying the same map to nested composite fields (`PgStruct#getAttributes(map)`). Useful for selecting a column whose type maps to a different `SQLData` class than the connection-level default. ### For pgjdbc developers A new public API `org.postgresql.api.codec`, marked `@Experimental`: * `BinaryCodec` / `TextCodec` — pluggable per-OID encoders/decoders. * `CodecContext` / `CodecRegistry` — connection-scoped lookup; overridable via `PgResultSet#getCodecContext`. * Built-in codecs cover all PG scalars, `numeric`, the date/time family, `bytea`, `uuid`, JSON, geometric, range, enum, domain, plus `CompositeCodec` and `ArrayCodec` for the structured paths. `PgResultSet.getObject`, `PgPreparedStatement.setObject` and the metadata queries route through codecs. The legacy `Connection.addDataType(name, PGobject subclass)` contract is preserved. `TypeInfoCache` itself is Caffeine-backed with epoch-based invalidation (`typeCacheEpoch++` on `CREATE` / `DROP` / `ALTER` and on `SET search_path`). `PgType` is an immutable value object; composite field lists load lazily. ### TODO * [x] Invalidate type cache on `search_path` changes, create, drop statements * [x] Make sure the new approach does not generate new roundtrips for known types like `int4` * [x] Fix association of Java class ↔ oid * [x] ~Move `PgType` to a top-level, and add subtypes (e.g. `PgArrayType` for use in `PgArray`)~ — the idea was not helpful, and `PgArrayType` was not added * [x] Add consumer-level JDBC / Spring Framework JDBC tests: `org.postgresql.test.consumer.composite` package, `pgjdbc-spring-jdbc-test` gradle module * [ ] Review commits since `docs: PGTYPE_SPEC overview` * [ ] Exclude unused Caffeine classes * [ ] Double-check `CallableStatements` is covered properly * [ ] Double-check updateable `ResultSet` is covered properly * [ ] Prepare "notable changes" section in the changelog * [ ] Compare performance between the old and new code * [ ] Enable `SQLData` to create structs/arrays for `SQLOutput.writeStruct(Struct)`. We should probably add `PGSQLOutput` extension interface that would expose `createStruct`/`createArray` * [ ] Consider an extension api for creating anonymous row: `PGConnection.createStruct(PgType[], Object[])` * [ ] Add tests (edge cases, property-based, mutation) * [ ] Check whether the API integrates with `COPY ... FORMAT BINARY` (see [pgingester `main.rs`](https://github.com/jamessewell/pgingester/blob/e323bf2706b09d23f75ae689e29d3f5b8f4017b7/src/main.rs#L435-L446)) * [ ] Polish the API surface * [ ] Add an API to parse and generate the PostgreSQL object notation ### Fixes * https://github.com/pgjdbc/pgjdbc/issues/381 * https://github.com/pgjdbc/pgjdbc/issues/641 * https://github.com/pgjdbc/pgjdbc/issues/1225 * https://github.com/pgjdbc/pgjdbc/issues/3049 * https://github.com/pgjdbc/pgjdbc/issues/2029 * https://github.com/pgjdbc/pgjdbc/issues/2416 * https://github.com/pgjdbc/pgjdbc/issues/1443 * https://github.com/pgjdbc/pgjdbc/issues/1884 * https://github.com/pgjdbc/pgjdbc/pull/3396 * https://github.com/pgjdbc/pgjdbc/issues/1225