pgjdbc/pgjdbc GitHub issues and pull requests (mirror)  
help / color / mirror / Atom feed
[pgjdbc/pgjdbc] PR #3750: Use `BufferedInputStream` with `FileInputStream`
8+ messages / 3 participants
[nested] [flat]

* [pgjdbc/pgjdbc] PR #3750: Use `BufferedInputStream` with `FileInputStream`
@ 2025-08-05 03:24  "jgardn3r (@jgardn3r)" <[email protected]>
  0 siblings, 0 replies; 8+ messages in thread

From: jgardn3r (@jgardn3r) @ 2025-08-05 03:24 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

Some places where `FileInputStream` was being used didn't have `BufferedInputStream`. In many of the use cases, these files were being read 1 byte at a time. This causes excessive IO calls to read the file. By ensuring the `InputStream` is buffered, the IO calls are minimised, improving the performance of the driver.

This performance issue was noticed when a certificate file was on a network drive. It resulted in considerable network traffic.

### All Submissions:

* [x] Have you followed the guidelines in our [Contributing](https://github.com/pgjdbc/pgjdbc/blob/master/CONTRIBUTING.md) document?
* [x] Have you checked to ensure there aren't other open [Pull Requests](../../pulls) for the same update/change?



^ permalink  raw  reply  [nested|flat] 8+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3750: Use `BufferedInputStream` with `FileInputStream`
@ 2025-08-05 07:31  "vlsi (@vlsi)" <[email protected]>
  6 siblings, 0 replies; 8+ messages in thread

From: vlsi (@vlsi) @ 2025-08-05 07:31 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

For reference, it looks like `Files.newInputStream` could be the way to go (it does not seem to require `BufferedInputStream`), however, it does not seem to be good enough for Java 8.

See:
* https://github.com/jetty/jetty.project/issues/3840
* https://bugs.openjdk.org/browse/JDK-8227080 (fs) Files.newInputStream(...).skip(n) is slow. It is fixed in Java 14 and back-ported to Java 11.

At the same time, `Files.newInputStream` says "the stream will not be buffered", so it would require buffering anyways. 

^ permalink  raw  reply  [nested|flat] 8+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3750: Use `BufferedInputStream` with `FileInputStream`
@ 2025-08-05 10:01  "davecramer (@davecramer)" <[email protected]>
  6 siblings, 0 replies; 8+ messages in thread

From: davecramer (@davecramer) @ 2025-08-05 10:01 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

Interesting. Thanks for the PR

^ permalink  raw  reply  [nested|flat] 8+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3750: Use `BufferedInputStream` with `FileInputStream`
@ 2025-08-06 15:37  "vlsi (@vlsi)" <[email protected]>
  6 siblings, 0 replies; 8+ messages in thread

From: vlsi (@vlsi) @ 2025-08-06 15:37 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

I've an idea how we could prevent similar issues in the future:

1) Create `org.postgresql.util.internal.IoUtil` class with `BufferedInputStream newBufferedInputStream(File)` method
2) Make sure we use the method whenever we need `new FileInputStream(...)`
3) Add `java.io.FileInputStream#FileInputStream(java.io.File)` and `FileInputStream(String name)` constructors to `forbidden-api.txt`:https://github.com/pgjdbc/pgjdbc/blob/de6023d9b90c0f5d66ed4efb9169cda56b352601/config/forbidden-apis...
4) Exclude `org.postgresql.util.internal.IoUtil` from forbidden-apis check: https://github.com/pgjdbc/pgjdbc/blob/de6023d9b90c0f5d66ed4efb9169cda56b352601/build-logic/verificat...

Then the build would prevent `new FileInputStream` entering to the source code.

WDYT?

^ permalink  raw  reply  [nested|flat] 8+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3750: Use `BufferedInputStream` with `FileInputStream`
@ 2025-08-06 16:26  "vlsi (@vlsi)" <[email protected]>
  6 siblings, 0 replies; 8+ messages in thread

From: vlsi (@vlsi) @ 2025-08-06 16:26 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

@jgardn3r , would you be open to update the PR as per https://github.com/pgjdbc/pgjdbc/pull/3750#issuecomment-3160666440 ?

^ permalink  raw  reply  [nested|flat] 8+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3750: Use `BufferedInputStream` with `FileInputStream`
@ 2025-08-07 02:13  "jgardn3r (@jgardn3r)" <[email protected]>
  6 siblings, 0 replies; 8+ messages in thread

From: jgardn3r (@jgardn3r) @ 2025-08-07 02:13 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

> @jgardn3r , would you be open to update the PR as per [#3750 (comment)](https://github.com/pgjdbc/pgjdbc/pull/3750#issuecomment-3160666440) ?

Happy to make the changes.

A few things that I have thought while implementing these changes:
* I'm not a huge fan of the signature `IoUtils.newBufferedInputStream(File)` (because, of course, naming is hard)
   * I think if I were naming it, I would call it `FileUtils.newInputStream(File)` similar to that of the `Files.newInputStream()`
   The reason being: if you are always going to expect file streams to be buffered, then there is no real point to put it in the name. If you want a file stream, it will be buffered, whether you like it or not.
   * There are many uses where the argument is a `String`, so I would add an overload for that
* The exclusions list references an imaginary `Unsafe.class` file. Should I replace that with the new file?
* Should test code be excluded or updated to use the new util?

What do you guys think?

^ permalink  raw  reply  [nested|flat] 8+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3750: Use `BufferedInputStream` with `FileInputStream`
@ 2025-08-07 05:38  "vlsi (@vlsi)" <[email protected]>
  6 siblings, 0 replies; 8+ messages in thread

From: vlsi (@vlsi) @ 2025-08-07 05:38 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

`FileUtils` is indeed a better class name.

> The reason being: if you are always going to expect file streams to be buffered, then there is no real point to put it in the name. If you want a file stream, it will be buffered, whether you like it or not

Well, `Files.newBufferedReader` does have `Buffered` in the name. I just thought having `.newBufferedInputStream` would would convey the intent.

>The exclusions list references an imaginary `Unsafe.class` file. Should I replace that with the new file?

Let's replace it indeed.

> Should test code be excluded or updated to use the new util?

AFAIK there are two instances, and they are worth updating:

> `new StrangeInputStream(seed, new FileInputStream("target/buffer.txt"));`

It is fine to convert to a new util method. It looks like a true case when the necessary buffering was missing.

> `new BufferedReader(new InputStreamReader(new FileInputStream(path)));`

It could rather be `Files.newBufferedReader(...)` instead


^ permalink  raw  reply  [nested|flat] 8+ messages in thread

* Re: [pgjdbc/pgjdbc] PR #3750: Use `BufferedInputStream` with `FileInputStream`
@ 2025-08-07 05:57  "vlsi (@vlsi)" <[email protected]>
  6 siblings, 0 replies; 8+ messages in thread

From: vlsi (@vlsi) @ 2025-08-07 05:57 UTC (permalink / raw)
  To: pgjdbc/pgjdbc <[email protected]>

For reference, I've filed https://github.com/openrewrite/rewrite-static-analysis/issues/694 so openrewrite could automatically replace `new BufferedReader(new InputStreamReader(new FileInputStream(path)))` with `Files.newBufferedReader` in the future.

^ permalink  raw  reply  [nested|flat] 8+ messages in thread


end of thread, other threads:[~2025-08-07 05:57 UTC | newest]

Thread overview: 8+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-08-05 03:24 [pgjdbc/pgjdbc] PR #3750: Use `BufferedInputStream` with `FileInputStream` "jgardn3r (@jgardn3r)" <[email protected]>
2025-08-05 07:31 ` "vlsi (@vlsi)" <[email protected]>
2025-08-05 10:01 ` "davecramer (@davecramer)" <[email protected]>
2025-08-06 15:37 ` "vlsi (@vlsi)" <[email protected]>
2025-08-06 16:26 ` "vlsi (@vlsi)" <[email protected]>
2025-08-07 02:13 ` "jgardn3r (@jgardn3r)" <[email protected]>
2025-08-07 05:38 ` "vlsi (@vlsi)" <[email protected]>
2025-08-07 05:57 ` "vlsi (@vlsi)" <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox