Message-ID: <gh-pgjdbc-pgjdbc-2963-c1818909689@github.com>
From: "zkorhone (@zkorhone)" <noreply+zkorhone@github.com>
To: "pgjdbc/pgjdbc" <noreply+pgjdbc-pgjdbc@github.com>
Date: Mon, 20 Nov 2023 11:52:19 +0000
Subject: Re: [pgjdbc/pgjdbc] issue #2963: BufferedOutputStream in PGStream should be replaced to something more efficient
In-Reply-To: <gh-pgjdbc-pgjdbc-2963@github.com>
References: <gh-pgjdbc-pgjdbc-2963@github.com>
List-Id: <gh-pgjdbc-pgjdbc.github.com>
X-GitHub-Author-Login: zkorhone
X-GitHub-Comment-Id: 1818909689
X-GitHub-Comment-Type: issue_comment
X-GitHub-Edited-At: 2023-11-20T14:37:20Z
X-GitHub-Issue: 2963
X-GitHub-Repo: pgjdbc/pgjdbc
X-GitHub-Type: comment
X-GitHub-Url: https://github.com/pgjdbc/pgjdbc/issues/2963#issuecomment-1818909689
Content-Type: text/plain; charset=utf-8

I did run some microbenchmarks for you:

[No threads] OutputStream: 150.25ms [ -56 ]
[No threads] BufferedOutputStream(OutputStream): 155.564ms [ -56 ]
[No threads] FilteringOutputStream(BufferedOutputStream(OutputStream)): 3919.915ms [ -56 ]
[Threads] OutputStream: 22.315ms [ -56 ]
[Threads] BufferedOutputStream(OutputStream): 21.27ms [ -56 ]
[Threads] FilteringOutputStream(BufferedOutputStream(OutputStream)): 626.437ms [ -56 ]
[Threads+Pool] OutputStream: 21.532ms [ -56 ]
[Threads+Pool] BufferedOutputStream(OutputStream): 27.802ms [ -56 ]
[Threads+Pool] FilteringOutputStream(BufferedOutputStream(OutputStream)): 613.602ms [ -56 ]

In above results:
* No threads - no threads were used
* Threads - a thread pool equal to half of size of available cores was used
* Threads+Pool - a thread pool equal to half of size of available cores was used and resource pooling was used
* OutputStream - data is written directly to target OutputStream
* BufferedOutputStream(OutputStream) - data is written via BufferedOutputStream to target OutputStream
* FilteringOutputStream(BufferedOutputStream(OutputStream)) - data is written via FilteringOutputStream to BufferedOutputStream and finally to target OutputStream

Note: Resource pooling is my guess on how postgres driver to perform when using connection pooling. I did this because in theory resource pooling could have impact on how HotSpot optimizes locks (lock elision). There's no way really to guarantee that my simulation is correct.

In results execution time is total execution time for test. I have 16 cpu cores, which in test results to 8 threads being used for running a threaded test. This explains why for threaded tests execution time is ~1/8 of single threaded tests.

Based on these results I'd suggest replacing FilteringOutputStream with custom OutputStream.

I also tried with custom version of BufferedOutputStream that doesn't implement locks. There are some gains (< 1%), but I wouldn't say they are significant enough to warrant custom implementation.

[microbenchmark.java.txt](https://github.com/pgjdbc/pgjdbc/files/13413623/microbenchmark.java.txt)

```
    static class NoLockBufferedOutputStream extends OutputStream {
        private final OutputStream dst;
        private final byte buffer[];
        private int length;

        public NoLockBufferedOutputStream(OutputStream dst) {
            this.dst = dst;
            this.buffer = new byte[8192];
            this.length = 0;
        }

        @Override
        public void write(int b) throws IOException {
            if (length == buffer.length) {
                flushBuffer();
            }
            buffer[length++] = (byte)b;
            if (length >= buffer.length) {
                flushBuffer();
            }
        }

        @Override
        public void write(byte[] b, int off, int len) throws IOException {
            int capacityAfter = buffer.length - length - len;
            if (capacityAfter < 0) {
                int toCopy = buffer.length - length;
                appendToBuffer(b, off, toCopy);
                flushBuffer();

                off += toCopy;
                len -= toCopy;

                if (len >= buffer.length) {
                    // more than our buffer
                    dst.write(b, off, len);
                } else {
                    appendToBuffer(b, off, len);
                }
            } else {
                appendToBuffer(b, off, len);
            }
        }

        private void appendToBuffer(byte[] src, int off, int toCopy) {
            System.arraycopy(src, off, buffer, length, toCopy);
            length += toCopy;
        }

        private void flushBuffer() throws IOException {
            try {
                dst.write(buffer, 0, length);
            } finally {
                length = 0;
            }
        }

        @Override
        public void flush() throws IOException {
            flushBuffer();
            dst.flush();
        }

        @Override
        public void close() throws IOException {
            dst.close();
        }

    }
```