public inbox for [email protected]  
help / color / mirror / Atom feed
PostgreSQL on S3-backed Block Storage with Near-Local Performance
6+ messages / 5 participants
[nested] [flat]

* PostgreSQL on S3-backed Block Storage with Near-Local Performance
@ 2025-07-17 22:57  Pierre Barre <[email protected]>
  0 siblings, 3 replies; 6+ messages in thread

From: Pierre Barre @ 2025-07-17 22:57 UTC (permalink / raw)
  To: [email protected]

Hi everyone,

I wanted to share a project I've been working on that enables PostgreSQL to run on S3 storage while maintaining performance comparable to local NVMe. The approach uses block-level access rather than trying to map filesystem operations to S3 objects.

ZeroFS: https://github.com/Barre/ZeroFS

# The Architecture

ZeroFS provides NBD (Network Block Device) servers that expose S3 storage as raw block devices. PostgreSQL runs unmodified on ZFS pools built on these block devices:

PostgreSQL -> ZFS -> NBD -> ZeroFS -> S3

By providing block-level access and leveraging ZFS's caching capabilities (L2ARC), we can achieve microsecond latencies despite the underlying storage being in S3.

## Performance Results

Here are pgbench results from PostgreSQL running on this setup:

### Read/Write Workload

```
postgres@ubuntu-16gb-fsn1-1:/root$ pgbench -c 50 -j 15 -t 100000 example
pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 50
number of threads: 15
maximum number of tries: 1
number of transactions per client: 100000
number of transactions actually processed: 5000000/5000000
number of failed transactions: 0 (0.000%)
latency average = 0.943 ms
initial connection time = 48.043 ms
tps = 53041.006947 (without initial connection time)
```

### Read-Only Workload

```
postgres@ubuntu-16gb-fsn1-1:/root$ pgbench -c 50 -j 15 -t 100000 -S example
pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1))
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 50
query mode: simple
number of clients: 50
number of threads: 15
maximum number of tries: 1
number of transactions per client: 100000
number of transactions actually processed: 5000000/5000000
number of failed transactions: 0 (0.000%)
latency average = 0.121 ms
initial connection time = 53.358 ms
tps = 413436.248089 (without initial connection time)
```

These numbers are with 50 concurrent clients and the actual data stored in S3. Hot data is served from ZFS L2ARC and ZeroFS's memory caches, while cold data comes from S3.

## How It Works

1. ZeroFS exposes NBD devices (e.g., /dev/nbd0) that PostgreSQL/ZFS can use like any other block device
2. Multiple cache layers hide S3 latency:
   a. ZFS ARC/L2ARC for frequently accessed blocks
   b. ZeroFS memory cache for metadata and hot dataZeroFS exposes NBD devices (e.g., /dev/nbd0) that PostgreSQL/ZFS can use like any other block device
   c. Optional local disk cache
3. All data is encrypted (ChaCha20-Poly1305) before hitting S3
4. Files are split into 128KB chunks for insertion into ZeroFS' LSM-tree

## Geo-Distributed PostgreSQL

Since each region can run its own ZeroFS instance, you can create geographically distributed PostgreSQL setups.

Example architectures:

Architecture 1


                         PostgreSQL Client
                                   |
                                   | SQL queries
                                   |
                            +--------------+
                            |  PG Proxy    |
                            | (HAProxy/    |
                            |  PgBouncer)  |
                            +--------------+
                               /        \
                              /          \
                   Synchronous            Synchronous
                   Replication            Replication
                            /              \
                           /                \
              +---------------+        +---------------+
              | PostgreSQL 1  |        | PostgreSQL 2  |
              | (Primary)     |◄------►| (Standby)     |
              +---------------+        +---------------+
                      |                        |
                      |  POSIX filesystem ops  |
                      |                        |
              +---------------+        +---------------+
              |   ZFS Pool 1  |        |   ZFS Pool 2  |
              | (3-way mirror)|        | (3-way mirror)|
              +---------------+        +---------------+
               /      |      \          /      |      \
              /       |       \        /       |       \
        NBD:10809 NBD:10810 NBD:10811  NBD:10812 NBD:10813 NBD:10814
             |        |        |           |        |        |
        +--------++--------++--------++--------++--------++--------+
        |ZeroFS 1||ZeroFS 2||ZeroFS 3||ZeroFS 4||ZeroFS 5||ZeroFS 6|
        +--------++--------++--------++--------++--------++--------+
             |         |         |         |         |         |
             |         |         |         |         |         |
        S3-Region1 S3-Region2 S3-Region3 S3-Region4 S3-Region5 S3-Region6
        (us-east) (eu-west) (ap-south) (us-west) (eu-north) (ap-east)

Architecture 2:

PostgreSQL Primary (Region 1) ←→ PostgreSQL Standby (Region 2)
                \                    /
                 \                  /
                  Same ZFS Pool (NBD)
                         |
                  6 Global ZeroFS
                         |
                      S3 Regions


The main advantages I see are:
1. Dramatic cost reduction for large datasets
2. Simplified geo-distribution 
3. Infinite storage capacity
4. Built-in encryption and compression

Looking forward to your feedback and questions!

Best,
Pierre

P.S. The full project includes a custom NFS filesystem too.






^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: PostgreSQL on S3-backed Block Storage with Near-Local Performance
@ 2025-07-18 04:40  Laurenz Albe <[email protected]>
  parent: Pierre Barre <[email protected]>
  2 siblings, 1 reply; 6+ messages in thread

From: Laurenz Albe @ 2025-07-18 04:40 UTC (permalink / raw)
  To: Pierre Barre <[email protected]>; [email protected]

On Fri, 2025-07-18 at 00:57 +0200, Pierre Barre wrote:
> Looking forward to your feedback and questions!

I think the biggest hurdle you will have to overcome is to
convince notoriously paranoid DBAs that this tall stack
provides reliable service, honors fsync() etc.

Performance is great, but it is not everything.  If things
perform surprisingly well, people become suspicious.

> P.S. The full project includes a custom NFS filesystem too.

"NFS" is a key word that does not inspire confidence in
PostgreSQL circles...

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: PostgreSQL on S3-backed Block Storage with Near-Local Performance
@ 2025-07-18 10:42  Seref Arikan <[email protected]>
  parent: Pierre Barre <[email protected]>
  2 siblings, 0 replies; 6+ messages in thread

From: Seref Arikan @ 2025-07-18 10:42 UTC (permalink / raw)
  To: Pierre Barre <[email protected]>; +Cc: [email protected]

Sorry, this was meant to go to the whole group:

Very interesting!. Great work. Can you clarify how exactly you're running
postgres in your tests? A specific AWS service? What's the test
infrastructure that sits above the file system?

On Thu, Jul 17, 2025 at 11:59 PM Pierre Barre <[email protected]> wrote:

> Hi everyone,
>
> I wanted to share a project I've been working on that enables PostgreSQL
> to run on S3 storage while maintaining performance comparable to local
> NVMe. The approach uses block-level access rather than trying to map
> filesystem operations to S3 objects.
>
> ZeroFS: https://github.com/Barre/ZeroFS
>
> # The Architecture
>
> ZeroFS provides NBD (Network Block Device) servers that expose S3 storage
> as raw block devices. PostgreSQL runs unmodified on ZFS pools built on
> these block devices:
>
> PostgreSQL -> ZFS -> NBD -> ZeroFS -> S3
>
> By providing block-level access and leveraging ZFS's caching capabilities
> (L2ARC), we can achieve microsecond latencies despite the underlying
> storage being in S3.
>
> ## Performance Results
>
> Here are pgbench results from PostgreSQL running on this setup:
>
> ### Read/Write Workload
>
> ```
> postgres@ubuntu-16gb-fsn1-1:/root$ pgbench -c 50 -j 15 -t 100000 example
> pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1))
> starting vacuum...end.
> transaction type: <builtin: TPC-B (sort of)>
> scaling factor: 50
> query mode: simple
> number of clients: 50
> number of threads: 15
> maximum number of tries: 1
> number of transactions per client: 100000
> number of transactions actually processed: 5000000/5000000
> number of failed transactions: 0 (0.000%)
> latency average = 0.943 ms
> initial connection time = 48.043 ms
> tps = 53041.006947 (without initial connection time)
> ```
>
> ### Read-Only Workload
>
> ```
> postgres@ubuntu-16gb-fsn1-1:/root$ pgbench -c 50 -j 15 -t 100000 -S
> example
> pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1))
> starting vacuum...end.
> transaction type: <builtin: select only>
> scaling factor: 50
> query mode: simple
> number of clients: 50
> number of threads: 15
> maximum number of tries: 1
> number of transactions per client: 100000
> number of transactions actually processed: 5000000/5000000
> number of failed transactions: 0 (0.000%)
> latency average = 0.121 ms
> initial connection time = 53.358 ms
> tps = 413436.248089 (without initial connection time)
> ```
>
> These numbers are with 50 concurrent clients and the actual data stored in
> S3. Hot data is served from ZFS L2ARC and ZeroFS's memory caches, while
> cold data comes from S3.
>
> ## How It Works
>
> 1. ZeroFS exposes NBD devices (e.g., /dev/nbd0) that PostgreSQL/ZFS can
> use like any other block device
> 2. Multiple cache layers hide S3 latency:
>    a. ZFS ARC/L2ARC for frequently accessed blocks
>    b. ZeroFS memory cache for metadata and hot dataZeroFS exposes NBD
> devices (e.g., /dev/nbd0) that PostgreSQL/ZFS can use like any other block
> device
>    c. Optional local disk cache
> 3. All data is encrypted (ChaCha20-Poly1305) before hitting S3
> 4. Files are split into 128KB chunks for insertion into ZeroFS' LSM-tree
>
> ## Geo-Distributed PostgreSQL
>
> Since each region can run its own ZeroFS instance, you can create
> geographically distributed PostgreSQL setups.
>
> Example architectures:
>
> Architecture 1
>
>
>                          PostgreSQL Client
>                                    |
>                                    | SQL queries
>                                    |
>                             +--------------+
>                             |  PG Proxy    |
>                             | (HAProxy/    |
>                             |  PgBouncer)  |
>                             +--------------+
>                                /        \
>                               /          \
>                    Synchronous            Synchronous
>                    Replication            Replication
>                             /              \
>                            /                \
>               +---------------+        +---------------+
>               | PostgreSQL 1  |        | PostgreSQL 2  |
>               | (Primary)     |◄------►| (Standby)     |
>               +---------------+        +---------------+
>                       |                        |
>                       |  POSIX filesystem ops  |
>                       |                        |
>               +---------------+        +---------------+
>               |   ZFS Pool 1  |        |   ZFS Pool 2  |
>               | (3-way mirror)|        | (3-way mirror)|
>               +---------------+        +---------------+
>                /      |      \          /      |      \
>               /       |       \        /       |       \
>         NBD:10809 NBD:10810 NBD:10811  NBD:10812 NBD:10813 NBD:10814
>              |        |        |           |        |        |
>         +--------++--------++--------++--------++--------++--------+
>         |ZeroFS 1||ZeroFS 2||ZeroFS 3||ZeroFS 4||ZeroFS 5||ZeroFS 6|
>         +--------++--------++--------++--------++--------++--------+
>              |         |         |         |         |         |
>              |         |         |         |         |         |
>         S3-Region1 S3-Region2 S3-Region3 S3-Region4 S3-Region5 S3-Region6
>         (us-east) (eu-west) (ap-south) (us-west) (eu-north) (ap-east)
>
> Architecture 2:
>
> PostgreSQL Primary (Region 1) ←→ PostgreSQL Standby (Region 2)
>                 \                    /
>                  \                  /
>                   Same ZFS Pool (NBD)
>                          |
>                   6 Global ZeroFS
>                          |
>                       S3 Regions
>
>
> The main advantages I see are:
> 1. Dramatic cost reduction for large datasets
> 2. Simplified geo-distribution
> 3. Infinite storage capacity
> 4. Built-in encryption and compression
>
> Looking forward to your feedback and questions!
>
> Best,
> Pierre
>
> P.S. The full project includes a custom NFS filesystem too.
>
>
>


^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: PostgreSQL on S3-backed Block Storage with Near-Local Performance
@ 2025-07-24 19:41  Nico Williams <[email protected]>
  parent: Laurenz Albe <[email protected]>
  0 siblings, 0 replies; 6+ messages in thread

From: Nico Williams @ 2025-07-24 19:41 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: Pierre Barre <[email protected]>; [email protected]

On Fri, Jul 18, 2025 at 06:40:58AM +0200, Laurenz Albe wrote:
> On Fri, 2025-07-18 at 00:57 +0200, Pierre Barre wrote:
> > Looking forward to your feedback and questions!
> 
> I think the biggest hurdle you will have to overcome is to
> convince notoriously paranoid DBAs that this tall stack
> provides reliable service, honors fsync() etc.

Is there a test suite that can be used to test PG's ACIDity in the face
of simulated power failures?

> Performance is great, but it is not everything.  If things
> perform surprisingly well, people become suspicious.

+1

> > P.S. The full project includes a custom NFS filesystem too.
> 
> "NFS" is a key word that does not inspire confidence in
> PostgreSQL circles...

Certainly NFSv3 should.  NFSv4 is much safer but I've no experience
running PG on it and I assume there will be cases where recovery from
network and/or server failures is slow.






^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: PostgreSQL on S3-backed Block Storage with Near-Local Performance
@ 2025-07-24 22:21  Marco Torres <[email protected]>
  parent: Pierre Barre <[email protected]>
  2 siblings, 1 reply; 6+ messages in thread

From: Marco Torres @ 2025-07-24 22:21 UTC (permalink / raw)
  To: [email protected]

My humble take on this project: well done! You are opening the doors to
work on a much-needed endeavor, decouple compute from storage, and
potentially elaborate on other projects for an active/active cluster! I
applaud you.

On Thu, Jul 17, 2025, 4:59 PM Pierre Barre <[email protected]> wrote:

> Hi everyone,
>
> I wanted to share a project I've been working on that enables PostgreSQL
> to run on S3 storage while maintaining performance comparable to local
> NVMe. The approach uses block-level access rather than trying to map
> filesystem operations to S3 objects.
>
> ZeroFS: https://github.com/Barre/ZeroFS
>
> # The Architecture
>
> ZeroFS provides NBD (Network Block Device) servers that expose S3 storage
> as raw block devices. PostgreSQL runs unmodified on ZFS pools built on
> these block devices:
>
> PostgreSQL -> ZFS -> NBD -> ZeroFS -> S3
>
> By providing block-level access and leveraging ZFS's caching capabilities
> (L2ARC), we can achieve microsecond latencies despite the underlying
> storage being in S3.
>
> ## Performance Results
>
> Here are pgbench results from PostgreSQL running on this setup:
>
> ### Read/Write Workload
>
> ```
> postgres@ubuntu-16gb-fsn1-1:/root$ pgbench -c 50 -j 15 -t 100000 example
> pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1))
> starting vacuum...end.
> transaction type: <builtin: TPC-B (sort of)>
> scaling factor: 50
> query mode: simple
> number of clients: 50
> number of threads: 15
> maximum number of tries: 1
> number of transactions per client: 100000
> number of transactions actually processed: 5000000/5000000
> number of failed transactions: 0 (0.000%)
> latency average = 0.943 ms
> initial connection time = 48.043 ms
> tps = 53041.006947 (without initial connection time)
> ```
>
> ### Read-Only Workload
>
> ```
> postgres@ubuntu-16gb-fsn1-1:/root$ pgbench -c 50 -j 15 -t 100000 -S
> example
> pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1))
> starting vacuum...end.
> transaction type: <builtin: select only>
> scaling factor: 50
> query mode: simple
> number of clients: 50
> number of threads: 15
> maximum number of tries: 1
> number of transactions per client: 100000
> number of transactions actually processed: 5000000/5000000
> number of failed transactions: 0 (0.000%)
> latency average = 0.121 ms
> initial connection time = 53.358 ms
> tps = 413436.248089 (without initial connection time)
> ```
>
> These numbers are with 50 concurrent clients and the actual data stored in
> S3. Hot data is served from ZFS L2ARC and ZeroFS's memory caches, while
> cold data comes from S3.
>
> ## How It Works
>
> 1. ZeroFS exposes NBD devices (e.g., /dev/nbd0) that PostgreSQL/ZFS can
> use like any other block device
> 2. Multiple cache layers hide S3 latency:
>    a. ZFS ARC/L2ARC for frequently accessed blocks
>    b. ZeroFS memory cache for metadata and hot dataZeroFS exposes NBD
> devices (e.g., /dev/nbd0) that PostgreSQL/ZFS can use like any other block
> device
>    c. Optional local disk cache
> 3. All data is encrypted (ChaCha20-Poly1305) before hitting S3
> 4. Files are split into 128KB chunks for insertion into ZeroFS' LSM-tree
>
> ## Geo-Distributed PostgreSQL
>
> Since each region can run its own ZeroFS instance, you can create
> geographically distributed PostgreSQL setups.
>
> Example architectures:
>
> Architecture 1
>
>
>                          PostgreSQL Client
>                                    |
>                                    | SQL queries
>                                    |
>                             +--------------+
>                             |  PG Proxy    |
>                             | (HAProxy/    |
>                             |  PgBouncer)  |
>                             +--------------+
>                                /        \
>                               /          \
>                    Synchronous            Synchronous
>                    Replication            Replication
>                             /              \
>                            /                \
>               +---------------+        +---------------+
>               | PostgreSQL 1  |        | PostgreSQL 2  |
>               | (Primary)     |◄------►| (Standby)     |
>               +---------------+        +---------------+
>                       |                        |
>                       |  POSIX filesystem ops  |
>                       |                        |
>               +---------------+        +---------------+
>               |   ZFS Pool 1  |        |   ZFS Pool 2  |
>               | (3-way mirror)|        | (3-way mirror)|
>               +---------------+        +---------------+
>                /      |      \          /      |      \
>               /       |       \        /       |       \
>         NBD:10809 NBD:10810 NBD:10811  NBD:10812 NBD:10813 NBD:10814
>              |        |        |           |        |        |
>         +--------++--------++--------++--------++--------++--------+
>         |ZeroFS 1||ZeroFS 2||ZeroFS 3||ZeroFS 4||ZeroFS 5||ZeroFS 6|
>         +--------++--------++--------++--------++--------++--------+
>              |         |         |         |         |         |
>              |         |         |         |         |         |
>         S3-Region1 S3-Region2 S3-Region3 S3-Region4 S3-Region5 S3-Region6
>         (us-east) (eu-west) (ap-south) (us-west) (eu-north) (ap-east)
>
> Architecture 2:
>
> PostgreSQL Primary (Region 1) ←→ PostgreSQL Standby (Region 2)
>                 \                    /
>                  \                  /
>                   Same ZFS Pool (NBD)
>                          |
>                   6 Global ZeroFS
>                          |
>                       S3 Regions
>
>
> The main advantages I see are:
> 1. Dramatic cost reduction for large datasets
> 2. Simplified geo-distribution
> 3. Infinite storage capacity
> 4. Built-in encryption and compression
>
> Looking forward to your feedback and questions!
>
> Best,
> Pierre
>
> P.S. The full project includes a custom NFS filesystem too.
>
>
>


^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: PostgreSQL on S3-backed Block Storage with Near-Local Performance
@ 2025-07-24 22:31  Pierre Barre <[email protected]>
  parent: Marco Torres <[email protected]>
  0 siblings, 0 replies; 6+ messages in thread

From: Pierre Barre @ 2025-07-24 22:31 UTC (permalink / raw)
  To: Marco Torres <[email protected]>; [email protected]

Hi Marco,

Thanks for the kind words!

> and potentially elaborate on other projects for an active/active cluster! I applaud you.

I wrote an argument there: https://github.com/Barre/ZeroFS?tab=readme-ov-file#cap-theorem

I definitely want to write a proof of concept when I get some time.

Best,
Pierre

On Fri, Jul 25, 2025, at 00:21, Marco Torres wrote:
> My humble take on this project: well done! You are opening the doors to work on a much-needed endeavor, decouple compute from storage, and potentially elaborate on other projects for an active/active cluster! I applaud you.
> 
> On Thu, Jul 17, 2025, 4:59 PM Pierre Barre <[email protected]> wrote:
>> Hi everyone,
>> 
>> I wanted to share a project I've been working on that enables PostgreSQL to run on S3 storage while maintaining performance comparable to local NVMe. The approach uses block-level access rather than trying to map filesystem operations to S3 objects.
>> 
>> ZeroFS: https://github.com/Barre/ZeroFS
>> 
>> # The Architecture
>> 
>> ZeroFS provides NBD (Network Block Device) servers that expose S3 storage as raw block devices. PostgreSQL runs unmodified on ZFS pools built on these block devices:
>> 
>> PostgreSQL -> ZFS -> NBD -> ZeroFS -> S3
>> 
>> By providing block-level access and leveraging ZFS's caching capabilities (L2ARC), we can achieve microsecond latencies despite the underlying storage being in S3.
>> 
>> ## Performance Results
>> 
>> Here are pgbench results from PostgreSQL running on this setup:
>> 
>> ### Read/Write Workload
>> 
>> ```
>> postgres@ubuntu-16gb-fsn1-1:/root$ pgbench -c 50 -j 15 -t 100000 example
>> pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1))
>> starting vacuum...end.
>> transaction type: <builtin: TPC-B (sort of)>
>> scaling factor: 50
>> query mode: simple
>> number of clients: 50
>> number of threads: 15
>> maximum number of tries: 1
>> number of transactions per client: 100000
>> number of transactions actually processed: 5000000/5000000
>> number of failed transactions: 0 (0.000%)
>> latency average = 0.943 ms
>> initial connection time = 48.043 ms
>> tps = 53041.006947 (without initial connection time)
>> ```
>> 
>> ### Read-Only Workload
>> 
>> ```
>> postgres@ubuntu-16gb-fsn1-1:/root$ pgbench -c 50 -j 15 -t 100000 -S example
>> pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1))
>> starting vacuum...end.
>> transaction type: <builtin: select only>
>> scaling factor: 50
>> query mode: simple
>> number of clients: 50
>> number of threads: 15
>> maximum number of tries: 1
>> number of transactions per client: 100000
>> number of transactions actually processed: 5000000/5000000
>> number of failed transactions: 0 (0.000%)
>> latency average = 0.121 ms
>> initial connection time = 53.358 ms
>> tps = 413436.248089 (without initial connection time)
>> ```
>> 
>> These numbers are with 50 concurrent clients and the actual data stored in S3. Hot data is served from ZFS L2ARC and ZeroFS's memory caches, while cold data comes from S3.
>> 
>> ## How It Works
>> 
>> 1. ZeroFS exposes NBD devices (e.g., /dev/nbd0) that PostgreSQL/ZFS can use like any other block device
>> 2. Multiple cache layers hide S3 latency:
>>    a. ZFS ARC/L2ARC for frequently accessed blocks
>>    b. ZeroFS memory cache for metadata and hot dataZeroFS exposes NBD devices (e.g., /dev/nbd0) that PostgreSQL/ZFS can use like any other block device
>>    c. Optional local disk cache
>> 3. All data is encrypted (ChaCha20-Poly1305) before hitting S3
>> 4. Files are split into 128KB chunks for insertion into ZeroFS' LSM-tree
>> 
>> ## Geo-Distributed PostgreSQL
>> 
>> Since each region can run its own ZeroFS instance, you can create geographically distributed PostgreSQL setups.
>> 
>> Example architectures:
>> 
>> Architecture 1
>> 
>> 
>>                          PostgreSQL Client
>>                                    |
>>                                    | SQL queries
>>                                    |
>>                             +--------------+
>>                             |  PG Proxy    |
>>                             | (HAProxy/    |
>>                             |  PgBouncer)  |
>>                             +--------------+
>>                                /        \
>>                               /          \
>>                    Synchronous            Synchronous
>>                    Replication            Replication
>>                             /              \
>>                            /                \
>>               +---------------+        +---------------+
>>               | PostgreSQL 1  |        | PostgreSQL 2  |
>>               | (Primary)     |◄------►| (Standby)     |
>>               +---------------+        +---------------+
>>                       |                        |
>>                       |  POSIX filesystem ops  |
>>                       |                        |
>>               +---------------+        +---------------+
>>               |   ZFS Pool 1  |        |   ZFS Pool 2  |
>>               | (3-way mirror)|        | (3-way mirror)|
>>               +---------------+        +---------------+
>>                /      |      \          /      |      \
>>               /       |       \        /       |       \
>>         NBD:10809 NBD:10810 NBD:10811  NBD:10812 NBD:10813 NBD:10814
>>              |        |        |           |        |        |
>>         +--------++--------++--------++--------++--------++--------+
>>         |ZeroFS 1||ZeroFS 2||ZeroFS 3||ZeroFS 4||ZeroFS 5||ZeroFS 6|
>>         +--------++--------++--------++--------++--------++--------+
>>              |         |         |         |         |         |
>>              |         |         |         |         |         |
>>         S3-Region1 S3-Region2 S3-Region3 S3-Region4 S3-Region5 S3-Region6
>>         (us-east) (eu-west) (ap-south) (us-west) (eu-north) (ap-east)
>> 
>> Architecture 2:
>> 
>> PostgreSQL Primary (Region 1) ←→ PostgreSQL Standby (Region 2)
>>                 \                    /
>>                  \                  /
>>                   Same ZFS Pool (NBD)
>>                          |
>>                   6 Global ZeroFS
>>                          |
>>                       S3 Regions
>> 
>> 
>> The main advantages I see are:
>> 1. Dramatic cost reduction for large datasets
>> 2. Simplified geo-distribution 
>> 3. Infinite storage capacity
>> 4. Built-in encryption and compression
>> 
>> Looking forward to your feedback and questions!
>> 
>> Best,
>> Pierre
>> 
>> P.S. The full project includes a custom NFS filesystem too.
>> 


^ permalink  raw  reply  [nested|flat] 6+ messages in thread


end of thread, other threads:[~2025-07-24 22:31 UTC | newest]

Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-07-17 22:57 PostgreSQL on S3-backed Block Storage with Near-Local Performance Pierre Barre <[email protected]>
2025-07-18 04:40 ` Laurenz Albe <[email protected]>
2025-07-24 19:41   ` Nico Williams <[email protected]>
2025-07-18 10:42 ` Seref Arikan <[email protected]>
2025-07-24 22:21 ` Marco Torres <[email protected]>
2025-07-24 22:31   ` Pierre Barre <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox