Re: Questions on logical replication

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Justin <[email protected]>
To: Koen De Groote <[email protected]>
Cc: pgsql-generallists.postgresql.org <[email protected]>
Cc: Adrian Klaver <[email protected]>
Subject: Re: Questions on logical replication
Date: Thu, 13 Jun 2024 11:37:20 -0400
Message-ID: <CALL-XeNt1R+mfAtKYOnht1dMwf2_KCbJ8Q-mcJ8OcDnWXHaT1w@mail.gmail.com> (raw)
In-Reply-To: <CAGbX52EGNrR_UODu6sjUThLnLdadGaQ5qKSCnzJB9uJtRgTN1A@mail.gmail.com>
References: <CAGbX52FxL9eR=jmS3ACgRC=tEm=xj_xMmFoOxO+wVA+oURW=kA@mail.gmail.com>
	<[email protected]>
	<CAGbX52HTgb2jYYO5eVDu=xB5-bjAra+34=AUFg4Xk0NpCRjnyg@mail.gmail.com>
	<[email protected]>
	<CAGbX52EMG+d-BR5SzDP+rQsrdxT32=CXD5XCT2YCmkTpr3jesw@mail.gmail.com>
	<[email protected]>
	<CAGbX52GF=n7-vi5VnfnMXKkf=CfbX_nAgmy3=eezLU6XWXqjfw@mail.gmail.com>
	<CALL-XeOZ36D2z5qcxhF0qf-m9+ArioRVXhN9KEUCgMM69UjMLA@mail.gmail.com>
	<CAGbX52F7goP9u+FojqVYFWuB=HxBbOPC-6F_n0WKgkVep2x1nw@mail.gmail.com>
	<CALL-XePtmrXsKCXg6s_ApW3j9E5+QJqi8oOw+yTHeeC_RgjOsA@mail.gmail.com>
	<CAGbX52EGNrR_UODu6sjUThLnLdadGaQ5qKSCnzJB9uJtRgTN1A@mail.gmail.com>

On Thu, Jun 13, 2024 at 6:01 AM Koen De Groote <[email protected]> wrote:

> > Why?  what benefit does this provide you??   Add all the tables when
> creating the publication and be done with it...  I get this when trying to
> understand how this all works on test boxes, but for production NO idea
> what you're trying to accomplish
>
> Adding all tables at once means adding the gigantic tables as well. Disk
> IO and Network traffic are a serious concern, increased CPU usage affecting
> queries of the live system, as well as transaction wraparound.
>
> Initial sync can be a serious concern, depending on the size of the table.
>

The number of initial  sync workers can be controlled
via max_sync_workers_per_subscription
see https://www.postgresql.org/docs/current/logical-replication-config.html


if you want to do one table at a time just set sync workers to 1.

If bandwidth is a problem  either from the disk or network, direct the
network traffic from the subscriber through a proxy or firewall to throttle
the network speed.  Slowing the copy will cause the WAL to build up  on the
publisher

CPU load on the publisher is very low its actually hard to see it doing
anything as its just reading the disk, streaming it to the subscriber..

For large tables with lots of indexes for the copy to complete as fast as
possible to prevent WAL build up, drop indexes.  For me the WAL build up
has only been an issue when dealing with multi-TB sized tables when it
takes several days to copy the data  for one table.

One trick is to remove all the indexes during the initial sync except for
the primary key so the subscriber has less work to do.


> Here's a nice guide where people did a logical replication upgrade,
> explaining why they did it this way:
> https://knock.app/blog/zero-downtime-postgres-upgrades
>

The blog suggests overly complicated things.  only doing 100GB chunks of
data at one time.  Maybe  if the publisher was scarce on resources or the
table is multi-TB in size it requires days to weeks to copy...

If the publisher is so low on resources that Logical Replication is
problematic  one can create a binary replica, promote it and convert it to
logical replication skipping the initial sync.  Then upgrade that server.
There is a minor outage required to convert a binary replica to a logical
replica.  I've done it in under 30 seconds.




>
> On Wed, Jun 12, 2024 at 7:01 PM Justin <[email protected]> wrote:
>
>>
>>
>> On Tue, Jun 11, 2024 at 5:43 PM Koen De Groote <[email protected]> wrote:
>>
>>> > If there are any errors during the replay of WAL such as missing
>>> indexes for Replica Identities during an Update or Delete  this will cause
>>> the main subscriber worker slot on the publisher to start backing up WAL
>>> files
>>>
>>> And also if the connection breaks, from what I understand, is that
>>> correct? Anything that stops the subscription, including disabling the
>>> subscription, is that right?
>>>
>>
>> Yes to all....
>>
>>
>>> > I suggest confirming all tables have replica identities or primary
>>> keys before going any further.
>>>
>>> Yes, I am aware of this. I made me a small script that prints which
>>> tables I have added to the publication and are done syncing, and which are
>>> currently not being replicated.
>>>
>>
>>
>>>
>>> > With PG 11 avoid REPLICA IDENTITY FULL as this causes full table scan
>>> on the subscriber for PG 15 and earlier.
>>>
>>> I'm also aware of this. My plan is to create a publication with no
>>> tables, and add them 1 by 1, refreshing the subscriber each time.
>>>
>>
>> Why?  what benefit does this provide you??   Add all the tables when
>> creating the publication and be done with it...  I get this when trying to
>> understand how this all works on test boxes, but for production NO idea
>> what you're trying to accomplish
>>
>>
>>> I'm not planning on using "REPLICA IDENTITY FULL" anywhere.
>>>
>> Good
>>
>

view thread (16+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Questions on logical replication
  In-Reply-To: <CALL-XeNt1R+mfAtKYOnht1dMwf2_KCbJ8Q-mcJ8OcDnWXHaT1w@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox