Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sG2l6-008ksi-MN for pgsql-general@arkaria.postgresql.org; Sat, 08 Jun 2024 20:33:53 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sG2l4-0017g9-GQ for pgsql-general@arkaria.postgresql.org; Sat, 08 Jun 2024 20:33:51 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sG2l4-0017g1-3x for pgsql-general@lists.postgresql.org; Sat, 08 Jun 2024 20:33:51 +0000 Received: from mail-io1-xd35.google.com ([2607:f8b0:4864:20::d35]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sG2l2-000fXt-2n for pgsql-general@lists.postgresql.org; Sat, 08 Jun 2024 20:33:49 +0000 Received: by mail-io1-xd35.google.com with SMTP id ca18e2360f4ac-7eb12b2bf78so139432539f.0 for ; Sat, 08 Jun 2024 13:33:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717878826; x=1718483626; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=y37kOyC7NyaCyGzXAzedTDkf1OGU8Li2XqLvKxtGzQ8=; b=Ai2iKh+TWwtTmXEroQbvaur79SrfIoCleHc+oIoxBAd+KY+9WuRJtic4j5KYmdR96R 9YxrkPYLhON2Wzmkjqnm1k6B6OS+3Tyu9RlQ2y45JhSiQgdfncwbo6luHnGzukoEOLQB iZOC+6r3JeH6lCeVK5OZXkYF0O/ZjZJrhht2elUaE2uClL0SG5XQcYUiDfxpv+Xe33hx mb9/SLVvvB0TyqrraPkMfMNSSIDmibsemFWEcUgqUmFGj653lQ24+9k5Gb31ryqVeGQX SItCmSbI+aibOuJRTuq51lhc7AfDfhCYrwwr5jTRnWs5jE4Y8C8rg4ryatxnz3+/Jpwd cPbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717878826; x=1718483626; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=y37kOyC7NyaCyGzXAzedTDkf1OGU8Li2XqLvKxtGzQ8=; b=J9Xg+e2Ji28EZ5Qe8B3J8vRkwjvJWuNYWl8VMJP8ozr7u43xkkkfpTwDunp8dLlRP2 C54mdAewKZQu7OHcheDgHEg9ImiOnSzJN2MgsIkp1O//+J5ysXBkXFsdNhJbHiAe48e6 HJfn/w1Hl1sQdvdI+5PvS3o9cCbCOcjf5dik6yDiMfZ7sgmyiV+hYKv2se4usa+AF97y qvsw5b9Ui8ETss9la4l16c6kC4YbgOs7gA7N7itssUQWxERmDZHhDiIIQJF2bQxmgfI0 3B04ocgKI0eL9TN5byZQXt3CJhJkf5d/U82zdW2hiq5ANkvCTM5cA86qfzjl36qxy+Aq j9Ug== X-Forwarded-Encrypted: i=1; AJvYcCWf3bQL14BoISF8dSbV1e3GYT0WbbnhZPMb8PD9q8ZBCbzYisQYOS7bkztwEk+Bub7FxdNLTSIxPBTVmQUzT4c4XSRc+0/BF9DuZF3x+0roi7y8 X-Gm-Message-State: AOJu0Yyp2xPTbNF0EnPhs4cMOFtDIihopaIzgXveVqr+ZThIH+LGVfFj RSdzaoVOq5EeyePtKLD+cElKWrGpbcRhAKfMhe4tq9wfMi6ESrJTKu7tKM6ekXJtiAh3Vi8Ax5Q AWjLh0/zG1x1s871rlnu2nNOfW0w= X-Google-Smtp-Source: AGHT+IEW9yittkMnw9i3B0EVKMqCWJIwfU+g3o+s9bSBqsS9hmz/SC7fyo8B0niYG31tBxuW4HfL1ZK2L/pIccOd4Mo= X-Received: by 2002:a05:6602:60ce:b0:7eb:708a:3264 with SMTP id ca18e2360f4ac-7eb708a3721mr293452539f.10.1717878825702; Sat, 08 Jun 2024 13:33:45 -0700 (PDT) MIME-Version: 1.0 References: <1628637f-419f-4f6a-9cb6-07af90cd0bc4@aklaver.com> <1c0273f5-a90a-48f6-b51f-fe15c16fa1c6@aklaver.com> <490e8a5c-f1f5-40fe-8243-f1c8c18d03f2@aklaver.com> In-Reply-To: From: Justin Date: Sat, 8 Jun 2024 16:33:34 -0400 Message-ID: Subject: Re: Questions on logical replication To: Koen De Groote Cc: Adrian Klaver , PostgreSQL General Content-Type: multipart/alternative; boundary="0000000000002c6101061a66d538" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000002c6101061a66d538 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, Jun 8, 2024 at 1:41=E2=80=AFPM Koen De Groote w= rote: > What I'm trying to do is upgrade a PG11 database to PG16, using logical > replication. > > The PG11 has an active and a standby, there are a handful of databases. O= n > particular one has a few tables just over 100GB, then a few 100 tables ne= ar > 1GB. > > What I'd do is start a publication with no tables and add them 1 at a > time, refreshing subscription each time. > > This might take a long time, so my main questions relate to potential > network issues or various situations where the instance receiving the > logical replication, suddenly stop being able to receive. > > Resyncing, and the effects of WAL buildup, are my main concern. > > Accidentally sent a mail to only your email, sorry for that. > > Regards, > Koen De Groote > >> >> This approach does not prevent WAL build up. The WAL build up occurs during the initial sync worker once that table is synced the WAL is replayed and released. The parent worker then become responsible for replaying the WAL for that table The WAL build up is during the initial sync of the data by table NOT during the entire synce of all the tables that have been published. For 1 gb table the initial sync will be very fast so I doubt any individual table will cause any significant WAL build up to put the publisher at risk of of crashing Once a table becomes synced the main subscriber worker keeps the WAL replayed. If there are any errors during the replay of WAL such as missing indexes for Replica Identities during an Update or Delete this will cause the main subscriber worker slot on the publisher to start backing up WAL files. If there are missing replica identities the affected tables will have to be dropped from the publication and subscription refreshed. The WAL file is already written with incorrect information so the table on the subscriber table is most likely not in recoverable state. I suggest confirming all tables have replica identities or primary keys before going any further. With PG 11 avoid REPLICA IDENTITY FULL as this causes full table scan on the subscriber for PG 15 and eariler. PG 16 on the subsciber can use a different unique index that has NOT NULL for all participating columns if the publisher is using Replicate Identity FULL on the published table One must understand the above before deploying logical replication. Hope this helps --0000000000002c6101061a66d538 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Sat, Jun 8, 2024 at 1:41=E2=80=AFPM Koen De Groote <kdg.dev@gmail.com> wrote:
What = I'm trying to do is upgrade a PG11 database to PG16, using logical repl= ication.

The PG11 has an active and a standby, there are a handful of databases. On=20 particular one has a few tables just over 100GB, then a few 100 tables=20 near 1GB.

What I'd do is start a publication w= ith no tables and add them 1 at a time, refreshing subscription each time.<= /div>

This might take a long time, so my main questions relate to potential=20 network issues or various situations where the instance receiving the=20 logical replication, suddenly stop being able to receive.

Resyncing, and the effects of WAL buildup, are my main concern.

Accidentally sent a mail to only your email, sorry fo= r that.

Regards,
Koen De Groote


Th= is approach does not prevent WAL build up.

The WAL build up occurs d= uring the initial sync worker once that table is synced the WAL is replayed= and released.=C2=A0 =C2=A0The parent worker then become responsible for re= playing the WAL for that table

The WAL build up is during the = initial sync of the data by table NOT during the entire synce=C2=A0of all t= he tables that have been published.

For 1 gb table the initial sync = will be very fast so I doubt any individual table will cause any significan= t WAL build up to put the publisher at risk of of crashing

Once a ta= ble becomes synced the main subscriber worker keeps the WAL replayed.=C2=A0= If there are any errors during the replay of WAL such as missing indexes f= or Replica Identities during an Update or Delete=C2=A0 this will cause the = main subscriber worker slot on the publisher to start backing up WAL files.= If there are missing replica identities the affected tables will have to b= e dropped from the publication and subscription refreshed.=C2=A0 The WAL=C2= =A0 file is already written with incorrect information so the table on the = subscriber table is most likely not in recoverable state.=C2=A0
=C2=A0I suggest confirming all tables have replica identities or primary keys b= efore going any further.=C2=A0 =C2=A0 With PG 11 avoid REPLICA IDENTITY FUL= L as this causes full table scan on the subscriber for PG 15 and eariler.= =C2=A0 PG 16 on the subsciber can use a different unique index that has NOT= NULL for all participating columns if the publisher is using=C2=A0 Replica= te Identity FULL on the published table

One must understand the abov= e before deploying logical replication.=C2=A0

Hope this helps
--0000000000002c6101061a66d538--