Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sjzwx-006Wzq-4W for pgsql-general@arkaria.postgresql.org; Fri, 30 Aug 2024 11:37:55 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sjzwv-002aN7-9U for pgsql-general@arkaria.postgresql.org; Fri, 30 Aug 2024 11:37:53 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sjUQU-00DAGS-Hg for pgsql-general@lists.postgresql.org; Thu, 29 Aug 2024 01:58:19 +0000 Received: from mail-centralusazon11010010.outbound.protection.outlook.com ([52.101.61.10] helo=DM1PR04CU001.outbound.protection.outlook.com) by magus.postgresql.org with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sjUQM-0020jy-LA for pgsql-general@postgresql.org; Thu, 29 Aug 2024 01:58:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=CXHrYkMwEwTo6waLfSleQUWnC8MD5xh2pQm3cR3FH+Mjpe3bvW/eEdGgOl0hGjObDlkzJJaIUjiOXVSCw6j5lsigKzZRXcU5PT/JYLE9L/GAwMpV7SUwffCo8rMY9wicEyd+1NCux9Zt2J+sRRWOFOZlGTenT9TfEG4JKrEhe6ULAS8jEXHvy6RTvhuMXjEg880f0noc4Hmsp9wjpnsKJ4CVSq2wmTaDBCV6QdoazFc4iNRuStly9m7TGvUOeKN8martowbaPFsh2cr02e56JceOVs0Vusyf6VDOkht9yd22IUtq9EgHbgrMcbVUbfeG3UXBoH1OgeHCUiRMaPtWrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=F0WVrl4niaIdxw6tvOmPStmohyYOCk0CAo1U6EH3G4A=; b=PlUJ1rCVASPKsvT7/HfyFSjHtyaTxZvfhXWw5O73gG39PFbHsLHvYyX9xGjQe95jMXeksyIsxnHoS/fjmXujiGMak0dD1tTAcilCD5LXL9D/oNyJyDTzfKtGTIHB9IwUGSbJC/P7qSDZh+y3C7NsdAbo+Iy9XChyn+jH7c+psZBZ8YUYK9DjWD4UigPJ9Y0ZwKtpapc7Aa0H3CIC6PZs6/HdIrk2aboe8mmYKdhERZ1EAIKtldxDvLZKnE9JnonLx45AHA/2cIBsyoncxC9Mae0CCYqzwgQW91t5og+4wW7IjXacgX2wv8vSZZlAjMdv9nbnmBjQMQMkd3D7PVEfkw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=ghx.com; dmarc=pass action=none header.from=ghx.com; dkim=pass header.d=ghx.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ghx.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=F0WVrl4niaIdxw6tvOmPStmohyYOCk0CAo1U6EH3G4A=; b=FF/EO9utEk53YdW4FPynAYmavreyo/tizsOhuFlDuJzFZW7xnsAzZpu0LuBYP44Obqs5prmjCYDLk5Bb3sv3+veFmnVPIvsYmb+xFEoGUD5ms5eMbqD9I/8T0Hki5E3fFfRBEXlQk5HKIGBY58OgcIrDWabKrVJJIkM5Yp9DPkY= Received: from BN9PR03MB5996.namprd03.prod.outlook.com (2603:10b6:408:135::10) by SJ0PR03MB5678.namprd03.prod.outlook.com (2603:10b6:a03:2d4::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7897.28; Thu, 29 Aug 2024 01:58:07 +0000 Received: from BN9PR03MB5996.namprd03.prod.outlook.com ([fe80::6918:d35a:f4cb:fffc]) by BN9PR03MB5996.namprd03.prod.outlook.com ([fe80::6918:d35a:f4cb:fffc%4]) with mapi id 15.20.7897.027; Thu, 29 Aug 2024 01:58:06 +0000 From: Michael Jaskiewicz To: "pgsql-general@postgresql.org" Subject: Postgres Logical Replication - how to see what subscriber is doing with received data? Thread-Topic: Postgres Logical Replication - how to see what subscriber is doing with received data? Thread-Index: AQHa+bZxCLKpzmWpx0KPm1TxQQ8zJg== Date: Thu, 29 Aug 2024 01:58:06 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=ghx.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: BN9PR03MB5996:EE_|SJ0PR03MB5678:EE_ x-ms-office365-filtering-correlation-id: 6adc1641-3fec-45dc-17cb-08dcc7ce0663 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0;ARA:13230040|366016|1800799024|376014|38070700018; x-microsoft-antispam-message-info: =?us-ascii?Q?bhFkT0Gupjne7eo4nkNVO0oGeC3p9FU1stRfKO9gn3E+RVvKgqlF6JbxIx0e?= =?us-ascii?Q?blVC9nAwn8BSYev7a/Y3UJo07esjrPR0Mw9eeEAw8mLylbXYhxlxu++lKp4R?= =?us-ascii?Q?Oln1gxt8wmNEb3briTWIAj+oRfrpfWifG7iiin+vFvgruUpgkFplaFU4TNNJ?= =?us-ascii?Q?hjGhmIjcOD3ElDHclpfzhPc1HcdLzqQsEFGt5B2j4QCkrm4MmGHz5h8n6zav?= =?us-ascii?Q?KR6j6+Y3AQPa4TN4rJFO82DbBoetbWK0Je3O+zcLQIBvHkcXN9TwIOryDufy?= =?us-ascii?Q?yq5p8s1cVLdJSs/ucgDXGkXZmEjbxXWy0XbtcW5FODeTCyIHP/n2zJ+oI9Ae?= =?us-ascii?Q?+TP/jUNFqiCTzNhgJrJgBHcGd5K0zi1VIlaJESW1lCxT6a0CLlygwzbOEu7Q?= =?us-ascii?Q?6B918yC5AmqW2KVOiLiC8DVE8vTBkgbrRFzvWoL2+AWHdscyTN1Uy/q7PKFX?= =?us-ascii?Q?R5MMZB3t6+wQXaICCRl1vKiFqZG1dvBa3XgN9HDN03atl8jVjUMS8g6jgMbW?= =?us-ascii?Q?VXB0qh9FaC50PLQaL2RN4Mu6ic4AmBK1NnjrUHOlVtc/JBzKOi2tovH2J0Cg?= =?us-ascii?Q?tVO6fEzEznrUaa7Yttt5iYqgwTag+GLDWzPA29GXpNxt1nfa3Yt/zGquOie5?= =?us-ascii?Q?ooIcYLJAH7ZRMB8WdWMOpyuKdoihQSPzcUSRbw0MNepIIH/zU19afCIrNCwM?= =?us-ascii?Q?12sljBJUqkLr9tZ6l842UEOZrlOFd7+UPVnIWAgbafFXBt5CIpyRPaQJolPr?= =?us-ascii?Q?1iGhPUogsFk9/vMATstLJYMhHdcjLi9ExavSARN2Ow/P21VgRiC4kdMvc6uz?= =?us-ascii?Q?kokOmwGGMp5VycHgUcpALyUjCE86jSxVqhcW7oM0Z1U4XNh/QD6Obl143Nsr?= =?us-ascii?Q?1uzCW0/DFcB8/Viw7HxC8ZOsyNwkYvGONy6gtY/rUIvmeW+lssCZ0amfkJyD?= =?us-ascii?Q?TSUGesBz7JowD3Az5bXhCF32VclUvbwZ91l0kRapPQMBVyTinwUoHUwOcwj4?= =?us-ascii?Q?Btyp7UwTd6Zzrdat3CPqXCvgaXPEAO+A2kzaHBSVRQaaGzJ86Clmt4AWWu5a?= =?us-ascii?Q?5+t7/hqjlNOSqkDJP0fS099ZZmLvl1nv1Uulm47+eME0B9gX8PSK41vo2o4Y?= =?us-ascii?Q?riCAZKxAFJxyuCufTpi6kUIVS8GqPNMIJlUB0zzu3srA5umKtBl0e8duBDiL?= =?us-ascii?Q?p9oKoB2jgjfcnJQzzR5yUN90sH6AYqGjhPaVPlXiwtpXQYSQ2VS8sEnO7aN9?= =?us-ascii?Q?Lc/eHEGZThvvLGcbfMCuJ+6oEeipktekXfjDSpOrddGeY1wGp2J+Dv0z5F1d?= =?us-ascii?Q?v8K4IU8tUh+b+wyflgljZ62lqlWc5eDK7yDMeMpLolU6alvR9ah7bQtu1uPb?= =?us-ascii?Q?lsMsoM9o6qXeWr1TPgokUvtMKtDrsXH9tveuJ/jXBfAkrsxbng=3D=3D?= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BN9PR03MB5996.namprd03.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(376014)(38070700018);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?LS1gnQ7+6ziVZfP84kWCa6xB3xdirMXXmwM0dOuSANzR6J91XN2JXzNLZ/9+?= =?us-ascii?Q?TE9w039Dehk6LJCN80VWdeQwjZOYIg8ECXHlWL3TxSANYNmHtmQX6kUxhVpS?= =?us-ascii?Q?HGD1Z53wWtLIbHqPpotQkFlEQcCn6Cic2/CGw3i0XdrUi3FbpEBLXCMZH5rH?= =?us-ascii?Q?VFdGZlWtAxIPhJeHiqDlxakLrg9BZs503OWwLmu5Szj8Faw3RdJ+cBSVbmkw?= =?us-ascii?Q?peYRxwZdpnJkBk3q2V54TGqxnHH0QSGpDYz5jpCXCvyrwXt7hqVpNVAfSRmD?= =?us-ascii?Q?dNVLgkiXJp6WlKtjuevGL9XI6xcONTiHq91FWr/ThhZ79QnQwVOER8s1azbr?= =?us-ascii?Q?1LwE8zWDT7iEr2gWrYIEsIGbc7whMP0z0B0h4rdkRTYMxa2BKKZnyBt5kSCF?= =?us-ascii?Q?rwDc0VgILbM/iitWnqrdcjpcZXkMAcnZdn0lnuJqFuMJq0lx3qfIckFh7DPJ?= =?us-ascii?Q?zit2LuvNyZHXWDEQirz1dDoXpz/FQ0jin3VUHJLFYzUIITzGiSnqf1XqpkAI?= =?us-ascii?Q?ToTLEcpf/BYGtQlKZLgDGPjMA57IOUL8BeAPL7lxLEnbgIBi88KHGJrF0E/t?= =?us-ascii?Q?P22D3XDPNCiGCuJM/YYv8fZBI+4TPziC11SZAvOz3/5roKBp69uVDpWCwMex?= =?us-ascii?Q?yicwtId6vUZ4AedX5akKoYrWdX77RfyuFsFzaOnsWhxZEQWVq/yV53+Iky2Q?= =?us-ascii?Q?Da0FN5dSyB2RXAwj9dEUxAGNXIv7GoEm3F/gAGaqo+0XDbLOLKTq8Z6g7hcA?= =?us-ascii?Q?fuqvxQYRVuiRquETxFDrNZ/9QKikhdxtf+B9twWSN/b2X0X07ZlWkznOb3Xu?= =?us-ascii?Q?LuqAdk1E4ECujEkR7lGgqzZtFpOIgRNJHYfQKDoe2wRAmzHQTQez2QVtLwRZ?= =?us-ascii?Q?4OIvwlcsW7Rjp8qGgSXED+PIef6s83hijkvZgPxbkQHhnctjm+vGTjnh1W41?= =?us-ascii?Q?q5ir5QwcBb3irg+/UVaX4gLYFXqj1q/yKYLp6L+JBjdwZ2HVHjalNZw/2Hgo?= =?us-ascii?Q?VzcTWGAvES2tNVyNrIW1NcKonxLN4SqOmxM4+s9AIpS4flOah0kfFvadO6ki?= =?us-ascii?Q?XN4r30BUFgrpWLvtVfqe3OZ7m/jXb1Du7IH81mfvt0cs+jOB+g3JpstjiyWZ?= =?us-ascii?Q?sVG/1I8DsudYbv2caxwH3TrroPoiowIwHp1IgEza4pPAD6CUo5m99VbGNY+e?= =?us-ascii?Q?BAUiY6VTc3qT+MBqBTmIJlanbyLIkfkdGIsPrvG1VkLAXVOdM1897AGTXBF9?= =?us-ascii?Q?esJH8BzjqLjuyeU3DfzR+jXt9JwAAMXPTnFpoJAAHD9mF696fUOHdhtHCuOB?= =?us-ascii?Q?Mz8MQTiUJbTds3RkpTY1intJlL18Dv2ONRVnzh6O0aaPDOe0qyJIzg8nf7L3?= =?us-ascii?Q?os9PBKeBZh0++6KlcYXepNJOqqnp684Bf/eyaW7o+J9DvmpNkKCDv4Sb4vTf?= =?us-ascii?Q?T3VLTRP0uzEb4GBZR+WVjIuMZlTG4lQ01+VenFwLQnqT5shuqgCyrju95irz?= =?us-ascii?Q?W4pex40TqK20w+3hPuLCsSyW4RaUsi1WO2bY4eBtSH4RbF4D5P2IzT6AOK3t?= =?us-ascii?Q?oQ/EBHguk/AWhroM9p72hsbbQDhfQ2ez0x/YK1wXSc97y+bktXapxiAB4plI?= =?us-ascii?Q?q9XmQU+PEOWE68pPdyTZHdBOStxbpMjK9DojzdVNDHOUfIDNlW1dcbKwX4mX?= =?us-ascii?Q?hITFGg=3D=3D?= Content-Type: multipart/alternative; boundary="_000_BN9PR03MB59965B5087688309C0D79DCEB7962BN9PR03MB5996namp_" MIME-Version: 1.0 X-OriginatorOrg: ghx.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BN9PR03MB5996.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6adc1641-3fec-45dc-17cb-08dcc7ce0663 X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Aug 2024 01:58:06.3594 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3c2088fe-3969-4873-ac0d-dc9f122866b9 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: wKEe1d9irW8h0NBb3EuAi8zJM75cKZLOy5X7pO87q3KBy91lfES/8NOxPnC95kZsDMPeaUY7j7ua1nEXKd8EYQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR03MB5678 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --_000_BN9PR03MB59965B5087688309C0D79DCEB7962BN9PR03MB5996namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I've got two Postgres 13 databases on AWS RDS. * One is a master, the other a slave using logical replication. * Replication has fallen behind by about 350Gb. * The slave was maxed out in terms of CPU for the past four days becaus= e of some jobs that were ongoing so I'm not sure what logical replication w= as able to replicate during that time. * I killed those jobs and now CPU on the master and slave are both low. * I look at the subscriber via `select * from pg_stat_subscription;` an= d see that latest_end_lsn is advancing albeit very slowly. * The publisher says write/flush/replay lags are all 13 minutes behind = but it's been like that for most of the day. * I see no errors in the logs on either the publisher or subscriber out= side of some simple SQL errors that users have been making. * CloudWatch reports low CPU utilization, low I/O, and low network. Is there anything I can do here? Previously I set wal_receiver_timeout time= out to 0 because I had replication issues, and that helped things. I wish I= had some visibility here to get any kind of confidence that it's going to = pull through, but other than these lsn values and database logs, I'm not su= re what to check. Sincerely, mj --_000_BN9PR03MB59965B5087688309C0D79DCEB7962BN9PR03MB5996namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

I've got two Postgres 13 databases on AWS RDS.<= /o:p>

  • One is a master, = the other a slave using logical replication.
  • Replication has fallen behind by= about 350Gb.
  • The slave was maxed out in terms of CPU for the past four days = because of some jobs that were ongoing so I'm not sure what logical replica= tion was able to replicate during that time.
  • I killed those jobs and now CPU = on the master and slave are both low.
  • I look at the subscriber via `select * = from pg_stat_subscription;` and see that latest_end_lsn is advancing albeit= very slowly.
  • The publisher says write/flush/replay lags are all 13 minutes b= ehind but it's been like that for most of the day.
  • I see no errors in the log= s on either the publisher or subscriber outside of some simple SQL errors t= hat users have been making.
  • CloudWatch reports low CPU utilization, low I/O, = and low network.

 

Is there anything I can do here? Previously I set wa= l_receiver_timeout timeout to 0 because I had replication issues, and that = helped things. I wish I had some visibility here to get any kind of confidence that it's going t= o pull through, but other than these lsn values and database logs, I'm not = sure what to check.

 

Sincerely,

mj

--_000_BN9PR03MB59965B5087688309C0D79DCEB7962BN9PR03MB5996namp_--