Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ufIvk-005G9h-Er for pgsql-general@arkaria.postgresql.org; Fri, 25 Jul 2025 13:57:49 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ufIvj-0017kU-Ib for pgsql-general@arkaria.postgresql.org; Fri, 25 Jul 2025 13:57:47 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ufIvj-0017kM-7r for pgsql-general@lists.postgresql.org; Fri, 25 Jul 2025 13:57:47 +0000 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ufIvg-000mnx-0G for pgsql-general@lists.postgresql.org; Fri, 25 Jul 2025 13:57:46 +0000 Received: by mail-pj1-x1035.google.com with SMTP id 98e67ed59e1d1-31223a4cddeso1715493a91.1 for ; Fri, 25 Jul 2025 06:57:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753451863; x=1754056663; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2FzP/h7WSuX6UrgW3GtHN2ms+yIIqT0sG8gxl5TjzC4=; b=HLhkUG7BFsADwbUqsFEV4bd/pdmqiBu4wGfInqQp07pkrHDxSL9qh8potoXzpl0ZZy TGqxj1uWlT0DJ5zzN8uDHmHQEqf3ZDrBMByOC1kGHckx/8XPDoGv34wzz+Ly8+4+Wcec g2ol5gA8tndeVYig4yR1FbDUi714s+gjLwT7UJ3K7k9v5aXXu+/EQGjcYHbkpAyskOgo HeVdC/MssrgI+E6KNiK5zm11v7jLNd0VBltz4KKzRu7RtTA0VnjbbrgcNXhyTRYQtY2b LaNlwxMcSjFHy2eNaSUF4FDpWey78ntoUOpIL/ZAk9pxetkZSYIMclKZZXfM6z+xE+Vy xX+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753451863; x=1754056663; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2FzP/h7WSuX6UrgW3GtHN2ms+yIIqT0sG8gxl5TjzC4=; b=k4/DCiGw6Ed1+yahmEqChMR80h5Ka7u+DGrN9RIw6fLcompZ2eBMUnBSZ4tP4N2ymb U/kSSA+iKUEaxZ2Rq5H7dRl899/x+1jdT6YHy1Clx9qBCryPeu/WGqxI6QDEDbklnzPD JzS53uko2ggJ+toWlMWzSz3a/LG0TBq9z1nvsRyHbynfd7R7wh9qGy/L+YQ9SKGP4wDy Noosow5qkuK5f9xrARQBZbsZG0t02h3yV8HrVb7FzHupZUVbpXw1sOl3Qs99blJiRk9A sXU2SI6HMfWV2pTG/1tTkpU30dH/EchhS82OqtO6e6HbY/1RjqwfilGEF+a21UQqKlVH AyRQ== X-Gm-Message-State: AOJu0YzXuQN17XSg6Zuhw+xRqe3/Ep4dmbUyYlhHIEfJ+tYzQmmyHxLh Rke5vhZyG2d0aVMntktDfxI3wjmp8TwKvv2EqG3e1UxenWWPPA2Jrv/vyeYTNXSdvX2Pr6JYyCF tzI8wka4Gxfbw0/hshysZaRxIm6JHADvgQw== X-Gm-Gg: ASbGncszGSSJEL7GnSGBQBR8AHj64zPgWpeG9lE/1oufpfnIKDWvYi/FxbK+eSphH0X 4j4xDmcFY7bQ2R0Zyzw3s78gHiW3XlC0HqJu3VRjye8kcDS2ZQN4JIHfVZ9M9Z1gRl9SNtJh3va UJnQkOwX6Caj3JJLCuNkugmp5ZRLkELzImTQ6brI+LbSTs4eIEQdfEXnp8N+EHT7co7pZrtWQc1 DTuSxOsPCakuMonjmmgRDFbGyFz7DCzCBx4buyz X-Google-Smtp-Source: AGHT+IE1KIxn0N4raWz5A16psQlDOGkqnS9GM2G7X2p1qvTiuKbqC1lC0LAT3oN9TTNbZ7vD1KUmNa/arOR0cLn7ii4= X-Received: by 2002:a17:90b:2b4d:b0:313:287c:74bd with SMTP id 98e67ed59e1d1-31e77a4b0f3mr3253208a91.33.1753451863087; Fri, 25 Jul 2025 06:57:43 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Jon Zeppieri Date: Fri, 25 Jul 2025 09:57:30 -0400 X-Gm-Features: Ac12FXx9JhczBdHqdAL6PkfrXRKmnDFG4W5Jinitxbed_NehFyXFnuO4eS6p-4Y Message-ID: Subject: Re: Possible causes of high_replay lag, given replication settings? To: Nick Cleaton Cc: pgsql-general@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Jul 23, 2025 at 4:27=E2=80=AFPM Nick Cleaton wro= te: > > On Fri, 18 Jul 2025 at 21:29, Jon Zeppieri wrote: > > > > I just had a situation where physical replication fell far behind > > (hours). The write and flush lag times were 0, but replay_lag was > > high. The replica has hot_standby_feedback on, and both > > max_standby_streaming_delay and max_standby_archive_delay are set to > > 30s. > > > > What could cause a situation like this? If the network were a problem, > > I'd expect the other _lag times to be high. So it appears that the > > replica was getting the WAL but was unable to apply it. Are there > > situations where the replica cannot apply WAL other than the kinds of > > conflicts that would be addressed by the _delay settings? > > > > I checked pg_stat_database_conflicts, but there was nothing in it -- al= l zeros. > > This can happen when there are several busy writing processes on the > primary. The single replay process on the replica can't keep up with > the writes. Thanks for the response, Nick. I'm curious why the situation you describe wouldn't also lead to the write_lag and flush_lag also being high. If the problem is simply keeping up with the primary, wouldn't you expect all three lag times to be elevated? - Jon