Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w33D2-000qmr-0W for pgsql-hackers@arkaria.postgresql.org; Thu, 19 Mar 2026 02:34:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w33Cz-00G1B0-2X for pgsql-hackers@arkaria.postgresql.org; Thu, 19 Mar 2026 02:34:01 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w33Cz-00G1Ao-1W for pgsql-hackers@lists.postgresql.org; Thu, 19 Mar 2026 02:34:01 +0000 Received: from mail-ed1-x52c.google.com ([2a00:1450:4864:20::52c]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w33Cw-0000000133V-4Aq4 for pgsql-hackers@postgresql.org; Thu, 19 Mar 2026 02:34:01 +0000 Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-6616cb8c80cso811401a12.0 for ; Wed, 18 Mar 2026 19:33:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773887637; cv=none; d=google.com; s=arc-20240605; b=g8X3tXWTUBqUJc6l+8+wOTuHHqLcFCzhdueGxc6AGKNTR8FeM9JMSy97pHAQR1ayUL 7TIe5wWXCzsE7ZlETI+8W1tsWBpdc9PPh/zA2siMy+4FVJcQbTzCyDXFGDagbDeVZGGJ M7gLxwC/kxyDmHsXIcIKp2foJuML4uhhewBbiSORxfh+qpDHBcswvreju+c5QC62GT3n 6eWCg2SXcFAI0P/o26/T/JcclBDhaxylqYt2ooog3869/3Z01K9U1f/Bb8/EwS8bxIsn DeDYkKQOVcb1y9cZTZU+9KYgq/vjVwbSotRVI8mGnZTxKHK9n9WpTPyrMAU8DXiMcIUJ RyMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=a7vAnmgM3kWyh3zgJc9DgymoQJ8EJIAJMCVHFmeXOzY=; fh=SEsJ3Qa70IBr7lUouQUBbGtC0+oPxGD+1xmg749p1KU=; b=fRdhVREfQsl1iBvBCxeTAgUUkf5iZeEsboCgykENBhfJtN9wbl9Ww2ddKoIsLWDg2R tEHk4DL6Ble6VdEG2euJhXK57r9ifRsx1G9gttaCEBJyUbK6ymguKBIZ7tI40HlmehSR Thz+rlQViRg/4akmFwtJz4f7XHagLAMJvzgW+epH6qe1Z/xNXsJzepFY6IaBahv8JXRX 67PvQslbreci/Gq1Hnk7CsQ7RVerihs6fHX2T7AbumWplXwU8XP+PathdjNKsHwT6TEk SKnH9CDMwXscOaCkDLvKEBI7TJoRsEZWtAK7sdFi50PWbqXLA40f+e/itg4qDYykksu+ RUhQ==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773887637; x=1774492437; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=a7vAnmgM3kWyh3zgJc9DgymoQJ8EJIAJMCVHFmeXOzY=; b=HpQ3h9HX965PIYICpWKSi2fFX3f4bQrCDx+9xwwfUO8mF5QxSDxBlaevAg71Q4qztu /BJotWKlb0jkm5fhR5DeK9ZIwFLAxSiMHPpnSuYab6GsFRjjuv7A/VUBDe2VkVA6Hyd2 mZ4Og0x9QCqyTcjD3NjB9GQGK8Sip6TOff8OJug0LDZlGs0g03g+UG3IsWE7n7cuWgSK n5Z1MXmV+E86ACaUHbZSUVO9x42AeoN6lkuaDrXdxT7nNyaDgNAdfQvNrI2RXmrMRi1V mTkhmuQlEgmmVfrjWSlLSyQRGHeTALVQcckbdD3FcSAPCN1ljDZlL3xSkPzG2AKIekBu N7uA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773887637; x=1774492437; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=a7vAnmgM3kWyh3zgJc9DgymoQJ8EJIAJMCVHFmeXOzY=; b=opdRvcX2xQU3rDLO9P34tFL1+vEvbzDQJP/RtMsv4n9PYTY6Y6TXE5v5TEDxh+bTBk JsEyAdgs3FkqkEWio4Ja67BqHkyur6LkjTJhOiBIUG8do1hggRRcjLDz1aA5oAN22amm C05jD31zb2mIuE8EL+B96Xgiivon8P3HCN4mhfpTEY22k9Rxo0hIcLtZk84YnV/YY8+I regpKwQhaAPZXilKavguV2uOGk5XCzSEuSWcCkODvg833Rob9qHEceFZb9T4D2QZ6dYt cdsOTEWROYaN4T6RVMaMtzY0uhR/Dz0bkPgjVNedor9cK9pvsNF2Xyt51gKBhsiz8Zmt +SEw== X-Forwarded-Encrypted: i=1; AJvYcCXNCemHiyXdskVwcvrenDLnnDmbJRs6xc3rsFY13SRtOUkUQRKAYbL/CknEMXVxB+YxAValP6R5X4GgLUEe@postgresql.org X-Gm-Message-State: AOJu0Yxpg9a4Cpbe42AknSbujsenk8OzNGpuEtEia0YzdRrPI3bppPVX LRopZ9DJQg5xuJb5NFy7frRmPHM76cp4dehD5E8VS4JuYWvMov9xGwyR3qxqg+aoM/7vjI7RK8U BAYYD5c4IQlgXqLuAtRopGly2deTzrwo= X-Gm-Gg: ATEYQzxZoBSsQGinLGlpJe3/dBSpBqxF/GNaQD+OJe7GDMAWcsCTB1biz33R35au0Ii Pw0KeL59ub/QBvMGtGHjx4tN9fR/Z4rbDJwXIGShh8cn4FS0WotJUoKXlQv+fZacH6BUTrRohgG 5I0oNMDgDrvUNyueRwgLGNrMU+pDi62OmqJccSk6rXo7FBgtSWB0/iheLZ2g715OYuKyhw+4cco uC3tuluwR19MhvaHWPYeg8nfBsETIAmd/6COg+BzeDASeqO06IZkV9sxj/9X2AlNAAF6Ohb4Fp/ QE9M1I+gQCWyG+6UsyPedN0S1GJC18P2qstoykZdUcUPpjGazQhZYErhTGVHtdX1PDFq0go1V3w p0Tcbr0NV X-Received: by 2002:a05:6402:5c9:b0:667:258:3a5 with SMTP id 4fb4d7f45d1cf-667b214dd0amr3780007a12.4.1773887636479; Wed, 18 Mar 2026 19:33:56 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Xuneng Zhou Date: Thu, 19 Mar 2026 10:33:44 +0800 X-Gm-Features: AaiRm52FbZ5DfSAbuHmBC_uaXfAYPJpt5l2OAaAjiLL8aONyghfyz5nhAsh4yRI Message-ID: Subject: Re: [WIP] Pipelined Recovery To: Imran Zaheer Cc: Zsolt Parragi , Jakub Wartak , "Hayato Kuroda (Fujitsu)" , pgsql-hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Imran, On Wed, Mar 18, 2026 at 8:06=E2=80=AFPM Imran Zaheer = wrote: > > (resending the mail, previous mail was held for moderation for some reaso= n.) > > Hi > > I am attaching a new rebased version of the patch. Following are some maj= or changes in the new patch set. > > * Streaming replication is now working. The prefetcher was not fully deco= upled from the startup process; that's why there were inconsistencies in so= me scenarios and most of the recovery tap tests were failing. > > * Patch is now split into consumer and producer patches. This will make r= eview easier. > > * Pipeline shutdown flow is also improved. Now producer will always check= for the shutdown flag (being set by the consumer) > > * Pipeline msg queue size is now configurable `wal_pipeline_queue_size` > > * Tap tests now passes with PG_TEST_INITDB_EXTRA_OPTS=3D"-c wal_pipeline= =3Don" > > * New tap test for `recovery/t/053_walpipeline.pl`. This covers some basi= c functionality of the pipeline. > > * The filename is changed to xlogpipeline.{h|C} > > Thanks to all for the valuable feedback. > Looking forward to your reviews, comments, etc. > > -- > Regards, > Imran Zaheer > > > On Wed, Mar 18, 2026 at 3:15=E2=80=AFPM Imran Zaheer wrote: >> >> (resending the mail, previous mail was held for moderation for some reas= on. Now pdf is moved to the tar.gz) >> >> Hi >> >> I am attaching a new rebased version of the patch. Following are some ma= jor changes in the new patch set. >> >> * Streaming replication is now working. The prefetcher was not fully dec= oupled from the startup process; that's why there were inconsistencies in s= ome scenarios and most of the recovery tap tests were failing. >> >> * Patch is now split into consumer and producer patches. This will make = review easier. >> >> * Pipeline shutdown flow is also improved. Now producer will always chec= k for the shutdown flag (being set by the consumer) >> >> * Pipeline msg queue size is now configurable `wal_pipeline_queue_size` >> >> * Tap tests now passes with PG_TEST_INITDB_EXTRA_OPTS=3D"-c wal_pipeline= =3Don" >> >> * New tap test for `recovery/t/053_walpipeline.pl`. This covers some bas= ic functionality of the pipeline. >> >> * The filename is changed to xlogpipeline.{h|C} >> >> Thanks to all for the valuable feedback. >> Looking forward to your reviews, comments, etc. >> >> -- >> Regards, >> Imran Zaheer >> >> >> On Wed, Mar 18, 2026 at 12:43=E2=80=AFPM Imran Zaheer wrote: >>> >>> Hi Zsolt. >>> >>> Thanks alot for the review and pointing out the bugs. I have fixed the = bugs you mentioned in my new patch set. But >>> patchset mail is held for moderation for some reason. >>> >>> > >>> > if (reachedRecoveryTarget) >>> > { >>> > + if (wal_pipeline_enabled) >>> > + WalPipeline_Stop(); >>> > >>> > What if we didn't reach the recovery target, shouldn't we stop the >>> > pipelines then? >>> > >>> >>> I have fixed the bugs shutdown logic. >>> >>> As we already know we will exist the recovery redo loop in `PerformWalR= ecovery()` only in two cases >>> >>> 1: Recovery target reached: >>> In this case consumers will call to stop the pipeline. >>> >>> @@ -1807,6 +1931,9 @@ PerformWalRecovery(void) >>> >>> if (reachedRecoveryTarget) >>> { >>> + if (wal_pipeline_enabled) >>> + WalPipeline_Stop(); >>> + >>> >>> 2: Available pg_wal is consumed and now more wal to read. >>> In this case pipeline producers will send the shutdown msg to the consu= mer. Consumer will >>> detect this message and then call ` WalPipeline_Stop`. This is the case= where we cannot read >>> more records and the while loop will break here. >>> >>> + if (decoded_record) >>> + { >>> + record =3D &decoded_record->header; >>> + return record; >>> + } >>> + else >>> + { >>> + /* >>> + * We will end up here only when pipeline couldn't read more >>> + * records and have sent a shutdown msg. We will acknowldge this >>> + * and will trigger request to stop the pipeline workers. >>> + */ >>> + WalPipeline_Stop(); >>> + return NULL; >>> + } >>> >>> >>> Hope this makes sense. >>> >>> Once again thanks for reporting the bugs. You will receive the new patc= hset mail soon once it is cleared from >>> the moderation. >>> >>> Looking forward to your reviews, comments, etc. >>> >>> Regards, >>> Imran Zaheer Thanks for this patch=E2=80=94it=E2=80=99s quite interesting. To my knowled= ge, there have been prior attempts to introduce parallelism into recovery, as you mentioned in your earlier email. I=E2=80=99m curious how this approach differs from those previous efforts, = and why those attempts ultimately did not land. I imagine there were constraints or complexities involved. It would be valuable to understand what lessons can be drawn from them. It also raises an implicit question: what makes the current approach more promising=E2=80=94whether due to a simpler design or improved performance. While these may not be directly related to your current proposal, the insights and experience from earlier work could help guide the development and shape the direction of this patch. Of course, some of this context can be pieced together from mailing list discussions and past talks, but doing so raises the bar for future reviewers. Any additional background you can share would be very helpful. --=20 Best, Xuneng