Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vRkDH-003woY-1i for pgsql-general@arkaria.postgresql.org; Sat, 06 Dec 2025 04:48:07 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vRkDF-00AqdA-0g for pgsql-general@arkaria.postgresql.org; Sat, 06 Dec 2025 04:48:05 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vRkDE-00Aqd0-2V for pgsql-general@lists.postgresql.org; Sat, 06 Dec 2025 04:48:05 +0000 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vRkDC-003Oan-25 for pgsql-general@lists.postgresql.org; Sat, 06 Dec 2025 04:48:04 +0000 Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-295395ceda3so5058505ad.2 for ; Fri, 05 Dec 2025 20:48:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764996480; x=1765601280; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=nb+Am7SdSoOQV6/sBmEij42NFBtI4RaUJ21nU7CG6HA=; b=l8lM+pJEwZbvkH6kRJNt37gQKsrpzWO0ghLD+J3x6q61Wyvqrf1hDM5629KK3PNyPb 9DGPWdW1nJMNMS+cOeaKDRK40H4lCLmbbColrDZ2TXRNhcbEzA9uMAmocRe0yaSe1l2+ EdTyVjWGRtJp6x4SsaWNRDPLSnDXt3iOC7BOOtQdwHY4IMK/USXo9Il7RrfuYOBsjwVU CHxpdDLfRnL4H5mfnykh5yT8dZyqcfivlkoPbOPLwQ4WTVyCABjmPP+kXlePlkisGHjM EyCZRkb3dm7ix7XNAm7/PqLdjfyoptbmEbfnqVgLTdxM5s4S4te08iUIotAaWpnut+Sv l9yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764996480; x=1765601280; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=nb+Am7SdSoOQV6/sBmEij42NFBtI4RaUJ21nU7CG6HA=; b=I78UAERrbg6RsIx+Dbxru7K+9bRgeLxkIBRDEdOGlMtYxvkIcjfGKPyhD7NgrokF8H 7Kjh2n0uc5pt1NmzXk6j+p+7grsDp0xOIx87iL7S4aasIcsJGWjWUTiTZmVUFNYXbD+M 9OJ6omSExAbP2dPnvCrGBCOK5izTf1kNZx+5RDNPsdeqBGi7hb1wVsWrBgyRffA39fdV N3xva/mcOwZAGuDS6+w2xTIFEguvevKVUPSlBALzGlPsu9qftaLpaa3AZzsy06MtcKYT HTJV181jPq1GDMyp4QG8lQ9U8jE+mI/4N5/D5VVyOoO0WqngHW3rGnlOLgtzh72Ocyht rQtg== X-Gm-Message-State: AOJu0YxO4Q9tD5G51s9bu7VcOI6cMedrOKhEdRFoZZal8qMppJEehbXd J/M92v4sV65Ubh3IzEpU8yeR406a0Z1w3bUKNmM1itUlFmeS1O1984j6J6rJgnOcW6oGD66dFxi 2XzrYVxQeWWsTm3XDvZR93ZuZCZXWtBs= X-Gm-Gg: ASbGncvtKzRijLwo4ZdmUpgcNaujzNNwfYT7JLPISypz+r7ON3tqdWvcMaPBoVWIwni lHaybWZXElSPOQ4Ru0uzgbksbkkxtxMuK2yf2T28uIfN1UR1kwPpXCf/RJNzC86x3hIA9HF7mS3 ZBR7iXVUJA0gP5FWCsw+x32Cwr6gABLFlemFJueVj/ZWDujX1yJSUMS3DLyWBHb7y1q93NceWEb QZIaqgpRyxJJV98I96LqvIn0HhRlESgwY4zKXZxL8FE9jlBoM+F0a8zQ/U3vtbGKxsqj4NxfqNA A6X9wW7mXVPapQ+Z8bi4u01UzDTjZNJh2i2ssenWjHTpD4EHvC9h0mX/4cI2jpk= X-Google-Smtp-Source: AGHT+IFtEXbOtsrRPVh10oxC6T7ebjQpZwsTKFUHo09JmUndBDEjCjp/M1zW10Lk2g3c6+ezi8F0W9uAcTQlSrEF0nA= X-Received: by 2002:a05:7301:e85:b0:2a6:9dbf:bbe1 with SMTP id 5a478bee46e88-2abc71f7f38mr738889eec.3.1764996479619; Fri, 05 Dec 2025 20:47:59 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Thomas Munro Date: Sat, 6 Dec 2025 17:47:23 +1300 X-Gm-Features: AWmQ_bkwyWq5N0vEk4KZ17dDG4D7HbhTsZEn5w6oIc1LJuTqgLjiUqjjgH54zpU Message-ID: Subject: Re: wdavdaemon / Microsoft Defender for Endpoint on Linux and slow Postgres recovery? To: "Colin 't Hart" Cc: PostgreSQL General Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Dec 3, 2025 at 3:48=E2=80=AFAM Colin 't Hart = wrote: > One of my clients has Microsoft Defender for Endpoint on Linux installed = on their Postgres servers. > > I was testing a database restore from pgBackRest. The restore itself seem= ed to complete in a reasonable amount of time, but then the Postgres recove= ry started and it was extremely slow to retrieve and apply the WAL files. > > I noticed wdavdaemon taking most of the CPU, and Postgres getting very li= ttle. These days, tools like that work by monitoring every read, write etc via kernel event queues (fanotify on Linux, ESF on macOS, IDK on Windows, it might still be using something more efficient but less isolated with tentacles inside the kernel). Those queues usually have a fixed size and when they overflow because the event consumer isn't keeping up, the monitored process can be blocked. That's probably true even if running in a mode where it doesn't have to reply to allow the operation to proceed. Presumably the consumer is running some kind of rolling fingerprint check over the data looking for things from its database of malware, which you'd hope would be very well optimised... My pet theory is that PostgreSQL suffers from these systems more than anything else not because of the total bandwidth but because of the per-operation overheads and our historical 8KB-at-a-time disk and network I/O. Your report about pgBackRest supports that idea: it probably copies a larger total size in big chunks, while recovery reads the WAL 8KB at a time (and evicts data 8KB at a time if your buffer pool is small), and then finally the checkpointer writes back 8KB at a time. Another factor is that it might be using only one fanotify queue for each process, or worse, but IDK if that matters, it sounds like the CPU might be saturated anyway? Future releases should improve all of that with bigger I/Os for WAL (read through an 8KB drinking straw, dunno if it's spying on reads too?) and data (I/O combining, various strategies, various prototypes[1][2], watch this space). It's also been proposed a few times that we should have an option to skip the end-of-recovery checkpoint, so then you'd get a regular "spread" checkpoint that the spyware could keep up with (assuming that it normally keeps up, just not in crash recovery). Another thing that probably makes this worse in this strange environment, if we assume it is due to small writes and reads are not affected, is that crash recovery currently dirties all pages that the WAL touches, forgetting progress that already made it to disk: it overwrites the LSN with an FPW and then replays all changes on top, when it could instead read the page in and skip a lot of work if the LSN is high enough, thereby often avoiding dirtying and re-writing the page, whenever checksums are on (as they are now by default). The checksum could be used as proof that the page wasn't torn by a non-atomic write interrupted by a power outage. I doubt anyone is really that interested in optimising for such setups per se when anyone will tell you to just turn it off, but the reason I've thought about it enough to take a guess is that my corporate-managed Mac was running the PostgreSQL test suite so slowly it would time out, and I was sufficiently nerd-sniped to figure out that it could keep up with bursts of I/O pretty well, but everything turned to custard under sustained workloads, notably in the recovery tests which deliberately run with a tiny buffer pool. As someone working on bits of our I/O plumbing, I couldn't help speculating that something that is objectively terrible about PostgreSQL is really just being magnified by strange new overheads that mess with the economics. It may not be a goal but I will still be happy if it copes with this stuff as a by-product of general improvements like generalised I/O combining. (Funnily enough I've actually got a bunch of unpublished tooling to simulate, detect and manage invisible I/O queuing.) > I wonder if anyone here has any experience with configuring exclusions so= that the WAL files can be processed faster? Yep, it entirely fixed the cliff and vastly reduced the CPU usage on my corporate Mac. There is still a small measurable slowdown, but the recovery test suite couldn't even complete without timing out while monitored. I expect exactly the same on Linux but haven't tried it. > Any advice on what to communicate with their IT department about using th= is on their database servers? I've never encountered it on Linux before... There is lots of writing on the internet about excluding pgdata from these types of tools. Much of it is concerned with Windows-specific problems: opening files and directories or mapping files at bad times can cause various PostgreSQL file operations to fail on that OS. I don't know of any reason why periodic scans of pgdata should interfere with PostgreSQL on Linux other than consuming I/O bandwidth, it seems to be just the per-syscall stuff that is unworkable. You might be able to show "meson test" failing as some kind of evidence that PostgreSQL is allergic to it. Or if you want to try to find a one-liner demonstration independent of PostgreSQL, you could test the can't-keep-up-with-stream-of-tiny-writes theory by experimenting with "dd" at different block sizes. I expect you'll find a size below which the fanotify queue quickly overflows and performance falls off a cliff. Current versions of PostgreSQL assumed fast and consistent buffered writes and pretended the system calls were free. These monitoring tools make them expensive and also non-linear by sending messages around with carrier pigeons. [1] https://www.postgresql.org/message-id/flat/CAAKRu_bcWRvRwZUop_d9vzF9nHA= iT%2B-uPzkJ%3DS3ShZ1GqeAYOw%40mail.gmail.com [2] https://www.postgresql.org/message-id/flat/CA%2BhUKGK1in4FiWtisXZ%2BJo-= cNSbWjmBcPww3w3DBM%2BwhJTABXA%40mail.gmail.com