Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wGjK2-006VoF-0I for pgsql-hackers@arkaria.postgresql.org; Sat, 25 Apr 2026 20:09:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wGjJ1-009XXU-35 for pgsql-hackers@arkaria.postgresql.org; Sat, 25 Apr 2026 20:08:47 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wGjJ0-009XXM-1w for pgsql-hackers@lists.postgresql.org; Sat, 25 Apr 2026 20:08:47 +0000 Received: from fhigh-b5-smtp.messagingengine.com ([202.12.124.156]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wGjIx-00000002m3e-1HsW for pgsql-hackers@lists.postgresql.org; Sat, 25 Apr 2026 20:08:45 +0000 Received: from phl-compute-02.internal (phl-compute-02.internal [10.202.2.42]) by mailfhigh.stl.internal (Postfix) with ESMTP id 61B467A00FC; Sat, 25 Apr 2026 16:08:42 -0400 (EDT) Received: from phl-imap-14 ([10.202.2.87]) by phl-compute-02.internal (MEProxy); Sat, 25 Apr 2026 16:08:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=burd.me; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :message-id:mime-version:reply-to:subject:subject:to:to; s=fm3; t=1777147722; x=1777234122; bh=RmjW0ab0KFjDQ6MNgaijd0vqqzpFB1hh q2M2RH3hoT4=; b=Ugo74Y77KqO3Wi0LQFuaXUR9CIBmAqKm+ITfGOhknyhX2b33 nDDhUkijHqPo2vkPkEOSO13OUiQubkKbOHI6iiBFgo4cqKRf3LlMWsg0abp+frSX qK3JUFxKFEbbg7U68Zn3wgUl66+llaysaaMZRQCziAgDdbtN44gEUD0NvPYPNI+h QdaMyrb87iqx0EDIZl3qgwHpP6M+wpQ8nZP6AxaNyOunKmPO/ZgNAPJpYHYFJZbB ZkDjdMhM5EmN5ltfeGa1pam0glYHRyUoS49xFQwpDrETYtoffhylX1BHer8fThd5 +OK4tnApEqqDwE9ogs1z/7vb7WdRbJs00CfEAA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:message-id :mime-version:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1777147722; x= 1777234122; bh=RmjW0ab0KFjDQ6MNgaijd0vqqzpFB1hhq2M2RH3hoT4=; b=V S34uGwqBcdYDNO6aT9bWe4upUeYkqOkr3ujouSn6Jnprl/m2jwt5a9L7h1is+7lk cMf8iQ3XSE3vhj5HU61tsnMWD4RAWwtzk47DdroKWZXUlV1npQvjf+dXpwCXJjd2 UYo2P2oyLrrVjmAFcsx3NjzmBjoeXYcLRiXC8VJhyk9QhbemYKqIMBnugn1ph+X4 rUTbPTZjycsEZA9HodVJEMnVP4OwWpjdg2p793yNlkE8i00z8v4QNqM612cpwa+M C/rRVAit3hFoNL344OHugFwwNr+HhMxgVBI7GsIwtrtw8ZQrExZuWz3XKZO2uOj6 qVpku8D9Z1FgTqM64qJGg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefhedrtddtgdejfeekfecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpefoggffhffvvefkufgtsehmtderreertddtnecuhfhrohhmpedfifhrvghguceuuhhr ugdfuceoghhrvghgsegsuhhrugdrmhgvqeenucggtffrrghtthgvrhhnpeehjedvjefhhf ekiedvledtgeetkedvgfeuudekjedtvdejheffudetleeggfejleenucffohhmrghinhep phhoshhtghhrvghsqhhlrdhorhhgnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrg hmpehmrghilhhfrhhomhepghhrvghgsegsuhhrugdrmhgvpdhnsggprhgtphhtthhopeeh pdhmohguvgepshhmthhpohhuthdprhgtphhtthhopehmlhhoughjsegrmhgriihonhdrtg homhdprhgtphhtthhopegrnhgurhgvshesrghnrghrrgiivghlrdguvgdprhgtphhtthho pehnrghthhgrnhgusghoshhsrghrthesghhmrghilhdrtghomhdprhgtphhtthhopehpgh hsqhhlqdhhrggtkhgvrhhssehlihhsthhsrdhpohhsthhgrhgvshhqlhdrohhrghdprhgt phhtthhopehtohhmrghssehvohhnughrrgdrmhgv X-ME-Proxy: Feedback-ID: i675e48f3:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 55B0BC4006E; Sat, 25 Apr 2026 16:08:41 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface MIME-Version: 1.0 Date: Sat, 25 Apr 2026 16:08:02 -0400 From: "Greg Burd" To: "PostgreSQL Hackers" Cc: "Andres Freund" , "Tomas Vondra" , "Nathan Bossart" Message-Id: <79629577-3ad8-4b1c-a469-ebc2cb4c5104@app.fastmail.com> Subject: [PATCH] Batched clock sweep to reduce cross-socket atomic contention Content-Type: multipart/mixed; boundary=17433b00b01b98fb89b021ca533c5f80a3c06626 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --17433b00b01b98fb89b021ca533c5f80a3c06626 Content-Type: text/plain Content-Transfer-Encoding: 7bit Hello hackers, A colleague of mine, Jim Mlodgenski, has been poking at NUMA behavior on some of the newer AWS bare-metal instance types (r8i in particular, which exposes 6 NUMA nodes via SNC3 on a 2-socket box), and in the process landed on a very small change to freelist.c that I think is worth showing around. His patch is attached with some tweaks of my own. Full disclosure: the exploration that led Jim to this patch idea was done with help from an AI assistant (Kiro); the idea, the benchmarking, and the final shape of the patch are human-driven, but I wanted to be up front about how his investigation started. Happy to discuss that separately if people want to. The one-line summary: instead of advancing nextVictimBuffer one buffer at a time via pg_atomic_fetch_add_u32, each backend claims a batch of 64 consecutive buffer IDs from the shared hand and then iterates them privately. Global sweep order is preserved -- every buffer is still visited exactly once per complete pass -- but the atomic contention on that one cache line drops by roughly the batch size. Why this matters ---------------- On multi-socket boxes under eviction pressure, every backend that needs a victim buffer ends up CAS'ing the same cache line. On a single socket, a locked RMW on that cache line stays warm in L1/L2 and completes in ~20ns. On 2+ sockets, the line bounces over QPI/UPI at ~100-200ns per op, and with hundreds of backends running StrategyGetBuffer() concurrently, the line ping-pongs constantly. It's a textbook NUMA scalability bottleneck, and once shared_buffers is smaller than the working set and the sweep is running continuously, that single atomic is what you hit in a perf profile (elevated bus-cycles, cache-misses on the cache line holding nextVictimBuffer). Andres pointed at the same spot in his pgconf.eu 2024 talk, and Tomas called it out in the "Adding basic NUMA awareness" thread [1] -- so this isn't news to anyone who's been looking at this area. What I think is new is a fix that's just this, without any of the surrounding architectural change. The framing (credit to Jim): the clock hand is doing two jobs. It *coordinates* backends so they don't redundantly decrement usage_count on the same buffers and so they eventually visit every buffer in the pool exactly once per pass. It also *serializes* access to the counter. Coordination is the part we want. Serialization is the part that's killing us on bigger NUMA boxes. Batching keeps the coordination and thins out the serialization. How it works ------------ Two per-backend statics, MyBatchPos and MyBatchEnd. When a backend calls ClockSweepTick() and its local batch is exhausted, it does a single fetch-add of CLOCK_SWEEP_BATCH_SIZE (64) against nextVictimBuffer and now owns that range. Subsequent ticks just bump the local counter. Wraparound got a small rewrite. The original code had the backend that crossed NBuffers drive completePasses++ under the spinlock via a CAS loop. With batching, multiple backends can each land a fetch-add that returns a value >= NBuffers in the same pass, so the logic now is: whoever sees a start >= NBuffers takes the spinlock, re-reads the counter, and if it's still out of range does a single CAS to wrap it and bumps completePasses. If somebody else already wrapped, we just release and move on. StrategySyncStart() still sees a consistent (nextVictimBuffer, completePasses) pair. The batch size is gated on whether we actually have multiple NUMA nodes. On a single-socket box the atomic is already socket-local, batching just makes backends skip further ahead than they need to, so we fall back to batch size 1 -- which is bit-for-bit the original behavior. The guard: if (pg_numa_init() != -1 && pg_numa_get_max_node() >= 1) ClockSweepBatchSize = Min(CLOCK_SWEEP_BATCH_SIZE, (uint32) NBuffers); else ClockSweepBatchSize = 1; Min() against NBuffers covers the small-shared_buffers corner so a batch never wraps the pool multiple times in one claim. Does batching mess up the meaning of usage_count? -------------------------------------------------- Short answer: no. I want to walk through this because it was my first concern too, and I think it's the question that will come up most on review. The clock sweep's usage_count is an access-frequency approximation measured in units of *complete passes*. A buffer with usage_count = N survives N passes without a re-pin. The semantic meaning lives at pass granularity, not at individual-buffer granularity. What batching changes: intra-pass temporal ordering. Without batching, with N backends sweeping, decrements are interleaved -- backend A hits B[0], backend B hits B[1], backend C hits B[2]. With batching, backend A hits B[0..63] in a tight local burst, then backend B hits B[64..127], etc. The 64-buffer chunks are decremented in bursts rather than individually. Why it doesn't matter: 1. Every buffer still gets decremented exactly once per complete pass. The invariant the algorithm actually depends on is untouched. 2. A buffer's survival window is the time between consecutive passes. That's milliseconds to seconds under load. Whether B[0] gets decremented 50us before or 50us after B[63] within the same pass is below the resolution of anything usage_count is trying to measure. 3. The bgwriter's feedback loop reads (nextVictimBuffer, completePasses, numBufferAllocs) via StrategySyncStart() every ~200ms. nextVictimBuffer still advances at the same *total* rate (64 per atomic op, but atomic ops happen 1/64 as often). The position it reports can jitter by up to 64 buffers relative to the one-at-a-time case, but BgBufferSync()'s smoothed estimates operate over thousands of buffers per cycle, so the jitter disappears into the averaging. numBufferAllocs still increments once per allocation. strategy_delta, smoothed_alloc, smoothed_density, reusable_buffers_est -- all unaffected in any way I can see. Table form, because it's easier to argue with: Property | Unpatched | Batched ----------------------------------+----------------+---------------- Buffers visited per pass | NBuffers | NBuffers Decrements per buffer per pass | 1 | 1 Eviction threshold | usage_count==0 | usage_count==0 Max survival (passes) | 6 | 6 Decrement ordering within a pass | interleaved | chunked bgwriter allocation rate signal | accurate | accurate Cross-socket atomic traffic | 1 per buffer | 1 per 64 There is one subtle difference worth naming. When a backend finds a victim at B[5] of its batch, it returns with MyBatchEnd still sitting at B[63]. The next time that backend needs a victim it resumes at B[6], not at wherever the global hand now points. So the backend drains its batch over multiple StrategyGetBuffer() calls rather than all at once. Under heavy load, where batches are consumed in microseconds, this is invisible. Under light load, the implication is that some buffers can sit with slightly stale usage_count for longer than they would have before. But "light load" means "the sweep is barely moving and nothing wants to evict anyway" -- so the effect doesn't show up where it would hurt. There's also a small positive side-effect: cache locality. The backend that just touched BufferDescriptor[B[0]] has the adjacent descriptors warm in L1/L2. Walking B[0..63] locally is cheaper than walking a striped interleaving where each descriptor was last touched by a different core. I haven't tried to isolate this in perf, but it falls out naturally. Benchmarks ---------- Jim ran these; I'm still working on reproducing them locally and will post independent numbers in a follow-up. All bare metal, Linux, huge pages enabled throughout (more on that below), postmaster pinned to node 0 with `numactl --cpunodebind=0` because otherwise stock TPS varied from 31K to 40K depending on which node the postmaster happened to land on at launch -- worth flagging for anyone trying to reproduce. Workload is pgbench scale 3000 (~45GB) with shared_buffers=32GB, so the working set always spills and the sweep is hot. r8i.metal-96xl (384 vCPUs, 2 sockets, 6 NUMA nodes via SNC3): pgbench RO: Clients Stock Patched Delta 64 31,457 36,353 +16% 128 31,678 37,864 +20% 256 31,510 37,558 +19% 384 31,431 37,464 +19% 512 31,329 37,040 +18% pgbench RW: Clients Stock Patched Delta 64 7,685 7,713 0% 128 10,420 10,541 +1% 256 12,393 12,463 +1% 384 15,317 15,197 -1% 512 17,930 17,978 0% m6i.metal (128 vCPUs, 2 sockets, Ice Lake): RO +19-20%, RW within noise. c8i.metal-48xl (192 vCPUs, 1 socket): Single-socket -> batch_size=1 -> original code path. No behavioral change. (I double-checked this one specifically because it's the sanity test for the gate.) HammerDB TPC-C on m6i.metal (1000 warehouses): VUs Stock Patched Delta 128 358,518 349,787 -2% 256 332,098 330,272 -1% 384 365,782 377,519 +3% 512 370,663 386,526 +4% No TPC-C regression, which was the thing we were most worried about. An earlier attempt (per-socket partitioned sweep, see below) was -13% on this same workload. The general shape is: the scaling curve flattens later. Unpatched, TPS tops out around 128 clients and stays flat up to 512 because backends are spending cycles waiting on the cache line rather than doing work. Patched, the curve keeps rising past the point where unpatched plateaus. Huge pages caveat: all of the above was run with huge pages on, on large-memory instances (the r8i.96xl has 3TB, so Jim never considered running without them). We have not characterized the non-huge-pages case. That's on my list; I don't expect it to change the conclusion, but I shouldn't speak for data I haven't collected. Relationship to Tomas's NUMA series ----------------------------------- Tomas posted a multi-patch NUMA-awareness series in [1] covering buffer interleaving across nodes, partitioned freelists, partitioned clock sweep, PGPROC interleaving, and related pieces. I want to be careful here because I don't think we should frame this patch as competing with that work. One thing I found striking as I re-read the thread: in the benchmarks Tomas posted later in the series, *most of the benefit comes from partitioning the clock sweep*, and the NUMA memory-placement layer on top sometimes runs slower than partitioning alone. His own conclusion, quoted roughly: the benefit mostly comes from just partitioning the clock sweep, and it's largely independent of the NUMA stuff; the NUMA partitioning is often slower. That observation is the thing that makes me think batching is worth considering on its own. It's going after the same bottleneck Tomas's partitioning addresses, but: - without splitting global eviction visibility (which is where cross-partition stealing gets complicated), - without requiring NUMA-aware buffer placement (which has huge page alignment, descriptor-partition-mid-page, and resize complications that are still being worked out in that thread), - without touching PGPROC or bgwriter. What this patch does *not* do: - place buffers on specific NUMA nodes - partition the freelist - touch PGPROC - add new GUCs - change bgwriter What this patch *does* do: - target exactly the clock-sweep contention that Tomas's partitioning targets, and reduce it by ~64x, in ~30 lines. If Tomas's series lands in full, this patch becomes redundant for its primary use case (though even within a partitioned sweep, the per-partition atomic still benefits from batching, so it's arguably a useful primitive either way). If Tomas's series lands incrementally over several cycles -- which the open items in that thread suggest is the realistic path -- this gets us a real chunk of the multi-socket win now. This patch is also orthogonal to my earlier thread about removing the freelist entirely [2], but given the proximity to that code Jim agreed that I could propose/steward it here on the list for consideration. Open questions / things I'd like feedback on -------------------------------------------- - Batch size. 64 is a round number that worked well in testing, but Nathan raised the reasonable point that on small shared_buffers with high concurrency, a fixed 64 could be unfortunate. Options: scale with shared_buffers (Min(64, NBuffers / N) for some N), scale with max_connections, keep it fixed but let operators tune it, or make it a function of NUMA node count. I don't have a strong opinion yet; the Min(batch, NBuffers) cap covers the "obviously wrong" corner but doesn't speak to the "several hundred backends on a few-MB shared_buffers" shape. Numbers/ideas/proposals welcome. - NUMA detection. The gate uses pg_numa_init() / pg_numa_get_max_node(). On systems where libnuma isn't available, or where get_mempolicy is blocked (some container configurations), we fall back to batch size 1. That's safe but it misses the "single socket, many cores, still benefits from fewer atomics" case. Might be worth a way to force-enable, or batching on all systems with a smaller batch size when single-socket. I'd like to measure before deciding. - Eviction pattern on reads. Nathan also flagged that with batching, the buffers a backend ends up pinning in one StrategyGetBuffer() call will tend to be contiguous in buffer-id space rather than scattered, which is a different allocation pattern than today. The usage_count analysis above says this is benign, but if anyone has an intuition for a workload where this would be observable (e.g., something that cares about the mapping between buffer-id and relation locality), I'd like to hear it. - nextVictimBuffer wraparound. The current code has a mild overflow concern papered over with "highly unlikely and wouldn't be particularly harmful". With batching this is no worse than before, but if we're already touching this function, it might be worth thinking about whether to tighten it up in the same patch or a follow-up. - Should the non-NUMA value for this be derived from core counts that imply L1/L2 cache layouts or simply default to 8 rather than 1 to realize some benefit? - Should there be a postgresql.conf setting for this that takes precedence? I'll run the non-huge-pages variant, reproduce the r8i numbers, poke at the small-shared_buffers corner, and post perf stat output showing the atomic/cache-miss deltas over the next few days. In the meantime, eyeballs and skepticism welcome -- I would especially welcome comments from Andres, who's been in this code recently, and from Tomas, whose series has the most overlap. I realize that we're past feature freeze and working on release notes for v19, so the chances of merging this are slim to none. I think this could be considered a "performance bug fix for NUMA systems" in this release, but that is stretching it a bit. It is a big ask at this stage to land a change like this. best. -greg [1] https://www.postgresql.org/message-id/099b9433-2855-4f1b-b421-d078a5d82017@vondra.me [2] https://www.postgresql.org/message-id/f0e3c02e-e217-4f04-8dab-1e7e80a228c0@burd.me --17433b00b01b98fb89b021ca533c5f80a3c06626 Content-Disposition: attachment; filename*0="v1-0001-Reduce-clock-sweep-atomic-contention-by-claiming-.pat"; filename*1="ch" Content-Type: application/octet-stream; name="=?UTF-8?Q?v1-0001-Reduce-clock-sweep-atomic-contention-by-claiming-.patc?= =?UTF-8?Q?h?=" Content-Transfer-Encoding: base64 RnJvbSBiZGNmOTBmYmQ4OWEwYWVjMzk3YTNkNTcyMjRhZTczMjk1OTczM2Y5IE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBHcmVnIEJ1cmQgPGdyZWdAYnVyZC5tZT4KRGF0ZTog U2F0LCAyNSBBcHIgMjAyNiAxNTo1MjozNiAtMDQwMApTdWJqZWN0OiBbUEFUQ0ggdjFdIFJl ZHVjZSBjbG9jay1zd2VlcCBhdG9taWMgY29udGVudGlvbiBieSBjbGFpbWluZyBidWZmZXJz CiBpbiBiYXRjaGVzCgpTdHJhdGVneUdldEJ1ZmZlcigpIGFkdmFuY2VzIG5leHRWaWN0aW1C dWZmZXIgdmlhCnBnX2F0b21pY19mZXRjaF9hZGRfdTMyKC4uLiwgMSkgb24gZXZlcnkgdGlj ay4gIE9uIG11bHRpLXNvY2tldApzeXN0ZW1zIHRoZSBjYWNoZSBsaW5lIGhvbGRpbmcgdGhl IGNvdW50ZXIgaGFzIHRvIHRyYXZlbCBvdmVyIHRoZQppbnRlcmNvbm5lY3Qgb24gZWFjaCBv cGVyYXRpb24sIHB1c2hpbmcgYSBzd2VlcCB0aWNrIGZyb20gfjIwbnMgKHRoZQpzYW1lLXNv Y2tldCBjYXNlKSBpbnRvIHRoZSB+MTAwLTIwMG5zIHJhbmdlLiAgV2l0aCBodW5kcmVkcyBv Zgpjb25jdXJyZW50IGJhY2tlbmRzIHVuZGVyIGV2aWN0aW9uIHByZXNzdXJlLCB0aGF0IG9u ZSBjYWNoZSBsaW5lCmJlY29tZXMgdGhlIGRvbWluYW50IGNvc3QgaW4gdGhlIHN3ZWVwLCB2 aXNpYmxlIGFzIGVsZXZhdGVkCmJ1cy1jeWNsZXMgYW5kIGNhY2hlLW1pc3NlcyBpbiBwZXJm IHByb2ZpbGVzLgoKRWFjaCBiYWNrZW5kIG5vdyBjbGFpbXMgYSByYW5nZSBvZiBDTE9DS19T V0VFUF9CQVRDSF9TSVpFICg2NCkKY29uc2VjdXRpdmUgYnVmZmVyIElEcyB3aXRoIGEgc2lu Z2xlIGZldGNoLWFkZCBhbmQgaXRlcmF0ZXMgdGhyb3VnaAp0aGVtIHByaXZhdGVseS4gIFRo ZSBzd2VlcCBzdGlsbCBhZHZhbmNlcyB0aHJvdWdoIHRoZSBwb29sIGluIG9yZGVyLAplYWNo IGJ1ZmZlciBpcyBzdGlsbCB2aXNpdGVkIGV4YWN0bHkgb25jZSBwZXIgY29tcGxldGUgcGFz cywgYW5kCnVzYWdlX2NvdW50IGlzIHN0aWxsIGRlY3JlbWVudGVkIGV4YWN0bHkgb25jZSBw ZXIgYnVmZmVyIHBlciBwYXNzOwp0aGUgbWVhbmluZyBvZiB1c2FnZV9jb3VudCBhcyAiaG93 IG1hbnkgY29tcGxldGUgcGFzc2VzIGEgYnVmZmVyCnN1cnZpdmVzIHdpdGhvdXQgYSByZS1w aW4iIGlzIHByZXNlcnZlZC4gIFdoYXQgY2hhbmdlcyBpcyB0aGUKdGVtcG9yYWwgb3JkZXJp bmcgb2YgZGVjcmVtZW50cyB3aXRoaW4gYSBzaW5nbGUgcGFzcywgd2hpY2ggdGhlCmFsZ29y aXRobSBkb2VzIG5vdCBkZXBlbmQgb24uCgpXcmFwYXJvdW5kIGhhbmRsaW5nIGlzIGFkanVz dGVkOiB3aXRoIGJhdGNoaW5nLCBtdWx0aXBsZSBiYWNrZW5kcwpjYW4gZWFjaCBzZWUgdGhl aXIgZmV0Y2gtYWRkIHJldHVybiBhIHZhbHVlIHBhc3QgTkJ1ZmZlcnMgd2l0aGluCnRoZSBz YW1lIHBhc3MuICBBbnkgc3VjaCBiYWNrZW5kIHRha2VzIGJ1ZmZlcl9zdHJhdGVneV9sb2Nr LApyZS1yZWFkcyB0aGUgY291bnRlciwgYW5kIGlmIGl0IGlzIHN0aWxsIG91dCBvZiByYW5n ZSB3cmFwcyBpdCB3aXRoCmEgc2luZ2xlIENBUyBhbmQgaW5jcmVtZW50cyBjb21wbGV0ZVBh c3Nlcy4gIFN0cmF0ZWd5U3luY1N0YXJ0KCkKY29udGludWVzIHRvIHNlZSBhIGNvbnNpc3Rl bnQgKG5leHRWaWN0aW1CdWZmZXIsIGNvbXBsZXRlUGFzc2VzKQpwYWlyLgoKQmF0Y2hpbmcg aXMgb25seSB1c2VmdWwgd2hlbiB0aGUgYXRvbWljIGlzIGFjdHVhbGx5IGNvbnRlbmRlZAph Y3Jvc3Mgbm9kZXMsIHNvIGl0IGlzIGFwcGxpZWQgb25seSB3aGVuIGxpYm51bWEgcmVwb3J0 cyBtb3JlIHRoYW4Kb25lIG5vZGUgKHBnX251bWFfZ2V0X21heF9ub2RlKCkgPj0gMSk7IG90 aGVyd2lzZSB0aGUgYmF0Y2ggc2l6ZQpzdGF5cyBhdCAxIGFuZCB0aGUgY29kZSBwYXRoIG1h dGNoZXMgbWFzdGVyIGJpdC1mb3ItYml0LiAgVGhlIGJhdGNoCmlzIGFsc28gY2FwcGVkIGF0 IE5CdWZmZXJzIHNvIGEgY2xhaW0gY2Fubm90IHdyYXAgdGhlIHBvb2wgbW9yZQp0aGFuIG9u Y2UuCgpDby1BdXRob3JlZC1ieTogSmltIE1sb2RnZW5za2kgPG1sb2RqQGFtYXpvbi5jb20+ CkNvLUF1dGhvcmVkLWJ5OiBHcmVnIEJ1cmQgPGdyZWdAYnVyZC5tZT4KLS0tCiBzcmMvYmFj a2VuZC9zdG9yYWdlL2J1ZmZlci9mcmVlbGlzdC5jIHwgMTM2ICsrKysrKysrKysrKysrKysr Ky0tLS0tLS0tCiAxIGZpbGUgY2hhbmdlZCwgOTQgaW5zZXJ0aW9ucygrKSwgNDIgZGVsZXRp b25zKC0pCgpkaWZmIC0tZ2l0IGEvc3JjL2JhY2tlbmQvc3RvcmFnZS9idWZmZXIvZnJlZWxp c3QuYyBiL3NyYy9iYWNrZW5kL3N0b3JhZ2UvYnVmZmVyL2ZyZWVsaXN0LmMKaW5kZXggZmRi NWJhZDc5MTAuLmU4NmVkMWY3ZGEwIDEwMDY0NAotLS0gYS9zcmMvYmFja2VuZC9zdG9yYWdl L2J1ZmZlci9mcmVlbGlzdC5jCisrKyBiL3NyYy9iYWNrZW5kL3N0b3JhZ2UvYnVmZmVyL2Zy ZWVsaXN0LmMKQEAgLTIyLDYgKzIyLDcgQEAKICNpbmNsdWRlICJzdG9yYWdlL3Byb2MuaCIK ICNpbmNsdWRlICJzdG9yYWdlL3NobWVtLmgiCiAjaW5jbHVkZSAic3RvcmFnZS9zdWJzeXN0 ZW1zLmgiCisjaW5jbHVkZSAicG9ydC9wZ19udW1hLmgiCiAKICNkZWZpbmUgSU5UX0FDQ0VT U19PTkNFKHZhcikJKChpbnQpKCooKHZvbGF0aWxlIGludCAqKSYodmFyKSkpKQogCkBAIC0x MDAsNjggKzEwMSwxMDEgQEAgc3RhdGljIEJ1ZmZlckRlc2MgKkdldEJ1ZmZlckZyb21SaW5n KEJ1ZmZlckFjY2Vzc1N0cmF0ZWd5IHN0cmF0ZWd5LAogc3RhdGljIHZvaWQgQWRkQnVmZmVy VG9SaW5nKEJ1ZmZlckFjY2Vzc1N0cmF0ZWd5IHN0cmF0ZWd5LAogCQkJCQkJCUJ1ZmZlckRl c2MgKmJ1Zik7CiAKKy8qCisgKiBOdW1iZXIgb2YgYnVmZmVyIElEcyB0byBjbGFpbSBmcm9t IHRoZSBzaGFyZWQgY2xvY2sgaGFuZCBhdCBvbmNlLgorICogTGFyZ2VyIHZhbHVlcyByZWR1 Y2UgY29udGVudGlvbiBvbiB0aGUgc2hhcmVkIGF0b21pYy4gIFdpdGggYSBiYXRjaAorICog c2l6ZSBvZiA2NCwgY29uY3VycmVudCBiYWNrZW5kcyBzd2VlcCBub24tb3ZlcmxhcHBpbmcg Y2h1bmtzIG9mIDY0CisgKiBidWZmZXJzIHJhdGhlciB0aGFuIGludGVybGVhdmluZyBvbmUg YnVmZmVyIGF0IGEgdGltZS4gIFRoZSBnbG9iYWwKKyAqIHN3ZWVwIG9yZGVyIGlzIHByZXNl cnZlZCDigJQgZWFjaCBidWZmZXIgaXMgc3RpbGwgdmlzaXRlZCBleGFjdGx5IG9uY2UKKyAq IHBlciBjb21wbGV0ZSBwYXNzLgorICovCisjZGVmaW5lIENMT0NLX1NXRUVQX0JBVENIX1NJ WkUgNjQKKworLyoKKyAqIFBlci1iYWNrZW5kIHN0YXRlIGZvciBiYXRjaGVkIGNsb2NrIHN3 ZWVwLgorICovCitzdGF0aWMgdWludDMyIE15QmF0Y2hQb3MgPSAwOwkvKiBuZXh0IGJ1ZmZl ciB3aXRoaW4gYmF0Y2ggKi8KK3N0YXRpYyB1aW50MzIgTXlCYXRjaEVuZCA9IDA7CS8qIG9u ZSBwYXN0IGxhc3QgYnVmZmVyIGluIGJhdGNoICovCisKKy8qCisgKiBFZmZlY3RpdmUgYmF0 Y2ggc2l6ZSBmb3IgdGhlIGNsb2NrIHN3ZWVwLCBjb21wdXRlZCBvbmNlIGF0IHN0YXJ0dXAu CisgKiBPbiBub24tTlVNQSBzeXN0ZW1zIChzaW5nbGUgc29ja2V0LCBubyBsaWJudW1hLCBv ciBjb250YWluZXJzIGJsb2NraW5nCisgKiBnZXRfbWVtcG9saWN5KSwgdGhpcyBpcyAxIC0t IHRoZSBvcmlnaW5hbCBvbmUtYXQtYS10aW1lIGJlaGF2aW9yLgorICogT24gbXVsdGktbm9k ZSBOVU1BIHN5c3RlbXMsIHRoaXMgaXMgTWluKENMT0NLX1NXRUVQX0JBVENIX1NJWkUsIE5C dWZmZXJzKQorICogdG8gcmVkdWNlIGNyb3NzLXNvY2tldCBhdG9taWMgY29udGVudGlvbiBv biBuZXh0VmljdGltQnVmZmVyLgorICovCitzdGF0aWMgdWludDMyIENsb2NrU3dlZXBCYXRj aFNpemUgPSAxOworCitzdGF0aWMgaW5saW5lIHVpbnQzMgorRWZmZWN0aXZlQmF0Y2hTaXpl KHZvaWQpCit7CisJcmV0dXJuIENsb2NrU3dlZXBCYXRjaFNpemU7Cit9CisKIC8qCiAgKiBD bG9ja1N3ZWVwVGljayAtIEhlbHBlciByb3V0aW5lIGZvciBTdHJhdGVneUdldEJ1ZmZlcigp CiAgKgotICogTW92ZSB0aGUgY2xvY2sgaGFuZCBvbmUgYnVmZmVyIGFoZWFkIG9mIGl0cyBj dXJyZW50IHBvc2l0aW9uIGFuZCByZXR1cm4gdGhlCi0gKiBpZCBvZiB0aGUgYnVmZmVyIG5v dyB1bmRlciB0aGUgaGFuZC4KKyAqIFJldHVybiB0aGUgbmV4dCBidWZmZXIgdG8gY29uc2lk ZXIgZm9yIGV2aWN0aW9uLiAgQmFja2VuZHMgY2xhaW0gYmF0Y2hlcworICogb2YgY29uc2Vj dXRpdmUgYnVmZmVyIElEcyBmcm9tIHRoZSBzaGFyZWQgY2xvY2sgaGFuZCwgdGhlbiBpdGVy YXRlIHRocm91Z2gKKyAqIHRoZW0gbG9jYWxseSB3aXRob3V0IGZ1cnRoZXIgYXRvbWljIG9w ZXJhdGlvbnMuICBUaGlzIHByZXNlcnZlcyB0aGUgZ2xvYmFsCisgKiBzd2VlcCBvcmRlciB3 aGlsZSByZWR1Y2luZyBjcm9zcy1zb2NrZXQgY29udGVudGlvbiBvbiB0aGUgc2hhcmVkIGNv dW50ZXIuCiAgKi8KIHN0YXRpYyBpbmxpbmUgdWludDMyCiBDbG9ja1N3ZWVwVGljayh2b2lk KQogewogCXVpbnQzMgkJdmljdGltOwogCi0JLyoKLQkgKiBBdG9taWNhbGx5IG1vdmUgaGFu ZCBhaGVhZCBvbmUgYnVmZmVyIC0gaWYgdGhlcmUncyBzZXZlcmFsIHByb2Nlc3NlcwotCSAq IGRvaW5nIHRoaXMsIHRoaXMgY2FuIGxlYWQgdG8gYnVmZmVycyBiZWluZyByZXR1cm5lZCBz bGlnaHRseSBvdXQgb2YKLQkgKiBhcHBhcmVudCBvcmRlci4KLQkgKi8KLQl2aWN0aW0gPQot CQlwZ19hdG9taWNfZmV0Y2hfYWRkX3UzMigmU3RyYXRlZ3lDb250cm9sLT5uZXh0VmljdGlt QnVmZmVyLCAxKTsKLQotCWlmICh2aWN0aW0gPj0gTkJ1ZmZlcnMpCisJaWYgKE15QmF0Y2hQ b3MgPj0gTXlCYXRjaEVuZCkKIAl7Ci0JCXVpbnQzMgkJb3JpZ2luYWxWaWN0aW0gPSB2aWN0 aW07Ci0KLQkJLyogYWx3YXlzIHdyYXAgd2hhdCB3ZSBsb29rIHVwIGluIEJ1ZmZlckRlc2Ny aXB0b3JzICovCi0JCXZpY3RpbSA9IHZpY3RpbSAlIE5CdWZmZXJzOwotCiAJCS8qCi0JCSAq IElmIHdlJ3JlIHRoZSBvbmUgdGhhdCBqdXN0IGNhdXNlZCBhIHdyYXBhcm91bmQsIGZvcmNl Ci0JCSAqIGNvbXBsZXRlUGFzc2VzIHRvIGJlIGluY3JlbWVudGVkIHdoaWxlIGhvbGRpbmcg dGhlIHNwaW5sb2NrLiBXZQotCQkgKiBuZWVkIHRoZSBzcGlubG9jayBzbyBTdHJhdGVneVN5 bmNTdGFydCgpIGNhbiByZXR1cm4gYSBjb25zaXN0ZW50Ci0JCSAqIHZhbHVlIGNvbnNpc3Rp bmcgb2YgbmV4dFZpY3RpbUJ1ZmZlciBhbmQgY29tcGxldGVQYXNzZXMuCisJCSAqIENsYWlt IGEgbmV3IGJhdGNoIGZyb20gdGhlIHNoYXJlZCBjbG9jayBoYW5kLiAgVGhpcyBpcyB0aGUg b25seQorCQkgKiBhdG9taWMgb3BlcmF0aW9uIHBlciBiYXRjaCwgcmVkdWNpbmcgY29udGVu dGlvbiBieSB0aGUgYmF0Y2ggc2l6ZS4KIAkJICovCi0JCWlmICh2aWN0aW0gPT0gMCkKKwkJ dWludDMyCQlzdGFydDsKKwkJdWludDMyCQliYXRjaF9zaXplID0gRWZmZWN0aXZlQmF0Y2hT aXplKCk7CisKKwkJc3RhcnQgPSBwZ19hdG9taWNfZmV0Y2hfYWRkX3UzMigmU3RyYXRlZ3lD b250cm9sLT5uZXh0VmljdGltQnVmZmVyLAorCQkJCQkJCQkJCWJhdGNoX3NpemUpOworCisJ CWlmIChzdGFydCA+PSAodWludDMyKSBOQnVmZmVycykKIAkJewotCQkJdWludDMyCQlleHBl Y3RlZDsKLQkJCXVpbnQzMgkJd3JhcHBlZDsKLQkJCWJvb2wJCXN1Y2Nlc3MgPSBmYWxzZTsK KwkJCXN0YXJ0ID0gc3RhcnQgJSBOQnVmZmVyczsKIAotCQkJZXhwZWN0ZWQgPSBvcmlnaW5h bFZpY3RpbSArIDE7CisJCQkvKgorCQkJICogSWYgdGhlIGNvdW50ZXIgaGFzIGdyb3duIGJl eW9uZCBOQnVmZmVycywgdHJ5IHRvIHdyYXAgaXQgYmFjay4KKwkJCSAqIFdlIG11c3QgaG9s ZCB0aGUgc3BpbmxvY2sgc28gU3RyYXRlZ3lTeW5jU3RhcnQoKSBjYW4gcmVhZAorCQkJICog bmV4dFZpY3RpbUJ1ZmZlciBhbmQgY29tcGxldGVQYXNzZXMgY29uc2lzdGVudGx5LgorCQkJ ICoKKwkJCSAqIE11bHRpcGxlIGJhY2tlbmRzIG1heSBlbnRlciB0aGlzIHNlY3Rpb24gY29u Y3VycmVudGx5LiBBZnRlcgorCQkJICogYWNxdWlyaW5nIHRoZSBzcGlubG9jaywgcmUtcmVh ZCB0aGUgY291bnRlcjogaWYgYW5vdGhlciBiYWNrZW5kCisJCQkgKiBhbHJlYWR5IHdyYXBw ZWQgaXQgYmVsb3cgTkJ1ZmZlcnMsIHdlJ3JlIGRvbmUuCisJCQkgKi8KKwkJCVNwaW5Mb2Nr QWNxdWlyZSgmU3RyYXRlZ3lDb250cm9sLT5idWZmZXJfc3RyYXRlZ3lfbG9jayk7CiAKLQkJ CXdoaWxlICghc3VjY2VzcykKIAkJCXsKLQkJCQkvKgotCQkJCSAqIEFjcXVpcmUgdGhlIHNw aW5sb2NrIHdoaWxlIGluY3JlYXNpbmcgY29tcGxldGVQYXNzZXMuIFRoYXQKLQkJCQkgKiBh bGxvd3Mgb3RoZXIgcmVhZGVycyB0byByZWFkIG5leHRWaWN0aW1CdWZmZXIgYW5kCi0JCQkJ ICogY29tcGxldGVQYXNzZXMgaW4gYSBjb25zaXN0ZW50IG1hbm5lciB3aGljaCBpcyByZXF1 aXJlZCBmb3IKLQkJCQkgKiBTdHJhdGVneVN5bmNTdGFydCgpLiAgSW4gdGhlb3J5IGRlbGF5 aW5nIHRoZSBpbmNyZW1lbnQKLQkJCQkgKiBjb3VsZCBsZWFkIHRvIGFuIG92ZXJmbG93IG9m IG5leHRWaWN0aW1CdWZmZXJzLCBidXQgdGhhdCdzCi0JCQkJICogaGlnaGx5IHVubGlrZWx5 IGFuZCB3b3VsZG4ndCBiZSBwYXJ0aWN1bGFybHkgaGFybWZ1bC4KLQkJCQkgKi8KLQkJCQlT cGluTG9ja0FjcXVpcmUoJlN0cmF0ZWd5Q29udHJvbC0+YnVmZmVyX3N0cmF0ZWd5X2xvY2sp OwotCi0JCQkJd3JhcHBlZCA9IGV4cGVjdGVkICUgTkJ1ZmZlcnM7CisJCQkJdWludDMyCQlj dXJyZW50OworCQkJCXVpbnQzMgkJd3JhcHBlZDsKIAotCQkJCXN1Y2Nlc3MgPSBwZ19hdG9t aWNfY29tcGFyZV9leGNoYW5nZV91MzIoJlN0cmF0ZWd5Q29udHJvbC0+bmV4dFZpY3RpbUJ1 ZmZlciwKLQkJCQkJCQkJCQkJCQkJICZleHBlY3RlZCwgd3JhcHBlZCk7Ci0JCQkJaWYgKHN1 Y2Nlc3MpCi0JCQkJCVN0cmF0ZWd5Q29udHJvbC0+Y29tcGxldGVQYXNzZXMrKzsKLQkJCQlT cGluTG9ja1JlbGVhc2UoJlN0cmF0ZWd5Q29udHJvbC0+YnVmZmVyX3N0cmF0ZWd5X2xvY2sp OworCQkJCWN1cnJlbnQgPSBwZ19hdG9taWNfcmVhZF91MzIoJlN0cmF0ZWd5Q29udHJvbC0+ bmV4dFZpY3RpbUJ1ZmZlcik7CisJCQkJaWYgKGN1cnJlbnQgPj0gKHVpbnQzMikgTkJ1ZmZl cnMpCisJCQkJeworCQkJCQl3cmFwcGVkID0gY3VycmVudCAlIE5CdWZmZXJzOworCQkJCQlp ZiAocGdfYXRvbWljX2NvbXBhcmVfZXhjaGFuZ2VfdTMyKCZTdHJhdGVneUNvbnRyb2wtPm5l eHRWaWN0aW1CdWZmZXIsCisJCQkJCQkJCQkJCQkJICAgJmN1cnJlbnQsIHdyYXBwZWQpKQor CQkJCQkJU3RyYXRlZ3lDb250cm9sLT5jb21wbGV0ZVBhc3NlcysrOworCQkJCX0KIAkJCX0K KworCQkJU3BpbkxvY2tSZWxlYXNlKCZTdHJhdGVneUNvbnRyb2wtPmJ1ZmZlcl9zdHJhdGVn eV9sb2NrKTsKIAkJfQorCisJCU15QmF0Y2hQb3MgPSBzdGFydDsKKwkJTXlCYXRjaEVuZCA9 IHN0YXJ0ICsgYmF0Y2hfc2l6ZTsKIAl9CisKKwl2aWN0aW0gPSBNeUJhdGNoUG9zICUgTkJ1 ZmZlcnM7CisJTXlCYXRjaFBvcysrOworCiAJcmV0dXJuIHZpY3RpbTsKIH0KIApAQCAtNDA4 LDYgKzQ0MiwyNCBAQCBTdHJhdGVneUN0bFNobWVtSW5pdCh2b2lkICphcmcpCiAKIAkvKiBO byBwZW5kaW5nIG5vdGlmaWNhdGlvbiAqLwogCVN0cmF0ZWd5Q29udHJvbC0+Ymd3cHJvY25v ID0gLTE7CisKKwkvKgorCSAqIERldGVybWluZSB0aGUgZWZmZWN0aXZlIGNsb2NrLXN3ZWVw IGJhdGNoIHNpemUuCisJICoKKwkgKiBPbiBtdWx0aS1ub2RlIE5VTUEgc3lzdGVtcywgY2xh aW1pbmcgYmF0Y2hlcyBvZiBidWZmZXJzIGZyb20gdGhlIHNoYXJlZAorCSAqIGNsb2NrIGhh bmQgcmVkdWNlcyBjcm9zcy1zb2NrZXQgY29udGVudGlvbiBvbiB0aGUgYXRvbWljIGNvdW50 ZXIuICBPbgorCSAqIHNpbmdsZS1zb2NrZXQgc3lzdGVtcywgYmF0Y2hpbmcgcHJvdmlkZXMg bm8gYmVuZWZpdCAodGhlIGF0b21pYyBpcworCSAqIGFscmVhZHkgc29ja2V0LWxvY2FsKSBh bmQganVzdCBjYXVzZXMgYmFja2VuZHMgdG8gc2tpcCBidWZmZXJzLCBzbyB3ZQorCSAqIHVz ZSBiYXRjaCBzaXplIDEgZm9yIHRoZSBvcmlnaW5hbCBiZWhhdmlvci4KKwkgKgorCSAqIHBn X251bWFfaW5pdCgpIHJldHVybnMgLTEgd2hlbiBOVU1BIGlzIHVuYXZhaWxhYmxlLgorCSAq IHBnX251bWFfZ2V0X21heF9ub2RlKCkgcmV0dXJucyAwIGZvciBhIHNpbmdsZSBOVU1BIG5v ZGUuCisJICovCisJaWYgKHBnX251bWFfaW5pdCgpICE9IC0xICYmIHBnX251bWFfZ2V0X21h eF9ub2RlKCkgPj0gMSkKKwkJQ2xvY2tTd2VlcEJhdGNoU2l6ZSA9IE1pbihDTE9DS19TV0VF UF9CQVRDSF9TSVpFLAorCQkJCQkJCQkgICh1aW50MzIpIE5CdWZmZXJzKTsKKwllbHNlCisJ CUNsb2NrU3dlZXBCYXRjaFNpemUgPSAxOwogfQogCiAKLS0gCjIuNTAuMSAoQXBwbGUgR2l0 LTE1NSkKCg== --17433b00b01b98fb89b021ca533c5f80a3c06626--