Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vi4sw-00EIla-1j for pgsql-hackers@arkaria.postgresql.org; Tue, 20 Jan 2026 06:06:39 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vi4sv-00G9zB-1w for pgsql-hackers@arkaria.postgresql.org; Tue, 20 Jan 2026 06:06:37 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vi4sv-00G9z2-0T for pgsql-hackers@lists.postgresql.org; Tue, 20 Jan 2026 06:06:37 +0000 Received: from mail-dy1-x1336.google.com ([2607:f8b0:4864:20::1336]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vi4ss-001MpG-24 for pgsql-hackers@lists.postgresql.org; Tue, 20 Jan 2026 06:06:36 +0000 Received: by mail-dy1-x1336.google.com with SMTP id 5a478bee46e88-2b6f5a9cecaso1013420eec.0 for ; Mon, 19 Jan 2026 22:06:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1768889194; x=1769493994; darn=lists.postgresql.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=X2h7VczYTik142wZzO/4H/Yef8Bvl5F4EroylDK2jrE=; b=fyUuZtRyp3yFMDo2bVhosLFEP/nI4ilOjzBNUUz5XSBKf4Fl75VciEtZq+853O8ssv eTbaHk0cL6SsnxfsVNkrlltKaW55HHNZZYWvXGmH0MYUR7JK0x1xcGHtSpjSUfutZGMk mUtSNDyMjx/AeFQtK11ieLBfy0q+QAD8sfbHHCrCEuDIUMtXlzO9aK0aB/yDINGMN5C0 pCoen65AzrXjUoF9MIxsKTyz41OfaUOCB3fP3IoXfvwqNV1AxMcFwxLs/boSo055ruBw N44oSqpqQKVp4Iabuq4W8cdw/+ZOnuvI+NLyjHup075m/tU1qsYVUCD2Xa8C+Yelkp2A kGOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768889194; x=1769493994; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-gg:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=X2h7VczYTik142wZzO/4H/Yef8Bvl5F4EroylDK2jrE=; b=Vw5xl0HlPPasoGViVGkn0m3fgYT5R6YPT/2lV1iwVoVwBSZ93yijn+k9d8IQOsXwxE fi24rmaKEplTB3uOUvx5WShyK77HYNfzIZrErjv7pfjDYrkKDazQha30lVx0ZdMHHkdb Xy6vU0W4ZWrOMjQtkdFu94d08sJl0zqde5h6tXpevA8U/+CmKyntOB0JbSMa270TtnuY 9+O4tLs60Sb7Hqi5A6eiTNymWQzC6UjtBeY5ZdDn98EbIXy0NMuztpnacl8OScBLQsqG PkO7dOmMzitzihEs7eHVvn+tUKf5eGlo9i15It5U9gd3G+hcfVQ4UrfyVFdIon2fpIN8 uuPQ== X-Gm-Message-State: AOJu0YxOykFDCu4ZkWYJ9YCbK7WnH9PULjrhGFg1p3RCtFu//6nPpWZ9 EPmlwDbZf+pRTSs69OTZ+Pr1/dcrOubtNWiy9XYMAqp7cLcTDFpjyV54 X-Gm-Gg: AZuq6aLns53cLpkgA1eufF8+T3m1N80e2dOTovpsPpurtqqmOrmhXM6WngXYhgZU5b2 mkH7ELIFSPkptQPP/6HBXZGhn3VpkRCG649VAj/dvnonUpEGKBxCTi4SMWXx22bL3IZiWOsJBW3 rfIgCQG3z+2rpKaOlYaXinPHR0hgtJhE8u3lZNcmpUa5q38SZgJDpnUGtGxuTVjuoI3xXx9cxAv z++4IjV3U5/zJMt4muqIZFXfHOj29UAd9cnC9fXAMF3J8ke2Rs7nAZZQP9Zuv/SOF1ZdWYwplZK V70Y4+RLXf/sL7JYUxBkTe9j2AoKAbq0v0jd7manYOeYwMm7/Kmfti9f/+FDzB0iByU0sD6kbXZ AGyOzhybHJXoLXoFCpiLy2hFeU8TKZk71d5F0j85YzBx8QV7BseMhtgyB+JbEvVuq8LAPszyaCa 4lj5DyzxReEYYaIN+0d+Ns X-Received: by 2002:a05:7301:7c0e:b0:2b0:4c12:d74d with SMTP id 5a478bee46e88-2b6fd746d53mr538773eec.15.1768889192736; Mon, 19 Jan 2026 22:06:32 -0800 (PST) Received: from smtpclient.apple ([142.171.105.12]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2b6b3651f39sm15870371eec.24.2026.01.19.22.06.31 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Jan 2026 22:06:32 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.700.81.1.4\)) Subject: Re: More speedups for tuple deformation From: Chao Li In-Reply-To: <0663AA4F-74FB-48A5-B77B-3C150445FF2B@gmail.com> Date: Tue, 20 Jan 2026 14:05:58 +0800 Cc: PostgreSQL Developers Content-Transfer-Encoding: quoted-printable Message-Id: References: <9A17C43D-7A28-4885-8974-555A40C9523E@gmail.com> <0663AA4F-74FB-48A5-B77B-3C150445FF2B@gmail.com> To: David Rowley X-Mailer: Apple Mail (2.3826.700.81.1.4) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > On Jan 20, 2026, at 12:32, Chao Li wrote: >=20 >=20 >=20 >> On Jan 20, 2026, at 08:11, David Rowley wrote: >>=20 >> On Mon, 19 Jan 2026 at 18:48, Chao Li wrote: >>> I reviewed the patch and traced some basic workflows. But I = haven=E2=80=99t done a load test to compare performance differences with = and without this patch, I will do that if I get some bandwidth later. = Here comes some review comments: >>>=20 >>> 1 - tupmacs.h >>> ``` >>> + /* Create a mask with all bits beyond natts's bit set to off = */ >>> + mask =3D 0xFF & ((((uint8) 1) << (natts & 7)) - 1); >>> + byte =3D (~bits[lastByte]) & mask; >>> ``` >>>=20 >>> When I read the code, I got an impression bits[lastByte] might = overflow when natts % 8 =3D=3D 0, so I traced the code, then I realized = that, this function is only called when a row has null values, so that, = when reaching here, natts % 8 !=3D 0, otherwise it should return earlier = within the for loop. >>=20 >> It certainly is possible to get to that part of the code when natts = is >> a multiple of 8 and the tuple contains NULLs after that (we may not = be >> deforming the entire tuple). The code you quoted that's setting = "mask" >> in that case will produce a zero mask, resulting in not finding any >> NULLs. I don't quite see any risk of overflowing any of the types >> here. If natts is 16 then effectively the code does 0xFF & ((1 << 0) >> - 1); so no overflow. Just left shift by 0 bits and bitwise AND with >> zero, resulting in the mask becoming zero. >>=20 >> How about if I write the comment as follows? >>=20 >> /* >> * Create a mask with all bits beyond natts's bit set to off. The = code >> * below will generate a zero mask when natts & 7 =3D=3D 0. When that = happens >> * all bytes that need to be checked were done so in the loop above. = The >> * code below will create an empty mask and end up returning natts. = This >> * has been done to avoid having to write a special case to check if = we've >> * covered all bytes already. >> */ >>=20 >=20 > I=E2=80=99m sorry I didn=E2=80=99t express myself clearly, maybe I = should have used =E2=80=9COOB=E2=80=9D rather than =E2=80=9Coverflow". = My real concern is about out-of-boundary read of bits[lastByte] when = natts&7=3D=3D0. >=20 > Say, natts is 16, then bits is 2 bytes long; lastByte =3D 16>>3 =3D 2, = so bits[2] is a OOB read. >=20 > If first_null_attr() is only called when hasnulls=3D=3Dtrue, then it = will never hit the OOB point, because it will return early from the = =E2=80=9Cfor=E2=80=9D loop. In the current patch, which is true, so the = OOB should never happen. >=20 > However, I don=E2=80=99t see any comment mentions something like = =E2=80=9Cfirst_null_attr() should only be called when hasnulls is true. = If in future one calls first_null_attr() in a situation where hasnulls = =3D=3D false, then the OOB will be triggered. >=20 > The comment you added explains that even if OOB happens, no matter = what value is hold by bits[lastByte], because mask is 0, the final = result is still correct, which is true, but OOB is still a concern. If = the bits array happens to end exactly at the edge of a memory page, the = OOB read bits[lastByte] may trigger a segment fault; and valgrind may = detect the OOB and complain about it. >=20 > So, my original comment was that, we should at least add something to = the header comment to mention =E2=80=9Cfirst_null_attr() should only be = called when hasnulls is true. If we can add an Assert to ensure hasnulls = is true, that would be even better. >=20 > But if we want first_null_attr() to be safe no matter hasnulls is true = or false, I think we should avoid the OOB. >=20 I also noticed one thing that, with running an arbitrary SQL statement, = first_null_attr() might be called with natts=3D0, so maybe it can have a = fast path to return 0 directly if natts=3D=3D0. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/