Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5QEZ-003Ehb-08 for pgsql-hackers@arkaria.postgresql.org; Wed, 25 Mar 2026 15:33:27 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w5QEX-00Ekze-1x for pgsql-hackers@arkaria.postgresql.org; Wed, 25 Mar 2026 15:33:26 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5QEX-00EkzW-0w for pgsql-hackers@lists.postgresql.org; Wed, 25 Mar 2026 15:33:25 +0000 Received: from mail-ej1-x630.google.com ([2a00:1450:4864:20::630]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w5QEU-000000015bW-3d2V for pgsql-hackers@postgresql.org; Wed, 25 Mar 2026 15:33:25 +0000 Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-b8f9568e074so449345166b.0 for ; Wed, 25 Mar 2026 08:33:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774452801; cv=none; d=google.com; s=arc-20240605; b=Yze85BdsUAHUzXbgSTZP3ahZhpQpXWM60ajNfY5Xtl6QyJDSNAZK5H61slk3vt3PU3 kE6W46+I12PlejZDHeinfTidzYPJOXTO7TXWJ3AdL14qV1Yv1p9nqwnPXkKZdJ0xh6KK GracxrPOnIrOk3y6EFy84pMgsOOuaLxA4RWbAN2SGxY9AjAV2ETyh26tnIlu7omRzjpK S4NU4uT3YDWM01F9Q0xI0KDQITgSWlSgB2zj0QBNr4ibgmlQz6aOknGGam9oeXUCC1kv BaPOtduIEXF9snZf6a89xVQTuBdVanqVf0I/Mjnk/fZQdTesaaxYBaiPvu80AQutN92C RDPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=wW6prEmsZl3D9Db3AKF/3cas2c2RmXnA7mBw7yJwYnk=; fh=CPU7oJ2aNVEyc0Iy1cUNSfw/7JRkTXJuk0IGjFf/H4c=; b=c3/7haVvE4uU0b9swR/syrd7/Y5n/bcqi48MZtec9zCnoiqRsYUWG0PcOvBojK0PyJ Tpfqt3qKsnOlhzb3zPmhVV6fY1DeWhqi24P3+iqGHX1hwyHmtZuh9/seFDPJif0Y4Sj9 PKOAfaoTqqt4+/KSiZVh0GZT9818lOD+qgk8cidU/uJ50CQ3RpPaJIR6cL+WC0EH+NkW BmVKMZVttn6XHz1zlpXkZr8LTweScLnvH0HqcrPg4Z2DrHKDz1tXwdfJSQYX3qEN+PrO 2bXzm+ZLLBYu4kSl3ctUOFPM3RJs9Q1vookeevCvIneUSP/dgWpKihtnGgT2R/A4qnWu 07Bw==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774452801; x=1775057601; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wW6prEmsZl3D9Db3AKF/3cas2c2RmXnA7mBw7yJwYnk=; b=qBVpZARvjUxkyo1LarhE8j/8icJTeYz/WcNo2Mp3cKb9mnUVOWkJhoU0NGLcfQvIFX wZafEXFfJnwuQCv5kd+dtbo959gNBWmElVAdRTwBjhnqiL4w1O7KitFySjDj6isuCcOe gfWyo20Cumd6UicyuJkvkyYzpabVnPXE737J1a5uuPwKYPDgpQHHYBttV8M92b3Apwso rFs4AnIXtg+4pxOuIGXQl6WP8c/VYhfOyBOtIdlgMAX4q3+JhVoE+Io/bnygIaAEGXHn HUHsj86oP5Crcnxu5ZP4ue0tVS5DQI5PklVCdHUIFbTIcIYPYWMU0HwOkzDt4Ye8e2xN tjpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774452801; x=1775057601; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=wW6prEmsZl3D9Db3AKF/3cas2c2RmXnA7mBw7yJwYnk=; b=aWddh/SDsbKXQYi3fy/8zIz2I5Ky0U7l91sXLG3vdgq30dubJCvTZeOOybh16r1Jw5 OfjOjbeTmkwMiLVC1FUi7LgcoDlG6TXYUbPqghZNVitDf73knUsSpOjDVWBNYPVJzZwn ePtaKR4LdDEfjyakm8xJcY9htTivch3TQ3NZWQ07NhJSU8W+X3NZMY2cZ5H0JVhoQSDJ X6R543QbuNzOvJBTWCjDZ6CnBPcRRvk1aVHNW1IOQevsfS6ekn7WF6QJjb5KflJlAWPP YPe265fJMHg4UCWUq1NjloS18A7IqCOoSWx74UFHJaQL0ZEY9Xy0n8q2QeWJUGp4imUc 94Pw== X-Forwarded-Encrypted: i=1; AJvYcCULyZbVFCPc7eTD1PHtWI3GMRrauvW6wqOmJexdUpuvSUM5PRyRkoZmo7OHFY7k2WfdJr/v4jQHlzl2OvOR@postgresql.org X-Gm-Message-State: AOJu0YzVXfylb16YOtahGlqhB6IYIjB2FQoI75RtBJBxQqZQVtBX5hsZ EU6RFK8nRaVvKYx6rEsrQ4tEXMcelXPhOA003dipU0TiS++9E+kkw+XuMOXd95kXetfHwQ0ENxm OmX5h5zDTGMYib8JocHl7cpmz7M8wwj0= X-Gm-Gg: ATEYQzyEWbRqvI+PtjSPuIu3K6hRozstYrjDjntyGNlFKaVZLI1LDJlaPzzo/LBJb8U IBD2lEw3B3qLUz9FFgQKHjDrd+r/DIu0brsWRl3YqzKTidZ7wcM7opv25DG6ETU8lbrHZkO/YMz AFpNztqIlLh7nPI7j5X2ikmYDACsSMkpPnOiEF2a+87Uyceug+6V4wWeIVnT5O1bRhpT3XuyPkZ jsphhnM8fgyvOqSHQs5hxedVeBs029JoURXVn3v5e9DkshAbAAjv/Hjz1/o+2K3UHknB3APLWDf RxKXSMnwWa6uTrs9DYX1+zb28BAv/gianyncx3biLsRRJNe05/qmBWV0RPvYQ7bxLwgd9q9FZX1 bsLjTgJuZ X-Received: by 2002:a17:907:61a3:b0:b99:1074:74f with SMTP id a640c23a62f3a-b9a54238966mr205187666b.34.1774452800090; Wed, 25 Mar 2026 08:33:20 -0700 (PDT) MIME-Version: 1.0 References: <42rdu4q44kvsq53fz5qgzuawqpaytvnemsnquynlfch5mqhc2m@6ytnlgivtzro> In-Reply-To: From: Melanie Plageman Date: Wed, 25 Mar 2026 11:33:08 -0400 X-Gm-Features: AQROBzCDTl4LNsoD2gEJJF8xTjgGh3VHRPiyaKWUQb8ARjsP_gAR1iXX8hPaf_E Message-ID: Subject: Re: Don't synchronously wait for already-in-progress IO in read stream To: Andres Freund Cc: Nazir Bilal Yavuz , Thomas Munro , pgsql-hackers@postgresql.org, Peter Geoghegan , Tomas Vondra Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Mar 25, 2026 at 11:15=E2=80=AFAM Andres Freund = wrote: > > > > In the attached prototype I replaced your patch introducing > > > PrepareHeadBufferReadIO()/ PrepareAdditionalBufferReadIO() with one t= hat > > > instead revises StartBufferIO() to have the enum return value you int= roduced > > > and a PgAioWaitRef * argument that callers that would like to asynchr= onously > > > wait for the IO to complete (others pass NULL). There are some other = cleanups > > > in it too, see the commit message for more details. > > > > I've come around to this. The aspect I like least is that io_wref is > > used both as an output parameter _and_ a decision input parameter. > > Do you have an alternative suggestion? We could add a dedicated parameter= for > that, but then it just opens up different ways of calling the function wi= th > the wrong arguments. Yea, I don't know. The cleanest way to handle callers with different intentions and tolerances is to have different functions that allow different behavior. But that's the opposite of what we're currently trying to do :) I tried to come up with some intention-related enum input argument, but that seems like a bit of over-engineering. > > Some callers never want to wait on in-progress IO (and don't have to), > > while others must eventually wait but can defer that waiting as long > > as they have a wait reference. If they can't get a wait reference, > > they have no way to wait later, so they must wait now. The presence of > > io_wref indicates this difference. > > > > I think it's important to express that less mechanically than in your > > current header comment and comment in the else block of > > StartSharedBufferIO() where we do the waiting. Explaining first=E2=80= =94before > > detailing argument combinations=E2=80=94why a caller would want to pass > > io_wref might help. > > I'm not entirely sure what you'd like here. Would the following comment > do the trick? > > * In several scenarios the buffer may already be undergoing I/O in this = or > * another backend. How to best handle that depends on the caller's > * situation. It might be appropriate to wait synchronously (e.g., becaus= e the > * buffer is about to be invalidated); wait asynchronously, using the buf= fer's > * IO wait reference (e.g., because the caller is doing readahead and doe= sn't > * need the buffer to be ready immediately); or to not wait at all (e.g., > * because the caller is trying to combine IO for this buffer with anothe= r > * buffer). > * > * How and whether to wait is controlled by the wait in io_wref parameter= s. In > * detail: > * > * Sounds good to me. > > However, I do think you need to enumerate the different combinations > > of wait and io_wref (as you've done) to make it clear what they are. > > > > I, for example, find it very confusing what wait =3D=3D false and io_wr= ef > > not NULL would mean. If IO is in progress on the buffer and the > > io_wref is not valid yet, the caller would get the expected > > BUFFER_IO_IN_PROGRESS return value but io_wref won't be set. I could > > see callers easily misinterpreting the API and passing this > > combination when what they want is wait =3D=3D true and io_wref not NUL= L > > -- because they don't want to synchronously wait. > > Hm. I started out proposing that we should just add an assert documenting= this > is a nonsensical combination. But when writing the comment for that I rea= lized > that it theoretically could make sense to pass in wait =3D=3D false and i= o_wref !=3D > NULL, if you wanted to get a wait reference, but would not want to do a > WaitIO() if there's no wait reference set. > > I don't think that's something we need right now, but ... We should somehow express this in the comment. - Melanie