Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tP7P7-00AL3S-Iy for psycopg@arkaria.postgresql.org; Sat, 21 Dec 2024 21:52:58 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tP7P6-00CgU5-WB for psycopg@arkaria.postgresql.org; Sat, 21 Dec 2024 21:52:56 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tP7P6-00CgTx-Ix for psycopg@lists.postgresql.org; Sat, 21 Dec 2024 21:52:56 +0000 Received: from mail-il1-x12a.google.com ([2607:f8b0:4864:20::12a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1tP7P3-000igM-8S for psycopg@postgresql.org; Sat, 21 Dec 2024 21:52:54 +0000 Received: by mail-il1-x12a.google.com with SMTP id e9e14a558f8ab-3a9cb8460f7so22380205ab.2 for ; Sat, 21 Dec 2024 13:52:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734817972; x=1735422772; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=uLY8UbgecGH3fTjR6BGUszAyHgAUtnE2rFWoLsaisRs=; b=HsOJB0o8/Nq3H7q3w6lfPpIRugJmo33PWXmNslFKNmGi3XaKy9liosK5Wr7GXMfiqm IsRsV+UNtdCeaV0T6sc/sCGcjgr3rTsvk83QLSJCs0zkcQPZfK5rbhdNNk4EbTepw7xe RApZUK3V8yvGPUi59/8mgMGGpxLUselaqXT2lNsDH7V4AYIdn6H6GivB/xcLpfveLViy YYtEWjolYthwb/OWcRyYJ2ivLZ1XpJ+NKxrpP6Qa+XH6RhHhKKglFqHlWt6V4iZU1UQu iQeH5TsJx0nC3ONfORuf0jekzJm/FuDZ76B30bP8CpfOit8AWB6rJXdzZiBJf3jj1gbK 038Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734817972; x=1735422772; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uLY8UbgecGH3fTjR6BGUszAyHgAUtnE2rFWoLsaisRs=; b=J0O/5aIG7j2vlr8wH4B0A6nc6O0lWn8JfPJEX9i3poieJLf30T6XUh8yFuhd0I8v7G V2OSphw15/szwGAu0NY8vGU6SXNl5lu2thiOCUc+Yu/Ho715zI0Hl5T7M3nHg/+hpGnP t5AOIAx2/GWffd4+DKY0NWIKo/30P/wvUQXtsXu6DXC+rdvE1psItyb/LFUNqlSMngyG J1bmDIDS9uv5g7Mk6ttcrZFhdLiZows9lgO22tvBza7Lcj1jHpFYxR6wYuq8HXqd4sfk HY7ZIsDtl+slBB1nZ7KJ47NW/JGWudEi/ubxuhMlgKsFpQW8qPjJvVvJgL82jy2hESd1 L9Tw== X-Gm-Message-State: AOJu0YwjKRP1L+t4BmeSpzAChO7nUKEbPLiOvKc3NjNbkhGuWhAipUpi htpp+tF2zUFnwkeeMAPuIvCKP7KcG7kQiLGmVb5gN0Tz0PVcP+R43UzW8ZivCx8WUYF7zoR10oS rDTd11yDdhwcObi1XAM/u+8kjZVmZcA== X-Gm-Gg: ASbGncumLjalm6JYFXlP7TrVi0QO3gMfPXvrmmWuCtssOiTYFDio1fNPRJChngJlrOR XSX55K13GGdisrlbJY+qYC1xKki5uH3g+LFVFBc1v X-Google-Smtp-Source: AGHT+IEUcKXUT93Np8YFcWr3SDk3yqsgOMq5EEMR4nmM8b2wmLz/YfPIFPZt2IiVAgdsMhb2lmTTIFEzwWZsuZI81ek= X-Received: by 2002:a92:c24c:0:b0:3a7:e800:7d37 with SMTP id e9e14a558f8ab-3c2d278282bmr69006615ab.10.1734817972194; Sat, 21 Dec 2024 13:52:52 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Vladimir Ryabtsev Date: Sat, 21 Dec 2024 13:52:40 -0800 Message-ID: Subject: Re: Memory To: Adrian Klaver Cc: psycopg@postgresql.org Content-Type: multipart/alternative; boundary="000000000000fb93140629cec8a7" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000fb93140629cec8a7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Now I see, the doc is already great. Thanks for pointing out Adrian. On Sat, Dec 21, 2024 at 8:34=E2=80=AFAM Adrian Klaver wrote: > On 12/21/24 02:45, Vladimir Ryabtsev wrote: > > Hi community, > > > > I am reading a big dataset using code similar to this: > > > > query =3D ''' > > SELECT timestamp, data_source, tag, agg_value > > FROM my_table > > '''I > > batch_size =3D 10_000_000 > > > > with psycopg.connect(cs, cursor_factory=3Dpsycopg.ClientrCursor) as con= n: > > FYI, ClientCursor. > > > > > > I looked the documentation, but did not find specifics related to > > performance differences between Server and Client cursors. > > > > I am fine with ServerCursor, but I need to ask, is it by design that > > with ClientCursor the result set is copied into memory despite > > fetchmany() limit? ClientCursor is the default class, so may be worth > > documenting the difference (sorry, if I missed that). > > Client side cursor > > > https://www.psycopg.org/psycopg3/docs/advanced/cursors.html#client-side-c= ursors > > "In such querying pattern, after a cursor sends a query to the server > (usually calling execute()), the server replies transferring to the > client the whole set of results requested, which is stored in the state > of the same cursor and from where it can be read from Python code (using > methods such as fetchone() and siblings)." > > > https://www.psycopg.org/psycopg3/docs/api/cursors.html#psycopg.Cursor.fet= chmany > > "fetchmany(size: int =3D 0) =E2=86=92 list[+Row] > > Return the next size records from the current recordset. > > size default to self.arraysize if not specified. > > Return type: > > Sequence[Row], with Row defined by row_factory > > " > > > Server side cursor > > > https://www.psycopg.org/psycopg3/docs/advanced/cursors.html#server-side-c= ursors > > "PostgreSQL has its own concept of cursor too (sometimes also called > portal). When a database cursor is created, the query is not necessarily > completely processed: the server might be able to produce results only > as they are needed. Only the results requested are transmitted to the > client: if the query result is very large but the client only needs the > first few records it is possible to transmit only them. > > The downside is that the server needs to keep track of the partially > processed results, so it uses more memory and resources on the server." > > > > > Thank you. > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > > --000000000000fb93140629cec8a7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Now I see, the doc is already great.
Thanks=C2=A0for p= ointing out Adrian.


On Sat, Dec 2= 1, 2024 at 8:34=E2=80=AFAM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 12/21/24 02:45, Vladimir Ryabtse= v wrote:
> Hi community,
>
> I am reading a big dataset using code similar to this:
>
> query =3D '''
> SELECT timestamp, data_source, tag, agg_value
> FROM my_table
> '''I
> batch_size =3D 10_000_000
>
> with psycopg.connect(cs, cursor_factory=3Dpsycopg.ClientrCursor) as co= nn:

FYI, ClientCursor.


>
> I looked the documentation, but did not find specifics related to
> performance differences between=C2=A0Server and Client cursors.
>
> I am fine with ServerCursor, but I need to ask, is it by design that <= br> > with ClientCursor the result set is copied into memory despite
> fetchmany() limit? ClientCursor is the default class, so may be worth =
> documenting the difference (sorry, if I missed that).

Client side cursor

https://www.psycopg.o= rg/psycopg3/docs/advanced/cursors.html#client-side-cursors

"In such querying pattern, after a cursor sends a query to the server =
(usually calling execute()), the server replies transferring to the
client the whole set of results requested, which is stored in the state of the same cursor and from where it can be read from Python code (using methods such as fetchone() and siblings)."

https://www.psycopg.o= rg/psycopg3/docs/api/cursors.html#psycopg.Cursor.fetchmany

"fetchmany(size: int =3D 0) =E2=86=92 list[+Row]

=C2=A0 =C2=A0 =C2=A0Return the next size records from the current recordset= .

=C2=A0 =C2=A0 =C2=A0size default to self.arraysize if not specified.

=C2=A0 =C2=A0 =C2=A0Return type:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Sequence[Row], with Row defined by row_fa= ctory

"


Server side cursor

https://www.psycopg.o= rg/psycopg3/docs/advanced/cursors.html#server-side-cursors

"PostgreSQL has its own concept of cursor too (sometimes also called <= br> portal). When a database cursor is created, the query is not necessarily completely processed: the server might be able to produce results only
as they are needed. Only the results requested are transmitted to the
client: if the query result is very large but the client only needs the first few records it is possible to transmit only them.

The downside is that the server needs to keep track of the partially
processed results, so it uses more memory and resources on the server."= ;

>
> Thank you.
>

--
Adrian Klaver
adrian.klave= r@aklaver.com

--000000000000fb93140629cec8a7--