public inbox for [email protected]
help / color / mirror / Atom feedFrom: Leon Winter <[email protected]>
To: [email protected]
Subject: Cursor With_Hold Performance Workarounds/Optimization
Date: Tue, 10 Oct 2017 14:20:39 +0200
Message-ID: <[email protected]> (raw)
List-Unsubscribe: <mailto:[email protected]?body=unsub%20pgsql-performance>
Hi,
during implementation of a runtime environment and the adjoining database
abstraction layer I noticed (like many before me [0] and as correctly mentioned
in the documentation) the sizeable performance impact of declaring a cursor
"with hold" for queries with large result sets.
Our use case very often looks like this:
open cursor for select from table1
loop
{ fetch some entries from cursor
update table2
commit
}
During iteration of the result set we commit changes to the database so we
must make sure to keep the cursor alive. One option is to use "with hold".
Unfortunately the resultset is then instantly materialzed which is a huge
performance burden.
In our use case the "commit" of changes often does not affect the iteration set.
Also the loop might be aborted before the resultset was fully read so we never
needed the whole materialzed set anyway.
To workaround these problems, we already employ some static analysis to avoid
"with hold" in all situations where there are no commits during the lifetime of
cursor or portal. For other cursors we choose to use a different database
connection inside the same application to protect the cursors from commit
operations and avoiding costly copy operations (if they would be used "with
hold" on the main database connection).
In an attempt to further minimize the performance impact I am thinking about
employing a lazy "with hold" where I would fetch all the remaining result rows
from a cursor or portal before a commit statement. This way I could at least
have great performance in all conflict-free situations until one would arrive at
an impass. Naturally I am now wondering why the postgres cursor/portal is not
also employing the same trick (at least as an option): Postpone materialization
of "with hold" cursors until it is required (like a commit operation is
dispatched).
Probably I am also missing many (internal) aspects but at that point it might be
possible to optimize further. When, for instance, no changes were made to result
set of the "with hold" cursor, it must not be materialized. From a previous
discussions [1] I heard that one can in fact accomplish that by using a different
database connection which is one workaround we are using.
I am not sure whether this kind of workaround/optimization work should be done
in the database abstraction/interface layer or the database itself. Since a lot
of people seem to run into the peformance issue many might profit from some
optimization magic in the database for such use cases. We are very invested in
this performance issue and are happy to resolve it on either level.
Regards,
Leon
[0] https://trac.osgeo.org/qgis/ticket/1175
https://stackoverflow.com/questions/33635405/postgres-cursor-with-hold
https://www.citusdata.com/blog/2016/03/30/five-ways-to-paginate/
[1] https://bytes.com/topic/postgresql/answers/420717-cursors-transactions-why
http://www.postgresql-archive.org/setFetchSize-td4935215.html
--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected]
Subject: Re: Cursor With_Hold Performance Workarounds/Optimization
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox