public inbox for [email protected]help / color / mirror / Atom feed
Re: slow SELECT expr INTO var in plpgsql 4+ messages / 1 participants [nested] [flat]
* Re: slow SELECT expr INTO var in plpgsql @ 2026-02-01 05:09 Pavel Stehule <[email protected]> 0 siblings, 1 reply; 4+ messages in thread From: Pavel Stehule @ 2026-02-01 05:09 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> so 31. 1. 2026 v 21:58 odesílatel Tom Lane <[email protected]> napsal: > Pavel Stehule <[email protected]> writes: > > I remember the old discussion about this issue, and I thought that the > > performance of SELECT INTO and assignment should be almost the same. I > > repeated these tests on pg 9.4, 11 and master (asserts are disabled) with > > interesting results > > > release, assign time, select into time > > 9.4, 2900 ms, 20800 ms > > 11, 2041 ms, 16243 ms > > master, 534ms, 15438 ms > > Yeah, we've sweated a good deal about optimizing plpgsql assignment, > but SELECT INTO is always done the hard way. > > I experimented a little bit with converting simple-expression > SELECT INTO into an assignment, as attached. It does reclaim > nearly all of the performance difference: for me, these two > test cases now take about 276 vs 337 ms. However, I'm concerned > about the side-effects of substituting this other code path; > there's a lot of potential minor differences in behavior. > Two that you can see in the regression test changes are: > > * SELECT INTO is tracked by pg_stat_statements, assignments aren't. > > * The context report for an error can be different, because > _SPI_error_callback() doesn't get used. > > We could probably eliminate the context-report difference by setting > up a custom error context callback in this new code path, but the > difference in pg_stat_statements output would be hard to mask. > There may be other discrepancies as well, such as variations in > error message wording. > > Probably no one would notice such details if it had been like that > all along, but would they complain about a change? I dunno. > This patch looks well. I can confirm massive speedup. I don't remember any report related to change of implementation of assign statement before, and I think it can be similar with this patch. In this specific case, I think so users suppose SELECT INTO is translated to assignment by default. And there are a lot of documents on the net that describe the transformation of the assignment statement to SELECT - so I think there is some grey zone where optimization can do some magic. More - the statistics for function execution can be covered by track_functions. Regards Pavel > > regards, tom lane > > ^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: slow SELECT expr INTO var in plpgsql @ 2026-03-10 07:58 Pavel Stehule <[email protected]> parent: Pavel Stehule <[email protected]> 0 siblings, 2 replies; 4+ messages in thread From: Pavel Stehule @ 2026-03-10 07:58 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> Hi ne 1. 2. 2026 v 6:09 odesílatel Pavel Stehule <[email protected]> napsal: > > > so 31. 1. 2026 v 21:58 odesílatel Tom Lane <[email protected]> napsal: > >> Pavel Stehule <[email protected]> writes: >> > I remember the old discussion about this issue, and I thought that the >> > performance of SELECT INTO and assignment should be almost the same. I >> > repeated these tests on pg 9.4, 11 and master (asserts are disabled) >> with >> > interesting results >> >> > release, assign time, select into time >> > 9.4, 2900 ms, 20800 ms >> > 11, 2041 ms, 16243 ms >> > master, 534ms, 15438 ms >> >> Yeah, we've sweated a good deal about optimizing plpgsql assignment, >> but SELECT INTO is always done the hard way. >> >> I experimented a little bit with converting simple-expression >> SELECT INTO into an assignment, as attached. It does reclaim >> nearly all of the performance difference: for me, these two >> test cases now take about 276 vs 337 ms. However, I'm concerned >> about the side-effects of substituting this other code path; >> there's a lot of potential minor differences in behavior. >> Two that you can see in the regression test changes are: >> >> * SELECT INTO is tracked by pg_stat_statements, assignments aren't. >> >> * The context report for an error can be different, because >> _SPI_error_callback() doesn't get used. >> >> We could probably eliminate the context-report difference by setting >> up a custom error context callback in this new code path, but the >> difference in pg_stat_statements output would be hard to mask. >> There may be other discrepancies as well, such as variations in >> error message wording. >> >> Probably no one would notice such details if it had been like that >> all along, but would they complain about a change? I dunno. >> > > This patch looks well. I can confirm massive speedup. > > I don't remember any report related to change of implementation of assign > statement before, and I think it can be similar with this patch. > > In this specific case, I think so users suppose SELECT INTO is translated > to assignment by default. And there are a lot of documents on the net that > describe the transformation of the assignment statement to SELECT - so I > think there is some grey zone where optimization can do some magic. More - > the statistics for function execution can be covered by track_functions. > Do you plan to push this patch? Unfortunately there is not any discussion about side effects. I wrote a version with dedicated error context callback, so there will be differences only in pg_stat_statements. It is true, so this should be hard to mask. Maybe this difference can be just documented - like "`SELECT expr INTO variable` can be optimized and executed by a direct expression executor, and then this query will not be visible in pg_stat_statement." Regards Pavel > Regards > > Pavel > > > >> >> regards, tom lane >> >> Attachments: [text/x-patch] 0001-optimize-select-into-v2.patch (7.6K, 3-0001-optimize-select-into-v2.patch) download | inline diff: From bf5874b9c9bf273935dbc61604c6767572d0e8b1 Mon Sep 17 00:00:00 2001 From: "[email protected]" <[email protected]> Date: Tue, 10 Mar 2026 08:37:38 +0100 Subject: [PATCH] optimze select into --- .../expected/level_tracking.out | 8 +-- .../pg_stat_statements/expected/plancache.out | 6 +- src/pl/plpgsql/src/expected/plpgsql_cache.out | 3 +- src/pl/plpgsql/src/pl_exec.c | 66 +++++++++++++++++++ 4 files changed, 72 insertions(+), 11 deletions(-) diff --git a/contrib/pg_stat_statements/expected/level_tracking.out b/contrib/pg_stat_statements/expected/level_tracking.out index a15d897e59b..832d65e97ca 100644 --- a/contrib/pg_stat_statements/expected/level_tracking.out +++ b/contrib/pg_stat_statements/expected/level_tracking.out @@ -1500,12 +1500,11 @@ SELECT PLUS_ONE(1); SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; calls | rows | query -------+------+---------------------------------------------------- - 2 | 2 | SELECT (i + $2 + $3)::INTEGER 2 | 2 | SELECT (i + $2)::INTEGER LIMIT $3 2 | 2 | SELECT PLUS_ONE($1) 2 | 2 | SELECT PLUS_TWO($1) 1 | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t -(5 rows) +(4 rows) -- immutable SQL function --- can be executed at plan time CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS @@ -1525,15 +1524,14 @@ SELECT PLUS_THREE(10); SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; toplevel | calls | rows | query ----------+-------+------+------------------------------------------------------------------------------ - f | 2 | 2 | SELECT (i + $2 + $3)::INTEGER f | 2 | 2 | SELECT (i + $2)::INTEGER LIMIT $3 t | 2 | 2 | SELECT PLUS_ONE($1) t | 2 | 2 | SELECT PLUS_THREE($1) t | 2 | 2 | SELECT PLUS_TWO($1) - t | 1 | 5 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C" + t | 1 | 4 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C" f | 2 | 2 | SELECT i + $2 LIMIT $3 t | 1 | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t -(8 rows) +(7 rows) SELECT pg_stat_statements_reset() IS NOT NULL AS t; t diff --git a/contrib/pg_stat_statements/expected/plancache.out b/contrib/pg_stat_statements/expected/plancache.out index e152de9f551..32bf913b286 100644 --- a/contrib/pg_stat_statements/expected/plancache.out +++ b/contrib/pg_stat_statements/expected/plancache.out @@ -159,11 +159,10 @@ SELECT calls, generic_plan_calls, custom_plan_calls, toplevel, query FROM pg_sta calls | generic_plan_calls | custom_plan_calls | toplevel | query -------+--------------------+-------------------+----------+---------------------------------------------------- 2 | 0 | 0 | t | CALL select_one_proc($1) - 4 | 2 | 2 | f | SELECT $1 1 | 0 | 0 | t | SELECT pg_stat_statements_reset() IS NOT NULL AS t 2 | 0 | 0 | t | SELECT select_one_func($1) 2 | 0 | 0 | t | SET plan_cache_mode TO $1 -(5 rows) +(4 rows) -- -- EXPLAIN [ANALYZE] EXECUTE + functions/procedures @@ -211,10 +210,9 @@ SELECT calls, generic_plan_calls, custom_plan_calls, toplevel, query FROM pg_sta 2 | 0 | 0 | t | EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT select_one_func($1) 4 | 0 | 0 | f | EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT select_one_func($1); 2 | 0 | 0 | t | EXPLAIN (COSTS OFF) SELECT select_one_func($1) - 4 | 2 | 2 | f | SELECT $1 1 | 0 | 0 | t | SELECT pg_stat_statements_reset() IS NOT NULL AS t 2 | 0 | 0 | t | SET plan_cache_mode TO $1 -(7 rows) +(6 rows) RESET pg_stat_statements.track; -- diff --git a/src/pl/plpgsql/src/expected/plpgsql_cache.out b/src/pl/plpgsql/src/expected/plpgsql_cache.out index 9df188ce56b..601e1ff3bc1 100644 --- a/src/pl/plpgsql/src/expected/plpgsql_cache.out +++ b/src/pl/plpgsql/src/expected/plpgsql_cache.out @@ -55,8 +55,7 @@ select show_result_type('select 1 as a'); -- (but if debug_discard_caches is on, it will succeed) select show_result_type('select 2.0 as a'); ERROR: type of parameter 5 (numeric) does not match that when preparing the plan (integer) -CONTEXT: SQL statement "select pg_typeof(r.a)" -PL/pgSQL function show_result_type(text) line 7 at SQL statement +CONTEXT: PL/pgSQL function show_result_type(text) line 7 at SQL statement -- but it's OK if we force plan rebuilding discard plans; select show_result_type('select 2.0 as a'); diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 84552e32c87..b14dbc77147 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -1301,6 +1301,19 @@ plpgsql_exec_error_callback(void *arg) estate->func->fn_signature); } +/* + * This error callback is used when case SELECT expr INTO var + */ +static void +plpgsql_execsql_error_callback(void *arg) +{ + PLpgSQL_expr *expr = (PLpgSQL_expr *) arg; + + Assert(expr); + + errcontext("SQL statement \"%s\"", expr->query); +} + /* ---------- * Support function for initializing local execution variables @@ -4253,6 +4266,59 @@ exec_stmt_execsql(PLpgSQL_execstate *estate, stmt->mod_stmt_set = true; } + /* + * Some users write "SELECT expr INTO var" instead of "var := expr". If + * the expression is simple and the INTO target is a single variable, we + * can bypass SPI and call ExecEvalExpr() directly. (exec_assign_expr + * would actually work for non-simple expressions too, but such an + * expression might return more or less than one row, complicating matters + * greatly. The potential performance win is small if it's non-simple, + * and any errors we might issue would likely look different, so avoid + * using this code path for non-simple cases.) + */ + if (expr->expr_simple_expr && stmt->into) + { + PLpgSQL_datum *target = estate->datums[stmt->target->dno]; + + if (target->dtype == PLPGSQL_DTYPE_ROW) + { + PLpgSQL_row *row = (PLpgSQL_row *) target; + + if (row->nfields == 1) + { + ErrorContextCallback plerrcontext; + + /* + * Setup error traceback support for reporting SQL statement + * in context. + */ + plerrcontext.callback = plpgsql_execsql_error_callback; + plerrcontext.arg = expr; + plerrcontext.previous = error_context_stack; + error_context_stack = &plerrcontext; + + /* Evaluate the expression and assign to the INTO target */ + exec_assign_expr(estate, estate->datums[row->varnos[0]], + expr); + + /* + * We must duplicate the other effects of the code below, as + * well. We know that exactly one row was returned, so it + * doesn't matter whether the INTO was STRICT or not. + */ + exec_set_found(estate, true); + estate->eval_processed = 1; + + /* + * Pop the error context stack + */ + error_context_stack = plerrcontext.previous; + + return PLPGSQL_RC_OK; + } + } + } + /* * Set up ParamListInfo to pass to executor */ -- 2.53.0 ^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: slow SELECT expr INTO var in plpgsql @ 2026-03-10 08:02 Pavel Stehule <[email protected]> parent: Pavel Stehule <[email protected]> 1 sibling, 0 replies; 4+ messages in thread From: Pavel Stehule @ 2026-03-10 08:02 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> Hi I am sorry, wrong patch Regards Pavel Attachments: [text/x-patch] optimize-select-into-v2.patch (6.3K, 3-optimize-select-into-v2.patch) download | inline diff: diff --git a/contrib/pg_stat_statements/expected/level_tracking.out b/contrib/pg_stat_statements/expected/level_tracking.out index a15d897e59b..832d65e97ca 100644 --- a/contrib/pg_stat_statements/expected/level_tracking.out +++ b/contrib/pg_stat_statements/expected/level_tracking.out @@ -1500,12 +1500,11 @@ SELECT PLUS_ONE(1); SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; calls | rows | query -------+------+---------------------------------------------------- - 2 | 2 | SELECT (i + $2 + $3)::INTEGER 2 | 2 | SELECT (i + $2)::INTEGER LIMIT $3 2 | 2 | SELECT PLUS_ONE($1) 2 | 2 | SELECT PLUS_TWO($1) 1 | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t -(5 rows) +(4 rows) -- immutable SQL function --- can be executed at plan time CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS @@ -1525,15 +1524,14 @@ SELECT PLUS_THREE(10); SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; toplevel | calls | rows | query ----------+-------+------+------------------------------------------------------------------------------ - f | 2 | 2 | SELECT (i + $2 + $3)::INTEGER f | 2 | 2 | SELECT (i + $2)::INTEGER LIMIT $3 t | 2 | 2 | SELECT PLUS_ONE($1) t | 2 | 2 | SELECT PLUS_THREE($1) t | 2 | 2 | SELECT PLUS_TWO($1) - t | 1 | 5 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C" + t | 1 | 4 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C" f | 2 | 2 | SELECT i + $2 LIMIT $3 t | 1 | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t -(8 rows) +(7 rows) SELECT pg_stat_statements_reset() IS NOT NULL AS t; t diff --git a/contrib/pg_stat_statements/expected/plancache.out b/contrib/pg_stat_statements/expected/plancache.out index e152de9f551..32bf913b286 100644 --- a/contrib/pg_stat_statements/expected/plancache.out +++ b/contrib/pg_stat_statements/expected/plancache.out @@ -159,11 +159,10 @@ SELECT calls, generic_plan_calls, custom_plan_calls, toplevel, query FROM pg_sta calls | generic_plan_calls | custom_plan_calls | toplevel | query -------+--------------------+-------------------+----------+---------------------------------------------------- 2 | 0 | 0 | t | CALL select_one_proc($1) - 4 | 2 | 2 | f | SELECT $1 1 | 0 | 0 | t | SELECT pg_stat_statements_reset() IS NOT NULL AS t 2 | 0 | 0 | t | SELECT select_one_func($1) 2 | 0 | 0 | t | SET plan_cache_mode TO $1 -(5 rows) +(4 rows) -- -- EXPLAIN [ANALYZE] EXECUTE + functions/procedures @@ -211,10 +210,9 @@ SELECT calls, generic_plan_calls, custom_plan_calls, toplevel, query FROM pg_sta 2 | 0 | 0 | t | EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT select_one_func($1) 4 | 0 | 0 | f | EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT select_one_func($1); 2 | 0 | 0 | t | EXPLAIN (COSTS OFF) SELECT select_one_func($1) - 4 | 2 | 2 | f | SELECT $1 1 | 0 | 0 | t | SELECT pg_stat_statements_reset() IS NOT NULL AS t 2 | 0 | 0 | t | SET plan_cache_mode TO $1 -(7 rows) +(6 rows) RESET pg_stat_statements.track; -- diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 84552e32c87..b14dbc77147 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -1301,6 +1301,19 @@ plpgsql_exec_error_callback(void *arg) estate->func->fn_signature); } +/* + * This error callback is used when case SELECT expr INTO var + */ +static void +plpgsql_execsql_error_callback(void *arg) +{ + PLpgSQL_expr *expr = (PLpgSQL_expr *) arg; + + Assert(expr); + + errcontext("SQL statement \"%s\"", expr->query); +} + /* ---------- * Support function for initializing local execution variables @@ -4253,6 +4266,59 @@ exec_stmt_execsql(PLpgSQL_execstate *estate, stmt->mod_stmt_set = true; } + /* + * Some users write "SELECT expr INTO var" instead of "var := expr". If + * the expression is simple and the INTO target is a single variable, we + * can bypass SPI and call ExecEvalExpr() directly. (exec_assign_expr + * would actually work for non-simple expressions too, but such an + * expression might return more or less than one row, complicating matters + * greatly. The potential performance win is small if it's non-simple, + * and any errors we might issue would likely look different, so avoid + * using this code path for non-simple cases.) + */ + if (expr->expr_simple_expr && stmt->into) + { + PLpgSQL_datum *target = estate->datums[stmt->target->dno]; + + if (target->dtype == PLPGSQL_DTYPE_ROW) + { + PLpgSQL_row *row = (PLpgSQL_row *) target; + + if (row->nfields == 1) + { + ErrorContextCallback plerrcontext; + + /* + * Setup error traceback support for reporting SQL statement + * in context. + */ + plerrcontext.callback = plpgsql_execsql_error_callback; + plerrcontext.arg = expr; + plerrcontext.previous = error_context_stack; + error_context_stack = &plerrcontext; + + /* Evaluate the expression and assign to the INTO target */ + exec_assign_expr(estate, estate->datums[row->varnos[0]], + expr); + + /* + * We must duplicate the other effects of the code below, as + * well. We know that exactly one row was returned, so it + * doesn't matter whether the INTO was STRICT or not. + */ + exec_set_found(estate, true); + estate->eval_processed = 1; + + /* + * Pop the error context stack + */ + error_context_stack = plerrcontext.previous; + + return PLPGSQL_RC_OK; + } + } + } + /* * Set up ParamListInfo to pass to executor */ ^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: slow SELECT expr INTO var in plpgsql @ 2026-03-21 05:10 Pavel Stehule <[email protected]> parent: Pavel Stehule <[email protected]> 1 sibling, 0 replies; 4+ messages in thread From: Pavel Stehule @ 2026-03-21 05:10 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> pá 20. 3. 2026 v 23:33 odesílatel Tom Lane <[email protected]> napsal: > Pavel Stehule <[email protected]> writes: > >> so 31. 1. 2026 v 21:58 odesílatel Tom Lane <[email protected]> napsal: > >>> I experimented a little bit with converting simple-expression > >>> SELECT INTO into an assignment, as attached. It does reclaim > >>> nearly all of the performance difference: for me, these two > >>> test cases now take about 276 vs 337 ms. However, I'm concerned > >>> about the side-effects of substituting this other code path; > >>> there's a lot of potential minor differences in behavior. > >>> Two that you can see in the regression test changes are: > >>> > >>> * SELECT INTO is tracked by pg_stat_statements, assignments aren't. > >>> > >>> * The context report for an error can be different, because > >>> _SPI_error_callback() doesn't get used. > >>> > >>> We could probably eliminate the context-report difference by setting > >>> up a custom error context callback in this new code path, but the > >>> difference in pg_stat_statements output would be hard to mask. > >>> There may be other discrepancies as well, such as variations in > >>> error message wording. > > > Do you plan to push this patch? Unfortunately there is not any discussion > > about side effects. > > Yeah, general radio silence out there. After thinking about it for > awhile, I've decided to go ahead with the patch. It'll be easy enough > to revert if people are unhappy. > > > I wrote a version with dedicated error context callback, > > Thanks for doing that. I found though that it wasn't quite enough, > because the existing code path applies _SPI_error_callback() during > evaluation of the expression but not during assignment to the target > variable. So for example, errors during type conversion to match > the target variable don't get a context line claiming they happened > during evaluation of the expression, which seems correct to me. > I was able to fix it by not using exec_assign_expr() but instead > copying that code in-line, so that we can pop the error context stack > at the right point. (See added tests in the committed patch, > ce8d5fe0e2802158b65699aeae1551d489948167.) > > > ... so there will be > > differences only in pg_stat_statements. It is true, so this should be > hard > > to mask. Maybe this difference can be just documented - like "`SELECT > expr > > INTO variable` can be optimized and executed by a direct expression > > executor, and then this query will not be visible in pg_stat_statement." > > We don't document that "var := expression" isn't captured, so I don't > think this needs to be either. Possibly Bruce will pick up the change > as a release-note item. > Thank you very much Regards Pavel > > regards, tom lane > ^ permalink raw reply [nested|flat] 4+ messages in thread
end of thread, other threads:[~2026-03-21 05:10 UTC | newest] Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-02-01 05:09 Re: slow SELECT expr INTO var in plpgsql Pavel Stehule <[email protected]> 2026-03-10 07:58 ` Pavel Stehule <[email protected]> 2026-03-10 08:02 ` Pavel Stehule <[email protected]> 2026-03-21 05:10 ` Pavel Stehule <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox