public inbox for [email protected]
help / color / mirror / Atom feedRe: Upsert error "column reference is ambiguous"
12+ messages / 6 participants
[nested] [flat]
* Re: Upsert error "column reference is ambiguous"
@ 2025-04-28 11:22 Tim Starling <[email protected]>
2025-04-28 13:29 ` Re: Upsert error "column reference is ambiguous" Laurenz Albe <[email protected]>
2025-04-28 13:54 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
0 siblings, 2 replies; 12+ messages in thread
From: Tim Starling @ 2025-04-28 11:22 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: [email protected]
On 28/4/25 20:54, Tom Lane wrote:
> Even if I were on board with arbitrarily adopting one of the two
> possible interpretations, it's far from obvious to me that most people
> would agree that "v" should mean the value from the existing row,
> rather than the new value. Better to make them say which they want.
OK sure, no way to tell, but if every other DBMS does it the same way
then that might be a hint.
Also, I'm just saying, the upsert feature is fully useless to me with
this name resolution policy.
In the single-row case, there's no need for EXCLUDED at all, because
the client knows everything about the excluded row. Recall my example:
INSERT INTO t VALUES (1,1) ON CONFLICT (k) DO UPDATE SET v=v+1;
If I meant SET v=EXCLUDED.v+1 I would have just written v=2. The
default policy (in other DBMSes) follows by analogy from the
single-row case.
-- Tim Starling
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
@ 2025-04-28 13:29 ` Laurenz Albe <[email protected]>
1 sibling, 0 replies; 12+ messages in thread
From: Laurenz Albe @ 2025-04-28 13:29 UTC (permalink / raw)
To: Tim Starling <[email protected]>; Tom Lane <[email protected]>; +Cc: [email protected]
On Mon, 2025-04-28 at 21:22 +1000, Tim Starling wrote:
> On 28/4/25 20:54, Tom Lane wrote:
> > Even if I were on board with arbitrarily adopting one of the two
> > possible interpretations, it's far from obvious to me that most people
> > would agree that "v" should mean the value from the existing row,
> > rather than the new value. Better to make them say which they want.
>
> OK sure, no way to tell, but if every other DBMS does it the same way
> then that might be a hint.
Which DBMS that supports INSERT .. ON CONFLICT do you have in mind?
> Also, I'm just saying, the upsert feature is fully useless to me with
> this name resolution policy.
Because you cannot write EXCLUDED?
> In the single-row case, there's no need for EXCLUDED at all, because
> the client knows everything about the excluded row. Recall my example:
>
> INSERT INTO t VALUES (1,1) ON CONFLICT (k) DO UPDATE SET v=v+1;
>
> If I meant SET v=EXCLUDED.v+1 I would have just written v=2. The
> default policy (in other DBMSes) follows by analogy from the
> single-row case.
Actually, for many people, the DWIM would be the other way around:
INSERT INTO tab (col)
SELECT something FROM othertab
ON CONFLICT (id)
/* "col" should get set to "something" */
DO UPDATE SET col = col;
I can follow your reasoning above, but if the SQL parser tried to
guess the user's intention like that, it is likely to go wrong
sometimes. As Tom said, better force the user to be explicit.
Yours,
Laurenz Albe
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
@ 2025-04-28 13:54 ` Tom Lane <[email protected]>
2025-04-28 15:21 ` Re: Upsert error "column reference is ambiguous" David G. Johnston <[email protected]>
2025-04-28 22:36 ` Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-29 12:43 ` Re: Upsert error "column reference is ambiguous" David Rowley <[email protected]>
1 sibling, 3 replies; 12+ messages in thread
From: Tom Lane @ 2025-04-28 13:54 UTC (permalink / raw)
To: Tim Starling <[email protected]>; +Cc: [email protected]
Tim Starling <[email protected]> writes:
> On 28/4/25 20:54, Tom Lane wrote:
>> Even if I were on board with arbitrarily adopting one of the two
>> possible interpretations, it's far from obvious to me that most people
>> would agree that "v" should mean the value from the existing row,
>> rather than the new value. Better to make them say which they want.
> OK sure, no way to tell, but if every other DBMS does it the same way
> then that might be a hint.
AFAIK, "ON CONFLICT" is a Postgres-ism. Exactly which constructs
in exactly which other databases are you citing as precedent?
> In the single-row case, there's no need for EXCLUDED at all, because
> the client knows everything about the excluded row.
Laurenz already provided the counter-example of an INSERT/SELECT,
but there's also the possibility of the INSERT supplying a computed
default value for a column, e.g., CURRENT_TIMESTAMP. So you won't
get far with that argument.
I do actually have some sympathy for your proposal after thinking
about it a bit more, but the argument I would use is "the behavior
of the ON CONFLICT UPDATE SET list should be as much as possible like
the behavior of an ordinary UPDATE's SET list". Since "v = v+1" would
refer to the existing row's "v" in regular UPDATE, it's sensible to
let that happen here too. Of course the counter-argument is that this
should be compared not to a trivial UPDATE, but an "UPDATE ... FROM
othertable" where the othertable supplies some conflicting column
name(s). In that situation we're going to make you resolve the
conflict by qualifying the column names. The only thing that makes
that not a precise parallel is that EXCLUDED is not something the user
wrote into the query explicitly, so there's no opportunity to
substitute different column aliases, as a FROM clause would allow.
Perhaps that justifies demoting it to second-class citizenship whereby
EXCLUDED has to be qualified but the target table doesn't. (I don't
find this argument hugely compelling, but it's an argument.)
BTW, I did wonder how hard it would be to make such a change.
On first glance it seems to be a one-liner:
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 1f4d6adda52..f11727adbaa 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -1306,7 +1306,7 @@ transformOnConflictClause(ParseState *pstate,
* Add the EXCLUDED pseudo relation to the query namespace, making it
* available in the UPDATE subexpressions.
*/
- addNSItemToQuery(pstate, exclNSItem, false, true, true);
+ addNSItemToQuery(pstate, exclNSItem, false, true, false);
/*
* Now transform the UPDATE subexpressions.
So this isn't about implementation difficulty but about whether
we think it's a good idea.
regards, tom lane
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 13:54 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
@ 2025-04-28 15:21 ` David G. Johnston <[email protected]>
2 siblings, 0 replies; 12+ messages in thread
From: David G. Johnston @ 2025-04-28 15:21 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tim Starling <[email protected]>; [email protected] <[email protected]>
On Monday, April 28, 2025, Tom Lane <[email protected]> wrote:
>
> AFAIK, "ON CONFLICT" is a Postgres-ism. Exactly which constructs
> in exactly which other databases are you citing as precedent?
>
I confirmed the SQLite reference from the original email.
“The upsert above inserts the new vocabulary word "jovial" if that word is
not already in the dictionary, or if it is already in the dictionary, it
increments the counter. The "count+1" expression could also be written as
"vocabulary.count". PostgreSQL requires the second form, but SQLite accepts
either.”
https://sqlite.org/lang_upsert.html
David J.
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 13:54 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
@ 2025-04-28 22:36 ` Tim Starling <[email protected]>
2025-04-28 22:58 ` Re: Upsert error "column reference is ambiguous" Christophe Pettus <[email protected]>
2025-04-29 06:36 ` Re: Upsert error "column reference is ambiguous" Laurenz Albe <[email protected]>
2 siblings, 2 replies; 12+ messages in thread
From: Tim Starling @ 2025-04-28 22:36 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: [email protected]
On 28/4/25 23:54, Tom Lane wrote:
> AFAIK, "ON CONFLICT" is a Postgres-ism. Exactly which constructs
> in exactly which other databases are you citing as precedent?
There's a list here:
<https://wiki.postgresql.org/wiki/UPSERT#UPSERT_as_implemented_in_practice;
Since that page was written in 2014, SQLite added upsert support,
consciously following PG's syntax, except that unqualified names
resolve to target rows.
My code would be like
function upsert( $table, $names, $values, $key, $set ) {
if ( $this->type === 'mysql' ) {
$conflict = 'ON DUPLICATE KEY UPDATE';
} else {
$conflict = "ON CONFLICT ($key) DO UPDATE SET";
}
return $this->query( "INSERT INTO $table ($names) " .
"VALUES ($values) $conflict $set" );
}
The parameters are a little bit more structured than that, but that
gives you the idea.
MediaWiki has supported MySQL's ON DUPLICATE KEY UPDATE since 2013,
and we've always had the conflict target parameter $key since then as
a helper for emulation. So it's trivial to produce either MySQL and
SQLite syntax.
-- Tim Starling
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 13:54 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
2025-04-28 22:36 ` Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
@ 2025-04-28 22:58 ` Christophe Pettus <[email protected]>
2025-04-28 23:01 ` Re: Upsert error "column reference is ambiguous" Christophe Pettus <[email protected]>
1 sibling, 1 reply; 12+ messages in thread
From: Christophe Pettus @ 2025-04-28 22:58 UTC (permalink / raw)
To: Tim Starling <[email protected]>; +Cc: Tom Lane <[email protected]>; [email protected]
> On Apr 28, 2025, at 15:36, Tim Starling <[email protected]> wrote:
> function upsert( $table, $names, $values, $key, $set ) {
> if ( $this->type === 'mysql' ) {
> $conflict = 'ON DUPLICATE KEY UPDATE';
> } else {
> $conflict = "ON CONFLICT ($key) DO UPDATE SET";
> }
> return $this->query( "INSERT INTO $table ($names) " .
> "VALUES ($values) $conflict $set" );
I'll mention that you can do this without ON CONFLICT in PostgreSQL in a way that, while not nearly as clean as ON CONFLICT, isn't a huge hack, either:
"DO $$ BEGIN INSERT INTO $table($names) VALUES($values); EXCEPTION WHEN integrity_constraint_violation THEN UPDATE $table SET $set WHERE $key=$values[0]; END; $$ LANGUAGE plpgsql;"
It does require knowing which of the VALUES is the key value being inserted (pseudocode syntax above), but if that is stylized to always be the first value, that does not seem insurmountable.
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 13:54 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
2025-04-28 22:36 ` Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 22:58 ` Re: Upsert error "column reference is ambiguous" Christophe Pettus <[email protected]>
@ 2025-04-28 23:01 ` Christophe Pettus <[email protected]>
0 siblings, 0 replies; 12+ messages in thread
From: Christophe Pettus @ 2025-04-28 23:01 UTC (permalink / raw)
To: Tim Starling <[email protected]>; +Cc: Tom Lane <[email protected]>; [email protected]
> On Apr 28, 2025, at 15:58, Christophe Pettus <[email protected]> wrote:
> It does require knowing which of the VALUES is the key value being inserted (pseudocode syntax above) [...]
The instant after I hit send, I realized that information is available to the function by lining up the $names and $values array, since the name of the key column is passed in.
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 13:54 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
2025-04-28 22:36 ` Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
@ 2025-04-29 06:36 ` Laurenz Albe <[email protected]>
2025-04-29 11:07 ` Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
1 sibling, 1 reply; 12+ messages in thread
From: Laurenz Albe @ 2025-04-29 06:36 UTC (permalink / raw)
To: Tim Starling <[email protected]>; Tom Lane <[email protected]>; +Cc: [email protected]
On Tue, 2025-04-29 at 08:36 +1000, Tim Starling wrote:
> My code would be like
>
> function upsert( $table, $names, $values, $key, $set ) {
> if ( $this->type === 'mysql' ) {
> $conflict = 'ON DUPLICATE KEY UPDATE';
> } else {
> $conflict = "ON CONFLICT ($key) DO UPDATE SET";
> }
> return $this->query( "INSERT INTO $table ($names) " .
> "VALUES ($values) $conflict $set" );
> }
>
> The parameters are a little bit more structured than that, but that
> gives you the idea.
Another litle "if" to cater for PostgreSQL's "EXCLUDED." would be
such a big problem?
Yours,
Laurenz Albe
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 13:54 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
2025-04-28 22:36 ` Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-29 06:36 ` Re: Upsert error "column reference is ambiguous" Laurenz Albe <[email protected]>
@ 2025-04-29 11:07 ` Tim Starling <[email protected]>
2025-04-29 15:29 ` Re: Upsert error "column reference is ambiguous" David G. Johnston <[email protected]>
0 siblings, 1 reply; 12+ messages in thread
From: Tim Starling @ 2025-04-29 11:07 UTC (permalink / raw)
To: Laurenz Albe <[email protected]>; Tom Lane <[email protected]>; +Cc: [email protected]
On 29/4/25 16:36, Laurenz Albe wrote:
> On Tue, 2025-04-29 at 08:36 +1000, Tim Starling wrote:
>> My code would be like
>>
>> function upsert( $table, $names, $values, $key, $set ) {
>> if ( $this->type === 'mysql' ) {
>> $conflict = 'ON DUPLICATE KEY UPDATE';
>> } else {
>> $conflict = "ON CONFLICT ($key) DO UPDATE SET";
>> }
>> return $this->query( "INSERT INTO $table ($names) " .
>> "VALUES ($values) $conflict $set" );
>> }
>>
>> The parameters are a little bit more structured than that, but that
>> gives you the idea.
>
> Another litle "if" to cater for PostgreSQL's "EXCLUDED." would be
> such a big problem?
I don't understand what you mean. EXCLUDED is not needed. "$table."
needs to be prefixed to every column reference in the string $set. How
do you find the column references amongst the string literals,
function calls, etc.? You would need to parse the expression.
This is a public interface and there may be callers in code that I
don't have access to.
Part of the reason for wanting to replace the existing emulation with
a native upsert is to simplify the code. Parsing the expression is
definitely not a simplification.
-- Tim Starling
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 13:54 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
2025-04-28 22:36 ` Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-29 06:36 ` Re: Upsert error "column reference is ambiguous" Laurenz Albe <[email protected]>
2025-04-29 11:07 ` Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
@ 2025-04-29 15:29 ` David G. Johnston <[email protected]>
0 siblings, 0 replies; 12+ messages in thread
From: David G. Johnston @ 2025-04-29 15:29 UTC (permalink / raw)
To: Tim Starling <[email protected]>; +Cc: Laurenz Albe <[email protected]>; Tom Lane <[email protected]>; [email protected] <[email protected]>
On Tuesday, April 29, 2025, Tim Starling <[email protected]> wrote:
>
>
> This is a public interface and there may be callers in code that I don't
> have access to.
>
You might help your cause by sharing examples of how client code uses your
driver to perform upsert that runs into this limitation.
David J.
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 13:54 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
@ 2025-04-29 12:43 ` David Rowley <[email protected]>
2025-04-29 14:51 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
2 siblings, 1 reply; 12+ messages in thread
From: David Rowley @ 2025-04-29 12:43 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tim Starling <[email protected]>; [email protected]
On Tue, 29 Apr 2025 at 01:54, Tom Lane <[email protected]> wrote:
> I do actually have some sympathy for your proposal after thinking
> about it a bit more, but the argument I would use is "the behavior
> of the ON CONFLICT UPDATE SET list should be as much as possible like
> the behavior of an ordinary UPDATE's SET list". Since "v = v+1" would
> refer to the existing row's "v" in regular UPDATE, it's sensible to
> let that happen here too. Of course the counter-argument is that this
> should be compared not to a trivial UPDATE, but an "UPDATE ... FROM
> othertable" where the othertable supplies some conflicting column
> name(s). In that situation we're going to make you resolve the
> conflict by qualifying the column names. The only thing that makes
> that not a precise parallel is that EXCLUDED is not something the user
> wrote into the query explicitly, so there's no opportunity to
> substitute different column aliases, as a FROM clause would allow.
> Perhaps that justifies demoting it to second-class citizenship whereby
> EXCLUDED has to be qualified but the target table doesn't. (I don't
> find this argument hugely compelling, but it's an argument.)
Not arguing for or against, but... I think there are some cases where
it would be more dangerous to relax this. Here's one case where not
qualifying the column can be dangerous:
create table a1 (a int);
insert into a1 values(1),(2);
create table a2 (a int);
insert into a2 values(1);
select * from a1 where a in(select a from a2); -- as expected.
-- application changes, a2.a isn't needed anymore. column gets dropped
but someone forgets to update a query in the app...
alter table a2 drop column a;
select * from a1 where a in(select a from a2); -- silently returns
unexpected results.
If the original author of that query had been thoughtful enough to
qualify the column in the subquery then someone would probably have
gotten an error and fixed it. The moral of that story is that
sometimes forcing the query author to qualify the column is a good
idea. (not that there's much we can do about that one...)
Now the question is, do any similar hazards exist with ON CONFLICT DO
UPDATE? I don't think so as any columns being dropped will disappear
from the insert target table and the EXCLUDED work table at the same
time.
Another thought is that you can have an UPDATE with a RETURNING
clause. An unqualified column defaults to NEW even though you could
argue it's ambiguous due to OLD (as of 80feb727c). Likely we were
forced into making it work that way through not wanting to force
everyone to rewrite their RETURNING statements when upgrading to v18.
The moral of that story is, UPDATE isn't exactly consistent already
about when it requires column qualifications. Maybe it's weird to
insist that users qualify columns with their ON CONFLICT UPDATE SET
when RETURNING is happy to assume you must have meant NEW.
David
^ permalink raw reply [nested|flat] 12+ messages in thread
* Re: Upsert error "column reference is ambiguous"
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 13:54 ` Re: Upsert error "column reference is ambiguous" Tom Lane <[email protected]>
2025-04-29 12:43 ` Re: Upsert error "column reference is ambiguous" David Rowley <[email protected]>
@ 2025-04-29 14:51 ` Tom Lane <[email protected]>
0 siblings, 0 replies; 12+ messages in thread
From: Tom Lane @ 2025-04-29 14:51 UTC (permalink / raw)
To: David Rowley <[email protected]>; +Cc: Tim Starling <[email protected]>; [email protected]
David Rowley <[email protected]> writes:
> Another thought is that you can have an UPDATE with a RETURNING
> clause. An unqualified column defaults to NEW even though you could
> argue it's ambiguous due to OLD (as of 80feb727c). Likely we were
> forced into making it work that way through not wanting to force
> everyone to rewrite their RETURNING statements when upgrading to v18.
> The moral of that story is, UPDATE isn't exactly consistent already
> about when it requires column qualifications. Maybe it's weird to
> insist that users qualify columns with their ON CONFLICT UPDATE SET
> when RETURNING is happy to assume you must have meant NEW.
That's an analogy I hadn't thought of, and it does seem on-point.
You might be right that we would not have done it like that if
we'd invented RETURNING's ability to support "OLD" at the get-go.
Nonetheless, it's there now and is a pretty similar precedent.
regards, tom lane
^ permalink raw reply [nested|flat] 12+ messages in thread
end of thread, other threads:[~2025-04-29 15:29 UTC | newest]
Thread overview: 12+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-04-28 11:22 Re: Upsert error "column reference is ambiguous" Tim Starling <[email protected]>
2025-04-28 13:29 ` Laurenz Albe <[email protected]>
2025-04-28 13:54 ` Tom Lane <[email protected]>
2025-04-28 15:21 ` David G. Johnston <[email protected]>
2025-04-28 22:36 ` Tim Starling <[email protected]>
2025-04-28 22:58 ` Christophe Pettus <[email protected]>
2025-04-28 23:01 ` Christophe Pettus <[email protected]>
2025-04-29 06:36 ` Laurenz Albe <[email protected]>
2025-04-29 11:07 ` Tim Starling <[email protected]>
2025-04-29 15:29 ` David G. Johnston <[email protected]>
2025-04-29 12:43 ` David Rowley <[email protected]>
2025-04-29 14:51 ` Tom Lane <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox