postgresql-interfaces/psqlodbc GitHub issues and pull requests (mirror)  
help / color / mirror / Atom feed
[postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
10+ messages / 2 participants
[nested] [flat]

* [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
@ 2025-01-28 17:03  "vadz (@vadz)" <[email protected]>
  0 siblings, 0 replies; 10+ messages in thread

From: vadz (@vadz) @ 2025-01-28 17:03 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

Suppose there is the following table in the database:

```sql
create table t(x integer check (x<100));
```

and consider a program which does something like the following (simplified):

```cpp
constexpr int ARRAY_SIZE = 2;
SQLSetStmtAttr(h, SQL_ATTR_PARAMSET_SIZE, (void *)ARRAY_SIZE, 0);

SQLINTEGER values[ARRAY_SIZE] = { 1, 101 }; // First value is valid, second one is not.
SQLLEN inds[ARRAY_SIZE];
SQLBindParameter(h, 1, SQL_PARAM_INPUT, SQL_C_INTEGER, SQL_INTEGER, 0, 0, values, 0, inds);

SQLUSMALLINT status[ARRAY_SIZE];
SQLSetStmtAttr(h, SQL_ATTR_PARAM_STATUS_PTR, status, 0);

SQLPrepare(h, "insert into t(x) values(?)", SQL_NTS);
SQLRETURN rc = SQLExecute(h);
```

When using SQL Server with either Microsoft ODBC driver or FreeTDS, `rc` is `SQL_SUCCESS_WITH_INFO` and `status` array is filled with `{ SQL_PARAM_SUCCESS, SQL_PARAM_ERROR }` which is nice because it allows the application to determine which row(s) contained values that resulted in an error.

When using PostgreSQL ODBC driver (v13.02, but from examining Git history it doesn't look like there have been any changes here even in the latest version), `rc` is `SQL_ERROR` and `status` contains `SQL_PARAM_ERROR` for both elements, which doesn't provide any useful information.

Is this behaviour intentional and, if not, would it be possible to change it to be more consistent with other ODBC drivers and, also, more useful?

^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
@ 2025-02-01 17:16  "davecramer (@davecramer)" <[email protected]>
  8 siblings, 0 replies; 10+ messages in thread

From: davecramer (@davecramer) @ 2025-02-01 17:16 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

So if I understand correctly, you would like the insert for the first to succeed and the second to fail.
Since the insert for both is a single transaction when the first fails the second fails as postgres does not commit the transaction. 
This is by design.

see https://www.postgresql.org/docs/current/tutorial-transactions.html for more details

Dave

^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
@ 2025-02-01 17:23  "vadz (@vadz)" <[email protected]>
  8 siblings, 0 replies; 10+ messages in thread

From: vadz (@vadz) @ 2025-02-01 17:23 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

Not quite, I'm perfectly fine with the insert failing and actually think this is more correct than doing a partial insert as some other databases/ODBC drivers do. But it would be great to have some information about which rows resulted in an error and this is something that `SQL_ATTR_PARAM_STATUS_PTR` is supposed to provide, but this driver returns `SQL_PARAM_ERROR` in it even for the first row, which would have succeeded.

I.e. the problem is not (so much) returning `SQL_ERROR` instead of `SQL_SUCCESS_WITH_INFO` (although this is inconsistent with the other ODBC drivers I've tried), but not providing any useful information in the status array.

^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
@ 2025-02-02 05:08  "davecramer (@davecramer)" <[email protected]>
  8 siblings, 0 replies; 10+ messages in thread

From: davecramer (@davecramer) @ 2025-02-02 05:08 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

Unfortunately the server does not provide that information. It doesn't know that the first one succeeded since it was all done in one insert.

```
insert into testtab1 values (1,'blah');
ERROR:  duplicate key value violates unique constraint "testtab1_pkey"
DETAIL:  Key (id)=(1) already exists.
```
While we do get the error message, we don't know if it is the 1st or 2nd value that failed


^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
@ 2025-02-02 16:47  "vadz (@vadz)" <[email protected]>
  8 siblings, 0 replies; 10+ messages in thread

From: vadz (@vadz) @ 2025-02-02 16:47 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

Just trying to understand: how does the driver manage to send multiple parameters to the server? From looking at the code it seems like it does `PQsendQuery()` for each row in a loop, but I must be missing something...

^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
@ 2025-02-02 20:40  "davecramer (@davecramer)" <[email protected]>
  8 siblings, 0 replies; 10+ messages in thread

From: davecramer (@davecramer) @ 2025-02-02 20:40 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

Can you show me the code you are thinking it is using. 
Without looking at the code I would think it does
```
insert into testtab1 (id) values (1),(2);
```

^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
@ 2025-02-02 20:49  "vadz (@vadz)" <[email protected]>
  8 siblings, 0 replies; 10+ messages in thread

From: vadz (@vadz) @ 2025-02-02 20:49 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

> Can you show me the code you are thinking it is using.

I've been looking at the part of `PGAPI_Execute()` between https://github.com/postgresql-interfaces/psqlodbc/blob/1fbc3b2bd31c9719763cc5d232ac0c0ea5a02f71/exec... and https://github.com/postgresql-interfaces/psqlodbc/blob/1fbc3b2bd31c9719763cc5d232ac0c0ea5a02f71/exec... but I could easily be wrong as I mostly just looked at the label name. I probably should understand how to enable tracing in this code to see what it's really doing during tun-time...

> Without looking at the code I would think it does
>
> `insert into testtab1 (id) values (1),(2);`

Interesting, wouldn't this run into some kind of limitation (number of parameters or total length of the command) for even relatively small amounts of data?



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
@ 2025-02-04 18:37  "davecramer (@davecramer)" <[email protected]>
  8 siblings, 0 replies; 10+ messages in thread

From: davecramer (@davecramer) @ 2025-02-04 18:37 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

it does, but not as much as one would think. It will handle quite a few values.

^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
@ 2025-02-05 10:07  "davecramer (@davecramer)" <[email protected]>
  8 siblings, 0 replies; 10+ messages in thread

From: davecramer (@davecramer) @ 2025-02-05 10:07 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

I would just enable logging on the server to see how it does this. 

^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows
@ 2025-02-05 13:32  "vadz (@vadz)" <[email protected]>
  8 siblings, 0 replies; 10+ messages in thread

From: vadz (@vadz) @ 2025-02-05 13:32 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

> I would just enable logging on the server to see how it does this.

Good point, thanks, this is indeed even simpler. And what I see is that the driver (at least 13.02 version from Debian Bookworm) just sends multiple semicolon separated statements to the server in a single request, e.g. for [my test](https://github.com/SOCI/soci/blob/b7b470e66d81be1346224f68206738e0ff872e59/tests/common/test-common....) I see the following

```
LOG:  00000: statement: insert into soci_test(val) values('1');insert into soci_test(val) values('a')
ERROR:  22P02: invalid input syntax for integer: "a" at character 74
```

If the "at character N" part is always present (but I'm not at all sure about this), it looks like it would be possible to find the row which resulted in the error by remembering the positions of the statement boundaries for each row.

But now I also wonder if this is really the most efficient way to do bulk inserts/updates with Postgres, at the very least repeating the entire query many times like this should consume much more bandwidth than preparing it once/reusing it it for all rows?

^ permalink  raw  reply  [nested|flat] 10+ messages in thread


end of thread, other threads:[~2025-02-05 13:32 UTC | newest]

Thread overview: 10+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-01-28 17:03 [postgresql-interfaces/psqlodbc] issue #89: Batch update with some invalid values fails for all rows "vadz (@vadz)" <[email protected]>
2025-02-01 17:16 ` "davecramer (@davecramer)" <[email protected]>
2025-02-01 17:23 ` "vadz (@vadz)" <[email protected]>
2025-02-02 05:08 ` "davecramer (@davecramer)" <[email protected]>
2025-02-02 16:47 ` "vadz (@vadz)" <[email protected]>
2025-02-02 20:40 ` "davecramer (@davecramer)" <[email protected]>
2025-02-02 20:49 ` "vadz (@vadz)" <[email protected]>
2025-02-04 18:37 ` "davecramer (@davecramer)" <[email protected]>
2025-02-05 10:07 ` "davecramer (@davecramer)" <[email protected]>
2025-02-05 13:32 ` "vadz (@vadz)" <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox