public inbox for [email protected]help / color / mirror / Atom feed
About the stability of COPY BINARY data 7+ messages / 3 participants [nested] [flat]
* About the stability of COPY BINARY data @ 2024-11-06 16:20 Dominique Devienne <[email protected]> 0 siblings, 2 replies; 7+ messages in thread From: Dominique Devienne @ 2024-11-06 16:20 UTC (permalink / raw) To: pgsql-general From https://www.postgresql.org/docs/current/sql-copy.html: |> binary-format file is less portable across machine architectures and PostgreSQL versions In my experience, the binary encoding of binding/resultset/copy is endian neutral (network byte order), so what is the less portable across machine architectures that warning about? Also, does the code for per-type _send() and _recv() functions really change across versions of PostgreSQL? How common are instances of such changes across versions? Any examples of such backward-incompatible changes, in the past? The binary data contains OIDs, but if sticking to built-in types, which OIDs are unlikely to change across versions? I'm obviously storing COPY BINARY data (we have lots of bytea columns), and I wonder how bad it is long term, and across PostgreSQL versions. Thanks for any insights, --DD ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: About the stability of COPY BINARY data @ 2024-11-07 16:37 Adrian Klaver <[email protected]> parent: Dominique Devienne <[email protected]> 1 sibling, 1 reply; 7+ messages in thread From: Adrian Klaver @ 2024-11-07 16:37 UTC (permalink / raw) To: Dominique Devienne <[email protected]>; pgsql-general On 11/6/24 08:20, Dominique Devienne wrote: >>>From https://www.postgresql.org/docs/current/sql-copy.html: > |> binary-format file is less portable across machine architectures > and PostgreSQL versions > > In my experience, the binary encoding of binding/resultset/copy is > endian neutral (network byte order), so what is the less portable > across machine architectures that warning about? > > Also, does the code for per-type _send() and _recv() functions really change > across versions of PostgreSQL? How common are instances of such > changes across versions? Any examples of such backward-incompatible > changes, in the past? > > The binary data contains OIDs, but if sticking to built-in types, > which OIDs are unlikely to change across versions? > > I'm obviously storing COPY BINARY data (we have lots of bytea > columns), and I wonder how bad it is long term, and across PostgreSQL > versions. If I where to hazard a guess this plays a part: https://www.postgresql.org/docs/current/sql-copy.html "To determine the appropriate binary format for the actual tuple data you should consult the PostgreSQL source, in particular the *send and *recv functions for each column's data type (typically these functions are found in the src/backend/utils/adt/ directory of the source distribution)." > > Thanks for any insights, --DD > > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: About the stability of COPY BINARY data @ 2024-11-07 17:14 Dominique Devienne <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 0 replies; 7+ messages in thread From: Dominique Devienne @ 2024-11-07 17:14 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: pgsql-general On Thu, Nov 7, 2024 at 5:37 PM Adrian Klaver <[email protected]> wrote: > > On 11/6/24 08:20, Dominique Devienne wrote: > >>From https://www.postgresql.org/docs/current/sql-copy.html: > > |> binary-format file is less portable across machine architectures > > and PostgreSQL versions > > > > In my experience, the binary encoding of binding/resultset/copy is > > endian neutral (network byte order), so what is the less portable > > across machine architectures that warning about? > > > > Also, does the code for per-type _send() and _recv() functions really change > > across versions of PostgreSQL? How common are instances of such > > changes across versions? Any examples of such backward-incompatible > > changes, in the past? > > > > The binary data contains OIDs, but if sticking to built-in types, > > which OIDs are unlikely to change across versions? > > > > I'm obviously storing COPY BINARY data (we have lots of bytea > > columns), and I wonder how bad it is long term, and across PostgreSQL > > versions. > > If I where to hazard a guess this plays a part: > > https://www.postgresql.org/docs/current/sql-copy.html > > "To determine the appropriate binary format for the actual tuple data > you should consult the PostgreSQL source, in particular the *send and > *recv functions for each column's data type (typically these functions > are found in the src/backend/utils/adt/ directory of the source > distribution)." Hi Adrian. Well, sure. The questions above are whether those type-specific formats are: 1) architecture dependent. (that's not my experience). 2) change across PostgreSQL versions. Not what the actual formats are. --DD PS: I'm surprised I didn't get answers. Seems to me to doc is overly "careful" about COPY BINARY's stability, thus my asking for confirmation here. ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: About the stability of COPY BINARY data @ 2024-11-07 17:39 Daniel Verite <[email protected]> parent: Dominique Devienne <[email protected]> 1 sibling, 1 reply; 7+ messages in thread From: Daniel Verite @ 2024-11-07 17:39 UTC (permalink / raw) To: Dominique Devienne <[email protected]>; +Cc: pgsql-general Dominique Devienne wrote: > Also, does the code for per-type _send() and _recv() functions > really change across versions of PostgreSQL? How common are > instances of such changes across versions? Any examples of such > backward-incompatible changes, in the past? For the timestamp types, I think these functions were sending/expecting float8 (before version 7.3), and then float8 or int64 depending on the server configuration up until 9.6, and since then int64 only. The same for the "time" field of the interval type. There is still an "integer_datetimes" GUC reflecting this. Best regards, -- Daniel Vérité https://postgresql.verite.pro/ Twitter: @DanielVerite ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: About the stability of COPY BINARY data @ 2024-11-07 17:55 Dominique Devienne <[email protected]> parent: Daniel Verite <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: Dominique Devienne @ 2024-11-07 17:55 UTC (permalink / raw) To: Daniel Verite <[email protected]>; +Cc: pgsql-general On Thu, Nov 7, 2024 at 6:39 PM Daniel Verite <[email protected]> wrote: > Dominique Devienne wrote: > > Also, does the code for per-type _send() and _recv() functions > > really change across versions of PostgreSQL? How common are > > instances of such changes across versions? Any examples of such > > backward-incompatible changes, in the past? > > For the timestamp types, I think these functions were > sending/expecting float8 (before version 7.3), and then float8 or > int64 depending on the server configuration up until 9.6, and since > then int64 only. > The same for the "time" field of the interval type. > There is still an "integer_datetimes" GUC reflecting this. Thanks. So it did happen in a distant past. Anything below 14 is of no concern to me though. So again, it does sound like changes are unlikely. And I haven't seen anything not network-byte-order, as far architecture is concerned. ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: About the stability of COPY BINARY data @ 2024-11-07 18:04 Adrian Klaver <[email protected]> parent: Dominique Devienne <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: Adrian Klaver @ 2024-11-07 18:04 UTC (permalink / raw) To: Dominique Devienne <[email protected]>; Daniel Verite <[email protected]>; +Cc: pgsql-general On 11/7/24 09:55, Dominique Devienne wrote: > On Thu, Nov 7, 2024 at 6:39 PM Daniel Verite <[email protected]> wrote: >> Dominique Devienne wrote: >>> Also, does the code for per-type _send() and _recv() functions >>> really change across versions of PostgreSQL? How common are >>> instances of such changes across versions? Any examples of such >>> backward-incompatible changes, in the past? >> >> For the timestamp types, I think these functions were >> sending/expecting float8 (before version 7.3), and then float8 or >> int64 depending on the server configuration up until 9.6, and since >> then int64 only. >> The same for the "time" field of the interval type. >> There is still an "integer_datetimes" GUC reflecting this. > > Thanks. So it did happen in a distant past. > Anything below 14 is of no concern to me though. > So again, it does sound like changes are unlikely. Yeah that is implied by: https://www.postgresql.org/docs/current/pgupgrade.html "Major PostgreSQL releases regularly add new features that often change the layout of the system tables, but the internal data storage format rarely changes. " The COPY warning is there as heads up that it is a possibility. > > And I haven't seen anything not network-byte-order, > as far architecture is concerned. > > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: About the stability of COPY BINARY data @ 2024-11-07 18:10 Dominique Devienne <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 0 replies; 7+ messages in thread From: Dominique Devienne @ 2024-11-07 18:10 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: Daniel Verite <[email protected]>; pgsql-general On Thu, Nov 7, 2024 at 7:04 PM Adrian Klaver <[email protected]> wrote: > On 11/7/24 09:55, Dominique Devienne wrote: > > On Thu, Nov 7, 2024 at 6:39 PM Daniel Verite <[email protected]> wrote: > >> Dominique Devienne wrote: > >>> Also, does the code for per-type _send() and _recv() functions > >>> really change across versions of PostgreSQL? How common are > >>> instances of such changes across versions? Any examples of such > >>> backward-incompatible changes, in the past? > >> > >> For the timestamp types, I think these functions were > >> sending/expecting float8 (before version 7.3), and then float8 or > >> int64 depending on the server configuration up until 9.6, and since > >> then int64 only. > >> The same for the "time" field of the interval type. > >> There is still an "integer_datetimes" GUC reflecting this. > > > > Thanks. So it did happen in a distant past. > > Anything below 14 is of no concern to me though. > > So again, it does sound like changes are unlikely. > > Yeah that is implied by: > > https://www.postgresql.org/docs/current/pgupgrade.html > > "Major PostgreSQL releases regularly add new features that often change > the layout of the system tables, but the internal data storage format > rarely changes. " > > The COPY warning is there as heads up that it is a possibility. > > > > And I haven't seen anything not network-byte-order, > > as far architecture is concerned. But the COPY BINARY format and "the internal data storage format" are two separate things Adrian, AFAIK. Using binds and result sets in binary mode is part of the protocol in a way, and not an internal implementation detail, like internal format for tables on disk. I'm sure I'm not the only one that using binary mode for PostgreSQL, and any change in _send and _recv functions across versions, or have their results be architecture dependent, would break many client codes, if it happened. And COPY BINARY's outer format is also public and documented, so can't change either. That's why I'm insisting on that phrase in the documentation, which gives the wrong impressions IMHO. Shouldn't it be removed or amended? Thanks, --DD ^ permalink raw reply [nested|flat] 7+ messages in thread
end of thread, other threads:[~2024-11-07 18:10 UTC | newest] Thread overview: 7+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2024-11-06 16:20 About the stability of COPY BINARY data Dominique Devienne <[email protected]> 2024-11-07 16:37 ` Adrian Klaver <[email protected]> 2024-11-07 17:14 ` Dominique Devienne <[email protected]> 2024-11-07 17:39 ` Daniel Verite <[email protected]> 2024-11-07 17:55 ` Dominique Devienne <[email protected]> 2024-11-07 18:04 ` Adrian Klaver <[email protected]> 2024-11-07 18:10 ` Dominique Devienne <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox