public inbox for [email protected]help / color / mirror / Atom feed
Adding New Column with default value. 5+ messages / 4 participants [nested] [flat]
* Adding New Column with default value. @ 2025-04-28 17:24 Gambhir Singh <[email protected]> 2025-04-28 17:46 ` Re: Adding New Column with default value. Ron Johnson <[email protected]> 2025-04-28 18:34 ` Re: Adding New Column with default value. David G. Johnston <[email protected]> 2025-04-29 01:13 ` Re: Adding New Column with default value. Ed Sabol <[email protected]> 0 siblings, 3 replies; 5+ messages in thread From: Gambhir Singh @ 2025-04-28 17:24 UTC (permalink / raw) To: [email protected] I got request from app Team to add new column to a table with following specifications 1. With Not Null Constraint 2. With Default value 0 Row Count - 50 Billion Please help me to understand a. DDL statement (alter table add column) will be followed by the DML statement (Update)....Am I correct ? b. What would be size of the Table (I assuming it will be doubled) c. What is the best approach for this kind of activity. Thanks & Regards Gambhir Singh ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: Adding New Column with default value. 2025-04-28 17:24 Adding New Column with default value. Gambhir Singh <[email protected]> @ 2025-04-28 17:46 ` Ron Johnson <[email protected]> 2 siblings, 0 replies; 5+ messages in thread From: Ron Johnson @ 2025-04-28 17:46 UTC (permalink / raw) To: Pgsql-admin <[email protected]> On Mon, Apr 28, 2025 at 1:25 PM Gambhir Singh <[email protected]> wrote: > > I got request from app Team to add new column to a table with following > specifications > > 1. With Not Null Constraint > 2. With Default value 0 > > Row Count - 50 Billion > > Please help me to understand > a. DDL statement (alter table add column) will be followed by the DML > statement (Update)....Am I correct ? > Do you want existing rows to have a value of 0, or just new ones? > b. What would be size of the Table (I assuming it will be doubled) > It will double if the table currently only has one scalar column. > c. What is the best approach for this kind of activity. > The ALTER TABLE documentation is pretty long, but searching it for "update", and reading carefully gives pointers. -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster! ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: Adding New Column with default value. 2025-04-28 17:24 Adding New Column with default value. Gambhir Singh <[email protected]> @ 2025-04-28 18:34 ` David G. Johnston <[email protected]> 2 siblings, 0 replies; 5+ messages in thread From: David G. Johnston @ 2025-04-28 18:34 UTC (permalink / raw) To: Gambhir Singh <[email protected]>; +Cc: [email protected] On Mon, Apr 28, 2025 at 10:25 AM Gambhir Singh <[email protected]> wrote: > > a. DDL statement (alter table add column) will be followed by the DML > statement (Update)....Am I correct ? > No. An alter command will either rewrite the table or it will not. It will never execute DML. David J. ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: Adding New Column with default value. 2025-04-28 17:24 Adding New Column with default value. Gambhir Singh <[email protected]> @ 2025-04-29 01:13 ` Ed Sabol <[email protected]> 2025-04-29 02:06 ` Re: Adding New Column with default value. Ron Johnson <[email protected]> 2 siblings, 1 reply; 5+ messages in thread From: Ed Sabol @ 2025-04-29 01:13 UTC (permalink / raw) To: Gambhir Singh <[email protected]>; +Cc: [email protected] On Apr 28, 2025, at 1:24 PM, Gambhir Singh <[email protected]> wrote: > Row Count - 50 Billion I've never dealt with a table that huge personally, but my concern would be that ALTER TABLE will lock the table for a very long time. Is this in a production environment with active usage of this table? Just SELECTs or are we talking UPDATEs and INSERTs as well? If so, you might need to do something more complicated than just ALTER TABLE. If you have enough disk space in the storage area for this database to have two identical copies of this 50 billion row table (with indexes!), you could make a copy of the table and either ALTER that copy or add the new column at the same time as making the copy and then, in a single transaction, rename the two tables to swap them. If you do it this way, the new table will replace the old table seamlessly without interrupting usage of the table. Somewhere in there, you might need to re-sync the two tables to make sure any rows that got inserted or updated while you were making the copy are incorporated into the new version of the table as well. Just some initial thoughts on how I would accomplish this and things I would consider when deciding how to do it. Good luck, Ed ^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: Adding New Column with default value. 2025-04-28 17:24 Adding New Column with default value. Gambhir Singh <[email protected]> 2025-04-29 01:13 ` Re: Adding New Column with default value. Ed Sabol <[email protected]> @ 2025-04-29 02:06 ` Ron Johnson <[email protected]> 0 siblings, 0 replies; 5+ messages in thread From: Ron Johnson @ 2025-04-29 02:06 UTC (permalink / raw) To: pgsql-admin On Mon, Apr 28, 2025 at 9:13 PM Ed Sabol <[email protected]> wrote: > On Apr 28, 2025, at 1:24 PM, Gambhir Singh <[email protected]> > wrote: > > Row Count - 50 Billion > > I've never dealt with a table that huge personally, but my concern would > be that ALTER TABLE will lock the table for a very long time. Is this in a > production environment with active usage of this table? Just SELECTs or are > we talking UPDATEs and INSERTs as well? If so, you might need to do > something more complicated than just ALTER TABLE. > > If you have enough disk space in the storage area for this database to > have two identical copies of this 50 billion row table (with indexes!), you > could make a copy of the table and either ALTER that copy or add the new > column at the same time as making the copy and then, in a single > transaction, rename the two tables to swap them. If you do it this way, the > new table will replace the old table seamlessly without interrupting usage > of the table. Somewhere in there, you might need to re-sync the two tables > to make sure any rows that got inserted or updated while you were making > the copy are incorporated into the new version of the table as well. > > Just some initial thoughts on how I would accomplish this and things I > would consider when deciding how to do it. > > COPY TO of that table, and then COPY FROM into a new table would let OP experiment. Since it's 50Bn rows, COPY TO of a quarter of the rows is probably adequate. Hopefully this bolding comes through: "When a column is added with *ADD COLUMN and a non-volatile DEFAULT* is specified, the default is evaluated at the time of the statement and the result stored in the table's metadata. That value will be used for the column for all existing rows. If no DEFAULT is specified, NULL is used. *In neither case is a rewrite of the table required.*" According to https://www.postgresql.org/docs/17/sql-altertable.html, "*Adding a *CHECK or *NOT NULL* constraint requires scanning the table to verify that existing rows meet the constraint, but *does not require a table rewrit*e." That's probably pretty fast, even if an exclusive lock is required. Thus, I'd probably try this on the table copy: ALTER TABLE foo ADD COLUMN bar BIGINT NOT NULL DEFAULT 0; UPDATE foo SET bar = 0 WHERE pk between 0*1000+0 AND 0*1000+9999; UPDATE foo SET bar = 0 WHERE pk between 1*1000+0 AND 1*1000+9999; UPDATE foo SET bar = 0 WHERE pk between 2*1000+0 AND 2*1000+9999; etc. The UPDATE statement would be in a bash loop, with the 0, 1, 2, 3... a variable. I'd also stick an occasional VACUUM in the bash script. -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster! ^ permalink raw reply [nested|flat] 5+ messages in thread
end of thread, other threads:[~2025-04-29 02:06 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2025-04-28 17:24 Adding New Column with default value. Gambhir Singh <[email protected]> 2025-04-28 17:46 ` Ron Johnson <[email protected]> 2025-04-28 18:34 ` David G. Johnston <[email protected]> 2025-04-29 01:13 ` Ed Sabol <[email protected]> 2025-04-29 02:06 ` Ron Johnson <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox