Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sRt2a-001p2e-EL for pgsql-general@arkaria.postgresql.org; Thu, 11 Jul 2024 12:36:52 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sRt2Y-00DkAu-7S for pgsql-general@arkaria.postgresql.org; Thu, 11 Jul 2024 12:36:50 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sRt2X-00Dk7i-Q8 for pgsql-general@lists.postgresql.org; Thu, 11 Jul 2024 12:36:49 +0000 Received: from mail-oa1-x32.google.com ([2001:4860:4864:20::32]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sRt2U-001YRg-E1 for pgsql-general@lists.postgresql.org; Thu, 11 Jul 2024 12:36:49 +0000 Received: by mail-oa1-x32.google.com with SMTP id 586e51a60fabf-260209df55dso389278fac.2 for ; Thu, 11 Jul 2024 05:36:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720701404; x=1721306204; darn=lists.postgresql.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=nl5NpZWOSr58rUatxTRmQgm7Ur0dYo0lcZ+aC8WPwGQ=; b=BGJMPPnB1mWhd8toi7zJ8nvlN6v8gVPgtQXj7tasFfH0jjTzSRvRhj3JZiQFgHee0+ mWIel4BksVehFaQxuanw6oyA8Tt94fz4wpo+kZdbhjeQML+lJKf7bqKoUfVpTdJno9ir a90+ScREXRIrGnXg/E7gpnkMY9jyMFsYAMuVRTyLJZ+0sLv9Ku1CzwywdzUavDKbZLf0 m8vXnlAv4b6uBs841Mq26ql2KfSIu67qq0ISpGymXSMBjRG2OLOClTv+VW6TCmd3f0MQ rm9blY/ipANsHnAgqkYirlhlZIW5DSdRmXprEmkihXPmZJga/IQoEckFgJUmZJReEPbd fM3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720701404; x=1721306204; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nl5NpZWOSr58rUatxTRmQgm7Ur0dYo0lcZ+aC8WPwGQ=; b=J5f3F2VjEM2bOlnJZ7sRjWN6O6fULsVRjX0cNKj6Htc8QP9/da+N69g8mFLIXJZ1LA rB4FPk+ewqRdiZ7kNrxiqe6XrdQVgFjy0m4a0J3A8Q+9ltxffypX69zEC78LveNY6we1 3cOMhKE/8AQRLgOC+jI1BKYbxiocndkLXBe8bn7Rh+haG0G7glsgsLPJWHqe15UmIrDy PYQfOLk8H8CyrUD6ivG5/YnTwZJzGz/Bc6e5w4J69RK5aWkw2129oXz/yYieeiCzU/KT vLYH3yBkE4WvzZvTofN5fci7aoVIrR/37MW5u6Mkgmqcb//gYkrrJEt2orf6riyLsAL7 72nQ== X-Gm-Message-State: AOJu0YxaVT/490JCBc54nSd0PDDlvfl4N/deM/TKmogObJRElH+n9krz pnv63domBZqpwuw76V34Gnyz27PUBuGpLSHeWw/KBPgL0KUsaKQ94X+qnbL7ZD4zT1SvXMMTjqB qWab96fAqa9jRvhpzkBxIH6MDxRiw4ZdL X-Google-Smtp-Source: AGHT+IFNlMDqw5yWEYeDyZZYZHUUZjQolpJsHLoWX43YQOuYShYQ/FYH1n2HAZIZxE2Fl3G4cnFOKtRIqf3PA8ixK4o= X-Received: by 2002:a05:6870:8194:b0:25c:b7f4:39ad with SMTP id 586e51a60fabf-25eaec92c97mr7146058fac.58.1720701404160; Thu, 11 Jul 2024 05:36:44 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ron Johnson Date: Thu, 11 Jul 2024 08:36:32 -0400 Message-ID: Subject: Re: Dropping column from big table To: "pgsql-generallists.postgresql.org" Content-Type: multipart/alternative; boundary="000000000000f5ad73061cf803bc" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000f5ad73061cf803bc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Jul 11, 2024 at 3:41=E2=80=AFAM sud wrote: > > > On Thu, 11 Jul, 2024, 12:46 pm Ron Johnson, > wrote: > >> On Wed, Jul 10, 2024 at 11:28=E2=80=AFPM sud wrote: >> >>> >>> >>> >>> Thank you so much. When you said *"you can execute one of the forms of >>> ALTER TABLE that performs a rewrite* >>> *of the whole table."* Does it mean that post "alter table drop column" >>> the vacuum is going to run longer as it will try to clean up all the ro= ws >>> and recreate the new rows? But then how can this be avoidable or made >>> better without impacting the system performance >>> >> >> "Impact" is a non-specific word. "How much impact" depends on how many >> autovacuum workers you've set it to use, and how many threads you set in >> vacuumdb. >> >> >>> and blocking others? >>> >> >> VACUUM never blocks. >> >> Anyway, DROP is the easy part; it's ADD COLUMN that can take a lot of >> time (depending on whether or not you populate the column with a default >> value). >> >> I'd detach all the partitions from the parent table, and then add the ne= w >> column to the not-children in multiple threads, add the column to the >> parent and then reattach all of the children. That's the fastest method= , >> though takes some time to set up. >> > > > Thank you so much. > > Dropping will take it's own time for post vacuum however as you rightly > said, it won't be blocking which should be fine. > > In regards to add column, Detaching all partitions then adding column to > the individual partition in multiple sessions and then reattaching looks = to > be a really awesome idea to make it faster. > Do both the DROP and ADD in the same "set". Possibly in the same statement (which would be fastest if it works), and alternatively on the same command line. Examples: psql --host=3Dfoo.example.com somedb -c "ALTER TABLE bar_p85 DROP COLUMN splat, ADD COLUMN barf BIGINT;" psql --host=3Dfoo.example.com somedb -c "ALTER TABLE bar_p85 DROP splat;" -= c ALTER TABLE bar_p85 ADD COLUMN barf BIGINT;" My syntax is probably wrong, but you get the idea. However one doubt, Will it create issue if there already exists foreign key > on this partition table or say it's the parent to other child > partition/nonpartition tables? > (Note that detached children have FK constraints.) It'll certainly create an "issue" if the column you're dropping is part of the foreign key. =F0=9F=98=80 It'll also cause a problem if the table you're dropping from or adding to is the "target" of the FK, since the source can't check the being-altered table during the ALTER TABLE statement. Bottom line: you can optimize for: 1. minimized wall time by doing it in multiple transactions (which *might* bodge your application; we don't know it, so can't say for sure), OR 2. assured consistency (one transaction where you just ALTER the parent, and have it ripple down to the children); it'll take much longer, though. One other issue: *if* adding the new column requires a rewrite, "ALTER parent" *might* (but I've never tried it) temporarily use an extra 2TB of disk space in that single transaction. Doing the ALTERs child by child minimizes that, since each child's ALTER is it's own transaction. Whatever you do... test test test. > --000000000000f5ad73061cf803bc Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Thu, Jul 11, 2024 at 3:41=E2=80=AFAM s= ud <suds1434@gmail.com> wro= te:


On Thu, 11 Jul, 2024, 12:46 pm = Ron Johnson, <ronljohnsonjr@gmail.com> wrote:
On Wed, Jul 10,= 2024 at 11:28=E2=80=AFPM sud <suds1434@gmail.com> wrote:

=


Thank y= ou so much. When you said "you can execute one of the forms of ALTE= R TABLE that performs a rewrite
of the whole table."<= /i>=C2=A0Does it mean that post "alter table drop column" the vac= uum is going to run longer as it will try to clean up all the rows and recr= eate the new rows? But then how can this be avoidable or made better withou= t impacting the system performance

=
"Impact" is a non-specific=C2=A0word.=C2=A0 "How = much impact" depends on how many autovacuum workers you've set it= =C2=A0to use,=C2=A0and how many threads you=C2=A0set in vacuumdb.
=C2=A0
and=C2=A0blocking others?
<= /div>

VACUUM never blocks.=C2=A0
=
Anyway, DROP is the easy part; it's ADD COLUMN that can = take a lot of time (depending on whether or not you populate the column wit= h a default value).

I'd detach all=C2=A0the pa= rtitions from the parent table, and then add the new column to the not-chil= dren in multiple threads, add the column to the parent and then reattach al= l of the children.=C2=A0 That's the fastest method, though takes some t= ime to set up.


Thank you so much.=C2=A0<= /div>

Dropping will take it= 9;s own time for post vacuum however as you rightly said, it won't be b= locking which should be fine.=C2=A0

In regards to add column, Detaching all partitions then adding = column=C2=A0 to the individual partition in multiple sessions and then reat= taching looks to be a really awesome idea to make it faster.

Do both the DROP and ADD in the same "se= t".=C2=A0 Possibly in the same statement (which would be fastest if it= works), and alternatively on the same command line.=C2=A0 Examples:
<= div>psql --host=3Dfoo.example.com so= medb -c "ALTER TABLE bar_p85 DROP COLUMN splat, ADD COLUMN barf BIGINT= ;"
psql --host=3Dfoo.= example.com somedb -c "ALTER TABLE bar_p85 DROP splat;" -c AL= TER TABLE bar_p85 ADD COLUMN barf BIGINT;"
=C2=A0
= My syntax is probably wrong, but you get the idea.

However one doubt, Will it create issue if there already exists for= eign key on this partition table or say it's the parent to other child = partition/nonpartition tables?=C2=A0

(Note that detached children have FK constraints.)

It'll certainly create an "issue" if the column you= 9;re dropping is part of the foreign key.=C2=A0=F0=9F=98=80

<= /div>
It'll also cause a problem if the table you're dropping f= rom or adding to is the "target" of the FK, since the source can&= #39;t check the being-altered table during the ALTER=C2=A0TABLE statement.<= /div>

Bottom line: you can optimize for:
1. mi= nimized wall time by doing it in multiple transactions (which might= =C2=A0bodge your application; we don't know it, so can't say for su= re), OR
2. assured consistency (one transaction where you just AL= TER the parent, and have it ripple down to the children); it'll take mu= ch longer, though.

One other issue: if addi= ng the new column requires a rewrite, "ALTER parent" might= =C2=A0=C2=A0(but I've never tried it) temporarily use an extra 2TB of d= isk space in that single transaction.=C2=A0 Doing the ALTERs child by child= minimizes that, since each child's ALTER is it's own transaction.<= /div>

Whatever you do... test test test.
--000000000000f5ad73061cf803bc--