MIME-Version: 1.0
References: <CAKna9VaJ_qHKBnw4O-VT3xGmzqThCuZ=LFXx-hPdw7E6RoqmeA@mail.gmail.com>
 <CAD=mzVUmXmkdvvMG30G1=D4Kq3WqnzGo=0ov9JnRCs1p=KJiTQ@mail.gmail.com>
 <CAKna9VZRc4+Vzbt6qPGMCauE84isPtz-wE_KX9AOt7WKfhwjiQ@mail.gmail.com> <CAD=mzVUX13ZM16kP4QhY+F5XiLr=ezCXftKOTKA4eUvhphgOJw@mail.gmail.com>
In-Reply-To: <CAD=mzVUX13ZM16kP4QhY+F5XiLr=ezCXftKOTKA4eUvhphgOJw@mail.gmail.com>
From: Lok P <loknath.73@gmail.com>
Date: Fri, 9 Aug 2024 00:08:39 +0530
Message-ID: <CAKna9Vb_mx+dX02XOV6mpr8RFC-5io38kM6=4xRHQj_MUvQ+aQ@mail.gmail.com>
Subject: Re: Column type modification in big tables
To: sud <suds1434@gmail.com>
Cc: pgsql-general <pgsql-general@lists.postgresql.org>
Content-Type: multipart/alternative; boundary="000000000000a0a5ba061f30560e"
Archived-At: <https://www.postgresql.org/message-id/CAKna9Vb_mx%2BdX02XOV6mpr8RFC-5io38kM6%3D4xRHQj_MUvQ%2BaQ%40mail.gmail.com>
Precedence: bulk

--000000000000a0a5ba061f30560e
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Thu, Aug 8, 2024 at 1:06=E2=80=AFAM sud <suds1434@gmail.com> wrote:

>
>
> On Wed, Aug 7, 2024 at 5:00=E2=80=AFPM Lok P <loknath.73@gmail.com> wrote=
:
>
>>
>>
>> On Wed, Aug 7, 2024 at 4:51=E2=80=AFPM sud <suds1434@gmail.com> wrote:
>>
>>>
>>>
>>> Others may correct but i think, If you don't have the FK defined on
>>> these columns you can do below.
>>>
>>>
>>> --Alter table add column which will be very fast within seconds as it
>>> will just add it to the data dictionary.
>>>
>>> ALTER TABLE tab1 ADD COLUMN new_column1 NUMERIC(3),   new_column2
>>> varchar2(3);
>>>
>>>
>>> *-- Back populate the data partition wise and commit, if it's really
>>> needed*
>>>
>>> UPDATE tab1_part1 SET new_column1 =3D CAST(old_column1 AS NUMERIC(3)),
>>> new_column2 =3D CAST(old_column2 AS varchar2(3)) ;
>>> commit;
>>> UPDATE tab1_part2 SET new_column1 =3D CAST(old_column1 AS NUMERIC(3)),
>>> new_column2 =3D CAST(old_column2 AS varchar2(3)) ;
>>> commit;
>>> UPDATE tab1_part3 SET new_column1 =3D CAST(old_column1 AS NUMERIC(3)),
>>> new_column2 =3D CAST(old_column2 AS varchar2(3)) ;
>>> commit;
>>> .....
>>>
>>>
>>> *--Alter table drop old columns which will be very fast within seconds
>>> as it will just drop it from the data dictionary.*
>>> ALTER TABLE your_table DROP COLUMN old_column1, DROP COLUMN old_column2=
;
>>>
>>
>>
>>
>> Thank you so much.
>>
>> I understand this will be the fastest possible way to achieve the column
>> modification.
>>
>> But talking about the dropped column which will be sitting in the table
>> and consuming storage space, Is it fine to leave as is or auto vacuum wi=
ll
>> remove the column values behind the scene and also anyway , once those
>> partitions will be purged they will be by default purged. Is this
>> understanding correct?
>>
>>  And also will this have any impact on the partition maintenance which i=
s
>> currently done by pg_partman as because the template table is now differ=
ent
>> internally(not from outside though). Will it cause conflict because of
>> those dropped columns from the main table?
>>
>
> I think leaving the table as is after the dropping column will be fine fo=
r
> you because your regular partition maintenance/drop will slowly purge the
> historical partitions and eventually they will be removed. But if you
> update those new columns with the old column values, then autovacuum shou=
ld
> also take care of removing the rows with older column values (which are
> dead actually) .
>
> Not sure if pg_partman will cause any issue ,as because the table now has
> the column data type/length changed. Others may confirm.
>

Thank you so much.

Can anybody suggest any other possible way here. As, we also need to have
the existing values be updated to the new column value here using update
command (even if it will update one partition at a time). And as I see we
have almost all the values in the column not null, which means it will
update almost ~5billion rows across all the partitions. So my question is ,
is there any parameter(like work_mem,maintenance_work_mem etc) which we can
set to make this update faster?
 or any other way to get this column altered apart from this method?

>

--000000000000a0a5ba061f30560e
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br></div><div class=3D"gmail_quote"><div=
 dir=3D"ltr" class=3D"gmail_attr">On Thu, Aug 8, 2024 at 1:06=E2=80=AFAM su=
d &lt;<a href=3D"mailto:suds1434@gmail.com">suds1434@gmail.com</a>&gt; wrot=
e:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0=
.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"l=
tr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote"><div dir=3D"l=
tr" class=3D"gmail_attr">On Wed, Aug 7, 2024 at 5:00=E2=80=AFPM Lok P &lt;<=
a href=3D"mailto:loknath.73@gmail.com" target=3D"_blank">loknath.73@gmail.c=
om</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margi=
n:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex=
"><div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote=
"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, Aug 7, 2024 at 4:51=E2=80=
=AFPM sud &lt;<a href=3D"mailto:suds1434@gmail.com" target=3D"_blank">suds1=
434@gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" sty=
le=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);paddi=
ng-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><br></div><div class=3D"gmai=
l_quote"><div dir=3D"ltr" class=3D"gmail_attr"><br></div>Others may correct=
 but i think, If you don&#39;t have the FK defined on these columns you can=
 do below.<br><br><br>--Alter table add column which will be very fast with=
in seconds as it will just add it to the data dictionary.<br><br>ALTER TABL=
E tab1 ADD COLUMN new_column1 NUMERIC(3),=C2=A0 =C2=A0new_column2 varchar2(=
3);<br>							<br>							<br><b>-- Back populate the data partition wise an=
d commit, if it&#39;s really needed</b><br><br>UPDATE tab1_part1 SET new_co=
lumn1 =3D CAST(old_column1 AS NUMERIC(3)), new_column2 =3D CAST(old_column2=
 AS varchar2(3)) ;<br>commit;<br>UPDATE tab1_part2 SET new_column1 =3D CAST=
(old_column1 AS NUMERIC(3)), new_column2 =3D CAST(old_column2 AS varchar2(3=
)) ;<br>commit;<br>UPDATE tab1_part3 SET new_column1 =3D CAST(old_column1 A=
S NUMERIC(3)), new_column2 =3D CAST(old_column2 AS varchar2(3)) ;<br>commit=
;<br>.....<br><br><b>--Alter table drop old columns which will be very fast=
 within seconds as it will just drop it from the data dictionary.<br></b><b=
r>ALTER TABLE your_table DROP COLUMN old_column1, DROP COLUMN old_column2;=
=C2=A0</div></div></blockquote><div><br></div><div><br></div><br>Thank you =
so much. <br><br>I understand this will be the fastest possible way to achi=
eve the column modification. <br><br><div>But talking about the dropped col=
umn which will be sitting in the table and consuming=C2=A0storage space, Is=
 it fine to leave as is or auto vacuum will remove the column values behind=
 the scene and also anyway , once those partitions will be purged they will=
 be by default purged. Is this understanding correct?</div><div><br></div><=
div>=C2=A0And also will this have any impact on the partition=C2=A0maintena=
nce which is currently done by pg_partman as because the template table is =
now different internally(not from outside though). Will it cause conflict b=
ecause=C2=A0of those dropped columns from the main table?</div></div></div>=
</blockquote><div><br></div><div>I think leaving the table as is after the =
dropping column will be fine for you because your regular partition mainten=
ance/drop will slowly purge the historical partitions and eventually=C2=A0t=
hey will be removed. But if you update those new columns with the old colum=
n values, then autovacuum=C2=A0should also take care of removing the rows w=
ith older column values (which are dead actually) .</div><div><br></div><di=
v>Not sure if pg_partman will cause any issue ,as because the table now has=
 the column data type/length changed. Others may confirm.</div></div></div>=
</blockquote><div><br></div><div>Thank you so much.=C2=A0</div><div><br></d=
iv><div>Can anybody suggest any other possible way here. As, we also need t=
o have the existing values be updated to the new column value here using up=
date command (even if it will update one partition at a time). And as I see=
 we have almost all the values in the column not null, which means it will =
update almost ~5billion rows across all the partitions. So my question is ,=
 is there any parameter(like work_mem,maintenance_work_mem etc) which we ca=
n set to make this update faster?</div><div>=C2=A0or any other way to get t=
his column altered apart from this method?</div><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,=
204);padding-left:1ex">
</blockquote></div></div>

--000000000000a0a5ba061f30560e--