Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sc82Q-004Axn-Uk for pgsql-general@arkaria.postgresql.org; Thu, 08 Aug 2024 18:39:02 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sc82P-00G1uy-Hq for pgsql-general@arkaria.postgresql.org; Thu, 08 Aug 2024 18:39:01 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sc82P-00G1up-52 for pgsql-general@lists.postgresql.org; Thu, 08 Aug 2024 18:39:01 +0000 Received: from mail-yw1-x1133.google.com ([2607:f8b0:4864:20::1133]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sc82I-003qDI-KT for pgsql-general@lists.postgresql.org; Thu, 08 Aug 2024 18:39:00 +0000 Received: by mail-yw1-x1133.google.com with SMTP id 00721157ae682-651583964d8so1579027b3.3 for ; Thu, 08 Aug 2024 11:38:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723142333; x=1723747133; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Df0Di7irJwZSbEEMVHyNJBkPUsvRBo9cM77anxHwQko=; b=C0X37VMh4j9n1ryIXDi6WXOZ5cpANUpPshp7GhuqiRwM4/ImKU1kSF+bi2u67rVrxA 3Xq6pGlEKIdSpwK2hFPnVwf0K+X1C0U4CH1i2UUprODK6mvYV6vzv0nTh7PtSgu+vi2e +6JzbPztAkDdQTGgjtHdYz0ow0LAyFvwDQe+QsM6FGTyJJKPBa6u996MxYoHdwdn29Zq 3S53XxyBpeoYuk3dJ5foDYBVe71vQ1wBdPpY/lZ8aatmLp6Qnl7V8bBnnqVyQ77ZcV0Q E7UqYfIpZcx/II5u/K1tja0/utvqZFe3rvcRSrXsD9U3VaGnLFHe3LMRFzko+FPurm1p +YMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723142333; x=1723747133; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Df0Di7irJwZSbEEMVHyNJBkPUsvRBo9cM77anxHwQko=; b=qoMs9CKQKt4TidQTnX/lMvQSFUPcnbuDmEsGkRhjtq8V6EUzHO6hkZ368KeWZbW8JG dyPKYlYQfVGfQ9phoBejoqDiZML3wOI7pijl+t/gaMzAibSanwzUBL8wbB8viSm9nU3e tDGE6KIR6Np1Z+GToFPMoMgJOJuVbaGDUsdetZGRPULeO61lzdAqhlydSC/wcqcJBgLw ZecZA49JNQwM6FlHKKJlUd3dBLHxUYCtWZ2umlWwfQQagFp6WHbQuhndax1Oud98qixq Owu4dPf8rVh6A/C/CZNcv1zIaX9PHxblv3jhmqq2VC8HdslO8KucH+/YwKSjAJzpZn3h kXag== X-Gm-Message-State: AOJu0YwHTOICMaHwpLXZiNxUBWdaUw+TCxDp9tZDKz33LRwhCcEjOtmf nKyhgecUr/DX0+ejqIFxKBw80e+gZlJaeP4s8vqUWq4XYFHo+yTu0GloBUaBfBT9zN29qcWIf7h tY2OtTHxiVxUDZ7GGWZLrJAQsxNo= X-Google-Smtp-Source: AGHT+IEPjbqJYGJ8f9mDDTnSC5XpkB1rhVQ9ja5619CR3gCnyB6C0efhkheQcTwivkRCC7cabqWtn3zeepPnzaqiocY= X-Received: by 2002:a05:690c:ed0:b0:695:d608:e7e9 with SMTP id 00721157ae682-69bf198db6cmr16883717b3.0.1723142332520; Thu, 08 Aug 2024 11:38:52 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Lok P Date: Fri, 9 Aug 2024 00:08:39 +0530 Message-ID: Subject: Re: Column type modification in big tables To: sud Cc: pgsql-general Content-Type: multipart/alternative; boundary="000000000000a0a5ba061f30560e" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000a0a5ba061f30560e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Aug 8, 2024 at 1:06=E2=80=AFAM sud wrote: > > > On Wed, Aug 7, 2024 at 5:00=E2=80=AFPM Lok P wrote= : > >> >> >> On Wed, Aug 7, 2024 at 4:51=E2=80=AFPM sud wrote: >> >>> >>> >>> Others may correct but i think, If you don't have the FK defined on >>> these columns you can do below. >>> >>> >>> --Alter table add column which will be very fast within seconds as it >>> will just add it to the data dictionary. >>> >>> ALTER TABLE tab1 ADD COLUMN new_column1 NUMERIC(3), new_column2 >>> varchar2(3); >>> >>> >>> *-- Back populate the data partition wise and commit, if it's really >>> needed* >>> >>> UPDATE tab1_part1 SET new_column1 =3D CAST(old_column1 AS NUMERIC(3)), >>> new_column2 =3D CAST(old_column2 AS varchar2(3)) ; >>> commit; >>> UPDATE tab1_part2 SET new_column1 =3D CAST(old_column1 AS NUMERIC(3)), >>> new_column2 =3D CAST(old_column2 AS varchar2(3)) ; >>> commit; >>> UPDATE tab1_part3 SET new_column1 =3D CAST(old_column1 AS NUMERIC(3)), >>> new_column2 =3D CAST(old_column2 AS varchar2(3)) ; >>> commit; >>> ..... >>> >>> >>> *--Alter table drop old columns which will be very fast within seconds >>> as it will just drop it from the data dictionary.* >>> ALTER TABLE your_table DROP COLUMN old_column1, DROP COLUMN old_column2= ; >>> >> >> >> >> Thank you so much. >> >> I understand this will be the fastest possible way to achieve the column >> modification. >> >> But talking about the dropped column which will be sitting in the table >> and consuming storage space, Is it fine to leave as is or auto vacuum wi= ll >> remove the column values behind the scene and also anyway , once those >> partitions will be purged they will be by default purged. Is this >> understanding correct? >> >> And also will this have any impact on the partition maintenance which i= s >> currently done by pg_partman as because the template table is now differ= ent >> internally(not from outside though). Will it cause conflict because of >> those dropped columns from the main table? >> > > I think leaving the table as is after the dropping column will be fine fo= r > you because your regular partition maintenance/drop will slowly purge the > historical partitions and eventually they will be removed. But if you > update those new columns with the old column values, then autovacuum shou= ld > also take care of removing the rows with older column values (which are > dead actually) . > > Not sure if pg_partman will cause any issue ,as because the table now has > the column data type/length changed. Others may confirm. > Thank you so much. Can anybody suggest any other possible way here. As, we also need to have the existing values be updated to the new column value here using update command (even if it will update one partition at a time). And as I see we have almost all the values in the column not null, which means it will update almost ~5billion rows across all the partitions. So my question is , is there any parameter(like work_mem,maintenance_work_mem etc) which we can set to make this update faster? or any other way to get this column altered apart from this method? > --000000000000a0a5ba061f30560e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Thu, Aug 8, 2024 at 1:06=E2=80=AFAM su= d <suds1434@gmail.com> wrot= e:


On Wed, Aug 7, 2024 at 5:00=E2=80=AFPM Lok P <<= a href=3D"mailto:loknath.73@gmail.com" target=3D"_blank">loknath.73@gmail.c= om> wrote:


On Wed, Aug 7, 2024 at 4:51=E2=80= =AFPM sud <suds1= 434@gmail.com> wrote:


Others may correct= but i think, If you don't have the FK defined on these columns you can= do below.


--Alter table add column which will be very fast with= in seconds as it will just add it to the data dictionary.

ALTER TABL= E tab1 ADD COLUMN new_column1 NUMERIC(3),=C2=A0 =C2=A0new_column2 varchar2(= 3);


-- Back populate the data partition wise an= d commit, if it's really needed

UPDATE tab1_part1 SET new_co= lumn1 =3D CAST(old_column1 AS NUMERIC(3)), new_column2 =3D CAST(old_column2= AS varchar2(3)) ;
commit;
UPDATE tab1_part2 SET new_column1 =3D CAST= (old_column1 AS NUMERIC(3)), new_column2 =3D CAST(old_column2 AS varchar2(3= )) ;
commit;
UPDATE tab1_part3 SET new_column1 =3D CAST(old_column1 A= S NUMERIC(3)), new_column2 =3D CAST(old_column2 AS varchar2(3)) ;
commit= ;
.....

--Alter table drop old columns which will be very fast= within seconds as it will just drop it from the data dictionary.
ALTER TABLE your_table DROP COLUMN old_column1, DROP COLUMN old_column2;= =C2=A0



Thank you = so much.

I understand this will be the fastest possible way to achi= eve the column modification.

But talking about the dropped col= umn which will be sitting in the table and consuming=C2=A0storage space, Is= it fine to leave as is or auto vacuum will remove the column values behind= the scene and also anyway , once those partitions will be purged they will= be by default purged. Is this understanding correct?

<= div>=C2=A0And also will this have any impact on the partition=C2=A0maintena= nce which is currently done by pg_partman as because the template table is = now different internally(not from outside though). Will it cause conflict b= ecause=C2=A0of those dropped columns from the main table?
=

I think leaving the table as is after the = dropping column will be fine for you because your regular partition mainten= ance/drop will slowly purge the historical partitions and eventually=C2=A0t= hey will be removed. But if you update those new columns with the old colum= n values, then autovacuum=C2=A0should also take care of removing the rows w= ith older column values (which are dead actually) .

Not sure if pg_partman will cause any issue ,as because the table now has= the column data type/length changed. Others may confirm.
=

Thank you so much.=C2=A0

Can anybody suggest any other possible way here. As, we also need t= o have the existing values be updated to the new column value here using up= date command (even if it will update one partition at a time). And as I see= we have almost all the values in the column not null, which means it will = update almost ~5billion rows across all the partitions. So my question is ,= is there any parameter(like work_mem,maintenance_work_mem etc) which we ca= n set to make this update faster?
=C2=A0or any other way to get t= his column altered apart from this method?
--000000000000a0a5ba061f30560e--