Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rt6dH-00CEKO-E1 for pgsql-general@arkaria.postgresql.org; Sat, 06 Apr 2024 14:02:59 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rt6dG-00Cd9o-9L for pgsql-general@arkaria.postgresql.org; Sat, 06 Apr 2024 14:02:58 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rt6dF-00Cd9f-VR for pgsql-general@lists.postgresql.org; Sat, 06 Apr 2024 14:02:57 +0000 Received: from mail-qv1-xf2f.google.com ([2607:f8b0:4864:20::f2f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1rt6dD-001AJI-5M for pgsql-general@lists.postgresql.org; Sat, 06 Apr 2024 14:02:56 +0000 Received: by mail-qv1-xf2f.google.com with SMTP id 6a1803df08f44-6994bcf7e80so4929056d6.0 for ; Sat, 06 Apr 2024 07:02:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712412174; x=1713016974; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=rEDMCCVMzBwhqVo/gPuBjk0XXF8jhN5HXRpPr5xB4Ek=; b=PciN65bMYrhR1RxOne85udq6LKX7+a7VqtrTsLr+3zuNKIDcfAwJdvFWbAi/advMtf I2XN6MeBahZ23Jxk2AgWmYX25cKf7Vu/FQ5JdsR46xCEvcVZ6KqM/vAkheNProoOJIXC 33Bp6Enq+CIFdTj/kN2SQf1f7tQTvTaaKfkde3N5iZz5/mYsb5osfbSC8MC2L7bwcCEi PXsMxWk629h3sSQgYuC2o3xvr2AiXL49Wi0/px1/Gysdd/Cvr7FGSoA5OR3D8Fs9ZpmD b95LJxAwdvsRDob9E1SLWg3OaXXt3Kq+dAZCykXDjAwria0Fq9D/AmqPz1XJjRKymcrQ G3Fw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712412174; x=1713016974; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=rEDMCCVMzBwhqVo/gPuBjk0XXF8jhN5HXRpPr5xB4Ek=; b=MxRX7tcKW4dbv0LnLhhXyqx37N3w0KdYGh/xlNabhjWrrGsjwm2tDCk3bjnD1PTnmI TMYe3HhPimHa9DcGmEmBJwyIqegHW4g78WXjYssWUnLEzBGEpDIik0TCOhCnswisRQre 2k9pQ60hcmeArDIjgD6szhBe2Jbri7GSm7ObOEvqOAt1zDxlEaU6xLLNnH2ZVD/fE5/2 Nyv5+kcY76YlAX4ruXH+FsujrIM3iW3iR2uG9Pj1J0bB5ll1Tgmcd4ssLoIvKGh23toe yVRvqBvCaXKuPUK8GKBL6ihY+b1O5aD/SP28HoOUJYELnjJ99DLYm4lor4llwCvJ3JJC mfaw== X-Forwarded-Encrypted: i=1; AJvYcCWEFW355nN55ekFKsoNBdLi08P7/Lp9tk/f1kFZEc9vTPJdnCSA9SkFJHktTg/P+P6/EhVRPC6oDc9Br0fK+TZFLvefKq6DXRaAFlDBrnbnYurT X-Gm-Message-State: AOJu0YxcaaG4bWn7WDCce2wAjMukWKtb6JgQ8mwAXENcNFeg3SCH+ZZ5 CaRU7VgL0k2Pc+DS6K6TRMQTBpsWzARF1c9RPTv7nO76IxOwxcCS+/JK+Dk9ExtZWAnnSTyRR02 ahExAiyHZgvdYeVcfzFTcn+4o8vc= X-Google-Smtp-Source: AGHT+IFyB+vrdsJuIUhTZeZ3/61luPEotfDnl01GH7x5bYCn1YZxYiFDYKbXOFnUQ5TsiPG0BwOqRdbXSNrDSnrpOQM= X-Received: by 2002:a0c:e2d4:0:b0:699:3138:54ad with SMTP id t20-20020a0ce2d4000000b00699313854admr8393730qvl.21.1712412174430; Sat, 06 Apr 2024 07:02:54 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: veem v Date: Sat, 6 Apr 2024 19:32:43 +0530 Message-ID: Subject: Re: Moving delta data faster To: Adrian Klaver , Greg Sabino Mullane Cc: yudhi s , pgsql-general Content-Type: multipart/alternative; boundary="0000000000005db88f06156e0785" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000005db88f06156e0785 Content-Type: text/plain; charset="UTF-8" On Fri, 5 Apr 2024 at 06:10, Adrian Klaver wrote: > > > S3 is not a database. You will need to be more specific about '... > then > > from the S3 it will be picked and gets merged to the target postgres > > database.' > > > > > > The data from S3 will be dumped into the stage table and then the > > upsert/merge from that table to the actual table. > > The S3 --> staging table would be helped by having the data as CSV and > then using COPY. The staging --> final table step could be done as > either ON CONFLICT or MERGE, you would need to test in your situation to > verify which works better. > Just a thought , in case the delta record changes are really higher(say >30-40% of the total number of rows in the table) can OP also evaluate the "truncate target table +load target table" strategy here considering DDL/Trunc is transactional in postgres so can be done online without impacting the ongoing read queries and also performance wise, it would be faster as compared to the traditional Update/Insert/Upsert/Merge? --0000000000005db88f06156e0785 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Fri, 5 Apr 2024 at 06:10, Adrian Klave= r <adrian.klaver@aklaver.co= m> wrote:

>=C2=A0 =C2=A0 =C2=A0S3 is not a database. You will need to be more spec= ific about '... then
>=C2=A0 =C2=A0 =C2=A0from the S3 it will be picked and gets merged to th= e target postgres
>=C2=A0 =C2=A0 =C2=A0database.'
>
>
> The data from S3 will be dumped into the stage table and then the
> upsert/merge from that table to the actual table.

The S3 --> staging table would be helped by having the data as CSV and <= br> then using COPY. The staging --> final table step could be done as
either ON CONFLICT or MERGE, you would need to test in your situation to verify which works better.
=C2=A0
Just a thought = , in case the delta record changes are really higher(say >30-40% of the = total number of rows in the table) can OP also evaluate the "truncate = target table +load target table" strategy here considering DDL/Trunc i= s transactional in postgres so can be done online without impacting the ong= oing read queries and also performance wise, it would be faster as compared= to the traditional=C2=A0Update/Insert/Upsert/Merge?
=C2=A0
--0000000000005db88f06156e0785--