Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1srAEW-00FoRt-Kj for pgsql-general@arkaria.postgresql.org; Thu, 19 Sep 2024 06:01:40 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1srAEU-004mm3-Oz for pgsql-general@arkaria.postgresql.org; Thu, 19 Sep 2024 06:01:38 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1srAEU-004mlt-Cm for pgsql-general@lists.postgresql.org; Thu, 19 Sep 2024 06:01:38 +0000 Received: from mail-oo1-xc33.google.com ([2607:f8b0:4864:20::c33]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1srAES-001weX-AS for pgsql-general@lists.postgresql.org; Thu, 19 Sep 2024 06:01:37 +0000 Received: by mail-oo1-xc33.google.com with SMTP id 006d021491bc7-5e1baf0f764so142804eaf.1 for ; Wed, 18 Sep 2024 23:01:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726725694; x=1727330494; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ENmpt2RiYCu+sjWPWLbF8mzfXie+urGbGIpwmLzaUG0=; b=X1nN/O23HIRnrB3DSfUxyBsRpaFHLA4HqrFBdYuKD1nP9yh1MyTF7z2CKc0fEIeacz qnylTD+VwoyTXkV3DN3vF4h3SKsgEuTfVoW89/ufDmtL9/E7Ev3iODV0u+OLYTFmQwy1 Ndlu5WUiojfpy06PXDeCJ6wQ8glNaO1YWT03p6TR6lm/SfUW4jBnmX3KZH2NUFDW5G4Q 2QPufl2ZEQ8uvO+N3jwXFi1Esj59roNuMyN4FYeMkeA4Dt9QuhMsYazvEk0LtwFBEbXg QPN0qNPd7vQH3QfOi9CWtfBqHVYaMJx+1DhwOJGg17tI2qGRdi0PbFbH0CLVuf8jqbS9 UOWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726725694; x=1727330494; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ENmpt2RiYCu+sjWPWLbF8mzfXie+urGbGIpwmLzaUG0=; b=txAzyu8s9tQuwyv8xuaYMtkj4u+lD8nxZcs77MGX6W3dQE1sQATuHQlgGGNAZsNHAg UkoNLPybr3Siad3is5MIZHgX28gjN0uCfv+5OFsfxUWvLMQu3Tpogu+QsAuSo7vm0/j5 +UvWVvppjcbIMvKBZDqiTi/4Vdh0E0DGXa8iMjbliSHer9Bxtm8mKaltgbpPpc3fIeb/ SEhE71fFDd2XopBNlGw3WBrIQ4VUpa5HMic1UvOskuCyXePZqoiCM95Uujbj5IafJKv8 a//cYomnmwsG67oz90QK6gfuzegnzJ6SiIVIM9RYTlqy31n0jsOrgdJZAv7AqZPU9AwN iDFQ== X-Gm-Message-State: AOJu0YxefqULLoz+P406O4FrhJKA5oQQngYqc1fUeZTuWDSFaRA0EVpA x+R9048h8cTBxr3JBTDt9a2qQbXts81MjvtTzQYOShvir+gx48cGKtAPjcsrCLlRi+6igaChY4l ZvhhRutCidYbzZ3XeMB1dTJPtsww= X-Google-Smtp-Source: AGHT+IG5eclh2uoa6vdSsSsJ0IHxpwIgOhxwLAOWjNi1du6lc3jpqIC9xUgPLIHhMSzl9gQt12MOCyj+TL1ZcIV5Igs= X-Received: by 2002:a05:6820:50b:b0:5e1:efda:3991 with SMTP id 006d021491bc7-5e20bfd4706mr10141288eaf.0.1726725694185; Wed, 18 Sep 2024 23:01:34 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ron Johnson Date: Thu, 19 Sep 2024 02:01:23 -0400 Message-ID: Subject: Re: How batch processing works To: Lok P Cc: pgsql-general Content-Type: multipart/alternative; boundary="000000000000a06a62062272a7c7" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000a06a62062272a7c7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Sep 19, 2024 at 1:31=E2=80=AFAM Lok P wrote: > Hello, > Saw multiple threads around the same , so I want some clarification. As w= e > know row by row is slow by slow processing , so in heavy write systems(sa= y > the client app is in Java) , people asked to do DMLS in batches rather in= a > row by row fashion to minimize the chatting or context switches between > database and client which is resource intensive. What I understand is tha= t > , a true batch processing means the client has to collect all the input > bind values and prepare the insert statement and submit to the database = at > oneshot and then commit. > > What it means actually and if we divide the option as below, which method > truly does batch processing or there exists some other method for doing t= he > batch processing considering postgres as backend database? > > I understand, the first method below is truly a row by row processing in > which context switches happen between client and database with each row, > whereas the second method is just batching the commits but not a true bat= ch > processing as it will do the same amount of context switching between the > database and client. But regarding the third and fourth method, will both > execute in a similar fashion in the database with the same number of > context switches? If any other better method exists to do these inserts i= n > batches? Appreciate your guidance. > > > CREATE TABLE parent_table ( > id SERIAL PRIMARY KEY, > name TEXT > ); > > CREATE TABLE child_table ( > id SERIAL PRIMARY KEY, > parent_id INT REFERENCES parent_table(id), > value TEXT > ); > > > Method-1 > > insert into parent_table values(1,'a'); > commit; > insert into parent_table values(2,'a'); > commit; > insert into child_table values(1,1,'a'); > Commit; > insert into child_table values(1,2,'a'); > commit; > > VS > > Method-2 > > insert into parent_table values(1,'a'); > insert into parent_table values(2,'a'); > insert into child_table values(1,1,'a'); > insert into child_table values(1,2,'a'); > Commit; > As a former "DP" programmer, from an application point of view, this is absolutely batch programming. My experience was with COBOL and C, though, which were low overhead. From what I've seen in PG log files, JDBC is astoundingly chatty. [snip] > > Method-4 > > INSERT INTO parent_table VALUES (1, 'a'), (2, 'a'); > INSERT INTO child_table VALUES (1,1, 'a'), (1,2, 'a'); > commit; > If I knew that I had to load a structured input data file (even if it had parent and child records), this is how I'd do it (but probably first try and see if "in-memory COPY INTO" is such a thing). --=20 Death to , and butter sauce. Don't boil me, I'm still alive. crustacean! --000000000000a06a62062272a7c7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Thu, Sep 19, 2024 at 1:31=E2=80=AFAM L= ok P <loknath.73@gmail.com&g= t; wrote:
Hello,
Saw multiple threads aroun= d the same , so I want some clarification. As we know row by row is slow by= slow processing , so in heavy write systems(say the client app is in Java)= , people asked to do DMLS in batches rather in a row by row fashion to min= imize the chatting or context switches between database and client which is= resource intensive. What I understand is that , a true batch processing me= ans the client has to collect all the input bind values and =C2=A0prepare t= he insert statement and submit to the database at oneshot and then commit. =

What it means actually and if we divide the option as below, which = method truly does batch processing or there exists some other method for do= ing the batch processing considering postgres as backend database?

I= understand, the first method below is truly a row by row processing in whi= ch context switches happen between client and database with each row, where= as the second method is just batching the commits but not a true batch proc= essing as it will do the same amount of context switching between the datab= ase and client. But regarding the third and fourth method, will both execut= e in a similar fashion in the database with the same number of context swit= ches? If any other better method exists to do these inserts in batches? App= reciate your guidance.


CREATE TABLE parent_table (
=C2=A0 =C2= =A0 id SERIAL PRIMARY KEY,
=C2=A0 =C2=A0 name TEXT
);

CREATE T= ABLE child_table (
=C2=A0 =C2=A0 id SERIAL PRIMARY KEY,
=C2=A0 =C2=A0= parent_id INT REFERENCES parent_table(id),
=C2=A0 =C2=A0 value TEXT
= );


Method-1

insert into parent_table values(1,'a'= );
commit;
insert into parent_table values(2,'a');
commi= t;
insert into child_table values(1,1,'a');
Commit;
inser= t into child_table values(1,2,'a');
commit;

VS

M= ethod-2

insert into parent_table values(1,'a');
insert i= nto parent_table values(2,'a');
insert into child_table values(= 1,1,'a');
insert into child_table values(1,2,'a'); =C2= =A0
Commit;

As a former = "DP" programmer, from an application point of view, this is absol= utely=C2=A0batch programming.

My experience was wi= th COBOL and C, though, which were low overhead.=C2=A0 From what I've s= een in PG log files, JDBC is astoundingly chatty.
=C2=A0
[snip]=C2=A0

Method-4

INSERT INTO parent_table VALUES =C2=A0= (1, 'a'), (2, 'a');
INSERT INTO child_table VALUES =C2= =A0 (1,1, 'a'), (1,2, 'a');
commit;

If I knew that I had to load a structured inp= ut data file (even if it had parent=C2=A0and child records), this is how I&= #39;d do it (but probably first try and see if "in-memory COPY INTO&qu= ot; is such a thing).

--
Death to <Redacted>, and butter sauce.
Don't boil me, I&= #39;m still alive.
<Redacted> crustacean!
--000000000000a06a62062272a7c7--