public inbox for [email protected]
help / color / mirror / Atom feedDifference between Bulk Load (Multiple inserts or single inserts) and COPY
6+ messages / 3 participants
[nested] [flat]
* Difference between Bulk Load (Multiple inserts or single inserts) and COPY
@ 2019-11-19 18:55 PG Doc comments form <[email protected]>
2019-11-19 22:55 ` Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY Laurenz Albe <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: PG Doc comments form @ 2019-11-19 18:55 UTC (permalink / raw)
To: [email protected]; +Cc: [email protected]
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/10/sql-copy.html
Description:
Hello,
Myself Mayank. I am a Ph.D. student.
I experimented with Bulk load and COPY.
Loading in COPY was very fast.
However, after COPYing data from a CSV file to PostgreSQL Table. The query
execution took lot of time for 1 of the first 4 queries.
Only this slow query was taking so much time, that even if I had used normal
bulk load, it would have been faster in total.
Then all other Query executions took equal time as it took while querying a
table after the Bulk data load method.
So, I want to know the exact reason what's the issue with COPY.
How exactly they differ? The only thing from the document I could identify
was row security.
But it did not mention anything about indexing. Like, in Bulk load, do
indices(or constraint checks) are created with data loading?
& in COPY it's done after? so when indices are being created that query
slows down??
Please reply soon with more details or send a link where I can read it in
depth.
Thanks.
Mayank.
[email protected]
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY
2019-11-19 18:55 Difference between Bulk Load (Multiple inserts or single inserts) and COPY PG Doc comments form <[email protected]>
@ 2019-11-19 22:55 ` Laurenz Albe <[email protected]>
2019-12-01 00:55 ` Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY Bruce Momjian <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Laurenz Albe @ 2019-11-19 22:55 UTC (permalink / raw)
To: [email protected]; [email protected]
On Tue, 2019-11-19 at 18:55 +0000, PG Doc comments form wrote:
> I experimented with Bulk load and COPY.
> Loading in COPY was very fast.
> However, after COPYing data from a CSV file to PostgreSQL Table. The query
> execution took lot of time for 1 of the first 4 queries.
> Only this slow query was taking so much time, that even if I had used normal
> bulk load, it would have been faster in total.
> Then all other Query executions took equal time as it took while querying a
> table after the Bulk data load method.
>
> So, I want to know the exact reason what's the issue with COPY.
> How exactly they differ? The only thing from the document I could identify
> was row security.
> But it did not mention anything about indexing. Like, in Bulk load, do
> indices(or constraint checks) are created with data loading?
> & in COPY it's done after? so when indices are being created that query
> slows down??
>
> Please reply soon with more details or send a link where I can read it in
> depth.
That cannot be answered without knowing the exact statements and the
table definitions.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY
2019-11-19 18:55 Difference between Bulk Load (Multiple inserts or single inserts) and COPY PG Doc comments form <[email protected]>
2019-11-19 22:55 ` Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY Laurenz Albe <[email protected]>
@ 2019-12-01 00:55 ` Bruce Momjian <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: Bruce Momjian @ 2019-12-01 00:55 UTC (permalink / raw)
To: Laurenz Albe <[email protected]>; +Cc: [email protected]; [email protected]
On Tue, Nov 19, 2019 at 11:55:44PM +0100, Laurenz Albe wrote:
> On Tue, 2019-11-19 at 18:55 +0000, PG Doc comments form wrote:
> > I experimented with Bulk load and COPY.
> > Loading in COPY was very fast.
> > However, after COPYing data from a CSV file to PostgreSQL Table. The query
> > execution took lot of time for 1 of the first 4 queries.
> > Only this slow query was taking so much time, that even if I had used normal
> > bulk load, it would have been faster in total.
> > Then all other Query executions took equal time as it took while querying a
> > table after the Bulk data load method.
> >
> > So, I want to know the exact reason what's the issue with COPY.
> > How exactly they differ? The only thing from the document I could identify
> > was row security.
> > But it did not mention anything about indexing. Like, in Bulk load, do
> > indices(or constraint checks) are created with data loading?
> > & in COPY it's done after? so when indices are being created that query
> > slows down??
> >
> > Please reply soon with more details or send a link where I can read it in
> > depth.
>
> That cannot be answered without knowing the exact statements and the
> table definitions.
I wonder if it is the overhead of rewriting all the rows to set the
per-row HEAP_XMIN_COMMITTED bit. Unfortunately, I don't know a way to
test this hypothesis.
--
Bruce Momjian <[email protected]> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
^ permalink raw reply [nested|flat] 6+ messages in thread
* Difference between Bulk Load (Multiple inserts or single inserts) and COPY
@ 2019-11-22 09:33 PG Doc comments form <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: PG Doc comments form @ 2019-11-22 09:33 UTC (permalink / raw)
To: [email protected]; +Cc: [email protected]
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/10/sql-copy.html
Description:
Hello,
> I experimented with Bulk load and COPY.
> Loading in COPY was very fast.
> However, after COPYing data from a CSV file to PostgreSQL Table. The
query
> execution took lot of time for 1 of the first 4 queries.
> Only this slow query was taking so much time, that even if I had used
normal
> bulk load, it would have been faster in total.
> Then all other Query executions took equal time as it took while querying
a
> table after the Bulk data load method.
>
> So, I want to know the exact reason what's the issue with COPY.
> How exactly they differ? The only thing from the document I could
identify
> was row security.
> But it did not mention anything about indexing. Like, in Bulk load, do
> indices(or constraint checks) are created with data loading?
> & in COPY it's done after? so when indices are being created that query
> slows down??
*Added details*
"Table & Query details"
I have 1 Table is there having 3 attributes:
TableName{ Column1 Varchar300, Column2 Varchar300, Column3 Varchar300};
I haven't created any primary keys or FKs. No other constraints.
Data set size: 150MB / 1M records
Queries:
Select count(*) from Table;
Select count(distinct( Column1, Column2 , Column3 )) from Table;
Select Column1, Column2, Column3 from Table as T1, Table as T2, Table as T3
where T1. Column1=T2.Column3 and T1. Column1="xyz";
Please let me know, how Bulk load vs. COPY different in both situations
1) Do the internal representation differs after data is loaded using Bulk
vs. COPY?
2) what if I have added Keys and Constraints, are they checked later? Means
loading is shown completed but in background it's creating indices/checking
constraints.
3) Can it be the reason that some other process(which?) is running in
background during query execution ? as I query the data as soon as the load
after COPY is complete.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Difference between Bulk Load (Multiple inserts or single inserts) and COPY
@ 2019-12-05 15:39 PG Doc comments form <[email protected]>
2019-12-21 18:24 ` Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY Bruce Momjian <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: PG Doc comments form @ 2019-12-05 15:39 UTC (permalink / raw)
To: [email protected]; +Cc: [email protected]
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/10/sql-copy.html
Description:
Hello,
> I experimented with Bulk load and COPY.
> Loading in COPY was very fast.
> However, after COPYing data from a CSV file to PostgreSQL Table. The
query
> execution took lot of time for 1 of the first 4 queries.
> Only this slow query was taking so much time, that even if I had used
normal
> bulk load, it would have been faster in total.
> Then all other Query executions took equal time as it took while querying
a
> table after the Bulk data load method.
>
> So, I want to know the exact reason what's the issue with COPY.
> How exactly they differ? The only thing from the document I could
identify
> was row security.
> But it did not mention anything about indexing. Like, in Bulk load, do
> indices(or constraint checks) are created with data loading?
> & in COPY it's done after? so when indices are being created that query
> slows down??
*Added details*
"Table & Query details"
I have 1 Table is there having 3 attributes:
TableName{ Column1 Varchar300, Column2 Varchar300, Column3 Varchar300};
I haven't created any primary keys or FKs. No other constraints.
Data set size: 150MB / 1M records
Queries:
Select count(*) from Table;
Select count(distinct( Column1, Column2 , Column3 )) from Table;
Select Column1, Column2, Column3 from Table as T1, Table as T2, Table as T3
where T1. Column1=T2.Column3 and T1. Column1="xyz";
Please let me know, how Bulk load vs. COPY different in both situations
1) Do the internal representation differs after data is loaded using Bulk
vs. COPY?
2) what if I have added Keys and Constraints, are they checked later? Means
loading is shown completed but in background it's creating indices/checking
constraints.
3) Can it be the reason that some other process(which?) is running in
background during query execution ? as I query the data as soon as the load
after COPY is complete.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY
2019-12-05 15:39 Difference between Bulk Load (Multiple inserts or single inserts) and COPY PG Doc comments form <[email protected]>
@ 2019-12-21 18:24 ` Bruce Momjian <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: Bruce Momjian @ 2019-12-21 18:24 UTC (permalink / raw)
To: [email protected]; [email protected]
This is not a documentation question. For assistance, please join the
appropriate mailing list and post your question:
http://www.postgresql.org/community
You can also try the #postgresql IRC channel on irc.freenode.net. See
the PostgreSQL FAQ for more information.
---------------------------------------------------------------------------
On Thu, Dec 5, 2019 at 03:39:24PM +0000, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/10/sql-copy.html
> Description:
>
> Hello,
>
> > I experimented with Bulk load and COPY.
> > Loading in COPY was very fast.
> > However, after COPYing data from a CSV file to PostgreSQL Table. The
> query
> > execution took lot of time for 1 of the first 4 queries.
> > Only this slow query was taking so much time, that even if I had used
> normal
> > bulk load, it would have been faster in total.
> > Then all other Query executions took equal time as it took while querying
> a
> > table after the Bulk data load method.
> >
> > So, I want to know the exact reason what's the issue with COPY.
> > How exactly they differ? The only thing from the document I could
> identify
> > was row security.
> > But it did not mention anything about indexing. Like, in Bulk load, do
> > indices(or constraint checks) are created with data loading?
> > & in COPY it's done after? so when indices are being created that query
> > slows down??
>
> *Added details*
>
> "Table & Query details"
> I have 1 Table is there having 3 attributes:
> TableName{ Column1 Varchar300, Column2 Varchar300, Column3 Varchar300};
> I haven't created any primary keys or FKs. No other constraints.
>
> Data set size: 150MB / 1M records
>
> Queries:
> Select count(*) from Table;
> Select count(distinct( Column1, Column2 , Column3 )) from Table;
> Select Column1, Column2, Column3 from Table as T1, Table as T2, Table as T3
> where T1. Column1=T2.Column3 and T1. Column1="xyz";
>
> Please let me know, how Bulk load vs. COPY different in both situations
> 1) Do the internal representation differs after data is loaded using Bulk
> vs. COPY?
> 2) what if I have added Keys and Constraints, are they checked later? Means
> loading is shown completed but in background it's creating indices/checking
> constraints.
> 3) Can it be the reason that some other process(which?) is running in
> background during query execution ? as I query the data as soon as the load
> after COPY is complete.
--
Bruce Momjian <[email protected]> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2019-12-21 18:24 UTC | newest]
Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2019-11-19 18:55 Difference between Bulk Load (Multiple inserts or single inserts) and COPY PG Doc comments form <[email protected]>
2019-11-19 22:55 ` Laurenz Albe <[email protected]>
2019-12-01 00:55 ` Bruce Momjian <[email protected]>
2019-11-22 09:33 Difference between Bulk Load (Multiple inserts or single inserts) and COPY PG Doc comments form <[email protected]>
2019-12-05 15:39 Difference between Bulk Load (Multiple inserts or single inserts) and COPY PG Doc comments form <[email protected]>
2019-12-21 18:24 ` Bruce Momjian <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox