public inbox for [email protected]help / color / mirror / Atom feed
Difference between Bulk Load (Multiple inserts or single inserts) and COPY 6+ messages / 3 participants [nested] [flat]
* Difference between Bulk Load (Multiple inserts or single inserts) and COPY @ 2019-11-19 18:55 PG Doc comments form <[email protected]> 0 siblings, 1 reply; 6+ messages in thread From: PG Doc comments form @ 2019-11-19 18:55 UTC (permalink / raw) To: [email protected]; +Cc: [email protected] The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/10/sql-copy.html Description: Hello, Myself Mayank. I am a Ph.D. student. I experimented with Bulk load and COPY. Loading in COPY was very fast. However, after COPYing data from a CSV file to PostgreSQL Table. The query execution took lot of time for 1 of the first 4 queries. Only this slow query was taking so much time, that even if I had used normal bulk load, it would have been faster in total. Then all other Query executions took equal time as it took while querying a table after the Bulk data load method. So, I want to know the exact reason what's the issue with COPY. How exactly they differ? The only thing from the document I could identify was row security. But it did not mention anything about indexing. Like, in Bulk load, do indices(or constraint checks) are created with data loading? & in COPY it's done after? so when indices are being created that query slows down?? Please reply soon with more details or send a link where I can read it in depth. Thanks. Mayank. [email protected] ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY @ 2019-11-19 22:55 Laurenz Albe <[email protected]> parent: PG Doc comments form <[email protected]> 0 siblings, 1 reply; 6+ messages in thread From: Laurenz Albe @ 2019-11-19 22:55 UTC (permalink / raw) To: [email protected]; [email protected] On Tue, 2019-11-19 at 18:55 +0000, PG Doc comments form wrote: > I experimented with Bulk load and COPY. > Loading in COPY was very fast. > However, after COPYing data from a CSV file to PostgreSQL Table. The query > execution took lot of time for 1 of the first 4 queries. > Only this slow query was taking so much time, that even if I had used normal > bulk load, it would have been faster in total. > Then all other Query executions took equal time as it took while querying a > table after the Bulk data load method. > > So, I want to know the exact reason what's the issue with COPY. > How exactly they differ? The only thing from the document I could identify > was row security. > But it did not mention anything about indexing. Like, in Bulk load, do > indices(or constraint checks) are created with data loading? > & in COPY it's done after? so when indices are being created that query > slows down?? > > Please reply soon with more details or send a link where I can read it in > depth. That cannot be answered without knowing the exact statements and the table definitions. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com ^ permalink raw reply [nested|flat] 6+ messages in thread
* Difference between Bulk Load (Multiple inserts or single inserts) and COPY @ 2019-11-22 09:33 PG Doc comments form <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: PG Doc comments form @ 2019-11-22 09:33 UTC (permalink / raw) To: [email protected]; +Cc: [email protected] The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/10/sql-copy.html Description: Hello, > I experimented with Bulk load and COPY. > Loading in COPY was very fast. > However, after COPYing data from a CSV file to PostgreSQL Table. The query > execution took lot of time for 1 of the first 4 queries. > Only this slow query was taking so much time, that even if I had used normal > bulk load, it would have been faster in total. > Then all other Query executions took equal time as it took while querying a > table after the Bulk data load method. > > So, I want to know the exact reason what's the issue with COPY. > How exactly they differ? The only thing from the document I could identify > was row security. > But it did not mention anything about indexing. Like, in Bulk load, do > indices(or constraint checks) are created with data loading? > & in COPY it's done after? so when indices are being created that query > slows down?? *Added details* "Table & Query details" I have 1 Table is there having 3 attributes: TableName{ Column1 Varchar300, Column2 Varchar300, Column3 Varchar300}; I haven't created any primary keys or FKs. No other constraints. Data set size: 150MB / 1M records Queries: Select count(*) from Table; Select count(distinct( Column1, Column2 , Column3 )) from Table; Select Column1, Column2, Column3 from Table as T1, Table as T2, Table as T3 where T1. Column1=T2.Column3 and T1. Column1="xyz"; Please let me know, how Bulk load vs. COPY different in both situations 1) Do the internal representation differs after data is loaded using Bulk vs. COPY? 2) what if I have added Keys and Constraints, are they checked later? Means loading is shown completed but in background it's creating indices/checking constraints. 3) Can it be the reason that some other process(which?) is running in background during query execution ? as I query the data as soon as the load after COPY is complete. ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY @ 2019-12-01 00:55 Bruce Momjian <[email protected]> parent: Laurenz Albe <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Bruce Momjian @ 2019-12-01 00:55 UTC (permalink / raw) To: Laurenz Albe <[email protected]>; +Cc: [email protected]; [email protected] On Tue, Nov 19, 2019 at 11:55:44PM +0100, Laurenz Albe wrote: > On Tue, 2019-11-19 at 18:55 +0000, PG Doc comments form wrote: > > I experimented with Bulk load and COPY. > > Loading in COPY was very fast. > > However, after COPYing data from a CSV file to PostgreSQL Table. The query > > execution took lot of time for 1 of the first 4 queries. > > Only this slow query was taking so much time, that even if I had used normal > > bulk load, it would have been faster in total. > > Then all other Query executions took equal time as it took while querying a > > table after the Bulk data load method. > > > > So, I want to know the exact reason what's the issue with COPY. > > How exactly they differ? The only thing from the document I could identify > > was row security. > > But it did not mention anything about indexing. Like, in Bulk load, do > > indices(or constraint checks) are created with data loading? > > & in COPY it's done after? so when indices are being created that query > > slows down?? > > > > Please reply soon with more details or send a link where I can read it in > > depth. > > That cannot be answered without knowing the exact statements and the > table definitions. I wonder if it is the overhead of rewriting all the rows to set the per-row HEAP_XMIN_COMMITTED bit. Unfortunately, I don't know a way to test this hypothesis. -- Bruce Momjian <[email protected]> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + ^ permalink raw reply [nested|flat] 6+ messages in thread
* Difference between Bulk Load (Multiple inserts or single inserts) and COPY @ 2019-12-05 15:39 PG Doc comments form <[email protected]> 0 siblings, 1 reply; 6+ messages in thread From: PG Doc comments form @ 2019-12-05 15:39 UTC (permalink / raw) To: [email protected]; +Cc: [email protected] The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/10/sql-copy.html Description: Hello, > I experimented with Bulk load and COPY. > Loading in COPY was very fast. > However, after COPYing data from a CSV file to PostgreSQL Table. The query > execution took lot of time for 1 of the first 4 queries. > Only this slow query was taking so much time, that even if I had used normal > bulk load, it would have been faster in total. > Then all other Query executions took equal time as it took while querying a > table after the Bulk data load method. > > So, I want to know the exact reason what's the issue with COPY. > How exactly they differ? The only thing from the document I could identify > was row security. > But it did not mention anything about indexing. Like, in Bulk load, do > indices(or constraint checks) are created with data loading? > & in COPY it's done after? so when indices are being created that query > slows down?? *Added details* "Table & Query details" I have 1 Table is there having 3 attributes: TableName{ Column1 Varchar300, Column2 Varchar300, Column3 Varchar300}; I haven't created any primary keys or FKs. No other constraints. Data set size: 150MB / 1M records Queries: Select count(*) from Table; Select count(distinct( Column1, Column2 , Column3 )) from Table; Select Column1, Column2, Column3 from Table as T1, Table as T2, Table as T3 where T1. Column1=T2.Column3 and T1. Column1="xyz"; Please let me know, how Bulk load vs. COPY different in both situations 1) Do the internal representation differs after data is loaded using Bulk vs. COPY? 2) what if I have added Keys and Constraints, are they checked later? Means loading is shown completed but in background it's creating indices/checking constraints. 3) Can it be the reason that some other process(which?) is running in background during query execution ? as I query the data as soon as the load after COPY is complete. ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Difference between Bulk Load (Multiple inserts or single inserts) and COPY @ 2019-12-21 18:24 Bruce Momjian <[email protected]> parent: PG Doc comments form <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Bruce Momjian @ 2019-12-21 18:24 UTC (permalink / raw) To: [email protected]; [email protected] This is not a documentation question. For assistance, please join the appropriate mailing list and post your question: http://www.postgresql.org/community You can also try the #postgresql IRC channel on irc.freenode.net. See the PostgreSQL FAQ for more information. --------------------------------------------------------------------------- On Thu, Dec 5, 2019 at 03:39:24PM +0000, PG Doc comments form wrote: > The following documentation comment has been logged on the website: > > Page: https://www.postgresql.org/docs/10/sql-copy.html > Description: > > Hello, > > > I experimented with Bulk load and COPY. > > Loading in COPY was very fast. > > However, after COPYing data from a CSV file to PostgreSQL Table. The > query > > execution took lot of time for 1 of the first 4 queries. > > Only this slow query was taking so much time, that even if I had used > normal > > bulk load, it would have been faster in total. > > Then all other Query executions took equal time as it took while querying > a > > table after the Bulk data load method. > > > > So, I want to know the exact reason what's the issue with COPY. > > How exactly they differ? The only thing from the document I could > identify > > was row security. > > But it did not mention anything about indexing. Like, in Bulk load, do > > indices(or constraint checks) are created with data loading? > > & in COPY it's done after? so when indices are being created that query > > slows down?? > > *Added details* > > "Table & Query details" > I have 1 Table is there having 3 attributes: > TableName{ Column1 Varchar300, Column2 Varchar300, Column3 Varchar300}; > I haven't created any primary keys or FKs. No other constraints. > > Data set size: 150MB / 1M records > > Queries: > Select count(*) from Table; > Select count(distinct( Column1, Column2 , Column3 )) from Table; > Select Column1, Column2, Column3 from Table as T1, Table as T2, Table as T3 > where T1. Column1=T2.Column3 and T1. Column1="xyz"; > > Please let me know, how Bulk load vs. COPY different in both situations > 1) Do the internal representation differs after data is loaded using Bulk > vs. COPY? > 2) what if I have added Keys and Constraints, are they checked later? Means > loading is shown completed but in background it's creating indices/checking > constraints. > 3) Can it be the reason that some other process(which?) is running in > background during query execution ? as I query the data as soon as the load > after COPY is complete. -- Bruce Momjian <[email protected]> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + ^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2019-12-21 18:24 UTC | newest] Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2019-11-19 18:55 Difference between Bulk Load (Multiple inserts or single inserts) and COPY PG Doc comments form <[email protected]> 2019-11-19 22:55 ` Laurenz Albe <[email protected]> 2019-12-01 00:55 ` Bruce Momjian <[email protected]> 2019-11-22 09:33 Difference between Bulk Load (Multiple inserts or single inserts) and COPY PG Doc comments form <[email protected]> 2019-12-05 15:39 Difference between Bulk Load (Multiple inserts or single inserts) and COPY PG Doc comments form <[email protected]> 2019-12-21 18:24 ` Bruce Momjian <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox