Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1djlH2-0001vt-F7 for pgsql-performance@arkaria.postgresql.org; Mon, 21 Aug 2017 11:53:40 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1djlH2-0004QU-0e for pgsql-performance@arkaria.postgresql.org; Mon, 21 Aug 2017 11:53:40 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1djlH1-0004QK-C1 for pgsql-performance@postgresql.org; Mon, 21 Aug 2017 11:53:39 +0000 Received: from mail-wr0-x22a.google.com ([2a00:1450:400c:c0c::22a]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1djlGw-0005xZ-RV for pgsql-performance@postgresql.org; Mon, 21 Aug 2017 11:53:38 +0000 Received: by mail-wr0-x22a.google.com with SMTP id p14so28177881wrg.1 for ; Mon, 21 Aug 2017 04:53:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=eeHtPrW5imE3/UM9vr5Kv9fxPcFC4esU17BkQ3X2exU=; b=XGmQLEZx8xReq+F8+Um9eoyvZ/v+P1mhbkyE337jOFCOzzVCyhczf7UiJ0fgVWd9NY rWRsKBoYNd0HyZqnACnHxjmUNbbvVtUTuZsvb8TB7/WOrPe0T5ZDNkw38I1Ox63scaKj tHm409iZWBOEDRTdvIMeX6HmVEK9stUdKPP6WCr3/t++51DqgMs9i18AU1ujlUUX/Nz9 ZZHLSn9H9ygyrN3015pPO/bwflIKPh1tkN9HLBlFZfyTO2oDTugQw5rv8gV8IE6wBdFi 26Snr395kUPk3jcitQM3zg8B4lC8Ob5LWqcsVYQtimmb8mpLtoX87gWAtuzVh6zRhqxD 425Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=eeHtPrW5imE3/UM9vr5Kv9fxPcFC4esU17BkQ3X2exU=; b=BoeI4J5YlTQ4Xd/bBl1Lp5NxNJlZuxXB/QrhtT4DE4RSimTugexjsCA/45/Loa2HBz vbEjrnfJx51luCdzeCSpf/dAKGRUDQ0fhn0COOPM6OM/Mr9JFGUxYCfdzVOytB4SZbPm c2phYIYIUnIvZDgDR4XjAO8yK+bH6qHM3eHTmPN/I/ySZbkCNgLN4C1totu3DsXtGbXL xS6p2Grrqn6q6c5aEKX2WnexOGiR8q+WVK9J5BX6aBEW+q70XkFoqmWPzqyuwlwfKDPR CIzMeeqfmIDI/JB4JrfL21li9qbMY0eQCAx8VnSRzTqj25Dg1L+5Yg8md7/+Lr2E+w7J eJqA== X-Gm-Message-State: AHYfb5jmT2RsqiMTNCMnRowr1KJDDnGTvcTC7GsGEKfbwcyhCbWUvDrs wR+02enPxwGI1g== X-Received: by 10.28.180.8 with SMTP id d8mr6793272wmf.161.1503316413500; Mon, 21 Aug 2017 04:53:33 -0700 (PDT) Received: from [192.168.1.33] (213.red-81-40-123.staticip.rima-tde.net. [81.40.123.213]) by smtp.gmail.com with ESMTPSA id 12sm15521714wmy.44.2017.08.21.04.53.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Aug 2017 04:53:32 -0700 (PDT) From: Daniel Blanch Bataller Message-Id: <9BC9216C-5081-403D-A08C-77B3AB154F65@gmail.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_AE29AF20-C7FF-42F0-AB62-575BC39E65FF" Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: performance problem on big tables Date: Mon, 21 Aug 2017 13:53:30 +0200 In-Reply-To: Cc: "pgsql-performance@postgresql.org" To: Mariel Cherkassky References: <88350892-C4AB-4191-9F8E-DB7CFA81F6C6@gmail.com> <00990FBB-D288-4878-8B8E-36E24F808F2C@gmail.com> <236952AA-596F-4526-BB84-2EDC4906454F@gmail.com> X-Mailer: Apple Mail (2.3273) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --Apple-Mail=_AE29AF20-C7FF-42F0-AB62-575BC39E65FF Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > El 21 ago 2017, a las 13:27, Mariel Cherkassky = escribi=C3=B3: >=20 > All this operation runs as part of a big transaction that I run. > How can I create a dump in the oracle server and copy it to the = postgresql server from a postgresql transaction ? I guess you could create a user defined function in any of the available = languages (perl, python, java, =E2=80=A6). Functions run inside = transactions too=E2=80=A6this is not simple, though.=20 > Chopping the table is optional when I use copy, but when I use copy to = remote oracle table it takes longer to create the dump.=20 It may take longer depending on how the oracle machine, table and = database are configured. In my experience oracle is not very fast = dumping whole tables, not to mention tables with BLOB data, which can be = as slow as hundreds of records per second (which is probably not your = case). If this transaction is to synchronize data between transactional servers = and data analysis servers you may consider using some type of = replication where only changes are sent. EnterpriseDB has tools to do = such things, I=E2=80=99m not aware of any other tool that can do this = between oracle and postgres. Regards, Daniel. >=20 > 2017-08-21 11:37 GMT+03:00 Daniel Blanch Bataller = >: >=20 >> El 21 ago 2017, a las 10:00, Mariel Cherkassky = > = escribi=C3=B3: >>=20 >> To summarize, I still have performance problems. My current situation = :=20 >> I'm trying to copy the data of many tables in the oracle database = into my postgresql tables. I'm doing so by running insert into = local_postgresql_temp select * from remote_oracle_table. The performance = of this operation are very slow and I tried to check the reason for that = and mybe choose a different alternative. >>=20 >> 1)First method - Insert into local_postgresql_table select * from = remote_oracle_table this generated total disk write of 7 M/s and actual = disk write of 4 M/s(iotop). For 32G table it took me 2 hours and 30 = minutes. >>=20 >> 2)second method - copy (select * from oracle_remote_table) to = /tmp/dump generates total disk write of 4 M/s and actuval disk write of = 100 K/s. The copy utility suppose to be very fast but it seems very = slow. >>=20 >>=20 >=20 >=20 > Are you using a FDW to access oracle server and then dump it using = copy? This is going to be slow, FDW isn't fast. >=20 >=20 >> -When I run copy from the local dump, the reading is very fast 300 = M/s. >>=20 >>=20 >=20 > You reported it was slow before. What has changed? How much does it = take to load the 32G table then? >=20 >=20 >> -I created a 32G file on the oracle server and used scp to copy it = and it took me a few minutes. >>=20 >>=20 >> -The wals directory is located on a different file system. The = parameters I assigned : >>=20 >> min_parallel_relation_size =3D 200MB >> max_parallel_workers_per_gather =3D 5=20 >> max_worker_processes =3D 8=20 >> effective_cache_size =3D 12GB >> work_mem =3D 128MB >> maintenance_work_mem =3D 4GB >> shared_buffers =3D 2000MB >> RAM : 16G >> CPU CORES : 8 >> HOW can I increase the writes ? How can I get the data faster from = the oracle database to my postgresql database? >>=20 >>=20 >=20 >=20 > Extract the table to a file in the oracle server in a format that the = COPY utility can read, then copy it to postgres server and load it. You = can even pipe commands and do it in a single step. >=20 > This is what I meant when I said that COPY is much faster than any = thing else. To make it even faster, if I/O is not your bottleneck, you = can chop the table in chunks and load it in parallel as I told you = before, I have done this many times when migrating data from oracle to = postgres. ora2pg uses this method to migrate data from oracle to = postgres too.=20 >=20 >>=20 >> 2017-08-20 14:00 GMT+03:00 Mariel Cherkassky = >: >> I realized something weird. When I`m preforming the copy utility of = postgresql in order to create dump from a local table in my postgresql = db it takes for 32G table 20 minutes. When I try to use copy for a = foregin table (on oracle database) It takes more than 2 hours.. During = the copy operation from the foreign table I dont see alot of write = operations, with iotop i see that its writes 3 M/s. What else I can = check ?=20 >>=20 >> 2017-08-20 9:39 GMT+03:00 Mariel Cherkassky = >: >> This server is dedicated to be a postgresql production database, = therefore postgresql is the only thing the runs on the server. The fs = that I`m using is xfs. I`ll add two different disks - one for the wals = and one for the temp tablespace. Regarding the disk, what size should = they be considering that the database size is about 250G. Does 16G of = ram considered little ? I installed iotop and I see that postgresql = writer is writing most of the time and above all. >>=20 >> I mentioned that I perform alot of insert into table select * from = table. Before that I remove indexes,constraints and truncate the table. = Should I run vacuum before or after the operation ?=20 >>=20 >> 2017-08-17 19:37 GMT+03:00 Claudio Freire >: >> On Thu, Aug 17, 2017 at 6:00 AM, Mariel Cherkassky >> > = wrote: >> > I checked with the storage team in the company and they saw that I = have alot >> > of io on the server. How should I reduce the io that the postgresql = uses ? >>=20 >> Do you have concurrent activity on that server? >>=20 >> What filesystem are you using wherever the data is sitting? >>=20 >> If you've got concurrent fsyncs happening, some filesystems handle >> that poorly. When you've got WAL and data mixed in a single disk, or >> worse, filesystem, it happens often that the filesystem won't handle >> the write barriers for the WAL efficiently. I/O gets intermingled = with >> bulk operations, and even small fsyncs will have to flush writes from >> bulk operations, which makes a mess of things. >>=20 >> It is a very good idea, and in fact a recommended practice, to put = WAL >> on its own disk for that reason mainly. >>=20 >> With that little RAM, you'll also probably cause a lot of I/O in temp >> files, so I'd also recommend setting aside another disk for a temp >> tablespace so that I/O doesn't block other transactions as well. >>=20 >> This is all assuming you've got concurrent activity on the server. If >> not, install iotop and try to see who's causing that much I/O. >>=20 >=20 >=20 --Apple-Mail=_AE29AF20-C7FF-42F0-AB62-575BC39E65FF Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
El 21 ago 2017, a las 13:27, Mariel Cherkassky <mariel.cherkassky@gmail.com> escribi=C3=B3:

All this operation runs as part = of a big transaction that I = run.
How can I create a dump in the oracle server and copy it to = the postgresql server from a postgresql transaction ? =

I guess = you could create a user defined function in any of the available = languages (perl, python, java, =E2=80=A6). Functions run inside = transactions too=E2=80=A6this is not simple, though. 

Chopping the table is = optional when I use copy, but when I use copy to remote oracle table it = takes longer to create the = dump. 

It may take longer depending on how the oracle = machine, table and database are configured. In my experience oracle is = not very fast dumping whole tables, not to mention tables with BLOB = data, which can be as slow as hundreds of records per second =  (which is probably not your case).

If this transaction is to synchronize data between = transactional servers and data analysis servers you may consider using = some type of replication where only changes are sent. EnterpriseDB has = tools to do such things, I=E2=80=99m not aware of any other tool that = can do this between oracle and postgres.

Regards,

Daniel.


2017-08-21 11:37 GMT+03:00 Daniel Blanch Bataller <daniel.blanch.bataller@gmail.com>:

El 21 ago 2017, a las 10:00, Mariel Cherkassky <mariel.cherkassky@gmail.com> escribi=C3=B3:

To = summarize, I still have performance problems. My current situation = : 

I'm trying to copy the data of many tables in = the oracle database into my postgresql tables. I'm doing so by = running insert into = local_postgresql_temp select * from remote_oracle_table. The = performance of this operation are very slow and I tried to check the = reason for that and mybe choose a different alternative.

1)First method - Insert into = local_postgresql_table select * from = remote_oracle_table this generated total disk write of 7 M/s = and actual disk write of 4 M/s(iotop). For 32G table it took me 2 hours = and 30 minutes.

2)second method - copy (select * from = oracle_remote_table) to /tmp/dump generates total disk write = of 4 M/s and actuval disk write of 100 K/s. The copy utility suppose to = be very fast but it seems very slow.




Are you using a FDW to access oracle server and then dump it = using copy? This is going to be slow, FDW isn't fast.


-When I run copy from the local dump, the = reading is very fast 300 M/s.



You reported it was slow before. = What has changed? How much does it take to load the 32G table = then?


-I created a 32G file on the oracle server and = used scp to copy it and it took me a few minutes.


-The wals directory is located on a different = file system. The parameters I assigned :

min_parallel_relation_size =3D 200MB
max_parallel_workers_per_gather =3D 5=20
max_worker_processes =3D 8=20
effective_cache_size =3D 12GB
work_mem =3D 128MB
maintenance_work_mem =3D 4GB
shared_buffers =3D 2000MB
RAM : 16G
CPU CORES : 8

HOW can I increase the writes ? How can I get = the data faster from the oracle database to my postgresql = database?




Extract the table to a file in the oracle server in a format = that the COPY utility can read, then copy it to postgres server and load = it. You can even pipe commands and do it in a single step.

This is what I meant = when I said that COPY is much faster than any thing else. To make it = even faster, if I/O is not your bottleneck, you can chop the table in = chunks and load it in parallel as I told you before, I have done this = many times when migrating data from oracle to postgres. ora2pg uses this = method to migrate data from oracle to postgres too. 


2017-08-20 14:00 = GMT+03:00 Mariel Cherkassky <mariel.cherkassky@gmail.com>:
I realized something weird. When I`m preforming = the copy utility of postgresql in order to create dump from a local = table in my postgresql db it takes for 32G table 20 minutes. When I try = to use copy for a foregin table (on oracle database) It takes more than = 2 hours.. During the copy operation from the foreign table I dont see = alot of write operations, with iotop i see that its writes 3 M/s. What = else I can check ? 

2017-08-20 9:39 GMT+03:00 Mariel Cherkassky <mariel.cherkassky@gmail.com>:
This server is dedicated to be a postgresql = production database, therefore postgresql is the only thing the runs on = the server. The fs that I`m using is xfs. I`ll add two different disks - = one for the wals and one for the temp tablespace. Regarding the disk, = what size should they be considering that the database size is about = 250G. Does 16G of ram considered little ? I installed iotop and I see = that postgresql writer is writing most of the time and above = all.

I mentioned that I perform alot of insert into = table select * from table. Before that I remove indexes,constraints and = truncate the table. Should I run vacuum before or after the operation = ? 

2017-08-17 19:37 GMT+03:00 Claudio Freire <klaussfreire@gmail.com>:
On Thu, Aug 17, 2017 at 6:00 = AM, Mariel Cherkassky
<mariel.cherkassky@gmail.com> wrote:
> I checked with the storage team in the company and they saw that I = have alot
> of io on the server. How should I reduce the io that the postgresql = uses ?

Do you have concurrent activity on that server?

What filesystem are you using wherever the data is sitting?
=
If you've got concurrent fsyncs happening, some filesystems handle
that poorly. When you've got WAL and data mixed in a single disk, or
worse, filesystem, it happens often that the filesystem won't handle
the write barriers for the WAL efficiently. I/O gets intermingled = with
bulk operations, and even small fsyncs will have to flush writes from
bulk operations, which makes a mess of things.

It is a very good idea, and in fact a recommended practice, to put = WAL
on its own disk for that reason mainly.

With that little RAM, you'll also probably cause a lot of I/O in temp
files, so I'd also recommend setting aside another disk for a temp
tablespace so that I/O doesn't block other transactions as well.

This is all assuming you've got concurrent activity on the server. If
not, install iotop and try to see who's causing that much I/O.




= --Apple-Mail=_AE29AF20-C7FF-42F0-AB62-575BC39E65FF--