Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1djllV-0003pr-6P for pgsql-performance@arkaria.postgresql.org; Mon, 21 Aug 2017 12:25:09 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1djllU-0005UF-GD for pgsql-performance@arkaria.postgresql.org; Mon, 21 Aug 2017 12:25:08 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1djljU-0002A3-T8 for pgsql-performance@postgresql.org; Mon, 21 Aug 2017 12:23:05 +0000 Received: from mail-wr0-x232.google.com ([2a00:1450:400c:c0c::232]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1djljQ-0006ch-VR for pgsql-performance@postgresql.org; Mon, 21 Aug 2017 12:23:04 +0000 Received: by mail-wr0-x232.google.com with SMTP id p8so33474783wrf.5 for ; Mon, 21 Aug 2017 05:23:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=PP4IrtckA0Xqklw2mlHfbW0CAoLe42nLSZex6C+nivc=; b=bC2esTt/HTKG9NJam6SM/naUJqsfHWtdhyVzF1RqTXbp+wU0ffjMmmWvKHyCgYAri9 1EiDrekM3dPt6lizYypS7kjsmrg2pwhPkadKxAb6oZoJ4GP80ppyrpskMPkEENwj0mEL tEgp9SrwJOoKPbL/7HC58LFuNgDlJzPJiv+uwU09dHjURpvd/AaQCS65zJfjwz83f0N6 8MhF7uPPanhhHtUjBXQglyhLbLaM3yl67/o+/uoAvx9AIzzaO5Q4g8uN/zC2KqC8D6Se 7un3dXmoImfN8d5rfBIoJKyLwuylqbqYvEQJLQpGvDLMOSelDb4DtO4aAVhzMTZpUgii 4lZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=PP4IrtckA0Xqklw2mlHfbW0CAoLe42nLSZex6C+nivc=; b=GDnBIbELpBTF6aJ4ja3s54pWtQjx1B2nVD97a9eEicL0/8DwfJ0TM18S/UgAt/7qHQ zvqzWr7LlJK26wuxEjH8ven0TFhds1dd8uAix6N6zZHo4xNHpUvuzcgfPp+brTu1hZzU zDN1xQEB0QSBcWsdcX7bitLhvckKj3i0bt6PXzxY/YtNG7RTKa0pSwEuhHeF3HMvV+sN UlpjF+JQ5Tr5hGEyKcIYs5w1rchBiChwWo1rpXuvYGyJrR6aFIU1pmD7+FPIBAgrSke5 wMH4LsGhf7teJ/CW4CP2iodFaCQElGxG+cVT/hSdiOcdoeOHu6EPFL6wz6o4PuO2u5tc HCZA== X-Gm-Message-State: AHYfb5i9LZQbLqoHE5KJ0cW0m2XlmgNHVXuG6O7k2ffOtMbH5znx/xFE JLo5Zkctowt2F/2GbWspTuMobH0CnQ== X-Received: by 10.28.236.208 with SMTP id h77mr1772050wmi.162.1503318180003; Mon, 21 Aug 2017 05:23:00 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.184.85 with HTTP; Mon, 21 Aug 2017 05:22:58 -0700 (PDT) In-Reply-To: <9BC9216C-5081-403D-A08C-77B3AB154F65@gmail.com> References: <88350892-C4AB-4191-9F8E-DB7CFA81F6C6@gmail.com> <00990FBB-D288-4878-8B8E-36E24F808F2C@gmail.com> <236952AA-596F-4526-BB84-2EDC4906454F@gmail.com> <9BC9216C-5081-403D-A08C-77B3AB154F65@gmail.com> From: Mariel Cherkassky Date: Mon, 21 Aug 2017 15:22:58 +0300 Message-ID: Subject: Re: performance problem on big tables To: Daniel Blanch Bataller Cc: "pgsql-performance@postgresql.org" Content-Type: multipart/alternative; boundary="001a11476f181a853c0557428abb" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --001a11476f181a853c0557428abb Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I`m searching for a way to improve the current performance, I'm not interesting in using a different tool or writing something new because I'm trying to migrate a system on oracle database to a postgresql database. 2017-08-21 14:53 GMT+03:00 Daniel Blanch Bataller < daniel.blanch.bataller@gmail.com>: > > El 21 ago 2017, a las 13:27, Mariel Cherkassky < > mariel.cherkassky@gmail.com> escribi=C3=B3: > > All this operation runs as part of a big transaction that I run. > > How can I create a dump in the oracle server and copy it to the postgresq= l > server from a postgresql transaction ? > > > I guess you could create a user defined function in any of the available > languages (perl, python, java, =E2=80=A6). Functions run inside transacti= ons > too=E2=80=A6this is not simple, though. > > Chopping the table is optional when I use copy, but when I use copy to > remote oracle table it takes longer to create the dump. > > > It may take longer depending on how the oracle machine, table and databas= e > are configured. In my experience oracle is not very fast dumping whole > tables, not to mention tables with BLOB data, which can be as slow as > hundreds of records per second (which is probably not your case). > > If this transaction is to synchronize data between transactional servers > and data analysis servers you may consider using some type of replication > where only changes are sent. EnterpriseDB has tools to do such things, I= =E2=80=99m > not aware of any other tool that can do this between oracle and postgres. > > Regards, > > Daniel. > > > 2017-08-21 11:37 GMT+03:00 Daniel Blanch Bataller < > daniel.blanch.bataller@gmail.com>: > >> >> El 21 ago 2017, a las 10:00, Mariel Cherkassky < >> mariel.cherkassky@gmail.com> escribi=C3=B3: >> >> To summarize, I still have performance problems. My current situation : >> >> I'm trying to copy the data of many tables in the oracle database into m= y >> postgresql tables. I'm doing so by running insert into >> local_postgresql_temp select * from remote_oracle_table. The performance >> of this operation are very slow and I tried to check the reason for that >> and mybe choose a different alternative. >> >> 1)First method - Insert into local_postgresql_table select * from >> remote_oracle_table this generated total disk write of 7 M/s and actual >> disk write of 4 M/s(iotop). For 32G table it took me 2 hours and 30 minu= tes. >> >> 2)second method - copy (select * from oracle_remote_table) to /tmp/dump = generates >> total disk write of 4 M/s and actuval disk write of 100 K/s. The copy >> utility suppose to be very fast but it seems very slow. >> >> >> >> Are you using a FDW to access oracle server and then dump it using copy? >> This is going to be slow, FDW isn't fast. >> >> >> -When I run copy from the local dump, the reading is very fast 300 M/s. >> >> >> You reported it was slow before. What has changed? How much does it take >> to load the 32G table then? >> >> >> -I created a 32G file on the oracle server and used scp to copy it and i= t >> took me a few minutes. >> >> -The wals directory is located on a different file system. The parameter= s >> I assigned : >> >> min_parallel_relation_size =3D 200MB >> max_parallel_workers_per_gather =3D 5 >> max_worker_processes =3D 8 >> effective_cache_size =3D 12GB >> work_mem =3D 128MB >> maintenance_work_mem =3D 4GB >> shared_buffers =3D 2000MB >> RAM : 16G >> CPU CORES : 8 >> >> HOW can I increase the writes ? How can I get the data faster from the >> oracle database to my postgresql database? >> >> >> >> Extract the table to a file in the oracle server in a format that the >> COPY utility can read, then copy it to postgres server and load it. You = can >> even pipe commands and do it in a single step. >> >> This is what I meant when I said that COPY is much faster than any thing >> else. To make it even faster, if I/O is not your bottleneck, you can cho= p >> the table in chunks and load it in parallel as I told you before, I have >> done this many times when migrating data from oracle to postgres. ora2pg >> uses this method to migrate data from oracle to postgres too. >> >> >> 2017-08-20 14:00 GMT+03:00 Mariel Cherkassky > >: >> >>> I realized something weird. When I`m preforming the copy utility of >>> postgresql in order to create dump from a local table in my postgresql = db >>> it takes for 32G table 20 minutes. When I try to use copy for a foregin >>> table (on oracle database) It takes more than 2 hours.. During the copy >>> operation from the foreign table I dont see alot of write operations, w= ith >>> iotop i see that its writes 3 M/s. What else I can check ? >>> >>> 2017-08-20 9:39 GMT+03:00 Mariel Cherkassky >> >: >>> >>>> This server is dedicated to be a postgresql production database, >>>> therefore postgresql is the only thing the runs on the server. The fs = that >>>> I`m using is xfs. I`ll add two different disks - one for the wals and = one >>>> for the temp tablespace. Regarding the disk, what size should they be >>>> considering that the database size is about 250G. Does 16G of ram >>>> considered little ? I installed iotop and I see that postgresql writer= is >>>> writing most of the time and above all. >>>> >>>> I mentioned that I perform alot of insert into table select * from >>>> table. Before that I remove indexes,constraints and truncate the table= . >>>> Should I run vacuum before or after the operation ? >>>> >>>> 2017-08-17 19:37 GMT+03:00 Claudio Freire : >>>> >>>>> On Thu, Aug 17, 2017 at 6:00 AM, Mariel Cherkassky >>>>> wrote: >>>>> > I checked with the storage team in the company and they saw that I >>>>> have alot >>>>> > of io on the server. How should I reduce the io that the postgresql >>>>> uses ? >>>>> >>>>> Do you have concurrent activity on that server? >>>>> >>>>> What filesystem are you using wherever the data is sitting? >>>>> >>>>> If you've got concurrent fsyncs happening, some filesystems handle >>>>> that poorly. When you've got WAL and data mixed in a single disk, or >>>>> worse, filesystem, it happens often that the filesystem won't handle >>>>> the write barriers for the WAL efficiently. I/O gets intermingled wit= h >>>>> bulk operations, and even small fsyncs will have to flush writes from >>>>> bulk operations, which makes a mess of things. >>>>> >>>>> It is a very good idea, and in fact a recommended practice, to put WA= L >>>>> on its own disk for that reason mainly. >>>>> >>>>> With that little RAM, you'll also probably cause a lot of I/O in temp >>>>> files, so I'd also recommend setting aside another disk for a temp >>>>> tablespace so that I/O doesn't block other transactions as well. >>>>> >>>>> This is all assuming you've got concurrent activity on the server. If >>>>> not, install iotop and try to see who's causing that much I/O. >>>>> >>>> >>>> >> > > --001a11476f181a853c0557428abb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I`m searching for a way to improve the cu= rrent performance, I'm not interesting in using a different tool or wri= ting something new because I'm trying to migrate a system on oracle dat= abase to a postgresql database.

<= div class=3D"gmail_quote">
2017-08-21 14:53 GMT+03:00 Danie= l Blanch Bataller <daniel.blanch.bataller@gmail.com>= :

<= blockquote type=3D"cite">
El 21 ago 2017, a las 13:27, Mariel Cherkassk= y <mari= el.cherkassky@gmail.com> escribi=C3=B3:

All this operation runs as part of a big transaction that I run.
How can I create a dump in the oracle server and copy it to th= e postgresql server from a postgresql transaction ?

I guess you could create a user defined = function in any of the available languages (perl, python, java, =E2=80=A6).= Functions run inside transactions too=E2=80=A6this is not simple, though.= =C2=A0

Chopping the table is optional when I use copy, but = when I use copy to remote oracle table it takes longer to create the dump.= =C2=A0

It may take= longer depending on how the oracle machine, table and database are configu= red. In my experience oracle is not very fast dumping whole tables, not to = mention tables with BLOB data, which can be as slow as hundreds of records = per second =C2=A0(which is probably not your case).

If this transaction is to synchronize data between transactional servers = and data analysis servers you may consider using some type of replication w= here only changes are sent. EnterpriseDB has tools to do such things, I=E2= =80=99m not aware of any other tool that can do this between oracle and pos= tgres.

Regards,

Daniel.


2017-08-21 11= :37 GMT+03:00 Daniel Blanch Bataller <daniel.blanch.batalle= r@gmail.com>:
El 21 ago 2017, a las 10:00, Mar= iel Cherkassky <mariel.cherkassky@gmail.com> escribi=C3=B3:

<= div>
To summarize, I still have performanc= e problems. My current situation :=C2=A0

I'm trying to copy the data of many tables in t= he oracle database into my postgresql tables. I'm doing so by running= =C2=A0insert into local_postgresql_temp select * from remote_oracle_table. The performance of this operation are very slow and I tried to check = the reason for that and mybe choose a different alternative.

1)First method -=C2=A0Insert into local_postgresql_= table select * from remote_oracle_table=C2=A0this generated total di= sk write of 7 M/s and actual disk write of 4 M/s(iotop). For 32G table it t= ook me 2 hours and 30 minutes.

2)sec= ond method -=C2=A0copy (select * from oracle_remote_table) to /tmp/dump=C2=A0generates total disk write of 4 M/s and actuval disk write of 100 K= /s. The copy utility suppose to be very fast but it seems very slow.




Are you using a FDW to access oracle server and then dump it usin= g copy? This is going to be slow, FDW isn't fast.

<= /div>
<= p style=3D"margin:0px 0px 1em;padding:0px;border:0px;font-variant-numeric:i= nherit;font-stretch:inherit;font-size:15px;line-height:inherit;font-family:= Arial,"Helvetica Neue",Helvetica,sans-serif;vertical-align:baseli= ne;clear:both;color:rgb(36,39,41)">-When I run copy from the local dump, th= e reading is very fast 300 M/s.



You reported it was slow before. What has = changed? How much does it take to load the 32G table then?
=

-I created a 32G file on the oracle= server and used scp to copy it and it took me a few minutes.


<= /div>

-The wals dir= ectory is located on a different file system. The parameters I assigned :

min_parallel_relation_size =3D <=
span class=3D"m_-1961580192068351663m_2853355547872582334gmail-lit" style=
=3D"margin:0px;padding:0px;border:0px;font-style:inherit;font-variant:inher=
it;font-weight:inherit;font-stretch:inherit;font-size:inherit;line-height:i=
nherit;font-family:inherit;vertical-align:baseline;color:rgb(125,39,39)">20=
0MB
max_parallel_workers_per_gather =3D 5=20
max_worker_processes =3D 8=20
effective_cache_size =3D 12GB
work_mem =3D 128MB
maintenance_work_mem =3D 4GB
shared_buffers =3D 2000MB
RAM : 16G
CPU CORES : 8

HOW can I increase the writes ? How can I get the data faster from the = oracle database to my postgresql database?




Extract the tabl= e to a file in the oracle server in a format that the COPY utility can read= , then copy it to postgres server and load it. You can even pipe commands a= nd do it in a single step.

This is what I meant wh= en I said that COPY is much faster than any thing else. To make it even fas= ter, if I/O is not your bottleneck, you can chop the table in chunks and lo= ad it in parallel as I told you before, I have done this many times when mi= grating data from oracle to postgres. ora2pg uses this method to migrate da= ta from oracle to postgres too.=C2=A0

=

2017-08-20 14:00 GMT+03:00 = Mariel Cherkassky <mariel.cherkassky@gmail.com>:
=
I realized something weird. When I`m pref= orming the copy utility of postgresql in order to create dump from a local = table in my postgresql db it takes for 32G table 20 minutes. When I try to = use copy for a foregin table (on oracle database) It takes more than 2 hour= s.. During the copy operation from the foreign table I dont see alot of wri= te operations, with iotop i see that its writes 3 M/s. What else I can chec= k ?=C2=A0

201= 7-08-20 9:39 GMT+03:00 Mariel Cherkassky <mariel.cherkassky@gmai= l.com>:
This server is ded= icated to be a postgresql production database, therefore postgresql is the = only thing the runs on the server. The fs that I`m using is xfs. I`ll add t= wo different disks - one for the wals and one for the temp tablespace. Rega= rding the disk, what size should they be considering that the database size= is about 250G. Does 16G of ram considered little ? I installed iotop and I= see that postgresql writer is writing most of the time and above all.

I mentioned that I perform alo= t of insert into table select * from table. Before that I remove indexes,co= nstraints and truncate the table. Should I run vacuum before or after the o= peration ?=C2=A0
=
2017-08-17 19:37 GMT+03:00 = Claudio Freire <klaussfreire@gmail.com>:
On Thu, Aug 17, 2017 at 6:00 AM, Mariel Cherkas= sky
<mariel= .cherkassky@gmail.com> wrote:
> I checked with the storage team in the company and they saw that I hav= e alot
> of io on the server. How should I reduce the io that the postgresql us= es ?

Do you have concurrent activity on that server?

What filesystem are you using wherever the data is sitting?

If you've got concurrent fsyncs happening, some filesystems handle
that poorly. When you've got WAL and data mixed in a single disk, or worse, filesystem, it happens often that the filesystem won't handle the write barriers for the WAL efficiently. I/O gets intermingled with
bulk operations, and even small fsyncs will have to flush writes from
bulk operations, which makes a mess of things.

It is a very good idea, and in fact a recommended practice, to put WAL
on its own disk for that reason mainly.

With that little RAM, you'll also probably cause a lot of I/O in temp files, so I'd also recommend setting aside another disk for a temp
tablespace so that I/O doesn't block other transactions as well.

This is all assuming you've got concurrent activity on the server. If not, install iotop and try to see who's causing that much I/O.





--001a11476f181a853c0557428abb--