Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1di6Ak-0003r4-KN for pgsql-performance@arkaria.postgresql.org; Wed, 16 Aug 2017 21:48:18 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1di6Ak-0001Pf-6v for pgsql-performance@arkaria.postgresql.org; Wed, 16 Aug 2017 21:48:18 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1di68w-0005Wf-FG for pgsql-performance@postgresql.org; Wed, 16 Aug 2017 21:46:26 +0000 Received: from mail-wr0-x22c.google.com ([2a00:1450:400c:c0c::22c]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1di68t-0003ui-Do for pgsql-performance@postgresql.org; Wed, 16 Aug 2017 21:46:25 +0000 Received: by mail-wr0-x22c.google.com with SMTP id x43so27771132wrb.3 for ; Wed, 16 Aug 2017 14:46:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=/t16NdMbDcaXo2PIC+l47yZIq9eKJ6rcckJwpcjSP+U=; b=ObYbrvD6uhAo4WsZ5lOBviPDev+F4Hqr1kyVABKHfckWbxeQEySKKnK7E1ERYdSxDn eNvZbP2a0UGNdq396xWIDr67SxWV9ZKgdAbXOUGs90GLfGxY/T93E7VfBmZ1kEm6XjVp G8ivo9/IPfQJLSAtFUQ2h3bWlhNijUz7MfDHpm8t+wxT3AXeXCX/mV0vb459ZU7KFeBN /xtL+JXyp9JE8sbkudHUtJDMf1AijQ6/VHGMmvILK7ZyaI6rAgkaTfu57SydJ/pyZSYR xki341SB5TJP3N7WnJkfzpgyLpctE02BuN/MtQoQE7Z8C4aoQt6bBIMCrseCGITzSLzW Z5JA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=/t16NdMbDcaXo2PIC+l47yZIq9eKJ6rcckJwpcjSP+U=; b=DCQLdr05n1cCKoizP6CHGR13ot0d+ljkJ3eR+PEY6zD9oEskybg+PvSWamntPDrMjv U2tl3Bl4Co0OrbCzabtFpEwKiqLzeddVRFn2kTi+4fjaM+iTjh9gl30KtalSViTbj8QH Um4V48Y3U+ZDrolFvcAhsxJS5s5S4FsLahEMIsx80CfBi18EZyAR76qoaWGN5AjQksQZ 0e8OvsYtGjDGlfRXhzaeRhBbfUxV2t4kuu0izPvz5fEhT9kU6wJYLFxeKicLP54NTY4l hxwh736snaUyib45s25LXwQkaglItcsLYahdgeEgyWMGQpRkbNPDBOVwYqltp7PIuPcz 19dQ== X-Gm-Message-State: AHYfb5jXcC8U0obe9NieExIvEd4W2pVhMJCJeqQg0gftP9GE7fvgQMtp rfRagwgmrf9r8A== X-Received: by 10.223.151.135 with SMTP id s7mr2284855wrb.231.1502919980742; Wed, 16 Aug 2017 14:46:20 -0700 (PDT) Received: from [192.168.1.33] (213.red-81-40-123.staticip.rima-tde.net. [81.40.123.213]) by smtp.gmail.com with ESMTPSA id a21sm2306374wrc.64.2017.08.16.14.46.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Aug 2017 14:46:19 -0700 (PDT) From: Daniel Blanch Bataller Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_CFDD147A-2163-4C45-8DED-E49357659C7A" Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: performance problem on big tables Date: Wed, 16 Aug 2017 23:46:18 +0200 In-Reply-To: Cc: Scott Marlowe , "pgsql-performance@postgresql.org" To: Mariel Cherkassky References: <88350892-C4AB-4191-9F8E-DB7CFA81F6C6@gmail.com> <00990FBB-D288-4878-8B8E-36E24F808F2C@gmail.com> X-Mailer: Apple Mail (2.3273) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --Apple-Mail=_CFDD147A-2163-4C45-8DED-E49357659C7A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Seems your disks are too slow. On my laptop (nothing special, just one = disk) using COPY I can dump 3G in ~ 20 secs, loading takes 120 secs, = bare copying 3G takes 10 secs.=20 Similar proportion you had, but much faster.=20 confirm I/O is your bottleneck, and tell us how you solved your problem Anyway, You can cut import time by half if you set your destination = table to unlogged (postgres will write half the data, it will save the = transaction log writing). Remember to set it to logged when finished!! Regards, Daniel > El 16 ago 2017, a las 16:32, Mariel Cherkassky = escribi=C3=B3: >=20 > My server is virtual and it have virtual hd from a vnx storage = machine. The logs and the data are on the same disk. >=20 > 2017-08-16 17:04 GMT+03:00 Daniel Blanch Bataller = >: > Considering it has to write logs and data at checkpoints I don=E2=80=99t= see it particularly slow compared to the extract phase. What kind of = disks you have SSD or regular disks? Different disks for ltransaction = logs and data? >=20 >=20 >> El 16 ago 2017, a las 15:54, Mariel Cherkassky = > = escribi=C3=B3: >>=20 >> I run the copy command via psql to create a local dump of a 3G table = and it took me 134059.732ms =3D~2 minutes. After that I imported the = data via copy and it took 458648.677ms =3D~7 minutes. So the copy = command works but pretty slow.=20 >>=20 >> 2017-08-16 16:08 GMT+03:00 Daniel Blanch Bataller = >: >> See if the copy command is actually working, copy should be very fast = from your local disk. >>=20 >>=20 >>> El 16 ago 2017, a las 14:26, Mariel Cherkassky = > = escribi=C3=B3: >>>=20 >>>=20 >>> After all the changes of the memory parameters the same = operation(without the copy utility) didnt run much faster - it took one = minute less. I made a test with the copy command (without the 'with = binary') and it took 1.5 hours to create the dumpfile in my local = postgresql server. Then I tried to run the copy from the local dump and = it is already running two hours and it didnt even finish. I looked at = the server log and I saw that I run the copy command at 13:18:05, 3 = minutes later checkpoint started and completed and there are no messages = in the log after that. What can I do ? Improving the memory parameters = and the memory on the server didnt help and for now the copy command = doesnt help either. >>>=20 >>>=20 >>>=20 >>>=20 >>> 2017-08-15 20:14 GMT+03:00 Scott Marlowe >: >>> On Tue, Aug 15, 2017 at 4:06 AM, Mariel Cherkassky >>> > = wrote: >>> > Hi, >>> > So I I run the cheks that jeff mentioned : >>> > \copy (select * from oracle_remote_table) to /tmp/tmp with binary = - 1 hour >>> > and 35 minutes >>>=20 >>> So 26G takes 95 minutes, or 27 MB/minute or 456k/second? Sound about >>> right (it's early, I haven't had enough coffee please check my = math). >>> That's pretty slow unless you're working across pretty big distances >>> with mediocre connections. My home internet downloads about 100MB/s >>> by comparison. >>>=20 >>> > \copy local_postresql_table from /tmp/tmp with binary - Didnt run = because >>> > the remote oracle database is currently under maintenance work. >>>=20 >>> You shouldn't need the remote oracle server if you've already copied >>> it over, you're just copying from local disk into the local pgsql = db. >>> Unless I'm missing something. >>>=20 >>> > So I decided to follow MichaelDBA tips and I set the ram on my = machine to >>> > 16G and I configured the effective_cache memory to = 14G,tshared_buffer to be >>> > 2G and maintenance_work_mem to 4G. >>>=20 >>> Good settings. Maybe set work_mem to 128MB or so while you're at it. >>>=20 >>> > I started running the copy checks again and for now it coppied 5G = in 10 >>> > minutes. I have some questions : >>> > 1)When I run insert into local_postresql_table select * from >>> > remote_oracle_table I insert that data as bulk to the local table = or row by >>> > row ? If the answer as bulk than why copy is a better option for = this case >>> > ? >>>=20 >>> insert into select from oracle remote is one big copy, but it will >>> take at least as long as copying from oracle to the local network >>> took. Compare that to the same thing but use file_fdw on the file >>> locally. >>>=20 >>> > 2)The copy from dump into the postgresql database should take less = time than >>> > the copy to dump ? >>>=20 >>> Yes. The copy from Oracle to your local drive is painfully slow for = a >>> modern network connection. >>>=20 >>> > 3)What do you think about the new memory parameters that I = cofigured ? >>>=20 >>> They should be OK. I'm more worried about the performance of the io >>> subsystem tbh. >>>=20 >>=20 >>=20 >=20 >=20 --Apple-Mail=_CFDD147A-2163-4C45-8DED-E49357659C7A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
Seems your disks are too slow. On my laptop = (nothing special, just one disk) using COPY I can dump 3G in ~ 20 secs, = loading takes 120 secs, bare copying 3G takes 10 secs. 

Similar proportion you = had, but much faster. 

confirm I/O is your bottleneck, and tell us how you solved = your problem

Anyway, You can cut import time by half if you set your = destination table to unlogged (postgres will write half the data, it = will save the transaction log writing). Remember to set it to logged = when finished!!


Regards,

Daniel

El = 16 ago 2017, a las 16:32, Mariel Cherkassky <mariel.cherkassky@gmail.com> escribi=C3=B3:

My = server is virtual and it have virtual hd from a vnx storage machine. The = logs and the data are on the same disk.

2017-08-16 17:04 GMT+03:00 Daniel Blanch Bataller = <daniel.blanch.bataller@gmail.com>:
Considering it = has to write logs and data at checkpoints I don=E2=80=99t see it = particularly slow compared to the extract phase. What kind of disks you = have SSD or regular disks? Different disks for ltransaction logs and = data?


El 16 ago 2017, a las 15:54, Mariel = Cherkassky <mariel.cherkassky@gmail.com> = escribi=C3=B3:

I run = the copy command via psql to create a local dump of a 3G table and it = took me 134059.732ms =3D~2 minutes. After that I imported the data = via copy and it took 458648.677ms =3D~7 minutes. So the copy command = works but pretty slow. 

2017-08-16 16:08 GMT+03:00 Daniel Blanch Bataller <daniel.blanch.bataller@gmail.com>:
See if the = copy command is actually working, copy should be very fast from your = local disk.

El 16 ago 2017, a las 14:26, Mariel Cherkassky <mariel.cherkassky@gmail.com> escribi=C3=B3:


After all the changes of the memory parameters the same = operation(without the copy utility) didnt run much faster - it =  took one minute less. I made a test with the copy command (without = the 'with binary') and it took 1.5 hours to create the dumpfile in my = local postgresql server. Then I tried to run the copy from the local = dump and it is already running two hours and it didnt even finish. I = looked at the server log and I saw that I run the copy command at = 13:18:05, 3 minutes later checkpoint started and completed and there are = no messages in the log after that. What can I do ? Improving the memory = parameters and the memory on the server didnt help and for now the copy = command doesnt help either.




2017-08-15 20:14 GMT+03:00 Scott Marlowe <scott.marlowe@gmail.com>:
On Tue, Aug 15, 2017 at 4:06 = AM, Mariel Cherkassky
<mariel.cherkassky@gmail.com> wrote:
> Hi,
> So I I run the cheks that jeff mentioned :
> \copy (select * from oracle_remote_table) to /tmp/tmp with binary - = 1 hour
> and 35 minutes

So 26G takes 95 minutes, or 27 MB/minute or 456k/second? Sound = about
right (it's early, I haven't had enough coffee please check my math).
That's pretty slow unless you're working across pretty big distances
with mediocre connections.  My home internet downloads about = 100MB/s
by comparison.

> \copy local_postresql_table from /tmp/tmp with binary - Didnt run = because
> the remote oracle database is currently under maintenance work.

You shouldn't need the remote oracle server if you've already = copied
it over, you're just copying from local disk into the local pgsql db.
Unless I'm missing something.

> So I decided to follow MichaelDBA tips and I set the ram on my = machine to
> 16G and I configured the effective_cache memory to = 14G,tshared_buffer to be
> 2G and maintenance_work_mem to 4G.

Good settings. Maybe set work_mem to 128MB or so while you're at = it.

> I started running the copy checks again and for now it coppied 5G = in 10
> minutes. I have some questions :
> 1)When I run insert into local_postresql_table select * from
> remote_oracle_table I insert that data as bulk to the local table = or row by
> row ?  If the answer as bulk than why copy is a better option = for this case
> ?

insert into select from oracle remote is one big copy, but it = will
take at least as long as copying from oracle to the local network
took. Compare that to the same thing but use file_fdw on the file
locally.

> 2)The copy from dump into the postgresql database should take less = time than
> the copy to dump ?

Yes. The copy from Oracle to your local drive is painfully slow = for a
modern network connection.

> 3)What do you think about the new memory parameters that I = cofigured ?

They should be OK. I'm more worried about the performance of the = io
subsystem tbh.






= --Apple-Mail=_CFDD147A-2163-4C45-8DED-E49357659C7A--