public inbox for [email protected]  
help / color / mirror / Atom feed
From: Jan Bauer Nielsen <[email protected]>
To: [email protected]
Subject: Performance decline maybe caused by multi-column index?
Date: Fri, 18 Mar 2016 14:26:14 +0100
Message-ID: <[email protected]> (raw)
List-Unsubscribe:  <mailto:[email protected]?body=unsub%20pgsql-performance>

Hi,

While developing a batch processing platform using postgresql as the 
underlying data store we are seeing a performance decline in our 
application.

In this application a job is broken up into chunks where each chunk 
contains a number of items (typically 10).

CREATE TABLE item (
     id                      SMALLINT NOT NULL,
     chunkId            INTEGER NOT NULL,
     jobId                 INTEGER NOT NULL,
     -- other attributes omitted for brewity
     PRIMARY KEY (jobId, chunkId, id)
);

So a job with 600.000 items results in 600.000 rows in the items table 
with a fixed jobId, chunkId ranging from 0-59999 and for each chunkId an 
id ranging from 0-9.

All ten inserts for a particular chunkId are handled in a single 
transaction, and over time we are seeing an increase in transaction 
execution time, <100ms for the first 100.000 items, >300ms when we reach 
the 400.000 mark, and the trend seems to be forever increasing.

No decline is observed if we instead sequentially submit 6 jobs of 
100.000 items each.

Therefore we are beginning to wonder if we are hitting some sort of 
upper limit with regards to the multi column index? Perhaps something 
causing it to sort on disk or something like that?

Any suggestions to the cause of this would be very much appreciated.

jobstore=> SELECT version();
                                            version
----------------------------------------------------------------------------------------------
  PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 
4.7.2-5) 4.7.2, 64-bit

jobstore=> SELECT name, current_setting(name), SOURCE
jobstore->   FROM pg_settings
jobstore->   WHERE SOURCE NOT IN ('default', 'override');
             name            |            current_setting |        source
----------------------------+----------------------------------------+----------------------
  application_name           | psql | client
  client_encoding            | UTF8 | client
  DateStyle                  | ISO, YMD | configuration file
  default_text_search_config | pg_catalog.english | configuration file
  lc_messages                | en_DK.UTF-8 | configuration file
  lc_monetary                | en_DK.UTF-8 | configuration file
  lc_numeric                 | en_DK.UTF-8 | configuration file
  lc_time                    | en_DK.UTF-8 | configuration file
  listen_addresses           | * | configuration file
  log_line_prefix            | %t | configuration file
  log_timezone               | localtime | configuration file
  max_connections            | 100 | configuration file
  max_stack_depth            | 2MB | environment variable
  port                       | 5432 | configuration file
  shared_buffers             | 128MB | configuration file
  ssl                        | on | configuration file
  ssl_cert_file              | /etc/ssl/certs/ssl-cert-snakeoil.pem | 
configuration file
  ssl_key_file               | /etc/ssl/private/ssl-cert-snakeoil.key | 
configuration file
  TimeZone                   | localtime | configuration file

Kind regards,

Jan Bauer Nielsen
Software developer
DBC as
http://www.dbc.dk/english


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected]
  Subject: Re: Performance decline maybe caused by multi-column index?
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox