public inbox for [email protected]
help / color / mirror / Atom feedFrom: Akihiko Odaki <[email protected]>
To: [email protected]
Subject: Inappropriate inner table for nested loop join
Date: Fri, 23 Jun 2017 19:31:40 +0900
Message-ID: <[email protected]> (raw)
List-Unsubscribe: <mailto:[email protected]?body=unsub%20pgsql-performance>
Hi all,
I am having a problem with nested loop join.
A database has 2 tables: "posts" and "follows".
Table "posts" have two columns: "timestamp" and "account".
Table "follows" have two columns: "target_account" and "owner_account".
The database also has an index on "posts" ("account", "timestamp"), one
on "posts"("timestamp") and on "follows" ("owner_account",
"target_account").
Table "posts" is so big and have 10 million records.
The number of Records with the same value for "owner_accounts" in table
"follows" is about 100 by average.
I issue the following query:
SELECT "posts".*
FROM "posts"
JOIN "follows" ON "follows"."target_account" = "posts"."account"
WHERE "follows"."owner_account" = $1
ORDER BY "posts"."timestamp"
LIMIT 100
That results in a nested loop join with table "posts" as the inner and
"follows" as the outer, which queried for each loop. EXPlAIN ANALYZE
says the actual number of rows queried from table "posts" is 500,000.
This behavior is problematic.
For performance, it may be better to retrieve 100 records joined with a
record in table "follows", and then to retrieve those whose
"posts"."timestamp" is greater than the one of last record we already
have, or 100 records, joined with another record in table "follows", and
so on. It would end up querying 10,000 records from table "posts" at
most. The number could be even smaller in some cases.
Now I have these tough questions:
* Is the "ideal" operation I suggested possible for PostgreSQL?
* If so, I think that could be achieved by letting PostgreSQL use
"follows" as the inner in the loops. How could I achieve that?
* Is there any other way to improve the performance of the query?
Answers are greatly appreciated.
Regards,
Akihiko Odaki
--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected]
Subject: Re: Inappropriate inner table for nested loop join
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox