public inbox for [email protected]help / color / mirror / Atom feed
Looking for an efficient way to replace efficient NOT IN when landling very large data 3+ messages / 2 participants [nested] [flat]
* Looking for an efficient way to replace efficient NOT IN when landling very large data @ 2023-04-11 09:27 Shaozhong SHI <[email protected]> 0 siblings, 1 reply; 3+ messages in thread From: Shaozhong SHI @ 2023-04-11 09:27 UTC (permalink / raw) To: pgsql-sql <[email protected]> Select a.id, a.name, b.id, b.name from a_large_table a, definitive b where ( a.id, b.name) not in (select b.id, b.name from definitive b) is very slow. Is there a faster way to do so? Regards, David ^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Looking for an efficient way to replace efficient NOT IN when landling very large data @ 2023-04-11 09:33 David Rowley <[email protected]> parent: Shaozhong SHI <[email protected]> 0 siblings, 1 reply; 3+ messages in thread From: David Rowley @ 2023-04-11 09:33 UTC (permalink / raw) To: Shaozhong SHI <[email protected]>; +Cc: pgsql-sql <[email protected]> On Tue, 11 Apr 2023 at 21:28, Shaozhong SHI <[email protected]> wrote: > > Select a.id, a.name, b.id, b.name from a_large_table a, definitive b where (a.id, b.name) not in > (select b.id, b.name from definitive b) > > is very slow. > > Is there a faster way to do so? It depends on what your exact requirements are for the NULL handling that NOT IN provides. Do you need the query to return 0 rows if b.id and b.name are null? This question is moot if none of the columns or either table allow NULLs. If you don't require that, then you'll give the planner more flexibility to choose a more efficient plan if you use NOT EXISTS instead. David ^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Looking for an efficient way to replace efficient NOT IN when landling very large data @ 2023-04-11 09:44 Shaozhong SHI <[email protected]> parent: David Rowley <[email protected]> 0 siblings, 0 replies; 3+ messages in thread From: Shaozhong SHI @ 2023-04-11 09:44 UTC (permalink / raw) To: David Rowley <[email protected]>; +Cc: pgsql-sql <[email protected]> On Tue, 11 Apr 2023 at 10:33, David Rowley <[email protected]> wrote: > On Tue, 11 Apr 2023 at 21:28, Shaozhong SHI <[email protected]> > wrote: > > > > Select a.id, a.name, b.id, b.name from a_large_table a, definitive b > where (a.id, b.name) not in > > (select b.id, b.name from definitive b) > > > > is very slow. > > > > Is there a faster way to do so? > > It depends on what your exact requirements are for the NULL handling > that NOT IN provides. Do you need the query to return 0 rows if b.id > and b.name are null? This question is moot if none of the columns or > either table allow NULLs. > > If you don't require that, then you'll give the planner more > flexibility to choose a more efficient plan if you use NOT EXISTS > instead. > > David > I would like to try out an example of NOT EXISTS way and see how the replacement works. Regards, David ^ permalink raw reply [nested|flat] 3+ messages in thread
end of thread, other threads:[~2023-04-11 09:44 UTC | newest] Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2023-04-11 09:27 Looking for an efficient way to replace efficient NOT IN when landling very large data Shaozhong SHI <[email protected]> 2023-04-11 09:33 ` David Rowley <[email protected]> 2023-04-11 09:44 ` Shaozhong SHI <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox