Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1smYxS-006oES-Oj for pgsql-general@arkaria.postgresql.org; Fri, 06 Sep 2024 13:25:03 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1smYxR-00EpYp-AA for pgsql-general@arkaria.postgresql.org; Fri, 06 Sep 2024 13:25:01 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1smYxQ-00EpWK-Tc for pgsql-general@lists.postgresql.org; Fri, 06 Sep 2024 13:25:01 +0000 Received: from mail-lf1-x133.google.com ([2a00:1450:4864:20::133]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1smYxP-000SD4-3I for pgsql-general@postgresql.org; Fri, 06 Sep 2024 13:25:00 +0000 Received: by mail-lf1-x133.google.com with SMTP id 2adb3069b0e04-53654e2ed93so2135475e87.0 for ; Fri, 06 Sep 2024 06:24:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725629098; x=1726233898; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=E6fvB4eaHZYwFA7ZHPuYBCFz33lGuJ8b1carTsIiBM0=; b=Q8JsCqw4XImP5R/yfwWcgPToXOM85S4MuhSZLHuKRlNm9Joqvd7QZ8Ld2coF07csVS +SmIUcPi3KnxfhOfcidI18a46irLY+A5Z7P9GYQDmEka2ELZTWcTAllyhpozr2Ow1pHz uELAR8LgcmDLo+p/ZuuzwMp5Qvxl7werPIiHB1hRFUTbeJ9y+yLF51W8yUulDxcU0FTG IZhnqI2eTSAZdDR0IRcDMH2IUHEFl9656ojTrYj56hZl9FX6O0nCNCce5agjR84pRbA3 B/07tgx3GsvBHsT2FF6x6K15y4pTEfMaXAtLffjCiZ+UTksgn8qfrSnbSxJcO4pILGs2 xPGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725629098; x=1726233898; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=E6fvB4eaHZYwFA7ZHPuYBCFz33lGuJ8b1carTsIiBM0=; b=j4th0cBub0ZV5zsc3bmzK6TLVAAisKYTpyqYSWF1/+mtf1U+pmPwRlT6251EWWXEPo q5eBb8hHd3Q8TpuNpWiYEQXrjhsM7puCTY1+/2hBtJwU6Ml3nrAhS4QJR4nIDLjHjVgh lEXgHtLJnfl3RH1xD3Ii+UEHkDMg2YwyV30N30hmuO/GZkBmGD3x8BI5cwRyTIO+70wO 1eVhdir6sC8If2JbD+cRWSKeF+4yvzql/sy/0s/tdvczGD3AS5MGAzNCySg3pHY9ISKH 3a4xTLl1sBG63J2UPhgDb8SD1z/irkIe0R7v+stBbvSmqloNvWpV4vT5QcqaLCUcJWln 7SDw== X-Gm-Message-State: AOJu0YxWnCGdCm9wG11YJEMzNF6J2HuMceQeFZy4bDrnwJhTfuOEtmp8 Kt61EOAPNA9iyMPmJfjyJ3nYGKNQwuS28/v8sCB0nflBR/BoZPkzOfI86bZVcoYRNXTNx523kGE y5O6vDw2ZRJabi1Rq4jFudBszE7dixM7b X-Google-Smtp-Source: AGHT+IE06wymtfxQXkytQ3JLGR15/0FhB14n0kBJlJ0UF1DFrcgl2++C7sJk9NPyIZdlE8rFdBuJXEzDRR1ix2hYr18= X-Received: by 2002:a05:6512:1256:b0:535:6925:7a8e with SMTP id 2adb3069b0e04-536587da5ffmr1456130e87.36.1725629097634; Fri, 06 Sep 2024 06:24:57 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: David Rowley Date: Sat, 7 Sep 2024 01:24:45 +1200 Message-ID: Subject: Re: Using left joins instead of inner joins as an optimization To: Xavier Solomon Cc: pgsql-general@postgresql.org Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, 6 Sept 2024 at 23:05, Xavier Solomon wrote: > > create table a(a_id int primary key generated always as identity, a_data text); > > create table b(b_id int primary key generated always as identity, a_id int not null references a(a_id), b_data text); > > Then the query > > explain select b_id from b natural left join a; > results in a `Seq Scan on b`. Whereas the query > > explain select b_id from b natural join a; > results in a join with sequential scans on both a and b. > > I believe because b.a_id is not null and references a.a_id a left and an inner join are exactly equivalent. Mostly equivalent, but there are a few corner cases where they're not. > My questions are: > - Am I wrong that in such a situation a left and inner join are equivalent? The foreign key triggers are deferred until at least the end of the statement, so there are cases where the foreign key can be temporarily violated. For example, if the outer query is an UPDATE a SET a_id = ... and you have an AFTER UPDATE ON a trigger that runs your left join query, a_id will be changed but the changes won't have been verified (or cascaded) in the referencing table. > - Why does PostgreSQL not automatically optimize this? We would need to change the way foreign keys work or maybe at least give the planner more context as to where the query it's planning is coming from. I think it might be safe to do this optimisation if it's a top-level query, but not if it's being run from a trigger. Maybe there are other cases which are safe too. > - Is it a bad idea to use left joins to optimize this even if semantically an inner join would be correct? If you only ever do it in top-level statements, then I think it's safe. If not, you might get wrong results. For deferred foreign key constraints, the window where the foreign key could be violated lasts until the end of the transaction, so even top-level queries could see wrong results if you use left join instead of inner. David