Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1p4nuU-0000ps-Ao for pgsql-sql@arkaria.postgresql.org; Mon, 12 Dec 2022 18:52:18 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1p4nuT-0000xR-6u for pgsql-sql@arkaria.postgresql.org; Mon, 12 Dec 2022 18:52:17 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1p4nuS-0000xI-Uy for pgsql-sql@lists.postgresql.org; Mon, 12 Dec 2022 18:52:16 +0000 Received: from sss.pgh.pa.us ([66.207.139.130]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1p4nuQ-00080g-QS for pgsql-sql@lists.postgresql.org; Mon, 12 Dec 2022 18:52:15 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 2BCIqDG82033541; Mon, 12 Dec 2022 13:52:13 -0500 From: Tom Lane To: Steve Midgley cc: Shane Borden , Rob Sargent , pgsql-sql@lists.postgresql.org Subject: Re: PARALLEL CTAS In-reply-to: References: <1e0484b6-40a6-15a9-9890-0991c8b8c1da@gmail.com> <7BD04D8E-9307-4107-B4D0-4C281390F996@gmail.com> Comments: In-reply-to Steve Midgley message dated "Mon, 12 Dec 2022 10:19:00 -0800" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2033539.1670871133.1@sss.pgh.pa.us> Date: Mon, 12 Dec 2022 13:52:13 -0500 Message-ID: <2033540.1670871133@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk >> Today I suspect you're left with something like the following: >> - CTAS from source where 1=2 (i.e. table definition via select semantics) >> - copy from stdin (filled with intended CTAS select) As far as I can tell, all supported versions of Postgres are perfectly content to parallelize the source-row computation in a CREATE TABLE AS SELECT, if they would parallelize the underlying SELECT. Note that this is not the same as INSERT INTO ... SELECT, which is a harder problem because the target table might already have indexes, constraints, etc. If what you are concerned about is parallelization of the physical insertions of the tuples, I'm pretty sure we don't have anything that can do that today, including COPY. COPY does have some batch-insertion optimizations, but that's not parallelism. Can you parallelize your problem at a higher level, ie do several table loads at once? regards, tom lane