Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sRjNv-000cGT-6B for pgsql-general@arkaria.postgresql.org; Thu, 11 Jul 2024 02:18:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sRjNt-00AP92-Uu for pgsql-general@arkaria.postgresql.org; Thu, 11 Jul 2024 02:18:13 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sRjNP-00ALf4-Jr for pgsql-general@lists.postgresql.org; Thu, 11 Jul 2024 02:17:43 +0000 Received: from mail-lf1-x12c.google.com ([2a00:1450:4864:20::12c]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sRjNM-001U7j-6L for pgsql-general@lists.postgresql.org; Thu, 11 Jul 2024 02:17:43 +0000 Received: by mail-lf1-x12c.google.com with SMTP id 2adb3069b0e04-52e9b9fb3dcso456483e87.1 for ; Wed, 10 Jul 2024 19:17:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720664258; x=1721269058; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=zTl+5yjKLcfN68qwzb0Qu9Y7s9M3cQllf3un0Dvxens=; b=jX4D8e4riDfq22hbzIkfPb/gMr2x5jmc7Lr5yJbXjUR6h1QnTcdN/1TEAFALuuyxVK BSL8yhiE5wccHQVo7RB89FSYBCdg4l1lkPAda18DO8ifwz6ydliuavOpDfrvdmDyCcg1 0tshkUNqM0gymOTtZ+YVUEIfa251/sL3npKesHpAZ5XXFXLFDIpg0NH+gl2JUAMAOia2 naK8+q91Hc5/DlmddO1YK2R25IBfsABlHYFOlXx0MZdt6JpEF0T1xEHOvD8t+H47Q0qb mbl6pS9+Zx8Th4XcUgKcxe5wYRzqwGs/+kd1R5AIRgYG9hTXc9LoY0E1d5M705zLRaYU wOqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720664258; x=1721269058; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=zTl+5yjKLcfN68qwzb0Qu9Y7s9M3cQllf3un0Dvxens=; b=HghdmpxRmbNVfX7RAyHfMjRHycf2ITe4GMgK2nfUJ2uIyBQc+7ulhE11BDn3wh/PGC AwEGspZBIhU0F75uOdy8VxNo43k8AkbXjcqXjUgQB0KvNI9IGr35BcOaPCey9yYXec+N mdS4eVoGh4+RznMD9xvLO40ycImlaSFMP2Vu7SjeHKwMalk2053ndZHMJVnhrr+ldLlC Cdsjo3RRwbi5ukzJXDcEoxoinzqFp9MAsSnhkaNO3Z7Kouu51aeGJtv4VqlhyO+U2sKL o874V511dKNXkg4VJ9Yo3IPJ3vtdgR3NL+8HViNk+8VdsF+SbCZ1TmCM+qT0B6SbUfeV uBUQ== X-Gm-Message-State: AOJu0YzxDtGQDbx1ENaykOAYWcktAXDPFwwFzT0mMhpjyCfdSvLyCvCl fqFmudVz0vi6DMGYOU9jzS/sB0uJFFdw35Td2YvnaEJgFvu0WTIJqL5rogpOUnbA+kRGvkhGUMs +ScY4bjwiKIGrbZ+F0jNE6baK6hA= X-Google-Smtp-Source: AGHT+IGq6SimyZqfjI/mdJRBH2MGVHrA2D4w8R3wXfZ7a3zapE4SGhRDqe6BLIVD6xYTvywHWTJ0eGvPm4sVdwTDN7M= X-Received: by 2002:a05:6512:34cb:b0:52c:dbe7:cfd5 with SMTP id 2adb3069b0e04-52eb99a3121mr3625029e87.32.1720664258132; Wed, 10 Jul 2024 19:17:38 -0700 (PDT) MIME-Version: 1.0 References: <3603c380-d094-136e-e333-610914fb3e80@gmx.net> In-Reply-To: <3603c380-d094-136e-e333-610914fb3e80@gmx.net> From: David Rowley Date: Thu, 11 Jul 2024 14:17:26 +1200 Message-ID: Subject: Re: Query on partitioned table needs memory n_partitions * work_mem To: Dimitrios Apostolou Cc: pgsql-general@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, 11 Jul 2024 at 13:19, Dimitrios Apostolou wrote: > I have a table with 1000 partitions on PostgreSQL 16. > I notice that a fairly complicated query of the form: > > SELECT ... GROUP BY ... LIMIT ... > > causes the postgres backend process to grow insanely very fast, and the > kernel OOM killer to kill it rather soon. > It seems it tries to allocate at least 1000 * work_mem. > -> Append > -> HashAggregate > -> Seq Scan > -> ... 1000 more hashagg+seqscans > > > Is this allocation pattern (workmem * n_partitions) expected under any > scenario? I can't find it documented. AFAIU the backend should allocate > up to (depth_of_execution_plan * work_mem) (putting aside the > hash_mem_multiplier and the parallel workers). Not depth of execution plan. It relates to the number of nodes in the plan which allocate work_mem or work_mem * hash_mem_multiplier. There is some documentation in [1]: "Note that a complex query might perform several sort and hash operations at the same time" Also, see the warning about execution time memory in [2]. > NOTE: after having written the above message, it occured to me that I have > enable_partitionwise_aggregate=on. And Turning it off fixes the issue and > makes the query faster too! Expected behaviour or bug? enable_partitionwise_aggregate=on causes this query to perform an aggregate per partition. If your GROUP BY clause values are distributed evenly throughout all partitions then you might find it's not much slower to execute the query with enable_partitionwise_aggregate=off. It's understandable that how PostgreSQL uses work_mem isn't ideal here, but unfortunately, that's the state of affairs, currently. You might want to reconsider your enable_partitionwise_aggregate setting and/or how many partitions you have. David [1] https://www.postgresql.org/docs/current/runtime-config-resource.html [2] https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-BEST-PRACTICES