Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bB03b-0005tL-BO for pgsql-performance@arkaria.postgresql.org; Thu, 09 Jun 2016 13:31:35 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1bB03Z-00027C-Uu for pgsql-performance@arkaria.postgresql.org; Thu, 09 Jun 2016 13:31:33 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1bB03Y-00026h-NK for pgsql-performance@postgresql.org; Thu, 09 Jun 2016 13:31:32 +0000 Received: from mail-vk0-x22f.google.com ([2607:f8b0:400c:c05::22f]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1bB03R-0006bL-OI for pgsql-performance@postgresql.org; Thu, 09 Jun 2016 13:31:31 +0000 Received: by mail-vk0-x22f.google.com with SMTP id e4so54833816vkb.1 for ; Thu, 09 Jun 2016 06:31:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=7o3HGjNrE2WtkA8Qa26PIgKNp7QfzN4WgfrC0Q1kSrs=; b=X0wcToDuwvoc/aMMNt0c1xuac75PHhEGkGtKkm1fgLovGqfetKBs44kTemT0t0szbz XlcDfBDkkvevMWWZRjIEjhZH6ZsKlx4pNDqJTjfSYRmm9HpBs0jXFncza+8YLom/Hhzn 8/+JG309S0suZNT+HTxc2yq01m70rJ4Qw1D/vGouYIGZ7cLmzRxt1fYSZOIG1IA4+Ze0 0pC/KoOUAdg32/lb+Et+lIVT4F3oEwN2UNFwGtBkZLXZ6H0LxWqXm7pEoLTBEvR1yQoN oYjVV2JqFKvHOiJRXw6dp54g+w/2YzuFRlRabUUR5aeb5Pt3yMiQluSy8Z/+19rqwnSu WIJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=7o3HGjNrE2WtkA8Qa26PIgKNp7QfzN4WgfrC0Q1kSrs=; b=J3642LQiycQzY7/QNAfJSVMaHkbm9u8lfkLKcLNq8fj7jkM+tKrdbBIqMGaQbv9Gnp i7kBPz3WiHxIuNrBwLWvsbcgZwvWLkkrkTjg7dCRqhGtJiibkEXhPopGKeoAhf+pYcHU s8TVPeu+kmFU5VXmJVyk7sQGANU30VMpS84ukvqcgqGka0IWSjpzN4/bCCOOR6zWJaIU gJ7vM1htcGJFmXTLrhYBvJgshEC3DEcBtCic5gIl8JgQnk1tZ95IiPU/fIUh06oMoPNd X//V1OOjiYIz/PQkoLvHPmPymFVIkH9FZzAMGXqUrZMW7BpqafBS72DqUT1zI8MGPYsP /kxA== X-Gm-Message-State: ALyK8tLMiSTO5loSt/qPDIajbAqvUWEMrLcRrQ7+aJaiuD6tUWZ0/IpB+mTZ8EgDn+e501WedEw3CzvmCH9YqA== X-Received: by 10.159.38.228 with SMTP id 91mr222296uay.36.1465479084232; Thu, 09 Jun 2016 06:31:24 -0700 (PDT) MIME-Version: 1.0 Received: by 10.176.68.103 with HTTP; Thu, 9 Jun 2016 06:31:23 -0700 (PDT) In-Reply-To: <717.1465365859@sss.pgh.pa.us> References: <5756C618.1070802@agliodbs.com> <717.1465365859@sss.pgh.pa.us> From: Merlin Moncure Date: Thu, 9 Jun 2016 08:31:23 -0500 Message-ID: Subject: Re: array size exceeds the maximum allowed (1073741823) when building a json To: Tom Lane Cc: Michael Paquier , Josh Berkus , Nicolas Paris , "David G. Johnston" , pgsql-performance Content-Type: text/plain; charset=UTF-8 X-Pg-Spam-Score: -2.7 (--) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org On Wed, Jun 8, 2016 at 1:04 AM, Tom Lane wrote: > Michael Paquier writes: >> On Tue, Jun 7, 2016 at 10:03 PM, Josh Berkus wrote: >>> On 06/07/2016 08:42 AM, Nicolas Paris wrote: >>>> Will this 1GO restriction is supposed to increase in a near future ? > >>> Not planned, no. Thing is, that's the limit for a field in general, not >>> just JSON; changing it would be a fairly large patch. It's desireable, >>> but AFAIK nobody is working on it. > >> And there are other things to consider on top of that, like the >> maximum allocation size for palloc, the maximum query string size, >> COPY, etc. This is no small project, and the potential side-effects >> should not be underestimated. > > It's also fair to doubt that client-side code would "just work" with > no functionality or performance problems for such large values. > > I await with interest the OP's results on other JSON processors that > have no issues with GB-sized JSON strings. Yup. Most json libraries and tools are going to be disgusting memory hogs or have exponential behaviors especially when you consider you are doing the transformation as well. Just prettifying json documents over 1GB can be a real challenge. Fortunately the workaround here is pretty easy. Keep your query exactly as is but remove the final aggregation step so that it returns a set. Next, make a small application that runs this query and does the array bits around each row (basically prepending the final result with [ appending the final result with ] and putting , between rows). It's essential that you use a client library that does not buffer the entire result in memory before emitting results. This can be done in psql (FETCH mode), java, libpq (single row mode), etc. I suspect node.js pg module can do this as well, and there certainty will be others. The basic objective is you want the rows to be streamed out of the database without being buffered. If you do that, you should be able to stream arbitrarily large datasets out of the database to a json document assuming the server can produce the query. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance