Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rB4Zy-009Kgq-Mu for pgsql-hackers@arkaria.postgresql.org; Thu, 07 Dec 2023 02:57:34 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rB4Yw-00ESfG-QE for pgsql-hackers@arkaria.postgresql.org; Thu, 07 Dec 2023 02:56:30 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rB4Yw-00ESf6-Dx for pgsql-hackers@lists.postgresql.org; Thu, 07 Dec 2023 02:56:30 +0000 Received: from mail-io1-xd2e.google.com ([2607:f8b0:4864:20::d2e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1rB4Yr-009BBT-52 for pgsql-hackers@postgresql.org; Thu, 07 Dec 2023 02:56:26 +0000 Received: by mail-io1-xd2e.google.com with SMTP id ca18e2360f4ac-7b6f4ee4f7fso3454239f.0 for ; Wed, 06 Dec 2023 18:56:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701917784; x=1702522584; darn=postgresql.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=V98S3UBwbx0mgphFrww8oE16C+uIQWvTe0CLaViDf6c=; b=fzxCGEert+z9VYuTZdmXqqzbmkZnQPGJv7CJ/LG6/vOjRtqXU+IuVdJMWGjq3fizYs LvJRFDS7q7uc/bg3ZDHexqz2Tsv6akpkQcxQvlwWOQunE+Jjthi/uG9gOe/aMbFjZHEa ZERbWhfNzc5FyIgWGxd/VGH8PgjkB+/Jpzyi14Y458zSITup9dPY6FfJPZKivouhh7hg GsueteDniVOe4+jrM2FQAmcJH8AwqyIq6PYycjhmwtCBd0dVLq7XTARUgStPKa7g0wMH Ov2wKOdWIXKNZ72cL/urGecxVO90lUlVLEDQ4204Id7etF4D+eFstRv1nZaz8L95OMJc 0bGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701917784; x=1702522584; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=V98S3UBwbx0mgphFrww8oE16C+uIQWvTe0CLaViDf6c=; b=JjWZlVHhibIsBlxpmHnKdpDtHQ+JpWpdVDrYwuzTgdkfAx2sd7sKYBrvb3nnKji6QG Yoe8tmk6VEJfuRQV88CiMIdfhN5H9kVxJ6Nf1voM4iJFGTzzPvZhn4r1BhjxZbrolF9v Sr65G1058tK0lxMKcoPklYxFx+PI+RJ8HeqTfTjZaQkuyl1HdqHq6dxAmXIDpO7Dj7AX SOEUuDh2JOcg/wdNkLcIBn2bWFh2qt8GWKGA6cxmSMyc+Kp8y5iTDOSZGVffGo2FdgJX gT3RJbAeLsOA4FVBH6aDKLq8OkC2o+4K30e30agkuUeKPiPoMwyDUxays+cDV56gYeal ZKww== X-Gm-Message-State: AOJu0YzhbUME9nKhZdNSOy/e0Mhlv7hes13G3d1rcCv5neERVsv2Exqc ZpECqB9/QVTTgfqkUO2zSjo= X-Google-Smtp-Source: AGHT+IFOfznac1f/ZXTlFjTmyUFBdHZirxMGD8AMsoyi8xs7Ik8Rjuk+nyraXHA5KgfX2RK16ZgABg== X-Received: by 2002:a05:6e02:1d0e:b0:35d:7487:f6c1 with SMTP id i14-20020a056e021d0e00b0035d7487f6c1mr4240192ila.29.1701917784236; Wed, 06 Dec 2023 18:56:24 -0800 (PST) Received: from nathanxps13 (162-195-168-172.lightspeed.stlsmo.sbcglobal.net. [162.195.168.172]) by smtp.gmail.com with ESMTPSA id c9-20020a02a409000000b00468ec5fbec7sm83179jal.33.2023.12.06.18.56.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 18:56:23 -0800 (PST) Date: Wed, 6 Dec 2023 20:56:22 -0600 From: Nathan Bossart To: Tom Lane Cc: Joe Conway , Andrew Dunstan , Davin Shearer , PostgreSQL-development Subject: Re: Emitting JSON to file using COPY TO Message-ID: <20231207025622.GA3011676@nathanxps13> References: <7a60faf6-e7f1-419d-aee6-10a78ea2fe81@joeconway.com> <2e7ff718-895d-83fc-46f7-be25e23b23b4@dunslane.net> <1104915.1701877459@sss.pgh.pa.us> <19a5f9d8-bd1f-9!e51-0f5b-510c1189a8a7@dunslane.net> <1110690.1701880139@sss.pgh.pa.us> <20231206163349.GA2873889@nathanxps13> <20231206164439.GC2873889@nathanxps13> <77312d02-9325-4f14-9fe0-2746c2ee12bc@joeconway.com> <1136975.1701894046@sss.pgh.pa.us> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1136975.1701894046@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Dec 06, 2023 at 03:20:46PM -0500, Tom Lane wrote: > If Nathan's perf results hold up elsewhere, it seems like some > micro-optimization around the text-pushing (appendStringInfoString) > might be more useful than caching. The 7% spent in cache lookups > could be worth going after later, but it's not the top of the list. Hah, it turns out my benchmark of 110M integers really stresses the JSONTYPE_NUMERIC path in datum_to_json_internal(). That particular path calls strlen() twice: once for IsValidJsonNumber(), and once in appendStringInfoString(). If I save the result from IsValidJsonNumber() and give it to appendBinaryStringInfo() instead, the COPY goes ~8% faster. It's probably worth giving datum_to_json_internal() a closer look in a new thread. diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c index 71ae53ff97..1951e93d9d 100644 --- a/src/backend/utils/adt/json.c +++ b/src/backend/utils/adt/json.c @@ -180,6 +180,7 @@ datum_to_json_internal(Datum val, bool is_null, StringInfo result, { char *outputstr; text *jsontext; + int len; check_stack_depth(); @@ -223,8 +224,8 @@ datum_to_json_internal(Datum val, bool is_null, StringInfo result, * Don't call escape_json for a non-key if it's a valid JSON * number. */ - if (!key_scalar && IsValidJsonNumber(outputstr, strlen(outputstr))) - appendStringInfoString(result, outputstr); + if (!key_scalar && IsValidJsonNumber(outputstr, (len = strlen(outputstr)))) + appendBinaryStringInfo(result, outputstr, len); else escape_json(result, outputstr); pfree(outputstr); -- Nathan Bossart Amazon Web Services: https://aws.amazon.com