Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rBDI3-009tPz-4q for pgsql-hackers@arkaria.postgresql.org; Thu, 07 Dec 2023 12:15:39 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rBDI1-00HQSd-PR for pgsql-hackers@arkaria.postgresql.org; Thu, 07 Dec 2023 12:15:37 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rBDI1-00HQSH-30 for pgsql-hackers@lists.postgresql.org; Thu, 07 Dec 2023 12:15:37 +0000 Received: from mail-yb1-xb2e.google.com ([2607:f8b0:4864:20::b2e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1rBDHw-00AYdV-55 for pgsql-hackers@postgresql.org; Thu, 07 Dec 2023 12:15:36 +0000 Received: by mail-yb1-xb2e.google.com with SMTP id 3f1490d57ef6-d9caf5cc948so940069276.0 for ; Thu, 07 Dec 2023 04:15:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joeconway.com; s=google; t=1701951330; x=1702556130; darn=postgresql.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=jHLuDnHf53K+078Nz/64Q23flWWoZMyMO5EpU2FQIKM=; b=h/ew4/O7tOL7722rlT+CSqgpVs4fiMHVx8ZKLQ08izyOjpT7ysz61QRViCEa3OyJD4 XmzN9XTYX9GuF8RJn4HK1mzSpxL0kb02w0cFmX0zaPvNdPjQl6Xx6FaalgVCB6fywnXg vrs/Rz8NTDWiyK0c5NkSbbcJLHGTV3lP4LABkMyKZH2PY/rnsuIgphs58CCFoBsiZztS 5OBfjqwp3lzJiLZ40vfDyI+1i7bdWh11HnrnxL1dhB34EsCLP8Rr3JLtALq3/sD8pYF+ 5rU34DrMmWOlxWpSb6j0sZah3NUFCx7LVSdGwM8LYkrdcdos7+6yqM5YEoVQZtTDB/Lh +1OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701951330; x=1702556130; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jHLuDnHf53K+078Nz/64Q23flWWoZMyMO5EpU2FQIKM=; b=BaG9a2w/qQoUupMT5MzJz8uoBAvrinT2C/yaEn6JhpwHqeEXNHcPKlFDeT2ycGzx6T xsszVmj9plGt2Y8adtopp5DfwmJ6+2eMORRrkPGAu95pnzH+f4i21jW34KiufbCXiMkE ndZYS77iixTG6EBz5EPpZjS2ddFMrgLlR/ULjhz4p7vHeJgBoOW+ekzXY3bsAu/HBGk0 TSGQ5ptn+g6CTSdZxDVgH4ZdiNCQCE5qjj8AvtfNpLZ02B3ODWlDfQCGcFYsgYJu6PyE CdS5x65dBXWs2+iP2Malgh1SnS/Q9AB+tXKXp4yPfY55eZ3HzHYuHQXupPqUjEb9K0Zv DIpQ== X-Gm-Message-State: AOJu0YwLrL0twscghuQtZ7qFJAIFk+vBWrfGJCA4kF++2tN5w3+LQ7lp 7348GX5cEv8cAV7eatb8+n5NKQ== X-Google-Smtp-Source: AGHT+IE5ENUCc5ls7TR+08semKDYSgzGC5J0KuCZ+UehJq163+MhQNihnIxPfiNeXOd6bRyZyV0/hg== X-Received: by 2002:a05:690c:e09:b0:5d7:1940:7d65 with SMTP id cp9-20020a05690c0e0900b005d719407d65mr2390003ywb.60.1701951329854; Thu, 07 Dec 2023 04:15:29 -0800 (PST) Received: from [192.168.4.41] (162-239-31-113.lightspeed.dybhfl.sbcglobal.net. [162.239.31.113]) by smtp.gmail.com with ESMTPSA id l20-20020a81d554000000b005d395bf0ad5sm344126ywj.64.2023.12.07.04.15.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 07 Dec 2023 04:15:29 -0800 (PST) Message-ID: <685e4c62-b584-4a0b-aee1-6ff77fa71aba@joeconway.com> Date: Thu, 7 Dec 2023 07:15:28 -0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Emitting JSON to file using COPY TO Content-Language: en-US To: Nathan Bossart , Tom Lane Cc: Andrew Dunstan , Davin Shearer , PostgreSQL-development References: <7a60faf6-e7f1-419d-aee6-10a78ea2fe81@joeconway.com> <2e7ff718-895d-83fc-46f7-be25e23b23b4@dunslane.net> <1104915.1701877459@sss.pgh.pa.us> <19a5f9d8-bd1f-9!e51-0f5b-510c1189a8a7@dunslane.net> <1110690.1701880139@sss.pgh.pa.us> <20231206163349.GA2873889@nathanxps13> <20231206164439.GC2873889@nathanxps13> <77312d02-9325-4f14-9fe0-2746c2ee12bc@joeconway.com> <1136975.1701894046@sss.pgh.pa.us> <20231207025622.GA3011676@nathanxps13> From: Joe Conway Autocrypt: addr=mail@joeconway.com; keydata= xsFNBEpXMCsBEADDnXUQzjlyi/cX02Gtdy2CLcroE5CsC7DJKdOBDbfgn0kfiIYoV5JniG4l VyzZUodY8yUAagqLYolh0UkBzs9N+qkm7erde4ypw3jzVQ37BuzIvk3nMUbuDZDgxWqX+nVS sKc+BQ5BpzgCHg48leoRO2ohjvYnUhgH3j2rFZCzaj6qQ7mv+XoxOJmUlVQtG06Jwkk7Vu14 7U9nMMM6hyUKzVnmCphnlcMNo26UyVU70MwFfFJgcI0c5fpp8byN56eD6VJVnufO5WAuEhzE qcrSJR2FAlmM90GBY+6vP29twLDCHuSFvrnujNCx/BvCC/a3/gPvyAFp4JtMm9eXAmq3m/Kw 94nTJXVdcbQeQQDp3KIG7MmWS4lnGvPn8v0CjgNaLvZXFLo1FgmUVsyEq1Lww4iRLa6sbpXJ ESx15UEue1k1YZM9C+4F/o3aeKNsAienjw2EXFzcaxIg/C4P493VMi3Qa8ycVxR5iYhUbYdo DFIUQhbFNsYfrtW/qZAELT3FCYFpZYG01e9Hj+cBrXXgyDDkQ5Lq4mlvmkRvuxn61V6Au4HA 0sJiCox5pM1FvzT+aI8HY1BYaiB9Pl4fhpKgmhhlSuglk9v39S4jmlUIb45iLAUVpeNM6Qjm 69pf5da9sm4aGFa7YlDSKf/WcU7z9ITZxsilOi2n7YJiwG7kTQARAQABzSRKb3NlcGggRSBD b253YXkgPG1haWxAam9lY29ud2F5LmNvbT7CwXoEEwEIACQCGwMCHgECF4AFCwkIBwMFFQoJ CAsFFgIDAQAFAlWTVvUCGQEACgkQMyt+aLaZQ0oPCQ/9HyRewMyvAIJRmoXoLAr8AoFLId6R qBJnNX0Lll0RLZui65aQ0+exwX7aH7TxWR16B2gWX3OmLfGT8XITOoG+zt9zsEpLvNkHchkF T/jyAcbuRj5WX9hamZgMbjXAJeCdlhW+fRA9Upb0w4dgBjqK5OgsqMikASL7t2vogHl9H08j vSoQLW+8wTnSBXBeBTBwB7xLIin5WVivzFHUCrnD2UsjeBIW3fmGdpTAjSxRzG+UPYVwXQ8F FLt7DpEytvLWapmZWMRdj0WZ/Q3SOO/Ed0yFqbzuwKaWcFrQBNeS2Sig+FefBNS98f9Hx7ku H3DW34qX/zSSdDh0jLs7X3PkIgF6BZR2TxaCwHPP9ERDiDaUInC9U7We1iZE1DjW8rLMEVJB hY0ClrrF67pnUKTbcU+uajpPn+2Jl74T0Set/XxpHZ4cezcJuqg31R8vHZgd5cf1WKP0D0pc qiuS02BBFkNCs1jQ+raTWcDuE6F1mUO2nvjUBN9r4y5DUbCNSqLKeAe/aA6JaSDkBpoXKdNS +c4rbzbktWkfUW8EhVlCGzNpy4ezEoVsqV2Ex7fNoxsE2vnSylLT9hycAmYf8ryMvniRZqnD T4JgLenIcQlkhB896T7wApOXfD8OJj1/XFxAfPi6vdlsr81uoxuB4euLp8IyduwLORRUogO9 zmAXG5jOwU0ESlcyJwEQAOkTBb9yDhJbMUgvhM11rZwT5tm4Y9TqtEHn0Zy3t9g7bdFFpMva v/KENd3oAtLFpMDf+H3AggFk4ftUwJwiVgJ88ilvCynJUGXiuYIaexY4DLgn4xpnuiEpYEFV dWnlw7dWVTc62exfqIz9bSWRzwfBCY9ruYGEb4RDPDSNSAVyI7sxHzef2asiYxIcxrTrw5Vu gWNlPZcV5/EJ6PUvATjBF2TBkXV7KOciQng2tsQGrGMkY5mduNqwpuh6zfPcVF8LeObe96wv 5ZhPRpO79nef7hnK2lJogp3JIo558Jlbz9WHtQEMZR85+bUhtI825QyNAFz3Jrn7NMgvDikc 2OrWo7YMgMC5hDSWVFqA6/EQCNnDWGABWgeYHZFpnPwsvUWIYdhSilUuj/Tuzvz9ZmucFNbQ bauDQw6VQ38ofGnoYDZFJsGncprB8dBi4tDrIQ+1RlIh6C2Z/eMipqJOT26+spluTjouvnKT 0S5yOgyX0PjbsysgwQdCGNJLHOjhHbSpSmOLaduV3CQo/0+DHT/TBjYfIXjTWouY9TkGxG4e NrxU0u2xAy5bMqOPmsFdjLTWlQUlF/fTMhB54XwI3FHWgnSnXZzStDTmTebLNdT/ftgliAzA 81uMj49j0exv731/v+7udLA1bV8gnZ01zQCASDpWiRQR3fgwcugSUqgRABEBAAHCwV8EGAEI AAkFAkpXMicCGwwACgkQMyt+aLaZQ0pwAQ//bjcWnZg/jjRQ9gbZUGMqniItZYRglBMKIqt4 Fia379JmHwTvavnFkJ8XMZ56UB0FIrgS+sUkRH6cPRQR+7Qi392LD021DXgSsz9CwFHjFyBG HwLEOTRcfYQbtJy0shHDJB4aQTOX3ERDH1PsvJNuevmQMzS0DWFav9+xMz9rKP4N+HffoBIZ E0C1xIE43nD4eLsbycte9sVIrmlNuUti3qUxJAQw8HwfJ6ZbBInHxquApR16uD1u99o6Xlnd FrDlY22tRmHCM0bR81GfGNdcU3Uo+rG/R/k4qa7s9/dgKvMbyH3fHhp/ceKag80Xo8IFurRl 0ZJP3sHJ2QDHCVLat7jRZ+43hi1WlIhFbrgn6IyI0i7XR/W8JjrC5MsKq4TUwGH077sU/kcH YebVJZRbUUst2hAGHDFVBcG12qoKf+ltL9qXJc1y7BGeCoUW6QjOpljpq6ZL4FQUsM0RSRjs 5egE3szPcIf5SyPK6WDOApoAq6M7BBFMGDZwEylYMtr0YekA1u86UA9D2xwLHEbBBp/uiby1 c9JbPJ1Pn8zJP8WZNeRw4Q9TtqVK09+oLirMUSpIDd6KdZ1VgRxOK2re7tjDvkVuYsSrsiJ+ 1iJNEnp9iK0ok0DlJpSCe6KhkxpaTdeoWMXdKuJWec0NIqoAd54ZgBPnr+UPxTixgPq/p6Q= In-Reply-To: <20231207025622.GA3011676@nathanxps13> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 12/6/23 21:56, Nathan Bossart wrote: > On Wed, Dec 06, 2023 at 03:20:46PM -0500, Tom Lane wrote: >> If Nathan's perf results hold up elsewhere, it seems like some >> micro-optimization around the text-pushing (appendStringInfoString) >> might be more useful than caching. The 7% spent in cache lookups >> could be worth going after later, but it's not the top of the list. > > Hah, it turns out my benchmark of 110M integers really stresses the > JSONTYPE_NUMERIC path in datum_to_json_internal(). That particular path > calls strlen() twice: once for IsValidJsonNumber(), and once in > appendStringInfoString(). If I save the result from IsValidJsonNumber() > and give it to appendBinaryStringInfo() instead, the COPY goes ~8% faster. > It's probably worth giving datum_to_json_internal() a closer look in a new > thread. Yep, after looking through that code I was going to make the point that your 11 integer test was over indexing on that one type. I am sure there are other micro-optimizations to be made here, but I also think that it is outside the scope of the COPY TO JSON patch. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com