Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ue22A-004D4Y-Is for pgsql-hackers@arkaria.postgresql.org; Tue, 22 Jul 2025 01:43:11 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ue229-004BIR-36 for pgsql-hackers@arkaria.postgresql.org; Tue, 22 Jul 2025 01:43:09 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ue228-004BII-B3 for pgsql-hackers@lists.postgresql.org; Tue, 22 Jul 2025 01:43:08 +0000 Received: from mail-qk1-x72e.google.com ([2607:f8b0:4864:20::72e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ue225-0007Sm-2M for pgsql-hackers@lists.postgresql.org; Tue, 22 Jul 2025 01:43:07 +0000 Received: by mail-qk1-x72e.google.com with SMTP id af79cd13be357-7e278d8345aso488817385a.0 for ; Mon, 21 Jul 2025 18:43:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dunslane-net.20230601.gappssmtp.com; s=20230601; t=1753148583; x=1753753383; darn=lists.postgresql.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :from:references:cc:to:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=Si+iwBoDwEyV3M0mKfyHk9tpdjXgCvyJMCYrPrEoKP4=; b=dhwcvuxpyJrjVIJ/SyzJI2JRnerIzr0Ocw3YCuwB8AszfVjbRFEyAGRmasMRiCJ+4J sHo6xqJZjkWe3ID7pHRVIMxIBKT5u7I4kxhVh0jFngTEgfwL7kiQqXZysR/bj7HmfY4l nLhvoBMgBcYPpQm4jTC6p/co48JWgdBj2zCRus1xbxJzp8HO36BDDzKR9J4Nh4Vc6qrt MB/huPJxObRzzcOJ5XIWlUWx5ckfbwvLDb9A9f76f8U4mljxB+Uf+afGv0fT6G0CLFoe FicC2dMhaNFDiWvLy3HUmGXw18Uda2rff9tnxL193jg6Q/ZHiOrmn/Dc0op698Wa2IlO Ajwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753148583; x=1753753383; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :from:references:cc:to:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Si+iwBoDwEyV3M0mKfyHk9tpdjXgCvyJMCYrPrEoKP4=; b=abzCT8eI1Xtuh3WBWE5EWcHheEwSBwYot4MsCBh+FcpdCGUVFKewyjVsU7Qo0EZNlS zMx6nT0LSK9t/iVh6alrcYbvIJR2Bn8VXolaH7sd/7QB3uJ+reaOMUFLP8QnxulrQVjA yYQ64HOFV+i40AOehvO6SQtUBY1Hq3xWheU5Qo0krLsPFXzJ6nyE5OS0QOa5V1k8c7IW ia2I2mHUtLO4Gspn06n0LXk7Z/diqMVEZRSL6Pkf39tVmX68UDEJ5stPe+DUoqlm9+5I Oc/kIhBdOoHA6l3X6PTQSUWaxcGVAYziM9Fv710PY/bfNYU0tAwQDBkeieLPetBnL5nn /rSg== X-Forwarded-Encrypted: i=1; AJvYcCUnCD8J6liW8pZCAiyVdL/UuliCmXNU2BciDJ1YfzNZ+3AlZ8intsvZrc5tc1k0YE5nTZ9GKHlzDwXQWZ1c@lists.postgresql.org X-Gm-Message-State: AOJu0YzCcQGEv9c3x828Ku/vTu/PnfTz7tGiOUGBc1c0CYKTbttveyrq 8gZj3BxSkAjSdGYMDn1Z7T14ULwF00zbHLj5m1B/5a7cfDB34x/yt8TbsHLqrQYfIeI= X-Gm-Gg: ASbGncs/8NzhTiG1d0x3+g7pS51U99t8OvdGQA0qMm581N1PLIb+bqdBQYSxIBaSRaC 6AJbYo6xdsMplb5wfE3Nq5xHoIQfuwtzy7bnrn0bGyTmHJwTFLcGCNVljT2FnNo4RQFUaPO8059 JnwaCp+fy3qp2PuoFSjrocLxhv5KuG2UHFZzNDbp7ezMxaeHmpNbo4OHDRF91bqLZvv9mRjDJR0 xdVHi6wwBLvwTSXSq0fjsUwLdF7FKs/nlNRhb6Zz5KYyvebQFmzOkwCwylfCt/Sr/8UuAYmgXd3 w4AqLOFNV60HVFK4/Y4q1Khxic8rBti0KI70BkuYC6eOuIz3pQ3d3IXLEwuE/3MRsMPZwkImY34 knxVroSxXE9TXe2i7cFLKotcaux2VRA== X-Google-Smtp-Source: AGHT+IF2RyTumMDAzoolBwd1vb8i5c2FjMDzOLIl5ir3hCMWMAbr28NsdeYQwHSras4S5uPjwMdgTA== X-Received: by 2002:a05:620a:8807:b0:7e2:315d:d7af with SMTP id af79cd13be357-7e343355808mr2487368485a.12.1753148582984; Mon, 21 Jul 2025 18:43:02 -0700 (PDT) Received: from ?IPV6:2605:a601:a6b0:500::1cb? ([2605:a601:a6b0:500::1cb]) by smtp.googlemail.com with ESMTPSA id af79cd13be357-7e356b458c2sm478975185a.23.2025.07.21.18.43.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 21 Jul 2025 18:43:02 -0700 (PDT) Message-ID: Date: Mon, 21 Jul 2025 21:43:01 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Non-text mode for pg_dumpall To: Noah Misch Cc: Mahendra Singh Thalor , =?UTF-8?Q?=C3=81lvaro_Herrera?= , jian he , Srinath Reddy , pgsql-hackers@lists.postgresql.org References: <616efe2c-3986-43cf-b88c-4435849acf9e@dunslane.net> <948154fe-0278-4f4c-8f5a-085e12f03163@dunslane.net> <20250708212819.09.nmisch@google.com> <20250716001957.c6.nmisch@google.com> <3f22a8bb-29e8-40cc-97a1-309181da2c13@dunslane.net> <20250722005339.ca.nmisch@google.com> From: Andrew Dunstan Content-Language: en-US Autocrypt: addr=andrew@dunslane.net; keydata= xsBNBE7KWFkBCAClridxur2AIc7eW2AR7izbfp3EnNefie2HbLF0izW5Ik5UjX2HBXBx4syI gY6b0ugohXrr274+baoAlvSbq6cAoQuEVrk5IZFzt20b1Xkx65FwGSEj526yiKLocqkJceSq Xr9xcA5SGY+FZv441chh5SU92v4q6z+6LPpoHOh97ptAVXZYNTtU0LevyvD5lja0TzbvJm6C eFXitJfnm1pLEr0DGJCR/iUOl/N62Kh4855zZC7NHIjQHPOvV5Stz/l5ilDhvGVk+xkXFPys SjZoUr1rXhYLpiyi5sR0X9FHXT0KnGuz1F5ERO7ZTLSSQ6fJwPj6gOk9K+vvoKvoeql5ABEB AAHNJEFuZHJldyBEdW5zdGFuIDxhbmRyZXdAZHVuc2xhbmUubmV0PsLAlwQTAQgAQQIbAwIX gAIZAQULCQgHAwUVCgkICwUWAgMBAAIeBRYhBOQ+WEYd/Hy/RGkVpZn6f8tZ/DuBBQJoGNGd BQkdEO8nAAoJEJn6f8tZ/DuBq74H/jkTR4Zi3stbw+xC7v2u3QozssK7MYPL2AsVfh7OealS h182fiWXpfvmmAB7WUHbhk9GC2RAOnHI/2d2jgKaMLAHsGYOT0YopTVIwRY43fCw/mK67yxc wmDcX+zyKfLaivNbf5A7QPLNwda98bEAMSJ8Sn652Uc6cA8t3uKGsVzbRBQOoYzjgvBCfSrE 9ql3PDNg0l4BfAqabd2f70ZUm9VAMEPrgv/v2xI7M2XiL4g5BVmqLCOwxLM8RMCotCuoweUr VO43DeBCIDwLxotMJKvGWDjBzQYlU1NPUAtNcz/gN9ITUe1VUGjyvGj4u1lxBOcQQUw7l1+T 5moZ4iZxXzvOwE0ETspYWQEIANGc4zQULOxhbqO2dyD51YhqCNRmm9oKWaqf+wmW4tpDe/VV cxAnNizd4LWCHfzpb5cHAtGkOPePMfzWVf6nvdF7d3eglbtf59+zG7O7llV0xSSoFiieQBsr GvqDInXYX/4mRRXMtyhM353/tixC9RWLs1oofyYmCPPXXY7h9R7en3B8BoVrRFcdzlIY/NFN hFGW/9dkEiGjgna2Rk6e15kln4ZvFBWUg23p93w/pqXcxY6+k/8TEk+C4R+M6w7o2PLGOjdZ +kPiUcw5H85zf/yZJwQXzisXaNduwWB6Vads9YC9dj6kPR1c4VGRqAaYL++LAEOqrlvm2Tvq QqZRtnEAEQEAAcLAfAQYAQgAJgIbDBYhBOQ+WEYd/Hy/RGkVpZn6f8tZ/DuBBQJoGNI2BQkd EODdAAoJEJn6f8tZ/DuBfw0IAKTsfD40teP/pp+bsLLMSxPXUYrrprTj7WFB5v61p6dkpSr/ qXmMlyahdxQFaPmfVgVirB1Vk/kHiWNnnGjfUV9nB2Zg9LI0Xb9/ts3LsUiRWXzG3tkMY6XL vsVOxW4XFRND9l2q+WW93aZ1DZl+fqWfYgMvsusFRhmGFOKTRfKPta2Pkv+AhA24N4+PrR5p bU4k2MO8PAGiK8eaYKGFG1bHKuAvoDoF7WXJ3FHxuWqLnKEt4dfOLm5pAe3zq1Lt6q8azT9i QWGpSAK5vQUWQHBHpiDjdPeqKZ6HiAXIIKfSmb+jrvXBqoP+D6/K7rUjG2aXiRtTIAXms9sm VRu7cmw= In-Reply-To: <20250722005339.ca.nmisch@google.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 2025-07-21 Mo 8:53 PM, Noah Misch wrote: > On Mon, Jul 21, 2025 at 04:41:03PM -0400, Andrew Dunstan wrote: >> On 2025-07-17 Th 6:18 AM, Mahendra Singh Thalor wrote >>>>>>>> --- a/src/bin/pg_dump/pg_restore.c >>>>>>>> +++ b/src/bin/pg_dump/pg_restore.c >>>>>>>> +/* >>>>>>>> + * read_one_statement >>>>>>>> + * >>>>>>>> + * This will start reading from passed file pointer using fgetc and read till >>>>>>>> + * semicolon(sql statement terminator for global.dat file) >>>>>>>> + * >>>>>>>> + * EOF is returned if end-of-file input is seen; time to shut down. >>>>>>> What makes it okay to use this particular subset of SQL lexing? >>>>>> To support complex syntax, we used this code from another file. >>>>> I'm hearing that you copied this code from somewhere. Running >>>>> "git grep 'time to shut down'" suggests you copied it from >>>>> InteractiveBackend(). Is that right? I do see other similarities between >>>>> read_one_statement() and InteractiveBackend(). >>>>> >>>>> Copying InteractiveBackend() provides negligible assurance that this is the >>>>> right subset of SQL lexing. Only single-user mode uses InteractiveBackend(). >>>>> Single-user mode survives mostly as a last resort for recovering from having >>>>> reached xidStopLimit, is rarely used, and only superusers write queries to it. >>>> Yes, we copied this from InteractiveBackend to read statements from >>>> global.dat file. >> Maybe we should ensure that identifiers with CR or LF are turned into >> Unicode quoted identifiers, so each SQL statement would always only occupy >> one line. > Interesting. That might work. > >> Or just reject role and tablespace names with CR or LF altogether, >> just as we do for database names. > There are other ways to get multi-line statements. Non-exhaustive list: > > - pg_db_role_setting.setconfig > - pg_shdescription.description > - pg_shseclabel.label > - pg_tablespace.spcoptions (if we add a text option in the future) > > I think this decision about lexing also ties to other unfinished open item > work of aligning "pg_dumpall -Fd;pg_restore [options]" behavior with "pg_dump > -Fd;pg_restore [options]". "pg_restore --no-privileges" should not restore > pg_tablespace.spcacl, and "pg_restore --no-comments" should not emit COMMENT > statements. > > I suspect this is going to end with a structured dump like we use on the > pg_dump (per-database) side. It's not an accident that v17 pg_restore doesn't > lex text files to do its job. pg_dumpall deals with a more-limited set of > statements than pg_dump deals with, but they're not _that much_ more limited. > I won't veto a lexing-based approach if it gets the behaviors right, but I > don't have high hopes for it getting the behaviors right and staying that way. Yeah, that was my original idea. But maybe instead of extending the archive mechanism, we could do something more lightweight, e.g. output the statements as a JSON array. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com