Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mGKcG-0008VU-M1 for pgsql-docs@arkaria.postgresql.org; Wed, 18 Aug 2021 12:24:20 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1mGKcF-0006wF-KP for pgsql-docs@arkaria.postgresql.org; Wed, 18 Aug 2021 12:24:19 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mGKcF-0006w2-9T for pgsql-docs@lists.postgresql.org; Wed, 18 Aug 2021 12:24:19 +0000 Received: from mail-wr1-x42d.google.com ([2a00:1450:4864:20::42d]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1mGKcC-0007ES-MV for pgsql-docs@lists.postgresql.org; Wed, 18 Aug 2021 12:24:18 +0000 Received: by mail-wr1-x42d.google.com with SMTP id q10so3259892wro.2 for ; Wed, 18 Aug 2021 05:24:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cybertec-at.20150623.gappssmtp.com; s=20150623; h=message-id:subject:from:to:cc:date:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=C5o853FvMWDg+nV8xwYCAB4Pq+7vRKgfWZMryBmQT40=; b=lmtoM+Ns++AeL1EIt7RpvqQmFnTPg6i3/QUfN5H93YNzCo5liofjjS4z51LNRtsNlx NAqu0LNKto7UQvZIXLWxYo+Gf84prTV2aIF6ZrQ/F1IQ4DyQrMbGXeXG7htySkqIhOSu T3uyUHVkGdxKQUCrZh3CTNh9fhm0PJOmX+mSd5oHRkpOVStJEKWlNasAjnOmIIudUBNA nPLaebHD52SYGJjrHM+afB+esXDjeVp1dtFRPGDyiLyv4rXYSkY0A7oHVJtO+vae5Gv6 ZZR66VsoNMAaFhvdO7A6k/gRpwBOHHYu7OhshI3D/UIY90oGPZKTaZbiiS9nZNE6ZZKL 8z9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=C5o853FvMWDg+nV8xwYCAB4Pq+7vRKgfWZMryBmQT40=; b=acMYgxiIpZeufcWlYxrghj6KJZzWtg99Dw1eYupssvta7MMN9w9Z2ipK162SjHZwop wj96gw3sv5RNDdyHBmZJinWKKsutUhwKGZzfHZRmG2boSxnRygU03q+rtlroYwx8CILN pWhMr3Ev7nHvOQr+X9BarxPdod33SPl9xc9XNitco0+cVqX20H6jNJlNFTpLyir/oBSt y31earb/DwnVAijSQ8JC9m5EMfWRhOysyZ/bnPKVG+xAWZkcbPUJXc2aoLlWoqSSGlig 3UtYQr/tcprwQKJoEHv+4aN0uom4b8a+BHSpSv6S/DW0IUDIdAHL7a2fSeC32uy7x7gW yyqA== X-Gm-Message-State: AOAM5313leqbgb7j9p7oUqrPv8gztu9207B4svxMbE+TQWyr9Y40KYcN 0H5My2ujqfKMXSbBH9y4aR+Gxg== X-Google-Smtp-Source: ABdhPJwt8/92K5+x+CnthvHrhoRLk+GFcllISCwVfvJNZOAXguTS5otV0b1Dy0M1GE1/un3TCj8/tg== X-Received: by 2002:a5d:4dc3:: with SMTP id f3mr10532877wru.302.1629289454863; Wed, 18 Aug 2021 05:24:14 -0700 (PDT) Received: from localhost.localdomain ([88.116.133.170]) by smtp.gmail.com with ESMTPSA id e10sm5615766wrt.82.2021.08.18.05.24.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Aug 2021 05:24:14 -0700 (PDT) Message-ID: <81f71541075c376ce137ee6aec42c61fb9a60a3d.camel@cybertec.at> Subject: Re: Improve documentation for pg_upgrade, standbys and rsync From: Laurenz Albe To: Stephen Frost Cc: pgsql-docs@lists.postgresql.org Date: Wed, 18 Aug 2021 14:24:13 +0200 In-Reply-To: <20210726191126.GW20766@tamriel.snowman.net> References: <22f129004bb66cd91e1dfd3345a9787f5039f3ae.camel@cybertec.at> <20210519143135.GI20766@tamriel.snowman.net> <20210716131744.GA20766@tamriel.snowman.net> <20210726191126.GW20766@tamriel.snowman.net> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, 2021-07-26 at 15:11 -0400, Stephen Frost wrote: > > > > > An additional thing that we should really be mentioning is to tell > > > > > people to go in and TRUNCATE all of their UNLOGGED tables before going > > > > > through this process, otherwise the rsync will end up spending a bunch > > > > > of time copying the files for UNLOGGED relations which you really don't > > > > > want. > > > > Ok, done. > > Great, thanks, it's not quite this simple, unfortunately, more below.. > > > + > > + If you are upgrading standby servers using methods outlined in section > + linkend="pgupgrade-step-replicas"/>, you should consider dropping temporary > > + tables and truncating unlogged tables on the primary, since that will speed up > > + rsync and keep the down time short. > > + You could run the following psql commands > > + in all databases: > > + > > + > > +SELECT format('DROP TABLE %s', oid::regclass) FROM pg_class WHERE relpersistence = 't' \gexec > > +SELECT format('TRUNCATE %s', oid::regclass) FROM pg_class WHERE relpersistence = 'u' \gexec > > + > > Temporary tables aren't actually visible across different backends, nor > should they exist once the system has been shut down, but sometimes they > do get left around due to a crash, so the above won't actually work and > isn't the way to deal with those. The same can also happen with > temporary files that we create which end up in pgsql_tmp. > > We could possibly exclude pgsql_tmp in the rsync command, but cleaning > up the temporary table files would involve something more complicated > like using 'find' to search for any '^t[0-9]+_[0-9]+.*$' files or > something along those lines. > > Though, for that matter we should really be looking through all of the > directories and files that pg_basebackup excludes and considering if > they should somehow be excluded. There's no easy way to exclude > everything that pg_basebackup would with just an rsync because the logic > is a bit complicated (which is why I was saying we really need a proper > tool...) but we could probably provide a somewhat better rsync command > by going through that list and excluding what makes sense to exclude. > We could also provide another explicit before-rsync step to review all > the temp table files and move them or remove them, depending on how > comfortable one is with hacking around in the data directory. > > This, of course, all comes back to the original complaint I had about > documenting this approach, which is that these things should only be > done by someone extremely familiar with the PG codebase, until and > unless we write an actual tool to do this. I agree with what you write, but that sounds like you are arguing for a code patch rather than for documentation to enable the user to do that manually, which is what I believe you said initially. My two statements will get rid of temporary tables left behind after a crash and truncate unlogged tables, which should be an improvement. Of course it would be good to get rid of orphaned files left behind after a crash, but, as you say, that is not so easy. I'd say that writing tools to do better than my two SQL statements is nice to have, but beyond the scope of this documentation patch. > > > > Recommend using the --relative option of rsync for clarity > > > > and adapt the code samples accordingly. > > > > Using relative paths makes clearer what is meant by "current > > > > directory" and "remote_dir". > > > > I normally prefer absolute paths as well. > > But that is the only way I got it to run, and I think that in this > > case it adds clarity to have the data directories relative to your > > current working directory. > > I'm pretty curious that you weren't able to get it to run with absolute > paths.. I tried a couple of times with a test cluster and failed. Part of the confustion for me is that you are supposed to run the rsync from a certain directory, which seems weird if paths are absolute. Run from *any* directory above the old and the new cluster? "Relative to my current directory" makes more sense to me here. > > + (There will be a mismatch if old standby servers were shut down > > + before the old primary or if the old standby servers are still running.) > > Would probably be good to note that if the standby's were shut down > before the primary then this method can *not* be used safely... The > above leaves it unclear about if the mismatch is an issue or not. I get > that this was in the original docs, but still would be good to improve > it. Agreed. Yours, Laurenz Albe