Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ux52z-00AZbP-EH for pgsql-admin@arkaria.postgresql.org; Fri, 12 Sep 2025 14:46:45 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ux52x-001rrm-9y for pgsql-admin@arkaria.postgresql.org; Fri, 12 Sep 2025 14:46:44 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ux52w-001rrb-UK for pgsql-admin@lists.postgresql.org; Fri, 12 Sep 2025 14:46:43 +0000 Received: from mail-oo1-xc2f.google.com ([2607:f8b0:4864:20::c2f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ux52v-000390-2N for pgsql-admin@postgresql.org; Fri, 12 Sep 2025 14:46:42 +0000 Received: by mail-oo1-xc2f.google.com with SMTP id 006d021491bc7-62189987b47so951544eaf.1 for ; Fri, 12 Sep 2025 07:46:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1757688401; x=1758293201; darn=postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=/p9S2QIXexBhnbZPOp/IJYYpk3HcxJFuVMpC/PuEpV4=; b=Ccz9lpOuHJHen7+uOMWroQLLOPRAhcSQlYBBzxyQG12BVXi3kINcqloxpin9kA9gWw MP3EjIS5s1ZjlgOV9Hre3bucNAKLYzzkD4ZOXYdxtjgyCa8dCZGv6+qbti2coTNEjfV5 hkeyZF/hg/D32p6C7T7MajInpEUtt3THGbl92FZIKM2tTDqSUY9F2igo++UFsrVS7x98 EZsaro4ufnlAzwPJEZOOTTlvjKLX7Bs5yKDQbdMDS1TZaNYxlrH8RdT9lPaCp+MIKKe/ CJ0OpBqmxYxoTuOLlXbxNNbGajsjt0japY8yZv0IoYZfRy/u8GpQMRaiRvBKUXLL/V/x Ut4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757688401; x=1758293201; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=/p9S2QIXexBhnbZPOp/IJYYpk3HcxJFuVMpC/PuEpV4=; b=q9SZRC+T0PlhwCIUKt82gpBBc7JqF6FlccLZ3i9fCEjJ/7NE0CAlDfAyUgr6fVzr+X qTJjy/qrTUvjfHdyBj9MLyYkZyIrpN7+5xAt22LTJ8U5Rvh2fPvqEM52akZjvC0VQiuC gb8iqWAN7fe9WCWwJcWb2aI/j6au5NgZcUHuBsDFEn5d8SvbNAKJ9lHyBLkFAcna3D9q 7PpmuTsNpkb219AH+n6AB+nH7ZotaC9LeN7M4jUYcohziEZ2BGfMOTuVGiK/lh6BZwfA 6cAH8/sfbGp0L8dhcns5bwGSvD106bfWU0YbcglDq4ygWsoygF9e1yaycaRnpgC6HV5m uSlg== X-Gm-Message-State: AOJu0Yzx9NfhJGUXQ+o2Ztwb2TqlM+gjTexz0coZiRvQsnS9vqcyZKce 377+HpOvrJcEK/TYhjD/uJScsll4/jwHj/mLMUDN+jcfm8Qcgp7yweOZWoCPSX+lgqTyURYtaxF KT13u/KQnfPyJRMmCN82PMnHppy2qGZU5YI5q X-Gm-Gg: ASbGncuIHin+uzpTgsZ01iHa0MICLy1p+XgAVN6YTmp0Z6WlGCB44wvI5gq+pyBKvdB 6QDp6EvnAOMluBu+jXhAFS+RV0OEUoYCoXXvuYswizpk6cO22+v2DE0pQAs8r1oM4QwiXC9OoV6 9p08VaGk1JOxh51ZxU52G8JS5ttp9XCWIxt9MTxhrL23LXQpVdoxCX37tiwBisqP5v1uDitaKP5 a3n4z5i X-Google-Smtp-Source: AGHT+IGNwxUIlyKeahrxb6CF79yheAJkh89NGVDeItO9lXFCnet4OztlGBJFpPya6umGxmjER15yQQHuMj8DFmF3Mg4= X-Received: by 2002:a05:6870:ac86:b0:314:9684:fe11 with SMTP id 586e51a60fabf-32e58f733ddmr1843169fac.44.1757688401338; Fri, 12 Sep 2025 07:46:41 -0700 (PDT) MIME-Version: 1.0 From: Ron Johnson Date: Fri, 12 Sep 2025 10:46:30 -0400 X-Gm-Features: Ac12FXwjjzBB1-psJGwdGsAS44FpfBGskVkrEu0d6Agime6CQ8QeX4I9jSMnvFY Message-ID: Subject: pg_stats.correlation rule of thumb for re-clustering a table? To: pgsql-admin Content-Type: multipart/alternative; boundary="000000000000c9a4fd063e9bb841" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000c9a4fd063e9bb841 Content-Type: text/plain; charset="UTF-8" (By re-cluster, I of course mean pg_repack.) Purely OLTP tables (that are only accessed randomly) can of course live with 0% correlation, but lots of tables are mixed-use, and so benefit from physical ordering on a carefully chosen field.. SELECT abs(correlation)::numeric(3,2) as correlation FROM pg_stats WHERE schemaname = 'foo' AND tablename = 'bar' AND attname = 'blarge'; correlation ------------- 0.84 (1 row) Obviously 84% is no need to worry, but what about 60% or 40%? Currently, I use 60%, but would like to do better. -- Death to , and butter sauce. Don't boil me, I'm still alive. lobster! --000000000000c9a4fd063e9bb841 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
(By re-cluster, I of course mean pg_repack.)

Purely OLTP tables (th= at are only accessed randomly) can of course live with 0% correlation, but = lots of tables are mixed-use, and so benefit from physical ordering on=C2= =A0a carefully=C2=A0chosen field..

SELECT abs(correlation)::numeric(3,2) as correlation
FRO= M pg_stats
WHERE schemaname =3D 'foo' AND tablename =3D 'ba= r'
=C2=A0 AND attname =3D 'blarge';
=C2=A0correlation
-------------
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 0.84
(1 row)

Obviously 84= % is no need to worry, but what about 60% or 40%? Currently, I use 60%, but= would like to do better.

--
Death to <Redacted>, and = butter sauce.
Don't boil me, I'm still alive.
<= Redacted> lobster!
--000000000000c9a4fd063e9bb841--