Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uZpmN-008ub6-90 for pgsql-general@arkaria.postgresql.org; Thu, 10 Jul 2025 11:49:31 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uZplN-00BgAU-AC for pgsql-general@arkaria.postgresql.org; Thu, 10 Jul 2025 11:48:29 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uZplM-00BgAM-Vl for pgsql-general@lists.postgresql.org; Thu, 10 Jul 2025 11:48:29 +0000 Received: from mail-oi1-x22a.google.com ([2607:f8b0:4864:20::22a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uZplM-006Z9I-0G for pgsql-general@lists.postgresql.org; Thu, 10 Jul 2025 11:48:28 +0000 Received: by mail-oi1-x22a.google.com with SMTP id 5614622812f47-40b99490728so951466b6e.0 for ; Thu, 10 Jul 2025 04:48:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752148107; x=1752752907; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=EXYCsQjBH6M0CFXRSLsF/dFqxHK/HrsCuzZha/4FWIk=; b=l1HUjfyFt4DBw23qaJKOB6tvRYR2O960AZTiZDS3KBs28+Tqj0NCi7IpS57Fen/FmT 4NJD6snzEeMjY6KYy2xKg+ftw8IYpYLdMLzErGcqvTy/OVLkvSam4MIxIYIwMTjvYxhK rUQL4/ZJsua0pN/vW9EytjUGWAhGKucKIH9Ol5EVyvaW65pMHosIfcmx3MKf2kVbS6nj U50nbmJgMgkwUozRU5V03+Wi9EzqxXPgVTPAP//1OnFc1XgCF8v3u14lYBgPPv9qO3+i gsI53tIRFhMeSSjVn++4c2hOHdKMnO0dOj3cD6jZKijarkdziB7w2hyu2gI2wcF1kNsQ By7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752148107; x=1752752907; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=EXYCsQjBH6M0CFXRSLsF/dFqxHK/HrsCuzZha/4FWIk=; b=pL5HtcGk5sd1CqgIAwUyYoRa/U1iolk6ozcmbR/MZpEm3Fih1OJW0fa5WTnlZhzl1e zU2quxvHBZ2Rq7ixBz80zHnynpfII7qOsTKH0Btl/NElBBY5z0uFK4lC6Mm2vlGLdSgp uSqvZc44PS7uKO3yoMBX+5Ls+hate/yVHMB9aBhTB3VTXTnBfR1DUDZtEz9p/wcv+cIB +eFi5gHdN+L2+qVqqV2bRyj1ewVMf4rlq4MgGbLVBU3p1O/I/CJXXEs6HznW7F4t2cr9 wPZIAlgq8KhHZ+/ycM1SGCgplJRyDz26cjI1DVCDdl/Uqw80JHYEdiTmizojiV0aelre +R5w== X-Gm-Message-State: AOJu0Yy69z6aM4ZOW4b0f0c/UJkJGR4VdKg/bK2tZUJ5W9zExpL2DA6n VXEE+5BDD1L9aRF6WDK2+3ZvSpXXhMKmmVFUNUYstktzwxZHNZoael4dStXaSaCZJjbAebH5KIe fArvVR1q00GYrPzzETkuOpvf3tV43Mhmtag== X-Gm-Gg: ASbGncuQ74h/Fe0vhvnItGQaKRM9TQSchPDjiDGIQeqU9MwBZ0/g3Nbec1JEGepS2nH ZgdJIECkBVXDNk76voFY5xcg21QKBEw2oripa5Oa9uhEgKJE/kD0oARkF/yXrFGhlrie6+hwnEd L8riZWY3CEIuWn7HNxL7xINHEih92ApOW2rNg/smchFjriSQ== X-Google-Smtp-Source: AGHT+IGq/mAjLq7F5787bnN4sThK8fRQnO7RjWZMcDQe3qOf6/QeDv6+T2j6LQvS+GljNLaS1BVkZsBufpJ91+dgVwE= X-Received: by 2002:a05:6808:690a:b0:40c:fe45:2ab5 with SMTP id 5614622812f47-413c2bf929fmr2369976b6e.1.1752148106875; Thu, 10 Jul 2025 04:48:26 -0700 (PDT) MIME-Version: 1.0 From: Dominique Devienne Date: Thu, 10 Jul 2025 13:48:15 +0200 X-Gm-Features: Ac12FXzRKtYCzgO3CGXKDuYOXnRKjfmQU7Rmo59BHvAyOMJGgtpTTf0v10mG2qs Message-ID: Subject: Aggregate versions of hashing functions (md5, sha1, etc...) To: pgsql-general@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk We store scientific information in PostgreSQL, and some of that is bytea and large, thus we must "chunk it" both for performance, and not be limited to 1GB (we do exceed that, in rare occasions). Recently I added md5/sha1 hashing support for such values (for various reasons, to track corruptions in our ETLs, now fixed, but also in the future for custom smart sync and such), and was surprised to discover there are no aggregate versions of those functions, neither the built-in md5 one (now with bytea overload), nor for pgcrypto. Did I miss something? Any chance this might be added in the future? Seems so logical to me, that these hashing functions were available are aggregates, I can't be the first one to think of that, can it? Thanks, --DD PS: The holly-grail IMHO, would be built-in support for hashing, with intelligent lazy-compute and persistence correctly reset on changes. Probably opt-in. PPS: Built-in Support for xxh64, or at least in official pgcrypto, would also be nice. With aggregate! PPPS: I remember Oracle LOBs supporting a built-in implicit mime-type attribute on them, which was used as the content-type with served over the built-in WebDAV extension for the DB. I wish PostgreSQL had something like that too. PPPPS: the lo extension is not viable for us, FWIW. All the above should be opt-in on bytea columns IMHO.