Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ua9cz-00H59u-Le for pgsql-general@arkaria.postgresql.org; Fri, 11 Jul 2025 09:01:09 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ua9cw-002ztB-Dh for pgsql-general@arkaria.postgresql.org; Fri, 11 Jul 2025 09:01:07 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ua9cw-002zt3-2u for pgsql-general@lists.postgresql.org; Fri, 11 Jul 2025 09:01:06 +0000 Received: from mail-oi1-x22d.google.com ([2607:f8b0:4864:20::22d]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ua9cu-006zaC-1e for pgsql-general@postgresql.org; Fri, 11 Jul 2025 09:01:06 +0000 Received: by mail-oi1-x22d.google.com with SMTP id 5614622812f47-40ba2ad38ddso1007832b6e.0 for ; Fri, 11 Jul 2025 02:01:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752224462; x=1752829262; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=w1JyjF/dbNVa2+eMbUP4Q3UEXCLtiaG1ahlp1DsQwvw=; b=fmAsnpNVATKVpnr7XJwv5Bg7+7TTvG6/848tGbg2teauWl77InGn3lChf80dlWSwH4 krNoBnIyZuFP5SYnywJOA4rYwcwEHVT6zbMvG54hBCH+Oqa4H0hpTK/BibE5SPX69OZf aU4oA3iJqqC4fjpoRjiQtI49QoJA8Olsq23RgR5/AbJ4tQiHm5/l6nzA3nWUYfRS1zyk GnFbEvQ0zvQJPK2TirjwdxfRr7YZ/b2nTwHabGRtxbZ66CsBZJxxMy+X8X6FPLSRNIvv QgqALfx+VHkcJd2iye19grvFV05J6yA5STOR/h/++7D+bpqa49oBNm0EL+XujtVD47b8 iOLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752224462; x=1752829262; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=w1JyjF/dbNVa2+eMbUP4Q3UEXCLtiaG1ahlp1DsQwvw=; b=ZEo5D9jgtG1Io1TwnoNeqBPSrUnQM/3YwSYVjVGxBkAYQa4DQ9rBAs98i44xfpFFAF NHZBw7jgLJNSAgzzH7xF+E+lzQqoDr/VqtgDUKyB+j8L5zPLkts5f4RN1Bq9gc34HT2I jY832Q3Wwb/Sds31CpIXqeJund7gAnyUxHcTmNicNKoFOy5Fhwty0HOspQPXKOSLTMw2 hJUebpCQ24fpUDOm0GtJ2GqYkEzfvVeI/2XH2dS6egIEmltfWsOSQ5yFxrwjBBVmwVuw mHq74ncQnD3veLHsEY2OaxDRW1w2OyEc+YL3E9QiEQvwL4AIYlt12e8JzB+adMz/MLi5 7bfw== X-Gm-Message-State: AOJu0Yz8dQ874LhOQZUrWVXUR8ziuqZM3fIb1TugHDGZtBmJPYu66nNr YrpQFC2zhtpMwoq3AVWf6XnjKwu7bmOezxPq7QydOZwDXtP5CZ72XzXAvbYTyzJxR2JONvN2irV 16AoriMa7k3LCEYYKfBAjUqzjXBVsWls= X-Gm-Gg: ASbGncsQ06outVK0G+8p5nBCdZfmd6yv/6Ae0gyj+LEpG7eVrIGKYIOa+DQKCt101d0 ASXIb8e9Jj8iFn5XRndbFk+ntcgB4or9Y7r8kVLeOAmQ1ddbcYEAjBue1587T7PJo17nFQyhjKH okllc26AqK+Y+UicvU+q2vHMqgODj5pS7V4STa50M1F4rmQu8HTtkTPFYLghgF1DtTWaGSKX6ow Jhk7dPYYQ== X-Google-Smtp-Source: AGHT+IHwvUgp3xph4nX4/3vn0h7VMNcAfTb1FuHU6okypdKXoSL0nWseaLVgw9q2xQOLD+vkBf2/i8sRwkzk7ts5gFI= X-Received: by 2002:a05:6808:5246:b0:3f8:3489:d93d with SMTP id 5614622812f47-415397b8e8bmr1319918b6e.25.1752224462345; Fri, 11 Jul 2025 02:01:02 -0700 (PDT) MIME-Version: 1.0 References: <7f90e1f3-7e0b-4b87-8cb6-f2862755fd3c@aklaver.com> In-Reply-To: From: Dominique Devienne Date: Fri, 11 Jul 2025 11:00:49 +0200 X-Gm-Features: Ac12FXw8_OrAo9VGWOi225teEGgvx407PGpToQ0PEIqdhkJP7f2LJ-4lHjZiXls Message-ID: Subject: Re: Aggregate versions of hashing functions (md5, sha1, etc...) To: Ron Johnson Cc: pgsql-general Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Jul 10, 2025 at 7:11=E2=80=AFPM Ron Johnson wrote: > On Thu, Jul 10, 2025 at 12:26=E2=80=AFPM Adrian Klaver wrote: >> On 7/10/25 04:48, Dominique Devienne wrote: >> > Seems so logical to me, that these hashing functions were available >> > are aggregates, I can't be the first one to think of that, can it? >> >> I've been on this list since late 2002 and I don't recall this ever >> being brought up. Now it is entirely possible that age has dimmed my >> recall abilities:) Though a quick search seems to confirm my memory. > > What even is an aggregate hash function? (I can imagine a few possibilit= ies, but don't want to assume.) Well, it's so obvious to me, I wonder if you're baiting me :) Any hasher/digest inits some internal state, processes bytes, typically in "streaming-fashion" via successive byte spans (equivalent to PostgreSQL's bytea), and yields a digest of various length at the end. The current md5() and pgcrypto.digest() functions roll the x1 init, xN process, and x1 finish into a single call, processing a single bytea (or perhaps more intelligently for TOAST'ed values, the 2K "rows" of those in streaming-fashion, hopefully. Can a dev confirm?). As an aggregate, the processing is extended to all values aggregated. That's it. Obviously order-sensitive, so an explicit ORDER BY inside the aggregate call is DE RIGEUR, but that's normal. As I mentioned already, SQLite supports sha3_agg() for almost a year, and had sha(1|3)_query, which is conceptually similar (although hashes value types too, since multi-column and dynamically typed), for years (8+ for sha3, probably decades for sha1). Basically anyone who knows hashing/digests and has ever written an aggregate UDF (in SQLite or elsewhere), understands what I'm talking about. --DD