Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uaCbh-000xPl-6G for pgsql-general@arkaria.postgresql.org; Fri, 11 Jul 2025 12:12:01 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uaCbe-004TCn-E1 for pgsql-general@arkaria.postgresql.org; Fri, 11 Jul 2025 12:11:59 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uaCbe-004TCf-2w for pgsql-general@lists.postgresql.org; Fri, 11 Jul 2025 12:11:58 +0000 Received: from mail-oi1-x236.google.com ([2607:f8b0:4864:20::236]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uaCbc-006kow-39 for pgsql-general@postgresql.org; Fri, 11 Jul 2025 12:11:57 +0000 Received: by mail-oi1-x236.google.com with SMTP id 5614622812f47-407a6c6a6d4so553659b6e.1 for ; Fri, 11 Jul 2025 05:11:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752235916; x=1752840716; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=t4ylPOp4/1IjOVYagNOFJ8ACFAk+mnfFp94hNaFELH8=; b=NBT99wgN/UbVP0Ysg/ck8db/j5wE9PC2P2wg4vW9pqyhIRf9TMDI0/7FmMnl0SX/z5 ITXE/X9CjcYAZsr3jC0i/JPTegfx6aoPvzoiwzltujI9r7383CDxwALjlMmk61TXZDUB nmVQgr2TaWAmJOA1WHLmEVjUzur4QCZ7QsZxdX5eQCjbBak5Esdcjm38bVGg6vsju9o0 Up6HxZEKtS/BHuT2kcq+GoL7YkrQ15ZvrhG1dqxDNQksx06eZnZEQvlT9n41dhB4eJru R76JXCT8PAorS4TCE1Kc46xF/4mk+9xR8yXkCldHBn/dwlCcqTgsiT7lE2CpeAzQXR1z sN7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752235916; x=1752840716; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=t4ylPOp4/1IjOVYagNOFJ8ACFAk+mnfFp94hNaFELH8=; b=bNxAyR2EjGaM/m4eMjUymcMGSFKIxcnvJ7/OUgR7W4FbFdIDfxUFHQH76hueyipyo2 k9MYI/WNV3S1x6q2ODzYLplCnLiyHM9UBbRGZzNSWeEI3RGdcwDIVj91hEJa3a6bMUSK Rnnj+AzD/zOVyz0O9gzjk/EOZ6T/7errR74+UtdjwKs0WF5ZKsATbnf06YeiRblDMDm7 UNCtiIQRLHe018TBcWAZbFE0ztzU5CK+0WkXV5A92dXMcgCuQFvVzHgpn7yqsUONAtXE ok5es3nojsONyBxdu+XsY7Y1qvhKLrpsi0YmuKpp9HSWOJ/3aSQOfmKjtXXpnQwm0U4G 1iwQ== X-Gm-Message-State: AOJu0YxJujNuTzwGiuNdZbauTwxUNxAm78YJOGUiSQqb60W3F2ybUop5 iE6bLorMyYcvpW1qIIrFLDp1wkvFxOzrU4pdNXDiEWIsOXwajduZF6+3dd6mSaF+B/rW9jP/Bn8 luJVzAClJCr8nuher33jBDDqnqSssm8cAu7hO X-Gm-Gg: ASbGncuG4mObb5wkCruNaUenVcT7IsgBg3pnGaUr5RmV8LizUcEhAQHywuQhl0LCkXW E2Y+lCVdEowL8E5tiA4dSzALq1n53/fIGqHiWfk700r1n7rl9xwkd405Dpoqpl7fTUPho2KpZbf AbL7VSqCvV8LIAsBhUqLogINPQ1MalcoFSFWtC5d+gg6ctLVaan0L8QxT1iPjgROQLaTfJB3dDv Eg/KGZt9A== X-Google-Smtp-Source: AGHT+IEyVMMb9BLKNex5Jqg3wITyZJWos0+e/H/WazKe75Xz8WDxHxSh/NWXx5xlMsuRZuzN6kTjOkrchwxL4zHK7LA= X-Received: by 2002:a05:6808:2516:b0:3f6:6d8f:1365 with SMTP id 5614622812f47-4150d74982bmr2065616b6e.3.1752235915486; Fri, 11 Jul 2025 05:11:55 -0700 (PDT) MIME-Version: 1.0 References: <7f90e1f3-7e0b-4b87-8cb6-f2862755fd3c@aklaver.com> In-Reply-To: From: Dominique Devienne Date: Fri, 11 Jul 2025 14:11:44 +0200 X-Gm-Features: Ac12FXysBIeRKPVbCClPwDxWZj4DF5w7qaerK0YMucqc551cJdiuVSFYBguQFqk Message-ID: Subject: Re: Aggregate versions of hashing functions (md5, sha1, etc...) To: Ron Johnson Cc: pgsql-general Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Jul 11, 2025 at 11:00=E2=80=AFAM Dominique Devienne wrote: > The current md5() and pgcrypto.digest() functions roll the x1 > init, xN process, and x1 finish into a single call, processing a > single bytea (or perhaps more intelligently for TOAST'ed values, the > 2K "rows" of those in streaming-fashion, hopefully. Can a dev confirm?) FWIW, I've [asked ChatGPT about that][1], and assuming it's right (md5 and pgcrypto.digest not leveraging the "substring-optimization" on TOASTED bytea), that's an unfortunate lost opportunity, especially for byteas reaching close to the 1GB limit. And again (sorry to lay it on thick...), when required to manually chunk for sizes > 1GB, the lack of aggregate is a bit crippling, I'm afraid. So again, can a dev confirm what ChatGPT blurted out? And if true, any interest in improving that for better TOAST support for true streaming hashing for current scalar digests? And of course, the main point of this thread, add (true streaming) aggregate support in a future version? Thanks, --DD [1]: https://chatgpt.com/share/6870fe03-416c-800e-8633-a76e478a794a