Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wFYuD-005J1O-0X for pgsql-bugs@arkaria.postgresql.org; Wed, 22 Apr 2026 14:50:21 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wFYuC-00E3Vc-1I for pgsql-bugs@arkaria.postgresql.org; Wed, 22 Apr 2026 14:50:20 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wFYuC-00E3VT-0W for pgsql-bugs@lists.postgresql.org; Wed, 22 Apr 2026 14:50:20 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wFYu9-00000002UHo-2HwX for pgsql-bugs@lists.postgresql.org; Wed, 22 Apr 2026 14:50:19 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 63MEo9V01024191; Wed, 22 Apr 2026 10:50:10 -0400 From: Tom Lane To: Andrey Borodin cc: PostgreSQL mailing lists Subject: Re: Potential buffer overrun in spell.c's CheckAffix() In-reply-to: <959933.1776865480@sss.pgh.pa.us> References: <641711.1776792744@sss.pgh.pa.us> <959933.1776865480@sss.pgh.pa.us> Comments: In-reply-to Tom Lane message dated "Wed, 22 Apr 2026 09:44:40 -0400" MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----- =_aaaaaaaaaa0" Content-ID: <1024125.1776869377.0@sss.pgh.pa.us> Date: Wed, 22 Apr 2026 10:50:09 -0400 Message-ID: <1024190.1776869409@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk ------- =_aaaaaaaaaa0 Content-Type: text/plain; charset="us-ascii" Content-ID: <1024125.1776869377.1@sss.pgh.pa.us> I wrote: > I suppose we could replace the strcpy with > memcpy(newword, word, len - Affix->replen); > and then we would not need the "len >= 2 * MAXNORMLEN" test > and both paths could share the same check. There's something > to be said for that, though it would be changing the logic to > a greater extent than just "add some safety checks". Concretely, about like this, where I also tried to make the actual byte-copying steps a bit more uniform. regards, tom lane ------- =_aaaaaaaaaa0 Content-Type: text/x-diff; name*0="v2-0001-Prevent-buffer-overrun-in-spell.c-s-CheckAffix.patc"; name*1="h"; charset="us-ascii" Content-ID: <1024125.1776869377.2@sss.pgh.pa.us> Content-Description: v2-0001-Prevent-buffer-overrun-in-spell.c-s-CheckAffix.patch Content-Transfer-Encoding: quoted-printable =46rom 844bb90d49f78c44c6ed395d245ff8a500b16395 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Wed, 22 Apr 2026 10:47:56 -0400 Subject: [PATCH v2] Prevent buffer overrun in spell.c's CheckAffix(). This function writes into a caller-supplied buffer of length 2 * MAXNORMLEN, which should be plenty in real-world cases. However a malicious affix file could supply an affix long enough to overrun that. Defend by just rejecting the match if it would overrun the buffer. I also inserted a check of the input word length against Affix->replen, just to be sure we won't index off the buffer, though it would be caller error for that not to be true. Also make the actual copying steps a bit more readable, and remove an unnecessary requirement for the whole input word to fit into the output buffer (even though it always will with the current caller). The lack of documentation in this code makes my head hurt, so I also reverse-engineered a basic header comment for CheckAffix. Reported-by: Xint Code Author: Tom Lane Reviewed-by: Andrey Borodin Discussion: https://postgr.es/m/641711.1776792744@sss.pgh.pa.us Backpatch-through: 14 --- src/backend/tsearch/spell.c | 47 ++++++++++++++++++++++++++++++++----- 1 file changed, 41 insertions(+), 6 deletions(-) diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c index a1bfd2a9f9b..abaa18c21c4 100644 --- a/src/backend/tsearch/spell.c +++ b/src/backend/tsearch/spell.c @@ -2065,9 +2065,32 @@ FindAffixes(AffixNode *node, const char *word, int = wrdlen, int *level, int type) return NULL; } = +/* + * Checks to see if affix applies to word, transforms word if so. + * The transformation consists of replacing Affix->replen leading or + * trailing bytes with the Affix->find string. + * + * word: input word + * len: length of input word + * Affix: affix to consider + * flagflags: context flags showing whether we are handling a compound wo= rd + * newword: output buffer (MUST be of length 2 * MAXNORMLEN) + * baselen: input/output argument + * + * If baselen isn't NULL, then *baselen is used to return the length of + * the non-changed part of the word when applying a suffix, and is used + * to detect whether the input contained only a prefix and suffix when + * later applying a prefix. + * + * Returns newword on success, or NULL if the affix can't be applied. + * On success, the modified word is stored into newword. + */ static char * CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, cha= r *newword, int *baselen) { + size_t keeplen, + findlen; + /* * Check compound allow flags */ @@ -2100,15 +2123,27 @@ CheckAffix(const char *word, size_t len, AFFIX *Af= fix, int flagflags, char *neww return NULL; } = + /* + * Protect against output buffer overrun (len < Affix->replen would be + * caller error, but check anyway) + */ + Assert(len =3D=3D strlen(word)); + if (len < Affix->replen) + return NULL; + keeplen =3D len - Affix->replen; /* how much of word we will keep */ + findlen =3D strlen(Affix->find); + if (keeplen + findlen >=3D 2 * MAXNORMLEN) + return NULL; + /* * make replace pattern of affix */ if (Affix->type =3D=3D FF_SUFFIX) { - strcpy(newword, word); - strcpy(newword + len - Affix->replen, Affix->find); + memcpy(newword, word, keeplen); + strcpy(newword + keeplen, Affix->find); if (baselen) /* store length of non-changed part of word */ - *baselen =3D len - Affix->replen; + *baselen =3D keeplen; } else { @@ -2116,10 +2151,10 @@ CheckAffix(const char *word, size_t len, AFFIX *Af= fix, int flagflags, char *neww * if prefix is an all non-changed part's length then all word * contains only prefix and suffix, so out */ - if (baselen && *baselen + strlen(Affix->find) <=3D Affix->replen) + if (baselen && *baselen + findlen <=3D Affix->replen) return NULL; - strcpy(newword, Affix->find); - strcat(newword, word + Affix->replen); + memcpy(newword, Affix->find, findlen); + strcpy(newword + findlen, word + Affix->replen); } = /* -- = 2.43.7 ------- =_aaaaaaaaaa0--