public inbox for [email protected]
help / color / mirror / Atom feedFrom: Aleksander Alekseev <[email protected]>
To: [email protected]
Cc: Tom Lane <[email protected]>
Cc: Álvaro Herrera <[email protected]>
Subject: Re: Review - Patch for pg_bsd_indent: improve formatting of multiline comments
Date: Tue, 12 May 2026 15:15:31 +0300
Message-ID: <CAJ7c6TO0xvunpeOv89i1eKQBhKF9=GEETkTz+yAGs1xGYH25MQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
Hi,
> /*-----
> * limit (sum(1/i^2),i=1,inf) = pi^2/6
> * resj = sum(wi/i^2),i=1,noccurrence,
> * wi - should be sorted desc,
> * don't sort for now, just choose maximum weight.
> * This should be corrected
> * Oleg Bartunov
> */
> res = res + (wjm + resj - wjm / ((jm + 1) * (jm + 1))) / 1.64493406685;
>
> > (Not that I understand what this is trying to tell me, mind)
>
> Me either :-(
I *think* maybe I was able to decipher it. PFA the patch.
--
Best regards,
Aleksander Alekseev
Attachments:
[text/x-patch] v1-0001-Decipher-the-comment-in-tsrank.c.patch (2.0K, 2-v1-0001-Decipher-the-comment-in-tsrank.c.patch)
download | inline diff:
From 0e8345863eabb9ada61fbe3c9c03f393682a2615 Mon Sep 17 00:00:00 2001
From: Aleksander Alekseev <[email protected]>
Date: Tue, 12 May 2026 15:09:36 +0300
Subject: [PATCH v1] Decipher the comment in tsrank.c
---
src/backend/utils/adt/tsrank.c | 34 +++++++++++++++++++++++++++-------
1 file changed, 27 insertions(+), 7 deletions(-)
diff --git a/src/backend/utils/adt/tsrank.c b/src/backend/utils/adt/tsrank.c
index d35e5528d0a..383ad393971 100644
--- a/src/backend/utils/adt/tsrank.c
+++ b/src/backend/utils/adt/tsrank.c
@@ -336,13 +336,33 @@ calc_rank_or(const float *w, TSVector t, TSQuery q)
jm = j;
}
}
-/*
- limit (sum(1/i^2),i=1,inf) = pi^2/6
- resj = sum(wi/i^2),i=1,noccurrence,
- wi - should be sorted desc,
- don't sort for now, just choose maximum weight. This should be corrected
- Oleg Bartunov
-*/
+
+ /*
+ * The ideal score for a term is the weighted harmonic sum:
+ *
+ * resj = sum(wi / i^2, i = 1..noccurrences)
+ *
+ * where wi is the weight of the i-th occurrence and weights are
+ * sorted in descending order so that the highest-weight
+ * occurrence gets the smallest divisor (i=1) and thus contributes
+ * the most.
+ *
+ * The result is divided by pi^2/6 ~= 1.64493406685, which is the
+ * limit of sum(1/i^2, i=1..inf). This normalizes the score to the
+ * [0, 1] range.
+ *
+ * As an approximation for efficiency, we skip the sort and
+ * instead only promote the single highest-weight occurrence to
+ * position i=1. This is done by taking the raw (unsorted) sum
+ * resj, subtracting the maximum weight's actual contribution
+ * wjm/(jm+1)^2, and adding back its corrected contribution
+ * wjm/1^2 = wjm:
+ *
+ * adjusted = resj - wjm/(jm+1)^2 + wjm = wjm + resj -
+ * wjm/(jm+1)^2
+ *
+ * The remaining occurrences are left in their original order.
+ */
res = res + (wjm + resj - wjm / ((jm + 1) * (jm + 1))) / 1.64493406685;
entry++;
--
2.43.0
view thread (11+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Review - Patch for pg_bsd_indent: improve formatting of multiline comments
In-Reply-To: <CAJ7c6TO0xvunpeOv89i1eKQBhKF9=GEETkTz+yAGs1xGYH25MQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox