public inbox for [email protected]  
help / color / mirror / Atom feed
From: Jacqui Caren <[email protected]>
To: Bryan Sayer <[email protected]>
Cc: [email protected]
Subject: Re: Selecting all variations of job title in a list
Date: Fri, 28 Nov 2025 10:19:43 +0000
Message-ID: <CABpmC2+wz6HALK8QzzkfZ+Z80bYdbL889=4KKMWd2ew_b4e4FQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CAKFQuwY2W1WB2qnMuhJ7=GD_rgNmm+N=a7_ZHEtuudw=qajJwg@mail.gmail.com>
	<[email protected]>
	<[email protected]>

Many years ago I used a weak precedence engine to categorize this form of
job title script to a job code.

The regexp did not work because we had asst to gen mgr
The wpe tokenized the words then relaxed token patterns with other token
codes with cumulative context
The final token reduction resulted in a final code or if in my example we
had a modifier to role code (asst to xxxx role)

The entire engine was created in Oracle but would be easy to implement in
pgsql. Back then neural nets were only just appearing in finance and llms
were non existent.

Old 1960's tech saved the day

On Wed, Nov 26, 2025, 17:02 Bryan Sayer <[email protected]> wrote:

> I am not very skilled at Postgresql specifically, but when I was doing SQL
> in another environment I would just do
>
> select distinct (or unique) jobtitle
>
> usually getting a count of how many times each title occurred. Then I
> would create a mapping to standardize the the job titles.
> *Bryan Sayer*
> Retired Demographer/Statistician
> *In a world in which you can be anything, be kind*
> On 11/26/2025 11:10 AM, Rich Shepard wrote:
>
> On Wed, 26 Nov 2025, David G. Johnston wrote:
>
> I was using this tool a while back when I was doing heavy regex work.
>
> https://www.regexbuddy.com/
>
> Keep in mind the native flavor of regex in PostgreSQL is TCL, not Perl.
>
> But I’d still say regexp is not the best solution here - unless you
> encapsulate the logic in a function.  I suspect you’ll want to use this
> logic in more than just a single query and with a literal regexp you have
> to rely on manual synchronization.  Note, you could combine the lookup
> table with regexes.  Though beware of ensure you don’t produce duplicate
> matches if you go that route.
>
>
> David,
>
> Thanks,
>
> Rich
>
>
>


view thread (6+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Selecting all variations of job title in a list
  In-Reply-To: <CABpmC2+wz6HALK8QzzkfZ+Z80bYdbL889=4KKMWd2ew_b4e4FQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox