Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vOIu7-001nyD-0n for pgsql-general@arkaria.postgresql.org; Wed, 26 Nov 2025 17:02:07 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vOIu4-000hY9-2D for pgsql-general@arkaria.postgresql.org; Wed, 26 Nov 2025 17:02:05 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vOIu4-000hY1-0u for pgsql-general@lists.postgresql.org; Wed, 26 Nov 2025 17:02:04 +0000 Received: from mail-vs1-xe2f.google.com ([2607:f8b0:4864:20::e2f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vOIu2-001bb1-12 for pgsql-general@lists.postgresql.org; Wed, 26 Nov 2025 17:02:03 +0000 Received: by mail-vs1-xe2f.google.com with SMTP id ada2fe7eead31-5dfb3297151so335880137.0 for ; Wed, 26 Nov 2025 09:02:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764176522; x=1764781322; darn=lists.postgresql.org; h=in-reply-to:content-language:from:references:to:subject:user-agent :mime-version:date:message-id:from:to:cc:subject:date:message-id :reply-to; bh=a5anRlYfJanhrqzzXV76MF4J22q5K5nL5NAhQKXRwJ8=; b=XDLMBpRS1XU7QvKjA7m1EMW4e2uiEBzlwiQgaDCS/X2U1s+ZKQPxxtJ0DDwCUT92Ku 5H6dfat4OlIOvMtxhfJB5ysPeHfwPh3VAP8s5wSD5BJzG8MmwR/3oKERW2f7btu0UzT1 6aBA7M/CwPG8UvOkLzrGZiqc8BXKTszxNZYFUJphd8hyfgF+jA2/yLdYT2p/Sq9FYsud E31bhwWOtBu/0HgMiVKWpsLrPNlVg9AomQluqkXWvm67W6YquKBXh/6vU34IijcVC8R2 XyeycQ54cEO8wYjxJpPuSo4kQ/6/9v7SVpP78ShHtSQGLFjpVhE4rmeV4sSHZl6sxomb y0gA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764176522; x=1764781322; h=in-reply-to:content-language:from:references:to:subject:user-agent :mime-version:date:message-id:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a5anRlYfJanhrqzzXV76MF4J22q5K5nL5NAhQKXRwJ8=; b=QDXefPiUA+TbdvosvCig6q0nzEZK0z9oNsvp2abB1pvJ2nbmHaJYDknpvrdaRgrZBi KUtLbsrYp19HAuwlWnMOmGX4sVJZvAwOMOTY00mgxfBp4dGiKYs5kZ6BGWZiQFie4+h4 wp6REX4sXZ2fxLGzfqYcUnE/8hH6wthKOiFehwxVE9Xm3HWyUGxwD8rDulpfwlBq0pCf N/jC0Tbdk09W4fn9Z2+CRsJ+31Ufp4GDWUoSTI//ndVmziphBotAZqkP6zChC4GvcAVp bvStpVyE8/3vfEw1zyRqA2tU4c8QRgydPMtHzAcs61nFtB1yEAdP9pwfWjUskMNcFGzL cSgg== X-Gm-Message-State: AOJu0YwIPwuP/+6d7gPJb0ux+Cwvx/DTnQT7qRdvVo7M5lFAJ30z/k5+ 3KSF9BYtbTMUikkwRcHbb2HwrkJa/+LKW20+TqrOCA5115GJoxBzjZqvF7OrfQ== X-Gm-Gg: ASbGncv59WSKWdG9hcNPy5Wsg2i6HvUaFRjPmVHJjcPEjF0aGPoTmORdc6wTkOh3Rwk JUb8vNBuSb4lWP7ZGKEs1EHcR477D4srIfQj3+6OWGkpRdirxsADfGqzOMytKqLQL0xkKF8kfd0 b3tVuLfZo5e2T5tKFLQpeiMhCH7C7adVhRCOFg0EPi36vbMtSbbnFgJM8XTRpT71uc12FKQ5lSu VxFlCsI0DveZ8mEbNRI9GdxWmeBGBMMsK/i1fQHu7bKav7qQrNsg38ZUrnhJaKt0MLXdFoQ19QS n5lO1eR/XvDodrGvUHG9aecDDM6fMd5KtF33ZTcLwDASle3ETx6euuOjsjjguSoltBLFR+CZXxl iLRVVNpbnd1EBBZQ6VhHTmTAi1ipLEDTjWR2gotewob6Wuq/1jfoFeGulTFsFdjW7D1URO1SUE1 ZMw0hC5dfW5vpoMK7VgNQJ8qREzRBI X-Google-Smtp-Source: AGHT+IFCADaPrv6v0v9llVU4IBAHlKHezRrSwYBJsbegt1nDeYfUyUeOajAQRJ30wHxEHx1HXCX65A== X-Received: by 2002:a05:6102:f96:b0:5db:cc92:26ee with SMTP id ada2fe7eead31-5e1de35cf2bmr4173720137.4.1764176509074; Wed, 26 Nov 2025 09:01:49 -0800 (PST) Received: from [100.64.100.6] ([173.239.211.154]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-8846e58ded0sm147551266d6.48.2025.11.26.09.01.48 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 26 Nov 2025 09:01:48 -0800 (PST) Content-Type: multipart/alternative; boundary="------------v4e4pDcqoHHB0dwfd55mSbrG" Message-ID: Date: Wed, 26 Nov 2025 12:01:46 -0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Selecting all variations of job title in a list To: pgsql-general@lists.postgresql.org References: <20251126154844.e6e3d5c7f476cc966fc201f4@wanadoo.fr> <29ffee7e-40a8-f911-4419-24ba8558f5db@appl-ecosys.com> <20251126164356.9a3b7fc25d5118e28175f9a4@wanadoo.fr> <989b6fd-2979-8ac6-13b-3916d2e47de8@appl-ecosys.com> From: Bryan Sayer Content-Language: en-US In-Reply-To: <989b6fd-2979-8ac6-13b-3916d2e47de8@appl-ecosys.com> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------v4e4pDcqoHHB0dwfd55mSbrG Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit I am not very skilled at Postgresql specifically, but when I was doing SQL in another environment I would just do select distinct (or unique) jobtitle usually getting a count of how many times each title occurred. Then I would create a mapping to standardize the the job titles. *Bryan Sayer* Retired Demographer/Statistician /In a world in which you can be anything, *be kind*/ On 11/26/2025 11:10 AM, Rich Shepard wrote: > On Wed, 26 Nov 2025, David G. Johnston wrote: > >> I was using this tool a while back when I was doing heavy regex work. >> >> https://www.regexbuddy.com/ >> >> Keep in mind the native flavor of regex in PostgreSQL is TCL, not Perl. >> >> But I’d still say regexp is not the best solution here - unless you >> encapsulate the logic in a function.  I suspect you’ll want to use this >> logic in more than just a single query and with a literal regexp you >> have >> to rely on manual synchronization.  Note, you could combine the lookup >> table with regexes.  Though beware of ensure you don’t produce duplicate >> matches if you go that route. > > David, > > Thanks, > > Rich > > --------------v4e4pDcqoHHB0dwfd55mSbrG Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

I am not very skilled at Postgresql specifically, but when I was doing SQL in another environment I would just do

select distinct (or unique) jobtitle

usually getting a count of how many times each title occurred. Then I would create a mapping to standardize the the job titles.

Bryan Sayer
Retired Demographer/Statistician
In a world in which you can be anything, be kind
On 11/26/2025 11:10 AM, Rich Shepard wrote:
On Wed, 26 Nov 2025, David G. Johnston wrote:

I was using this tool a while back when I was doing heavy regex work.

https://www.regexbuddy.com/

Keep in mind the native flavor of regex in PostgreSQL is TCL, not Perl.

But I’d still say regexp is not the best solution here - unless you
encapsulate the logic in a function.  I suspect you’ll want to use this
logic in more than just a single query and with a literal regexp you have
to rely on manual synchronization.  Note, you could combine the lookup
table with regexes.  Though beware of ensure you don’t produce duplicate
matches if you go that route.

David,

Thanks,

Rich


--------------v4e4pDcqoHHB0dwfd55mSbrG--