Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rw5Tf-00GgF9-5H for pgsql-general@arkaria.postgresql.org; Sun, 14 Apr 2024 19:25:23 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rw5Td-007uuE-Mw for pgsql-general@arkaria.postgresql.org; Sun, 14 Apr 2024 19:25:21 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rw5Td-007uu6-89 for pgsql-general@lists.postgresql.org; Sun, 14 Apr 2024 19:25:21 +0000 Received: from mail-ed1-x542.google.com ([2a00:1450:4864:20::542]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1rw5Ta-002qgS-JT for pgsql-general@lists.postgresql.org; Sun, 14 Apr 2024 19:25:20 +0000 Received: by mail-ed1-x542.google.com with SMTP id 4fb4d7f45d1cf-56e1f3462caso2829616a12.3 for ; Sun, 14 Apr 2024 12:25:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713122717; x=1713727517; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=/vny2Vsc2BRpGfMhImnpNGvML6oPXB/kiNh9QSCQuNo=; b=mSNA0R3euUCA3sYw/XWGjxb0+6xQFpoHVWOzpDqDXhxwjexH57f7Y78wrFzVoOQBfB O/jJK8yMa9EsYZKcvls3EEpgF2TmbNOPmTyyUIiayH2H9qs3UWZj/ClE8ke+y2cicHP4 Z8KFytVmk/AJt/ZyqBvAx7YIy96B7XahR7AZZlIm+GEQvo1KItrXBVj8sMIq2HA9+MCM 8PnKSO3kmrU3vgmLfh4UvFO5NTpcIuKwuRx61m7HwDApjvUFg0C7XHbJVAdgpWpIs2Ij h+s15D7QMACrHjCXI+1ZgGlakWLFAygfaYkr6mt5bAS5ele+mRW4i8S5OqKajGixqC+K gUXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713122717; x=1713727517; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/vny2Vsc2BRpGfMhImnpNGvML6oPXB/kiNh9QSCQuNo=; b=XnIVzeq8UwgzROmSXSbDh47r72ANAPGR+O4TwzqRZfR2S0dmxyBHrhTAGdg4oEN0x8 nqqmZDrqaW3iaOZArH2XJjzjRAaeqatCeUCwKx7kDOBkGxZmMh0CiolWIU4J8J1iyerd x5S6bnyceUBxeMZawirsbUmLIfe3w2pycrJyzj3q1lYNPqUacrlpUiyAdn2BO2Hop9wg CIC72MfzGzinxVAXVB7f6H0kXLhGEr5GGvHCRlrC1hQqY4ffgJjPEddLRz0HvVnIYtxc FD6jNDHJ8XajvvrWDwG/TmCNXc3Jd0O5oCo4NLK6fh7pUSWiF4hwsRk6rbr+r2EoJfVo BJ/A== X-Gm-Message-State: AOJu0YxhvqQjO+4WL8IyW0ugmYc1exf7vKr2NGH2ZXLLe9PRVvR1PMUK UD8KTOws9IOOyyaq2ZaSq4NjEI188W0B9idWj4/0FDhOX8aY91aotJZ8Rv5XbxFqa9Q1l7vU97f y3XmnU9mSXRVX8x0z6EBcmNeE3MqNun7R X-Google-Smtp-Source: AGHT+IFcYcO/a0u/bTNYhIE+MxUFWhizUg5OK/6owa/U1WofAY1qdIRYlxrDmCSIij2h9EdM9UyCUDiv5p3UZEN4Dns= X-Received: by 2002:a50:8e52:0:b0:56e:2d8f:ff80 with SMTP id 18-20020a508e52000000b0056e2d8fff80mr6517750edx.5.1713122716840; Sun, 14 Apr 2024 12:25:16 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Vijaykumar Jain Date: Mon, 15 Apr 2024 00:55:06 +0530 Message-ID: Subject: Re: constant crashing To: jack Cc: pgsql-general Content-Type: multipart/alternative; boundary="000000000000fe7cb5061613764f" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000fe7cb5061613764f Content-Type: text/plain; charset="UTF-8" Ignore my thread, I guess there might be a bug given it segfaulted. On Mon, Apr 15, 2024, 12:48 AM Vijaykumar Jain < vijaykumarjain.github@gmail.com> wrote: > > > On Sun, 14 Apr 2024 at 21:50, jack wrote: > >> The full error reads: >> server closed the connection expectantly >> This probably means the server terminated abnormally >> before or while processing the request. >> error: connection to server was lost >> >> PostgreSQL 16.2 >> >> I also believe it is a resource issue which can be rectified with a >> setting, but which setting? >> If you were updating 100 million records what settings would you adjust? >> >> Here are the updates I am performing on the 100 million records: >> UPDATE table SET category_modified = UPPER(category); >> UPDATE table SET category_modified = >> REGEXP_REPLACE(REPLACE(REPLACE(category_modified, '''','-'), '`', '-'), >> '\s{2,}', ' ', 'g') WHERE AND LENGTH(category_modified)>1 AND >> POSITION('--' IN category_modified)>0; >> UPDATE table SET category_modified = REPLACE(category_modified,' ','-'); >> UPDATE table SET category_modified = CASE WHEN category_modified IS NOT >> NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END; >> UPDATE table SET category_modified = regexp_replace(category_modified, >> '-{2,}', '-', 'g'); >> UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1 >> FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND >> category_modified LIKE '%-'; >> >> > independent of best practices, i just want to check if there is a leak. > I created a sample table with text data and ran updates like yours and I > could not see mem growth, but I have a small vm and ofc your > category_modified field might be more complex than simple text fields for > 30-40 chars. > > can you grab the pid of your psql backend and (if you have pidstat > installed) monitor resource usage for that pid > > postgres@pg:~/udemy/16$ psql > psql (16.2 (Ubuntu 16.2-1.pgdg22.04+1)) > Type "help" for help. > > postgres=# select pg_backend_pid(); > pg_backend_pid > ---------------- > 1214 > (1 row) > > # pidstat 2 100 -rud -h -p 1214 > (get all stats for that pid) that might help to figure out if there is a > leak or the server has other things competing for memory and your updates > were picked by the killer. > > Linux 5.15.0-101-generic (pg) 04/15/24 _x86_64_ (1 CPU) > > # Time UID PID %usr %system %guest %wait %CPU CPU > minflt/s majflt/s VSZ RSS %MEM kB_rd/s kB_wr/s kB_ccwr/s > iodelay Command > 00:40:25 113 1214 0.00 0.00 0.00 0.00 0.00 0 > 0.00 0.00 354112 220940 24.18 0.00 0.00 0.00 > 0 postgres > > # Time UID PID %usr %system %guest %wait %CPU CPU > minflt/s majflt/s VSZ RSS %MEM kB_rd/s kB_wr/s kB_ccwr/s > iodelay Command > 00:40:27 113 1214 0.00 0.00 0.00 0.00 0.00 0 > 0.00 0.00 354112 220940 24.18 0.00 0.00 0.00 > 0 postgres > .... > > ofc, if there is a genuine leak , then there might be more digging needed Finding > memory leaks in Postgres C code (enterprisedb.com) > > just kill the process requesting more mem than available Memory context: > how PostgreSQL allocates memory - CYBERTEC (cybertec-postgresql.com) > > > > -- > Thanks, > Vijay > LinkedIn - Vijaykumar Jain > --000000000000fe7cb5061613764f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Ignore my thread, I guess there might be a bug given it s= egfaulted.=C2=A0

On Mon, Apr 15, 2024, 12:48 AM Vijaykumar Jain <vijaykumarjain.github@gmail.com= > wrote:


On Sun, 14 Apr 2024 at 21:50, jack <jack4pg@a= 7q.com> wrote:
The = full error reads:
server closed the connection=C2=A0expectantly
This probably means the server terminated abnorm= ally
before or while processing the request.
error: connection to serve= r was lost

PostgreSQL 16.2

I also bel= ieve it is a resource issue which can be rectified with a setting, but whic= h setting?
If you were updating 100 million records what settings would you adj= ust?

Here are the updates I am performing on the 100 million r= ecords:
UPDATE table SET category_modified =3D UPPER(categ= ory);
UPDATE table SET category_modified =3D REGEXP_= REPLACE(REPLACE(REPLACE(category_modified, '''','-'= ), '`', '-'), '\s{2,}', ' ', 'g') W= HERE =C2=A0AND LENGTH(category_modified)>1 AND POSITION('--' IN = category_modified)>0;
UPDATE table SET category_m= odified =3D REPLACE(category_modified,' ','-');
UPDATE table SET category_modified =3D CASE WHEN category_modif= ied IS NOT NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$'= , '') ELSE NULL END;
UPDATE table SET catego= ry_modified =3D regexp_replace(category_modified, '-{2,}', '-&#= 39;, 'g');
UPDATE table SET category_modifie= d =3D SUBSTRING(category_modified FROM 1 FOR LENGTH(category_modified) - 1)= WHERE LENGTH(category_modified)>1 AND category_modified LIKE '%-= 9;;


independent o= f best practices, i just want to check if there is a leak.
I crea= ted a sample table with text data and ran updates like yours and I could no= t see mem growth, but I have a small vm and ofc your=C2=A0 category_modifie= d field might be more complex than simple text fields for 30-40 chars.

can you grab the pid of your psql backend and (if you = have pidstat installed) monitor resource usage for that pid

<= /div>
postgres@pg:~/udemy/16$ psql
psql (16.2 (Ubuntu 16= .2-1.pgdg22.04+1))
Type "help" for help.

=
postgres=3D# select pg_backend_pid();=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 pg_backe= nd_pid
----------------
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A01214
(1 row)

# pids= tat 2 100 -rud -h -p 1214=C2=A0
(get all stats for that pid) that= might help to figure out if there is a leak or the server has other things= competing for memory and your updates were picked by the killer.

Linux 5.15.0-101-generic (pg)=C2=A0 =C2=A004/15/24=C2=A0 = =C2=A0 =C2=A0 =C2=A0 _x86_64_=C2=A0 =C2=A0 =C2=A0 =C2=A0 (1 CPU)
=
# Time=C2=A0 =C2=A0 =C2=A0 =C2=A0 UID=C2=A0 =C2=A0 =C2=A0 = =C2=A0PID=C2=A0 =C2=A0 %usr %system=C2=A0 %guest=C2=A0 =C2=A0%wait=C2=A0 = =C2=A0 %CPU=C2=A0 =C2=A0CPU=C2=A0 minflt/s=C2=A0 majflt/s=C2=A0 =C2=A0 =C2= =A0VSZ=C2=A0 =C2=A0 =C2=A0RSS=C2=A0 =C2=A0%MEM=C2=A0 =C2=A0kB_rd/s=C2=A0 = =C2=A0kB_wr/s kB_ccwr/s iodelay=C2=A0 Command
00:40:25=C2=A0 =C2= =A0 =C2=A0 113=C2=A0 =C2=A0 =C2=A0 1214=C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 0.00= =C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 =C2=A00= =C2=A0 =C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 =C2=A0 0.00=C2=A0 354112=C2=A0 22094= 0=C2=A0 24.18=C2=A0 =C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 =C2=A0 0.00=C2=A0 =C2= =A0 =C2=A0 0.00=C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 postgres

# Time=C2=A0 =C2=A0 =C2=A0 =C2=A0 UID=C2=A0 =C2=A0 =C2=A0 =C2=A0PI= D=C2=A0 =C2=A0 %usr %system=C2=A0 %guest=C2=A0 =C2=A0%wait=C2=A0 =C2=A0 %CP= U=C2=A0 =C2=A0CPU=C2=A0 minflt/s=C2=A0 majflt/s=C2=A0 =C2=A0 =C2=A0VSZ=C2= =A0 =C2=A0 =C2=A0RSS=C2=A0 =C2=A0%MEM=C2=A0 =C2=A0kB_rd/s=C2=A0 =C2=A0kB_wr= /s kB_ccwr/s iodelay=C2=A0 Command
00:40:27=C2=A0 =C2=A0 =C2=A0 1= 13=C2=A0 =C2=A0 =C2=A0 1214=C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 0.00=C2=A0 =C2= =A0 0.00=C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 =C2=A00=C2=A0 =C2= =A0 =C2=A0 0.00=C2=A0 =C2=A0 =C2=A0 0.00=C2=A0 354112=C2=A0 220940=C2=A0 24= .18=C2=A0 =C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 =C2=A0 0.00=C2=A0 =C2=A0 =C2=A0 0= .00=C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 postgres
....

ofc, if there is a genuine leak , then there mig= ht be more digging needed=C2=A0Finding memory leaks in Postgres C code (enterprisedb.com)=C2=A0
=
just kill the process requesting more mem than available=C2=A0=C2=A0Memory context:= how PostgreSQL allocates memory - CYBERTEC (cybertec-postgresql.com)


--=
Tha= nks,
Vijay
--000000000000fe7cb5061613764f--