public inbox for [email protected]
help / color / mirror / Atom feedCLOSE_WAIT pileup and Application Timeout
5+ messages / 3 participants
[nested] [flat]
* CLOSE_WAIT pileup and Application Timeout
@ 2024-10-04 04:29 KK CHN <[email protected]>
2024-10-04 11:41 ` Re: CLOSE_WAIT pileup and Application Timeout Francesco Benetton <[email protected]>
2024-10-06 18:37 ` Re: CLOSE_WAIT pileup and Application Timeout Alvaro Herrera <[email protected]>
0 siblings, 2 replies; 5+ messages in thread
From: KK CHN @ 2024-10-04 04:29 UTC (permalink / raw)
To: pgsql-general
List,
I am facing a network (TCP IP connection closing issue) .
Running a mobile tablet application, Android application to update the
status of vehicles fleet say around 1000 numbers installed with the app on
each vehicle along with a vehicle tracking application server solution
based on Java and Wildfly with PosrgreSQL16 backend.
The mobile tablets are installed with the android based vehicle
tracking app which updated every 30 seconds its location fitted inside the
vehicle ( lat long coordinates) to the PostgreSQL DB through the java
backend application to know the latest location of the vehicle and its
movement which will be rendered in a map based front end.
The vehicles on the field communicate via 443 to 8080 of the Wildfly
(version 27 ) deployed with the vehicle tracking application developed with
Java(version 17).
* The mobile tablet communicates to the backend application over mobile
data (4G/5G SIMS). *
The running vehicles may disconnect or be unable to send the location
data in between if the mobile data coverage is less or absent in a
particular area where data coverage is nil or signal strength less.
The server on which the backend application runs most often ( a week's
time or so) shows connection timeout and is unable to serve tracking of
the vehicles further.
When we restart the Wildfly server the application returns to normal.
again the issue repeats after a week or two.
In the Server machine when this bottleneck occurs I am seeing a lot of
TCP/IP CLOSE_WAIT ( 3000 to 5000 ) when the server backend becomes
unresponsive.
What is the root cause of this issue ? Is it due to the android
application unable to send the CLOSE_WAIT ACK due to poor mobile data
connectivity ?
If so, how do people address this issue ? and what may be a fix ?
Any directions / or reference material most welcome.
Thank you,
Krishane
^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: CLOSE_WAIT pileup and Application Timeout
2024-10-04 04:29 CLOSE_WAIT pileup and Application Timeout KK CHN <[email protected]>
@ 2024-10-04 11:41 ` Francesco Benetton <[email protected]>
1 sibling, 0 replies; 5+ messages in thread
From: Francesco Benetton @ 2024-10-04 11:41 UTC (permalink / raw)
To: KK CHN <[email protected]>; +Cc: pgsql-general
If I understand clearly, postgresql is used as a Data server for the
backend, and so the Android app does not connect directly to postgresql.
The first idea is a problem on closing or recycling the connection by the
backend after executing the request. Maybe wrong client connection pooling
settings?
Il ven 4 ott 2024, 06:29 KK CHN <[email protected]> ha scritto:
> List,
>
> I am facing a network (TCP IP connection closing issue) .
>
> Running a mobile tablet application, Android application to update the
> status of vehicles fleet say around 1000 numbers installed with the app on
> each vehicle along with a vehicle tracking application server solution
> based on Java and Wildfly with PosrgreSQL16 backend.
>
> The mobile tablets are installed with the android based vehicle
> tracking app which updated every 30 seconds its location fitted inside the
> vehicle ( lat long coordinates) to the PostgreSQL DB through the java
> backend application to know the latest location of the vehicle and its
> movement which will be rendered in a map based front end.
>
> The vehicles on the field communicate via 443 to 8080 of the Wildfly
> (version 27 ) deployed with the vehicle tracking application developed with
> Java(version 17).
>
>
> * The mobile tablet communicates to the backend application over mobile
> data (4G/5G SIMS). *
>
> The running vehicles may disconnect or be unable to send the location
> data in between if the mobile data coverage is less or absent in a
> particular area where data coverage is nil or signal strength less.
>
> The server on which the backend application runs most often ( a week's
> time or so) shows connection timeout and is unable to serve tracking of
> the vehicles further.
>
> When we restart the Wildfly server the application returns to normal.
> again the issue repeats after a week or two.
>
> In the Server machine when this bottleneck occurs I am seeing a lot of
> TCP/IP CLOSE_WAIT ( 3000 to 5000 ) when the server backend becomes
> unresponsive.
>
> What is the root cause of this issue ? Is it due to the android
> application unable to send the CLOSE_WAIT ACK due to poor mobile data
> connectivity ?
>
>
> If so, how do people address this issue ? and what may be a fix ?
>
> Any directions / or reference material most welcome.
>
> Thank you,
> Krishane
>
>
>
>
>
>
^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: CLOSE_WAIT pileup and Application Timeout
2024-10-04 04:29 CLOSE_WAIT pileup and Application Timeout KK CHN <[email protected]>
@ 2024-10-06 18:37 ` Alvaro Herrera <[email protected]>
2024-10-07 11:30 ` Re: CLOSE_WAIT pileup and Application Timeout KK CHN <[email protected]>
1 sibling, 1 reply; 5+ messages in thread
From: Alvaro Herrera @ 2024-10-06 18:37 UTC (permalink / raw)
To: KK CHN <[email protected]>; +Cc: pgsql-general
On 2024-Oct-04, KK CHN wrote:
> The mobile tablets are installed with the android based vehicle
> tracking app which updated every 30 seconds its location fitted inside the
> vehicle ( lat long coordinates) to the PostgreSQL DB through the java
> backend application to know the latest location of the vehicle and its
> movement which will be rendered in a map based front end.
>
> The vehicles on the field communicate via 443 to 8080 of the Wildfly
> (version 27 ) deployed with the vehicle tracking application developed with
> Java(version 17).
It sounds like setting TCP keepalives in the connections between the
Wildfly and the vehicles might help get the number of dead connections
down to a reasonable level. Then it's up to Wildfly to close the
connections to Postgres in a timely fashion. (It's not clear from your
description how do vehicle connections to Wildfly relate to Postgres
connections.)
I wonder if the connections from Wildfly to Postgres use SSL? Because
there are reported cases where TCP connections are kept and accumulate,
causing problems -- but apparently SSL is a necessary piece for that to
happen.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Thou shalt study thy libraries and strive not to reinvent them without
cause, that thy code may be short and readable and thy days pleasant
and productive. (7th Commandment for C Programmers)
^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: CLOSE_WAIT pileup and Application Timeout
2024-10-04 04:29 CLOSE_WAIT pileup and Application Timeout KK CHN <[email protected]>
2024-10-06 18:37 ` Re: CLOSE_WAIT pileup and Application Timeout Alvaro Herrera <[email protected]>
@ 2024-10-07 11:30 ` KK CHN <[email protected]>
2024-10-07 14:31 ` Re: CLOSE_WAIT pileup and Application Timeout Alvaro Herrera <[email protected]>
0 siblings, 1 reply; 5+ messages in thread
From: KK CHN @ 2024-10-07 11:30 UTC (permalink / raw)
To: Alvaro Herrera <[email protected]>; +Cc: pgsql-general
On Mon, Oct 7, 2024 at 12:07 AM Alvaro Herrera <[email protected]>
wrote:
> On 2024-Oct-04, KK CHN wrote:
>
> > The mobile tablets are installed with the android based vehicle
> > tracking app which updated every 30 seconds its location fitted inside
> the
> > vehicle ( lat long coordinates) to the PostgreSQL DB through the java
> > backend application to know the latest location of the vehicle and its
> > movement which will be rendered in a map based front end.
> >
> > The vehicles on the field communicate via 443 to 8080 of the Wildfly
> > (version 27 ) deployed with the vehicle tracking application developed
> with
> > Java(version 17).
>
> It sounds like setting TCP keepalives in the connections between the
> Wildfly and the vehicles might help get the number of dead connections
> down to a reasonable level. Then it's up to Wildfly to close the
> connections to Postgres in a timely fashion. (It's not clear from your
> description how do vehicle connections to Wildfly relate to Postgres
> connections.)
>
>
Where do I have to introduce the TCP keepalives ? in the OS level or
application code level ?
[root@dbch wildfly-27.0.0.Final]# cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
[root@dbch wildfly-27.0.0.Final]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
[root@dbch wildfly-27.0.0.Final]# cat
/proc/sys/net/ipv4/tcp_keepalive_probes
9
[root@dbch wildfly-27.0.0.Final]#
These are the default values in the OS level. Do I need to reduce all the
above three values to say 600, 20, 5 ? Or need to be handled in the
application backend code ?
Any hints much appreciated..
>
> I wonder if the connections from Wildfly to Postgres use SSL? Because
> there are reported cases where TCP connections are kept and accumulate,
> causing problems -- but apparently SSL is a necessary piece for that to
> happen.
>
No SSL in between Wildfly (8080 ) to PGSQL(5432). Both the machines
internal lan VMs in the same network. Only the devices on the field
(fitted on the vehicles) communicate to the application backend via a
public URL :443 port then it connectes to the 8080 of wildfly then the
java code connects the database server running on 5432 on the internal LAN
network.
>
> --
> Álvaro Herrera 48°01'N 7°57'E —
> https://www.EnterpriseDB.com/
> Thou shalt study thy libraries and strive not to reinvent them without
> cause, that thy code may be short and readable and thy days pleasant
> and productive. (7th Commandment for C Programmers)
>
^ permalink raw reply [nested|flat] 5+ messages in thread
* Re: CLOSE_WAIT pileup and Application Timeout
2024-10-04 04:29 CLOSE_WAIT pileup and Application Timeout KK CHN <[email protected]>
2024-10-06 18:37 ` Re: CLOSE_WAIT pileup and Application Timeout Alvaro Herrera <[email protected]>
2024-10-07 11:30 ` Re: CLOSE_WAIT pileup and Application Timeout KK CHN <[email protected]>
@ 2024-10-07 14:31 ` Alvaro Herrera <[email protected]>
0 siblings, 0 replies; 5+ messages in thread
From: Alvaro Herrera @ 2024-10-07 14:31 UTC (permalink / raw)
To: KK CHN <[email protected]>; +Cc: pgsql-general
On 2024-Oct-07, KK CHN wrote:
> On Mon, Oct 7, 2024 at 12:07 AM Alvaro Herrera <[email protected]>
> wrote:
> Where do I have to introduce the TCP keepalives ? in the OS level or
> application code level ?
>
> [root@dbch wildfly-27.0.0.Final]# cat /proc/sys/net/ipv4/tcp_keepalive_time
> 7200
> [root@dbch wildfly-27.0.0.Final]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
> 75
> [root@dbch wildfly-27.0.0.Final]# cat
> /proc/sys/net/ipv4/tcp_keepalive_probes
> 9
> [root@dbch wildfly-27.0.0.Final]#
>
> These are the default values in the OS level. Do I need to reduce all the
> above three values to say 600, 20, 5 ? Or need to be handled in the
> application backend code ?
My understanding is that these values have no effect unless the socket
gets
setsockopt( ... , SO_KEEPALIVE, ...)
So that's definitely something that the app needs to do -- it's not
enabled automatically.
With these default settings, the connection would be closed about 2:11
after going quiet, so if your problem manifests only a week later, you
would have enough time for these to be cleaned up. But of course you
should monitor what happens.
> > I wonder if the connections from Wildfly to Postgres use SSL? Because
> > there are reported cases where TCP connections are kept and accumulate,
> > causing problems -- but apparently SSL is a necessary piece for that to
> > happen.
> >
> No SSL in between Wildfly (8080 ) to PGSQL(5432).
Okay, that's unlikely to be relevant then.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Linux transformó mi computadora, de una `máquina para hacer cosas',
en un aparato realmente entretenido, sobre el cual cada día aprendo
algo nuevo" (Jaime Salinas)
^ permalink raw reply [nested|flat] 5+ messages in thread
end of thread, other threads:[~2024-10-07 14:31 UTC | newest]
Thread overview: 5+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-10-04 04:29 CLOSE_WAIT pileup and Application Timeout KK CHN <[email protected]>
2024-10-04 11:41 ` Francesco Benetton <[email protected]>
2024-10-06 18:37 ` Alvaro Herrera <[email protected]>
2024-10-07 11:30 ` KK CHN <[email protected]>
2024-10-07 14:31 ` Alvaro Herrera <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox