Message-ID: From: "bataroland (@bataroland)" To: "pgjdbc/pgjdbc" Date: Wed, 29 Jan 2025 15:20:26 +0000 Subject: Re: [pgjdbc/pgjdbc] issue #3089: Metaspace Memory leak: Thread.inheritedAccessControlContext In-Reply-To: References: List-Id: X-GitHub-Author-Login: bataroland X-GitHub-Comment-Id: 2621950441 X-GitHub-Comment-Type: issue_comment X-GitHub-Edited-At: 2025-01-29T15:20:57Z X-GitHub-Issue: 3089 X-GitHub-Repo: pgjdbc/pgjdbc X-GitHub-Type: comment X-GitHub-Url: https://github.com/pgjdbc/pgjdbc/issues/3089#issuecomment-2621950441 Content-Type: text/plain; charset=utf-8 ### Analysis of the PostgreSQL JDBC Cleaner Thread Issue in PostgreSQL JDBC on Application Servers After extensive analysis, we identified a classloader leak issue related to the `PostgreSQL-JDBC-Cleaner` thread in PostgreSQL JDBC when used with application servers. This issue is similar to the one described in [this StackOverflow answer](https://stackoverflow.com/a/78171021/9004863), but in our case, it occurs on a different application server (JBoss). #### **Root Cause of the Issue** The core problem lies in how the `PostgreSQL-JDBC-Cleaner` thread inherits its execution context from the thread that creates it. Specifically, when a new thread is started in Java, it inherits an `AccessControlContext` from its parent thread. This means that the `PostgreSQL-JDBC-Cleaner` thread may retain a reference to the `ProtectionDomain` of the web application that initially triggered its creation. As a result the web application’s classloader remains referenced, preventing garbage collection (GC) from reclaiming it. In PostgreSQL JDBC, the `org.postgresql.util.LazyCleaner` class manages a background cleaner thread named **"PostgreSQL-JDBC-Cleaner"**. When this thread is created, the following method is invoked: ```java # JDK8 java.lang.Thread#init(java.lang.ThreadGroup, java.lang.Runnable, java.lang.String, long, java.security.AccessControlContext, boolean) ``` This method sets an internal field called `inheritedAccessControlContext`, which holds an instance of `java.security.AccessControlContext` inherited from the creator thread. If the parent thread belongs to a web application, its classloader will be retained in memory through this field. Consequently, the classloader cannot be garbage collected until all `org.postgresql.util.LazyCleaner.Node` instances have been removed and cleaned. #### **How the Issue Manifests in Application Servers** The impact of this issue depends on when the `PostgreSQL-JDBC-Cleaner` thread is initialized: - **Scenario 1 (Worst Case): The Cleaner Thread is created after a Web Application gets a connection** - If a web application is deployed and establishes a database connection via JNDI, the `PostgreSQL-JDBC-Cleaner` thread is initialized within the application's context. - Since the `PostgreSQL-JDBC-Cleaner` thread holds a reference to the web application’s classloader, undeploying the application leaves a dangling reference, causing a classloader leak. - Over time, repeated (re)deployments can accumulate these leaked classloaders, leading to increased memory usage and after some time OOM. - **Scenario 2 (Better Case): The Cleaner Thread is created before any Web Application is deployed** - If a test connection is made **immediately after the application server starts** (before any web application is deployed), the `PostgreSQL-JDBC-Cleaner` thread is initialized within the application server’s context instead of a specific web application’s classloader. - In this case, subsequent web applications will not be referenced by the cleaner thread, and their classloaders can be garbage collected. - However, connection pools automatically removes idle connections after **30 minutes**. If this happens and the application gets a connection from the pool, then a new `PostgreSQL-JDBC-Cleaner` thread may be created under the web application’s classloader, reintroducing the classloader leak. #### **Why the Problem Persists** - The PostgreSQL JDBC driver is registered **globally** in the application server. - Connection pooling exacerbates the problem: - If the application server manages database connections through a connection pool, the `PostgreSQL-JDBC-Cleaner` thread may outlive the applications using it. - If the application itself manages the connections instead of relying on the server’s connection pool, the problem would not occur. - In a production application server, it is unlikely that **all** pooled connections would be closed. As a result, the leaked classloader remains in memory indefinitely unless the server is restarted. #### **Steps to Reproduce the Issue** 1. Configure a **PostgreSQL JDBC datasource** in the application server. 2. Deploy a web application that retrieves the datasource via **JNDI**. 3. Execute a simple SQL statement within the application (e.g., `SELECT version();`). 4. Undeploy the web application. 5. Create a **heap dump** of the application server’s memory. 1. If a `PostgreSQL-JDBC-Cleaner` thread was created from a thread owned by the web application’s classloader, the web application’s classloader will remain in memory and **cannot be garbage collected** unless: - The server is completely shut down. - All database connections are explicitly closed and cleaned. 2. If a `PostgreSQL-JDBC-Cleaner` thread was created **before any web application deployment**, then until the `pgjdbc.config.cleanup.thread.ttl` expires, only the application server’s classloader is referenced in `inheritedAccessControlContext`, preventing web application classloader leaks. #### **Conclusion** The `PostgreSQL-JDBC-Cleaner` thread in PostgreSQL JDBC introduces a potential **classloader leak** when used with application servers that manage connection pooling. The issue arises because the cleaner thread inherits the `AccessControlContext` of the thread that creates it. - If the cleaner thread is initialized **after a web application gets a connection**, as a result, the web application’s classloader remains referenced indefinitely, leading to **memory leaks**. - If the cleaner thread is initialized **before any web application deploys**, the issue can be avoided, but idle connection removal mechanisms can reintroduce it over time. To mitigate this issue, possible workarounds include: - Ensuring that the `PostgreSQL-JDBC-Cleaner` thread is **initialized only once at the driver init** before any applications deploy. #### Used and tested JDBC drivers - Postgres JDBC 42.7.5 - EDB JDBC 42.7.3.2 Collab: @bodzso