Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Memory Leak in StatusLogger’s ConcurrentLinkedQueue #3511

Open
ByteExceptionM opened this issue Mar 2, 2025 · 6 comments
Open
Assignees
Labels
api Affects the public API waiting-for-user More information is needed from the user

Comments

@ByteExceptionM
Copy link

Description

Log4j2's StatusLogger$BoundedQueue is causing a memory leak, where millions of ConcurrentLinkedQueue$Node instances are being retained, leading to extreme memory usage.

A heap dump analysis reveals:

  • 210,815,632 ConcurrentLinkedQueue$Node instances consuming 4.7 GB of RAM.
  • 105,409,426 ConcurrentLinkedQueue instances consuming 2.4 GB of RAM.

Image
Image
Image

This issue occurs on a single node running Log4j 2.22.1, and has only started happening recently, without any noticeable log warnings or errors.

Configuration

  • Version: Log4j 2.22.1
  • Operating system: Private Docker image, based on itzg/minecraft-server
  • JDK: JDK 21.0.6+7
  • Container environment: Running within a Docker container

Logs

There are no noticeable errors or stack traces. The issue was identified through heap dump analysis, which shows that StatusLogger$BoundedQueue is retaining a massive number of ConcurrentLinkedQueue$Node instances, leading to a memory leak of over 4.9 GB.

Reproduction

The exact trigger is unknown, but the issue was observed on a high-load Minecraft server using Log4j 2.22.1. To reproduce:

  1. Start a Minecraft server using Log4j 2.22.1.
  2. Run the server under normal operational load (many background log events).
  3. Leave it running for several hours or days.
  4. Generate a heap dump and analyze memory usage.
  5. If affected, StatusLogger$BoundedQueue will hold millions of ConcurrentLinkedQueue$Node instances, consuming several gigabytes of RAM.

I found that Log4j2 uses a ConcurrentLinkedQueue in the StatusLogger implementation: StatusLogger.java, line 521

This could be the reason for the uncontrolled memory growth, as the queue might not be properly cleared or bounded under certain conditions.

@vy
Copy link
Member

vy commented Mar 2, 2025

@ByteExceptionM, thanks so much for the report. StatusLogger has gone through a significant overhaul since 2.22.1, would you mind trying out the most recent version, i.e., 2.24.3, please?

If this doesn't fix your problem, would you mind providing the following details, please?

  1. The Log4j configuration (e.g., log4j2.xml)
  2. The arguments passed to the associated java process

The size of StatusLogger::buffer is determined by the log4j2.status.entries system property. This was defaulting to 200 in 2.22.1 (hence the 200 elements you see in the screenshot you shared) and, after the overhaul carried out in 2.23.1, switched to 0. It is still puzzling how 200 StatusData instances can sum up to 4.9 GB. Nevertheless, in the worst case, you can set the log4j2.status.entries system property to 0 as a workaround.

@vy vy added bug Incorrect, unexpected, or unintended behavior of existing code api Affects the public API waiting-for-user More information is needed from the user and removed waiting-for-maintainer bug Incorrect, unexpected, or unintended behavior of existing code labels Mar 2, 2025
@ByteExceptionM
Copy link
Author

ByteExceptionM commented Mar 3, 2025

Hi, @vy!

As you can see in the screenshots I provided, the issue isn't limited to just the 200 StatusData instances. The memory usage is caused by an enormous amount of ConcurrentLinkedQueue$Node objects, with hundreds of millions of entries.

Image

I understand that 2.23.1+ introduced major changes, but I can't simply upgrade to Log4j 2.24.3, because the Minecraft server (or one of its plugins) depends on Log4j 2.22.1. Unfortunately, I don't have a way to fully track which component is responsible for this dependency, so updating is difficult at the moment.

As a workaround, I have now set the environment variable: LOG4J_STATUS_ENTRIES=0
I will monitor whether this improves the situation.

The second screenshot shows how severe the issue became yesterday, with memory usage skyrocketing to 13.9 GB due to ConcurrentLinkedQueue$Node instances. I'll report back with findings after running with this new setting.

Image

Thanks again for your help!

@github-actions github-actions bot added waiting-for-maintainer and removed waiting-for-user More information is needed from the user labels Mar 3, 2025
@ByteExceptionM
Copy link
Author

ByteExceptionM commented Mar 3, 2025

Hey, @vy!

I applied the workaround by setting: LOG4J_STATUS_ENTRIES=0.

At first, it seemed to work fine. The server has been running for about 8 hours without any issues, and I didn't see any ConcurrentLinkedQueue in the heap summary. However, around 30 minutes ago, it suddenly started growing again. As you can see in the new screenshot, ConcurrentLinkedQueue$Node is once again the top memory consumer, and the instance count keeps increasing.

Image
Image

  1. Could there be another part of Log4j still using ConcurrentLinkedQueue?
  2. Is it possible that LOG4J_STATUS_ENTRIES=0 isn't taking effect properly? (Is there a way I can verify at runtime that Log4j actually applied this setting?)
  3. Are there any other Log4j components that may be responsible for this behavior?

The second screenshot I uploaded earlier shows how bad this issue got yesterday, reaching 13.9 GB of retained memory just from these objects. I'll keep monitoring the system, but any further ideas or suggestions would be greatly appreciated!

Thanks again!

@ppkarwasz
Copy link
Contributor

I understand that 2.23.1+ introduced major changes, but I can't simply upgrade to Log4j 2.24.3, because the Minecraft server (or one of its plugins) depends on Log4j 2.22.1. Unfortunately, I don't have a way to fully track which component is responsible for this dependency, so updating is difficult at the moment.

Log4j follows semantic versioning, so you can safely upgrade to 2.24.3. In version 2.23.0 a big refactoring of StatusLogger took place, but it did not affect binary compatibility.

We only maintain the last minor version of each major branch, so there will never be a 2.22.2 release. If this bug does not affect 2.24.3, we might add some note to 2.22.x, but we will not fix it.

@vy
Copy link
Member

vy commented Mar 5, 2025

I applied the workaround by setting: LOG4J_STATUS_ENTRIES=0

@ByteExceptionM, can you explain what do you exactly mean by applying this setting? Do you pass this as an environment variable to the associated java process?

Would it be possible to share a heap dump with us? If so, you can

  1. [OPTIONAL] encrypt1 the heap dump,
  2. upload it somewhere publicly reachable, and
  3. send an email to [email protected] on where to download and how to unpack.

1 You can encrypt it either using a password-protected ZIP, or via GPG. In the latter case, you need public keys of Log4j maintainers.

@ByteExceptionM
Copy link
Author

I understand that 2.23.1+ introduced major changes, but I can't simply upgrade to Log4j 2.24.3, because the Minecraft server (or one of its plugins) depends on Log4j 2.22.1. Unfortunately, I don't have a way to fully track which component is responsible for this dependency, so updating is difficult at the moment.

Log4j follows semantic versioning, so you can safely upgrade to 2.24.3. In version 2.23.0 a big refactoring of StatusLogger took place, but it did not affect binary compatibility.

We only maintain the last minor version of each major branch, so there will never be a 2.22.2 release. If this bug does not affect 2.24.3, we might add some note to 2.22.x, but we will not fix it.

My concern is not about compatibility, but about the practical feasibility of upgrading.

In my environment, there are multiple software components relying on Log4j 2.22.1, including the Minecraft server itself and various plugins. The issue is that I do not have a clear way to track which components depend on this specific version, and I cannot simply replace or modify the software running my server without thorough testing and validation.

This makes upgrading much more complex than just switching the Log4j version in a standalone application.

I applied the workaround by setting: LOG4J_STATUS_ENTRIES=0

@ByteExceptionM, can you explain what do you exactly mean by applying this setting? Do you pass this as an environment variable to the associated java process?

Would it be possible to share a heap dump with us? If so, you can

  1. [OPTIONAL] encrypt1 the heap dump,
  2. upload it somewhere publicly reachable, and
  3. send an email to [email protected] on where to download and how to unpack.

1 You can encrypt it either using a password-protected ZIP, or via GPG. In the latter case, you need public keys of Log4j maintainers.

Yes, I applied the setting by passing it as an environment variable to the Java process: LOG4J_STATUS_ENTRIES=0.

I can prepare a heap dump and will send it over soon. I'll follow the encryption and upload instructions and notify you via email when it's ready.

Thanks again!

@vy vy added waiting-for-user More information is needed from the user and removed waiting-for-maintainer labels Mar 14, 2025
@vy vy self-assigned this Mar 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Affects the public API waiting-for-user More information is needed from the user
Projects
None yet
Development

No branches or pull requests

3 participants