Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop allocating an Exception on Channel shutdown #10524

Closed
slandelle opened this issue Aug 28, 2023 · 4 comments · Fixed by #11955
Closed

Stop allocating an Exception on Channel shutdown #10524

slandelle opened this issue Aug 28, 2023 · 4 comments · Fixed by #11955
Milestone

Comments

@slandelle
Copy link
Contributor

What version of gRPC-Java are you using?

The latest

What is your environment?

Not relevant

What did you expect to see?

Not an Exception being allocated every time a Channel gets closed.

What did you see instead?

ClientTransportLifecycleManager#notifyShutdown calls Status#asException that allocates an Exception for the sake of being able to track where notifyShutdown was called from. To my understanding, this is a debugging feature, nothing in gRPC's logic requires it.

This generates a lot of allocations when allocation lots of channels for load testing purpose.

Could it be possible to deactivate this feature with a System property? And possibly deactivate it by default if it's indeed only used for debugging and changing it is not considered a breaking change.

Steps to reproduce the bug

Not relevant

@sanjaypujare
Copy link
Contributor

CC @ejona86, he may have some comments about why this was added.

@ejona86
Copy link
Member

ejona86 commented Aug 28, 2023

Oh, this is a different one than I thought. I thought you had discovered ManagedChannelOrphanWrapper, which is on the channel creation flow.

allocationSite = new SoftReference<>(
ENABLE_ALLOCATION_TRACKING
? new RuntimeException("ManagedChannel allocation site")
: missingCallSite);

I'm surprised you noticed this one and not the ManagedChannelOrphanWrapper exception. Why would this one be more noticable?

@ejona86
Copy link
Member

ejona86 commented Aug 29, 2023

Regarding ClientTransportLifecycleManager, I think it'd probably be better to just get rid of the getShutdownThrowable() API and have each callsite convert the status to an exception when necessary. I am surprised by your earlier PR though, because I feel like NettyClientHandler.channelInactive() would have always been called, and it calls getShutdownThrowable() unconditionally. So it seems making the throwable lazily initialized wouldn't have changed the frequency it was created.

We should make cancelPing(Throwable) become cancelPing(Status) to avoid that throwable creation. That leaves just createStream() as being a concern. But because the RPC is failing, it we'll probably be creating exceptions anyway. Not that great, but probably not that bad.

@slandelle
Copy link
Contributor Author

slandelle commented Aug 29, 2023

I'm surprised you noticed this one and not the ManagedChannelOrphanWrapper exception.

We did, but we disabled ENABLE_ALLOCATION_TRACKING. Looks like a debugging feature, not something that should be turned on in production.

I feel like NettyClientHandler.channelInactive() would have always been called, and it calls getShutdownThrowable() unconditionally

I'll only be able to check on Thursday, sorry.

I've been busy tracking other Exceptions allocations when closing a channel. It turns out there's also a bug in Netty, netty/netty#13569, whose consequences are even worse in grpc-netty when using shutdownNow because it causes a double channel closing (one from the graceful shutdown and one from the forceful one), hence 2 SslClosedEngineExceptions every time. See #10527 that's one of the causes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants