Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.26.2 breaks grpc-websocket-proxy #5326

Open
jnis23 opened this issue Mar 5, 2025 · 10 comments · May be fixed by #5331
Open

v2.26.2 breaks grpc-websocket-proxy #5326

jnis23 opened this issue Mar 5, 2025 · 10 comments · May be fixed by #5331

Comments

@jnis23
Copy link

jnis23 commented Mar 5, 2025

I realize https://github.com/tmc/grpc-websocket-proxy is not officially supported by this project, but it's mentioned a lot in issues here. Just flagging that the change in #5240 breaks the websocket-proxy, causing requests to hang indefinitely.

@johanbrandhorst
Copy link
Collaborator

Huh, that was certainly unintentional. Is it happening for all attempts to use the websocket proxy? I've used it in the past and I don't recall if it requires there to be no body on the message. Could you share how you think this breaks it? I would be keen to keep supporting the proxy.

@jnis23
Copy link
Author

jnis23 commented Mar 5, 2025

I tried using it with both GET and POST methods to no avail. But I think the hangup will come from how the proxy itself handles the request body: https://github.com/tmc/grpc-websocket-proxy/blob/673ab2c3ae75cc01952b84b88590e30e75dcf395/wsproxy/websocket_proxy.go#L186

It opens a pipe to read/write the body for the lifetime of the websocket.

@jnis23
Copy link
Author

jnis23 commented Mar 5, 2025

I can dig deeper tomorrow, but if #5240 was for cleaning up connections, would we be able to just defer that io.Copy? @paskozdilar would that still solve your issues?

@paskozdilar
Copy link
Contributor

paskozdilar commented Mar 6, 2025

#5240 was related to gRPC connections hanging indefinitely when client sent a body (e.g. "{}") on methods where google.api.http.body annotation (e.g. body: "*") is not defined. I am not sure if deferred io.Copy would help - but there is an integration test. Try adding defer and seeing if it passes.

I have since deleted the repository with bug reproduction code, but I'll upload it again until we figure this one out.

Interestingly, the issue also happened in the ad-hoc websocket proxy I used for streaming methods.
I'll upload that code too so we have a working reference.

@paskozdilar
Copy link
Contributor

I've re-uploaded the bug report repository: https://github.com/paskozdilar/bug-report-grpc-gateway

You can find my implementation of websocket proxy here: https://github.com/paskozdilar/bug-report-grpc-gateway/blob/master/ws.go

I've managed to reproduce the issue with WebSocket breaking.


Interestingly, when you define option (google.api.http) = { body: "*"; } annotation in the .proto file and send empty object to websocket, the method works fine. It is only when body is not defined that the issues start.

Maybe go io.Copy(io.Discard, r.Body) could work?

@paskozdilar
Copy link
Contributor

paskozdilar commented Mar 6, 2025

Indeed, changing io.Copy(io.Discard, r.Body) to go io.Copy(io.Discard, r.Body) makes things work.

I'm worried because I don't know what's going on under the good, why was io.Copy necessary at all, but I think it will work fine. My testing show no regressions when using go io.Copy as opposed to just io.Copy.

Here's the PR: #5331


@jnis23 could you provide a minimal reproducible example of the bug you're having, so I can confirm this change fixes it?

If not, you can use replace directive in go.mod to test my fork of grpc-gateway with your code. - This might be hard, since the change is in the code generator, not the library itself. If you're using buf, you could clone my branch and define the local plugin command as ["go", "run", ".../path/to/grpc-gateway"]...

@jnis23
Copy link
Author

jnis23 commented Mar 6, 2025

Thanks for looking into this so quickly! I'll get a repo up today and start testing

@jnis23
Copy link
Author

jnis23 commented Mar 8, 2025

Sorry for the delay here. I got caught up trying to dig into your question of why the io.Copy was needed in the first place. I've spun up a repo here with some sample streams and a UI for testing. The Stream rpc is how I'm using this in my actual project ({get: '/stream/{id}).

I can confirm that go io.Copy works both in my test project and the project where I saw this first pop up. So I think we can get your PR in as a stop gap until we figure out what is really going on.

@paskozdilar
Copy link
Contributor

paskozdilar commented Mar 19, 2025

@jnis23
I've noticed that the "Working" example from your repo doesn't close connections properly.
Interestingly, that issue is also caused by the req.Body never being read after the initial json.Unmarshal.

This commit fixes it:
paskozdilar/grpc-gateway-proxy-example@a1731aa

I'll try to fix this too in the PR.

@paskozdilar
Copy link
Contributor

question of why the io.Copy was needed in the first place

My best guess so far is that gRPC detects connection close via some equivalent to req.Body.Read([]byte{}) which doesn't actually read from connection, just returns an error when connection is closed. But as long as there is unread data in the connection, it will never return error. That may be why draining req.Body makes things work.

This is wild speculation, obviously. I'm yet to dive in the gRPC source code and see what's what.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants