TCP Socket hanging - both sides stuck in sendto()

2023-04-05 14:38 问答作者：

We have an linux application (we don't have the source) that seems to be hanging. The socket between the two processes is reported as ESTABLISHED, and there is some data in the kernel socket buffer (although nowhere near the configured 16M via wmem/rmem). Both ends of the socket seem to 开发者_如何学Gobe stuck on a sendto().

Below is some investigation using netstat/lsof and strace:

HOST A (10.152.20.28)

[root@hosta ~]# lsof -n -u df01 | grep 12959 | grep 12u
q         12959 df01   12u  IPv4            4398449                TCP 10.152.20.28:38521->10.152.20.29:gsigatekeeper (ESTABLISHED)

[root@hosta ~]# netstat -anp | grep 38521
tcp   268754  90712 10.152.20.28:38521          10.152.20.29:2119           ESTABLISHED 12959/q

[root@hosta ~]# strace -p 12959
Process 12959 attached - interrupt to quit
sendto(12, "sometext\0somecode\0More\0exJKsss"..., 542, 0, NULL, 0 <unfinished ...>
Process 12959 detached
[root@hosta~]#

HOST B (10.152.20.29)

[root@hostb ~]# netstat -anp | grep 38521
tcp    72858 110472 10.152.20.29:2119           10.152.20.28:38521          ESTABLISHED 25512/q

[root@hostb ~]# lsof -n -u df01 | grep 38521
q         25512 df01   14u  IPv4            6456715                 TCP 10.152.20.29:gsigatekeeper->10.152.20.28:38521 (ESTABLISHED)

[root@hostb ~]# strace -p 25512
Process 25512 attached - interrupt to quit
sendto(14, "\0\10\0\0\0Owner\0sym\0Type\0Ctpy\0Time\0Lo"..., 207, 0, NULL, 0 <unfinished ...>
Process 25512 detached
[root@hostb~]#

We have upgraded the NIC driver to the latest and greatest. The systems are running RHEL 5.6 x64 (2.6.18-238.el5), I have checked the eratta for RHEL 5.7 and 5.8 but I can see no mention of bugs with the bnx2 driver or the kernel.

Does anyone have any ideas of how to debug this further?

Is either side actually reading? If not, it could be that both sides' receive buffers are full, leading to not sending data (due to the receive window being filled), leading to both send buffers being filled, which will cause sendto to block. (It's possible that this could happen despite your setting of wmem/rmem if the application is setting the SO_RCVBUF and SO_SNDBUF socket options.)

To debug this, I'd synchronize both machine's clocks, then run both applications under strace with the -e trace=network and -tt options, so you can compare the logs and see if the application isn't reading.

You could also use a network analyzer (such as Wireshark) to determine if the TCP receive window gets stuck on 0.

If this is the case, you could probably work around this by creating a small caching proxy, which would recv/send from both sides, buffering whatever can't be sent at the time.

继续阅读：freeze sendto sockets

TCP Socket hanging - both sides stuck in sendto()

HOST A (10.152.20.28)

HOST B (10.152.20.29)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

HOST A (10.152.20.28)

HOST B (10.152.20.29)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生 新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？