How can I tell if a socket buffer is full?
How can I tell if a read socket buffer is full or a write socket buffer is empty?
Is there a way I can get the status of a socket buffer without a system call?
UPDATE: How about this: I'd like to get a callback or signal when either the read socket buffer is full or the write socket buffer is empty. This way I can stop processing to allow more I/O to occur on the wire, since being I/O bound is always an issue when sending data on the wire.
The select()
call is how you chec开发者_如何学编程k if the read buffer has something in it. Not when it is full (I think).
Poll the file descriptor with select
and a zero timeout - if select says it's writeable, the send buffer isn't full.
(Oh... without a system call. No, there isn't.)
Addendum:
In response to your updated question, you can use two ioctl
s on the TCP socket: SIOCINQ
returns the amount of unread data in the recieve buffer, and SIOCOUTQ
returns the amount of unsent data in the send queue. I don't believe there's any asynchronous event notification for these though, which will leave you having to poll.
I know this is an old thread, but for the benefit of those who stumble onto this via search engine, I will answer the question, as it hasn't really been answered above.
Before I start, get over the system call hangup - you cannot interact with kernel-based (*nix) network stacks without switching in and out of kernel space. Your goal should be to understand the stack features, so you can get the best out of your system.
How can I tell if a read socket buffer is full
This part has been answered - you don't because it's not how you should be thinking.
If the sender is (badly) fragmenting it's TCP frames (usually due to not buffering marshaled data on output, and having the Nagle algorithm turned off with TCP_NDELAY), your idea of reducing the number of system calls you make is a good idea. The approach you should be using involves setting a "low watermark" for reading. First, you establish what you think is a reasonable receive buffer size by setting SO_RCVBUF using setsockopt(). Then read back the actual read buffer size using getsockopt(), as you might not get what you ask for. :) Unfortunately, not all implementations allow you to read SO_RCVBUF back again, so your mileage may vary. Next, decide how much data you want to be present for reading before you want to read it. Set SO_RCVLOWAT with this size, using setsockopt(). Now, the socket's file descriptor will only select as readable when there is at least that amount of data read to read.
or a write socket buffer is empty?
This is an interesting one, as I needed to do this recently to ensure that my MODBUS/TCP ADU's each occupied their own TCP frames, which the MODBUS specification requires (@steve: controlling fragmentation is one time you do need to know when the send buffer is empty!). As far as the original poster is concerned, I doubt very much that he really wants this, and believe he would be much better served knowing the send buffer size before he starts, and checking the amount of data in the send buffer periodically during sending, using techniques already described. That would provide finer-grained information about the proportion of the send buffer used, which could be used to throttle production more smoothly.
For those still interested in how to detect (asynchronously) when the send buffer is empty (once you're sure it's really what you want), the answer is simple - you set the send low-watermark (SO_SNDLOWAT) equal to the send buffer size. That way the socket's file descriptor will only select as writable when the send buffer is empty.
It's no coincidence that my answers to your questions revolve around the use of select(). In almost all cases (and I realize I'm heading into religious territory now!) apps that need to move a lot of data around (intra- and inter-host) are best structured as single-threaded state machines, using selection masks and a processing loop based around pselect(). These days some OS's (Linux to name one) even allow you to manage your signal handling using file descriptor selections. What luxury - when I was a boy... :)
Peter
You can try ioctl
. FIONREAD tells you how many bytes are immediately readable. If this is the same as the buffer size (which you might be able to retrieve and/or set with another icotl call), then the buffer is full. Likewise, if you can write as many bytes as the size of the output buffer, then the output buffer is empty.
I don't how widely supported FIONREAD, FIONWRITE, and SIOCGIFBUFS (or equivalents) are. I'm not sure I've ever used any of them, although I've a sneaky feeling I've used similar functionality on Symbian for some reason or other.
Whether the call needs kernel mode to compute this is platform-specific. Vaguely trying to avoid system calls is not a valid optimisation technique.
A basic BSD-style sockets interface doesn't say anything much about read and write buffers. When does it matter whether the send buffer is empty? It certainly doesn't mean that all the data has been received at the other endpoint of the socket - it could be sitting in some router somewhere. Likewise, "your" read buffer being full doesn't guarantee that a write at the other end will block.
Generally speaking, you just read/write as much as you can and let the sockets layer handle the complexity. If you're seeing a lot of I/O completed with tiny sizes then maybe there's some performance problem. But remember that a stream socket will send/receive a packet at a time, containing a block of data. Unless TCP_NODELAY is set, it's not as though bytes are arriving by ones at the NIC, and you might end up making one read call per byte. They're arriving in packets, so most likely will become readable all at once, perhaps 1k-ish at a time. You're unlikely to be able to speed things up by holding off reading until there's a lot to read. In fact you might make it worse, because by the time your endpoint's read buffer is full, there's a risk that incoming data is being discarded because there's nowhere to store it, resulting in delays and re-sends.
Taking in account that the kernel buffer for sockets lives in kernelspace I doubt there is any way of asking for the size without a syscall.
With syscalls you can try recv with PEEK.
ret = recv(fd, buf, len, MSG_PEEK);
Will give do the recv but without emptying the buffer.
That is not possible without a syscall. But what's the problem with syscalls?
I think there's a fundamental reason why your approach is flawed/doomed. The system doesn't want to tell you when the read buffer is full / write buffer is empty because those events indicate a break down in the contract between you and the system. If things get to that point (particularly in the read direction) it is too late for you to ensure smooth operation of the protocol stack. Some more data might arrive while you are finally deciding to read the buffer. You should be reading the buffer before it gets full, that's the whole point of buffered I/O.
If you do read()s in separate thread, SO_RCVLOWAT can help block this read until there is enough data in buffer. Unfortunately, poll() and select() ignore this socket option at least on Linux and always check for single byte available.
@blaze,
Linux and SO_RCVLOWAT
With respect, my experience differs from yours. I have been using receive buffer low watermarks in Linux since FC5, in products which distribute video over IP (both UDP & TCP), so I understand how important it is to make the most of your network stack features. In fact, Linux was one of the first implementations to allow you to read back the low watermark (and some still don't allow that). :)
You mention poll() and select() as failing to respect the SO_RCVLOWAT. I have been using pselect() for as long as I can remember, so maybe the problem is with select() and poll(). In any case, you should always use pselect() or ppoll(), where available, in preference to the older calls, because they can atomically alter the program's signal mask as you enter/leave the call. If you understand what that means, then you will appreciate why this is critical in commercial software. If not, such a discussion would warrant its own thread. :)
Peter
精彩评论