Why isn't my server socket producing activity in select() when the client end of the connection is closed?
This has been perplexing me for a couple of days, so I decided to stop struggling with it and throw it open to a wider audience.
I have a server written in C that creates a worker thread for each client (the number of clients is expected to be very small). Each thread has only two file descriptors associated with it, one of which is the socket, so I decided to use select() for simplicity. The clients never send data to the server, but I set the activity bit for the socket file descriptor in the readfds argument before calling select() as a means of detecting when the client has closed the connection.
The clients are all instances of a Java program that appears to be opening a stream socket connection appropriately (my low-level Java networking experience is not that strong and I didn't write the client). In my test environment, I run the Java program from within an instance of Eclipse editor, and kill it using the stop button to simulate shutdown.
Server and client are both running under Linux.
Communication behaves according to my expectation except when the client shuts down. In this circumstance, I would expect the call to select() in the server's worker thread to return with the ready bit associated with the socket file descriptor set in the readfds argument, at which point a call to read() should return 0 bytes, indicating that the peer has closed the connection.
What I see is that the expected behavior occurs randomly, while in other cases, the call to select() does not return and the server eventually gets (errno == EPIPE) after write() returns -1 when it has data that it decides to send to the dead client. In particular, the first connection to the server always behaves correctly, while the second one always fails. This is not really blocking my progress because the server simply logs an error and cleans up 开发者_开发百科the connection when it detects the condition, but this is annoying me and making me wonder if there is some subtle point that I'm forgetting here because it's been a long time since I've programmed at this level.
EDIT: The code is kind of scattered in various small chunks across several translation units, so I'll try to cook it down to something that is remotely readable (note that lots of error checking code has also been removed):
/* A chunk of code related to the handing of a listen socket */
int clientSockFd = accept(listenerFd, &addr, &addrlen);
int optVal = 1;
setsockopt(clientSockFd, IPPROTO_TCP, TCP_NODELAY, &optVal, sizeof(optVal));
ThreadDataStruct *useful = (ThreadDataStruct *)malloc(sizeof(useful));
/* Add in some useful stuff */
useful->fd = clientSockFd;
pthread_create(&newThreadId, NULL, workerThread, (void *)useful);
/* End of listener handling code */
/* ... */
static void *workerThread(void *arg) {
int clientSockFd = ((ThreadDataStruct *)arg)->fd;
/* Unpack some other useful stuff from arg */
int maxFd = LargestFdUsedByThisThread + 1;
while (1) {
fd_set readableReady;
FD_ZERO(&readableReady);
FD_SET(clientSockFd, &readableReady);
int readyFdCount = select(maxFd, &readableReady, NULL, NULL, NULL);
if (FD_ISSET(clientSockFd, &readableReady)) {
/* Clean up various data structures associated with the thread */
return NULL;
}
/* Do something else useful but irrelevant to this problem */
}
}
The three most likely causes of this problem are:
Something is amiss in the select
call. For example, if LargestFdUsedByThisThread
is less than clientSockFd
, the select
call will not unblock.
The connection never closed. For example, if another process also had a reference to the underlying TCP connection, killing the client process will not close the connection so long as that other process still has a reference to it. This happens commonly if another process fork
ed off the client process after it accepted the connection and didn't close its handle to the connection.
You have a threading problem. I noticed that you pass a pointer to the newly-created thread. If you then modify the data in the main thread, the newly-created thread may read the new data rather than the data at the time you constructed it. This can cause the thread to see the wrong value for clientSockFd
. A good pattern to avoid this error is:
1) Accept a connection and get ready to create a thread.
2) Allocate an object to hold the parameters the thread needs.
3) Create the thread passing it a pointer to that object.
4) In the new thread, free the parameter object when done with it.
精彩评论