Network error on PASE on iSeries machine
I am running a server program, written in C running on PASE on an iSeries machine. PASE (Portable AIX Solutions Environment) is a simulation of AIX on IBM iSeries machines.
Server program is a connection oriented iterative tcp server.
Server logic contains call to accept() which returns a socket descriptor. This is followed by call to ioctl() to set the socket non blocking using F_IONBIO.
This call to ioctl fails intermittently, returns -1 with errno = 9 (EBADF : bad file descriptor) , for approximately 0.8% percent of the times it is called. Once it fails for a particular socket descriptor, the next failures are always for the same socket descriptor and with same errno.
When this happens, client side fails with errno = 73, i.e. connection reset by peer.
The server is a daemon process; so stdin is closed on initialization, and is available on accept(). Initially I ob开发者_运维百科served that ioctl() failed for socket descriptor 0, but not always. Hence, I tried to prevent reuse of socket descriptor 0 by setting stdin to '/dev/null', in case that was the issue. But I am not sure if this was the main issue. Yet to get the test results after this change.
Issue has been observed only on some machines, and usually when machine is loaded. So this seems to be some sort of a race condition. Server logic is well tested and seems to be stable.
Have any socket related issues been observed on PASE or AIX platform? Could this be OS related?
Any help/pointers with this issue would be appreciated.
thanks in advance,
avg
Is there any chance you are running up against the default maximum of 200 file descriptors per job?
If so you can use the DosSetRelMaxFH()--Change Maximum Number of File Descriptors API to increase the limit.
If that's not the issue I suggest collecting and examining an SST communications trace of the error. See the TCP/IP Communications Trace Instructions for more information.
Next I would check the group PTF levels especially SF99315 TCP/IP Group PTF.
IBM support is really helpful tracking down issues like these.
精彩评论