Java Solaris NIO OP_CONNECT problem
I have a Java client that connects to a C++ server using TCP Sockets using Java NIO. This works under Linux, AIX and HP/UX but under Solaris the OP_CONNECT
event never fires.
Further details:
Selector.select()
is returning 0, and the 'selected key set' is empty.- The issue only occu开发者_如何学Crs when connecting to the local machine (via loopback or ethernet interface), but works when connecting to a remote machine.
- I have confirmed the issue under two different Solaris 10 machines; a physical SPARC and virtual x64 (VMWare) using both JDK versions 1.6.0_21 and _26.
Here is some test code which demonstrates the issue:
import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.SocketChannel;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;
public class NioTest3
{
public static void main(String[] args)
{
int i, tcount = 1, open = 0;
String[] addr = args[0].split(":");
int port = Integer.parseInt(addr[1]);
if (args.length == 2)
tcount = Integer.parseInt(args[1]);
InetSocketAddress inetaddr = new InetSocketAddress(addr[0], port);
try
{
Selector selector = Selector.open();
SocketChannel channel;
for (i = 0; i < tcount; i++)
{
channel = SocketChannel.open();
channel.configureBlocking(false);
channel.register(selector, SelectionKey.OP_CONNECT);
channel.connect(inetaddr);
}
open = tcount;
while (open > 0)
{
int selected = selector.select();
System.out.println("Selected=" + selected);
Iterator<SelectionKey> it = selector.selectedKeys().iterator();
while (it.hasNext())
{
SelectionKey key = it.next();
it.remove();
channel = (SocketChannel)key.channel();
if (key.isConnectable())
{
System.out.println("isConnectable");
if (channel.finishConnect())
{
System.out.println(formatAddr(channel) + " connected");
key.interestOps(SelectionKey.OP_WRITE);
}
}
else if (key.isWritable())
{
System.out.println(formatAddr(channel) + " isWritable");
String message = formatAddr(channel) + " the quick brown fox jumps over the lazy dog";
ByteBuffer buffer = ByteBuffer.wrap(message.getBytes());
channel.write(buffer);
key.interestOps(SelectionKey.OP_READ);
}
else if (key.isReadable())
{
System.out.println(formatAddr(channel) + " isReadable");
ByteBuffer buffer = ByteBuffer.allocate(1024);
channel.read(buffer);
buffer.flip();
byte[] bytes = new byte[buffer.remaining()];
buffer.get(bytes);
String message = new String(bytes);
System.out.println(formatAddr(channel) + " read: '" + message + "'");
channel.close();
open--;
}
}
}
}
catch (IOException e)
{
e.printStackTrace();
}
}
static String formatAddr(SocketChannel channel)
{
return Integer.toString(channel.socket().getLocalPort());
}
}
You can run this using the command line:
java -cp . NioTest3 <ipaddr>:<port> <num-connections>
Where port should be 7 if you are running against a real echo service; i.e.:
java -cp . NioTest3 127.0.0.1:7 5
If you cannot get a real echo service running then the source to one is here. Compile the echo server under Solaris with:
$ cc -o echoserver echoserver.c -lsocket -lnsl
and run it like this:
$ ./echoserver 8007 > out 2>&1 &
This has been reported to Sun as a bug.
Your bug report has been closed as 'not a bug', with an explanation. You are ignoring the result of connect()
, which if true means that OP_CONNECT
will never fire, because the channel is already connected. You only need the whole OP_CONNECT/finishConnect()
megillah if it returns false. So you shouldn't even register OP_CONNECT
unless connect()
returns false, let alone register it before you've even called connect().
Further remarks:
Under the hood, OP_CONNECT
and OP_WRITE
are the same thing, which explains part of it.
As you have a single thread for this, the workaround would be to do the connect in blocking mode, then switch to non-blocking for the I/O.
Are you doing the select() after registering the channel with the Selector
?
The correct way of handling non-blocking connect is as follows:
channel.configureBlocking(false);
if (!channel.connect(...))
{
channel.register(sel, SelectionKey.OP_CONNECT, ...); // ... is the attachment, or absent
}
// else channel is connected, maybe register for OP_READ ...
// select() loop runs ...
// Process the ready keys ...
if (key.isConnectable())
{
if (channel.finishConnect())
{
key.interestOps(0); // or SelectionKey.OP_READ or OP_WRITE, whatever is appropriate
}
}
A few non-exhaustive comments after reviewing your extended code:
Closing a channel cancels the key. You don't need to do both.
The non-static
removeInterest()
method is incorrectly implemented.TYPE_DEREGISTER_OBJECT
also closes the channel. Not sure if that is what you really intended. I would have thought it should just cancel the key, and there should be a separate operation for closing the channel.You have gone way overboard on the small methods and exception handling.
addInterest()
andremoveInterest()
are good examples. They catch exceptions, log them, then proceed as though the exception hadn't happened, when all they actually do is set or clear a bit: one line of code. And on top of that many of them have both static and non-static versions. Same goes for all the little methods that callkey.cancel()
,channel.close()
, etc. There is no point to all this, it is just clocking up lines of code. It only adds obscurity and makes your code harder to understand. Just do the operation required inline and have a single catcher at the bottom of the select loop.If
finishConnect()
returns false it isn't a connection failure, it just hasn't completed yet. If it throws anException
, that is a connection failure.You are registering for
OP_CONNECT
andOP_READ
at the same time. This doesn't make sense, and it may cause problems. There is nothing to read untilOP_CONNECT
has fired. Just register forOP_CONNECT
at first.You are allocating a
ByteBuffer
per read. This is very wasteful. Use the same one for the life of the connection.You are ignoring the result of
read()
. It can be zero. It can be -1, indicating EOS, on which you must close the channel. You are also assuming you will get an entire application message in a single read. You can't assume that. That's another reason why you should use a singleByteBuffer
for the life of the connection.You are ignoring the result of
write()
. It could be less thanbuffer.remaining(
) was when you called it. It can be zero.You could simplify this a lot by making the
NetSelectable
the key attachment. Then you could do away with several things, including for example the channel map, and the assert, because the channel of the key must always be equal to the channel of the attachment of the key.I would also definitely move the
finishConnect()
code to theNetSelector
, and haveconnectEvent()
just be a success/failure notification. You don't want to spread this kind of stuff around. Do the same withreadEvent()
, i.e. do the read itself inNetSelector
, with a buffer supplied by theNetSelectable
, and just notify theNetSelectable
of the read result: count or -1 or exception. Ditto on write: if the channel is writable, get something to write from theNetSelectable
, write it in the NetSelector, and notify the result. You could have the notification callbacks return something to indicate what to do next, e.g. close the channel.
But really this is all five times as complex as it needs to be, and the fact that you have this bug proves it. Simplify your head.
I have worked-around this bug using the following:
If Selector.select()
returns 0 (and didn't timeout, if the timeout version was used) then:
- Iterate over the keys registered with the selector via
selector.keys().iterator()
(remembering not to calliterator.remove()
). - If
OP_CONNECT
interest has been set with the key then callchannel.finishConnect()
and do whatever would have been done ifisConnectable()
has returnedtrue
.
For example:
if (selected == 0 && elapsed < timeout)
{
keyIter = selector.keys().iterator();
while (keyIter.hasNext())
{
key = keyIter.next();
if (key.isValid())
{
channel = (SocketChannel)key.channel();
if (channel != null)
{
if ((key.interestOps() & SelectionKey.OP_CONNECT) != 0)
{
if (channel.finishConnect())
{
key.interestOps(0);
}
}
}
}
}
}
This has been reported to Sun as a bug.
精彩评论