Membase node fault handeling in Java
I am looking for a java example that shows how to avoid exceptions with Membase when one of its nodes goes down.
I have a small cluster of two nodes with one 'default' bucket. It is replicated on both servers. I wrote little java test app for stress loading. I use spymemcache 2.7. When I run it - both servers get busy. When I shutdown one membase instance my java app crushes.
Here is exception log:
2011-06-15 17:32:33.405 INFO net.spy.memcached.MemcachedConnection: Added {QA sa=/192.168.1.9:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2011-06-15 17:32:33.407 INFO net.spy.memcached.MemcachedConnection: Added {QA sa=/192.168.1.10:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2011-06-15 17:32:33.412 INFO net.spy.memcached.MemcachedConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@63238bd2
2011-06-15 17:32:33.413 INFO net.spy.memcached.MemcachedConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@37bd2664
2011-06-15 18:20:21.896 INFO net.spy.memcached.MemcachedConnection: Reconnecting due to exception on {QA sa=/192.168.1.9:11211, #Rops=2,开发者_StackOverflow社区 #Wops=0, #iq=0, topRop=net.spy.memcached.protocol.binary.StoreOperationImpl@5f4275d4, topWop=null, toWrite=0, interested=1}
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:237)
at sun.nio.ch.IOUtil.read(IOUtil.java:210)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
at net.spy.memcached.MemcachedConnection.handleReads(MemcachedConnection.java:487)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:427)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:280)
at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:2063)
2011-06-15 18:20:21.897 WARN net.spy.memcached.MemcachedConnection: Closing, and reopening {QA sa=/192.168.1.9:11211, #Rops=2, #Wops=0, #iq=0, topRop=net.spy.memcached.protocol.binary.StoreOperationImpl@5f4275d4, topWop=null, toWrite=0, interested=1}, attempt 0.
2011-06-15 18:20:21.898 WARN net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl: Discarding partially completed op: net.spy.memcached.protocol.binary.StoreOperationImpl@5f4275d4
2011-06-15 18:20:21.899 WARN net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl: Discarding partially completed op: net.spy.memcached.protocol.binary.GetOperationImpl@802b249
Exception in thread "main" java.lang.RuntimeException: Exception waiting for value
at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:1146)
at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:1163)
I made node 192.168.1.9 go down, but client didn't understood that and crashed.
Any ideas how to properly handle it?
-Does the exception go away if you fail over the server? -Which server's URI are you pointing the client at? Does it make a difference if you point it at the "other" one and/or both of them?
Perry
This I believe is not because of memcached cluster as I had this error message when running a single memcached server on my local and running a test case to cache and retrieve data. Following environment was in use mac snow leapord + spymemcached 2.7 + memcached 1.4.6
I was running it in daemon mode and it got fixed after I restarted my local memcached server.
I am sorry that I cannot exactly tell the cause for this but doing the above fixed the problem.
So, we didn't find answer for the question and we don't use this software anymore. Now we use Erlang with absolutely illuminates the need in such kind of in-memory data storage.
精彩评论