开发者

What do I get this Error: EADDRINUSE, Address already in use when stress-testing Node.js with cradle and CouchDB?

I am trying to measure the throughput of a simple Node.js program with a CouchDB backend using cradle as the DB driver. When I put load against the program I get the following error within 30 seconds:

EADDRINUSE, Address already in use

Here is my program:

var http = require ('http'),
    url = require('url'),
    cradle = require('cradle'),
    c = new(cradle.Connection)('127.0.0.1',5984,{cache: false, raw: false}),
    db = c.database('testdb'),
    port=8081;

http.createServer(function(req,res) {
    var id = url.parse(req.url).pathname.substring(1);  
    db.get(id,function(err, doc) {
      res.writeHead(200,{'Content-Type': 'application/json'});
      res.write(JSON.stringify(doc));
      res.end();
    });
}).listen(port);

console.log("Server listening on port "+port);

I am using a JMeter script with 50 concurrent users. The average response time is 120ms, average size of the document returned 3KB.

As you can see I set the caching of Cradle to false. To investigate I looked at the number of waiting sockets: It increases up to about 4000, at which point it crashes (netstat | grep WAIT | wc -l)

To test other options I set the caching to true. In this case the program doesn't crash, but the number of waiting sockets increases to almost 10000 over time.

I also wrote the same program (sans the asynchronous part) as a Java Servlet, and it runs fine without the number of waiting sockets increasing much beyond 20.

My question is: Why do I get the ' EADDRINUSE, Address already in use' error? Why is the number of waiting sockets so high?

P.S.: This is a snippet from the output of netstat|grep WAIT:

tcp4       0      0  localhost.5984         localhost.58926        TIME_WAIT
tcp4       0      0  localhost.5984         localhost.58925        TIME_WAIT
tcp4       0      0  localhost.58924        localhost.5984    开发者_Go百科     TIME_WAIT
tcp4       0      0  localhost.58922        localhost.5984         TIME_WAIT
tcp4       0      0  localhost.5984         localhost.58923        TIME_WAIT


Are you sure you don't have a zombie process on 8001?

    ps aux | grep node

might help

Also wrote an article to help people get started with node and couchdb, if you are interested you can check out http://writings.nunojob.com/2011/09/getting-started-with-nodejs-and-couchdb.html


Upgrade to Cradle 0.5.6. It does not have the problem.

Speculation about the problem

The waiting sockets are probably in the CLOSE_WAIT state. (There are other states that would match your grep, such as TIME_WAIT. Can you confirm that it is CLOSE_WAIT and not anything else?)

The linked post has a helpful quote:

RF793 says CLOSE_WAIT is the TCP/IP stack waiting for the local application to release the socket. So, it hangs because it has received the information that the remote host has initiated a disconnection and is closing its socket, upon what the local application did not close its own side.

So maybe the solution consists in finding a bug fix for your application...

Indeed. In your case, there are two connections per query, one from JMeter to Node, and another from Node to CouchDB. Either JMeter (older more mature software) is not closing the connection properly, or Cradle (newer, less mature software) is not closing the connection properly. Obviously, Cradle is the most likely to have the bug. (Perhaps it is NodeJS's HTTP library itself, but Cradle seems like the first place to check.)

I do not have a complete answer, but hopefully these will be helpful clues. I think the address-in-use error is because there are no more source addresses to make an "outgoing" (even for 127.0.0.1) connection. But I am so far unsure why the CLOSE_WAIT count is different in each trial. (Perhaps it is fluctuating heavily as entire connection pools are closed.)

To gain more information, perhaps try an alternative CouchDB client library such as request or nano and compare the results.

Please us know what you find because it would be great to identify and close this potential Cradle bug (or bug somewhere at least!). Thanks.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜