开发者

Ensuring data is being read with async_read

I am currently testing my network application in very low bandwidth environments. I currently have code that attempts to ensure that the connection is good by maki开发者_开发技巧ng sure I am still receiving information.

Traditionally I have done this by recording the timestamp in my ReadHandler function so that each time it gets called I know I have received data on the socket. With very low bandwidths this isn't sufficient because my ReadHandler is not getting called frequently enough.

I was toying around with the idea of writing my own completion condition function (right now I am using tranfer_at_least(1)) thinking it would get called more frequently and I could record my timestamp there, but I was wondering if there wasn't some other more standard way to go about this.


We had a similar issue in production: some of our connections may be idle for days, but we must detect if the remote is dead ASAP.

We solved it by enabling the TCP_KEEPALIVE option:

boost::asio::socket_base::keep_alive option(true);
mSocketTCP.set_option(option);

which had to be accompanied by new startup script that writes sensible values to /proc/sys/net/ipv4/tcp_keepalive_* which have very long timeouts by default (on LInux)


You can use the read_some method to get partial reads, and deal with the book keeping. This is more efficient than transfer_at_least(1), but you still have to keep track of what is going on.

However, a cleaner approach is just to use a concurrent deadline_timer. If the timer goes off before you are finished, then is taking too long and cancel whatever is going on. If not, just stop the timer and continue. Something like:

boost::asio::deadline_timer t;
t.expires_from_now(boost::posix_time::seconds(20));
t.async_wait(bind(&Class::timed_out, this, _1));

// Do stuff.

if (!t.cancel()) {
   // Timer went off, abort
}


// And the timeout method

void Class::timed_out(error_code const& error)
{
    if (error == boost::asio::error::operation_aborted) return;
    // Deal with the timeout, close the socket, etc.
}


I don't know how to handle low latency of network from within application. Can you be sure if it's network latency, or if peer server or peer application busy and react slowly. Does it matter if it network/server/application quilt?

Even if you can discover network latency and find it's big, what are you going to do? You can not improve the situation.

Consider other critical case which is a subset of what you're trying to handle - network is down (e.g. you disconnect cable from your machine). Since it a subset of your problem you want to handle it too.

Let's examine the network down effect on active TCP connection.How can you discover your active TCP connection is still alive? Calling send() will success, but it merely says that the message queued in TCP outgoing queue in kernel. TCP stack will try to send it, but since TCP ACK won't be sent back, TCP stack on your side will try to resend it again and again. You can see your message in netstat output (Send-Q column).

I'm aware of the following ways to deal with it:

  • One standard way is TCP keep alive proposed @Cubby.

  • Another way is to implement Keep Alive mechanism. Send Keep Alive req message and peer is obligated to send back Keep Alive ack message.

    If you don't receive ack message after predefined timeout, try to send Keep Alive req N more times (e.g. N=2). If still no success, close the socket and open it again. If peer server is not available you'll not be abable to open connection, since TCP 3 way handshake requires peer to respond.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜