(How) Can I reduce socket latency?
I have written an HTTP proxy that does some stuff that's not relevant here, but it is increasing the client's time-to-serve by a huge amount (600us without proxy vs 60000us with it). I think I have found where the bulk of that time is coming from - between my proxy finishing sending back to the client and the client finishing receiving it. For now, server, proxy and client are running on the same host, using localhost as the addresses.
Once the proxy has finished sending (once it has returned from send() at least), I print the result of gettimeofday which gives an absolute time. When my client has received, it prints the result of gettimeofday. Since they're both on the same host, this should be accurate. All send() calls are with no flags, so they are blocking. The difference between the two is about 40000us.
The proxy's socket on which it listens for client connections is set up with the hints AF_UNSPEC, SOCK_STREAM and AI_PASSIVE. Presumably a socket from accept()ing on that will have the same parameters?
If I'm understanding all this correctly, Apache manages to do everything in 600us (including开发者_如何学Go the equivalent of whatever is causing this 40000us delay). Can anybody suggest what might be causing this? I have tried setting the TCP_NODELAY option (I know I shouldn't, it's just to see if it made a difference) and the delay between finishing sending and finishing receiving went right down, I forget the number but <1000us.
This is all on Ubuntu Linux 2.6.31-19. Thanks for any help
40ms is the TCP ACK delay on Linux, which indicates that you are likely encountering a bad interaction between delayed acks and the Nagle algorithm. The best way to address this is to send all of your data using a single call to send()
or sendmsg()
, before waiting for a response. If that is not possible then certain TCP socket options including TCP_QUICKACK
(on the receiving side), TCP_CORK
(sending side), and TCP_NODELAY
(sending side) can help, but can also hurt if used improperly. TCP_NODELAY
simply disables the Nagle algorithm and is a one-time setting on the socket, whereas the other two must be set at the appropriate times during the life of the connection and can therefore be trickier to use.
You can't really do meaningful performance measurements on a proxy with the client, proxy and origin server on the same host.
Place them all on different hosts on a network. Use real hardware machines for them all, or specialised hardware test systems (e.g. Spirent).
Your methodology makes no sense. Nobody has 600us of latency to their origin server in practice anyway. Running all the tasks on the same host creates contention and a wholly unreaslistic network environment.
INTRODUCTION:
I already praised mark4o for the truly correct answer to the general question of lowering latency. I would like to translate the answer in terms of how it helped solve my latency issue because I think it's going to be the answer most people come here looking for.
ANSWER:
In a real-time network app (such as a multiplayer game) where getting short messages between nodes as quickly as possible is critical, TURN NAGLE OFF. In most cases this means setting the "no-delay" flag to true.
DISCLAIMER:
While this may not solve the OP specific problem, most people who come here will probably be looking for this answer to the general question of their latency issues.
ANECDOTAL BACK-STORY:
My game was doing fine until I added code to send two messages separately, but they were very close to each other in execution time. Suddenly, I was getting 250ms extra latency. As this was a part of a larger code change, I spent two days trying to figure out what my problem was. When I combined the two messages into one, the problem went away. Logic led me to mark4o's post and so I set the .Net socket member "NoDelay" to true, and I can send as many messages in a row as I want.
From e.g. the RedHat documentation:
Applications that require lower latency on every packet sent should be run on sockets with TCP_NODELAY enabled. It can be enabled through the setsockopt command with the sockets API:
int one = 1;
setsockopt(descriptor, SOL_TCP, TCP_NODELAY, &one, sizeof(one));
For this to be used effectively, applications must avoid doing small, logically related buffer writes. Because TCP_NODELAY is enabled, these small writes will make TCP send these multiple buffers as individual packets, which can result in poor overall performance.
In your case, that 40ms is probably just a scheduler time quantum. In other words, that's how long it takes your system to get back round to the other tasks. Try it on a real network, you'll get a completely different picture. If you have a multi-core machine, using virtual OS instances in Virtualbox or some other VM would give you a much better idea of what is really going to happen.
For a TCP proxy it would seem prudent on the LAN side to increase the TCP initial window size as discussed on linux-netdev and /. recently.
http://www.amailbox.org/mailarchive/linux-netdev/2010/5/26/6278007
http://developers.slashdot.org/story/10/11/26/1729218/Google-Microsoft-Cheat-On-Slow-Start-mdash-Should-You
Including paper on the topic by Google,
http://www.google.com/research/pubs/pub36640.html
And an IETF draft also by Google,
http://zinfandel.levkowetz.com/html/draft-ietf-tcpm-initcwnd-00
For Windows, I'm not sure if setting TCP_NODELAY helps. I tried that, but latency was still bad. One person suggested I try UDP, and that did the trick.
A few complicated examples of UDP did not work for me, but I ran across a simple one and it did the trick...
#include <Winsock2.h>
#include <WS2tcpip.h>
#include <system_error>
#include <string>
#include <iostream>
class WSASession
{
public:
WSASession()
{
int ret = WSAStartup(MAKEWORD(2, 2), &data);
if (ret != 0)
throw std::system_error(WSAGetLastError(), std::system_category(), "WSAStartup Failed");
}
~WSASession()
{
WSACleanup();
}
private:
WSAData data;
};
class UDPSocket
{
public:
UDPSocket()
{
sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
if (sock == INVALID_SOCKET)
throw std::system_error(WSAGetLastError(), std::system_category(), "Error opening socket");
}
~UDPSocket()
{
closesocket(sock);
}
void SendTo(const std::string& address, unsigned short port, const char* buffer, int len, int flags = 0)
{
sockaddr_in add;
add.sin_family = AF_INET;
add.sin_addr.s_addr = inet_addr(address.c_str());
add.sin_port = htons(port);
int ret = sendto(sock, buffer, len, flags, reinterpret_cast<SOCKADDR *>(&add), sizeof(add));
if (ret < 0)
throw std::system_error(WSAGetLastError(), std::system_category(), "sendto failed");
}
void SendTo(sockaddr_in& address, const char* buffer, int len, int flags = 0)
{
int ret = sendto(sock, buffer, len, flags, reinterpret_cast<SOCKADDR *>(&address), sizeof(address));
if (ret < 0)
throw std::system_error(WSAGetLastError(), std::system_category(), "sendto failed");
}
sockaddr_in RecvFrom(char* buffer, int len, int flags = 0)
{
sockaddr_in from;
int size = sizeof(from);
int ret = recvfrom(sock, buffer, len, flags, reinterpret_cast<SOCKADDR *>(&from), &size);
if (ret < 0)
throw std::system_error(WSAGetLastError(), std::system_category(), "recvfrom failed");
// make the buffer zero terminated
buffer[ret] = 0;
return from;
}
void Bind(unsigned short port)
{
sockaddr_in add;
add.sin_family = AF_INET;
add.sin_addr.s_addr = htonl(INADDR_ANY);
add.sin_port = htons(port);
int ret = bind(sock, reinterpret_cast<SOCKADDR *>(&add), sizeof(add));
if (ret < 0)
throw std::system_error(WSAGetLastError(), std::system_category(), "Bind failed");
}
private:
SOCKET sock;
};
Server
#define TRANSACTION_SIZE 8
static void startService(int portNumber)
{
try
{
WSASession Session;
UDPSocket Socket;
char tmpBuffer[TRANSACTION_SIZE];
INPUT input;
input.type = INPUT_MOUSE;
input.mi.mouseData=0;
input.mi.dwFlags = MOUSEEVENTF_MOVE;
Socket.Bind(portNumber);
while (1)
{
sockaddr_in add = Socket.RecvFrom(tmpBuffer, sizeof(tmpBuffer));
...do something with tmpBuffer...
Socket.SendTo(add, data, len);
}
}
catch (std::system_error& e)
{
std::cout << e.what();
}
Client
char *targetIP = "192.168.1.xxx";
Socket.SendTo(targetIP, targetPort, data, len);
精彩评论