Computing network distance between two hosts
I wan开发者_Python百科t to compute some metrics relative to the "distance" between two hosts on a network app. I came up with the following naïve solution inspired by ping
.
- Sending UDP packages of varying size.
- Wait for a response of the other node.
- Compute the time between send and recieve.
- Normalize this data and compute my metrics over it.
I'd like to avoid managing raw sockets, but if that's a better option, please tell me.
Would you reccomend another solution?
EDIT:
I think I was not clear on this. I know what's TTL and traceroute
and that's not what I am searching for.
What I am searching for is for a better metric that combine latency, bandwidth and yes, the traditional distance between hosts (because I think traceroute
alone is not that useful for managing a protocol). That's the motivation of using ping
-like measures.
The question becomes can you not modify the existing protocol or be more industrious and capture RTT details from existing request-reply messages?
If you modify the existing protocol, by say adding on transmission time stamp you can perform additional analytics server side. You might be able to still infer times if there is a request-reply from the server to the client.
The main idea being that adding additional messages explicitly for path latency measurement is often highly redundant and only serves to increase network chatter and complexity.
The definition of metrics you are looking for depends on the purpose of it - you can do it in many ways, and which way is the best always depends on the purpose.
In general, you are looking for some function distance(A, B)
. In general that would be a function of bandwidth and latency between A and B:
distance(A, B) = f(bandwidth(A, B), latency(A, B))
the shape of function f() would depend on the purpose, on the application - what you really need to optimize. The simplest would be to use linear function:
distance(A, B) = alpha * bandwidth + beta * latency
and again, the coefficients alpha and beta would depend on what you are trying to optimize. If you have measured some variable that measures your system performance, you can do statistical analysis (regression) to find optimal parameters:
performance(A, B) ~ alpha * bandwidth(A, B) + beta * latency(A, B)
Also be careful when you speak about metrics. Each metric must fulfil the following condition:
distance(A, B) + distance(B, C) >= distance(A, C)
Which is not always true in computer networks, as it depends on router decision.
IMHO it highly depends on specific details of your application.
- Some apps are more "delay" sensitive, while loosing some packets is accepted - like VOIP. Than you need to measure reponse time , with ignoring lost packets
- others needs fast reposes, (so "delay" sensitive), and lost packets need to be retransmitted, but data amount is low - than you need to measure, like you make it
- depending how retransmitting impacts your app you have either to calculate : average , variance or other - let's it be d
- depending if redundancy is acceptable, you can send each packet n times to reduce retransmission. Than make function d = f(n), and compare the functions.
- some applications need mostly speed, while delay might be very long (like hours). Than you might be interested on making "sliding window" statistic for given period of time, to know how much data where transferred during "last t minutes" and update value constantly.
There could be many manyo ther metrics, also including redundant connections between hosts when reliability is priority. So it highly depends on application.
In networks "distance" usualy is measured in terms of hops. Time does not really represent distance accurately because it is prone to short-term congestion and other network issues. Take a look at traceroute to see how to measure distance in terms of hops by sending packets with increasing TTLs.
Edit: Now that your question has additional details - Latency and bandwidth can never be meaningfully combined together into a generic metric. You may want to device a weightage depending on what your application prefers (latency vs bandwidth).
It seems to me like a smoothed RTT is going to serve you better. Something like what TCP maintains, a long time average of RTTs with a smoothing factor to account for anomalies. There is no one good way of doing this, so you may want to search for "RTT smoothing" and experiment with a few of them.
I think what you want is to use the packet's time-to-live field:
Time To Live (TTL)
An eight-bit time to live field helps prevent datagrams from persisting (e.g. going in circles) on an internet. This field limits a datagram's lifetime. It is specified in seconds, but time intervals less than 1 second are rounded up to 1. In latencies typical in practice, it has come to be a hop count field. Each router that a datagram crosses decrements the TTL field by one. When the TTL field hits zero, the packet is no longer forwarded by a packet switch and is discarded. Typically, an ICMP message (specifically the time exceeded) is sent back to the sender to inform it that the packet has been discarded. The reception of these ICMP messages is at the heart of how traceroute works.
In a nutshell, you can send successive IP packets, decrementing the time-to-live for each one that you send. Once you stop getting a response back, you know roughly how many hops must exist between the source and destination hosts.
If you don't want to work with the sockets yourself you can simply using the ping command, which provides an option that lets you specify the time-to-live value to use for the ping packets.
精彩评论