开发者

Looking for solutions to limit http requests to multiple hosts to what can be afforded while maximizing throughput

In an application which downloads many documents over http in parallel, I would like to make optimal use of the network connection, without pushing it beyond its limits and getting timeouts.

I am thinking this has to do with congestion control. Perhaps a gradual increase in request frequency, until the network connection appears to be overburdened, followed by a slight drop in request frequency, followed by continuous monitoring to adjust the rate.

The bit I'm having trouble with is how best to detect the overbur开发者_高级运维dened network condition. If I were to measure the time between issuing a request and the beginning of the response, that would effectively give me a round trip time. If the average of this time increases significantly then we have an overburdened network. I wonder what 'significantly' should mean in this case.

Does this sound about right? Can you shed any more light on this problem? Anyone out there coded this scenario?

I have tagged this question .net because that is the framework I'm using, and if there is framework support for this scenario, then I'd like to know.

EDIT To clarify, I am talking about many hosts here, and only one instance of the application. I already have in place a system to avoid simultaneous connections to the same server (requests are delivered end to end), so the question is not so much how to saturate the pipe (I know how to do this), but how best to limit requests so as to avoid timeout errors.


Unless you are coding this just for personal use, you should also consider what happens if multiple clients are hitting the same server at the same time using your algorithm.

Traditionally, Web browsers limited themselves to two simultaneous connections per Web server. IE8 increased this to six, pissing off lots of Web server admins. See here for more discussion of this issue.

Note that TCP already has congestion control algorithms that attempt to saturate the pipe even for one (1) connection. If the documents you are downloading are not tiny (10s of kilobytes or more), you will probably find that opening tons of connections to the same server will not speed things up, and may slow them down.

The only way lots of connections to the same server will help is if (a) it is heavily loaded and your goal is just to suck up more than your "fair share" of the server's bandwidth; or (b) you are downloading lots of tiny files over distinct HTTP connections, so the TCP algorithm does not have enough time to adapt itself to the available bandwidth of the link.

My suggestion, which I doubt you will like, is to open a fixed number of connections per server (e.g., two) and just let TCP do its job.


Thanks Nemo. I implemented your suggestion of monitoring bandwidth with a moving average. I use this value to adjust a value representing target number of outstanding requests. I orchestrate the issuing of new requests such as to tend towards this moving target.

Someone also suggested using a bandwidth-limiting proxy.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜