开发者

Why is downloading from Azure blobs taking so long?

In my Azure web role OnStart() I need to deploy a huge unmanaged program the role depends on. The program is previously compressed into a 400-megabytes .zip archive, 开发者_JS百科splitted to files 20 megabytes each and uploaded to a blob storage container. That program doesn't change - once uploaded it can stay that way for ages.

My code does the following:

CloudBlobContainer container = ... ;
String localPath = ...;
using( FileStream writeStream = new FileStream(
             localPath, FileMode.OpenOrCreate, FileAccess.Write ) )
{
     for( int i = 0; i < blobNames.Size(); i++ ) {
         String blobName = blobNames[i];
         container.GetBlobReference( blobName ).DownloadToStream( writeStream );
     }
     writeStream.Close();
}

It just opens a file, then writes parts into it one by one. Works great, except it takes about 4 minutes when run from a single core (extra small) instance. Which means the average download speed about 1,7 megabytes per second.

This worries me - it seems too slow. Should it be so slow? What am I doing wrong? What could I do instead to solve my problem with deployment?


Adding to what Richard Astbury said: An Extra Small instance has a very small fraction of bandwidth that even a Small gives you. You'll see approx. 5Mbps on an Extra Small, and approx. 100Mbps on a Small (for Small through Extra Large, you'll get approx. 100Mbps per core).


The extra small instance has limited IO performance. Have you tried going for a medium sized instance for comparison?


In some ad-hoc testing I have done in the past I found that there is no discernable difference between downloading 1 large file and trying to download in parallel N smaller files. It turns out that the bandwidth on the NIC is usually the limiting factor no matter what and that a large file will just as easily saturate it as many smaller ones. The reverse is not true, btw. You do benefit by uploading in parallel as opposed to one at a time.

The reason I mention this is that it seems like you should be using 1 large zip file here and something like Bootstrapper. That would be 1 line of code for you to download, unzip, and possibly run. Even better, it won't do it more than once on reboot unless you force it to.

As others have already aptly mentioned, the NIC bandwidth on the XS instances is vastly smaller than even a S instance. You will see much faster downloads by bumping up the VM size slightly.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜