How to get 174 GB of data into my Azure Table Storage
I have 174 GB of XML files that I need to get into my Azure table storage, what is the best way of doing this? The XML files should be par开发者_如何学Gosed and their content put into different tables in my Azure table storage on an Azure computing instance after upload.
That's a lot of data. Today the only way to get things into Windows Azure is to upload them via HTTP. 174GB is going to take a very very long time to upload over most network connections.
That being said, I would suggest uploading the XML into blob storage and then running code (in a worker role) that pulls the XML from blob storage, parses it, and writes it to tables. In other words, do the upload with the raw XML, and do the translation into tables in the cloud, where latency will be low and bandwidth high.
I would compress them and store the file in blob storage. From there, I would pull the file into a worker role(s) and do the actual inserts. Things to keep in mind:
- Bandwidth is free into Windows Azure, so it costs nothing but time to upload blob.
- Storage transactions are not free, so you should try to use batch inserts when possible (same table, same partition key). 1M inserts will be $1.
- You will get the fastest performance inside Windows Azure. Download the files in parallel amongst instances (use leases to track) and do the inserts.
It might be possible to use Azure Drive in your vm instance. You will upload file to vm and then to Azuredrive drive.
精彩评论