Performance - Table Service, SQL Azure - insert. Query speed on large amount of data

2023-01-19 02:09 问答作者：

I'd read many posts and articles about comparing SQL Azure and Table Service and most of them told that Table Service is more scalable than SQL Azure.

http://www.silverlight-travel.com/blog/2010/03/31/azure-table-storage-sql-azure/
http://www.intertech.com/Blog/post/Windows-Azure-Table-Storage-vs-Windows-SQL-Azure.aspx
Microsoft Azure Storage vs. Azure SQL Database
https://social.msdn.microsoft.com/Forums/en-US/windowsazure/thread/2fd79cf3-ebbb-48a2-be66-542e21c2bb4d
https://blogs.msdn.com/b/windowsazurestorage/archive/2010/05/10/windows-azure-storage-abstractions-and-their-scalability-targets.aspx
https://stackoverflow.com/questions/2711868/azure-performance
http://vermorel.com/journal/2009/9/17/table-storage-or-the-100x-cost-factor.html
Azure Tables or SQL Azure?
http://www.brentozar.com/archive/2010/01/sql-azure-frequently-asked-questions/
https://code.google.com/p/lokad-cloud/wiki/FatEntities

Sorry for http, I'm new user >_< But http://azurescope.cloudapp.net/BenchmarkTestCases/ benchmark shows different picture.

My case. Using SQL Azure: one table with many inserts, about 172,000,000 per day(2000 per second). Can I expect good perfomance for inserts and selects when I have 2 million records or 9999....9 billion records in one table?

Using Table Service: one table with some number of partitions. Number of partitions can be large, very large.

Question #1: is Table service has some limitations or best practice for creating many, many, many partitions in one table?

Question #2: in a single partition I have a large amount of small entities, like in SQL Azure example above. Can I expect good p开发者_运维技巧erfomance for inserts and selects when I have 2 million records or 9999 billion entities in one partition?

I know about sharding or partition solutions, but it is a cloud service, is cloud not powerfull and do all work without my code skills?

Question #3: Can anybody show me benchmarks for quering on large amount of datas for SQL Azure and Table Service?

Question #4: May be you could suggest a better solution for my case.

Short Answer

I haven't seen lots of partitions cause Azure Tables (AZT) problems, but I don't have this volume of data.
The more items in a partition, the slower queries in that partition
Sorry no, I don't have the benchmarks
See below

Long Answer

In your case I suspect that SQL Azure is not going work for you, simply because of the limits on the size of a SQL Azure database. If each of those rows you're inserting are 1K with indexes you will hit the 50GB limit in about 300 days. It's true that Microsoft are talking about databases bigger than 50GB, but they've given no time frames on that. SQL Azure also has a throughput limit that I'm unable to find at this point (I pretty sure it's less than what you need though). You might be able to get around this by partitioning your data across more than one SQL Azure database.

The advantage SQL Azure does have though is the ability to run aggregate queries. In AZT you can't even write a select count(*) from customer without loading each customer.

AZT also has a limit of 500 transactions per second per partition, and a limit of "several thousand" per second per account.

I've found that choosing what to use for your partition key (PK) and row key depends (RK) on how you're going to query the data. If you want to access each of these items individually, simply give each row it's own partition key and a constant row key. This will mean that you have lots of partition.

For the sake of example, if these rows you were inserting were orders and the orders belong to a customer. If it was more common for you to list orders by customer you would have PK = CustomerId, RK = OrderId. This would mean to find orders for a customer you simply have to query on the partition key. To get a specific order you'd need to know the CustomerId and the OrderId. The more orders a customer had, the slower finding any particular order would be.

If you just needed to access orders just by OrderId, then you would use PK = OrderId, RK = string.Empty and put the CustomerId in another property. While you can still write a query that brings back all orders for a customer, because AZT doesn't support indexes other than on PartitionKey and RowKey if your query doesn't use a PartitionKey (and sometimes even if it does depending on how you write them) will cause a table scan. With the number of records you're talking about that would be very bad.

In all of the scenarios I've encountered, having lots of partitions doesn't seem to worry AZT too much.

Another way you can partition your data in AZT that is not often mentioned is to put the data in different tables. For example, you might want to create one table for each day. If you want to run a query for last week, run the same query against the 7 different tables. If you're prepared to do a bit of work on the client end you can even run them in parallel.

Azure SQL can easily ingest that much data an more. Here's a video I recorded months ago that show a sample (available on GitHub) that shows one way you can do that.

https://www.youtube.com/watch?v=vVrqa0H_rQA

here's the full repo

https://github.com/Azure-Samples/streaming-at-scale/tree/master/eventhubs-streamanalytics-azuresql

继续阅读：azure-sql-database azure-storage

Performance - Table Service, SQL Azure - insert. Query speed on large amount of data

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？