Which is faster: Working from memory or lots of queries?
You need to pull a large amount of data (thousands of entries) from a local SQL server to be used in generating a report. You have two options:
1) Use a single query to get the data from the server, load it into a .NET data table and use a series of loops to get the information you 开发者_运维技巧need
2) Perform an SQL query each time you need a record
I know there are several factors affecting speed but with only the information given, which would you think is faster?
Without any more information, I was say option 1. You can perform LOTS of CPU operations in the time it takes to make a round trip to a database and download the result set. Also, I assume it would be easier to code the C# logic than the T-SQL logic. However, if you are talking about set queries then maybe the SQL would be easier to write. In the end, I would do whatever is easier to write, then later change it if it is too slow.
BTW, how many queries are we talking here? 5, 10, 100? If it is 5 queries, then the time won't be bad, but if you are doing a query for each record, and you can have lots of records, you never want to do that.
1 - Batching requests is almost always better. The network and connection overhead by opening a thousand queries will add up.
If there are only a few thousand of entries, scanning the table linearly will probably not be slower than the overhead to get data from the server. However, the approach scales really badly and if the number of rows increases, your solution will become very slow quickly, wherease the solution using a query is more 'predictable'. In general, let the database perform what its meant to be: process data.
I would expect 1 to be faster, if there's little enough data that it fits in RAM, and if the 'series of loops' on whatever your data structures are is no less efficient than whatever SQL server might achieve using its indexes.
I think other things factor into this decision such as:
- Do you have caching setup on SQL?
- Is SQL on the same server or remote machine? (Do you have to think about network overhead/speed issues)
- Is memory an issue or do you have free reign on the server? (Does the web server have enough memory to hold lots o' data in a data table?)
- How many people/programs are accessing SQL (More people might equal slower response times)
- Is it a mostly read or write database (Writes tend to lock tables, may have to wait to get your data, a one time bulk read would be faster here than several smaller reads)
For example, cached data on a local SQL server will probably be just as fast as loading everything into a data table. However, data on a remote machine (cached or uncached) will be magnitudes slower given the connection overhead/network latency/having to read the records from the hard drive.
I was dealing with some similar problem... my solution was normalizing data for each report... the bad news with that is that you need a background process updating your data, but with that I don't need to process a lot of info on memory and the database connection isn't so much recall..
精彩评论