Improving speed when loading a list from a web service
This is a continuation on this question:
The problem is simple. I need to call methods from a REST web service which controls several tables. One table is a snapshot table which contains records with huge XML files. Each XML file is basically a backup from another database. This backup XML is then sent to customers who use the data as read-only information in another product. Basically, the data in the XML are lists of companies, products, business rules and whatever more. And no, those customers work offline most of the time, so they cannot get the data live. Walking through the list of snapshots is tricky:XMLData.Snapshots.Skip(N).Take(1).First();
but it works very nice. It's the answer of the previous Q.
But there are three other lists of data that I need to walk through. These are called Changes, Errors and Messages. They contain (1) changes to the data, (2) errors that occurred during modifying the data and (3) generic messages. All of these records are linked to a snapshot record. Thus a single snapshot can contain multiple changes, errors and messages.
I still cannot access the server code but since there's a REST service wrapped around an entity framework exposing most of the functionality, I can still use that service. (And that service is only accessible i开发者_开发知识库nternally, on the Intranet. This is basically what I have to work with. And while the lists of changes, errors and messages are relatively small, the snapshots are still huge.
The problem is that I now want to generate a client-side report of the changes, errors and messages, but without grouping them by snapshot! They need to be grouped by date. But each record also needs to show the title of the snapshot, which causes me some incredible headaches...
When walking through e.g. the Changes with the regular foreach instruction I can load the Snapshot data by usingXMLData.LoadProperty(Change, "Snapshots");
but since the snapshot record itself is generally about 300 MB, this just slows the whole thing down to a crawl. (There are tens of thousands of these records in total!) So I need a faster solution, without having to modify the server code.
Any suggestions?
Yeah, okay. Modifying the server would be the proper way but that's not possible. It's in production and this list isn't important enough to require an upgrade of the server. Basically, I'm not allowed to modify any server-code for now. (But they still want this list.)
Some additional complexity... The application I'm working on is something that just needs to run once per week or per month. But with the current amount of records, I estimate it would take more than two days to finish. The data itself will be updated a few times every office day and snapshots are created every week or so on the server. Errors can always be generated when users start browsing the site which maintains the data but in general, there will be about 50 changes and 4 errors per week, plus a few messages when the server goes down and up again or when snapshots are generated.
One option would be to maintain another client side data file that contains the same data but ordered by date. This could be generated and updated asynchronously in a separate thread so as not to impact your main application. Whenever a user wants to generate the rport you will need to check that the data file is available for reading and perhaps take a lock on it.
This should be able to handle tens of thousands of records. But you will have to make sure that you put in robust code to keep the two sets of data in sync and if you are ever unsure treat the data from the server as the "golden copy".
Can you make another web service caching data from the first, or accessing the same data source?
精彩评论