Data caching techniques / Tips / AppFabric
We have million and millions of records in a SQL table, and we run开发者_Python百科 really complex analytics on that data to generate reports.
As the table is growing and additional records are being added, the computation time is increasing and the user has to wait a long time before the webpage loads.
We were thinking of using a distributed cache like AppFabric to load the data in memory when the application loads and then running our reports off that data in memory. This should improve the response time a little since now data is in memory vs disk.
Before we take the plundge and implement this I wanted to check and find out what others are doing and what are some of the best techniques and practices to load data in memory, caching etc. Surely you don't just load the entire table with 100s of millions of records in memory...??
I was also looking into OLAP / Data warehousing, which might give us better performance rather than caching.
The solution to complex reporting is to pre-calculate, so you're on the right path if you're looking at OLAP.
Have you considered partitioning your database? We do this for our largest databases.
Having said that, using app fabric cache correctly will greatly increase performance for most applications that are IO heavy.
We have million and millions of records in a SQL table,
Bad policy. Flat files are better.
and we run really complex analytics on that data to generate reports.
In some cases, you'd be happier loaded relevant subsets into SQL.
As the table is growing and additional records are being added, the computation time is increasing
That's the consequence of using a database for too much. Use it for less.
We were thinking of using a distributed cache like AppFabric...
Perhaps. Flat files, however, are fast and more scalable than RDBMS.
was also looking into OLAP / Data warehousing
Good plan. Buy Kimball's book immediately. You don't need more technology. You only need to make better use of flat files as primary and SQL as a place for ad-hoc queries (against subsets) for users.
精彩评论