strategies to retrieve resultset from huge database table
It's my belief that many people run into this problem: from a frontend JSP page, a user set up some criteria based on whi开发者_开发技巧ch a SQL is constructed and used to retrieve results from one or more database table(s). The issue is, this table grows by 1 mill per day and becomes gigantic.
I know there is not a definite answer to this question: how can we do to expedite this procedure? Indexing might be one (which I heard a lot but know little about), and another thing on my mind is to use some customized caching solution such as Gigaspace. Will Hibernate help in this case?
Anyone else wanna add their 2 cents?
Thanks a lot! John
Well erm yes you do need to index your database!
If you're not even indexing your database, then you probably need to start by reading a little bit about how to appropriately index your database.
Then, it shouldn't matter per se about the number of millions of rows in your database tables: the very raison d'etre of a decent database system is to cope with tables with millions of rows. But you will want to make sure is that the specification of which rows are actually retrieved from those millions is sensible and that the queries in question can go off appropriate indexes (e.g. because of parameters entered by the user to narrow them down). "Adding an index" isn't a magical panacea necessarily: you need to make sure that you have indexes added that are appropriate to how your queries end up looking by the time they hit the database.
I wouldn't personally go down the road of adding spurious caching and other layers of complexity until (a) you've actually ascertained that in practice that you need them and (b) you can actually ascertain that the layers you are adding will solve the problem that you want them to solve. If you haven't got round to indexing your database yet, then I would really start by just building a simple, appropriately optimised solution and take it from there.
An index is a must with that amount of data or even a faction of that! As far as any other answers goes it really depends on what you plan on doing with that amount of data because one strategy isn't going to cover all use-cases.
精彩评论