Complexity of Views - Comprehensive or Building-Blocks?
When creating a number of views against a database, is it better to have a handful of comprehensive views or a larger number of 'building开发者_如何学Go blocks' that can be joined together to achieve the desired result - or is a mixture of the two not necessarily a bad thing?
As a SQL Server developer, I've been bitten by the complexity that large views (many tables) can impose on the query optimizer - which is of course true of any query that involves a large number of tables.
Are there any rules-of-thumb that help determine whether a particular view is going to be viable?
SQL development in general does not ply well to the software development paradigm: it is not reusable, does not lead to DRY definitions and is hard to maintain. But the most important thing is that most techniques that improve this status-quo and lead to better quality code result is runtime problems. And a SQL run time problem is nothing like a suboptimal code construct in code, it results in bad plans that give results in tens and hundreds of times slower than an optimal plan. In other words, when a DRY query definition base don reasonable blocks results in a table scan plan that runs 10 seconds and an ugly single-use view has a better plan that runs in 10 ms, you forget everything about DRY and go with the ugly but fast view. The differences in run time between a good plan and a bad plan are just too big.
This is why with SQL development a good projects ends up with a few well tuned queries that are constantly measured and checked for performance. I'm sad to say but in my experience the more 'healthy' the SQL code was from a classic code pov (DRY, reusable, maintainable) the more problems it had in real world production when faced with large data size. I really wish there was an easy way to deploy reusable SQL blocks that could be assembled into complex structures. It just doesn't work that way. I know enough about how SQL query optimization works to understand that the query optimizers look at the resulted complex block as a whole and they cannot leverage the internal blocks as 'units of work', they are tasked to optimize the final, end result. And optimizing such complex queries, considering data access paths, IO costs, size of data, column values distribution probabilities is just very very very complex, orders of magnitude more complex than the task, say, a C# optimizer is asked to do.
My advice would be: keep few complex views that are tested and tuned. Freedom to compose basic building block will quickly be abused and you'll discover it too late.
I just delt with this issue this week. Here is my experience.
Each view should be specific to the task at hand. We were trying to build a view on top of a view (etc) and ran into performance issues very quickly (in production of course). Switching to a single query that returned the specific data instead of building the data from building block views brought the performance back inline with expectations.
The upside to spceific views/queries is that they are specific to the task at hand. They can run faster because they are only doing the work that is necessary.
The downside to specific views is that similar logic must be maintained in each view.
In our case there was no choice. Performance went off the cliff when using building block views.
The extent to which SQL allows you to compose views, or any database objects for that matter, is limited. If you have many 'building block' views, when composing another view from those building blocks, the query optimizer may end up joining the same table multiple times. Also, tables which are not required may also be joined. So while the the building block concept may provide a degree of reusability, it should be utilized with caution. Instead, focus on designing a view for a specific requirement, then you can consider it factoring it into building blocks.
精彩评论