Are there any toolsets/techniques to determine if SQL queries are functionally identical?
From a client I receive SQL queries that contain all sorts of redundant sub-SELECTs, generated by a builder in MS Access. I convert the SQL to Postgres, and it runs - but I know it's hopelessly inefficient. So, I set about optimising the query, and produce iterative improvements - and to determine whether each new query is functionally identical, I run it and ensure it produces the same resultset.
This, of course, is not 100% effective at determining whether one query is the same as another - it relies on the data exploring every different case. A colleague suggests I could run an explain plan over each query, but concedes that the internal db optimiser may employ a different query strategy even though the newer query is functionally identical (and this is a good thing - the new strategy may be much more efficient).
I have at least intermediate SQL skills, but am not a confident query optimiser. So, I'd be interested in any toolset for Postgres that can say with 100% certainty that two queries do the same thing. Even better if it can suggest optimisati开发者_JS百科ons itself! I believe that TOAD can do this, but I don't believe it is available for Postgres. I'm on Mac OS X.
Great question. Tall order.
As you know there's the level of being logically identical. Then there is the level of generating identical plans as far as which table accesses are required and in what order.
Lastly those underlying plans are compared to the distribution of data in the table to determine whether or not to use a given index on a table, and the use of cached data pages comes into play as well. (30,000 foot description, I know)
If your question is limited to 'perform identically on the same dataset', then comparing EXPLAIN plans would be sufficient, and would encompass both logical identicality and some level of I/O identicality.
You could try using TOAD on MySQL to analyze the query. Once you improve the query, port it over to PostgreSQL.
Assuming your are improving a PostgreSQL query that is of course already written for PostgreSQL use pgTap (Unit Testing for PostgreSQL), no question about it. You can test for how long a query runs and the results it returns among other things. http://pgtap.org/
Upgrade to PostgreSQL 9.0 or later.
From 9.0 release notes:
- Multiple performance enhancements for specific types of queries, including elimination of unnecessary joins. This helps optimize some automatically-generated queries, such as those produced by object-relational mappers (ORMs).
I would say no. However queries from access should not perform worse in postgres given proper indexes.
You might percieve the queries as slower as msaccess normally returns partial results and only on browsing the dataset continues to execute the query where postgres would normally run the whole query (depending of course on the client you use).
精彩评论