Finding all rows in a database where a field is different from another field
I have the following SQL query:
SELECT * FROM table WHERE field_1 <> 开发者_高级运维field_2
Which is the best index structure to use, to keep this query efficient: two indexes on field_1 and field_2 or a single index which includes both fields?
EDIT: The database is MySQL
If you have a enormous table better is to denormalize it and store the result of filed1<>field2 in separate column, and update it on every insert/update of the corresponding row
I imagine this may depend on which platform you are using, but on MS SQL Server definitely one index!
Indexes are not going to help you.
The databse must do a table scan, as it is comparing two fields in the same row.
It depends on your database engine, but generally it's best to assume that a query will only use one index per table. This would imply that a single index across both columns is likely to be best.
However, the only way to find out is to populate a table with dummy data and try it out. Make sure that the dummy data is representative in terms of how it is distributed as, for example, if 99% of field2 values are identical to each other then it may reduce the value of having an index.
To be sure, I'd try all three options, but remember you are writing to each index with every insert / update. (so indexing both fields will have to be more beneficial by a margin to compensate for the negative effects on write performance) Remember, it doesn't have to be perfect, it just has to be good enough to handle the system throughput without creating unacceptable UI performance latencies.
What I'd try first is A single index on the field that has the most distinct values... i.e. if Field1 has 1000 different values in it, and field 2 only has 20, then put the index on field1.
Here's a nice article about indexes and inequality matches:
http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/
alternatively, if your data is vast, you might consider using a trigger to mark another column with a bit, indiciating if the columns match or not, and then search on that column. All depends on your situation, of course.
精彩评论