sqlite Indexing Performance Advice
I have an sqlite database in my iPhone app that I access via the Core Data framework. I'm using NSPredicates to query the database.
I am building a search function that needs to search six different varchar fields that contain text. At the moment, it's very slow and I need to improve performance, probably in the sqlite database. Would it be best to create an index on all those columns? Or would it be better to build a custom index table that expands those si开发者_StackOverflowx columns into multiple rows, each containing a word and the ID it matches? Any other suggestions?
There are things you can do to improve the performance of searching for text in sqlite databases. Although Core Data abstracts you away from the underlying store it can be good to have an appreciation of what is going on when your store is backed using sqlite.
If we assume you're doing a substring search of these fields there are things you can do to improve search performance. Apple recommend using a derived properties. This amounts to maintaining a normalised version of your property in your model that is used for searching. The derived property should be done in a way that it can be indexed. You then express your search in terms of this derived property using binary operators > <= etc.
I found doing this reduced our search from around 1 second to under 100ms.
To make things clear I would suggest looking at the ADC example http://developer.apple.com/mac/library/samplecode/DerivedProperty/
From the Core Data Programming Guide:
How you use predicates can significantly affect the performance of your application. If a fetch request requires a compound predicate, you can make the fetch more efficient by ensuring that the most restrictive predicate is the first, especially if the predicate involves text matching (contains, endsWith, like, and matches) since correct Unicode searching is slow. If the predicate combines textual and non-textual comparisons, then it is likely to be more efficient to specify the non-textual predicates first, for example (salary > 5000000) AND (lastName LIKE 'Quincey') is better than (lastName LIKE 'Quincey') AND (salary > 5000000).
If there is a way to reorder your query such that the simplest logic is on the left, and the most complex on the right, that can help your search performance. As Lyon suggests, searching Unicode text is extremely expensive, so Apple recommends searching against derived values that strip unicode characters and common phrases like a, and, and the.
I assume these columns store text. The question is how much text and how often this model is accessed. If it is a large amount of text, I would create other properties that held the text, stripping common words and Unicode text. The only downside to this is that you end up with extra properties to maintain. You can do any indexing to improve perf on those columns.
If what you want is essentially full text indexing of your sqlite db, then you may want to use sqlite's ft3 module, since that's exactly what it provides: http://www.sqlite.org/cvstrac/wiki?p=FtsUsage http://dotnetperls.com/sqlite-fts3
精彩评论