Is this a bug I should submit to Apple, or is this expected behavior?

2023-02-13 02:42 问答作者：

When using CoreData, the following multi-column index predicate is very slow - it takes almost 2 seconds for 26,000 records.

Please note both columns are indexed, and I am purposefully doing the query with > and <=, instead of beginswith, to make it fast:

NSPredicate *predicate = [NSPredicate predicateWithFormat:
  @"airportNameUppercase >= %@ AND airportNameUppercase < %@ \
        OR cityUppercase >= %@ AND cityUppercase < %@ \
    upperText, upperTextIncremented,
    upperText, upperTextIncremented];

However, if I run two separate fetchRequests, one for each column, and then I merge the results, then each fetchRequest takes just 1-2 hundredths of a second, and merging the lists (which are sorted) takes about 1/10th of a second.

Is this a bug in how CoreData handles multiple indices, or is this expected behavior? The following is my full, optimized code, which works very fast:

NSFetchRequest *fetchRequest = [[[NSFetchRequest alloc] init]autorelease];
[fetchRequest setFetchBatchSize:15]; 

// looking up a list of Airports
NSEntityDescription *entity = [NSEntityDescription entityForName:@"Airport" 
                                          inManagedObjectContext:context];
[fetchRequest setEntity:entity];    

// sort by uppercase name
NSSortDescriptor *nameSortDescriptor = [[[NSSortDescriptor alloc] 
           initWithKey:@"airportNameUppercase" 
             ascending:YES 
              selector:@selector(compare:)] autorelease];
NSArray *sortDescriptors = [[[NSArray alloc] initWithObjects:nameSortDescriptor, nil]autorelease];
[fetchRequest setSortDescriptors:sortDescriptors];

// use > and <= to do a prefix search that ignores locale and unicode,
// because it's very fast   
NSString *upperText = [text uppercaseString];
unichar c = [upperText characterAtIndex:[text length]-1];
c++;    
NSString *modName = [[upperText substringToIndex:[text length]-1]
                         stringByAppendingString:[NSString stringWithCharacters:&c length:1]];

// for the first fetch, we look up names and codes
// we'll merge these results with the next fetch for city name
// because looking up by name and city at the same time is slow
NSPredicate *predicate = [NSPredicate predicateWithFormat:
   @"airportNameUppercase >= %@ AND airportNameUppercase < %@ \
                        OR iata == %@ \
                        OR icao ==  %@",
     upperText, modName,
     upperText,
     upperText,
     upperText];
[fetchRequest setPredicate:predicate];

NSArray *nameArray = [context executeFetchRequest:fetchRequest error:nil];

// now that we looked up all airports with names beginning with the prefix
// look up airports with cities beginning with the prefix, so we can merge the lists
predicate = [NSPredicate predicateWithFormat:
  @"cityUppercase >= %@ AND cityUppercase < %@",
    upp开发者_运维百科erText, modName];
[fetchRequest setPredicate:predicate];
NSArray *cityArray = [context executeFetchRequest:fetchRequest error:nil];

// now we merge the arrays
NSMutableArray *combinedArray = [NSMutableArray arrayWithCapacity:[cityArray count]+[nameArray count]];
int cityIndex = 0;
int nameIndex = 0;
while(   cityIndex < [cityArray count] 
      || nameIndex < [nameArray count]) {

  if (cityIndex >= [cityArray count]) {
    [combinedArray addObject:[nameArray objectAtIndex:nameIndex]];
    nameIndex++;
  } else if (nameIndex >= [nameArray count]) {
    [combinedArray addObject:[cityArray objectAtIndex:cityIndex]];
    cityIndex++;
  } else if ([[[cityArray objectAtIndex:cityIndex]airportNameUppercase] isEqualToString: 
                         [[nameArray objectAtIndex:nameIndex]airportNameUppercase]]) {
    [combinedArray addObject:[cityArray objectAtIndex:cityIndex]];
    cityIndex++;
    nameIndex++;
  } else if ([[cityArray objectAtIndex:cityIndex]airportNameUppercase] < 
                         [[nameArray objectAtIndex:nameIndex]airportNameUppercase]) {
    [combinedArray addObject:[cityArray objectAtIndex:cityIndex]];
    cityIndex++;
  } else if ([[cityArray objectAtIndex:cityIndex]airportNameUppercase] > 
                         [[nameArray objectAtIndex:nameIndex]airportNameUppercase]) {
    [combinedArray addObject:[nameArray objectAtIndex:nameIndex]];
    nameIndex++;
  }

}

self.airportList = combinedArray;

CoreData has no affordance for the creation or use of multi-column indices. This means that when you execute the query corresponding to your multi-property predicate, CoreData can only use one index to make the selection. Subsequently it uses the index for one of the property tests, but then SQLite can't use an index to gather matches for the second property, and therefore has to do it all in memory instead of using its on-disk index structure.

That second phase of the select ends up being slow because it has to gather all the results into memory from the disk, then make the comparison and drop results in-memory. So you end up doing potentially more I/O than if you could use a multi-column index.

This is why, if you will be disqualifying a lot of potential results in each column of your predicate, you'll see much faster results by doing what you're doing and making two separate fetches and merging in-memory than you would if you made one fetch.

To answer your question, this behavior isn't unexpected by Apple; it's just an effect of a design decision to not support multi-column indices in CoreData. But you should to file a bug at https://feedbackassistant.apple.com/ requesting support of multi-column indices if you'd like to see that feature in the future.

In the meantime, if you really want to get max database performance on iOS, you could consider using SQLite directly instead of CoreData.

When in doubt, you should file a bug.

There isn't currently any API to instruct Core Data to create a compound index. If a compound index were to exist, it would be used without issue.

Non-indexed columns are not processed entirely in memory. They result in a table scan, which isn't the same thing as loading the entire file (well, unless your file only has 1 table). Table scans on strings tend to be very slow.

SQLite itself is limited in the number of indices it will used per query. Basically just 1, give or take some circumstances.

You should use the [n] flag for this query to do a binary search against normalized text. There is a sample project on ADC called 'DerivedProperty'. It will show how to normalize text so you can use binary collations as opposed to the default ICU integration for fancy localized Unicode aware text comparisons.

There's a much longer discussion about fast string searching in Core Data at https://devforums.apple.com/message/363871

继续阅读：cocoa-touch core-data objective-c

Is this a bug I should submit to Apple, or is this expected behavior?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？