开发者

Get String Between Two Other Strings in ObjC

I am trying to parse a string and get another string in the middle.

ie.

Hello world this is a string

I need to find the string between开发者_运维问答 "world" and "is" (this). I have looked around but haven't been able to figure it out yet, mainly because I am new to Objective C... Anyone have an idea of how to do this, with RegEx or without?


The regular expressions solution that Jacques gives works, and the caveat of requiring iOS 4.0 and later is true. Using regular expressions is also quite slow, and an overkill if the search expressions are known string constants.

You can solve the problem using methods on NSString, or a class named NSScanner, both have been available since iPhone OS 2.0 and long before that, since before Mac OS X 10.0 actually :).

So what you want is a new method on NSString like this?

@interface NSString (CWAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end;
@end

No problem, and we assume we should return nil is no such strings could be found.

The implementation using NSString only is quite straight forward:

@implementation NSString (NSAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end {
    NSRange startRange = [self rangeOfString:start];
    if (startRange.location != NSNotFound) {
        NSRange targetRange;
        targetRange.location = startRange.location + startRange.length;
        targetRange.length = [self length] - targetRange.location;   
        NSRange endRange = [self rangeOfString:end options:0 range:targetRange];
        if (endRange.location != NSNotFound) {
           targetRange.length = endRange.location - targetRange.location;
           return [self substringWithRange:targetRange];
        }
    }
    return nil;
}
@end

Or you could do the implementation using the NSScanner class:

@implementation NSString (NSAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end {
    NSScanner* scanner = [NSScanner scannerWithString:self];
    [scanner setCharactersToBeSkipped:nil];
    [scanner scanUpToString:start intoString:NULL];
    if ([scanner scanString:start intoString:NULL]) {
        NSString* result = nil;
        if ([scanner scanUpToString:end intoString:&result]) {
            return result;
        }
    }
    return nil;
}
@end


Just a simple modification to PeyloW's answer, that returns all strings within the start and end strings:

-(NSMutableArray*)stringsBetweenString:(NSString*)start andString:(NSString*)end
{

  NSMutableArray* strings = [NSMutableArray arrayWithCapacity:0];

  NSRange startRange = [self rangeOfString:start];

  for( ;; )
  {

    if (startRange.location != NSNotFound)
    {

      NSRange targetRange;

      targetRange.location = startRange.location + startRange.length;
      targetRange.length = [self length] - targetRange.location;   

      NSRange endRange = [self rangeOfString:end options:0 range:targetRange];

      if (endRange.location != NSNotFound)
      {

        targetRange.length = endRange.location - targetRange.location;
        [strings addObject:[self substringWithRange:targetRange]];

        NSRange restOfString;

        restOfString.location = endRange.location + endRange.length;
        restOfString.length = [self length] - restOfString.location;

        startRange = [self rangeOfString:start options:0 range:restOfString];

      }
      else
      {
        break;
      }

    }
    else
    {
      break;
    }

  }

  return strings;

}


See the ICU user guide on regular expressions.

If you know there'll just be one result:

NSRegularExpression *regex = [NSRegularExpression
    regularExpressionWithPattern:@"\bworld\s+(.+)\s+is\b" options:0 error:NULL]

NSTextCheckingResult *result = [regex firstMatchInString:string
    options:0 range:NSMakeRange(0, [string length]];

// Gets the string inside the first set of parentheses in the regex
NSString *inside = [string substringWithRange:[result rangeAtIndex:1]];

The \b makes sure there's a word boundary before world and after is (so "hello world this isn't a string" wouldn't match). The \s gobbles up any whitespace after world and before is. The .+? finds what you're looking for, with the ? making it non-greedy so that "hello world this is a string hello world this is a string" doesn't give you "this a string hello world this".

I'll leave it up to you to figure out how to handle multiple matches. The NSRegularExpression documentation should help you out.

If you want to make sure the match doesn't cross sentence boundaries, you could do ([^.]+?) instead of (.+?), or you could use enumerateSubstringsInRange:options:usingBlock: on your string and pass NSStringEnumerationBySentences in the options.

This stuff all needs 4.0+. If you want to support 3.0+, look into RegexKitLite.


If it happens to be just strings seperated by white spaces you can use the following code: Either

[string componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]]

OR

NSMutableArray *parts = [NSMutableArray arrayWithCapacity:1];

NSScanner *scanner = [NSScanner scannerWithString:string];
NSString *token;
while ([scanner scanUpToCharactersFromSet:[NSCharacterSet whitespaceCharacterSet]] intoString:&token]) {
    [parts addObject:token];
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜