CamelCase to underscores and back in Objective-C
I'm looking for a simple, efficient way to convert strings in CamelCase to underscore notation (i.e., MyClassName -> my_class_name) and back 开发者_运维技巧again in Objective C.
My current solution involves lots of rangeOfString
, characterAtIndex
, and replaceCharactersInRange
operations on NSMutableStrings, and is just plain ugly as hell :) It seems that there must be a better solution, but I'm not sure what it is.
I'd rather not import a regex library just for this one use case, though that is an option if all else fails.
Chris's suggestion of RegexKitLite is good. It's an excellent toolkit, but this could be done pretty easily with NSScanner. Use -scanCharactersFromSet:intoString:
alternating between +uppercaseLetterCharacterSet
and +lowercaseLetterCharacterSet
. For going back, you'd use -scanUpToCharactersFromSet:
instead, using a character set with just an underscore in it.
How about these:
NSString *MyCamelCaseToUnderscores(NSString *input) {
NSMutableString *output = [NSMutableString string];
NSCharacterSet *uppercase = [NSCharacterSet uppercaseLetterCharacterSet];
for (NSInteger idx = 0; idx < [input length]; idx += 1) {
unichar c = [input characterAtIndex:idx];
if ([uppercase characterIsMember:c]) {
[output appendFormat:@"_%@", [[NSString stringWithCharacters:&c length:1] lowercaseString]];
} else {
[output appendFormat:@"%C", c];
}
}
return output;
}
NSString *MyUnderscoresToCamelCase(NSString *underscores) {
NSMutableString *output = [NSMutableString string];
BOOL makeNextCharacterUpperCase = NO;
for (NSInteger idx = 0; idx < [underscores length]; idx += 1) {
unichar c = [underscores characterAtIndex:idx];
if (c == '_') {
makeNextCharacterUpperCase = YES;
} else if (makeNextCharacterUpperCase) {
[output appendString:[[NSString stringWithCharacters:&c length:1] uppercaseString]];
makeNextCharacterUpperCase = NO;
} else {
[output appendFormat:@"%C", c];
}
}
return output;
}
Some drawbacks are that they do use temporary strings to convert between upper and lower case, and they don't have any logic for acronyms, so myURL will result in my_u_r_l.
Try this magic:
NSString* camelCaseString = @"myBundleVersion";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(?<=[a-z])([A-Z])|([A-Z])(?=[a-z])" options:0 error:nil];
NSString *underscoreString = [[regex stringByReplacingMatchesInString:camelCaseString options:0 range:NSMakeRange(0, camelCaseString.length) withTemplate:@"_$1$2"] lowercaseString];
NSLog(@"%@", underscoreString);
Output: my_bundle_version
If your concern is just the visibility of your code, you could make a category for NSString
using the methods you've designed already. That way, you only see the ugly mess once. ;)
For instance:
@interface NSString(Conversions) {
- (NSString *)asCamelCase;
- (NSString *)asUnderscored;
}
@implementation NSString(Conversions) {
- (NSString *)asCamelCase {
// whatever you came up with
}
- (NSString *)asUnderscored {
// whatever you came up with
}
}
EDIT: After a quick Google search, I couldn't find any way of doing this, even in plain C. However, I did find a framework that could be useful. It's called RegexKitLite. It uses the built-in ICU library, so it only adds about 20K to the final binary.
Here's my implementation of Rob's answer:
@implementation NSString (CamelCaseConversion)
// Convert a camel case string into a dased word sparated string.
// In case of scanning error, return nil.
// Camel case string must not start with a capital.
- (NSString *)fromCamelCaseToDashed {
NSScanner *scanner = [NSScanner scannerWithString:self];
scanner.caseSensitive = YES;
NSString *builder = [NSString string];
NSString *buffer = nil;
NSUInteger lastScanLocation = 0;
while ([scanner isAtEnd] == NO) {
if ([scanner scanCharactersFromSet:[NSCharacterSet lowercaseLetterCharacterSet] intoString:&buffer]) {
builder = [builder stringByAppendingString:buffer];
if ([scanner scanCharactersFromSet:[NSCharacterSet uppercaseLetterCharacterSet] intoString:&buffer]) {
builder = [builder stringByAppendingString:@"-"];
builder = [builder stringByAppendingString:[buffer lowercaseString]];
}
}
// If the scanner location has not moved, there's a problem somewhere.
if (lastScanLocation == scanner.scanLocation) return nil;
lastScanLocation = scanner.scanLocation;
}
return builder;
}
@end
Here's yet another version based on all the above. This version handles additional forms. In particular, tested with the following:
camelCase => camel_case
camelCaseWord => camel_case_word
camelURL => camel_url
camelURLCase => camel_url_case
CamelCase => camel_case
Here goes
- (NSString *)fromCamelCaseToDashed3 {
NSMutableString *output = [NSMutableString string];
NSCharacterSet *uppercase = [NSCharacterSet uppercaseLetterCharacterSet];
BOOL previousCharacterWasUppercase = FALSE;
BOOL currentCharacterIsUppercase = FALSE;
unichar currentChar = 0;
unichar previousChar = 0;
for (NSInteger idx = 0; idx < [self length]; idx += 1) {
previousChar = currentChar;
currentChar = [self characterAtIndex:idx];
previousCharacterWasUppercase = currentCharacterIsUppercase;
currentCharacterIsUppercase = [uppercase characterIsMember:currentChar];
if (!previousCharacterWasUppercase && currentCharacterIsUppercase && idx > 0) {
// insert an _ between the characters
[output appendString:@"_"];
} else if (previousCharacterWasUppercase && !currentCharacterIsUppercase) {
// insert an _ before the previous character
// insert an _ before the last character in the string
if ([output length] > 1) {
unichar charTwoBack = [output characterAtIndex:[output length]-2];
if (charTwoBack != '_') {
[output insertString:@"_" atIndex:[output length]-1];
}
}
}
// Append the current character lowercase
[output appendString:[[NSString stringWithCharacters:¤tChar length:1] lowercaseString]];
}
return output;
}
If you are concerned with the speed of your code you probably want to write a more performant version of the code:
- (nonnull NSString *)camelCaseToSnakeCaseString {
if ([self length] == 0) {
return @"";
}
NSMutableString *output = [NSMutableString string];
NSCharacterSet *digitSet = [NSCharacterSet decimalDigitCharacterSet];
NSCharacterSet *uppercaseSet = [NSCharacterSet uppercaseLetterCharacterSet];
NSCharacterSet *lowercaseSet = [NSCharacterSet lowercaseLetterCharacterSet];
for (NSInteger idx = 0; idx < [self length]; idx += 1) {
unichar c = [self characterAtIndex:idx];
// if it's the last one then just append lowercase of character
if (idx == [self length] - 1) {
if ([uppercaseSet characterIsMember:c]) {
[output appendFormat:@"%@", [[NSString stringWithCharacters:&c length:1] lowercaseString]];
}
else {
[output appendFormat:@"%C", c];
}
continue;
}
unichar nextC = [self characterAtIndex:(idx+1)];
// this logic finds the boundaries between lowercase/uppercase/digits and lets the string be split accordingly.
if ([lowercaseSet characterIsMember:c] && [uppercaseSet characterIsMember:nextC]) {
[output appendFormat:@"%@_", [[NSString stringWithCharacters:&c length:1] lowercaseString]];
}
else if ([lowercaseSet characterIsMember:c] && [digitSet characterIsMember:nextC]) {
[output appendFormat:@"%@_", [[NSString stringWithCharacters:&c length:1] lowercaseString]];
}
else if ([digitSet characterIsMember:c] && [uppercaseSet characterIsMember:nextC]) {
[output appendFormat:@"%@_", [[NSString stringWithCharacters:&c length:1] lowercaseString]];
}
else {
// Append lowercase of character
if ([uppercaseSet characterIsMember:c]) {
[output appendFormat:@"%@", [[NSString stringWithCharacters:&c length:1] lowercaseString]];
}
else {
[output appendFormat:@"%C", c];
}
}
}
return output;
}
I have combined the answers found here into my refactoring library, es_ios_utils. See NSCategories.h:
@property(nonatomic, readonly) NSString *asCamelCaseFromUnderscores;
@property(nonatomic, readonly) NSString *asUnderscoresFromCamelCase;
Usage:
@"my_string".asCamelCaseFromUnderscores
yields @"myString"
Please push improvements!
I happened upon this question looking for a way to convert Camel Case to a spaced, user displayable string. Here is my solution which worked better than replacing @"_" with @" "
- (NSString *)fromCamelCaseToSpaced:(NSString*)input {
NSCharacterSet* lower = [NSCharacterSet lowercaseLetterCharacterSet];
NSCharacterSet* upper = [NSCharacterSet uppercaseLetterCharacterSet];
for (int i = 1; i < input.length; i++) {
if ([upper characterIsMember:[input characterAtIndex:i]] &&
[lower characterIsMember:[input characterAtIndex:i-1]])
{
NSString* soFar = [input substringToIndex:i];
NSString* left = [input substringFromIndex:i];
return [NSString stringWithFormat:@"%@ %@", soFar, [self fromCamelCaseToSpaced:left]];
}
}
return input;
}
OK guys. Here is an all regex answer, which I consider the only true way:
Given:
NSString *MYSTRING = "foo_bar";
NSRegularExpression *_toCamelCase = [NSRegularExpression
regularExpressionWithPattern:@"(_)([a-z])"
options:NSRegularExpressionCaseInsensitive error:&error];
NSString *camelCaseAttribute = [_toCamelCase
stringByReplacingMatchesInString:MYSTRING options:0
range:NSMakeRange(0, attribute.length)
withTemplate:@"\\U$2"];
Yields fooBar.
Conversely:
NSString *MYSTRING = "fooBar";
NSRegularExpression *camelCaseTo_ = [NSRegularExpression
regularExpressionWithPattern:@"([A-Z])"
options:0 error:&error];
NSString *underscoreParsedAttribute = [camelCaseTo_
stringByReplacingMatchesInString:MYSTRING
options:0 range:NSMakeRange(0, attribute.length)
withTemplate:@"_$1"];
underscoreParsedAttribute = [underscoreParsedAttribute lowercaseString];
Yields: foo_bar.
\U$2 replaces second capture group with upper-case version of itself :D
\L$1 however, oddly, does not replace the first capture group with a lower-case version of itself :( Not sure why, it should work. :/
精彩评论