Most efficient way to make the first character of a String lower case?

2023-01-23 04:59 问答作者：

What is the most efficient way to make the first character of a String lower case?

I can think of a number of ways to do this:

Using charAt() with substring()

String input   = "SomeInputString";
String output  = Character.toLowerCase(input.charAt(0)) +
                   (input.length() > 1 ? input.substring(1) : "");

Or using a char array

 String input  = "SomeInputString";
 char c[]      = input.toCharArray();
 c[0]          = Character.toLowerCase(c[0]);
 String output = new String(c);

I am sure 开发者_C百科there are many other great ways to achieve this. What do you recommend?

I tested the promising approaches using JMH. Full benchmark code.

Assumption during the tests (to avoid checking the corner cases every time): the input String length is always greater than 1.

Results

Benchmark           Mode  Cnt         Score        Error  Units
MyBenchmark.test1  thrpt   20  10463220.493 ± 288805.068  ops/s
MyBenchmark.test2  thrpt   20  14730158.709 ± 530444.444  ops/s
MyBenchmark.test3  thrpt   20  16079551.751 ±  56884.357  ops/s
MyBenchmark.test4  thrpt   20   9762578.446 ± 584316.582  ops/s
MyBenchmark.test5  thrpt   20   6093216.066 ± 180062.872  ops/s
MyBenchmark.test6  thrpt   20   2104102.578 ±  18705.805  ops/s

The score are operations per second, the more the better.

Tests

test1 was first Andy's and Hllink's approach:

string = Character.toLowerCase(string.charAt(0)) + string.substring(1);

test2 was second Andy's approach. It is also Introspector.decapitalize() suggested by Daniel, but without two if statements. First if was removed because of the testing assumption. The second one was removed, because it was violating correctness (i.e. input "HI" would return "HI"). This was almost the fastest.
```
char c[] = string.toCharArray();
c[0] = Character.toLowerCase(c[0]);
string = new String(c);
```
test3 was a modification of test2, but instead of Character.toLowerCase(), I was adding 32, which works correctly if and only if the string is in ASCII. This was the fastest. c[0] |= ' ' from Mike's comment gave the same performance.
```
char c[] = string.toCharArray();
c[0] += 32;
string = new String(c);
```

test4 used StringBuilder.

StringBuilder sb = new StringBuilder(string);
sb.setCharAt(0, Character.toLowerCase(sb.charAt(0)));
string = sb.toString();

test5 used two substring() calls.

string = string.substring(0, 1).toLowerCase() + string.substring(1);

test6 uses reflection to change char value[] directly in String. This was the slowest.

try {
    Field field = String.class.getDeclaredField("value");
    field.setAccessible(true);
    char[] value = (char[]) field.get(string);
    value[0] = Character.toLowerCase(value[0]);
} catch (IllegalAccessException e) {
    e.printStackTrace();
} catch (NoSuchFieldException e) {
    e.printStackTrace();
}

Conclusions

If the String length is always greater than 0, use test2.

If not, we have to check the corner cases:

public static String decapitalize(String string) {
    if (string == null || string.length() == 0) {
        return string;
    }

    char c[] = string.toCharArray();
    c[0] = Character.toLowerCase(c[0]);

    return new String(c);
}

If you are sure that your text will be always in ASCII and you are looking for extreme performance because you found this code in the bottleneck, use test3.

I came across a nice alternative if you don't want to use a third-party library:

import java.beans.Introspector;

Assert.assertEquals("someInputString", Introspector.decapitalize("SomeInputString"));

When it comes to string manipulation take a look to Jakarta Commons Lang StringUtils.

If you want to use Apache Commons you can do the following:

import org.apache.commons.lang3.text.WordUtils;
[...] 
String s = "SomeString"; 
String firstLower = WordUtils.uncapitalize(s);

Result: someString

Despite a char oriented approach I would suggest a String oriented solution. String.toLowerCase is Locale specific, so I would take this issue into account. String.toLowerCase is to prefer for lower-caseing according to Character.toLowerCase. Also a char oriented solution is not full unicode compatible, because Character.toLowerCase cannot handle supplementary characters.

public static final String uncapitalize(final String originalStr,
            final Locale locale) {
        final int splitIndex = 1;
        final String result;
        if (originalStr.isEmpty()) {
        result = originalStr;
        } else {
        final String first = originalStr.substring(0, splitIndex).toLowerCase(
                locale);
        final String rest = originalStr.substring(splitIndex);
        final StringBuilder uncapStr = new StringBuilder(first).append(rest);
        result = uncapStr.toString();
        }
        return result;
    }

UPDATE: As an example how important the locale setting is let us lowercase I in turkish and german:

System.out.println(uncapitalize("I", new Locale("TR","tr")));
System.out.println(uncapitalize("I", new Locale("DE","de")));

will output two different results:

ı

i

Strings in Java are immutable, so either way a new string will be created.

Your first example will probably be slightly more efficient because it only needs to create a new string and not a temporary character array.

A very short and simple static method to archive what you want:

public static String decapitalizeString(String string) {
    return string == null || string.isEmpty() ? "" : Character.toLowerCase(string.charAt(0)) + string.substring(1);
}

val str = "Hello"
s"${str.head.toLower}${str.tail}"

Result:

res4: String = hello

If what you need is very simple (eg. java class names, no locales), you can also use the CaseFormat class in the Google Guava library.

String converted = CaseFormat.UPPER_CAMEL.to(CaseFormat.LOWER_CAMEL, "FooBar");
assertEquals("fooBar", converted);

Or you can prepare and reuse a converter object, which could be more efficient.

Converter<String, String> converter=
    CaseFormat.UPPER_CAMEL.converterTo(CaseFormat.LOWER_CAMEL);

assertEquals("fooBar", converter.convert("FooBar"));

To better understand philosophy of the Google Guava string manipulation, check out this wiki page.

String testString = "SomeInputString";
String firstLetter = testString.substring(0,1).toLowerCase();
String restLetters = testString.substring(1);
String resultString = firstLetter + restLetters;

I have come accross this only today. Tried to do it myself in the most pedestrian way. That took one line, tho longish. Here goes

String str = "TaxoRank"; 

System.out.println(" Before str = " + str); 

str = str.replaceFirst(str.substring(0,1), str.substring(0,1).toLowerCase());

System.out.println(" After str = " + str);

Gives:

Before str = TaxoRanks

After str = taxoRanks

继续阅读：optimization performance string

Most efficient way to make the first character of a String lower case?

Results

Tests

Conclusions

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Results

Tests

Conclusions

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？