开发者

Get the first letter of each word in a string using regex

I'm trying to get the first letter of each word in a string using rege开发者_开发技巧x, here is what I have tried:

public class Test
{
    public static void main(String[] args)
    {
        String name = "First Middle Last";
        for(String s : name.split("(?<=[\\S])[\\S]+")) System.out.println(s);
    }
}

The output is as follows:

F
 M
 L

How can I fix the regex to get the correct output?


Edit Took some suggestions in the comments, but kept the \S because \w is only alpha-numeric and might break unexpectedly on any other symbols.

Fixing the regex and still using split:

name.split("(?<=[\\S])[\\S]*\\s*")


Why not simply:

public static void main(String[] args)
{
    String name = "First Middle Last";
    for(String s : name.split("\\s+")) System.out.println(s.charAt(0));
}   


(Disclaimer: I have no experience with Java, so if it handles regexes in ways that render this unhelpful, I apologize.)

If you mean getting rid of the spaces preceding the M and L, try adding optional whitespace at the end

(?<=[\\S])[\\S]+\\s*

However, this may add an extra space in the case of single-letter words. This may fix that:

(?<=[\\S])[\\S]*\\s*


Sometimes it is easier to use a different technique. In particular, there's no convenient method for “get all matching regions” (you could build your own I suppose, but that feels like a lot of effort). So we transform to something we can handle:

String name = "First Middle Last";
for (String s : name.replaceAll("\\W*(\\w)\\w*\\W*","$1").split("\\B"))
    System.out.println(s);

We could simplify somewhat if we were allowed to assume there were no leading or trailing non-word characters:

String name = "First Middle Last";
for (String s : name.replaceAll("(\\w)\\w*","$1").split("\\W+"))
    System.out.println(s);


I recently had this question in an interview and came up with this solution after looking here.

String input = "First Middle Last";
Pattern p = Pattern.compile("(?<=\\s+|^)\\w");
Matcher m = p.matcher(input);

while (m.find()) {
    System.out.println(m.group());
}

This regex won't pick up non-word characters at the start of strings. So if someone enters "Mike !sis Strawberry", the return will be M, S. This is not the case with the selected answer that returns M, !, S

The regex works by serching for single word characters (\w) that have one or more space characters (\s+) or are at the start of a line (^).

To modify what is being searched for, the \w can be changed to other regex valid entries.

To modify what precedes the search character, modify (\s+|^). In this example \s+ is used to look for one or more white spaces and the ^ is used to determine if the character is at the start of the string being searched. To add additional criteria, add a pipe character followed by a valid regex search entry.


It's not fixing the regex, but adding a .trim() to the output string still works:

String name = "First Middle Last";
for(String s : name.split("(?<=[\\S])[\\S]+")) System.out.println(s.trim());

output:

F
M
L
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜