开发者

Replacing all non-alphanumeric characters with empty strings

I tried using this but didn't work-

return value.repl开发者_开发知识库aceAll("/[^A-Za-z0-9 ]/", "");


Use [^A-Za-z0-9].

Note: removed the space since that is not typically considered alphanumeric.


Try

return value.replaceAll("[^A-Za-z0-9]", "");

or

return value.replaceAll("[\\W]|_", "");


You should be aware that [^a-zA-Z] will replace characters not being itself in the character range A-Z/a-z. That means special characters like é, ß etc. or cyrillic characters and such will be removed.

If the replacement of these characters is not wanted use pre-defined character classes instead:

 str.replaceAll("[^\\p{IsAlphabetic}\\p{IsDigit}]", "");

PS: \p{Alnum} does not achieve this effect, it acts the same as [A-Za-z0-9].


return value.replaceAll("[^A-Za-z0-9 ]", "");

This will leave spaces intact. I assume that's what you want. Otherwise, remove the space from the regex.


You could also try this simpler regex:

 str = str.replaceAll("\\P{Alnum}", "");


Java's regular expressions don't require you to put a forward-slash (/) or any other delimiter around the regex, as opposed to other languages like Perl, for example.


Solution:

value.replaceAll("[^A-Za-z0-9]", "")

Explanation:

[^abc] When a caret ^ appears as the first character inside square brackets, it negates the pattern. This pattern matches any character except a or b or c.

Looking at the keyword as two function:

  • [(Pattern)] = match(Pattern)
  • [^(Pattern)] = notMatch(Pattern)

Moreover regarding a pattern:

  • A-Z = all characters included from A to Z

  • a-z = all characters included from a to z

  • 0=9 = all characters included from 0 to 9

Therefore it will substitute all the char NOT included in the pattern


I made this method for creating filenames:

public static String safeChar(String input)
{
    char[] allowed = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-_".toCharArray();
    char[] charArray = input.toString().toCharArray();
    StringBuilder result = new StringBuilder();
    for (char c : charArray)
    {
        for (char a : allowed)
        {
            if(c==a) result.append(a);
        }
    }
    return result.toString();
}


If you want to also allow alphanumeric characters which don't belong to the ascii characters set, like for instance german umlaut's, you can consider using the following solution:

 String value = "your value";

 // this could be placed as a static final constant, so the compiling is only done once
 Pattern pattern = Pattern.compile("[^\\w]", Pattern.UNICODE_CHARACTER_CLASS);

 value = pattern.matcher(value).replaceAll("");

Please note that the usage of the UNICODE_CHARACTER_CLASS flag could have an impose on performance penalty (see javadoc of this flag)


Using Guava you can easily combine different type of criteria. For your specific solution you can use:

value = CharMatcher.inRange('0', '9')
        .or(CharMatcher.inRange('a', 'z')
        .or(CharMatcher.inRange('A', 'Z'))).retainFrom(value)


Simple method:

public boolean isBlank(String value) {
    return (value == null || value.equals("") || value.equals("null") || value.trim().equals(""));
}

public String normalizeOnlyLettersNumbers(String str) {
    if (!isBlank(str)) {
        return str.replaceAll("[^\\p{L}\\p{Nd}]+", "");
    } else {
        return "";
    }
}


public static void main(String[] args) {
    String value = " Chlamydia_spp. IgG, IgM & IgA Abs (8006) ";

    System.out.println(value.replaceAll("[^A-Za-z0-9]", ""));

}

output: ChlamydiasppIgGIgMIgAAbs8006

Github: https://github.com/AlbinViju/Learning/blob/master/StripNonAlphaNumericFromString.java


Guava's CharMatcher provides a concise solution:

output = CharMatcher.javaLetterOrDigit().retainFrom(input);


Dart

If you tried this and it didn't work..

value.replaceAll("[^A-Za-z0-9]", "");

Just use RegExp like this:

value.replaceAll(RegExp("[^A-Za-z0-9]"), "");

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜