How do I remove all non alphanumeric characters from a string except dash?
How do I remo开发者_开发技巧ve all non alphanumeric characters from a string except dash and space characters?
Replace [^a-zA-Z0-9 -]
with an empty string.
Regex rgx = new Regex("[^a-zA-Z0-9 -]");
str = rgx.Replace(str, "");
I could have used RegEx, they can provide elegant solution but they can cause performane issues. Here is one solution
char[] arr = str.ToCharArray();
arr = Array.FindAll<char>(arr, (c => (char.IsLetterOrDigit(c)
|| char.IsWhiteSpace(c)
|| c == '-')));
str = new string(arr);
When using the compact framework (which doesn't have FindAll)
Replace FindAll with1
char[] arr = str.Where(c => (char.IsLetterOrDigit(c) ||
char.IsWhiteSpace(c) ||
c == '-')).ToArray();
str = new string(arr);
1 Comment by ShawnFeatherly
You can try:
string s1 = Regex.Replace(s, "[^A-Za-z0-9 -]", "");
Where s
is your string.
Using System.Linq
string withOutSpecialCharacters = new string(stringWithSpecialCharacters.Where(c =>char.IsLetterOrDigit(c) || char.IsWhiteSpace(c) || c == '-').ToArray());
The regex is [^\w\s\-]*
:
\s
is better to use instead of space (), because there might be a tab in the text.
Based on the answer for this question, I created a static class and added these. Thought it might be useful for some people.
public static class RegexConvert
{
public static string ToAlphaNumericOnly(this string input)
{
Regex rgx = new Regex("[^a-zA-Z0-9]");
return rgx.Replace(input, "");
}
public static string ToAlphaOnly(this string input)
{
Regex rgx = new Regex("[^a-zA-Z]");
return rgx.Replace(input, "");
}
public static string ToNumericOnly(this string input)
{
Regex rgx = new Regex("[^0-9]");
return rgx.Replace(input, "");
}
}
Then the methods can be used as:
string example = "asdf1234!@#$";
string alphanumeric = example.ToAlphaNumericOnly();
string alpha = example.ToAlphaOnly();
string numeric = example.ToNumericOnly();
Want something quick?
public static class StringExtensions
{
public static string ToAlphaNumeric(this string self,
params char[] allowedCharacters)
{
return new string(Array.FindAll(self.ToCharArray(),
c => char.IsLetterOrDigit(c) ||
allowedCharacters.Contains(c)));
}
}
This will allow you to specify which characters you want to allow as well.
Here is a non-regex heap allocation friendly fast solution which was what I was looking for.
Unsafe edition.
public static unsafe void ToAlphaNumeric(ref string input)
{
fixed (char* p = input)
{
int offset = 0;
for (int i = 0; i < input.Length; i++)
{
if (char.IsLetterOrDigit(p[i]))
{
p[offset] = input[i];
offset++;
}
}
((int*)p)[-1] = offset; // Changes the length of the string
p[offset] = '\0';
}
}
And for those who don't want to use unsafe or don't trust the string length hack.
public static string ToAlphaNumeric(string input)
{
int j = 0;
char[] newCharArr = new char[input.Length];
for (int i = 0; i < input.Length; i++)
{
if (char.IsLetterOrDigit(input[i]))
{
newCharArr[j] = input[i];
j++;
}
}
Array.Resize(ref newCharArr, j);
return new string(newCharArr);
}
I´ve made a different solution, by eliminating the Control characters, which was my original problem.
It is better than putting in a list all the "special but good" chars
char[] arr = str.Where(c => !char.IsControl(c)).ToArray();
str = new string(arr);
it´s simpler, so I think it´s better !
Here's an extension method using @ata answer as inspiration.
"hello-world123, 456".MakeAlphaNumeric(new char[]{'-'});// yields "hello-world123456"
or if you require additional characters other than hyphen...
"hello-world123, 456!?".MakeAlphaNumeric(new char[]{'-','!'});// yields "hello-world123456!"
public static class StringExtensions
{
public static string MakeAlphaNumeric(this string input, params char[] exceptions)
{
var charArray = input.ToCharArray();
var alphaNumeric = Array.FindAll<char>(charArray, (c => char.IsLetterOrDigit(c)|| exceptions?.Contains(c) == true));
return new string(alphaNumeric);
}
}
If you are working in JS, here is a very terse version
myString = myString.replace(/[^A-Za-z0-9 -]/g, "");
I use a variation of one of the answers here. I want to replace spaces with "-" so its SEO friendly and also make lower case. Also not reference system.web from my services layer.
private string MakeUrlString(string input)
{
var array = input.ToCharArray();
array = Array.FindAll<char>(array, c => char.IsLetterOrDigit(c) || char.IsWhiteSpace(c) || c == '-');
var newString = new string(array).Replace(" ", "-").ToLower();
return newString;
}
There is a much easier way with Regex.
private string FixString(string str)
{
return string.IsNullOrEmpty(str) ? str : Regex.Replace(str, "[\\D]", "");
}
精彩评论