c# Regex remove words of less than 3 letters?
Any ideas on the regex need to remove words of 3 letters or less? So it would find "ii it was bbb c开发者_StackOverflow社区at rat hat" etc but not "four, three, twos".
Regex to match words of length 1 to 3 would be \b\w{1,3}\b
, replace these matches with empty string.
Regex re = new Regex(@"\b\w{1,3}\b");
var result = re.Replace(input, "");
To also remove double spaces you could use:
Regex re = new Regex(@"\s*\b\w{1,3}\b\s*");
var result = re.Replace(input, " ");
(Altho it will leave a space at the beginning/end of string.)
Don't necessarily need a regex for this, it can be done with a simple linq selection.
string[] words = inputString.Split(' ');
var longWords = words.Where(x => x.Length > 3);
string outputString = String.Join(" ", longWords.ToArray());
Hell you could even do it in one line of code:
outputString = String.Join(" ", inputString.Split(' ').Where(x => x.Length > 3).ToArray());
I'm going to go out on a limb here and throw a non-regex solution at you:
public static string StripWordsWithLessThanXLetters(string input, int x)
{
var inputElements = input.Split(' ');
var resultBuilder = new StringBuilder();
foreach (var element in inputElements)
{
if (element.Length >= x)
{
resultBuilder.Append(element + " ");
}
}
return resultBuilder.ToString().Trim();
}
This is more verbose than the other solutions, but I think the performance cost of using the Linq solution might outweigh its net benefit, and a regex incur the same costs (potentially with more complexity to maintenance.)
string qText = "Long or not long sample text";
var inputWords = qText.Split(' ').ToList();
var rem = (from c in inputWords
where c.Length > 3
select c).ToList();
If the performance is an issue, here's another implementation that doesn't involve regex nor linq.
It's about 20% to 25% faster than other solutions
public static string StripWordsByLength(this string str, int minLength)
{
bool addSpace = false;
StringBuilder resultBuilder = new StringBuilder();
foreach (string word in str.Split(' '))
{
if (word.Length >= minLength)
{
if (addSpace)
resultBuilder.Append(' ');
else
addSpace = true;
resultBuilder.Append(word);
}
}
return resultBuilder.ToString();
}
精彩评论