Searching String for specific Word. C#

2023-02-09 05:36 问答作者：

I would like to search a string for a specific words that a user would type in and then output the percentage that word is displayed within the text. Just wondering what the best method for this wo开发者_JAVA技巧uld be and if you could help me out please.

I suggest using String.Equals overload with StringComparison specified for better performance.

var separators = new [] { ' ', ',', '.', '?', '!', ';', ':', '\"' };
var words = sentence.Split (separators);
var matches = words.Count (w =>
    w.Equals (searchedWord, StringComparison.OrdinalIgnoreCase));
var percentage = matches / (float) words.Count;

Note that percentage will be float, e.g. 0.5 for 50%.
You can format it for display using ToString overload:

var formatted = percentage.ToString ("P0"); // 0.1234 => 12 %

You can also change format specifier to show decimal places:

var formatted = percentage.ToString ("P2"); // 0.1234 => 12.34 %

Please keep in mind that this method is ineffective for large strings because it creates a string instance for each of the words found. You might want to take StringReader and read word by word manually.

The easiest way is to use LINQ:

char[] separators = new char() {' ', ',', '.', '?', '!', ':', ';'};
var count =
    (from word In sentence.Split(separators)      // get all the words
    where word.ToLower() = searchedWord.ToLower() // find the words that match
    select word).Count();                         // count them

This only counts the number of times the word appears in the text. You could also count how many words there are in the text:

var totalWords = sentence.Split(separators).Count());

and then just get the percentage as:

var result = count / totalWords * 100;

My suggestion is a complete class.

class WordCount {
    const string Symbols = ",;.:-()\t!¡¿?\"[]{}&<>+-*/=#'";

    public static string normalize(string str)
    {
        var toret = new StringBuilder();

        for(int i = 0; i < str.Length; ++i) {
            if ( Symbols.IndexOf( str[ i ] ) > -1 ) {
                toret.Append( ' ' );
            } else {
                toret.Append( char.ToLower( str[ i ] ) );
            }
        }

        return toret.ToString();
    }

    private string word;
    public string Word {
        get { return this.word; }
        set { this.word = value; }
    }

    private string str;
    public string Str {
        get { return this.str; }
    }

    private string[] words = null;
    public string[] Words {
       if ( this.words == null ) {
           this.words = this.Str.split( ' ' );
       }

       return this.words;
    }

    public WordCount(string str, string w)
    {
         this.str = ' ' + normalize( str ) + ' ';
         this.word = w;
    }

    public int Times()
    {
        return this.Times( this.Word );
    }

    public int Times(string word)
    {
        int times = 0;

        word = ' ' + word + ' ';

        int wordLength = word.Length;
        int pos = this.Str.IndexOf( word );

        while( pos > -1 ) {
            ++times;

            pos = this.Str.IndexOf( pos + wordLength, word );
        }

        return times;
    }

    public double Percentage()
    {
        return this.Percentage( this.Word );
    }

    public double Percentage(string word)
    {
        return ( this.Times( word ) / this.Words.Length );
    }
}

Advantages: string splitting is cached, so there is no danger of applying it more than one time. It is packaged in one class, so it can be easily resuable. No necessity of Linq. Hope this helps.

// The words you want to search for
var words = new string[] { "this", "is" };

// Build a regular expresion query
var wordRegexQuery = new System.Text.StringBuilder();
wordRegexQuery.Append("\\b(");
for (var wordIndex = 0; wordIndex < words.Length; wordIndex++)
{
  wordRegexQuery.Append(words[wordIndex]);
  if (wordIndex < words.Length - 1)
  {
    wordRegexQuery.Append('|');
  }
}
wordRegexQuery.Append(")\\b");

// Find matches and return them as a string[]
var regex = new System.Text.RegularExpressions.Regex(wordRegexQuery.ToString(), RegexOptions.IgnoreCase);
var someText = var someText = "This is some text which is quite a good test of which word is used most often. Thisis isthis athisisa.";
var matches = (from Match m in regex.Matches(someText) select m.Value).ToArray();

// Display results
foreach (var word in words)
{
    var wordCount = (int)matches.Count(w => w.Equals(word, StringComparison.InvariantCultureIgnoreCase));
    Console.WriteLine("{0}: {1} ({2:f2}%)", word, wordCount, wordCount * 100f / matches.Length);
}

继续阅读：string

Searching String for specific Word. C#

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？