开发者

Count how often the word occurs in the text in PHP

In php I need to Load a file and get all of the words and echo the word and the number of 开发者_如何学JAVAtimes each word shows up in the text, (I also need them to show up in descending order most used words on top) ★✩


Here's an example:

$text = "A very nice únÌcÕdë text. Something nice to think about if you're into Unicode.";

// $words = str_word_count($text, 1); // use this function if you only want ASCII
$words = utf8_str_word_count($text, 1); // use this function if you care about i18n

$frequency = array_count_values($words);

arsort($frequency);

echo '<pre>';
print_r($frequency);
echo '</pre>';

The output:

Array
(
    [nice] => 2
    [if] => 1
    [about] => 1
    [you're] => 1
    [into] => 1
    [Unicode] => 1
    [think] => 1
    [to] => 1
    [very] => 1
    [únÌcÕdë] => 1
    [text] => 1
    [Something] => 1
    [A] => 1
)

And the utf8_str_word_count() function, if you need it:

function utf8_str_word_count($string, $format = 0, $charlist = null)
{
    $result = array();

    if (preg_match_all('~[\p{L}\p{Mn}\p{Pd}\'\x{2019}' . preg_quote($charlist, '~') . ']+~u', $string, $result) > 0)
    {
        if (array_key_exists(0, $result) === true)
        {
            $result = $result[0];
        }
    }

    if ($format == 0)
    {
        $result = count($result);
    }

    return $result;
}


$words = str_word_count($text, 1);
$word_frequencies = array_count_values($words);
arsort($word_frequencies);
print_r($word_frequencies);


This function uses a regex to find words (you might want to change it, depending on what you define a word as)

function count_words($text)
{
    $output = $words = array();
    preg_match_all("/[A-Za-z'-]+/", $text, $words); // Find words in the text

    foreach ($words[0] as $word)
    {
        if (!array_key_exists($word, $output))
            $output[$word] = 0;

        $output[$word]++; // Every time we find this word, we add 1 to the count
    }

    return $output;
}

This iterates over each word, constructing an associative array (with the word as the key) where the value refers to the occurences of each word. (e.g. $output['hello'] = 3 => hello occured 3 times in the text).

Perhaps you might want to change the function to deal with case insensitivity (i.e. 'hello' and 'Hello' are not the same word, according to this function).


echo count(explode('your_word', $your_text));
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜