开发者

Word association counting

I am new to java. I need to count word associations with each other in a sentence. For example, for the sentence, "Dog is a Dog and Cat is a Cat", the fina开发者_开发知识库l association count will be- The first row: Dog-Dog(0), Dog-is(2), Dog-a(2) Dog-and(1), Dog-Cat(2)

and so on.

It is kind of developing an association matrix. Any suggestion on how that can be developed?


Thanks Roman. I can split the words from the sentences-

String sentence=null;
    String target="Dog is a Dog and Cat is a Cat";
    int index = 0;
    Locale currentLocale = new Locale ("en","US");
    BreakIterator wordIterator = BreakIterator.getWordInstance(currentLocale);
    //Creating the sentence iterator
    BreakIterator bi = BreakIterator.getSentenceInstance();
    bi.setText(target);

    while (bi.next() != BreakIterator.DONE) {

        sentence = target.substring(index, bi.current());
        System.out.println(sentence);
        wordIterator.setText(sentence);
        int start = wordIterator.first();
        int end = wordIterator.next();

        while (end!=BreakIterator.DONE){

            String word = sentence.substring(start,end);
             if (Character.isLetterOrDigit(word.charAt(0))) {

                System.out.println(word);

             }//if (Character.isLetterOrDigit(word.charAt(0)))

             start = end;
             end = wordIterator.next();
        }//while (end!=BreakIterator.DONE)
        index = bi.current();
    }  //  while (bi.next() != BreakIterator.DONE)

But did not get your other two points. Thanks.


  1. Split the sentence into separate words.
  2. Generate pairs.
  3. Merge the same pairs.

It's as simple as:

String[] words = sentence.split("\\s"); //first step
List<List<String>> pairs = 
    new ArrayList<List<String>>((int)(((words.length) / 2.0) * (words.length - 1)));
for (int i = 0; i < words.length - 1; i++) {
    for (int j = i + 1; j < words.length; j++) {
         List<String> pair = Arrays.asList(words[i], words[j]);
         Collections.sort(pair);
         pairs.add(pair);
    }
} //second step
Map<List<String>, Integer> pair2count = new LinkedHashMap<List<String>, Integer>();
for (List<String> pair : pairs) {
    if (pair2count.containsKey(pair)) {
        pair2count.put(pair, pair2count.get(pair) + 1);
    } else {
        pair2count.put(pair, 1);
    }
} //third step

//output
System.out.println(pair2count);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜