Map of Map - word pairs in java - stuck
I am using a MSDOS windows prompt to pipe in a file.. its a regular file with words.(not like abc,def,ghi..etc)
I am trying to write a program that counts how many times each word pair appears in a text file. A word pair consists of two consecutive words (i.e. a word and the word that directly follows it). In the first sentence of this paragraph, the words “counts” and “how” are a word pair.
What i want the program to do is, take this input :
abc def abc ghi abc def ghi jkl abc xyz abc abc abc ---
Should produce this output:
abc:
abc, 2
def, 2开发者_运维技巧
ghi, 1
xyz, 1
def:
abc, 1
ghi, 1
ghi:
abc, 1
kl, 1
jkl:
abc, 1
xyz:
abc, 1
My input is not going to be like that though. My input will be more like: "seattle amazoncom is expected to report" so would i even need to test for "abc"?
MY BIGGEST issue is adding it to the map... so i think
I think i need to use a map of a map? I am not sure how to do this?
Map<String, Map<String, Integer>> uniqueWords = new HashMap<String, Map<String, Integer>>();
I think the map would produce this output for me: which is axactly what i want..
Key | Value number of times
--------------------------
abc | def, ghi, jkl 3
def | jkl, mno 2
if that map is correct, in my situation how would i add to it from the file? I have tried:
if(words.contain("abc")) // would i even need to test for abc?????
{
uniqueWords.put("abc", words, ?) // not sure what to do about this?
}
this is what i have so far.
import java.util.Scanner;
import java.util.ArrayList;
import java.util.TreeSet;
import java.util.Iterator;
import java.util.HashSet;
public class Project1
{
public static void main(String[] args)
{
Scanner sc = new Scanner(System.in);
String word;
String grab;
int number;
// ArrayList<String> a = new ArrayList<String>();
// TreeSet<String> words = new TreeSet<String>();
Map<String, Map<String, Integer>> uniquWords = new HashMap<String, Map<String, Integer>>();
System.out.println("project 1\n");
while (sc.hasNext())
{
word = sc.next();
word = word.toLowerCase();
if (word.matches("abc")) // would i even need to test for abc?????
{
uniqueWords.put("abc", word); // syntax incorrect i still need an int!
}
if (word.equals("---"))
{
break;
}
}
System.out.println("size");
System.out.println(uniqueWords.size());
System.out.println("unique words");
System.out.println(uniqueWords.size());
System.out.println("\nbye...");
}
}
I hope someone can help me because i am banging my head and not learnign anything for weeks now.. Thank you...
I came up with this solution. I think your idea with the Map may be more elegant, but run this an lets see if we can refine:
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
public class Main {
private static List<String> inputWords = new ArrayList<String>();
private static Map<String, List<String>> result = new HashMap<String, List<String>>();
public static void main(String[] args) {
collectInput();
process();
generateOutput();
}
/*
* Modify this method to collect the input
* however you require it
*/
private static void collectInput(){
// test code
inputWords.add("abc");
inputWords.add("def");
inputWords.add("abc");
inputWords.add("ghi");
inputWords.add("abc");
inputWords.add("def");
inputWords.add("abc");
}
private static void process(){
// Iterate through every word in our input list
for(int i = 0; i < inputWords.size() - 1; i++){
// Create references to this word and next word:
String thisWord = inputWords.get(i);
String nextWord = inputWords.get(i+1);
// If this word is not in the result Map yet,
// then add it and create a new empy list for it.
if(!result.containsKey(thisWord)){
result.put(thisWord, new ArrayList<String>());
}
// Add nextWord to the list of adjacent words to thisWord:
result.get(thisWord).add(nextWord);
}
}
/*
* Rework this method to output results as you need them:
*/
private static void generateOutput(){
for(Entry e : result.entrySet()){
System.out.println("Symbol: " + e.getKey());
// Count the number of unique instances in the list:
Map<String, Integer>count = new HashMap<String, Integer>();
List<String>words = (List)e.getValue();
for(String s : words){
if(!count.containsKey(s)){
count.put(s, 1);
}
else{
count.put(s, count.get(s) + 1);
}
}
// Print the occurances of following symbols:
for(Entry f : count.entrySet()){
System.out.println("\t following symbol: " + f.getKey() + " : " + f.getValue());
}
}
System.out.println();
}
}
In your table, you have Key | Value | Number of times. Is the "nubmer of times" specific to each of second words? This may work.
My suggestion in your last question was to use a map of Lists. Each unique word would have an associated List (empty to begin with). At the end of processing you would count up all identical words in the list to get a total:
Key | List of following words
abc | def def ghi mno ghi
Now, you could count identical items in your list to find out that: abc --> def = 2 abc --> ghi = 2 abc --> mno = 1
I think this approach or yours would work well. I'll put some code together and update this post is nobody else responds.
You have initialized uniqueWords
as a Map of Maps, not a Map of Strings as you are trying to populate it. For your design to work, you need to put a Map<String, Integer>
as the value for the "abc" key.
....
Map<String, Map<String, Integer>> uniquWords = new HashMap<String, Map<String, Integer>>();
System.out.println("project 1\n");
while (sc.hasNext())
{
word = sc.next();
word = word.toLowerCase();
if (word.matches("abc")) // would i even need to test for abc?????
// no, just use the word
{
uniqueWords.put("abc", word); // <-- here you are putting a String value, instead of a Map<String, Integer>
}
if (word.equals("---"))
{
break;
}
}
Instead, you could do something akin to the following brute-force approach:
Map<String, Integer> followingWordsAndCnts = uniqueWords.get(word);
if (followingWordsAndCnts == null) {
followingWordsAndCnts = new HashMap<String,Integer>();
uniqueWords.put(word, followingWordsAndCnts);
}
if (sc.hasNext()) {
word = sc.next().toLowerCase();
Integer cnt = followingWordsAndCnts.get(word);
followingWordsAndCnts.put(word, cnt == null? 1 : cnt + 1);
}
You could make this a recursive method to ensure that each word gets its turn as the following word and the word that is being followed.
for each key (e.g. "abc") you want to store another string (e.g. "def","abc") paired with an integer(1,2)
I would download google collections and use a Map<String, Multiset<String>>
Map<String, Multiset<String>> myMap = new HashMap<String, Multiset<String>>();
...
void addPair(String word1, String word2) {
Multiset<String> set = myMap.get(word1);
if(set==null) {
set = HashMultiMap.create();
myMap.put(word1,set);
}
set.add(word2);
}
int getOccurs(String word1, String word2) {
if(myMap.containsKey(word1))
return myMap.get(word1).count(word2);
return 0;
}
If you don't want to use a Multiset, you can create the logical equivalents(for your purposes, not general purpose):
Multiset<String> === Map<String,Integer>
Map<String, Multiset<String>> === Map<String, Map<String,Integer>>
To make your answer in alphabetically order... Simply make all HashMap into TreeMap. For example:
new HashMap>();' into new TreeMap>();
and dont forget to add import java.util.TreeMap;
精彩评论