Constructing frequency table of Objects in ArrayList<Object[]>

2023-02-03 22:07 问答作者：

I try to implement Incognito k-anonymization algorithm in Java. A part of this algorithm is the frequency set construction for a given table. The columns of the table vary each time so I decided to represent the table as an ArrayList of Object[] where Object[] size is the number of the columns. In this Object I store each row's values for each columns.

I try to construct the frequency table using the following method:

ArrayList<Object[]> table = new ArrayList<Object[]>();
....// table filling//.....
ArrayList<Object[]> frequencySet = new ArrayList<Object[]>();
for(int i=0;i<table.size();i++)
     {
         Integer count = 1;
         int j = 0;
         for(j=i+1;j<table.size();j++)
         {
             if(Arrays.equals(table.get(i), table.get(j)))
             {
                 //System.out.println(i+" equals to "+j);
                 count++;
                 table.remove(j);
                 j = j-1;
             }
         }
         int size = arguments.size()+1;
         Object[] anObject = new Object[size];
   开发者_JAVA百科      System.arraycopy(table.get(i), 0, anObject, 0, arguments.size());
         anObject[size-1] = count;
         frequencySet.add(anObject);
     }

The problem is that the algorithm is very slow, and I figured out that the most of the time is consumed in this method. (For 100.000 data it needs 13min to run- I don't know if this is normal). Is there a faster way to construct frequency table?

Never use remove on an ArrayList, it's O(size()). Also, your count variable gets wrapped and unwrapped each time you increment it. Make its type int and wrap it into Integer only at the very end.

Without knowing anything about the type of Objects you store, I assume the methods equals and hashCode are redefined for them. Then the best thing that comes into mind is wrap the array of Object's into a class Row (it's a good thing to do anyway), redefine equals and hashCode for Row (using Arrays.equals and Arrays.hashCode) and count the occurrences of each Row in one pass using a

HashMap<Row, Integer> count;


for (Row row : table) {
    if (count.containsKey(row)) {
        count.put(row, count.get(row) + 1);
    } else {
        count.put(row, 1);
    }
}

Sort them, and then count reoccurances with a loop after that. That will bring it down to O(n log n)

or use a hashtable to do your counting instead. That should be a linear time calculation.

继续阅读：algorithm frequency

Constructing frequency table of Objects in ArrayList<Object[]>

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？