Remove duplicates from a sorted ArrayList while keeping some elements from the duplicates

2022-12-22 19:30 问答作者：

Okay at first I thought this would be pretty straightforward. But I can't think of an efficient way to solve this. I figured a brute force way to solve this but that's not very elegant. I have an ArrayList. Contacts is a VO class that has multiple members - name, regions, id. There are duplicates in ArrayList because different regions appear multiple times. The list is sorted by ID. Here is an example:

Entry 0 - Name: John Smith; Region: N; ID: 1

Entry 1 - Name: John Smith; Region: MW; ID: 1

Entry 2 - Name: John Smith; Region: S; ID: 1

Entry 3 - Name: Jane Doe; Region: NULL; ID: 2

Entry 4 - Name: Jack Black; Region: N; ID: 3

Entry 6 - Name: Jack Black; Region: MW; ID: 3

Entry 7 - Name: Joe Don; Region: NE; ID: 4

I want to transform the list to below by combining duplicate regions together for the same ID. Therefore, the final list should have only 4 distinct elements with the regions combined.

So the output should look like th开发者_如何学Cis:-

Entry 0 - Name: John Smith; Region: N,MW,S; ID: 1

Entry 1 - Name: Jane Doe; Region: NULL; ID: 2

Entry 2 - Name: Jack Black; Region: N,MW; ID: 3

Entry 3 - Name: Joe Don; Region: NE; ID: 4

What are your thoughts on the optimal way to solve this? I am not looking for actual code but ideas or tips to go about the best way to get it done.

Thanks for your time!!!

You can iterate them while dumping them (and merging duplicates) into a TreeMap. Then create a list from the sorted view of the TreeMap's values.

In the sample code I'm assuming you have an Entry class with id, name and regions fields this last one being a List of Region instances. This could easily be changed to a Set, and Region to Strings or whatever you're using. The sample copies the entries before inserting them into the map since they will be modified when merged to other entries.

SortedMap<Integer, Entry> mergedEntriesMap = new TreeMap<Integer, Entry>();
for (Entry e : entries) {
  if (mergedEntriesMap.contains(e.id)) {
    Entry m = mergedEntriesMap.get(e);
    m.regions.addAll(e.regions);
  } else {
    Entry m = new Entry();
    // copy the entry to keep the original array clean
    m.id = e.id;
    m.name = e.name;
    m.regions = new ArrayList<Region>(e.regions);
    mergedEntriesMap.put(m.id, m);
  }
}

List<Entry> mergedEntries = new ArrayList<Entry>(mergedEntriesMap.values());

Is the initial data stuck in this format? If not, you might want to look into changing the query you're using to retrieve your data by grouping all the ids together and forming a comma separated list column. Here's an example in sql

SELECT      Id, [Name], Regions = replace
            ((SELECT Region AS [data()]
            FROM RegionTable
            WHERE  Id = u.Id
            ORDER BY Region FOR xml path('')), ' ', ', ')
FROM        [User] u
WHERE       Id IS NOT NULL
GROUP BY Id, [Name]

This is a pseudocode to accomplish what you want. At the abstract level, you have a list of Pair<K,V> (first, second), sorted by K, and no two pair are truly equal (i.e. you can have (k1,v1) and (k1,v2), but you can't have two (k1,v1) in the list.

You want to merge consecutive pairs (k,v1),(k,v2),(k,v3) to one group (k,[v1,v2,v3]).

List<Pair<K,V>> in;
List<Pair<K,List<V>>> out = [ ];

Pair<K,V> lastP = SENTINEL_PAIR; // lastP.first matches nothing
Pair<K,List<V>> lastGroup;

for (Pair<K,V> p : in) {
  if (p.first == lastP.first) {  // same group as last
    lastGroup.second.add(p.second);
  } else {                       // start a new group
    lastGroup = (p.first, [ p.second ]);
    out.add(lastGroup);
  }
  lastP = p;
}

In your case, K is the ID, and V is the region. This is O(N).

Have you taken a look at google's Multimap? It is pretty much created for this type of data structure in which there is a key that maps to a Collection of items. So in this case a String name would map to a Collection of Region objects.

Multimap<String, Region> names = HashMultimap.create();
for (Entry entry : entries) {
    names.put(entry.getName(), entry.getRegion());
}
// Now u can get the collection of regions by name
Collection<Region> johnsRegions = names.get("John Smith");

继续阅读：arraylist

Remove duplicates from a sorted ArrayList while keeping some elements from the duplicates

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？