Algorithm to find added/removed elements in an array
I am looking for the most efficent way of solving the following
Problem:
given an array Before = { 8, 7, 2, 1} and an array After ={1, 3, 8, 8}
find the added and the removed elements
the solution is:
added = 3, 8
removed = 7, 2
My idea so far is:
for i = 0 .. B.Lenghtt-1
{
for j= 0 .. A.Lengh开发者_开发百科t-1
{
if A[j] == B[i]
A[j] = 0;
B[i] = 0;
break;
}
}
// B elemnts different from 0 are the Removed elements
// A elemnts different from 0 are the Added elemnts
Does anyone know a better solution perhaps more efficent and that doesn't overwrite the original arrays
Sorting is your friend.
Sort the two arrays (a and b), and then walk them (using x and y as counters). Move down both 1 at a time. You can derive all your tests from there:
- if a[x] < b[y], then a[x] was removed (and only increment x)
- if a[x] > b[y], then b[y] was added (and only increment y)
(I may have missed an edge case, but you get the general idea.)
(edit: the primary edge case that isn't covered here is handling when you reach the end of one of the arrays before the other, but it's not hard to figure out. :)
You could also use a Dictionary<int, int>
and a algorithm similar to this:
foreach i in source_list: dictionary[i]++;
foreach i in dest_list: dictionary[i]--;
The final dictionary tells you which elements were inserted/removed (and how often). This solution should be quite fast even for bigger lists - faster than sorting.
if your language as multiset available (set with count of elements) your question is a standard operation.
B = multiset(Before)
A = multiset(After)
result is A.symdiff(B) (symdiff is union minus intersection and that is exactly what you are looking for to have removed and added).
Obviously you can also get removed only or added only using classical difference between sets.
It can trivially be implemented using hashes and it's O(n) (using sort is slightly less efficient as it is O(n.log(n)) because of the sort itself).
In some sort of C++ pseudo code:
Before.sort();
After.sort();
int i = 0;
int j = 0;
for (; i < Before.size() && j < After.size(); ) {
if (Before[i] < After[j]) {
Removed.add(Before[i]);
++i;
continue;
}
if (Before[i] > After[j]) {
Added.add(After[j]);
++j;
continue;
}
++i;
++j;
}
for (; i < Before.size(); ++i) {
Removed.add(Before[i]);
}
for (; j < After.size(); ++j) {
Added.add(After[j]);
}
This can be solved in linear time.
Create a map for calculating the number of repetitions of each element.
Go through the before
array and fill the map.
Go through the after
array and decrease the value in the map for each element.
At the end, go through the map and if you find a negative value, that element was added - if positive, that element was removed.
Here is some Java code for this (not tested, just written right now):
Map<Integer, Integer> repetitionMap = new HashMap<Integer, Integer>();
for (int i = 0; i < before.length; i++) {
Integer number = repetitionMap.get(before[i]);
if (number == null) {
repetitionMap.put(before[i], 1);
} else {
repetitionMap.put(before[i], number + 1);
}
}
for (int i = 0; i < after.length; i++) {
Integer number = repetitionMap.get(after[i]);
if (number == null) {
repetitionMap.put(after[i], -1);
} else {
repetitionMap.put(after[i], number - 1);
}
}
Set<Integer> keySet = repetitionMap.keySet();
for (Integer i : keySet) {
Integer number = repetitionMap.get(i);
if (number > 0) {
System.out.println("removed " + number + "times value " + i);
}
if (number < 0) {
System.out.println("added " + number + "times value " + i);
}
}
perl:
@a = ( 8, 7, 2, 2, 1 ); @b = ( 1, 3, 8, 8, 8 ); $d{$_}++ for(@a); $d{$_}-- for(@b); print"added = "; for(keys %d){print "$_ " x (-$d{$_}) if($d{$_}<0)} print"\n"; print"removed = "; for(keys %d){print "$_ " x ($d{$_}) if($d{$_}>0)} print"\n";
result:
$ ./inout.pl added = 8 8 3 removed = 7 2 2
In Groovy:
def before = [8, 7, 2, 1, 1, 1], after = [1, 3, 8, 8, 8]
def added = before.countBy{it}
def result = after.inject(added){map, a -> map[a] ? map << [(a):map[a] - 1]: map << [(a):-1]}
.inject([:]){m, k, v -> v == 0 ? (m << [:]) : (v < 0 ? m << [(k):"added ${v.abs()} times"] : m << [(k):"removed ${v.abs()} times"])}
println "before $before\nafter $after"
println "result: $result"
Result:
before [8, 7, 2, 1, 1, 1]
after [1, 3, 8, 8, 8]
result: [8:added 2 times, 7:removed 1 times, 2:removed 1 times, 1:removed 2 times, 3:added 1 times]
For countBy
I got insipred from Some groovy magic post
In groovy inject
is like reduce
in other functional languages.
I also refer Groovy collection api slides from Trygve Amundsen with really good table with functional methods
Second solution:
def before = [8, 7, 2, 1, 1, 1], after = [1, 3, 8, 8, 8]
def sb = before.countBy{it}
def sa = after.countBy{it}
def result = sa.inject(sb){m, k, v -> m[k] ? m << [(k): m[k] - v] : m << [(k): -v]}
.inject([:]){m, k, v -> v == 0 ? (m << [:]) : (v < 0 ? m << [(k):"added ${v.abs()} times"] : m << [(k):"removed ${v.abs()} times"])}
精彩评论