开发者

Algorithm to find added/removed elements in an array

I am looking for the most efficent way of solving the following

Problem:

given an array Before = { 8, 7, 2, 1} and an array After ={1, 3, 8, 8}
find the added and the removed elements

the solution is:
        added = 3, 8 
        removed = 7, 2

My idea so far is:

for i = 0 .. B.Lenghtt-1
{
    for j= 0 .. A.Lengh开发者_开发百科t-1
    {
        if A[j] == B[i]

            A[j] = 0;
            B[i] = 0;

            break;
    }
}

// B elemnts different from 0 are the Removed elements
// A elemnts different from 0 are the Added elemnts

Does anyone know a better solution perhaps more efficent and that doesn't overwrite the original arrays


Sorting is your friend.

Sort the two arrays (a and b), and then walk them (using x and y as counters). Move down both 1 at a time. You can derive all your tests from there:

  • if a[x] < b[y], then a[x] was removed (and only increment x)
  • if a[x] > b[y], then b[y] was added (and only increment y)

(I may have missed an edge case, but you get the general idea.)

(edit: the primary edge case that isn't covered here is handling when you reach the end of one of the arrays before the other, but it's not hard to figure out. :)


You could also use a Dictionary<int, int> and a algorithm similar to this:

foreach i in source_list: dictionary[i]++;
foreach i in dest_list: dictionary[i]--;

The final dictionary tells you which elements were inserted/removed (and how often). This solution should be quite fast even for bigger lists - faster than sorting.


if your language as multiset available (set with count of elements) your question is a standard operation.

B = multiset(Before)
A = multiset(After)

result is A.symdiff(B) (symdiff is union minus intersection and that is exactly what you are looking for to have removed and added).

Obviously you can also get removed only or added only using classical difference between sets.

It can trivially be implemented using hashes and it's O(n) (using sort is slightly less efficient as it is O(n.log(n)) because of the sort itself).


In some sort of C++ pseudo code:

Before.sort();
After.sort();
int i = 0;
int j = 0;
for (; i < Before.size() && j < After.size(); ) {
    if (Before[i] < After[j]) {
        Removed.add(Before[i]);
        ++i;
        continue;
    }
    if (Before[i] > After[j]) {
        Added.add(After[j]);
        ++j;
        continue;
    }
    ++i;
    ++j;
}
for (; i < Before.size(); ++i) {
     Removed.add(Before[i]);
}
for (; j < After.size(); ++j) {
     Added.add(After[j]);
}


This can be solved in linear time. Create a map for calculating the number of repetitions of each element. Go through the before array and fill the map. Go through the after array and decrease the value in the map for each element. At the end, go through the map and if you find a negative value, that element was added - if positive, that element was removed.

Here is some Java code for this (not tested, just written right now):

Map<Integer, Integer> repetitionMap = new HashMap<Integer, Integer>();

for (int i = 0; i < before.length; i++) {
    Integer number = repetitionMap.get(before[i]);
    if (number == null) {
        repetitionMap.put(before[i], 1);
    } else {
        repetitionMap.put(before[i], number + 1);
    }
}

for (int i = 0; i < after.length; i++) {
    Integer number = repetitionMap.get(after[i]);
    if (number == null) {
        repetitionMap.put(after[i], -1);
    } else {
        repetitionMap.put(after[i], number - 1);
    }
}

Set<Integer> keySet = repetitionMap.keySet();
for (Integer i : keySet) {
    Integer number = repetitionMap.get(i);
    if (number > 0) {
        System.out.println("removed " + number + "times value " + i);
    }

    if (number < 0) {
        System.out.println("added " + number + "times value " + i);
    }
}


perl:

@a = ( 8, 7, 2, 2, 1 );
@b = ( 1, 3, 8, 8, 8 );

$d{$_}++ for(@a);
$d{$_}-- for(@b);

print"added = "; 
for(keys %d){print "$_ " x (-$d{$_}) if($d{$_}<0)}
print"\n";
print"removed = "; 
for(keys %d){print "$_ " x ($d{$_}) if($d{$_}>0)}
print"\n";

result:

$ ./inout.pl 
added = 8 8 3 
removed = 7 2 2 


In Groovy:

def before = [8, 7, 2, 1, 1, 1], after = [1, 3, 8, 8, 8]
def added = before.countBy{it}
def result = after.inject(added){map, a -> map[a] ? map << [(a):map[a] - 1]: map << [(a):-1]}
        .inject([:]){m, k, v -> v == 0 ? (m << [:]) : (v < 0 ? m << [(k):"added ${v.abs()} times"] : m << [(k):"removed ${v.abs()} times"])}
println "before $before\nafter  $after"
println "result: $result"

Result:

before [8, 7, 2, 1, 1, 1]
after  [1, 3, 8, 8, 8]
result: [8:added 2 times, 7:removed 1 times, 2:removed 1 times, 1:removed 2 times, 3:added 1 times]

For countBy I got insipred from Some groovy magic post

In groovy inject is like reduce in other functional languages.

I also refer Groovy collection api slides from Trygve Amundsen with really good table with functional methods

Second solution:

def before = [8, 7, 2, 1, 1, 1], after = [1, 3, 8, 8, 8]
def sb = before.countBy{it}
def sa = after.countBy{it}
def result = sa.inject(sb){m, k, v -> m[k] ? m << [(k): m[k] - v] : m << [(k): -v]}
   .inject([:]){m, k, v -> v == 0 ? (m << [:]) : (v < 0 ? m << [(k):"added ${v.abs()} times"] : m << [(k):"removed ${v.abs()} times"])}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜