开发者

Compare arrays of int in high performance

I cant remember from my days in college, the way to compare two unsorted arrays of int and find the number of matches ? Each value is unique 开发者_开发知识库in it own array, and both arrays are the same size.

for example

int[5] a1 = new []{1,2,4,5,0}
int[5] a2 = new []{2,4,11,-6,7}

int numOfMatches = FindMatchesInPerformanceOfNLogN(a1,a2);

any one does remember ?


If you can store the contents of one of the arrays in a HashMap, then you can check for the existence of the elements in the other array by seeing if they exist in the HashMap. This is O(n).


One array must be sorted so that you can compare in n*log(n). That is for every item in the unsorted array (n) you perform a binary search on the sorted array (log(n)). If both are unsorted, I don't see a way to compare in n*log(n).


how about this:

  1. concatenate the two arrays
  2. quicksort the result
  3. step through from array[1] to array[array.length - 1] and check array[i] against array[i-1]

if they are the same, you had a duplicate. This should also be O(n*log(n)) and will not require a binary search for each check.


You could use LINQ:

var a1 = new int[5] {1, 2, 4, 5, 0};
var a2 = new int[5] {2, 4, 11, -6, 7};
var matches = a1.Intersect(a2).Count();

I'm not sure if you're just asking for a straight-forward way or the fastest/best way possible...


You have two methods that I am aware of (ref: http://www2.cs.siu.edu/~mengxia/Courses%20PPT/220/carrano_ppt08.ppt):

Recursive (pseudocode)

Algorithm to search a[first] through a[last] for desiredItem
if (there are no elements to search)
    return false
else if (desiredItem equals a[first])
    return true
else    
    return the result of searching a[first+1] through a[last]

Efficiency

May be O(log n) though I have not tried it.

Sequential Search (pseudocode)

public boolean contains(Object anEntry)
{   
    boolean found = false;
    for (int index = 0; !found && (index < length); index++) {
    if (anEntry.equals(entry[index]))
            found = true;
    } 
    return found;
} 

Efficiency of a Sequential Search

Best case  O(1)
    Locate desired item first
Worst case  O(n)
    Must look at all the items
Average case O(n)
    Must look at half the items 
    O(n/2) is still O(n)

I am not aware of an O(log n) search algorithm unless it is sorted.


I don't know if it is the fastest way but you can do

int[] a1 = new []{1,2,4,5,0};
int[] a2 = new []{2,4,11,-6,7};
var result = a1.Intersect(a2).Count();

It is worth comparing this with other ways that are optimised for int as Intersect() operates on IEnumerable.


This problem is also amenable to parallelization: spawn n1 threads and have each one compare an element of a1 with n2 elements of a2, then sum values. Probably slower, but interesting to consider, is spawning n1 * n2 threads to do all comparisons simultaneously, then reducing. If P >> max(n1, n2) in the first case, P >> n1 * n2 in the second, you could do the whole thing in O(n) in the first case, O(log n) in the second.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜