开发者

How can I make this vector enumeration code faster?

I have three large sets of vectors: A, B1 and B2. These sets are stored in files on disk. For each vector a from A I need to check whether it may be presented as a = b1 + b2, where b1 is from B1 and b2 is from B2. Vectors have 20 compo开发者_JS百科nents, and all components are non-negative numbers.

How I'm solving this problem now (pseudocode):

foreach a in A  
  foreach b1 in B1
    for i = 1 to 20
      bt[i] = a[i] - b1[i]
      if bt[i] < 0 then try next b1
    next i

    foreach b2 in B2
      for i = 1 to 20
        if bt[i] != b2[i] then try next b2
      next i

      num_of_expansions++
    next b2
  next b1
next a

My questions:

1. Any ideas on how to make if faster?

2. How to make it in parallel?

3. Questions 1, 2 for the case when I have B1, B2, ..., Bk, k > 2?


You can sort B1 and B2 by norm. If a = b1 + b2, then ||a|| = ||b1 + b2|| <= ||b1|| + ||b2||, so for any a and b1, you can efficiently eliminate all elements of B2 that have norm < ||a|| - ||b1||. There may also be some way to use the distribution of norms in B1 and B2 to decide whether to switch the roles of the two sets in this. (I don't see off-hand how to do it, but it seems to me that something like this should hold if the distributions of norms in B1 and B2 are significantly different.)

As for making it parallel, it seems that each loop can be turned into a parallel computation, since all computations of one inner iteration are independent of all other iterations.

EDIT

Continuing the analysis: since b2 = a - b1, we also have ||b2|| <= ||a|| + ||b1||. So for any given a and b1, you can restrict the search in B2 to those elements with norms in the range ||a|| ± ||b1||. This suggests that for B1 you should select the set with the smallest average norm.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜