Speeding up self-similarity in an image

2022-12-12 19:46 问答作者：

I'm writing a program that will generate images. One measurement that I want is the amount of "self-similarity" in the image. I wrote the following code that looks for the countBest-th best match开发者_Python百科es for each sizeWindow * sizeWindow window in the picture:

double Pattern::selfSimilar(int sizeWindow, int countBest) {
    std::vector<int> *pvecount;

    double similarity;
    int match;
    int x1;
    int x2;
    int xWindow;
    int y1;
    int y2;
    int yWindow;

    similarity = 0.0;

    // (x1, y1) is the original that's looking for matches.

    for (x1 = 0; x1 < k_maxX - sizeWindow; x1++) {
        for (y1 = 0; y1 < k_maxY - sizeWindow; y1++) {
             pvecount = new std::vector<int>();

             // (x2, y2) is the possible match.
             for (x2 = 0; x2 < k_maxX - sizeWindow; x2++) {
                 for (y2 = 0; y2 < k_maxY - sizeWindow; y2++) {
                     // Testing... 
                     match = 0;

                     for (xWindow = 0; xWindow < sizeWindow; xWindow++) {
                         for (yWindow = 0; yWindow < sizeWindow; yWindow++) {
                             if (m_color[x1 + xWindow][y1 + yWindow] == m_color[x2 + xWindow][y2 + yWindow]) {
                                 match++;
                             }
                         }
                     }

                     pvecount->push_back(match);
                 }
             }

             nth_element(pvecount->begin(), pvecount->end()-countBest, pvecount->end());

             similarity += (1.0 / ((k_maxX - sizeWindow) * (k_maxY - sizeWindow))) *
                 (*(pvecount->end()-countBest) / (double) (sizeWindow * sizeWindow));

             delete pvecount;
        }
    }

    return similarity;
}

The good news is that the algorithm does what I want it to: it will return a value from 0.0 to 1.0 about how 'self-similar' a picture is.

The bad news -- as I'm sure that you've already noted -- is that the algorithm is extremely slow. It takes (k_maxX - sizeWindow) * (k_maxY - sizeWindow) * (k_maxX - sizeWindow) * (k_maxY - sizeWindow) * sizeWindow * sizeWindow steps for a run.

Some typical values for the variables:

k_maxX = 1280
k_maxY = 1024
sizeWindow = between 5 and 25
countBest = 3, 4, or 5
m_color[x][y] is defined as short m_color[k_maxX][k_maxY] with values between 0 and 3 (but may increase in the future.)

Right now, I'm not worried about the memory footprint taken by pvecount. Later, I can use a sorted data set that doesn't add another element when it's smaller than countBest. I am only worried about algorithm speed.

How can I speed this up?

Ok, first, this approach is not stable at all. If you add random noise to your image, it will greatly decrease the similarity between the two images. More importantly, from an image processing standpoint, it's not efficient or particularly good. I suggest another approach; for example, using a wavelet-based approach. If you performed a 2d DWT on your image for a few levels and compared the scaling coefficients, you would probably get better results. Plus, the discrete wavelet transform is O(n).

The downside is that wavelets are an advanced mathematical topic. There are some good OpenCourseWare notes on wavelets and filterbanks here.

Your problem strongly reminds me of the calculations that have to be done for motion compensation in video compression. Maybe you should take a closer look what's done in that area.

As rlbond already pointed out, counting the number of points in a window where the colors exactly match isn't what's normally done in comparing pictures. A conceptually simpler method than using discrete cosine or wavelet transformations is to add the squares of the differences

diff = (m_color[x1 + xWindow][y1 + yWindow] - m_color[x2 + xWindow][y2 + yWindow]);
sum += diff*diff;

and use sum instead of match as criterion for similarity (now smaller means better).

Back to what you really asked: I think it is possible to cut down the running time by the factor 2/sizeWindow (maybe squared?), but it is a little bit messy. It's based on the fact that certain pairs of squares you compare stay almost the same when incrementing y1 by 1. If the offsets xOff = x2-x1 and yOff = y2-y1 are the same, only the top (rsp. bottom) vertical stripes of the squares are no longer (rsp. now, but not before) matched. If you keep the values you calculate for match in a two-dimensional array indexed by the offsets xOff = x2-x1 and yOff = y2-y1, then can calculate the new value for match[xOff][yOff] for y1 increased by 1 and x1 staying the same by 2*sizeWindow comparisons:

for (int x = x1; x < x1 + sizeWindow; x++) {
    if (m_color[x][y1] == m_color[x + xOff][y1 + yOff]) {
        match[xOff][yOff]--; // top stripes no longer compared
    }

    if (m_color[x][y1+sizeWindow] == m_color[x + xOff][y1 + sizeWindow + yOff]) {
        match[xOff][yOff]++; // bottom stripe compared not, but wasn't before
    }
}

(as the possible values for yOff changed - by incrementing y1 - from the interval [y2 - y1, k_maxY - sizeWindow - y1 - 1] to the interval [y2 - y1 - 1, k_maxY - sizeWindow - y1 - 2] you can discard the matches with second index yOff = k_maxY - sizeWindow - y1 - 1 and have to calculate the matches with second index yOff = y2 - y1 - 1 differently). Maybe you can also keep the values by how much you increase/decrease match[][] during the loop in an array to get another 2/sizeWindow speed-up.

继续阅读：algorithm image-processing optimization similarity

Speeding up self-similarity in an image

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？