开发者

Code understanding Question

I was going through one code on net.

I did not understand following logic. This code works and works really fast.

for (int i = 0; i < typo_word_vec.size(); i++)
{
        float each_typo_word_len = (float)typo_word_vec[i].size();
        int start_range = each开发者_运维问答_typo_word_len - floor((each_typo_word_len / lower_bound_word_size) * each_typo_word_len) - 1;
        if (start_range < 1)
                start_range = 1;
        int end_range = each_typo_word_len + ceil((each_typo_word_len / upper_bound_word_size) * each_typo_word_len) + 1;
        if (end_range > src_word_max_len)
                end_range = src_word_max_len - 1;

        call_get_dist(i, start_range, end_range);
}  

But I do not understand what is the logic behind using start_range and end_range What underlying algorithm or theory is used here.


You really should have posted a few more lines - we definitely need to check the whole code to understand something.

As I understand it, the 'source' words are ordered by size. The 'candidate' words may be shorter or longer than their potential match. That is what start_range and end_range are used for.

Though I have a hard time figuring why the author doesn't use

start_range = 0;
end_range = src_word_max_len;

EDIT: ok, this is just an optimization from his part (quoting readme.txt):

I solved this using pythonand php first, however, my solution were continuously rejected since it takes too much time to solve it (my guess). In the "cpp" directory, I uploaded my solution with c++ using STL, and finally accepted (The basic idea of algorithm is almost the same: prunning the scan range of source files) Currently, I plan to try this problem using other language such as java next time. The statement of the problem can be found here: http://www.facebook.com/careers/puzzles.php?puzzle_id=17

He just arbitrarily defines a range large enough to have a high enough probability of finding the right matching word in it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜