开发者

recursion & returning boolean values

bool binsearch(string phrase, vector<string> words, int from, int to, int &test)
{
    while (tf == "y") //tf is a global variable
    {
        int mid = (to+from)/2;
        if (words[mid] == phrase) {tf = "t"; return true;}
        if (mid == test) {tf = "f"; return false;}
        if (words[mid] > phrase) {return binsearch(phrase, words, mid-1, to, mid);}
        else {return binsearch(phrase, words, from, mid+1, mid);}
     }
}

i'm working on getting this binary search working. i need the overall function to return either "true" or "false". i understand how the recursion works up until either line 6 or 7 execut开发者_如何学编程es and the return command is invoked. i've done research, and it seems like there's no way to exit the function right there and it has to "unwind" itself. the tf global variable nonsense is so it won't execute that body again when it's unwinding...but i'm still not getting the results i want.

essentially, i just want to get out of the function ASAP after either the return true or return false command is invoked, and return the value to the main function accordingly

Thanks


I don't understand your binary search either, and using global variables in addition to recursion leads to programs which are very hard to understand. It's better to go back the call stack again and "unwind" it properly. Look at the following example (untested):

bool binsearch(const string& phrase, const vector<string> &words, int from, int to)
{
    if (from > to)
        return false; // word not found
    int mid = from + (to - from) / 2; // i think your calculation is wrong here
    int cmp = phrase.compare(words[mid]);
    if (cmp < 0)
        return binsearch(phrase, words, from, mid - 1);
    else if (cmp > 0)
        return binsearch(phrase, words, mid + 1, to);
    else
        return true; // word found
}


You can use the STL's built-in binary_search as follows:

binary_search(words.begin(),words.end(), phrase)

If you're doing this to learn; there are a few things...

  • You don't need a while loop. There are three cases to consider: the word comes before mid, at mid, or after mid. Each of these three cases returns - so it's impossible to even reach the end of the loop body.
  • You use test when exactly, and do you need this variable?
  • You should consider carefully exactly which range of indexes still needs to be searched. Are from and to inclusive or exclusive? You need to be precise and consistent.
  • Consider that division of positive integers rounds down. No matter what values they have, ensure that the recursive call calls a smaller range - to avoid infinite loops. This will help avoid the need for your test variable (see David's comment below).
  • It's not good practice to use global variables; certainly not in otherwise pure functions - I'm assuming you're doing this for debugging purposes?
  • How large can to and from be? In some cases, note that to+from may exceed 2^31-1.
  • It's typical in C++ to express these notions with iterators. You don't have to, of course.
  • It's typical in C++ to pass large objects by const & where possible - that way, the recursive call doesn't need to copy the entire vector. This is not important for correctness, but practically very important for efficient code.


Pass vector<string> words as reference in your binsearch() function. Presently it keeps creating copies of vector<string> whenever the function is called which is not needed. Moreover in future if you want to update that vector<>, then passing by reference is the best way.

There should be return statement outside the while loop. That will be the final 'return`.


One of the classical way to get rid of this : rewrite it without recursion.

For example, use a while loop, then as soon as you find the result, use a break to go out. You can have a look at following code (not compiled, just written quickly from your own code)

bool binsearch(const string& phrase, const vector<string> &words, int from, int to)
{
    bool retVal = false;
    int start = from;
    int end = to;


    while(start<end) {
       int mid = from + (to - from) / 2; // i think your calculation is wrong here
       int cmp = phrase.compare(words[mid]);
       if (cmp < 0) {
           end=mid-1;
       } else if (cmp > 0) {
           start=mid+1;
       } else {
           retVal = true;
           break;
       }
    }
    return retVal;
}

There is no elegant or portable way to jump out of a full call stack, it's at best fairly risky. Moreover, the derecursified function will be much quicker : it does not need to push stuff on stack and do a a function call

Edit

  • added missing return
  • concerning performances : just benchmark it. In this particular case, code complexity (for the human reader) is almost the same, but depending on the algo, it can be much more complex (or even impossible).


There are several things wrong with your code. For starters, it's not clear what to and from mean: are they inclusive, or exclusive. And if they're both inclusive (which your arguments to the recursive calls seems to suggest), how do you detect the end. And what does test mean? You seem to be using it as an end criterion when you don't find the word, but I don't see how.

If I were writing this, I'd use a simple helper class to hold the target and the word list (but you can just propagate them down explicitly), and a wrapper function so that the client code doesn't have to specify the to and from arguments. There's no need for a global variable, or any additional test. And I'd use the half open intervals that are idiomatic in C++: lower bound inclusive, upper bound exclusive (so top == bottom specifies an empty range, so I've finished without finding the element):

bool 
binSearch( std::string const& target,
           std::vector<std::string> const& words );

namespace {

    class BinSearcher
    {
        std::vector<std::string> const& myWords;
        std::string const& myTarget;

        bool doSearch( int bottom, int top ) const
        {
            int mid = (top - bottom) / 2;
            return mid != top
                && (myTarget == myWords[mid]
                    || (myTarget > myWords[mid] && doSearch( mid + 1, top ))
                    || (myTarget < myWords[mid] && doSearch( bottom, mid ));
        }
        BinSearcher( std::string const& target,
                     std::vector<std::string> const& words )
            : myWords( words )
            , myTarget( target )
        {
        }
        friend bool binSearch( std::string const&,
                               std::vector<std::string> const& );
    };
}

bool 
binSearch( std::string const& target,
           std::vector<std::string> const& words )
{
    BinSearcher searcher( target, words );
    return searcher.doSearch( 0, words.size() );
}

Note that you can't do the comparison before testing that the range isn't equal; it will cause an out of bounds access if all of the elements are less than the target.

Also: I presume that you're doing this for pedagogical reasons. Otherwise, you should just use the function in the standard library. And I wouldn't normally use recursion here: there's a straightforward iterative solution:

namespace {

    bool
    doBinSearch( std::string const& target,
                 std::vector<std::string> const& words,
                 int bottom,
                 int top )
    {
        bool found = false;
        while ( bottom != top && ! found ) {
            int mid = (top - bottom) / 2;
            int cmp = target.compare( words[mid] );
            if ( cmp < 0 ) {
                top = mid ;
            } else if ( 0 < cmp ) {
                bottom = mid + 1;
            } else {
                found = true;
            }
        }
        return found;
    }
}

bool
binSearch( std::string const& target,
           std::vector<std::string> const& words )
{
    return doBinSearch( target, words, 0, words.size() );
}

(Finally, you'll notice that I've converted your parameters to references. It doesn't change anything in the logic of the code, but if words is relatively large, it will make a very significant impact on the performance.)


You can use a longjmp, aka a "non-local goto", to exit the inner recursion immediately, but the question is whether this micro-optimization is worth the trouble.

A better option is to change the recursion into a loop. Since all the recursive calls are "in tail position" (are the argument to return), you can replace them with code that resets the parameter variables. Unfortunately, I don't understand your code, so I can't give you an example.


This is a classic exercise in using recursion - sure, one can also do things nonrecursively, but it's very elegant to "let the recursion manage one's bookkeeping." For those whose knee-jerk reaction is "do it iteratively", I suggest doing the analogous exercise on a merge-sort or quicksort. Very similar recursive structure, but the bookkeeping there is greatly eased in a recursive context. And on modern CPUs the recursive code - surprise! - often runs as fast or faster, to boot.

Here is my recursive implementation, using the OP's problem context. Note there is no need to separately test the midpoint element: Within the context of the C++ "less than" paradigm for comparison predicates (where given <, one can infer equality via .not.(a.lt.b) .and. .not.(b.lt.a) ), an extra test for equality of the midpoint makes little sense, although in the special case of the string class with its many-valued compare result, it may yield a modest speedup to add special handling for the 0-return result. My sample version assumes only < (in the sense that if one really only had <, one would slightly rearrange the divide-and-conquer conditional to use that), which generalizes more easily to numeric and user-defined data types:

bool binsearch(const string&phrase, vector<string>&words, int from, int to)
{
    if(from > to) return false;    // range sanity check
    if(from == to) return (phrase.compare(words[to]) == 0);    // bottom of the recursion
    int mid = from + (to-from)/2;        // Range still has >= 2 elements; set up to recurse
    if(phrase.compare(words[mid]) <= 0)  // Recurse into the subinterval bracketing the target
        return binsearch(phrase,words, from,mid);
    else
        return binsearch(phrase,words, mid+1,to);
}

And here is a non-recursive version of the above, which corrects several problems with the sample code posted by user 'Bruce' and again uses no separate midpoint-value test:

bool binsearch_nr(const string& phrase, const vector<string> &words, int from, int to)
{
    if(from > to) return false;    // range sanity check
    while(from < to) {
        int mid = from + (to-from)/2;
        int cmp = phrase.compare(words[mid]);
        if (cmp <= 0)
            to = mid;
        else
            from = mid+1;
    }
    return (phrase.compare(words[to]) == 0);
}

I did a comparative timing of the above 2 implementations using a vector of 1 million quasi-random text snippets, code built using gcc 4.2 on MacOS ... the recursive version runs ~20% slower. For my hand-rolled merge-sort code, though, recursive is faster. YMMV.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜