开发者

Improve a word search game worst case

Consider:

a c p r c 
x s o p c 
v o v n i 
w g f m n 
q a t i t

An alphabet i_index is adjacent to another alphabet j_index in the tile if i_index is next to j_index in any of the following positions:

* * *
* x *
* * *

Here all the * indicates the location which are adjacent to x.

The task is to find a given string in the tile. The condition is that all the characters of the given string should be adjacent, and no one character in the tile may be used more than once to construct the given string.

I have came up with a simpl开发者_JS百科y backtracking solution, for which the solutions are pretty fast, but the worst case time is really worse.

For an example: Say the tile has 4x4 filled with all a's , therefore 16 a's, and the string to find is aaaaaaaaaaaaaaab, that is, 15 a's and one b . One what is to eliminate strings with characters which does not appear in the tile. But still worst case can still appear with say the tile have abababababababab and the string to find is abababababababbb .

My attempt is like this:

https://ideone.com/alUPf:

#include <stdio.h>
#include <string.h>
#include <ctype.h>

#define MAX 5

int sp (char mat[MAX][MAX], char *pat, int c, int i, int j)
{
  int r = 0;
  char temp;


  if (c == strlen (pat))
    return 1;
  if (((i<0) || (j<0)) || (i>=MAX) || (j>=MAX))
        return 0;
  if (mat[i][j] != pat[c])
    return 0;
  if (isupper (mat[i][j]))
    return 0;


  /* Save character and mark location to indicate
   * DFS has visited this node, to stop other branches
   * to enter here and cross over path
   */
  temp = mat[i][j];
  mat[i][j] = 0;

  r |= sp (mat, pat, c+1, i-1, j-1);
  r |= sp (mat, pat, c+1, i-1, j);
  r |= sp (mat, pat, c+1, i-1, j+1);
  r |= sp (mat, pat, c+1, i, j+1);
  r |= sp (mat, pat, c+1, i+1, j+1);
  r |= sp (mat, pat, c+1, i+1, j);
  r |= sp (mat, pat, c+1, i+1, j-1);
  r |= sp (mat, pat, c+1, i, j-1);

  /* restore value */
  mat[i][j] = temp;

  /* mark if success */
  if ((mat[i][j] == pat[c]) && (r == 1))
    mat[i][j] = toupper (mat[i][j]);

  return r;
}

/* Testing the `sp` module */
int main (void)
{
  char mat[MAX][MAX] = {
                     {'a', 'c', 'p', 'r', 'c'},
                     {'x', 's', 'o', 'p', 'c'},
                     {'v', 'o', 'v', 'n', 'i'},
                     {'w', 'g', 'f', 'm', 'n'},
                     {'q', 'a', 't', 'i', 't'}
                   };
  char pat[] = "microsoft";
  int i, j;

  for (i=0; i<5; i++)
  {
    for (j=0; j<5; j++)
      printf ("%c ", mat[i][j]);
    printf ("\n");
  }

  for (i=0; i<5; i++)
    for (j=0; j<5; j++)
      sp (mat, pat, 0, i, j);

  printf ("\n\n\n");
  for (i=0; i<5; i++)
  {
    for (j=0; j<5; j++)
    {
      if (isupper (mat[i][j]))
        printf ("%c ", mat[i][j]);
      else
        printf (". ");
    }
    printf ("\n");
  }
  printf ("\n");
  return 0;
}

which prints:

a c p r c 
x s o p c 
v o v n i 
w g f m n 
q a t i t 



. . . R . 
. S O . C 
. O . . I 
. . F M . 
. . T . . 

The function sp does the work, performs the back tracking.

Is there a better way ? or is it possible to lower the worst case time ?


There is no polynomial algorithm, so I don't think you can get much better...

It is possible to encode any grid graph (a planar graph with nodes with degree <= 4) using a letter matrix. The following grid

0-0-0
| | |
0 0-0
| | 
0-0-0

Can be converted by turning edges into 'a's, vertices into 'b's and empty spaces into 'z's

B a B a B  
a z a z a  
B z B a B  
a z a z z  
B a B a B 

Looking for a hamiltonian path in the original graph is equivalent to searching for the string BaBaBaBaBaBaBaBaB (with all 9 Bs). But the Hamiltonian path problem for grids in NP-complete* so the word searching problem is NP-hard.

Since a word path is clearly a polynomial certificate, the word searching problem on matrices is NP-complete.


*I remember seeing a proof for this a while ago and Wikipedia confirms, but without linking to a reference >:/


I'm pretty sure theres more on this problem out there. I just pulled this proof out of my ass and I'm pretty sure were not the first ones to look at the problem. At the least there is a good chance for nice heuristics on the non-degenerate cases you get in a real magazine puzzle...


One simple improvement is to check the value of r after each call to sp. If r == 1 then stop calling sp. You found your solution. This is unless you need to find all possible solutions.

Something like this (logical OR operator does not calculate second operand if first is true):

r = sp (mat, pat, c+1, i-1, j-1)) ||
  sp (mat, pat, c+1, i-1, j) ||
  sp (mat, pat, c+1, i-1, j+1) ||
  sp (mat, pat, c+1, i, j+1) ||
  sp (mat, pat, c+1, i+1, j+1) ||
  sp (mat, pat, c+1, i+1, j) ||
  sp (mat, pat, c+1, i+1, j-1) ||
  sp (mat, pat, c+1, i, j-1) ? 1 : 0;


I think you might find that focusing on the worst case is counterproductive here, because there is no real improvement to be made. However, there are many useful improvements to be made in "real world" cases. For example, if the words are always drawn from a dictionary, if they may be limited in length (or have a natural distribution of lengths, statistically speaking). For small grids you could imagine searching them in advance to find all words from a dictionary, storing the list in a hashmap, and then performing a simple lookup as words need to be "tested". All the time goes into building your index, but that may be acceptable if the grid rarely changes.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜