开发者

PYTHON- Given a starting word, which sequence of letters will allow you to maximize the number of valid words that can be spelled

Edit: New question. Given that I suspect this question is more difficult than I originally thought-- potentially NP-complex, I have a different question: what is useful algorithm that can get close to maximizing the number of 开发者_如何学编程letters used?

I was inspired to write a program based on the card game Scrabble Slam! I'm new to algorithms, however, and can't think of an efficient way of solving this problem:

You start with a string containing an English-valid 4 letter word. You can place one letter on that word at a time such that by placing the letter you form a new dictionary-valid word. The letters that you have are equal to the letters in the alphabet.

For example if the starting word was "cart", this could be a valid sequence of moves:

sand --> sane --> sine --> line --> lins --> pins --> fins , etc.

The goal is to maximize the number of words in a sequence by using as many letters of the alphabet (without using a letter more than once).

My algorithm can't find the longest sequence, just a guess at what the longest might be.

First, it gets a list of all words that can be formed by changing one letter of the starting word. That list is called 'adjacentList.' It then looks through all the words in adjacentList and finds which of those words have the most adjacent words. Whichever word has the most adjacentWords, it chooses to turn the starting word into.

For example, the word sane can be turned in 28 other words, the word sine can be turned into 27 other words, the word line can be turned into 30 other words-- each one of these choices was made to maximize the likelihood that more and more words could be spelled.

What would be the best way to go about solving this problem? What sort of data structure would be optimal? How can I improve my code to make it more efficient and less verbose?

##Returns a list of all adjacent words.  Lists contain tuples of the adjacent word and the letter that
##makes the difference between those two words.

def adjacentWords(userWord):
    adjacentList, exactMatches, wrongMatches = list(), list(), str()
    for dictWords in allWords:
        for a,b in zip(userWord, dictWords):
            if a==b: exactMatches.append(a)
            else: wrongMatches = b
        if len(exactMatches) == 3:
            adjacentList.append((dictWords, wrongMatches))
        exactMatches, wrongMatches = list(), list()
    return adjacentList
    #return [dictWords for dictWords in allWords if len([0 for a,b in zip(userWord, dictWords) if a==b]) == 3]


def adjacentLength(content):
    return (len(adjacentWords(content[0])), content[0], content[1])

#Find a word that can be turned into the most other words by changing one letter
def maxLength(adjacentList, startingLetters):
    return max(adjacentLength(content) for content in adjacentList if content[1] in startingLetters)

def main():
    startingWord = "sand"
    startingLetters = "a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z".replace(" ", "").split(',')

    existingWords = list()
    while True:
        adjacentList = adjacentWords(startingWord)
        letterChoice = maxLength(adjacentList, startingLetters)
        if letterChoice[1] not in existingWords:
            print "Going to use letter: "+ str(letterChoice[2]) + " to spell the word "+ str(letterChoice[1]) + " with "+ str(letterChoice[0]) + " possibilities."
            existingWords.append(letterChoice[1])
            startingLetters.remove(str(letterChoice[2]))
            startingWord = letterChoice[1]


main()


@Parseltongue You might find out that although, there is an optimal solution for the given task, yet there is no known way to solve it in optimal way. This task is one of the NP-class problems.

Consider this:

sand --> [?]and 

at this point you have to iterate through ALL possible combinations of [?]and to find out words that match the criteria. In worst case, with no heuristics, that is 26 iterations.

sand --> band, hand, land, wand

Now, take every found word and iterate the second letter etc.
You can see, that the amount of iterations you have to make grows exponentially.

This is in some manner very similar to the problem of chess. No matter how powerful your computer is, it can't predict all possible moves, because there are too many combinations.


Me and a friend actually developed a similar program (in Java, though) in a laboration for a data structures and algorithms course. The task was to create the shortest chain of words from one word to another.

We built up a tree of all available words where one node was connected to another iff they only differed by one letter. Using some kind of hashtable, such as a dictionary, is required for speed. We actually removed one letter from the words we used as keys, thus taking advantage of the fact that connected words then had the same hash, but it's hard to explain without having the code.

For finding the shortest chain, we just used a Breadth-first search. However, finding the shortest path in a graph is easier than the longest. See longest path problem for more details.

To solve the main problem, you could just iterate through all words. However, if speed is important is it often easier to try to find a good special heuristic algorithm.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜