solving wordle puzzles with python

Wordle is a fun little word puzzle that is very similar to the game Mastermind. The player gets 6 guesses to correctly identify a hidden 5 letter word and after each guess is told whether each letter of the guessed word was:

  1. The correct letter in the correct location
  2. One of the letters in the true word but in the wrong location
  3. Not in the true word

Clearly a computer algorithm given a candidate list of 5 letter words and the hints following the guesses could come to a reasonable solution. The most simple version is a narrowing algorithm that for each guess-check turn eliminates words from the list of possible words by checking if each letter in the word is compatible with the hints from the guess.

That is essentially what I have written a little program to do - you can see the code here. The function to narrow the list of words is:

def narrow_words(valid_words, hints):
    allowed_words = []
    for word in valid_words:
        allowed_flag = True
        word_arr = list(word)
        for k in range(len(word_arr)):
            # hint "g" means letter is correct
            if hints[k]["val"] == "g" and word_arr[k] != hints[k]["letter"]:
                allowed_flag = False
                break
            # hint "y" means letter is in word but in wrong place
            elif hints[k]["val"] == "y" and (hints[k]["letter"] == word[k] or hints[k]["letter"] not in word_arr):
                allowed_flag = False
                break
            # hint "b" means letter is not in word
            elif hints[k]["val"] == "b" and hints[k]["letter"] in word_arr:
                allowed_flag = False
                break
        if allowed_flag:
            allowed_words.append(word)
    return allowed_words

One of the decisions the algorithm has to make is to select a word from the list of still-allowed words. My brother suggested selecting the most popular word (from a list where the words are ordered by popularity in common usage), while my initial thought was to select a random word. The question really has morphed from an algorithm question to a statistics question. If the creator of Wordle is selecting words that are more popular more frequently, my brother’s susggestion is a good one. But if the words are selected randomly - choosing randomly will be better. I’ve implemented both, and a hybrid that chooses from randomly from the top 10% of the valid words.

def get_word(word_list, method):
    if method == "most_popular":
        index = 0
    elif method == "rand_most_popular":
        index = randint(0, round(0.10*len(word_list), 0))   
    elif method == "random":
        index  = randint(0, len(word_list)-1)
    else:
        index = 0
    return word_list[index]

I found a list of 5 letter words sorted by popularity here that has almost 6000 entries. I tested the algorithm(s) by randomly sampling the true word from the list 10,000 times. The play game function returns whether the algorithm found the solution and in how many steps. Here are the results:

Results selecting the most popular word at each loop
Not Solved:	789	times (7.89%)
1 Turns:	3	times (0.0%)
2 Turns:	97	times (1.0%)
3 Turns:	1125	times (11.2%)
4 Turns:	3180	times (31.8%)
5 Turns:	3222	times (32.2%)
6 Turns:	1584	times (15.8%)



Results selecting a random word at each loop
Not Solved:	962	times (9.62%)
1 Turns:	2	times (0.0%)
2 Turns:	196	times (2.0%)
3 Turns:	1484	times (14.8%)
4 Turns:	3169	times (31.7%)
5 Turns:	2816	times (28.2%)
6 Turns:	1371	times (13.7%)



Results selecting from the most popular 10% of words at each loop
Not Solved:	858	times (8.58%)
1 Turns:	3	times (0.0%)
2 Turns:	111	times (1.1%)
3 Turns:	1100	times (11.0%)
4 Turns:	3223	times (32.2%)
5 Turns:	3165	times (31.6%)
6 Turns:	1540	times (15.4%)

The more random the selection becomes the higher the overall did-not-solve rate becomes, while lowering the average turns to solve. Overall the algorithms have solve rates over 90% and an average turns to solve of about 4 - that’s about as good as I’ve been solving them by hand!

Final Thoughts

I think there may be more clever ways to select the word - including trying to maximize the number of different letters used or by preferentially selecting words with more vowels which may eliminate words more quickly. A quick glance at the wikipedia page for Mastermind suggests using a minimax selection method (from TAOCP author Donald Knuth). I took a glance at the source code for the Wordle site and I do think the word list is included in the minifies javascript function - but I’m not trying to take all the fun out of it!