There are two types of social network users: those who are addicted to Wordle and those who wonder what those little yellow and green squares are that their friends tweet non-stop. The simple game, which consists of guessing a hidden five-letter word, has become one of the great viral phenomena of recent months: the 90 fans who played it in November had become 300,000 this January. The inventor of the original version in English, the Welsh-born software engineer Josh Wardle, created it during the pandemic to entertain his partner, a lover of these puzzles, according to an interview in the New York Times. From there it jumped to his whole family and then became popular on Twitter and Facebook thanks to the mysterious colored decoys with which the players, without revealing the word searched for so as not to crush other potential participants, show how much it has cost them to find the solution.
One of the keys to its success is its simplicity. The player accesses a minimalist page, without registration or advertising, tries out five-letter words and the program tells them which of those letters coincide in the same position (little green square) or in a different position (little yellow square) with those of the solution. Exactly the same as it happened with colors, instead of words, in the legendary Mastermind. To win you have to find the correct answer in less than six attempts. Wardle only puts out one challenge a day, an idea he got from the New York Times Spellig Bee hobby, so his fans have to wait 24 hours to play again. But, perhaps for this reason, it engages more, and once the mechanism has been mastered after a few games, its followers end up asking an inevitable question: is there an optimal method that allows it to be solved in the fewest possible number of attempts?
Esteban Moro (Salamanca, 1971), professor, researcher and data scientist at the Carlos III University of Madrid and visiting professor at the Massachusetts Institute of Technology (MIT), has sought a scientific answer to this question. And he has explained, in an article published on his blog, a strategy that would have solved, in less than six steps, 99% of the 206 challenges posed so far by Wardle, although it is not transferable to other versions of the game, such as those that They have been circulating for days in Spanish and even in Galician.
I wrote a simple blog post about how to play (and 99% win) Wordle with R https://t.co/stb33cBDDn The simple strategy is to exploit the bias in Wordle answers towards more frequent words in English, combined with a smart choice for the initial guess (“orate” ). Enjoy! #RStats
— Esteban Moro (@estebanmoro) January 11, 2022
His method is based on two keys: start the game with a word that he has identified as the best, and choose the successive attempts following a simple rule. But how to find that pattern?
Moro has used a free software programming language called R for his calculations, which allows him to carry out statistical analyzes and try to reproduce the innards of Wordle on his computer. With it, he created a game with the same rules that includes the 12,972 five-letter words that exist in the English language. Next, the program simulated successive games, always starting with the term aeros (“eros” in Spanish, which has the five most common letters in that language), and choosing in the following attempts a random word among all those that could fit in the solution. With these instructions, the program won (it managed to find the solution in less than six steps) 80% of the time it was faced with a randomly chosen word, with an average of 5.1 attempts. And he solved almost 90% of the puzzles, with an average of 4.7 attempts, when he was asked one of the more than 200 puzzles already proposed by Wardle.
But there was a way to improve those averages. Other researchers had discovered that the game's solutions are not randomly chosen from over 12,000 possible ones; that some words were more likely to come out than others. Crossing the correct answers of the previous Wordles with a corpus of the most used terms in English, Moro confirmed that Wardle chooses frequently used words in that language, something that the inventor of the game had also pointed out in his interview in The New York Times, where it was mentioned that he avoided too rare terms. “It makes perfect sense”, explains the Spanish researcher by phone from Boston, where he lives, “for the game to be a success it needs to be simple and playable, and choosing between the most common terms means that in the end we all get it right in just a few Attempts”.
Moro then changed the algorithm. He programmed the simulations so that, also starting the game always with the word aeros, he always chose in the successive steps the most used term in English from among all the possible ones, with the help of a corpus that sort them according to their frequency of use. The results hardly improved for the words chosen at random, but the strategy proved to be much more effective for those that Wardle had already proposed in his challenges: the program solved 97% of the puzzles with an average of 3.9 attempts.
“I am a data scientist and as such I am always looking for these biases and these patterns that help us make algorithms. And that is what I have done, to see that there was a bias in the words that Wardle chose, and to exploit it to make a better strategy”, explains Moro.
Could anything else be done to refine the method? Maybe change the start word. The letters of aeros include the five most frequent letters in English (as pointed out by Edgar Allan Poe in the cryptographic challenge included in his famous story The gold bug), but Moro found that in the more than 200 solutions published so far in the original version of the game, the t appeared more times than the s. He then changed the initial word aeros to orate (speak), and keeping the rule of always choosing the most frequently used word afterwards, the algorithm solved the 99 % of hobbies raised by Wardle. In any case, the researcher warns that this improvement of two points in the results could be a statistical coincidence, and that more data would be needed to assess whether it is significant.
Wardle deliberately chooses more common words to make the game more friendly. But a super difficult Wordle could be programmed, with rarer terms or containing, for example, several letters common to many words. “In English it would be quite difficult belly (vientre), for example, because there are many words that end in those three letters,” explains Moro. In the event that rare words were deliberately included, his method would not work, new biases would have to be detected and the algorithm adjusted so that it chose, for example, the least used terms or those most similar to others.
Apart from what will be the best possible strategy, there is another recurring question surrounding Wordle: the reason for his dazzling and global popularity. Moro believes, and he is not the only one, that in part it has to do with the fact that he brings a certain serenity to a world of unbridled rhythm. “Because Wardle posts only one puzzle a day, that slow pace gives us unhurried, synchronized social interaction. And that is one of the successes of the game”, he explains.
You can follow Newsfresh TECHNOLOGY on Facebook and Twitter or sign up here to receive our weekly newsletter.