Cryptogram Helper Algorithm

The algorithm used by the applet's "Solve" button is a deterministic, iterative procedure based around a 26x26 matrix of probabilities (how likely the algorithm thinks it is that a given code letter should be replaced by a given text letter). The matrix starts out with all probabilities equal, then (after loading a dictionary of English word frequencies) repeats the following steps:

For each word in the dictionary that matches a word of the cryptogram, the word's frequency is multiplied by the matrix entries for its individual letters, giving an overall probability of seeing that word in that position.
For each pair of a code word and a possible decryption of that word, we build a 26x26 matrix in which the probability of finding that word is as large as possible (with uniform probabilities for unrelated letters). We then replace our original matrix with a new matrix formed as a weighted average of the word matrices, where the weights are the word probabilities computed in the first step.
A minimum-weight perfect matching algorithm computes the most likely decryption for the current matrix.
We try all possible ways of swapping two letters in the decryption, keeping a swap when it improves the quality (number of recognized words and product of frequencies of those words). If we find a good swap we also adjust the matrix to make that decryption more likely in future iterations.
For each word of the decryption, we use the dictionary to find all other words that could replace it without changing anything in the rest of the decryption. If we find any such words, we use the one with the highest frequency.

At each iteration, the iteration number and quality of the current translation are shown in the browser's status line. The decryption shown in the main window is the best one found so far, which is not necessarily the one from the latest iteration.

See Graham Toal's cryptogram source code page for alternative cryptogram-solving algorithms.

Return to Cryptogram Helper

View source