Solving CryptoQuotes with PVM:

Strategy


Return to Contents Return to Project Reports

Basic Strategy

Although people often solve cryptoquotes using the help of general grammar guidelines, this program solves them using only the words themselves. The strategy is patterned around the way PVM computes in parallel. PVM is suited to coarse grained algorithms, where the size of each task is large in proportion to the amount of data which is swapped between tasks. As such, it is natural to assign each task spawned by PVM a word to solve. That task looks at the length of the word it is assigned and loads a "dictionary" composed of words which are of the same length. Then, by scanning the words in the dictionary, the task is able to determine which letter can be in each position; the total possibilities for each letter can thus be narrowed down in this way. While scanning the list for the first time, the task also discards words which do not fit the general pattern of the target word. For example, the scrambled word "mqrswqmab" cannot be the word "affiliate," because the first and seventh letters must be the same (likewise the second and the sixth letters must be identical). The word could be "iteration" however, because "iteration" fits the pattern of the target word. The tasks, upon reaching the end of a such a scan, may have down the list of possible letters a letter in the code can be. If so, it shares this newfound knowledge with the rest of the tasks in the quote.

This strategy, bereft of the ability to guess at words based on their order and possible grammatical role, must rely heavily on the word patterns in longer words to reach a decisive solution. Because of this, quotes composed entirely of short words are often unsolvable using this approach. Luckily, the average cryptoquote contains enough long words for this strategy to be effective.

The effectiveness of the strategy is also dependent upon the size of the dictionary. If the words in the quote are required to exist in the dictionary, then it is of great advantage for the dictionary to be as small as possible, partly to reduce scan times but mostly to reduce the number of possibilities for each letter. It should also follow from this that if a word in the quote is not in the dictionary, then the problem is unsolvable. Human cryptoquote afictionados know that they can solve puzzles containing words they do not know by delaying the solution of that word until they have determined enough letters (in fact, one of the attractions of solving cryptoquotes is the effect it has on improving vocabulary, similar to effect crossword puzzles have). Then, with some or all of the letters in that word filled in, it becomes easier to guess at that word's identity. However, in employing such tactics, the human is invariably using the meaning of the words around the unknown word to at least narrow down its part of speech. Humans can also recognize basic morpheme schemes common to all words (relationships between the placement of vowels and consonants), and this allows them to "fill" in small gaps of one or two letters (a vowel between two consonants, for example; three consonants in a row occur only in special combinations like the beginning of the word "three" or "spleen"). Since it is exceedingly difficult to imbue the computer with this level of comprehension, quotes with words not in the dictionary are considered unsolvable in the scope of this program.


  • Go back to Introduction
  • Go on to Implementation


  • Written by:

    Ko-Ming Chang
    Computer Science 397
    Parallel Computing
    Dr. Thomas Whaley
    Washington and Lee University
    Lexington, VA 24450

    Questions? Mail me at kchang@liberty.uc.wlu.edu