Pattern Word Dictionary
In solving cryptograms I have often resorted to a little printed pattern word dictionary I have, but I was frustrated by how small it was and how often the needed word was not in the dictionary. I decided to start building my own pattern word dictionary, one that could continue to grow as I scanned more doccuments for more words to add to it.
The dictionary was built by a computer program that took as its input text files from the Internet. These files included technical documents, scientific papers, many novels from the Gutenberg Project, several movie screenplays, and just about any other piece of text I could lay my hands on. To date I have scanned several hundred megabytes worth of text documents and the dictionary contains about 52,000 words.
Among the several methods of notating patterns I've used the convention that the word is replaced. letter by letter, with consecutive letters of the alphabet except where any letter is duplicated. For example the word "that" has the pattern "ABCA" because the first and last letters are the same. The word "Mississippi" would have the pattern "ABCCBCCBDDB".
There are some differences between this pattern word dictionary and others I have seen. The most noticable is that words are included which are normally considered non-pattern words. For example the word "the" has the pattern ABC, and is considered a non-pattern word because no letter repeats. However, patterns that have no letters repeating are still patterns, and so I have included them.
The words are not sorted alphabetically within each pattern, but are sorted according to how frequently that word appears in all the text files scanned. Thus under the pattern "ABCA" the word "that" occurs long before the word "area" because "that" is a much more commonly used word.
The dictionary also includes:
This pattern word dictionary is a work in progress and probably contains some errors and duplicate words along with some words that aren't really words due to typos in the original text files.
Recent updates:
March 14, 2004 - removed duplicate words. Removed bogus words from the letter 'A'. Scanned and added all words from the Old and New Testiment of the King James Bible.
The entire text of this pattern word dictionary can be
downloaded here. (612 KB)
Or downloaded in zip form here. (257 KB)