Calculate letter frequencies

Show:

Letter frequency

The letter frequency gives information about how often a letter occurs in a text. If you use the tool on this page to analyse a text you will, for each type of letter, see the total number of times that the letter occurs and also a percentage that shows how common the letter is in relation to all the letters in the text. By default the most common letters are listed at the at the top, but it is also possible to use alphabetical order. Another option is to allow all special characters (e.g. hyphens, spaces, dots) to be included in the calculations.

N-gram

When analysing a text it can sometimes be interesting to look at multiple letters in a sequence, and see how common they are in the text. A group of two letters are called a bigram, while a group of three letters are called a trigram. Sometimes bigger groups of letters are used when anaylizing written languages but is not as common.

Applications

The letters are not equally common in all languages so by comparing the letter frequencies of a text it is possible to detect which language the text is written in. It can also be used to find the author of a book or document, because each person has its own style of writing which leads to different letter frequencies. It is often not possible to get a hundred percent confident result because all texts doesn't match the average. It is sometimes difficult to know what an average text is because it can depend on the subject and the context in which the text has been written. By using bigrams or trigrams, instead of individual letters, it is possible to get a more reliable result but it requires a lot more storage, and much more text is needed before the frequency distribution is stablized.

Another application is in cryptography to decode encrypted messages. If it is known which letters that are most common it can sometimes make it easier to deduce which letters that are used.