We generate bigram, trigram and n-gram lists according to the customer’s specifications. For typing prediction, trigrams perform much better than bigrams and we possess corpora in many languages large enough to generate a sufficient number of trigrams for this purpose.
All bigrams, trigrams and n-grams in a language
Our corpora are large enough to generate n-gram lists of all used n-grams in a language. Such a language database can contain hundreds of millions of n-grams. The n-gram database can be filtered according to the criteria specified by the customer and delivered as a download in many formats.
Typically, an n-gram database comes with frequency but we are able to meet further requirements specified by the customer.