Languages and corpora

Already available

Data, tools and services, in most cases, are based on a large sample of language called a corpus. Word lists, n-grams, lexical databases and any other data we supply are generated from these corpora. We are constantly developing new corpora and increase the coverage of languages. At this moment, these are the languages and corpora we currently have.

Language support development

We have an ample experience in developing support for new languages and building new text corpora. If your language is currently not supported or you need new data, please request the support to be developed.

Languages and corpora already available