InterCorp in version 9 has 1,460 mil. words in foreign languages, including 232 mil. words in the core part and 1,229 mil. words in the collections. The counts for Czech texts are 187 mil. words total, including 97 mil. words in the core and 90 mil. in the collections.
Romany was added as a new language. The newly tagged and lemmatized languages are Croatian, Serbian are Latvian.
Serbian texts written in Cyrillic were converted into Latin script. Due to a new way of deciding about newly added texts the representation of individual languages was improved.
The names of authors and translators within a language were unified.