Charles Explorer logo
🇬🇧

InterCorp, a Parallel Corpus of 40 Languages

Publication at Faculty of Arts |
2019

Abstract

This chapter presents the 9th version of InterCorp, a parallel corpus created at the Faculty of Arts, Charles University in Prague. The corpus contains texts in Czech aligned with one or more foreign-language version(s), including Czech and 39 other languages.

The chapter analyses its structure and technical parameters, and describes some technical tools used with the corpus (Kontext, a corpus query interface, and InterText, a parallel text alignment editor created specifically for the project). Similarly, the contribution discusses Treq (Translation Equivalents Database), a collection of bilingual Czech-foreign language dictionaries built automatically from InterCorp.

In the last section of the chapter, the possibilities for methodological and linguistic exploitation of the corpus are discussed.