The JEROME corpus is a monolingual comparable corpus specifically designed to enable the research of translated Czech in comparison with original, non-translated Czech. It includes 69 million tokens and two text types, fiction and professional literature.
Within the JEROME corpus, a subcorpus balanced according to source languages was made available for the research of so-called translation universals.