Charles Explorer logo
🇬🇧

InterCorp (release 7)

Publication

Abstract

A new release of the InterCorp parallel corpus features a large collection of film subtitles. The total size of foreign language texts has reached 173 million tokens in the core and 1.2 billion tokens in collections, the number of languages has increased to 38.