Charles Explorer logo
🇨🇿

The SYN concept: towards one-billion corpus of Czech

Publikace na Filozofická fakulta |
2009

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

The paper describes corpus SYN, a unification of synchronic written corpora of Czech consistently re-processed with state-of-the-art versions of available tools. After inclusion of newspaper corpus SYN2009PUB, its size will reach 1.2 billion tokens.