Charles Explorer logo
🇬🇧

Extracting a Lexicon of Discourse Connectives in Czech from an Annotated Corpus

Publication at Faculty of Mathematics and Physics |
2017

Abstract

We discuss a process of exploiting a large corpus manually annotated with discourse relations - the Prague Discourse Treebank 2.0 - to create a lexicon of Czech discourse connectives (CzeDLex). We present theoretical aspects of the project and a technical solution based on the (XML-based) Prague Markup Language that allows for an efficient incorporation of the lexicon into the family of Prague treebanks.