Charles Explorer logo
🇬🇧

Detecting and Correcting Errors in an English Tectogrammatical Annotation

Publication at Faculty of Mathematics and Physics |
2009

Abstract

We present our first experiments with detecting and correcting errors in a manual annotation of English texts, taken from the Penn Treebank, at the dependency-based tectogrammatical layer, as it is defined in the Prague Dependency Treebank. The main idea is that errors in the annotation usually result in an inconsistency, i.e. the state when a phenomenon is annotated in different ways at several places in a corpus.

We describe our algorithm for detecting inconsistencies (it got positive feedback from annotators) and we present some statistics on the manually corrected data and results of a tectogrammatical analyzer which uses these data for its operation. The corrections have improved the data just slightly so far, but we outline some ways to more significant improvement.(1)