Charles Explorer logo
🇬🇧

Relation among Segments - Segmentation Schemas of Czech Sentences

Publication at Faculty of Mathematics and Physics |
2008

Abstract

Syntactic analysis of natural language sentences, the basic requirement of many applied tasks, is a complex task, especially for languages with free word order. Natural solution, which reduces complexity of the input sentences, can be a module in which they determined the structure of sentences before full synyactic analysis.

We propose to use the concept of segments, easily recognizable sections sentences automatically. We introduce `segmentation schemata' that describe the relationship between the segments - in particular, super/subordination, coordination and aposition, and parenthesis.

In this article we present framework that allows to develop and test rules for automatically determining the segmentation schemes. We describe two basic experiments - experiment to obtain the segmentation patterns from trees from Prague Dependency Treebank amd segmentation experiment with the rules applied to plain text.

Furthermore, we propose measures for evaluating segmentation