Relation among Segments - Segmentation Schemas of Czech Sentences

Publication at Faculty of Mathematics and Physics |

2008

Abstract

Syntactic analysis of natural language sentences, the basic requirement of many applied tasks, is a complex task, especially for languages with free word order. Natural solution, which reduces complexity of the input sentences, can be a module in which they determined the structure of sentences before full synyactic analysis.

We propose to use the concept of segments, easily recognizable sections sentences automatically. We introduce `segmentation schemata' that describe the relationship between the segments - in particular, super/subordination, coordination and aposition, and parenthesis.

In this article we present framework that allows to develop and test rules for automatically determining the segmentation schemes. We describe two basic experiments - experiment to obtain the segmentation patterns from trees from Prague Dependency Treebank amd segmentation experiment with the rules applied to plain text.

Furthermore, we propose measures for evaluating segmentation

Keywords

Relation among Segments Segmentation Schemas Czech Sentences