Syntactic analysis of natural language sentences, the basic requirement of many applied tasks, is a complex task, especially for languages with free word order. Natural solution, which reduces complexity of the input sentences, can be a module in which they determined the structure of sentences before full synyactic analysis.
We propose to use the concept of segments, easily recognizable sections sentences automatically. We introduce `segmentation schemata' that describe the relationship between the segments - in particular, super/subordination, coordination and aposition, and parenthesis.
In this article we present framework that allows to develop and test rules for automatically determining the segmentation schemes. We describe two basic experiments - experiment to obtain the segmentation patterns from trees from Prague Dependency Treebank amd segmentation experiment with the rules applied to plain text.
Furthermore, we propose measures for evaluating segmentation