Charles Explorer logo
🇬🇧

Semi-Automatic Construction of Word-Formation Networks

Publication at Faculty of Mathematics and Physics |
2020

Abstract

The article presents a semi-automatic method for the construction of word-formation networks focusing particularly on derivation. The proposed approach applies a sequential pattern mining technique to construct useful morphological features in an unsupervised manner.

The features take the form of regular expressions and later are used to feed a machine-learned ranking model. The network is constructed by applying the learned model to sort the lists of possible base words and selecting the most probable ones.

This approach, besides relatively small training set and a lexicon, does not require any additional language resources such as a list of vowel and consonant alternations, part-of-speech tags etc. The proposed approach is evaluated on lexeme sets of four languages, namely Polish, Spanish, Czech, and French.

The conducted experiments demonstrate the ability of the proposed method to construct linguistically adequate word-formation networks from small training sets. Furthermore, the performed feasibi