Gibbs Sampling Segmentation of Parallel Dependency Trees for Tree-Based Machine Translation

Publication at Faculty of Mathematics and Physics |

2016

Abstract

We present a work in progress aimed at extracting translation pairs of source and target dependency treelets to be used in a dependency-based machine translation system. We introduce a novel unsupervised method for parallel tree segmentation based on Gibbs sampling.

Using the data from a Czech-English parallel treebank, we show that the procedure converges to a dictionary containing reasonably sized treelets; in some cases, the segmentation seems to have interesting linguistic interpretations.

Keywords

gibbs sampling segmentation parallel dependency trees tree based machine translation