The amount of training data in statistical machine translation is critical for translation quality. In this paper, we demonstrate how to increase translation quality for one language pair by bringing in parallel data from a closely related language.
In particular, we improve en→sk translation using a large Czech–English parallel corpus and a shallow (rule-based) MT system for cs→sk. Several setup options are explored in order to identify the best possible configuration.