In recent years, machine translation (MT) research focused on investigating how hybrid MT as well as MT combination systems can be designed so that the resulting translations give an improvement over the individual translations. As a first step towards achieving this objective we have developed a parallel corpus with source data and the output of a number of MT systems, annotated with metadata information, capturing aspects of the translation process performed by the different MT systems.
As a second step, we have organised a shared task in which participants were requested to build Hybrid/System Combination systems using the annotated corpus as input. The main focus of the shared task is trying to answer the following question: Can Hybrid MT algorithms or System Combination techniques benefit from the extra information (linguistically motivated, decoding and runtime) from the different systems involved? In this paper, we describe the annotated corpus we have created.
We provide an overview on the partici