In this paper, we show that automatically-generated questions and answers can be used to evaluate the quality of Machine Translation systems. Building on recent work on the evaluation of abstractive text summarization, we propose a new metric for system-level Machine Translation evaluation, compare it with other state-of-the-art solutions, and show its robustness by conducting experiments for various translation directions.