Building Indonesian Dependency Parser Using Cross-lingual Transfer Learning

Publication

Abstract

In recent years, cross-lingual transfer learning has been gaining positive trends across NLP tasks. This research aims to develop a dependency parser for Indonesian using cross-lingual transfer learning.

The dependency parser uses a Transformer as the encoder layer and a deep biaffine attention decoder as the decoder layer. The model is trained using a transfer learning approach from a source language to our target language with fine-tuning.

We choose four languages as the source domain for comparison: French, Italian, Slovenian, and English. Our proposed approach is able to improve the performance of the dependency parser model for Indonesian as the target domain on both same-domain and cross-domain testing.

Compared to the baseline model, our best model increases UAS up to 4.31% and LAS up to 4.46%. Among the chosen source languages of dependency treebanks, French and Italian that are selected based on LangRank output perform better than other languages selected based on other criteria.

French, which has the highest rank from LangRank, performs the best on cross-lingual transfer learning for the dependency parser model.

Keywords

ross-domain cross-lingual transfer learning dependency parser transformer