Charles Explorer logo
🇬🇧

Introducing YakuToolkit. Yakut Treebank and Morphological Analyzer.

Publication

Abstract

This poster presents the first publicly available treebank of Yakut, a Turkic language spoken in Russia, and a morphological analyzer for this language. The treebank was annotated following the Universal Dependencies (UD) framework and the mor- phological analyzer can directly access and use its data.

Yakut is an under-represented language whose prominence can be raised by making reliably annotated data and NLP tools that could process it freely accessible. The publication of both the treebank and the analyzer serves this purpose with the prospect of evolving into a benchmark for the development of NLP online tools for other languages of the Turkic family in the future.