Charles Explorer logo
🇨🇿

More data and new tools. Advances in parsing the Index Thomisticus Treebank

Publikace

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

This paper investigates the recent advances in parsing the Index Thomisticus Treebank, which encompasses Medieval Latin texts by Thomas Aquinas. The research focuses on two types of variables.

On the one hand, it examines the impact that a larger dataset has on the results of parsing; on the other hand, performances of new parsers are analysed with respect to less recent tools. Term of comparison to determine the effective parsing advances are the results in parsing the Index Thomisticus Treebank described in a previous work.

First, the best performing parser among those concerned in that study is tested on a larger dataset than the one originally used. Then, some parser combinations that were developed in the same study are evaluated as well, assessing that more training data result in more accurate performances.

Finally, to examine the impact that newly available tools have on parsing results, we train, test, and evaluate two neural parsers chosen among those best performing in the CoNLL 2018 Shared Task. Our experiments reach the highest accuracy rates achieved so far in automatic syntactic parsing of the Index Thomisticus Treebank and of Latin overall.

Klíčová slova