Robust methods have been proposed for content and topic-based text classification, as well authorship attribution in stylometry. However, the problem of a fine-grained literary genre (style) recognition is much less studied.
We present several approaches to the recognition of eight literary genres manually annotated in a large corpus of Polish blogs. Different text representations were combined with neural network classifiers, including deep, recursive neural networks.
Very good results were achieved for the representation of blog posts with the help of pre-trained fastText word embeddings and the Bi-GRU recursive deep neural network as a classifier. As the observed good performance of this classifier could be a result of topical bias across genres, experiments on a selected sub-corpus with a reduced dominance of the most frequent topic were also conducted with no significant change observed. (C) 2021 The Authors.
Published by Elsevier B.V.