In this paper, we compare Linear Mixed Effect Models (LMM) which utilise the predictors Average Information Content (IC) and frequency for the prediction of lengths of aspect-marked verbs. IC is the information which target elements convey to their context.
Focusing on typologically diverse languages, we took as contexts dependency frames and n-grams, and found that IC estimated from n-grams outperforms IC estimated from dependency frames: the models which utilise IC from n-grams achieve high correlations between predicted and actual verbs’ lengths, while models which utilise IC form dependency frames perform poorly. Only in few languages we found prediction effects of IC.