This work presents ongoing research on finding illustrative images to “soft news” articles, magazine-style texts where the appropriateness of an illustrative image is not judged primarily by its descriptiveness. We describe our baselines and experiments using Denoising Autoencoders and present the web-pic multimodal dataset of Czech news articles and their accompanying images.
Finally, we briefly present the Safire library, a Python framework for building and analyzing complex experimental pipelines suitable for multimodal tasks.