There is a vast amount of datasets available as Open Data on the Web. However, it is challenging for consumers to find datasets relevant to their goals.
This is because the available metadata in catalogs is not descriptive enough. Nevertheless, datasets exist in various types of contexts not expressed in the metadata.
These may include information about the data publisher, the legislation related to dataset publication, etc. In this paper we describe an idea of a data model that enables consumers to better understand the data.
We propose to define a formal model for representation of the datasets and their contexts, and we propose to apply existing similarity techniques, adjust them to fit each identified dataset context type and combine them together to measure similarity of datasets in new ways, improving their findability.