The Earth Mover's Distance is a well-known distance-based similarity measure employed in various domains of data management, especially in computer vision and content-based multimedia retrieval. However, as the computation of the Earth Mover's Distance is a considerably expensive task, efficient processing of content-based similarity queries in large multimedia databases remains a challenging issue.
In this paper, we propose to use nonmetric ground distances within the computation of the Earth Mover's Distance in order to speedup its computation, thus improving the efficiency of the entire retrieval process. Moreover, by investigating the inner workings of the Earth Mover's Distance, we show how to balance the trade-off between effectiveness and efficiency in order to adapt the retrieval process to individual user requirements.
By making use of metric access methods in combination with the Rubner filter, we empirically show an improvement in efficiency by two orders of magnitude according to the sequential scan, while keeping the retrieval error below 5%.