Charles Explorer logo
🇬🇧

Clustering Problems on Sliding Windows

Publication at Faculty of Mathematics and Physics |
2016

Abstract

We explore clustering problems in the streaming sliding window model in both general metric spaces and Euclidean space. We present the first polylogarithmic space O(1)-approximation to the metric k-median and metric k-means problems in the sliding window model, answering the main open problem posed by Babcock, Datar, Motwani and O'Callaghan [5], which has remained unanswered for over a decade.

Our algorithm uses O(k3 log6 W) space and poly(k, log W) update time, where W is the window size. This is an exponential improvement on the space required by the technique due to Babcock, et al.

We introduce a data structure that extends smooth histograms as introduced by Braverman and Ostrovsky [11] to operate on a broader class of functions. In particular, we show that using only polylogarithmic space we can maintain a summary of the current window from which we can construct an O(1)-approximate clustering solution.