This paper focuses on combining association measures using corresponding receiver operating characteristic curves. The approach is motivated by a problem of automatic bigram collocation extraction from the field of computational linguistics.
It is based on supervised machine learning techniques and the fact that different association measures discover different collocation types. Clusters of equivalent ROC curves are first determined by a testing procedure.
The paper's major contribution is an investigation of the possibility of combining representatives of the clusters of equivalent association measures into more complex models, thus improving performance of the collocation extraction.