Charles Explorer logo
🇬🇧

The complexity of computation and approximation of the t-ratio over one-dimensional interval data

Publication at Faculty of Mathematics and Physics |
2014

Abstract

The main question is how to compute the upper and lower limits of the range of possible values of a given statistic, when the data range over given intervals. Initially some well-known statistics, such as sample mean, sample variance or F-ratio, are considered in order to illustrate that in some cases the limits can be computed efficiently, while in some cases their computation is NP-hard.

Subsequently, the t-ratio (variation coefficient) is considered. It is investigated when the limits for t-ratio are computable in polynomial time and a new efficient algorithm is designed for this case.

Conversely, complementary NP-hardness results are proved, demonstrating the cases when the computation cannot be done efficiently. Subsequently, the NP-hardness results are strengthened: it is shown that under certain assumptions, even an approximate evaluation with an arbitrary absolute error is NP-hard.

Finally, it is shown that the situation can also be (in some sense) regarded positively: a new pseudopolynomial algorithm is developed. The algorithm is of practical importance, especially when the dataset to be processed is large and does not contain "excessively" large numbers.