The number of features in current data collection techniques often reaches thousands. We propose a methodology based on information theory and apply it to data sets with predominantly binary and ordinal features in order to identify informative features.
Our proposed technique benefits from the robust properties of redundant features. We show that while direct calculation by definition exhibits cubic complexity, sparse structures can be processed in nearly linear time.