This paper focuses on wrapper-based feature selection for a 1-nearest neighbor classifier. we consider in particular the case of a small sample size with a few hundred instances, which is common in biomedical applications. We propose a technique for calculating the complete bootstrap for a 1-nearest-neighbor classifier (i.e., averaging over all desired test/train partitions of the data).
The complete bootstrap and the complete cross-validation error estimate with lower variance are applied as novel selection criteria and are compared with the standard bootstrap and cross-validation in combination with three optimization techniques sequential forward selection (SFS), binary particle swarm optimization (BPSO) and simplified social impact theory based optimization (SSITO). The experimental comparison based on ten datasets draws the following conclusions: for all three search methods examined here, the complete criteria are a significantly better choice than standard 2-fold cross-validation, 10-fold cross-validation and bootstrap with 50 trials irrespective of the selected output number of iterations.
All the complete criterion-based 1NN wrappers with SFS search performed better than the widely-used FILTER and SIMBA methods. We also demonstrate the benefits and properties of our approaches on an important and novel real-world application of automatic detection of the subthalamic nucleus.