Background: Microarray technologies are used to measure the simultaneous expression of a certain set of thousands of genes based on ribonucleic acid (RNA) obtained from a biological sample. We are interested in several statistical analyses such as 1) finding differentially expressed genes between or among several experimental groups, 2) finding a small number of genes allowing for the correct classification of a sample in a certain group, and 3) finding relations among genes.
Objectives: Gene expression data are high dimensional, and this fact complicates their analysis because we are able to perform only a few samples (e.g. the peripheral blood from a limited number of patients) for a certain set of thousands of genes. The main purpose of this paper is to present the shrinkage estimator and show its application in different statistical analyses.
Methods: The shrinkage approach relates to the shift of a certain value of a classic estimator towards a certain value of a specified target estimator. More precisely, the shrinkage estimator is the weighted average of the classic estimator and the target estimator.
Results: The benefit of the shrinkage estimator is that it improves the mean squared error (MSE) as compared to a classic estimator. The MSE combines the measure of an estimator’s bias away from its true unknown value and the measure of the estimator’s variability.
The shrinkage estimator is a biased estimator but has a lower variability. Conclusions: The shrinkage estimator can be considered as a promising estimator for analyzing high dimensional gene expression data.