Minimum redundancy feature selection

Feature selection is one of the basic problems in pattern recognition and machine learning. It has a variety of applications in many areas, such as cancer diagnosis and speaker recognition.

Features can be selected in many different ways. One scheme is to select features that correlate strongest to the classification variable. This has been called maximum-relevance selection. Many heuristic algorithms can be used, such as the sequential forward, backward, or floating selections.

On the other hand, features can be selected to be mutually far away from each other, while they still have "high" correlation to the classification variable. This scheme, termed as minimum-Redundancy-Maximum-Relevance selection (mRMR), has been found to be more powerful than the maximum relevance selection.

As a special case, the "correlation" can be replaced by the statistical dependency between variables. Mutual information can be used to quantify the dependency. In this case, it is shown that mRMR is an approximation to maximizing the dependency between the joint distribution of the selected features and the classification variable.