Classifier Domains of Competence in Data Complexity Space
01 January 2005
We study the domain of competence of a set of popular classifiers, by means of a methodology that relates the classifier's behavior to problem complexity. We find that the simplest classifiers, the nearest neighbor and the linear classifier, have extreme behavior in the sense that they mostly behave either as the best approach for certain types of problems or as the worst approach for other kinds of problems. We also identify that the domain of competence of the nearest neighbor is almost opposed to that of the linear classifier. Ensemble methods such as decision forests are not outstanding in any particular set of problems but perform more robustly in general. A by-product of this study is the identification of the most relevant features for optimal classifier selection.