Ensemble classifiers such as bagging, boosting and model averaging are known to have improved accuracy and robustness over a single model. Their potential, however, is limited in applications which have no access to raw data but to the meta-level model output. In this paper, we study ensemble learning with output from multiple supervised and unsupervised models, a topic where little work has been done. Although unsupervised models, such as clustering, do not directly generate label prediction for each individual, they provide useful constraints for the joint prediction of a set of related objects. We propose to consolidate a classification solution by maximizing the consensus among both supervised predictions and unsupervised constraints. We cast this ensemble task as an optimization problem on a bipartite graph, where the objective function favors the smoothness of the prediction over the graph, as well as penalizing deviations from the initial labeling provided by supervised models. We solve this problem through iterative propagation of probability estimates among neighboring nodes. Our method can also be interpreted as conducting a constrained embedding in a transformed space, or a ranking on the graph. Experimental results on three real applications demonstrate the benefits of the proposed method over existing alternatives.
Jing Gao, Feng Liang, Wei Fan, Yizhou Sun, Jiawei Han. Graph-based Consensus Maximization among Multiple Supervised and Unsupervised Models. Advances in Neural Information Processing Systems, 22, 2009, 585-593. [link]