A fast clustering algorithm for data with a few labeled instances.

Journal: Computational intelligence and neuroscience

Published Date: Mar 11, 2015

Abstract

The diameter of a cluster is the maximum intracluster distance between pairs of instances within the same cluster, and the split of a cluster is the minimum distance between instances within the cluster and instances outside the cluster. Given a few labeled instances, this paper includes two aspects. First, we present a simple and fast clustering algorithm with the following property: if the ratio of the minimum split to the maximum diameter (RSD) of the optimal solution is greater than one, the algorithm returns optimal solutions for three clustering criteria. Second, we study the metric learning problem: learn a distance metric to make the RSD as large as possible. Compared with existing metric learning algorithms, one of our metric learning algorithms is computationally efficient: it is a linear programming model rather than a semidefinite programming model used by most of existing algorithms. We demonstrate empirically that the supervision and the learned metric can improve the clustering quality.

Authors

Jinfeng Yang

Electric Power Research Institute of Guangdong Power Grid Corporation, Guangzhou 510080, China.
Yong Xiao

Electric Power Research Institute of Guangdong Power Grid Corporation, Guangzhou 510080, China.
Jiabing Wang

School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China.
Qianli Ma

School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China.
Yanhua Shen

School of Materials Science and Engineering, South China University of Technology, Guangzhou 510006, China.

Keywords

Algorithms Artificial Intelligence Cluster Analysis Computational Biology Learning Models, Theoretical

External Resources

View on PubMed Access via DOI PubMed (25861252)

A fast clustering algorithm for data with a few labeled instances.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals