A classification-based approach to semi-supervised clustering with pairwise constraints.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

In this paper, we introduce a neural network framework for semi-supervised clustering with pairwise (must-link or cannot-link) constraints. In contrast to existing approaches, we decompose semi-supervised clustering into two simpler classification tasks: the first stage uses a pair of Siamese neural networks to label the unlabeled pairs of points as must-link or cannot-link; the second stage uses the fully pairwise-labeled dataset produced by the first stage in a supervised neural-network-based clustering method. The proposed approach is motivated by the observation that binary classification (such as assigning pairwise relations) is usually easier than multi-class clustering with partial supervision. On the other hand, being classification-based, our method solves only well-defined classification problems, rather than less well specified clustering tasks. Extensive experiments on various datasets demonstrate the high performance of the proposed method.

Authors

  • Marek Śmieja
    Faculty of Mathematics and Computer Science, Jagiellonian University, 6 Lojasiewicza Street, 30-348, Kraków, Poland.
  • Łukasz Struski
    Faculty of Mathematics and Computer Science, Jagiellonian University, Kraków, Poland. Electronic address: lukasz.struski@uj.edu.pl.
  • Mário A T Figueiredo
    Instituto de Telecomunicações, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal. Electronic address: mario.figueiredo@tecnico.ulisboa.pt.