Learning with similarity functions on graphs using matchings of geometric embeddings
Paper in proceedings, 2015

We develop and apply the Balcan-Blum-Srebro (BBS) theory of classification via similarity functions (which are not necessarily kernels) to the problem of graph classification. First we place the BBS theory into the unifying framework of optimal transport theory. This also opens the way to exploit coupling methods for establishing properties required of a good similarity function as per their definition. Next, we use the approach to the problem of graph classification via geometric embeddings such as the Laplacian, pseudo-inverse Laplacian and the Lovász orthogonal labellings. We consider the similarity function given by optimal and near-optimal matchings with respect to Euclidean distance of the corresponding embeddings of the graphs in high dimensions. We use optimal couplings to rigorously establish that this yields a "good" similarity measure in the BBS sense for two well known families of graphs. Further, we show that the similarity yields better classification accuracy in practice, on these families, than matchings of other well-known graph embeddings. Finally we perform an extensive empirical evaluation on benchmark data sets where we show that classifying graphs using matchings of geometric embeddings outperforms the previous state-of-the-art methods.

Geometric embeddings

Classification

Similarity functions

Matchings

Graphs

Author

Fredrik Johansson

Chalmers, Computer Science and Engineering (Chalmers), Computing Science (Chalmers)

Devdatt Dubhashi

Chalmers, Computer Science and Engineering (Chalmers), Computing Science (Chalmers)

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Vol. 2015-August 467-476

Subject Categories

Computer and Information Science

DOI

10.1145/2783258.2783341

ISBN

9781450336642