Learning with similarity functions on graphs using matchings of geometric embeddings
Paper in proceedings, 2015
We develop and apply the Balcan-Blum-Srebro (BBS) theory of classification via similarity functions (which are not necessarily kernels) to the problem of graph classification. First we place the BBS theory into the unifying framework of optimal transport theory. This also opens the way to exploit coupling methods for establishing properties required of a good similarity function as per their definition. Next, we use the approach to the problem of graph classification via geometric embeddings such as the Laplacian, pseudo-inverse Laplacian and the Lovász orthogonal labellings. We consider the similarity function given by optimal and near-optimal matchings with respect to Euclidean distance of the corresponding embeddings of the graphs in high dimensions. We use optimal couplings to rigorously establish that this yields a "good" similarity measure in the BBS sense for two well known families of graphs. Further, we show that the similarity yields better classification accuracy in practice, on these families, than matchings of other well-known graph embeddings. Finally we perform an extensive empirical evaluation on benchmark data sets where we show that classifying graphs using matchings of geometric embeddings outperforms the previous state-of-the-art methods.