What does my GNN really capture? On exploring internal GNN representations

Abstract

GNNs are efficient for classifying graphs but their internal workings is opaque which limits their field of application. Existing methods for explaining GNN focus on disclosing the relationships between input graphs and the model’s decision. In contrary, the method we propose isolates internal features, hidden in the network layers, which are automatically identified by the GNN to classify graphs. We show that this method makes it possible to know the parts of the input graphs used by GNN with much less bias than the SOTA methods and therefore to provide confidence in the decision process.