Jacob Springer, Carnegie Mellon University
-
Zoom - https://washington.zoom.us/j/99239593243
I will present a perspective from a recent line of work on how "adversarial examples"—input perturbations that dramatically change the neural network's output—can help us understand the (dis)similarities between different neural network classifiers. I will show that existing similarity metrics overestimate neural network similarity and will present an invariance-based method to correct the issues with previous metrics. To conclude, I will describe how we can use the corrected similarity metrics to reveal a surprising (empirical) asymmetrical relationship between non-robust and robust neural networks in which non-robust networks exhibit similarities with robust networks, but not with other non-robust networks.