Graph Metanetworks for Processing Diverse Neural Architectures | Department of Mathematics

Derek Lim (MIT)

Tuesday, February 6 2024, 11am - 12pm

Zoom: https://washington.zoom.us/j/98292982594

Neural networks efficiently encode learned information within their parameters. Consequently, many tasks can be unified by treating neural networks themselves as input data. In this talk, we discuss new metanetworks - neural networks that take weights from other neural networks as input. Put simply, we carefully build graphs representing the input neural networks and process the graphs using graph neural networks. Our approach, Graph Metanetworks (GMNs), generalizes to input neural architectures where competing methods struggle, such as multi-head attention layers, normalization layers, convolutional layers, ResNet blocks, and group-equivariant linear layers. We prove that GMNs are expressive and equivariant to parameter permutation symmetries that leave the input neural network functions unchanged. We validate the effectiveness of our method on several metanetwork tasks over diverse neural network architectures.

Event Type

Seminars

Event Subcalendar

Pacific Northwest Seminar on Topology, Algebra, and Geometry in Data Science