The Proportional Scaling Limit of Neural Networks 

Mufan Li, Princeton University
-
CLK 219

Recent advances in deep learning performance have all relied on scaling up the number of parameters within neural networks, consequently making asymptotic scaling limits a compelling approach to theoretical analysis. However, current research predominantly focuses on infinite-width limits, unable to adequately analyze the role of depth in deep networks. In this talk, we explore a unique approach by studying the proportional infinite-depth-and-width limit. 

 

Firstly, we show that large depth networks necessarily require a shaping of the non-linearities to achieve a well-behaved limit. We then characterize the limiting distribution of the shaped network at initialization via a stochastic differential equation (SDE) for the feature covariance matrix. Furthermore, in the linear network setting, we can characterize the spectrum of the covariance matrix in the large data limit via a geometric variant of Dyson Brownian motions. 

Event Type
Event Subcalendar