Francesco Tudisco

Associate Professor (Reader) in Machine Learning

School of Mathematics, The University of Edinburgh
The Maxwell Institute for Mathematical Sciences
School of Mathematics, Gran Sasso Science Institute JCMB, King’s Buildings, Edinburgh EH93FD UK
email: f dot tudisco at ed.ac.uk

Neural rank collapse: Weight decay and small within-class variability yield low-rank bias

Emanuele Zangrando, Piero Deidda, Simone Brugiapaglia, Nicola Guglielmi, Francesco Tudisco,
preprint, (2024)

Abstract

Recent work in deep learning has shown strong empirical and theoretical evidence of an implicit low-rank bias: weight matrices in deep networks tend to be approximately low-rank and removing relatively small singular values during training or from available trained models may significantly reduce model size while maintaining or even improving model performance. However, the majority of the theoretical investigations around low-rank bias in neural networks deal with oversimplified deep linear networks. In this work, we consider general networks with nonlinear activations and the weight decay parameter, and we show the presence of an intriguing neural rank collapse phenomenon, connecting the low-rank bias of trained networks with networks' neural collapse properties: as the weight decay parameter grows, the rank of each layer in the network decreases proportionally to the within-class variability of the hidden-space embeddings of the previous layers. Our theoretical findings are supported by a range of experimental evaluations illustrating the phenomenon.

Please cite this paper as:

@article{zangrando2024neural,
  title={Neural Rank Collapse: Weight Decay and Small Within-Class Variability Yield Low-Rank Bias},
  author={Zangrando, Emanuele and Deidda, Piero and Brugiapaglia, Simone and Guglielmi, Nicola and  Tudisco, Francesco},
  journal={arXiv:2402.03991},
  year={2024}
}

Links: arxiv

Keywords: neural collapse low-rank bias deep learning neural networks low-rank pruning compression