Francesco Tudisco

Associate Professor (Reader) in Machine Learning

School of Mathematics, The University of Edinburgh
The Maxwell Institute for Mathematical Sciences
School of Mathematics, Gran Sasso Science Institute JCMB, King’s Buildings, Edinburgh EH93FD UK
email: f dot tudisco at ed.ac.uk

Robust low-rank training via approximate orthonormal constraints

Dayana Savostianova, Emanuele Zangrando, Gianluca Ceruti, Francesco Tudisco,
Advances in Neural Information Processing Systems (NeurIPS), (2023)

Abstract

With the growth of model and data sizes, a broad effort has been made to design pruning techniques that reduce the resource demand of deep learning pipelines, while retaining model performance. In order to reduce both inference and training costs, a prominent line of work uses low-rank matrix factorizations to represent the network weights. Although able to retain accuracy, we observe that low-rank methods tend to compromise model robustness against adversarial perturbations. By modeling robustness in terms of the condition number of the neural network, we argue that this loss of robustness is due to the exploding singular values of the low-rank weight matrices. Thus, we introduce a robust low-rank training algorithm that maintains the network’s weights on the low-rank matrix manifold while simultaneously enforcing approximate orthonormal constraints. The resulting model reduces both training and inference costs while ensuring well-conditioning and thus better adversarial robustness, without compromising model accuracy. This is shown by extensive numerical evidence and by our main approximation theorem that shows the computed robust low-rank network well-approximates the ideal full model, provided a highly performing low-rank sub-network exists.

Please cite this paper as:

@article{savostianova2023robust,
  title={Robust low-rank training via approximate orthonormal constraints},
  author={Savostianova, Dayana and Zangrando, Emanuele and Ceruti, Gianluca and Tudisco, Francesco},
  journal={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2023}
}

Links: arxiv doi code

Keywords: deep learning neural networks low-rank pruning compression