Francesco Tudisco

Associate Professor (Reader) in Machine Learning

School of Mathematics, The University of Edinburgh
The Maxwell Institute for Mathematical Sciences
School of Mathematics, Gran Sasso Science Institute JCMB, King’s Buildings, Edinburgh EH93FD UK
email: f dot tudisco at ed.ac.uk

GeoLoRA: Geometric integration for parameter efficient fine-tuning

Steffen Schotthöfer, Emanuele Zangrando, Gianluca Ceruti, Francesco Tudisco, Jonas Kusch,
preprint, (2022)

Abstract

Low-Rank Adaptation (LoRA) has become a widely used method for parameter-efficient fine-tuning of large-scale, pre-trained neural networks. However, LoRA and its extensions face several challenges, including the need for rank adaptivity, robustness, and computational efficiency during the fine-tuning process. We introduce GeoLoRA, a novel approach that addresses these limitations by leveraging dynamical low-rank approximation theory. GeoLoRA requires only a single backpropagation pass over the small-rank adapters, significantly reducing computational cost as compared to similar dynamical low-rank training methods and making it faster than popular baselines such as AdaLoRA. This allows GeoLoRA to efficiently adapt the allocated parameter budget across the model, achieving smaller low-rank adapters compared to heuristic methods like AdaLoRA and LoRA, while maintaining critical convergence, descent, and error-bound theoretical guarantees. The resulting method is not only more efficient but also more robust to varying hyperparameter settings. We demonstrate the effectiveness of GeoLoRA on several state-of-the-art benchmarks, showing that it outperforms existing methods in both accuracy and computational efficiency.

Please cite this paper as:

@article{schotthofer2024geolora,
  title={GeoLoRA: Geometric integration for parameter efficient fine-tuning},
  author={Schotth{\"o}fer, Steffen and Zangrando, Emanuele and Ceruti, Gianluca and Tudisco, Francesco and Kusch, Jonas},
  journal={arXiv preprint arXiv:2410.18720},
  year={2024}
}

Links: arxiv

Keywords: deep learning neural networks low-rank pruning compression fine-tuning optimization