Francesco Tudisco

Associate Professor (Reader) in Machine Learning

School of Mathematics, The University of Edinburgh
The Maxwell Institute for Mathematical Sciences
School of Mathematics, Gran Sasso Science Institute JCMB, King’s Buildings, Edinburgh EH93FD UK
email: f dot tudisco at ed.ac.uk

New paper out

The graph $\infty$-Laplacian eigenvalue problem

Abstract: We analyze various formulations of the $\infty$-Laplacian eigenvalue problem on graphs, comparing their properties and highlighting their respective advantages and limitations. First, we investigate the graph $\infty$-eigenpairs arising as limits of $p$-Laplacian eigenpairs, extending key results from the continuous setting to the discrete domain. We prove that every limit of $p$-Laplacian eigenpair, for $p$ going to $\infty$, satisfies a limit eigenvalue equation and establish that the corresponding eigenvalue can be bounded from below by the packing radius of the graph, indexed by the number of nodal domains induced by the eigenfunction. ... Read more

New paper out

GeoLoRA: Geometric integration for parameter efficient fine-tuning

Abstract: Low-Rank Adaptation (LoRA) has become a widely used method for parameter-efficient fine-tuning of large-scale, pre-trained neural networks. However, LoRA and its extensions face several challenges, including the need for rank adaptivity, robustness, and computational efficiency during the fine-tuning process. We introduce GeoLoRA, a novel approach that addresses these limitations by leveraging dynamical low-rank approximation theory. GeoLoRA requires only a single backpropagation pass over the small-rank adapters, significantly reducing computational cost as compared to similar dynamical low-rank training methods and making it faster than popular baselines such as AdaLoRA. ... Read more

--- Performance of different optimizers when tested on the low-rank matrix approximation problem.

New paper out

Low-Rank Adversarial PGD Attack

Abstract: Adversarial attacks on deep neural network models have seen rapid development and are extensively used to study the stability of these networks. Among various adversarial strategies, Projected Gradient Descent (PGD) is a widely adopted method in computer vision due to its effectiveness and quick implementation, making it suitable for adversarial training. In this work, we observe that in many cases, the perturbations computed using PGD predominantly affect only a portion of the singular value spectrum of the original image, suggesting that these perturbations are approximately low-rank. ... Read more

--- Attacks perfomed with PGD are low rank. The plot shows the rank of the attack across different modfels on CIFAR10.

Solaris, the first foundation model for the Sun

Excited to announce that our paper “Solaris: A Foundation Model for the Sun” has been accepted to the Foundation Models for Science workshop at NeurIPS 2024! 🌞

We’ve developed Solaris, the first foundation model to forecast the Sun’s atmosphere. By leveraging 13 years of full-disk, multi-wavelength solar imagery from the Solar Dynamics Observatory—spanning an entire solar cycle—we’ve pre-trained Solaris to make 12-hour interval forecasts. This is a significant step towards capturing the complex dynamics of the solar atmosphere and transforming solar forecasting. Can’t wait to share more at the conference!

Huge contratulations to my students and main contributors Harris Abdul Majid and Pietro Sittoni!

Paper accepted @ NeurIPS 2024

🚀 Excited to share that our paper Geometry-aware training of factorized layers in tensor Tucker format has been accepted at NeurIPS 2024!

A huge thanks to an amazing team of collaborators and especially to my student Emanuele Zangrando, first author, for their outstanding work. Together, we explored a novel training approach for Tucker-decomposed neural network layers, which dynamically adjusts ranks during training and achieves high compression rates without sacrificing performance.

Looking forward to presenting this and discussing it with the community at NeurIPS!

Bayes fellowship

Thrilled to join the Bayes Innovation Fellows cohort at the University of Edinburgh! 🎉 Excited for the journey ahead as we work to bring cutting-edge machine learning models for language processing and forecasting into real-world applications. A big thank you to the Bayes Centre for this incredible opportunity! Check out the official announcements here and here.


Paper accepted @ Nature Human Behaviour

Very excited that our paper, What we should learn from pandemic publishing, has been published in Nature Human Behaviour! This work is the result of a fantastic collaboration between colleagues from the medical school, social sciences, and applied mathematics. A special shoutout to Sara Venturini and Satyaki Sikdar for their incredible contributions and dedication.

New paper out

What we should learn from pandemic publishing

Abstract: Authors of COVID-19 papers produced during the pandemic were overwhelmingly not subject matter experts. Such a massive inflow of scholars from different expertise areas is both an asset and a potential problem. Domain-informed scientific collaboration is the key to preparing for future crises. Please cite this paper as: @article{sikdar2024what, author = {Sikdar, Satyaki and Venturini, Sara and Charpignon, Marie-Laure and Kumar, Sagar and Rinaldi, Francesco and Tudisco, Francesco and Fortunato, Santo and Majumder, Maimuna S. ... Read more

Recent advancements in uncertainty quantification for scientific machine learning, artificial intelligence and sampling algorithms

Excited to be organizing a two-day workshop in Edinburgh on Recent advancements in uncertainty quantification for scientific machine learning, artificial intelligence and sampling algorithms.

The workshop brings together experts from seemingly disparate scientific domains such as applied probability, multi-scale modeling, interacting particle systems, optimal transport in Data Science, statistical deep learning, numerical simulation of Stochastic Differential Equations, and uncertainty quantification.

Below is the program:

3/9 Tuesday (1PM-6PM)
1PM-2PM: Aretha Teckentrup, Uncertainty-aware surrogate models for inverse problems
2PM-3PM: Eric Hall, Global Sensitivity Analysis for Deep Learning of Complex Systems
3PM-3:30PM: Coffee break
3:30PM-6PM: Poster session

4/9 Wednesday (10AM-12:30PM)
10AM-11AM: Nicos Georgiou, Various models for queuing systems in tandem 11AM-11:30AM: coffee break
11:30AM-12:30PM: Nikolas Nusken, Bridging transport and variational inference: a new objective for optimal control.

New postdoc joining our group

Angelo Alberto Casulli has officially joined our Numerical Analysis and Data Science group at GSSI. Welcome Angelo!


Overjoyed to announce I’ve become father for the second time today; feeling incredibly blessed and grateful! 👶❣️

Paper accepted @ SIAM Journal on Matrix Analysis

Delighted that our work Cholesky-like Preconditioner for Hodge Laplacians via Heavy Collapsible Subcomplex has been accepted on SIAM Journal on Matrix Analysis and Applications. Great work designing fast and effective topology-driven preconditioners for higher-order Laplacians led by my former student Tony Savostianov.

New paper out

Testing Quantum and Simulated Annealers on the Drone Delivery Packing Problem

Abstract: Using drones to perform human-related tasks can play a key role in various fields, such as defense, disaster response, agriculture, healthcare, and many others. The drone delivery packing problem (DDPP) arises in the context of logistics in response to an increasing demand in the delivery process along with the necessity of lowering human intervention. The DDPP is usually formulated as a combinatorial optimization problem, aiming to minimize drone usage with specific battery constraints while ensuring timely consistent deliveries with fixed locations and energy budget. ... Read more

Paper accepted @ Proceedings Royal Society A

Very happy that our work A nonlinear spectral core-periphery detection method for multiplex networks has been accepted on Proceedings Royal Society A. Exciting collaboration with Kai Bergermann and Martin Stoll!

Talk @ ICMS Big Data Inverse Problems Workshop

I am giving a talk today on our recent work on exploiting low-rank geometry to reduce memory and computational footprints in deep learning pipelines at the ICMS workshop on Big Data Inverse Problems in Edinburgh (UK).

Talk @ SIAM LA 2024

I am giving a talk today on our recent work on model comperssion algorithms and analysis for deep learning, at the SIAM Conference on Applied Linear Algebra in Paris, France. Thanks Rima Khouja for the kind invitation!

Low-rank Numerical Patterns of Dynamical Systems and Neural Networks

I am excited to be organizing a minimymposium on low-rank patterns in machine learning and numerical analysis at the SIAM Conference on Applied Linear Algebra 2024.

The size of data and model parameters is growing enormously in modern science and engineering. While recent advancements in computational hardware make it tempting to handle large-scale problems by merely allocating increasing resources, a much more efficient approach combines efficient hardware with techniques from model order reduction. Among the successful approaches, those leveraging low-rank formats are particularly interesting due to their ability to combine favorable memory and computational footprints with solid mathematical analysis and interpretation.

As low-rank data naturally emerge in diverse settings, including complex and quantum systems, high-dimensional optimization, and deep learning, the analysis of low-rank structures in modern systems is being pushed forward simultaneously by many different scientific areas, highlighting the fundamental importance of this type of reduced-order structure in understanding and solving complex problems.

This minisymposium brings together contributions from different communities working on this topic, with the goal of sampling recent developments in low-rank techniques with a specific focus on their relevance and application to machine learning.

Invited Speakers

  • Alex Cayco-Gajic, École Normale SupĂ©rieure Paris, France
  • Hung-Hsu Chou, Technische Universität MĂĽnchen, Germany
  • Romain Cosson, Inria, France
  • Hessam Babaee, University of Pittsburgh, U.S.
  • Mathis Dus, ENPC, France
  • Filippo Vicentini, École Polytechnique, France
  • Jörg Nick, ETH Zurich, Switzerland
  • Emanuele Zangrando, Gran Sasso Science Institute, Italy

The Linear Algebra of Multilayer Networks

I am excited to be organizing a minimymposium on the linear algebra of multilayer netoworks at the SIAM Conference on Applied Linear Algebra 2024.

Models of complex networks allow insights into application areas ranging from social over transport and engineered networks. Their canonical representation by matrices made them an important field of study in applied linear algebra. Multilayer networks represent a versatile model for networks in which entities are connected by different types of interactions. These can be represented by structured matrices or tensors, calling for novel linear algebra techniques such as linear systems, linear or non-linear eigenvalue problems, or matrix functions to reveal structural network properties. Popular examples include community detection, centrality analysis, or the detection of core–periphery structure. This minisymposium brings together a representative sample of the scientific community presenting recent progress on models and algorithms for the analysis of multilayer networks.

Invited Speakers

  • Fabrizio De Vico Fallani, INRIA Paris Brain Institute, France
  • Sara Venturini, Sensible City Lab, MIT, U.S.
  • Kai Bergermann, Technische Universität, Chemnitz, Germany;

New paper out

Geometry-aware training of factorized layers in tensor Tucker format

Abstract: Reducing parameter redundancies in neural network architectures is crucial for achieving feasible computational and memory requirements during training and inference phases. Given its easy implementation and flexibility, one promising approach is layer factorization, which reshapes weight tensors into a matrix format and parameterizes them as the product of two small rank matrices. However, this approach typically requires an initial full-model warm-up phase, prior knowledge of a feasible rank, and it is sensitive to parameter initialization. ... Read more

--- Comparison of vanilla compression approaches with different tensor formats with the proposed TDLRT method. Mean and standard deviation of 20 weight initializations are displayed. TDLRT achieves higher compression rates at higher accuracy with lower variance between initializations.

Dr Anton Savostianov

My (now former) student Anton has successfully defended his PhD thesis today obtaining his PhD in Mathematics in the Natural and Social Sciences cum laude. Congratulations Anton!


Paper accepted @ ICML 2024

I am very happy that our paper Subhomogeneous deep equilibrium models has been accepted on the proceedings of this year’s ICML conference.
Congrats to my student Pietro Sittoni on such an important achievement originated from his MSc thesis work!

Paper accepted @ Science Advances

Very excited that our work Learning the effective order of a hypergraph dynamical system has been accepted on Science Advances. Fantastic and super fun collaboration with Michael Schaub from RWTH Aachen and Leonie Neuhäuser and Michael Scholkemper from his group.

Numerical Analysis in the Era of Data Science

Excited to be organizing a two-day workshop at ICMS aiming at bringing together experts in the field of numerical analysis and its interface with data science and machine learning. The meeting is dedicated to the 60th birthday of Desmond J. Higham.

Keynote Speakers

  • Elena Celledoni, Norwegian University of Science and Technology
  • Peter Grindrod, University of Oxford
  • Francoise Tisseur, University of Manchester
  • Ivan Tyukin, King’s College London
  • Jesus-Maria Sanz-Serna, Universidad Carlos III de Madrid
  • Peter Kloeden, University of Tubingen
  • Brynjulf Owren, Norwegian University of Science and Technology
  • Vanni Noferini, Aalto University
  • Alison Ramage, University of Strathclyde
  • Andrew M. Stuart, California Institute of Technology
  • Benedict Leimkuhler, University of Edinburgh
  • Catherine Higham, University of Glasgow
  • Aretha Teckentrup, University of Edinburgh
  • Anders Hansen, University of Cambridge
  • Alexander Bastounis, University of Leicester
  • Konstantinos Zygalakis, University of Edinburgh

New paper out

Subhomogeneous Deep Equilibrium Models

Abstract: Implicit-depth neural networks have grown as powerful alternatives to traditional networks in various applications in recent years. However, these models often lack guarantees of existence and uniqueness, raising stability, performance, and reproducibility issues. In this paper, we present a new analysis of the existence and uniqueness of fixed points for implicit-depth neural networks based on the concept of subhomogeneous operators and the nonlinear Perron-Frobenius theory. Compared to previous similar analyses, our theory allows for weaker assumptions on the parameter matrices, thus yielding a more flexible framework for well-defined implicit networks. ... Read more

Paper accepted @ AI4DiffEqtnsInSci ICLR Workshop

Our paper Mixture of Neural Operators: Incorporating Historical Information for Longer Rollouts has been accepted at ICLR 2024 Workshop on AI4DifferentialEquations In Science. Big congratulations to my student Harris on an excellent work.

Turing fellowship

So happy to share that I have been selected as part of this year Turing Fellows cohort! I look forward to an exciting time as Turing Fellow. Check the postings by University of Edinburgh, Bayes centre, Alan Turing Institute as well as LinkedIn’s post one and post two.


New paper out

Solaris: A Foundation Model for the Sun

Abstract: Foundation models have demonstrated remarkable success across various scientific domains, motivating our exploration of their potential in solar physics. In this paper, we present Solaris, the first foundation model for forecasting the Sun’s atmosphere. We leverage 13 years of full-disk, multi-wavelength solar imagery from the Solar Dynamics Observatory, spanning a complete solar cycle, to pre-train Solaris for 12-hour interval forecasting. Solaris is built on a large-scale 3D Swin Transformer architecture with 109 million parameters. ... Read more

New paper out

Mixture of Neural Operators: Incorporating Historical Information for Longer Rollouts

Abstract: Traditional numerical solvers for time-dependent partial differential equations (PDEs) notoriously require high computational resources and necessitate recomputation when faced with new problem parameters. In recent years, neural surrogates have shown great potential to overcome these limitations. However, it has been paradoxically observed that incorporating historical information into neural surrogates worsens their rollout performance. Drawing inspiration from multistep methods that use historical information from previous steps to obtain higher-order accuracy, we introduce the Mixture of Neural Operators (MoNO) framework; a collection of neural operators, each dedicated to processing information from a distinct previous step. ... Read more

New paper out

Neural rank collapse: Weight decay and small within-class variability yield low-rank bias

Abstract: Recent work in deep learning has shown strong empirical and theoretical evidence of an implicit low-rank bias: weight matrices in deep networks tend to be approximately low-rank and removing relatively small singular values during training or from available trained models may significantly reduce model size while maintaining or even improving model performance. However, the majority of the theoretical investigations around low-rank bias in neural networks deal with oversimplified deep linear networks. ... Read more

--- The rank of the weight matrix $W_{\ell}$ of layer $\ell$ trained with weight decay $\lambda$ decreases with the total class variability $\mathrm{TCV}$ of any latent space $X_k$, with $k<\ell$.

New paper out

Cholesky-like Preconditioner for Hodge Laplacians via Heavy Collapsible Subcomplex

Abstract: Techniques based on $k$-th order Hodge Laplacian operators $L_k$ are widely used to describe the topology as well as the governing dynamics of high-order systems modeled as simplicial complexes. In all of them, it is required to solve a number of least square problems with $L_k$ as coefficient matrix, for example in order to compute some portions of the spectrum or integrate the dynamical system. In this work, we introduce the notion of optimal collapsible subcomplex and we present a fast combinatorial algorithm for the computation of a sparse Cholesky-like preconditioner for $L_k$ that exploits the topological structure of the simplicial complex. ... Read more

New paper out

Collaboration and topic switches in science

Abstract: Collaboration is a key driver of science and innovation. Mainly motivated by the need to leverage different capacities and expertise to solve a scientific problem, collaboration is also an excellent source of information about the future behavior of scholars. In particular, it allows us to infer the likelihood that scientists choose future research directions via the intertwined mechanisms of selection and social influence. Here we thoroughly investigate the interplay between collaboration and topic switches. ... Read more

Call for PhD applications @ Maxwell Institute’s Graduate School

I am looking for new PhD students on the following three projects within the Maxwell Institute’s Graduate School at The University of Edinburgh:

  • Structured reduced-order deep learning for scientific and industrial applications
  • Modern numerical linear algebra techniques for efficient learning and optimization (co-supervised with John Pearson)
  • Stability of Artificial Intelligence Algorithms (co-supervised with Des Higham)

For more details and to apply: https://www.mac-migs.ac.uk/mac-migs-2024/
Deadline for applications is 22 January 2024. The start date of the PhD is September 2024 and the duration is 4 years. The first year is devoted to training, with several available courses and training activities (also in collaboration with industries). Successful applicants will receive a full-time scholarship for the entire duration of the program.

The Maxwell Institute for Mathematical Sciences brings together research activities in mathematical sciences at Edinburgh and Heriot-Watt Universities. The Institute has a physical home on the top floor of the Bayes Centre which it shares with the International Centre for Mathematical Sciences (ICMS), creating a hub for mathematical sciences research, training, and applications in central Edinburgh.

Paper accepted on ESAIM: M2AN

Our paper Optimizing network robustness via Krylov subspaces has been accepted on ESAIM Mathematical Modeling and Numerical Analysis. Fun collaboration with Stefano Massei! We provide efficient and accurate Krylov-based methods for optimizing the robustness on large networks. Code in Matlab availabe here.

Visiting MaLGa Machine Learning Genoa Center

Exciting research days ahead visiting MaLGa Machine Learning Genoa Center! I will also present our recent work on reducing model parameters in deep learning and low-rank bias at the ML seminar. Thanks Lorenzo Rosasco for the kind invitation!


New paper out

A nonlinear spectral core-periphery detection method for multiplex networks

Abstract: Core-periphery detection aims to separate the nodes of a complex network into two subsets: a core that is densely connected to the entire network and a periphery that is densely connected to the core but sparsely connected internally. The definition of core-periphery structure in multiplex networks that record different types of interactions between the same set of nodes but on different layers is nontrivial since a node may belong to the core in some layers and to the periphery in others. ... Read more

--- Our NSM vs multilayer degree on 2 layer Internet network with different noise levels

Paper accepted on EURO J Computational Optimization

Our paper Laplacian-based Semi-Supervised Learning in Multilayer Hypergraphs by Coordinate Descent has been accepted on EURO Journal on Computational Optimization. We explore the computational advantage of randomized coordinate gradient methods for semi-supervised learning on higher-order graph models.

Paper accepted on NeurIPS 2023

Excited that our paper on Robust low-rank training has been accepted on NeurIPS 2023! We propose a method to train networks with low-rank weights while reducing the network condition number and thus increasing its robustness with respect to adversarial attacks. Congrats to my two PhD students Dayana Savostianova and Emanuele Zangrando!

New paper out

Learning the effective order of a hypergraph dynamical system

Abstract: Dynamical systems on hypergraphs can display a rich set of behaviours not observable for systems with pairwise interactions. Given a distributed dynamical system with a putative hypergraph structure, an interesting question is thus how much of this hypergraph structure is actually necessary to faithfully replicate the observed dynamical behaviour. To answer this question, we propose a method to determine the minimum order of a hypergraph necessary to approximate the corresponding dynamics accurately. ... Read more

New paper out

Robust low-rank training via approximate orthonormal constraints

Abstract: With the growth of model and data sizes, a broad effort has been made to design pruning techniques that reduce the resource demand of deep learning pipelines, while retaining model performance. In order to reduce both inference and training costs, a prominent line of work uses low-rank matrix factorizations to represent the network weights. Although able to retain accuracy, we observe that low-rank methods tend to compromise model robustness against adversarial perturbations. ... Read more

--- Evolution of loss, accuracy, and condition number for Lenet5 on MNIST dataset. The proposed approach (CondLR) converges faster while maintaining a well-conditioned neural network.

DLRA Workshop @ EPFL

Traveling today to EPF Lausanne for the DLRA “New Horizon” workshop. I will present our recent work on spectral pruning of deep learning models. Thanks Gianluca Ceruti and Jonas Kusch for the kind invitation!


SIAM SIGEST paper available online

Our SIAM Review’s SIGEST paper Nonlinear Perron-Frobenius theorem for nonnegative tensors has been published on Vol 65 Iss 2 of SIAM Review. Thanks to the Editors for the support and for the nice presentation!

Numerical Linear Algebra Days @ GSSI

I am co-organizing the 18th edition of the Numerical Algebra Days (2giorni) workshop at GSSI.

This is the 18th workshop in a series dedicated to Numerical Linear Algebra and Applications, aiming at gathering the (mostly Italian) Numerical Linear Algebra scientific community to discuss recent advances in the area and to promote the exchange of novel ideas and the collaboration among researchers.

Here are the slides of my lecture on Nonlinear Perron-Frobenius Theory


New paper out

Nonlinear Perron-Frobenius theorems for nonnegative tensors

Abstract: We present a unifying Perron–Frobenius theory for nonlinear spectral problems defined in terms of nonnegative tensors. By using the concept of tensor shape partition, our results include, as a special case, a wide variety of particular tensor spectral problems considered in the literature and can be applied to a broad set of problems involving tensors (and matrices), including the computation of operator norms, graph and hypergraph matching in computer vision, hypergraph spectral theory, higher-order network analysis, and multimarginal optimal transport. ... Read more

Paper accepted @ ICML 2023

I am very happy that our paper Learning the right layers: a data-driven layer-aggregation strategy for semi-supervised learning on multilayer graphs has been accepted on the proceedings of this year’s ICML conference.

Congrats to Sara Venturini on one more important achievement!

Joining the editorial board of CMM

I just accepted an invite to join the editorial board of the Springer’s journal Computational Mathematics and Modeling, a journal estabilished and run by the department of Computational Mathematics and Cybernetics of the Lomonosov Moscow State University, a place that is very important to me.

New paper out

A nonlinear model of opinion dynamics on networks with friction-inspired stubbornness

Abstract: The modeling of opinion dynamics has seen much study in varying academic disciplines. Understanding the complex ways information can be disseminated is a complicated problem for mathematicians as well as social scientists. Inspired by the Cucker-Smale system of flocking dynamics, we present a nonlinear model of opinion dynamics that utilizes an environmental averaging protocol similar to the DeGroot and Freidkin-Johnsen models. Indeed, the way opinions evolve is complex and nonlinear effects ought to be considered when modelling. ... Read more

Plenary talk at IC2S2 2023 conference

Excited that our work on Social Contagion in Science has been selected as one of the 16 plenary talks at the next International Conference on Computational Social Science in Copenhagen, out of 900+ submissions. I’m very proud of the great multidisciplinary team that has worked on this project, in particular Sara and Satyaki.

Additionally, our work on The COVID-19 research outbreak: how the pandemic culminated in a surge of new researchers has been accepted as oral presentation at the same conference. This is based on a fanstastic ongoing collaboration with Maia Majumder’s team at Harvard Medical School.

News highlights

*/}}