Abstract:
Using drones to perform human-related tasks can play a key role in various fields, such as defense, disaster response, agriculture, healthcare, and many others. The drone delivery packing problem (DDPP) arises in the context of logistics in response to an increasing demand in the delivery process along with the necessity of lowering human intervention. The DDPP is usually formulated as a combinatorial optimization problem, aiming to minimize drone usage with specific battery constraints while ensuring timely consistent deliveries with fixed locations and energy budget.
...
Read more
I am giving a talk today on our recent work on exploiting low-rank geometry to reduce memory and computational footprints in deep learning pipelines at the ICMS workshop on Big Data Inverse Problems in Edinburgh (UK).
Talk @ SIAM LA 2024
I am giving a talk today on our recent work on model comperssion algorithms and analysis for deep learning, at the SIAM Conference on Applied Linear Algebra in Paris, France. Thanks Rima Khouja for the kind invitation!
Low-rank Numerical Patterns of Dynamical Systems and Neural Networks
I am excited to be organizing a minimymposium on low-rank patterns in machine learning and numerical analysis at the SIAM Conference on Applied Linear Algebra 2024.
The size of data and model parameters is growing enormously in modern science and engineering. While recent advancements in computational hardware make it tempting to handle large-scale problems by merely allocating increasing resources, a much more efficient approach combines efficient hardware with techniques from model order reduction. Among the successful approaches, those leveraging low-rank formats are particularly interesting due to their ability to combine favorable memory and computational footprints with solid mathematical analysis and interpretation.
As low-rank data naturally emerge in diverse settings, including complex and quantum systems, high-dimensional optimization, and deep learning, the analysis of low-rank structures in modern systems is being pushed forward simultaneously by many different scientific areas, highlighting the fundamental importance of this type of reduced-order structure in understanding and solving complex problems.
This minisymposium brings together contributions from different communities working on this topic, with the goal of sampling recent developments in low-rank techniques with a specific focus on their relevance and application to machine learning.
Invited Speakers
Alex Cayco-Gajic, École Normale Supérieure Paris, France
Hung-Hsu Chou, Technische Universität München, Germany
Romain Cosson, Inria, France
Hessam Babaee, University of Pittsburgh, U.S.
Mathis Dus, ENPC, France
Filippo Vicentini, École Polytechnique, France
Jörg Nick, ETH Zurich, Switzerland
Emanuele Zangrando, Gran Sasso Science Institute, Italy
The Linear Algebra of Multilayer Networks
I am excited to be organizing a minimymposium on the linear algebra of multilayer netoworks at the SIAM Conference on Applied Linear Algebra 2024.
Models of complex networks allow insights into application areas ranging from social over transport and engineered networks. Their canonical representation by matrices made them an important field of study in applied linear algebra. Multilayer networks represent a versatile model for networks in which entities are connected by different types of interactions. These can be represented by structured matrices or tensors, calling for novel linear algebra techniques such as linear systems, linear or non-linear eigenvalue problems, or matrix functions to reveal structural network properties. Popular examples include community detection, centrality analysis, or the detection of core–periphery structure. This minisymposium brings together a representative sample of the scientific community presenting recent progress on models and algorithms for the analysis of multilayer networks.
Invited Speakers
Fabrizio De Vico Fallani, INRIA Paris Brain Institute, France
Sara Venturini, Sensible City Lab, MIT, U.S.
Kai Bergermann, Technische Universität, Chemnitz, Germany;
Dr Anton Savostianov
My (now former) student Anton has successfully defended his PhD thesis today obtaining his PhD in Mathematics in the Natural and Social Sciences cum laude. Congratulations Anton!
Paper accepted @ ICML 2024
I am very happy that our paper Subhomogeneous deep equilibrium models has been accepted on the proceedings of this year’s ICML conference.
Congrats to my student Pietro Sittoni on such an important achievement originated from his MSc thesis work!
Excited to be organizing a two-day workshop at ICMS aiming at bringing together experts in the field of numerical analysis and its interface with data science and machine learning. The meeting is dedicated to the 60th birthday of Desmond J. Higham.
Keynote Speakers
Elena Celledoni, Norwegian University of Science and Technology
Peter Grindrod, University of Oxford
Francoise Tisseur, University of Manchester
Ivan Tyukin, King’s College London
Jesus-Maria Sanz-Serna, Universidad Carlos III de Madrid
Peter Kloeden, University of Tubingen
Brynjulf Owren, Norwegian University of Science and Technology
Vanni Noferini, Aalto University
Alison Ramage, University of Strathclyde
Andrew M. Stuart, California Institute of Technology
Abstract:
Implicit-depth neural networks have grown as powerful alternatives to traditional networks in various applications in recent years. However, these models often lack guarantees of existence and uniqueness, raising stability, performance, and reproducibility issues. In this paper, we present a new analysis of the existence and uniqueness of fixed points for implicit-depth neural networks based on the concept of subhomogeneous operators and the nonlinear Perron-Frobenius theory. Compared to previous similar analyses, our theory allows for weaker assumptions on the parameter matrices, thus yielding a more flexible framework for well-defined implicit networks.
...
Read more
Abstract:
Traditional numerical solvers for time-dependent partial differential equations (PDEs) notoriously require high computational resources and necessitate recomputation when faced with new problem parameters. In recent years, neural surrogates have shown great potential to overcome these limitations. However, it has been paradoxically observed that incorporating historical information into neural surrogates worsens their rollout performance. Drawing inspiration from multistep methods that use historical information from previous steps to obtain higher-order accuracy, we introduce the Mixture of Neural Operators (MoNO) framework; a collection of neural operators, each dedicated to processing information from a distinct previous step.
...
Read more
Abstract:
We propose a novel methodology to solve a key eigenvalue optimization problem which arises in the contractivity analysis of neural ODEs. When looking at contractivity properties of a one layer weight-tied neural ODE $\dot{u}(t)=σ(Au(t)+b)$ (with $u,b \in {\mathbb R}^n$, $A$ is a given $n \times n$ matrix, $σ: {\mathbb R} \to {\mathbb R}^+$ denotes an activation function and for a vector $z \in {\mathbb R}^n$, $σ(z) \in {\mathbb R}^n$ has to be interpreted entry-wise), we are led to study the logarithmic norm of a set of products of type $D A$, where $D$ is a diagonal matrix such that ${\mathrm{diag}}(D) \in σ'({\mathbb R}^n)$.
...
Read more
Abstract:
Recent work in deep learning has shown strong empirical and theoretical evidence of an implicit low-rank bias: weight matrices in deep networks tend to be approximately low-rank and removing relatively small singular values during training or from available trained models may significantly reduce model size while maintaining or even improving model performance. However, the majority of the theoretical investigations around low-rank bias in neural networks deal with oversimplified deep linear networks.
...
Read more
--- The rank of the weight matrix $W_{\ell}$ of layer $\ell$ trained with weight decay $\lambda$ decreases with the total class variability $\mathrm{TCV}$ of any latent space $X_k$, with $k<\ell$.
Abstract:
Techniques based on $k$-th order Hodge Laplacian operators $L_k$ are widely used to describe the topology as well as the governing dynamics of high-order systems modeled as simplicial complexes. In all of them, it is required to solve a number of least square problems with $L_k$ as coefficient matrix, for example in order to compute some portions of the spectrum or integrate the dynamical system. In this work, we introduce the notion of optimal collapsible subcomplex and we present a fast combinatorial algorithm for the computation of a sparse Cholesky-like preconditioner for $L_k$ that exploits the topological structure of the simplicial complex.
...
Read more
Abstract:
Collaboration is a key driver of science and innovation. Mainly motivated by the need to leverage different capacities and expertise to solve a scientific problem, collaboration is also an excellent source of information about the future behavior of scholars. In particular, it allows us to infer the likelihood that scientists choose future research directions via the intertwined mechanisms of selection and social influence. Here we thoroughly investigate the interplay between collaboration and topic switches.
...
Read more
Call for PhD applications @ Maxwell Institute’s Graduate School
I am looking for new PhD students on the following three projects within the Maxwell Institute’s Graduate School at The University of Edinburgh:
Structured reduced-order deep learning for scientific and industrial applications
Modern numerical linear algebra techniques for efficient learning and optimization (co-supervised with John Pearson)
Stability of Artificial Intelligence Algorithms (co-supervised with Des Higham)
For more details and to apply:
https://www.mac-migs.ac.uk/mac-migs-2024/ Deadline for applications is 22 January 2024. The start date of the PhD is September 2024 and the duration is 4 years. The first year is devoted to training, with several available courses and training activities (also in collaboration with industries). Successful applicants will receive a full-time scholarship for the entire duration of the program.
Exciting research days ahead visiting MaLGa Machine Learning Genoa Center! I will also present our recent work on reducing model parameters in deep learning and low-rank bias at the ML seminar. Thanks Lorenzo Rosasco for the kind invitation!
Abstract:
Core-periphery detection aims to separate the nodes of a complex network into two subsets: a core that is densely connected to the entire network and a periphery that is densely connected to the core but sparsely connected internally. The definition of core-periphery structure in multiplex networks that record different types of interactions between the same set of nodes but on different layers is nontrivial since a node may belong to the core in some layers and to the periphery in others.
...
Read more
--- Our NSM vs multilayer degree on 2 layer Internet network with different noise levels
Paper accepted on EURO J Computational Optimization
Excited that our paper on Robust low-rank training has been accepted on NeurIPS 2023! We propose a method to train networks with low-rank weights while reducing the network condition number and thus increasing its robustness with respect to adversarial attacks. Congrats to my two PhD students Dayana Savostianova and Emanuele Zangrando!
Abstract:
Dynamical systems on hypergraphs can display a rich set of behaviours not observable for systems with pairwise interactions. Given a distributed dynamical system with a putative hypergraph structure, an interesting question is thus how much of this hypergraph structure is actually necessary to faithfully replicate the observed dynamical behaviour. To answer this question, we propose a method to determine the minimum order of a hypergraph necessary to approximate the corresponding dynamics accurately.
...
Read more
Abstract:
With the growth of model and data sizes, a broad effort has been made to design pruning techniques that reduce the resource demand of deep learning pipelines, while retaining model performance. In order to reduce both inference and training costs, a prominent line of work uses low-rank matrix factorizations to represent the network weights. Although able to retain accuracy, we observe that low-rank methods tend to compromise model robustness against adversarial perturbations.
...
Read more
--- Evolution of loss, accuracy, and condition number for Lenet5 on MNIST dataset. The proposed approach (CondLR) converges faster while maintaining a well-conditioned neural network.
Abstract:
The computing cost and memory demand of deep learning pipelines have grown fast in recent years and thus a variety of pruning techniques have been developed to reduce model parameters. The majority of these techniques focus on reducing inference costs by pruning the network after a pass of full training. A smaller number of methods address the reduction of training costs, mostly based on compressing the network via low-rank layer factorizations.
...
Read more
--- Comparison of vanilla compression approaches with different tensor formats with the proposed TDLRT method. Mean and standard deviation of 20 weight initializations are displayed. TDLRT achieves higher compression rates at higher accuracy with lower variance between initializations.
This is the 18th workshop in a series dedicated to Numerical Linear Algebra and Applications, aiming at gathering the (mostly Italian) Numerical Linear Algebra scientific community to discuss recent advances in the area and to promote the exchange of novel ideas and the collaboration among researchers.
Abstract:
We present a unifying Perron–Frobenius theory for nonlinear spectral problems defined in terms of nonnegative tensors. By using the concept of tensor shape partition, our results include, as a special case, a wide variety of particular tensor spectral problems considered in the literature and can be applied to a broad set of problems involving tensors (and matrices), including the computation of operator norms, graph and hypergraph matching in computer vision, hypergraph spectral theory, higher-order network analysis, and multimarginal optimal transport.
...
Read more
Congrats to Sara Venturini on one more important achievement!
Joining the editorial board of CMM
I just accepted an invite to join the editorial board of the Springer’s journal Computational Mathematics and Modeling, a journal estabilished and run by the department of Computational Mathematics and Cybernetics of the Lomonosov Moscow State University, a place that is very important to me.
Abstract:
The modeling of opinion dynamics has seen much study in varying academic disciplines. Understanding the complex ways information can be disseminated is a complicated problem for mathematicians as well as social scientists. Inspired by the Cucker-Smale system of flocking dynamics, we present a nonlinear model of opinion dynamics that utilizes an environmental averaging protocol similar to the DeGroot and Freidkin-Johnsen models. Indeed, the way opinions evolve is complex and nonlinear effects ought to be considered when modelling.
...
Read more
Additionally, our work on The COVID-19 research outbreak: how the pandemic culminated in a surge of new researchers has been accepted as oral presentation at the same conference. This is based on a fanstastic ongoing collaboration with Maia Majumder’s team at Harvard Medical School.
Three talks at NetSci 2023 conference
Three talks has been accepted at the next NetSci 2023 conference in Vienna:
Quantifying the homological stability of simplicial complexes, presented by Anton Savostianov
Learning the right layers: a data-driven layer-aggregation strategy for semi-supervised learning on multilayer graphs, presented by Sara Venturini
Social Contagion in Science, presented by Satyaki Sikdar
SIGEST highlights a recent outstanding paper from one of SIAM’s specialized research journals, chosen on the basis of exceptional interest to the entire SIAM community.
The winning paper is reprinted in the SIGEST section of SIAM Review, the flagship journal of the society. The SIGEST version of the paper has a new title
Nonlinear Perron-Frobenius theorems for nonnegative tensors
and contains two major additions:
A widely extended introduction, with many non-trivial examples of tensor eigenvalue problems in applications, including problems from computer vision and optimal transport. Here we detail how the nonlinear Perron-Frobenius theorems for tensors that we introduced can be of great help to tackle these problems.
A new nonlinear Perron-Frobenius theorem that significantly improves (parts of) the previous Perron-Frobenius theorems for tensors and allows us to address more general problems (such as the optimal transport problem discussed in the introduction)
Abstract:
We consider the problem of attaining either the maximal increase or reduction of the robustness of a complex network by means of a bounded modification of a subset of the edge weights. We propose two novel strategies combining Krylov subspace approximations with a greedy scheme and with the limited-memory BFGS. The paper discuss the computational and modeling aspects of our methodology and illustrates the various optimization problems on networks that can be addressed within the proposed framework.
...
Read more
Abstract:
Graph Semi-Supervised learning is an important data analysis tool, where given a graph and a set of labeled nodes, the aim is to infer the labels to the remaining unlabeled nodes. In this paper, we start by considering an optimization-based formulation of the problem for an undirected graph, and then we extend this formulation to multilayer hypergraphs. We solve the problem using different coordinate descent approaches and compare the results with the ones obtained by the classic gradient descent method.
...
Read more
Abstract:
The homology groups of a simplicial complex reveal fundamental properties of the topology of the data or the system and the notion of topological stability naturally poses an important yet not fully investigated question. In the current work, we study the stability in terms of the smallest perturbation sufficient to change the dimensionality of the corresponding homology group. Such definition requires an appropriate weighting and normalizing procedure for the boundary operators acting on the Hodge algebra’s homology groups.
...
Read more
--- Continuous and discretized manifold with a 1-dimensional hole.
Paper accepted on Applied and Computational Harmonic Analysis
Excited that our paper Nodal domain count for the generalized p-Laplacian – with Piero Deidda and Mario Putti – has been accepted for publication on Applied and Computational Harmonic Analysis. Among the main results, we prove that the eigenvalues of the p-Laplacian on a tree are all variational (and thus they are exactly n), we show that the number of nodal domains of the p-Laplacian for general graphs can be bound both from above and from below, and we deduce that the higher-order Cheeger inequality is tight on trees.
Arturo De Marinis from our team is attending XMaths workshop at University of Bari this week, presenting preliminary results on our work on stability of neural dynamical systems.
Data Science in Action at University of Padua
I am giving an invited lecture today on Low-parametric deep learning, at the Data Science in Action day organized by the University of Padua. You can find here the slides of my talk.
Presenting today @ Örebro University
I am presenting today my work on fast and efficient neural networks' training via low-rank gradient flows at the Research Seminars in Mathematics at the School of Science and Technology, Örebro University (Sweden). Thanks Andrii Dmytryshyn for the kind invitation!