Publications

2023

Compact Matrix Quantum Group Equivariant Neural Networks

We derive the existence of a new type of neural network, called a compact matrix quantum group equivariant neural network, that learns from data that has an underlying quantum symmetry. We apply the Woronowicz formulation of Tannaka-Krein duality to characterise the weight matrices that appear in these neural networks for any easy compact matrix quantum group. We show that compact matrix quantum group equivariant neural networks contain, as a subclass, all compact matrix group equivariant neural networks. Moreover, we obtain characterisations of the weight matrices for many compact matrix group equivariant neural networks that have not previously appeared in the machine learning literature.

Graph Automorphism Group Equivariant Neural Networks

For any graph $G$ having $n$ vertices and its automorphism group Aut(G), we provide a full characterisation of all of the possible Aut(G)–equivariant neural networks whose layers are some tensor power of $\mathbb{R}^{n}$. In particular, we find a spanning set of matrices for the learnable, linear, Aut(G)–equivariant layer functions between such tensor power spaces in the standard basis of $\mathbb{R}^{n}$.


An Algorithm for Computing with Brauer's Group Equivariant Neural Network Layers

The learnable, linear neural network layers between tensor power spaces of $\mathbb{R}^{n}$ that are equivariant to the orthogonal group, $O(n)$, the special orthogonal group, $SO(n)$, and the symplectic group, $Sp(n)$, were characterised in  arXiv:2212.08630. We present an algorithm for multiplying a vector by any weight matrix for each of these groups, using category theoretic constructions to implement the procedure. We achieve a significant reduction in computational cost compared with a naive implementation by making use of Kronecker product matrices to perform the multiplication. We show that our approach extends to the symmetric group, $S_n$, recovering the algorithm of arXiv:2303.06208 in the process.

Categorification of Group Equivariant Neural Networks

We present a novel application of category theory for deep learning. We show how category theory can be used to understand and work with the linear layer functions of group equivariant neural networks whose layers are some tensor power space of $\mathbb{R}^{n}$ for the groups $S_n$, $O(n)$, $Sp(n)$, and $SO(n)$. By using category theoretic constructions, we build a richer structure that is not seen in the original formulation of these neural networks, leading to new insights.In particular, we outline the development of an algorithm for quickly computing the result of a vector that is passed through an equivariant, linear layer for each group in question. The success of our approach suggests that category theory could be beneficial for other areas of deep learning.

How Jellyfish Characterise Alternating Group Equivariant Neural Networks

Accepted at ICML 2023, Hawaii, USA.

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:27483-27495, 2023.

PMLR 202:27483-27495 arXiv 2301.10152 Preview

We provide a full characterisation of all of the possible alternating group ($A_n$) equivariant neural networks whose layers are some tensor power of $\mathbb{R}^{n}$. In particular, we find a basis of matrices for the learnable, linear, $A_n$--equivariant layer functions between such tensor power spaces in the standard basis of $\mathbb{R}^{n}$.

2022

Brauer's Group Equivariant Neural Networks

Accepted for a live presentation at ICML 2023, Hawaii, USA.

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:27461-27482, 2023

PMLR 202:27461-27482 arXiv 2212.08630 Preview Live Oral

We provide a full characterisation of all of the possible group equivariant neural networks whose layers are some tensor power of $\mathbb{R}^{n}$ for three symmetry groups that are missing from the machine learning literature: $O(n)$, the orthogonal group; $SO(n)$, the special orthogonal group; and $Sp(n)$, the symplectic group. In particular, we find a spanning set of matrices for the learnable, linear, equivariant layer functions between such tensor power spaces in the standard basis of $\mathbb{R}^{n}$ when the group is $O(n)$ or $SO(n)$, and in the symplectic basis of $\mathbb{R}^{n}$ when the group is $Sp(n)$.

Connecting Permutation Equivariant Neural Networks and Partition Diagrams

arXiv: 2212.08648

We show how the Schur--Weyl duality that exists between the partition algebra and the symmetric group results in a stronger theoretical foundation for characterising all of the possible permutation equivariant neural networks whose layers are some tensor power of the permutation representation $M_n$ of the symmetric group $S_n$. In doing so, we unify two separate bodies of literature, and we correct some of the major results that are now widely quoted by the machine learning community. In particular, we find a basis of matrices for the learnable, linear, permutation equivariant layer functions between such tensor power spaces in the standard basis of $M_n$ by using an elegant graphical representation of a basis of set partitions for the partition algebra and its related vector spaces. Also, we show how we can calculate the number of weights that must appear in these layer functions by looking at certain paths through the McKay quiver for $M_n$. Finally, we describe how our approach generalises to the construction of neural networks that are equivariant to local symmetries.

A Multigraph Approach for Performing the Quantum Schur Transform

Accepted for a poster presentation at QCTiP 2023, Cambridge, UK.

arXiv: 2204.10694

We take inspiration from the Okounkov--Vershik approach to the representation theory of the symmetric groups to develop a new way of understanding how the Schur--Weyl duality can be used to perform the Quantum Schur Transform. The Quantum Schur Transform is a unitary change of basis transformation between the computational basis of $(\mathbb{C}^d)^{\otimes n}$ and the Schur--Weyl basis of $(\mathbb{C}^d)^{\otimes n}$. We describe a new multigraph, which we call the Schur--Weyl--Young graph, that represents both standard Weyl tableaux and standard Young tableaux in the same diagram. We suggest a major improvement on Louck's formula for calculating the transition amplitudes between two standard Weyl tableaux appearing in adjacent levels of the Schur--Weyl--Young graph for the case $d=2$, merely by looking at the entries in the two tableaux. The key theoretical component that underpins our results is the discovery of a branching rule for the Schur--Weyl states, which we call the Schur--Weyl branching rule. This branching rule allows us to perform the change of basis transformation described above in a straightforward manner for any $n$ and $d$.