What Is the Gerstenhaber Problem?

When Cayley introduced matrix algebra in 1858, he did much more than merely arrange numbers in a rectangular array. His definitions of addition, multiplication, and inversion produced an algebraic structure that has proved to be immensely useful, and which still holds many mysteries today.

An $n\times n$ matrix has $n^2$ parameters, so the vector space $\mathbb{R}^{n\times n}$ of real matrices has dimension $n^2$ , with a basis given by the matrices $E_{ij} = e_i^{}e_j^T$ , where $e_i$ is the vector that is zero except for a $1$ in the $i$ th entry. In his original paper, Cayley noticed the important property that the powers of a particular matrix $A$ can never span $\mathbb{R}^{n\times n}$ . The Cayley–Hamilton theorem says that $A$ satisfies its own characteristic equation, that is, $p(A) = 0$ where $p(t) = \det(t I - A)$ is the characteristic polynomial of $A$ . This means that $A^n$ can be expressed as a linear combination of $I$ , $A$ , …, $A^{n-1}$ , so the powers of $A$ span a vector space of dimension at most $n$ .

Gerstenhaber proved a generalization of this property in 1961: if $A$ and $B$ are two commuting $n\times n$ matrices then the matrices $A^iB^j$ , for all nonnegative $i$ and $j$ , generate a vector space of dimension at most $n$ . This result is much more difficult to prove than the Cayley–Hamilton theorem. Gerstenhaber’s proof was based on algebraic geometry, but purely matrix-theoretic proofs have been given.

A natural question, called the Gerstenhaber problem, is: does this result extend to three matrices, that is, does the vector space

$S_n = \{\, A^iB^jC^k: 0\le i,j,k \le n-1 \,\}$

have dimension at most $n$ for any $n\times n$ matrices $A$ , $B$ , and $C$ that commute with each other? (We can limit the powers to $n-1$ by the Cayley–Hamilton theorem.) The problem is defined over any field, but here I focus on the reals.

Before considering the three matrix case let us note that the answer is known to be “no” for four commuting $4\times 4$ matrices $A$ , $B$ , $C$ , and $D$ . Indeed let

$A = e_1^{}e_3^T, \quad B = e_1^{}e_4^T, \quad C = e_2^{}e_3^T, \quad D = e_2^{}e_4^T$

and note that all possible products of two of these matrices are zero, so the matrices commute pairwise. Yet $I = A^0, A, B, C, D$ are clearly five linearly independent matrices. Hence Gerstenhaber’s result does not extend to four matrices. It also does not extend to five or more matrices because it is known that the failure of the result for one value of $n$ implies failure for all larger $n$ . The question, then, is whether the three matrix case is like the two matrix case or the case for four or more matrices.

A great deal of effort has been put into proving or disproving the Gerstenhaber problem, so far without success. Here are two known facts.

The result holds for all $n\le 11$ .
By a 1905 result of Schur, the dimension of $S_n$ is at most $1 + \lfloor n^2/4\rfloor$ , which is less than $n^2$ but still much bigger than $n$ . (Here, $\lfloor \cdot \rfloor$ is the floor function.)

One approach to investigating this problem is to look for a counterexample computationally. For some $n\ge 12$ , choose three commuting $n\times n$ matrices $A$ , $B$ , and $C$ , select $m\ge n$ monomials

$X_i = A^{i_p} B^{j_p} C^{k_p}, \quad 1\le p \le m,$

form the matrix

$Y = [\mathrm{vec}(X_1), \mathrm{vec}(X_2), \dots, \mathrm{vec}(X_m)],$

where $\mathrm{vec}$ converts a matrix into a vector by stacking the columns on top of each other, then compute $\mathrm{rank}(Y)$ , which is a lower bound on $\mathrm{dim}(S_n)$ , and check whether it is greater than $n$ .

This simple-minded approach has some obvious difficulties. How do we choose $A$ , $B$ , and $C$ ? How do we choose the powers? How do we avoid overflow and underflow and compute a reliable rank, given that we might be dealing with large powers of large matrices?

Holbrook and O’Meara (2020), who have written several papers on the Gerstenhaber problem, which they call the GP problem, state that they “firmly believe the GP will turn out to have a negative answer” and they have developed a sophisticated approach to searching for a case with $\mathrm{dim}(S_n) > n$ , which they call a “Eureka”. They first note that $A$ , $B$ and $C$ can be assumed to be nilpotent. This means that all three matrices must be defective, because a nondefective nilpotent matrix is zero. Next they note that since commuting matrices are simultaneously unitarily triangularizable, $A$ , $B$ , and $C$ can be assumed to be strictly upper triangular. Then they note that $A$ can be assumed to be in Weyr canonical form.

The Weyr canonical form is a dual of the Jordan canonical form in which the Jordan matrix is replaced by a Weyr matrix, which is a direct sum of basic Weyr matrices. A basic Weyr matrix has one distinct eigenvalue and is upper block bidiagonal with diagonal blocks that are multiples of the identity matrix and superdiagonal blocks that are rectangular identity matrices. The difference between Jordan and Weyr matrices is illustrated by the example

$J(\lambda) = \left[\begin{array}{ccc|cc} \lambda & 1 & 0 &&\\ 0 & \lambda & 1 &&\\ 0 & 0 & \lambda &&\\\hline & & &\lambda & 1\\ & & & 0 &\lambda \end{array}\right], \quad W(\lambda) = \left[\begin{array}{cc|cc|c} \lambda & 0 & 1 & 0&\\ 0 & \lambda & 0 & 1&\\\hline & & \lambda & 0 &1\\ & & 0 &\lambda & 0\\\hline & & & 0 &\lambda \end{array}\right],$

where the Jordan matrix $J(\lambda)$ and Weyr matrix $W(\lambda)$ are related by $J(\lambda) = P^TW(\lambda)P$ for some permutation matrix $P$ . There is an elegant way of relating the block sizes in the Jordan and Weyr matrices via a Young diagram. Form an array whose $k$ th column contains $j_k$ dots, where $j_k$ is the size of the $j$ th diagonal block of $J$ :

$\begin{array}{c|ccc} & j_1 & j_2\\\hline w_1 & \bullet & \bullet\\ w_2 & \bullet & \bullet\\ w_3 & \bullet & \end{array}$

Then the number of dots in the $k$ th row is the size of the $k$ th diagonal block in the Weyr form.

The reason for using the Weyr form is that whereas any matrix that commutes with a Jordan matrix has Toeplitz blocks, any matrix that commutes with a Weyr matrix is block upper triangular and is uniquely determined by the first block row. By choosing a Weyr form for $A$ , commuting matrices $B$ and $C$ can be built up in a systematic way.

Thanks to a result of O’Meara (2020) it suffices to compute modulo a prime $p$ , so the computations can be done in exact arithmetic, removing the need to worry about rounding errors or overflow.

Holbrook and O’Meara have mostly tried matrices up to dimension 50, but they feel that “the Loch Ness monster probably lives in deeper water, closer to $100\times 100$ ”. Their MATLAB codes (40 nicely commented but not optimized M-files) are available on request from the address given in their preprint. If you have the time to spare you might want to experiment with the codes and try to find a Eureka.

References

This is a minimal set of references, which contain further useful references within.

Robert M. Corless and Steven E. Thornton, The Weyr Canonical Form, 2016 (PDF poster).
John Holbrook and Kevin O’Meara, A Computing Strategy and Programs to Resolve the Gerstenhaber Problem for Commuting Triples of Matrices, ArXiv:2006.085882020, 2020.
John Holbrook and Kevin O’Meara, Commuting Triples of Matrices Vs the Computer, IMAGE (The Bulletin of the International Linear Algebra Society), to appear.
Kevin O’Meara, The Gerstenhaber Problem for Commuting Triples of Matrices is “Decidable”, Comm. Algebra 48, 453–466, 2020.
Kevin O’Meara, John Clark, and Charles Vinsonhaler, Advanced Topics in Linear Algebra. Weaving Matrix Problems Through the Weyr Form, Oxford University Press, Oxford, UK. 2011. Chapter 5.
B. A. Sethuraman, the Algebra Generated by Three Commuting Matrices, Math. Newsl. 21, 62–67, 2011.

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is the Gerstenhaber Problem?

What Is the Gerstenhaber Problem?

References

Recommend

What Is the Adjugate of a Matrix?

What Is a Householder Matrix?

Nick Higham - Accuracy and Stability of Numerical Algorithms

What Is an Orthogonal Matrix?

Apache Flink: Hadoop Compatibility in Flink

Apache Flink: December 2014 in the Flink community

Apache Flink: January 2015 in the Flink community

Apache Flink: February 2015 in the Flink community

Apache Flink: Peeking into Apache Flink's Engine Room

Apache Flink: March 2015 in the Flink community

About Joyk