

The 100th anniversary of Moore-Penrose inverse and its role in statistics and ma...
source link: http://www.zhengwenjie.net/pseudoinverse/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

The 100th anniversary of Moore-Penrose inverse and its role in statistics and machine learning
All men are equal, but not all matrices have inverses. For instance, rectangular matrices do not have inverses; square matrices without full rank do not have inverses. The matrix rights activists (i.e. E. H. Moore, 1920; Arne Bjerhammar, 1951; and Roger Penrose, 1955) among mathematicians thus stood out and spoke for these computationally unfavored matrices. Thanks to their continual efforts, every matrix finally got an inverse, dubbed the Moore-Penrose (pseudo) inverse. These previously unfavored matrices have since contributed to the academia and revolutionized statistics and machine learning. In memory of its 100th anniversary, let me talk, in this post, about the Moore-Penrose inverse and its applications.
The traditional inverse
For a matrix An×nAn×n, if there exists such a matrix A−1A−1 that AA−1=A−1A=I,AA−1=A−1A=I, where II is the identity matrix, then A−1A−1 is called the inverse of AA.
This concept is taught in the university and is what helps us solve the linear systems. That is, for an invertible matrix AA, if Ax=b,Ax=b, then we have x=A−1b.x=A−1b. Nonetheless, not all matrices are invertible, and hence some matrices have multiple solutions, and others have none.
The Moore-Penrose inverse
The existence of inverse brings us great convenience, and its absence is like the pain in the eye. Indeed, sometimes we desperately want an inverse, just like some of us desperately wanting a boyfriend or a girlfriend. To help those lonely matrices, E.H. Moore (1920) generalized the concept of inverse and hence helped everyone find his match, dubbed the Moore-Penrose inverse.
Definition
For a rectangular matrix Am×nAm×n, its Moore-Penrose inverse is defined as a matrix A+n×mAn×m+ satisfying all of the following four criteria:
- AA+A=AAA+A=A
- A+AA+=A+A+AA+=A+
- (AA+)T=AA+(AA+)T=AA+
- (A+A)T=A+A(A+A)T=A+A
Examples
The above equations look frightening, so let us see some examples first. Obviously, an inverse is itself a Moore-Penrose inverse. Another less trivial example takes the form of (ATA)−1AT(ATA)−1AT. Does it sound familiar? Yes, this is the projection operator in the least square. Indeed, when AA has full rank and m>nm>n (i.e., AA is tall and thin), we have A+=(ATA)−1AT.A+=(ATA)−1AT.
The next time you are asked during the interview the least square problem minβ∥y−Xβ∥22,minβ‖y−Xβ‖22, you can showoff and write ^β=X+y,β^=X+y, leaving the interviewer a troubled look. Also, remember: true men do not give explanations.
Existence
You might question the existence of Moore-Penrose inverse, just like some heartbroken young ladies desert the fairy tale and no longer believe the existence of true love.
To prove that I am not a charlatan, let me humbly prove the existence of the Moore-Penrose by constructing itself. Let A=UΣVTA=UΣVT be its Singular Value Decomposition (yes, SVD always exists), then A+=VΣ+UT.A+=VΣ+UT. For a rectangular diagonal matrix such as ΣΣ, we get its Moore-Penrose inverse by taking the reciprocal of each nonzero element on the diagonal, leaving the zeros in place, and then transposing the matrix.
Uniqueness
Therefore, the true love exists, but is it unique? Is the Moore-Penrose inverse unique? Yes, it is. There is precisely one matrix A+A+ satisfying all four conditions of the definition.
Moreover, the Moore-Penrose inverse of the Moore-Penrose inverse is the original matrix. That is, (A+)+=A.(A+)+=A. You are the true love of your true love; the one you love loves you. No triangular relationship or polygamy here.
By the way, the uniqueness is a consequence of the last two conditions of the four. A matrix satisfying only the first condition is known as a generalized inverse. If it also satisfies the second condition, it is called a generalized reflexive inverse. They are not unique.
Applications of the Moore-Penrose inverse
E. H. Moore built a utopia and made matrices fall in love again. In this section, let us see the power of love by visiting some applications in statistics and machine learning.
Linear least square
Like previously mentioned, the least square estimate ^β=X+yβ^=X+y.
Minimum norm solution to a linear system
In the previous case XX is a tall and thin matrix (i.e., m>nm>n), whereas in this case XX is a short and fat matrix (i.e., m<nm<n). Needless to say, the linear system Xβ=yXβ=y is under-determined. In this case, the solution X+yX+y is the one with the minimum Euclidean norm. Also, see Bartlett et al. (2019)’s paper Benign Overfitting in Linear Regression for recent development.
Obtain all solutions of a linear system
If the linear system Xβ=yXβ=y has any solutions, they are all given by β=X+y+(I−X+X)wβ=X+y+(I−X+X)w for arbitrary vector ww.
Chi-squared test
We have all been taught in the statistics class how to use Student’s t-test for the mean of a single random variable, such as the height of a population. However, sometimes we are asked to test the means of several random variables (i.e., random vector) in one fell swoop. For instance, the economists may want to test the average of GDP, of inflation rate, and of unemployment rate altogether. In this case, we need to use the Chi-squared test, where we have to construct a statistic in the form of xTAxxTAx. Indeed, this AA is taken as the Moore-Penrose inverse of the covariance matrix.
Proof. Let xx be a multivariate Gaussian random vector with zero mean vector and covariance matrix ΣΣ. That is, E[x]=0E[x]=0 and E[xxT]=ΣE[xxT]=Σ.
Obviously, ΣΣ is a positive semi-definite matrix, and it is diagonalizable. That is, there exists an orthogonal matrix PP such that Σ=PTDP=PT√D√DP,Σ=PTDP=PTDDP, where √DD is a matrix taking square root on DD elementwise.
To prove that xTΣ+xxTΣ+x follows a χ2χ2 distribution, first we need to show that it can be expressed as the squared norm of a multivariate Gaussian random vector. Indeed, xTΣ+x=xTPT√D+√D+Px=(√D+Px)T(√D+Px),xTΣ+x=xTPTD+D+Px=(D+Px)T(D+Px), where the first equality follows from the construction of the Moore-Penrose inverse via SVD. Since the linear transform of a Gaussian vector is still a Gaussian vector, √D+PxD+Px is effectively a Gaussian vector, and it also preserves the zero mean.
Then, we need to show that this Gaussian random vector has idempotent covariance matrix. E[(√D+Px)(xTPT√D+)]=√D+PE[xxT]PT√D+=√D+PΣPT√D+=√D+D√D+.E[(D+Px)(xTPTD+)]=D+PE[xxT]PTD+=D+PΣPTD+=D+DD+. The last term is a diagonal matrix with entries 0’s or 1’s, and it is hence idempotent. Moreover, the corresponding χ2χ2 distribution has rank(ΣΣ) degrees of freedom.
Summary
In this post, I introduced the Moore-Penrose inverse, its existence and uniqueness, as well as some examples. I also show four applications in statistics and machine learning:
- Least square
- Minimum norm solution to a linear system
- Obtain all solutions of a linear system
- Chi-squared test
You may also like
Recommend
-
20
A team of researchers from CMU and Technion recently introduced a new system, Penrose, that can turn complex mathematical notations into various styles of simple diagrams. The novel system rapidly attracted attention on s...
-
22
网友:图表界的LaTeX 鱼羊 发自 凹非寺 量子位 报道 | 公众号 QbitAI 画数学插图令人头秃? 现在,CMU的研究人员们开发出了一款实用工具 Penrose : 以 纯文本...
-
11
A singular mind: Roger Penrose on his Nobel Prize[Illustration by Morten Morland]Text settingsCommentsShareSir Roger Penrose was at school when he realised that his mind worked in an unusual way. ‘I thought, maybe when I...
-
11
Game of Life running on Penrose tilesWe value your privacyWe and our st...
-
12
penrose (0.2.0)1,469 views•Jan 17, 2021441ShareSave
-
22
Audeze Penrose Wireless Gaming Headphones Review: A Sensational Audio Experience By Gavin Phillips Published 3 hours ago...
-
9
Roger Penrose and Black Holes 网站首页 往昔追忆 浮光掠影 科学园地 技术广角 笑傲江湖 翻译作品 站长微博 评论选录 Welcome to Changhai Lu's Homepage I know nothi...
-
14
News & Events 100th Anniversary of the Tulsa Massacre: A conversation with Cord Je...
-
12
Avalon Penrose on Twitter: "a normal person explains cryptocurrency: https://t.co/FVSIGcaMcD"Don’t miss what’s happeningPeople on Twitter are the first to know.
-
1
Disney launches web3 music store to celebrate its 100th anniversary about 3 hours ago
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK