Can QR Decomposition Be Actually Faster? Schwarz-Rutishauser Algorithm - CodePro... - JOYK Joy of Geek, Geek News, Link all geek

Download qr_decomposition.zip - 20.8 KB

Introduction

QR decomposition is one of the newest and, probably, most interesting linear algebra operators, that has several known applications in many fields of science and engineering. The related research of QR decomposition was held starting at the beginning of the XX century. Today, an entire class of QR decomposition algorithms is effectively used in mathematics to solve the linear least-squares (LS) problem, solving the linear equation systems, as well as an essential step of the eigenvalues computation and singular values decomposition (SVD) methods [1].

For a few past decades, the dynamically growing computer science and IT areas influenced the potential interest of researchers in optimal large-sized matrices QR factorization methods, used as an essential step of many other algorithms, providing a solution to many real-world problems, such as:

Graphical images and digital photography transformation, including rotation, distortion, and 3D-perspective filters that are widely used in a variety of existing graphics editing apps, such as Adobe Photoshop, Affinity Photo, GIMP, etc.

Determining the unique (i.e. "singular") parameters of electric signals via several interpolation-based methods, widely used in digital signals processing, to recognize noise and other artifacts, and thus, providing an ability to recover data corrupted by the distorted signal.

The analysis of vibration levels in industrial machines and mechanisms, based on a number of the principal component analysis (PCA) methods applied for representing parameter magnitudes in the terms of factors, which might be explanatory while detecting the vibration and acoustic environmental interference [7].

Acquiring the knowledge and insights, exhibited on huge statistical datasets to reveal the trends that provide an understanding of what specifically is having the most critical impact on the amounts of sales, annual stocks market closing price, as well as the other estimates, conducted by trading companies, to encompass their business, effectively, and, thus, to increase the popularity and reduce the costs of goods, digital content and services, being merchandised to billions of online customers, worldwide [1,3,8].

Moreover, the QR-factorization process yields the orthogonal matrix of factors, which is a special kind of functional matrices, in which the most common or similar factor values are being aggregated into multiple areas. This, in turn, benefits effective data clustering, while solving various classification problems, serving as an alternative to many AI-based supervised learning algorithms, requiring continuous training.

In this article, I will introduce the number of fundamental QR-decomposition techniques, applied for performing QR-factorization, such as the classical Gram-Schmidt orthogonalization process, Householder reflections, as well as the optimized Schwarz-Rutishauser algorithm, addressing the problem of large-sized matrices QR-decomposition, being discussed [1,5].

Specifically, an audience of this article readers will find out about optimization of the famous Gram-Schmidt orthogonalization process, significantly reducing its complexity, based on the Schwarz-Rutishauser algorithm, being introduced [5].

Additionally, I will survey the complexity of the Schwarz-Rutishauser algorithm, comparing it to the other existing QR factorization methods, such as either the classical Gram-Schmidt process or Householder reflections [1,4].

Background

Before we begin, let's take a short glance at the QR-decomposition procedure. It assumes that an entire QR factorization process relies on the decomposition of any real or complex matrix A into a product of two matrices Q and R, respectively. In this case, each of these matrices can be computed via several methods.

For instance, finding a matrix of the orthogonal column-vectors Q is possible by using a variety of methods, such as the Gram-Schmidt process, Householder reflections, Givens rotations, etc. In turn, another upper-triangular matrix R can be found by performing the multiplication of Q's-transpose and the matrix A. Finally, it's convenient to find the diagonal matrix of eigenvalues E, computing as the product of matrix R by Q, as well as the corresponding eigenvectors by solving the linear equation system in R, using the Jordan-Gaussian method.

An entire matrix factorization process is mainly based on the orthogonal matrix Q's computation, which is the most essential step and can be done by applying the several algorithms, discussed in the succeeding paragraphs.

QR Decomposition Algorithm

QR factorization is a decomposition of a real or complex matrix A (m x n) into a product of an orthogonal matrix Q and an upper triangular matrix R, such as that [1]:

Normally, the computation of matrix 𝙌 yields the matrix 𝘼 factorization [1,4], such as:

In the paragraphs below, we will thoroughly discuss the most common features of both matrices 𝙌 and 𝙍 [4].

An orthogonal matrix Q:

Generally, computation of the orthogonal matrix 𝙌 leads to the factorization of matrix 𝘼 [4], so that full and partial factorization exists. QR decomposition can be applied not only to the square but also to rectangular matrices, even if a matrix does not have a full rank. Mostly, the full factorization of 𝘼 gives the matrix 𝙌 of the same shape as the matrix 𝘼, so that the number of columns in both 𝙌 and 𝘼 is equal [5]. The factorization of matrix 𝘼 is a useful feature, that is applied whenever the values in 𝘼 must be represented in the terms of their factors [1,3].

By its definition [1], 𝙌 is a symmetric matrix, the multiplication of which by its transpose is an identity matrix 𝙄, due to the orthogonal property (1.2). Each column vector in 𝙌 becomes orthogonal to the other vectors. All of those column vectors form an orthonormal basis of 𝘼, which is much similar to mapping each column vector in the Euclidean space ℝₙ onto the coordinate system of the other space ℝ’ₙ. The term “orthonormal” means that orthogonal column vectors in 𝙌 are normalized, dividing them by their Euclidean norms [1,2,6].

An upper triangular matrix R:

An upper triangular matrix 𝙍 can be easily computed as a product of matrix 𝙌’s-transpose and matrix 𝘼 so that finally 𝙍 is a matrix, all sub-diagonal of which are 0s. Specifically, the computation of 𝙍 results in the Gaussian elimination of 𝘼, reducing it to the row-echelon form [1,4,5,6]. To eliminate the elements followed by the diagonal element in 𝙍, scalar elements 𝒓ᵢₖ, 𝒓ₖₖ are multiplied by the corresponding unit basis vector 𝒆ₖ of the sub-space ℏ in a Hilbert ambient space ℋ [2].

The upper triangular matrix 𝙍, mentioned above, can be effectively used for solving the linear least-squares (LS) problem, as well as to find a single non-trivial solution of 𝑨𝒙=𝒃 linear equation system [1].

A QR decomposition of a trivial 2-by-2 real matrix 𝘼 can be pictured as:

The 1st and 2nd column vectors a₁(3;4) and a₂(7;5), (red) and (navy), are shown in the figure above.

Figure a), from the left, illustrates the column vectors q₁(-0.6,-0.8) and q₂(-0.8,0.6), (gold) and (blue), obtained as the result of matrix 𝘼 factorization. The length of these vectors must always be equal to |𝙦|=1, the orthogonal unit vectors 𝙦.

Vectors q₁ and q₂ are perpendicular to one another, and thus, are orthogonal. In the simplest case, each of these vectors can be obtained by rotating each of the column vectors in 𝘼 to the 90º angle. The other way, specific rotation can be done by multiplying the trivial Jacobi rotation matrix by each column in the multi-vector 𝘼.

The corresponding vectors r₁(-5;-8.2), r₂(0;-2.6) in 𝙍, (gold) and (blue), are plotted in figure b), from the right. Unlike columns of the multi-vector 𝙌, r₁ and r₂ are not orthogonal. These vectors are obtained column-by-column as the result of scalar multiplication of each column vector in 𝙍 by the corresponding unit basis vectors 𝒆₁ and 𝒆₂ (green) of each sub-space in a Hilbert ambient space ℋ, respectively, as shown above.

Classical QR Decomposition Methods

Presently, there are at least three classical methods, widely used for computing the QR decomposition of real and complex matrices:

Gram-Schmidt Orthogonalization
Householder Reflections
Givens Rotations

Each of these methods, above, has its benefits and disadvantages. For example, the Householder Reflections are typically very complex when used for the QR decomposition of non-real matrices. In turn, the Givens Rotations method incurs the precision loss at the number of final iterations and is inefficient while using it for large-sized matrices decomposition.

In this story, I will bound the discussion to an explanation of the classical Gram-Schmidt process and the Schwarz-Rutishauser algorithms. I will use the Householder reflections and Givens rotations solely for the performance assessment purposes, surveying their complexity compared to the Schwarz-Rutishauser algorithm, being discussed.

In the succeeding paragraphs, I will introduce the Schwarz-Rutishauser algorithm, which is one of the most successful modifications of the Gram-Schmidt orthogonalization process.

Gram-Schmidt Orthogonalization

The Gram-Schmidt process, proposed in the mid-50s, was one of the first successful methods for the orthogonalization of real or complex matrices [1,4,5,6].

This method is widely used for recursively finding matrices 𝑸 of 𝑨, based on the orthogonal projection of each vector 𝒂ₖ ∈ 𝑨, 𝒌=𝟏..𝒏 onto the span of already known vectors 𝗾ᵢ ∈ 𝙌(𝒌), 𝒊=𝟏..𝒌. Each of the new orthogonal vectors 𝙦ₖ ∈ 𝙌 is computed as the sum of projections, subtracting each one of them from the corresponding vector 𝒂ₖ ∈ 𝑨. Finally, the upper triangular matrix 𝑹 can be easily found as the product of 𝑸’s-transpose and the matrix 𝑨 using equation (1), above.

An orthogonal projection of 𝒂 onto 𝒒 is obtained by using the equation (2): below:

Normally, the Gram-Schmidt process yields an orthogonal matrix 𝙌 and can be algebraically expressed as:

, alternatively:

Each new vector 𝙦ₖ ∈ 𝙌 is finally normalized by dividing it by its Euclidean norm |𝙦ₖ|.

A fragment of code in Python, listed below, demonstrates an implementation of the classical Gram-Schmidt orthogonalization process:

Python

Copy Code

def conj_transpose(a): 
    return np.conj(np.transpose(a))

def proj_ab(a,b):                  
    return a.dot(conj_transpose(b)) / lin.norm(b) * b

def qr_gs(A, type=complex):    
    A = np.array(A, dtype=type)
    (m, n) = np.shape(A)
    
    (m, n) = (m, n) if m > n else (m, m)

    Q = np.zeros((m, n), dtype=type)
    R = np.zeros((m, n), dtype=type)
    
    for k in range(n):       
        pr_sum = np.zeros(m, dtype=type)
        
        for j in range(k):
            pr_sum += proj_ab(A[:,k], Q[:,j])
            
        Q[:,k] = A[:,k] - pr_sum                                             
        Q[:,k] = Q[:,k] / lin.norm(Q[:,k])

    if type == complex:        
        R = conj_transpose(Q).dot(A)
    else: R = np.transpose(Q).dot(A)
        
    return -Q, -R

The Gram-Schmidt orthogonalization process is more intuitively simple than the Householder reflections or Givens rotations methods. Although, it has some disadvantages, such as numerical instability, as well as a notably high computational complexity, above 𝑶❨𝟐𝒎𝒏²❩, when applied to the large-sized matrices. Also, it requires a modification, based on determining a matrix’s leading dimension, when performing the decomposition of rectangular m-by-n matrices [4].

Schwarz-Rutishauser Algorithm

As you’ve probably have noticed, the classical Gram-Schmidt algorithm has a typically high complexity, caused by the computation of the orthogonal projection of each vector 𝒂ₖ ∈ 𝑨 onto vectors 𝗾ᵢ ∈ 𝙌(𝒌). In this case, the complexity of the orthogonal projection operator is about 𝑶❨𝟑𝒎❩, having a negative impact on the overall performance of the Gram-Schmidt process, itself [4].

That, we need a slightly different method for the orthogonalization of 𝑨, rather than computing multiple sums of each 𝒂ₖ vector's projection onto 𝙌(𝒌).

The Schwarz-Rutishauser algorithm is a modification of the Gram-Schmidt matrix orthogonalization method, proposed by H. R. Schwarz, H. Rutishauser and E. Stiefel, in their research paper “Numerik symmetrischer Matrizen” (Stuttgart, 1968) [6]. The research, held, was motivated by the goal to reduce the complexity of the existing Gram-Schmidt projection-based methods, improving their efficiency and numerical stability [5], over those popular Housholder reflections and rotation methods.

The main idea behind the Schwarz-Rutishauser algorithm is that the 𝑨’s orthogonalization complexity can be drastically reduced, compared to the classical Gram-Schmidt process 𝑶❨𝟐𝒎𝒏²❩ and Householder reflections 𝑶❨𝟐𝒎𝒏²-𝟬.𝟔𝒏³❩ methods [5].

To provide a better understanding of the Schwarz-Rutishauser algorithm, let’s take a closer look at the equation (2) and (3.1) from the right, shown above.

While computing the sum of projections of 𝒂’s onto the span 𝙌(𝒌), we must divide the scalar product of vectors <𝒂,𝒒> by the squared norm |𝙦ₖ|², which leads to the normalization of their product [2]. However, it’s not necessary, since each vector 𝗾ᵢ ∈ 𝙌(𝒌) has been already normalized during one of the previous 𝒌-th algorithm’s steps.

According to that, the equation (3.1) can be simplified by eliminating the division by the squared norm in the sum of projections [5], from the right:

Since 𝑹 is the inner product of 𝑸’s-transpose and 𝑨, then we can easily compute the 𝒊-th element of each column vector 𝒓ₖ as:

Herein, the scalar product of 𝗾ᵢ-and𝒂ₖ-vectors is multiplied by the unit basis vector 𝙚ₖ of the corresponding hyperplane in Hilbert’s ambient space ℋ [2].

Finally, let’s substitute the equation (4.2) to (4.1) so that we have:

Since we admit that 𝑸 must be initially equal to 𝑨 (𝑸 =𝑨), the computation of vectors 𝗾 and 𝒓 can be done recursively, by using equations (4.2) and (4.3), above [5]:

According to the vector arithmetics, to get each new 𝒌-th column vector 𝙦ₖ ∈ 𝙌, we subtract the product of previous vectors 𝗾ᵢ ∈ 𝙌(𝒌) and 𝒓ᵢₖ from the current vector 𝙦ₖ, which is also called the vector rejection of 𝒂ₖ. Let’s remind that 𝙦ₖ is a vector, initially equal to the original vector 𝒂ₖ (𝙦ₖ=𝒂ₖ), and the sum operator can be finally removed from the equation (4.3).

In this case, we also need to compute 𝒓ₖₖ, which is the 𝒌-th diagonal element in 𝑹, as the norm |𝙦ₖ|:

, and normalize the new 𝒌-th column vector 𝙦ₖ ∈ 𝙌, dividing it by its norm, equal to that diagonal element 𝒓ₖₖ, from the preceding step [5].

Finally, we can briefly formulate the Schwarz-Rutishauser algorithm:

Initialize matrix 𝙌, equal to 𝑨 (𝙌=𝑨), and 𝑹 as the matrix of 0s;

2. For each 𝒌-th column vector 𝙦ₖ ∈ 𝙌, 𝒌=𝟏..𝒏, do the following:

2.1. For the span of vectors 𝗾ᵢ ∈ 𝙌(𝒌), 𝒊=𝟏..𝒌, that are already known:

Get the 𝒊-th element 𝒓ᵢₖ of the 𝒌-th column vector 𝒓ₖ ∈ 𝑹, equation (5.1);
Update the 𝒌-th column vector 𝙦ₖ ∈ 𝙌, equation (5.2);

2.2. Proceed with the previous step 2.1, to orthogonalize vector 𝙦ₖ, and obtain the 𝒌-th column vector 𝒓ₖ;

3. Compute the diagonal element 𝒓ₖₖ as the 𝙦ₖ-vector’s norm 𝒓ₖₖ=|𝙦ₖ|, equation (5.3);

4. Normalize the 𝒌-th column vector 𝙦ₖ ∈ 𝙌, dividing it by its norm, equal to 𝒓ₖₖ from the previous step 3;

5. Proceed with the orthogonalization of 𝑨, performed in the steps 1–4, to compute the 𝑨’ orthonormal basis 𝙌 and the upper triangular matrix 𝑹;

6. Once completed, return the negative resultant matrices -𝙌 and -𝑹 from the procedure;

As you can see from the fragment of code, below, the implementation of the Schwarz-Rutishauser algorithm has been drastically reduced, compared to the classical Gram-Schmidt process. Moreover, it provides an ability to compute those vectors of matrices 𝑹 and 𝙌, in-place [5].

A fragment of the code, demonstrating an implementation of the Schwarz-Rutishauser algorithm is listed below:

Copy Code

def qr_gs_modsr(A, type=complex):
    
    A = np.array(A, dtype=type)
    (m,n) = np.shape(A)

    Q = np.array(A, dtype=type)      
    R = np.zeros((n, n), dtype=type)

    for k in range(n):
        for i in range(k):
            R[i,k] = np.transpose(Q[:,i]).dot(Q[:,k]);
            Q[:,k] = Q[:,k] - R[i,k] * Q[:,i];

        R[k,k] = lin.norm(Q[:,k]); Q[:,k] = Q[:,k] / R[k,k];
    
    return -Q, -R

Numerical Stability

In many cases, approximating the highest precision floating-point values is numerically unstable due to the persistent rounding error, caused by the arithmetic overflow [1].

The stability of matrix decomposition largely depends on its elements' numerical representation (real or complex), as well as their number.

According to the best practices, to avoid the significant precision loss while performing the QR decomposition, it’s recommended to represent the elements as rational numbers using a variety of the existing multi-precision libraries (such as the mpmath or GMPY) for Python and other programming languages and then perform a matrix factorization.

Complexity

Finally, let’s take a look into the Schwarz-Rutishauser algorithm’s complexity. In most cases, its complexity evaluates to 𝑶(𝙢𝙣²) and can be reduced as least as twice (2x), compared to the classical Gram-Schmidt and the Householder reflection specific methods [4,5]. Also, the QR decomposition of a huge matrix by using Gram-Schmidt and Householder methods requires the computation of transpose conjugate, which, in turn, negatively impacts the overall performance by at least 𝙣-times, more.

The complexity of all three QR-decomposition algorithms is shown in the chart, below:

Despite this, the Schwarz-Rutishauser algorithm’s complexity remains the same for either case of real or complex matrix decomposition because the following algorithm does not additionally require the computation of the matrix’s complex transpose conjugate, nevertheless, it's entirely based on the same orthogonalization procedure, applied to any matrices, regardless of the data type [5].

The complexity survey of the Schwarz-Rutishauser and other QR-decomposition algorithms is illustrated in the comparison chart, below:

As you can see, the overall complexity of the Schwarz-Rutishauser algorithm (navy) is effectively reduced by almost 61% compared to the Householder reflections (red), and by 50% - the classical Gram-Schmidt orthogonalization process (green).

Points of Interest

Finally, the Schwarz-Rutishauser algorithm, being discussed, is extremely efficient, while used for the factorization of large-sized real and complex matrices of an arbitrary shape, having the size of more than 𝟏𝟎³ elements at each dimension. Also, it’s considered to have better numerical stability and is attractive for running the specific code performing the factorization, in parallel, scaling it across multiple cores in the modern symmetric CPUs.

The overall complexity of the Schwarz-Rutishauser algorithm is just 𝑶(𝙢𝙣²), which is 𝑶(𝒏)-times less, compared to the classical Gram-Schmidt process, which is very close to the matrix inner product's complexity, itself.

References

History

December 11, 2021 - article "Can QR Decomposition Be Actually Faster? Schwarz-Rutishauser Algorithm" was published.

Can QR Decomposition Be Actually Faster? Schwarz-Rutishauser Algorithm - CodePro...

Introduction

Background

An orthogonal matrix Q:

An upper triangular matrix R:

Classical QR Decomposition Methods

Gram-Schmidt Orthogonalization

Schwarz-Rutishauser Algorithm

Numerical Stability

Complexity

Points of Interest

References

History

Recommend

为什么选择Cynefin框架？ – zwischenzugs

Running a RDL/RDLC (SQL Report) in ASP.NET without SSRS

NewPipe 0.21.14 brings description snippets and a new "Add to playlist"...

Xiaomi's new battery tech offers 10% more capacity at the same volume

Harlinn.Windows - A Visual Studio 2022 Solution

Bitcoin SV was upgraded to 1.0.10: What does this mean?

Microsoft is rolling out a fix for Teams so Android phones can make 911 calls ag...

商汤科技延迟上市

Do you think the Moto Edge X30 has what it takes to put Motorola back on the map...

今天国家公祭日

About Joyk