In a multiplicative mathematical system, it is sometimes useful to write an object as a multiplication of smaller or simpler objects of the same kind. This process is called factorization (more generally, decomposition). Mathematicians, engineers, and data scientists use different kinds of factorization (decomposition) to simplify complex calculations, and for example, find solutions to equations. QR matrix factorization (decomposition) of a matrix \(A\) is to factor (decompose) \(A\) into the multiplication of an orthonormal matrix \(Q\) and an upper triangular matrix \(R\). Later, we will see that QR matrix factorization is often useful to solve “linear least squares problems” used in data science and analysis.

Recall that an \(m \times n\) real matrix is orthogonal if the set of its columns form an orthonormal subset of the real vector space \(\mathbb{R}^m\) (see orthogonal matrices).

#### QR matrix factorization

**The QR matrix factorization**. Let \(A\) be an \(m \times n \) matrix having linearly independent columns. Then, \(A\) has a QR factorization. More precisely, there is an \(m \times n\) matrix \(Q\) whose columns form an orthonormal basis for \(\hbox{Col}(A)\) and there is an upper-triangular \(n \times n\) invertible matrix \(R\) with positive entries on its main diagonal such that \(A = QR\).

**Exercise**. Let \(A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix} \). By direct computation, show that \(A = QR\), where $$Q = \begin{pmatrix} 1/\sqrt{10} & 3/\sqrt{10} \\ 3/\sqrt{10} & -1/\sqrt{10}\end{pmatrix} $$ is an orthogonal matrix and $$R = \begin{pmatrix} \sqrt{10} & 7 \sqrt{2/5} \\ 0 & \sqrt{2/5}\end{pmatrix}$$ is an invertible upper-triangular matrix with positive entries on its main diagonal.

#### The proof of the QR matrix factorization result

Let columns of \(A\) be a basis for \(\hbox{Col}(A)\). By Gram-Schmidt process, we can convert the columns of \(A\) into an orthonormal basis $$\{v_1, \dots, v_n\}$$ for \(W = \hbox{Col}(A)\).

Set $$Q = \begin{pmatrix} v_1 & \cdots & v_n \end{pmatrix}.$$ It is evident that \(\{C_i\}_{i=1}^k\) and \(\{v_i\}_{i=1}^k\) span the same subspace, and so, for each \(k\), we can find real numbers $$r_{1k}, \dots, r_{kk}$$ such that $$C_k = r_{1k}v_1 + \dots + r_{kk}v_k + $$ $$\dots + 0u_n.$$

From the beginning, it was possible to assume that \(r_{kk} \geq 0\) because if \(r_{kk} \leq 0\), then we would multiply \(r_{kk}\) and \(v_k\) by \(-1\).

This means that for each \(k\), there is a column vector of the form $$r_k = \begin{pmatrix} r_{1k} \\ \vdots \\ r_{kk} \\ \vdots \\ 0 \end{pmatrix},$$ where each column of \(A\) satisfies the equality $$C_k = Q r_k.$$ Now, set $$R = \begin{pmatrix} r_1 & \cdots & r_n \end{pmatrix},$$ and observe that $$QR = \begin{pmatrix} Qr_1 & \cdots & Qr_n\end{pmatrix} = $$ $$\begin{pmatrix} C_1 & \cdots & C_n \end{pmatrix} = A.$$

Finally, in order to prove the invertibility of \(R\), we assume that \(x\) is a vector such that \(Rx = 0\). We multiply both sides of the equation by \(Q\) and we obtain that \(Ax = 0\). Since columns of \(A\) are linearly independent, we obtain that \(x\) is the zero vector.

Hence, by the result on the characterizations of invertible matrices discussed in the post on the inverse of a matrix, \(R\) is invertible, and so, the entries on its main diagonal are nonzero, i.e., positive and the proof is complete.