In the post on vector orthogonal projection, we discussed the concept of orthogonal projection of a vector onto a nonzero vector. In this post, we will discuss the concept of orthogonal projection of a vector onto a nonzero subspace of an inner product vector space.

**The orthogonal decomposition theorem**

First, we prove the following result:

**The orthogonal decomposition theorem**. Let \(W\) be a nonzero subspace of \(\mathbb{R}^n\). Then, any vector in \(\mathbb{R}^n\) can be written uniquely in the form $$y = \hat{y} + z,$$ where \(\hat{y}\) sits in \(W\) and \(z\) in \(W^{\perp}\). Moreover, if \(w_i\)s are orthogonal basis of \(W\), we have $$\hat{y} = \frac{y \cdot w_1}{w_1 \cdot w_1} w_1 + \dots + \frac{y \cdot w_p}{w_p \cdot w_p} w_p.$$

The proof goes as follows: Let \(W\) be a nonzero subspace of \(\mathbb{R}^n\). So, the dimension of \(W\) is at most \(n\). Assume that \(\dim W\) is \(p\). By Gram-Schmidt process (see basis for vector spaces), we can convert the basis of \(W\) which has \(p\) elements into an orthogonal basis with \(p\) elements.

Now, assume that $$\{w_1, \dots, w_p\}$$ is an orthogonal basis of \(W\). Set $$\hat{y} = \frac{y \cdot w_1}{w_1 \cdot w_1} w_1 + \dots + \frac{y \cdot w_p}{w_p \cdot w_p} w_p.$$ The vector \(\hat{y}\) sits in \(W\) because \(\hat{y}\) is a linear combination of some vectors in \(W\). On the other hand, if we set \(z = y – \hat{y}\), then as easy calculation shows that \(z \cdot w_i = 0\), for each \(i\). This shows that \(z\) is orthogonal to \(W\), and so, it sits in \(W^{\perp}\). Since the intersection of \(W\) and \(W^{\perp}\) is \(\{0\}\) (see orthogonal complement), the decomposition is unique and the proof is complete.

#### Definition and formula of orthogonal projection

- Let \(W\) be a nonzero subspace of \(\mathbb{R}^n\) and $$\{w_1, \dots, w_p\}$$ be an orthogonal basis for the subspace \(W\). The projection of any vector \(y\) onto \(W\) is, by definition, the vector \(\hat{y}\) in the orthogonal decomposition theorem (proved above): $$\hat{y} = \sum_{i=1}^{p} \frac{y \cdot w_i}{w_i \cdot w_i} w_i.$$
- The formula for the projection of any vector onto a nonzero subspace with orthonormal basis is even simpler. Let $$\{w_1, \dots, w_p\}$$ be an orthonormal basis for the subspace \(W\) of \(\mathbb{R}^n\). The orthogonal projection of a vector \(y\) onto \(W\) is obtained by the following formula: $$\hat{y} = \sum_{i=1}^{p} (y\cdot w_i) w_i.$$
- If we define the matrix \(O\) as follows: $$O = \begin{pmatrix} w_1 & \cdots & w_p \end{pmatrix},$$ then $$\hat{y} = O O^T y.$$

One of the interesting applications of the orthogonal decomposition theorem is the following result:

#### The best **approximation** theorem

**The best approximation theorem**. Let \(W\) be a nonzero subspace of \(\mathbb{R}^n\) and \(y\) an arbitrary vector. The closest point on \(W\) to \(y\) is the projection of \(y\) onto \(W\), i.e. $$\Vert y – \hat{y} \Vert < \Vert y – v \Vert,$$ for all \(v \in W\) with \(v \neq \hat{y}\).

The proof of the best approximation theorem is as follows:

Let \(v \neq \hat{y}\) be an element of \(W\). Then, \(\hat{y} – v\) is a nonzero element of \(W\). By the orthogonal decomposition theorem, \(z = y – \hat{y}\) is orthogonal to \(W\). In particular, \(y – \hat{y}\) is orthogonal to \(\hat{y} – v\). Now, since $$ y – v = (y – \hat{y}) + (\hat{y} – v), $$ by Pythagorean Theorem, we have that $$\Vert y – v \Vert^2 = \Vert y – \hat{y} \Vert^2 + \Vert \hat{y} – v \Vert^2.$$ Because \(v \neq \hat{y}\), the vector \(v – \hat{y}\) is nonzero, and so, \(\Vert \hat{y} – v \Vert^2\) is a positive real number.

This means that $$\Vert y – \hat{y} \Vert < \Vert y – v \Vert,$$ for each \(v \in W\). In other words, the projection of \(y\) onto \(W\) is the closest point of \(W\) to \(y\). This completes the proof.

#### Some exercises for orthogonal projection

**Exercise**. Find the nearest point in a subspace \(W\) of \(C[0,1]\) generated by \(e^x\) and \(e^{2x}\) to the point \(e^{3x}\).

**Solution**. It is easy to see that the vectors \(u_1 = e^x\) and \(u_2 = e^{2x}\) in the inner product real vector space \(C[0,1]\) are linearly independent. However, \(u_1\) and \(u_2\) are not perpendicular to each other because their inner product is not zero:

$$u_1 \cdot u_2 = \int_{0}^{1} e^{x} e^{2x} dx = $$ $$\frac{e^3 -1}{3}.$$ By Gram-Schmidt process (see basis for vector spaces), the vectors \(v_1\) and \(v_2\) are perpendicular to each other if we set $$ v_1 = u_1$$ and $$v_2 = u_2 – \frac{u_2 \cdot v_1}{v_1 \cdot v_1} v_1.$$ Observe that $$u_2 \cdot v_1 = \frac{e^3 -1}{3}$$ and $$v_1 \cdot v_1 = \frac{e^2 -1}{2}.$$ So, \(v_1 = e^x\) and \(v_2 = e^{2x} – \frac{2(e^3-1)}{3(e^2-1)} e^x\) is an orthogonal basis for \(W\). The nearest (closest) point of \(W\) to \(y = e^{3x}\) is the projection of \(y\) onto \(W\): $$\hat{y} = \frac{y \cdot v_1}{v_1 \cdot v_1} v_1 + \frac{y \cdot v_2}{v_2 \cdot v_2} v_2,$$ where $$ y \cdot v_1 = \frac{e^4-1}{4}$$ $$v_1 \cdot v_1 = \frac{e^2-1}{2}$$ $$ y \cdot v_2 = \frac{(e^2 + 3e + 1)(e^3 -1)}{30} $$ $$ v_2 \cdot v_2 = \frac{(e-1)^3 (e^2+4e+1)}{36(e+1)}.$$

**Exercise**. Find the orthogonal projection of the point \((1,1,1)\) onto the following plane $$\{(x,y,z): x+y+z = 0\}.$$

**Exercise**. Find the orthogonal projection of \(x^3\) onto a subspace of \(C[1,2]\) generated by \(x\) and \(x^2\).