Orthogonal Transformations and Orthogonal Matrices

Orthogonal Transformations and Orthogonal Matrices

A linear transformation $T$ from $\mathbb{R}^n$ to $\mathbb{R}^n$ is called an orthogonal transformation if it preserves the length of vectors: $\left|\left|T(x)\right|\right| = \left|\left|x\right|\right|$ for all $x\in \mathbb{R}^n.$ If $T(x)=Ax$ is an orthogonal transformation, we say $A$ is an orthogonal matrix.

Lemma. (Orthogonal Transformation) Let $T$ be an orthogonal transformation from $\mathbb{R}^n$ to $\mathbb{R}^n.$ If $v, w \in \mathbb{R}^n$ are orthogonal, then $T(v), T(w) \in \mathbb{R}^n$ are orthogonal.

Proof. We want to show $T(v), T(w)$ are orthogonal, and by the Pythagorean theorem, we have to show $$ \left|\left| T(v)+T(w)\right|\right|^2= \left|\left| T(v)\right|\right| ^2 + \left|\left|T(w)\right|\right|^2. $$ This equality follows \begin{align*} \left|\left| T(v)+T(w)\right|\right|^2 & = \left|\left|T(v+w)\right|\right|^2 =\left|\left|v+w\right|\right|^2 \\ & =\left|\left|v\right|\right|^2+\left|\left| w\right|\right|^2 =\left|\left| T(v)\right|\right|^2 + \left|\left| T(w)\right|\right|^2 \end{align*} since $T$ is linear, orthogonal and that $v, w$ are orthogonal, respectively.

Theorem. A linear transformation $T$ from $\mathbb{R}^n$ to $\mathbb{R}^n$ is orthogonal if and only if the vectors $T(e_1), \ldots, T(e_n)$ form an orthonormal basis.

Proof. If $T$ is an orthogonal transformation, then by definition, the $T(e_i)$ are unit vectors, and also, by Orthogonal Transformation they are orthogonal. Therefore, $T(e_1),\ldots, T(e_n)$ form an orthonormal basis. Conversely, suppose $T(e_1)$, \ldots, $T(e_n)$ form an orthonormal basis. Consider a vector $x=x_1 e_1+\cdots +x_n e_n.$ Then \begin{align*} \left|\left|T(x)\right|\right|^2 &=\left|\left|T(x_1 e_1+\cdots + x_n e_n)\right|\right|^2 =\left|\left|x_1 T(e_1)+\cdots + x_n T(e_n)\right|\right|^2 \\ &=\left|\left|x_1T(e_1)\right|\right|^2+\cdots + \left|\left|x_nT(e_n)\right|\right|^2 = x_1^2+\cdots + x_n^2 =\left|\left| x\right|\right|^2. \end{align*} Taking the square root of both sides shows that $T$ preserves lengths and therefore, $T$ is an orthogonal transformation.

Corollary. An $n \times n$ matrix $A$ is orthogonal if and only if its columns form an orthonormal basis.

Proof. The proof is left for the reader.

The transpose $A^T$ of an $n\times n$ matrix $A$ is the $n\times n$ matrix whose $ij$-th entry is the $ji$-th entry of $A.$ We say that a square matrix $A$ is symmetric if $A^T=A$, and $A$ is called skew-symmetric if $A^T=-A.$

Theorem. (Orthogonal and Transpose Properties)

(1) The product of two orthogonal $n\times n$ matrices is orthogonal.

(2) The inverse of an orthogonal matrix is orthogonal.

(3) If the products $(A B)^T$ and $B^T A^T$ are defined then they are equal.

(4) If $A$ is invertible then so is $A^T$, and $(A^T)^{-1}=(A^{-1})^T.$

(5) For any matrix $A$, $\text{rank}\,(A) = \text{rank} \,(A^T).$

(6) If $v$ and $w$ are two column vectors in $\mathbb{R}^n$, then $v \cdot w = v^T w.$

(7) The $n \times n$ matrix $A$ is orthogonal if and only if $A^{-1}=A^T.$

Proof. The proof of each part follows.

  • Suppose $A$ and $B$ are orthogonal matrices, then $AB$ is an orthogonal matrix since $T(x)=AB x$ preserves length because $$ \left|\left|T(x)\right|\right| = \left|\left|AB x\right|\right| = \left|\left|A(B x)\right|\right| = \left|\left|B x\right|\right| = \left|\left|x\right|\right|. $$
  • Suppose $A$ is an orthogonal matrix, then $A^{-1}$ is orthogonal an matrix since $T(x)=A^{-1} x$ preserves length because $\left|\left| A^{-1}x \right|\right| = \left|\left| A(A^{-1}x) \right|\right| = \left|\left| x \right|\right|.$
  • We will compare entries in the matrices $(AB)^T$ and $B^T A^T$ as follows: $$ \begin{array}{rl} i j \text{-th entry of }(AB)^T &= ji \text{-th entry of }AB\\ & = (j \text{-th row of } A) \cdot (i \text{-th column of } B)\\ \ i j \text{-th entry of }B^TA^T &=(i \text{-th row of } B^T) \cdot (j \text{th column of } A^T)\\ & = (i \text{-th column of } B) \cdot (j \text{-th row of } A)\\ & = (j \text{-th row of } A) \cdot (i \text{-th column of } B). \end{array} $$ Therefore, the $ij$-th entry of $(AB)^T$ is the same of the $ij$-th entry of $B^T A^T.$
  • Suppose $A$ is invertible, then $A A^{-1}=I_n.$ Taking the transpose of both sides along with (iii) it yields, $(A A^{-1})^T=(A^{-1})^T A^T=I_n.$ Thus $A^T$ is invertible and since inverses are unique, it follows $(A^T)^{-1}=(A^{-1})^T.$
  • Exercise.
  • If $v=\begin{bmatrix}a_1\\ \vdots \\ a_n\end{bmatrix}$ and $w=\begin{bmatrix}b_1 \\ \vdots \\ b_n\end{bmatrix}$, then $$ v \cdot w=\begin{bmatrix}a_1 \\ \vdots \\ a_n \end{bmatrix} \cdot \begin{bmatrix} b_1\\ \vdots\\ b_n\end{bmatrix} = a_1b_1+\cdots +a_n b_n =\begin{bmatrix} a_1 & \cdots & a_n\end{bmatrix} \begin{bmatrix} b_1 \\ \vdots\\ b_n\end{bmatrix} =\begin{bmatrix}a_1 \\ \vdots \\ a_n\end{bmatrix}^T w=v^T w. $$
  • Let’s write $A$ in terms of its columns: $A=\begin{bmatrix}v_1 & \cdots & v_n \end{bmatrix}.$ Then \begin{equation*} \label{ata} A^T A= \begin{bmatrix} v_1^T \\ \vdots \\ v_n^T \end{bmatrix} \begin{bmatrix} v_1 & \cdots & v_n \end{bmatrix} =\begin{bmatrix}v_1 \cdot v_1 & & v_1 \cdot v_n \\ \vdots & \cdots & \vdots \\ v_n \cdot v_1 & & v_n \cdot v_n\end{bmatrix}. \end{equation*} Now $A$ is orthogonal, if and only if $A$ has orthonormal columns, meaning $A$ is orthogonal if and only if $A^TA=I_n$. Therefore, $A$ is orthogonal if and only if $A^{-1}=A^T.$

Theorem. (Orthogonal Projection Matrix)

Let $V$ be a subspace of $\mathbb{R}^n$ with orthonormal basis $u_1, \ldots, u_m.$ The matrix of the orthogonal projection onto $V$ is $Q Q^T$ where $Q= \begin{bmatrix} u_1 & \cdots & u_m \end{bmatrix}$.

Let $V$ be a subspace of $\mathbb{R}^n$ with basis $v_1,\ldots,v_m$ and let $A=\begin{bmatrix}v_1 & \cdots v_m \end{bmatrix}$, then the orthogonal projection matrix onto $V$ is $A(A^T A)^{-1}A^T.$

Proof. The proof of each part follows.

  • Since $u_1$, \ldots, $u_m$ is an orthonormal basis of $V$ we can, by Orthogonal Projection, write, \begin{align*} \text{proj}_V (x) & =(u_1 \cdot x) u_1 + \cdots + (u_m \cdot x) u_m =u_1 u_1^T x + \cdots +u_m u_m^T x & \\ &=(u_1 u_1^T + \cdots +u_m u_m^T) x = \begin{bmatrix} u_1 & \cdots & u_m \end{bmatrix} \begin{bmatrix} u_1^T \\ \vdots \\ u_m^T \end{bmatrix} x =QQ^Tx. \end{align*}
  • Since $v_1,\ldots,v_m$ form a basis of $V$, there exists unique scalars $c_1,\ldots,c_m$ such that $\text{proj}_V(x)=c_1 v_1+\cdots +c_m v_m.$ Since $A=\begin{bmatrix}v_1 & \cdots & v_m \end{bmatrix}$ we can write $\text{proj}_V(x)=A c.$ Consider the system $A^TAc =A^T x$ where $A^TA$ is the coefficient matrix and $c$ is the unknown. Since $c$ is the coordinate vector of $\text{proj}_V(x)$ with respect to the basis $(v_1,\ldots,v_m)$, the system has a unique solution. Thus, $A^TA$ must be invertible, and so we can solve for $c$, namely $c=(A^T A)^{-1}A^Tx.$ Therefore, $\text{proj}_V(x)=A c =A (A^T A)^{-1}A^T $ as desired. Notice it suffices to consider the system $A^TAc =A^T x$, or equivalently $A^T(x-A c)=0$, because $$ A^T(x -A c)=A^T(x-c_1 v_1-\cdots – c_m v_m) $$ is the vector whose $i$th component is $$ (v_i)^T(x-c_1 v_1-\cdots -c_m v_m)=v_i\cdot(x-c_1v_1-\cdots -c_m v_m) $$ which we know to be zero since $x-\text{proj}_V(x)$ is orthogonal to $V.$

Example. Is there an orthogonal transformation $T$ from $\mathbb{R}^3$ to $\mathbb{R}^3$ such that $$ T\begin{bmatrix} 2\\ 3\\ 0\end{bmatrix} =\begin{bmatrix} 3\\ 0\\ 2\end{bmatrix} \qquad \text{and} \qquad T\begin{bmatrix}-3\\ 2\\ 0\end{bmatrix} = \begin{bmatrix} 2\\ -3\\ 0\end{bmatrix}? $$ No, since the vectors $\begin{bmatrix}2\\ 3\\ 0\end{bmatrix}$ and $\begin{bmatrix}-3\\ 2\\ 0\end{bmatrix}$ are orthogonal, whereas $\begin{bmatrix}3\\ 0\\ 2\end{bmatrix}$ and $\begin{bmatrix}2\\ -3\\ 0\end{bmatrix}$ are not, by Orthogonal Transformation.

Example. Find an orthogonal transformation $T$ from $\mathbb{R}^3$ to $\mathbb{R}^3$ such that $$ T\begin{bmatrix}2/3\\ 2/3\\ 1/3\end{bmatrix} = \begin{bmatrix}0 \\ 0\\ 1\end{bmatrix}. $$ Let’s think about the inverse of $T$ first. The inverse of $T$, if it exists, must satisfy $T^{-1}(e_3) = \begin{bmatrix}2/3\\ 2/3\\ 1/3\end{bmatrix} = v_3.$ Furthermore, the vectors $v_1, v_2, v_3$ must form an orthonormal basis of $\mathbb{R}^3$ where $T^{-1}x=\begin{bmatrix}v_1 & v_2 & v_3\end{bmatrix} x.$ We require a vector $v_1$ with $v_1\cdot v_3=0$ and $\left|\left|v_1 \right|\right| =1.$ By inspection, we find $v_1=\begin{bmatrix} -2/3\\ 1/3\\ 2/3\end{bmatrix}.$ Then $$
v_2=v_1\times v_3 =\begin{bmatrix} -2/3\\ 1/3\\ 2/3\end{bmatrix} \times \begin{bmatrix} 2/3\\ 2/3\\ 1/3\end{bmatrix} =\begin{bmatrix} 1/9-4/9\\ -(-2/9-4/9)\\ -4/9-2/9 \end{bmatrix} = \begin{bmatrix} -1/3\\ 2/3\\ -2/3\end{bmatrix} $$ does the job since $$ \left|\left| v_1 \right| \right| = \left|\left| v_2 \right| \right| = \left|\left| v_3 \right| \right| =1 $$ and $$ v_1\cdot v_2=v_1\cdot v_3=v_2\cdot v_3=0. $$ In summary $$ T^{-1}=\begin{bmatrix}-2/3 & -1/3 & 2/3 \\ 1/3 & 2/3 & 2/3 \\ 2/3 & -2/3 & 1/3\end{bmatrix}x. $$ By Orthogonal and Transpose Properties the matrix of $T^{-1}$ is orthogonal and the matrix $T=(T^{-1})^{-1}$ is the transpose of the matrix of $T^{-1}.$ Therefore, it suffices to use $$ T=\begin{bmatrix}-2/3 & -1/3 & 2/3 \\ 1/3 & 2/3 & 2/3 \\ 2/3 & -2/3 & 1/3\end{bmatrix}^Tx=\begin{bmatrix}-2/3 & 1/3 & 2/3 \\ -1/3 & 2/3 & -2/3 \\ 2/3 & 2/3 & 1/3 \end{bmatrix} x. $$

Example. Show that a matrix with orthogonal columns need not be an orthogonal matrix. For example $A=\begin{bmatrix}4 & -3 \\ 3 & 4 \end{bmatrix}$ is not an orthogonal matrix $Tx=Ax$ does not preserve length by comparing the lengths of $x$ and $Tx$ with $\begin{bmatrix}-3\\ 4\end{bmatrix}.$

Example. Find all orthogonal $2\times 2$ matrices. Write $A=\begin{bmatrix}v_1 & v_2\end{bmatrix}.$ The unit vector $v_1$ can be expressed as $v_1=\begin{bmatrix}\cos \theta\ \sin \theta\end{bmatrix}$, for some $\theta.$ Then $v_2$ will be one of the two unit vectors orthogonal to $v_1$, namely $v_2=\begin{bmatrix}-\sin \theta \ \cos \theta\end{bmatrix}$ or $v_2=\begin{bmatrix} \sin \theta\ -\cos \theta\end{bmatrix}.$ Therefore, an orthogonal $2\times 2$ matrix is either of the form $$ A=\begin{bmatrix}\cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{bmatrix}\hspace{1cm} \text{or} \hspace{1cm} A=\begin{bmatrix}\cos \theta & \sin \theta \ \sin \theta & -\cos \theta \end{bmatrix} $$ representing a rotation or a reflection, respectively.

Example. Given $n\times n$ matrices $A$ and $B$ which of the following must be symmetric?

  • $B B^T$
  • $A^T B^TB A$
  • $B(A+A^T)B^T$

The solution to each part follows.

  • By Orthogonal and Transpose Properties, $B B^T$ is symmetric because $$(B B^T)^T=(B^T)^TB^T=B B^T. $$
  • By Orthogonal and Transpose Properties}, $A^T B^TB A$ is symmetric because $$ (A^TB^TBA)^T=A^TB^T(B^T)^T(A^T)^T=A^TB^TBA. $$
  • By Orthogonal and Transpose Properties}, $B(A+A^T)B^T$ is symmetric because $$ (B(A+A^T)B^T)^T=((A+A^T)B^T)^TB^T=B(A+A^T)^TB^T $$ $$ =B(A^T+A)^TB^T=B((A^T)^T+A^T)B^T=B(A+A^T)B^T. $$

Example. If the $n\times n$ matrices $A$ and $B$ are symmetric which of the following must be symmetric as well?

  • $2I_n+3A-4 A^2$,
  • $A B^2 A.$

The solution to each part follows.

  • First note that $(A^2)^T=(A^T)^2=A^2$ for a symmetric matrix $A.$ Now we can use the linearity of the transpose, $$ (2I_n+3A-4 A^2)^T=2I_n^T+3A^T-4 (A^2)^T=2I_n+3A-4 A^2 $$ showing that the matrix $2I_n+3A-4 A^2$ is symmetric.
  • The matrix $A B^2 A$ is symmetric since, $$ (AB^2A)^T=(ABBA)^T=(BA)^T(AB)^T=A^TB^TB^TA^T=AB^2A. $$

Example. Use Orthogonal Projection Matrix to find the matrix $A$ of the orthogonal projection onto $$ W=\mathop{span} \left(\begin{bmatrix} 1\\ 1\\ 1\\ 1\end{bmatrix}, \begin{bmatrix} 1\\ 9\\ -5\\ 3\end{bmatrix}\right). $$ Then find the matrix of the orthogonal projection onto the subspace of $\mathbb{R}^4$ spanned by the vectors $\begin{bmatrix}1\\ 1\\ 1\\ 1\end{bmatrix}$ and $\begin{bmatrix}1\\ 2\\ 3\\ 4\end{bmatrix}.$

First we apply Gram-Schmidt Process, to $W=\mathop{span}(v_1, v_2)$, to find that the vectors $$ u_1=\frac{v_1}{\left|\left| v_1 \right|\right| } =\begin{bmatrix}1/2 \\ 1/2 \\ 1/2 \\ 1/2\end{bmatrix}, u_2 =\frac{v_2^\perp}{\left|\left| v_2^\perp \right|\right| } =\frac{v_2-\left(u_1 \cdot v_2\right) u_1}{\left|\left| v_2-\left(u_1 \cdot v_2\right)u_1 \right|\right| } =\begin{bmatrix}-1/10 \\ 7/10 \\ -7/10 \\ 1/10\end{bmatrix} $$ form an orthonormal basis of $W.$ By Orthogonal Projection Matrix, the matrix of the projection onto $W$ is $A=Q Q^T$ where $Q=\begin{bmatrix}u_1 & u_2\end{bmatrix}.$ Therefore the orthogonal projection onto $W$ is $$ A= \begin{bmatrix} 1/2 & -1/10 \\ 1/2 & 7/10 \\ 1/2 & -7/10 \\ 1/2 & 1/10 \end{bmatrix} \begin{bmatrix} 1/2 & 1/2 & 1/2 & 1/2 \\ -1/10 & 7/10 & -7/10 & 1/10 \end{bmatrix} =\frac{1}{100} \begin{bmatrix} 26 & 18 & 32 & 24 \\ 18 & 74 & -24 & 32 \\ 32 & -24 & 74 &18 \\ 24 & 32 & 18 & 26 \end{bmatrix}. $$ Let $A=\begin{bmatrix}1 & 1 \\ 1 & 2 \\ 1 & 3 \\ 1 & 4 \end{bmatrix}$ and then the orthogonal projection matrix is $$ A(A^TA)^{-1}A^T =\frac{1}{10}\begin{bmatrix}7 & 4 & 1 & -2 \\ 4 & 3 & 2 & 1 \\ 1 & 2 & 3 & 4 \\ -2 & 1 & 4 & 7 \end{bmatrix}. $$