Diagonalization by a Similarity Transformation¶
Definition.
A matrix \(\ \boldsymbol{A}\in M_n(C)\ \) is diagonalizable by a similarity transformation if there exists an invertible matrix \(\ \boldsymbol{P}\in M_n(C)\ \) such that
where \(\ \boldsymbol{D}\in M_n(C)\ \) is a diagonal matrix.
We say that \(\ \boldsymbol{P}\ \) is a diagonalizing matrix or that it diagonalizes the matrix \(\ \boldsymbol{A}.\)
Lemma.
Consider matrices \(\ \boldsymbol{A},\,\boldsymbol{P}\in M_n(C).\) Columns \(\ \boldsymbol{X}_1,\ \boldsymbol{X}_2,\ldots, \boldsymbol{X}_n\ \) of the matrix \(\ \boldsymbol{P}\ \) are eigenvectors of the matrix \(\ \boldsymbol{A}:\)
if and only if
where \(\ \boldsymbol{D}\,=\, \text{diag}(\lambda_1,\lambda_2,\ldots,\lambda_n)\,.\)
Indeed, according to the column rule of matrix multiplication:
and thus the conditions (2) and (3) equivalent.
Theorem 7.
A matrix \(\ \boldsymbol{A}\ \) is diagonalizable by a similarity transformation (1) if and only if the space \(\,C^n\,\) has a basis \(\,\mathcal{B} = (\boldsymbol{X}_1,\, \boldsymbol{X}_2,\,\ldots,\,\boldsymbol{X}_n)\ \) consisting of eigenvectors of the matrix \(\ \boldsymbol{A}:\)
Then the matrix \(\ \boldsymbol{P}\,=\, [\ \boldsymbol{X}_1\,|\,\boldsymbol{X}_2\,|\,\ldots\,|\,\boldsymbol{X}_n\ ],\ \) whose columns are vectors from the basis \(\,\mathcal{B}\,,\ \) diagonalizes the matrix \(\,\boldsymbol{A}.\)
Proof.
A matrix \(\ \boldsymbol{A}\ \) is diagonalizable by a similarity transformation if there exists an invertible matrix \(\ \boldsymbol{P}\,\equiv\, [\ \boldsymbol{X}_1\,|\ \boldsymbol{X}_2\,|\ \ldots\,|\ \boldsymbol{X}_n\,]\ \) such that
where \(\ \boldsymbol{D}\ \) is a diagonal matrix: \(\ \boldsymbol{D}\,=\,\text{diag}(\lambda_1,\lambda_2,\ldots,\lambda_n).\ \) This is equivalent to the conditions
The condition \(\ \det{\boldsymbol{P}}\neq 0\ \) means that \(\ \mathcal{B} = (\boldsymbol{X}_1,\, \boldsymbol{X}_2,\,\ldots,\,\boldsymbol{X}_n)\ \) comprises a linearly independent set of vectors of the space \(\ C^n.\ \) Moreover, by the above Lemma:
In an \(\ n\)-dimensional vector space every set of \(\,n\,\) linearly independent vectors is a basis. Therefore, since \(\ \dim{C^n}=n,\ \) so \(\,\mathcal{B}\ \) is a basis of the space \(\ C^n;\ \) it is the basis consisting of eigenvectors of the matrix \(\,\boldsymbol{A}.\)
On the other hand, if eigenvectors \(\ \boldsymbol{X}_1,\,\boldsymbol{X}_2,\, \ldots,\,\boldsymbol{X}_n\ \) of the matrix \(\,\boldsymbol{A}\in M_n(C)\,\) span the space \(\,C^n,\ \) then the matrix \(\,\boldsymbol{P}\,\) whose columns are the basis vectors diagonalizes the matrix \(\,\boldsymbol{A}.\)
Comments and corollaries.
1.) Every matrix \(\,\boldsymbol{A}\in M_n(C)\,\) has at least one eigenvalue \(\,\lambda\,\) and the associated eigenvector \(\,\boldsymbol{X}.\ \) Hence, because the equation (2) does not require that the eigenvalues \(\,\lambda_i\ \) and \(\,\) the associated eigenvectors \(\,\boldsymbol{X}_i\ \) are distinct, there always exists a matrix \(\,\boldsymbol{P}\ \) such that the equation (3) holds. In particular, one may take
Then \(\,\boldsymbol{A}\boldsymbol{P}= \boldsymbol{P}\boldsymbol{D}=\lambda\,\boldsymbol{P},\ \) but the matrix \(\ \boldsymbol{P}\ \) is not invertible and thus the relation does not hold (1).
2.) The formula \(\ \boldsymbol{D}\,=\, \boldsymbol{P}^{-1}\boldsymbol{A}\,\boldsymbol{P}\ \) may be interpreted in terms of transformation of a matrix of a linear operator under a change of basis. Consider the space \(\,C^n\ \) with the canonical basis \(\ \mathcal{E}\,=\,(\boldsymbol{e}_1,\boldsymbol{e}_2,\ldots\, \boldsymbol{e}_n).\ \) Let \(\boldsymbol{A}\ \) be the matrix of a linear operator \(F\in \text{End}(C^n)\ \) defined by \(\ F(\boldsymbol{x})\,:\,=\,\boldsymbol{A}\boldsymbol{x},\ \) \(\,\boldsymbol{x}\in C^n.\ \) If eigenvectors \(\ \boldsymbol{X}_1,\boldsymbol{X}_2,\ldots,\boldsymbol{X}_n\ \) of the operator \(\,F\,\) are linearly independent, then the matrix \(\ \boldsymbol{P}\,=\, [\ \boldsymbol{X}_1\,|\,\boldsymbol{X}_2\,|\,\ldots\,|\,\boldsymbol{X}_n\ ]\ \) is the transition matrix from the canonical basis \(\,\mathcal{E}\,\) to the basis \(\,\mathcal{B}\,=\, (\boldsymbol{X}_1,\boldsymbol{X}_2,\ldots\,\boldsymbol{X}_n)\ \) consisting of the eigenvectors.
Hence, \(\boldsymbol{D}\ \) is a matrix of the operator \(\,F\ \) in the basis \(\,\mathcal{B}\ \) consisting of its eigenvectors. As one should expect, this is a diagonal matrix with the eigenvalues of \(\,F\ \) on the diagonal.
3.) We know already that the eigenevectors of a linear operator which are associated to different eigenvalues are linearly independent.
Corollary. If a matrix \(\,\boldsymbol{A}\in M_n(C)\ \) has \(\,n\,\) distinct eigenvalues, then there exists a similarity transformation which diagonalizes this matrix.
Indeed, if columns of the matrix \(\,\boldsymbol{P}\,\) are eigenvectors of the matrix \(\,\boldsymbol{A}\,\) which are associated with distinct eigenvalues, then the matrix \(\,\boldsymbol{P}\,\) is non-degenerate: \(\,\det{\boldsymbol{P}}\neq 0,\ \) and thus invertible.
4.) Eigenvectors of a normal operator which are associated with distinct eigenvalues comprise an orthogonal system, and after normalization - an orthonormal system. A matrix whose columns comprise an orthonormal system is unitary.
Corollary. Let \(\,\boldsymbol{A}\in M_n(C)\ \) be a normal (e.g. Hermitian or unitary) matrix. If \(\,\boldsymbol{A}\ \) has \(\,n\,\) distinct eigenvalues, then there exists a unitary similarity transformation which diagonalizes this matrix (a diagonalizing matrix \(\,\boldsymbol{P}\ \) is unitary: \(\ \boldsymbol{P}^+\boldsymbol{P}=\boldsymbol{I}_n).\)
Remark. A normal matrix does not have to have \(\,n\,\) distinct eigenvectors to be diagonalizable. Namely, one can prove a more general
Theorem 8.
A matrix \(\,\boldsymbol{A}\in M_n(C)\ \) is diagonalizable by a unitary similarity transformation if and only if it is normal.
Application to real matrices.
For a real matrix \(\,\boldsymbol{A}:\ \) \(\,\boldsymbol{A}\in M_n(R),\ \) we have \(\,\boldsymbol{A}^+=\boldsymbol{A}^T.\ \) Therefore
(a real Hermitian matrix is symmetric), and
(a real unitary matrix is orthogonal).
Theorem 9.
Every real symmetric or orthogonal matrix is diagonalizable by a unitary similarity transformation.
Eigenvalues, and thus also eigenvectors, of a real symmetric matrix are real. Hence, a unitary diagonalizing matrix is a real orthogonal matrix.
Corollary. Every real symmetric matrix is diagonalizable by a real orthogonal similarity transformation.
In comparison to the previous case, eigenvalues of a real orthogonal matrix (and thus also its eigenvectors) may be complex and not real. Then the unitary diagonalizing matrix will be also complex and not real.
Theorem 10.
If a matrix \(\ \boldsymbol{A}\ \) is diagonalizable by a similarity transformation, then an algebraic multiplicity of every eigenvalue is equal to the geometric multiplicity.
Proof. \(\ \) If a transformation \(\ \boldsymbol{A}\ \rightarrow\ \boldsymbol{P}^{-1}\boldsymbol{A}\,\boldsymbol{P}\ \equiv\boldsymbol{D}\ \) diagonalizes the matrix \(\ \boldsymbol{A},\ \) then \(\ \boldsymbol{D}\,= \text{diag}(\lambda_1,\,\lambda_2,\,\ldots,\,\lambda_k),\ \) where \(\ \lambda_1,\lambda_2,\ldots,\lambda_k\ \) are eigenvalues of the matrix \(\,\boldsymbol{A}.\ \) The number with which \(\,\lambda_i\,\) occurs on a diagonal of the matrix \(\ \boldsymbol{D}\ \) is equal to both an algebraic and geometric multiplicity of this eigenvalue.