Diagonalization by a Similarity Transformation

Definition.

A matrix  AMn(C)  is diagonalizable by a similarity transformation if there exists an invertible matrix  PMn(C)  such that

(1)P1AP = D,

where  DMn(C)  is a diagonal matrix.

We say that  P  is a diagonalizing matrix or that it diagonalizes the matrix  A.

Lemma.

Consider matrices  A,PMn(C). Columns  X1, X2,,Xn  of the matrix  P  are eigenvectors of the matrix  A:

(2)AX1=λ1X1,AX2=λ2X2,AXn=λnXn

if and only if

(3)AP=PD,

where  D=diag(λ1,λ2,,λn).

Indeed, according to the column rule of matrix multiplication:

AP=  A [ X1| X2| | Xn]  ==  [ AX1| AX2| | AXn]PD=  [ λ1X1| λ2X2| | λnXn],

and thus the conditions (2) and (3) equivalent.

Theorem 7.

A matrix  A  is diagonalizable by a similarity transformation (1) if and only if the space Cn has a basis B=(X1,X2,,Xn)  consisting of eigenvectors of the matrix  A:

AX1=λ1X1,AX2=λ2X2,,AXn=λnXn,λ1,λ2,,λnC.

Then the matrix  P=[ X1|X2||Xn ],  whose columns are vectors from the basis B,  diagonalizes the matrix A.

Proof.

A matrix  A  is diagonalizable by a similarity transformation if there exists an invertible matrix  P[ X1| X2| | Xn]  such that

P1AP = D,

where  D  is a diagonal matrix:  D=diag(λ1,λ2,,λn).  This is equivalent to the conditions

AP=PDanddetP0.

The condition  detP0  means that  B=(X1,X2,,Xn)  comprises a linearly independent set of vectors of the space  Cn.  Moreover, by the above Lemma:

AX1=λ1X1,AX2=λ2X2,AXn=λnXn.

In an  n-dimensional vector space every set of n linearly independent vectors is a basis. Therefore, since  dimCn=n,  so B  is a basis of the space  Cn;  it is the basis consisting of eigenvectors of the matrix A.

On the other hand, if eigenvectors  X1,X2,,Xn  of the matrix AMn(C) span the space Cn,  then the matrix P whose columns are the basis vectors diagonalizes the matrix A.

Comments and corollaries.

1.) Every matrix AMn(C) has at least one eigenvalue λ and the associated eigenvector X.  Hence, because the equation (2) does not require that the eigenvalues λi  and the associated eigenvectors Xi  are distinct, there always exists a matrix P  such that the equation (3) holds. In particular, one may take

λ1=λ2=λn=λ,X1=X2=Xn=X.

Then AP=PD=λP,  but the matrix  P  is not invertible and thus the relation does not hold (1).

2.) The formula  D=P1AP  may be interpreted in terms of transformation of a matrix of a linear operator under a change of basis. Consider the space Cn  with the canonical basis  E=(e1,e2,en).  Let A  be the matrix of a linear operator FEnd(Cn)  defined by  F(x):=Ax,  xCn.  If eigenvectors  X1,X2,,Xn  of the operator F are linearly independent, then the matrix  P=[ X1|X2||Xn ]  is the transition matrix from the canonical basis E to the basis B=(X1,X2,Xn)  consisting of the eigenvectors.

Hence, D  is a matrix of the operator F  in the basis B  consisting of its eigenvectors. As one should expect, this is a diagonal matrix with the eigenvalues of F  on the diagonal.

3.) We know already that the eigenevectors of a linear operator which are associated to different eigenvalues are linearly independent.

Corollary. If a matrix AMn(C)  has n distinct eigenvalues, then there exists a similarity transformation which diagonalizes this matrix.

Indeed, if columns of the matrix P are eigenvectors of the matrix A which are associated with distinct eigenvalues, then the matrix P is non-degenerate: detP0,  and thus invertible.

4.) Eigenvectors of a normal operator which are associated with distinct eigenvalues comprise an orthogonal system, and after normalization - an orthonormal system. A matrix whose columns comprise an orthonormal system is unitary.

Corollary. Let AMn(C)  be a normal (e.g. Hermitian or unitary) matrix. If A  has n distinct eigenvalues, then there exists a unitary similarity transformation which diagonalizes this matrix (a diagonalizing matrix P  is unitary:  P+P=In).

Remark. A normal matrix does not have to have n distinct eigenvectors to be diagonalizable. Namely, one can prove a more general

Theorem 8.

A matrix AMn(C)  is diagonalizable by a unitary similarity transformation if and only if it is normal.

Application to real matrices.

For a real matrix A:  AMn(R),  we have A+=AT.  Therefore

A+=AAT=A

(a real Hermitian matrix is symmetric), and

A+A=InATA=In

(a real unitary matrix is orthogonal).

Theorem 9.

Every real symmetric or orthogonal matrix is diagonalizable by a unitary similarity transformation.

Eigenvalues, and thus also eigenvectors, of a real symmetric matrix are real. Hence, a unitary diagonalizing matrix is a real orthogonal matrix.

Corollary. Every real symmetric matrix is diagonalizable by a real orthogonal similarity transformation.

In comparison to the previous case, eigenvalues of a real orthogonal matrix (and thus also its eigenvectors) may be complex and not real. Then the unitary diagonalizing matrix will be also complex and not real.

Theorem 10.

If a matrix  A  is diagonalizable by a similarity transformation, then an algebraic multiplicity of every eigenvalue is equal to the geometric multiplicity.

Proof.   If a transformation  A  P1AP D  diagonalizes the matrix  A,  then  D=diag(λ1,λ2,,λk),  where  λ1,λ2,,λk  are eigenvalues of the matrix A.  The number with which λi occurs on a diagonal of the matrix  D  is equal to both an algebraic and geometric multiplicity of this eigenvalue.