LU Factorization of a Matrix

Our aim is to represent a square matrix \(\,\boldsymbol{A}\in M_n(K)\,\) as a product of a lower and an upper triangular matrices: \(\ \boldsymbol{A}\ =\ \boldsymbol{L}\,\boldsymbol{U}.\)

We start out with a particular example. \(\,\) Let \(\ \boldsymbol{A}\ =\ \left[\begin{array}{rrr} 2 & 1 & 1 \\ 4 & -6 & 0 \\ -2 & 7 & 2\end{array}\right]\in M_n(Q)\,.\)

The Gauss elimination, which converts \(\,\boldsymbol{A}\,\) to an upper triangular matrix \(\,\boldsymbol{U},\ \) is composed of the following consecutive elementary operations:

  1. From the 2nd row of matrix \(\,\boldsymbol{A}\,\) we subtract the doubled 1st row.
    This elementary operation \(\,O_3(1,0,-2)\,\) is represented by the elementary matrix \(\,\boldsymbol{E}^{(1)}= \boldsymbol{E}_3(1,0,-2).\ \) The result is \(\ \boldsymbol{A}_1=\boldsymbol{E}^{(1)}\boldsymbol{A}:\)
    \[\begin{split}\boldsymbol{A}_1\ =\ \left[\begin{array}{rrr} 1 & 0 & 0 \\ -2 & 1 & 0 \\ 0 & 0 & 1\end{array}\right]\ \left[\begin{array}{rrr} 2 & 1 & 1 \\ 4 & -6 & 0 \\ -2 & 7 & 2\end{array}\right]\ =\ \left[\begin{array}{rrr} 2 & 1 & 1 \\ 0 & -8 & -2 \\ -2 & 7 & 2\end{array}\right]\,.\end{split}\]
  2. To the 3rd row of matrix \(\,\boldsymbol{A}_1\,\) we add the 1st row.
    This operation \(\,O_3(2,0,1)\,\) is represented by the matrix \(\,\boldsymbol{E}^{(2)}=\boldsymbol{E}_3(2,0,1).\,\)
    The result is \(\ \boldsymbol{A}_2= \boldsymbol{E}^{(2)}\boldsymbol{A}_1:\)
    \[\begin{split}\boldsymbol{A}_2\ =\ \left[\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 1\end{array}\right]\ \left[\begin{array}{rrr} 2 & 1 & 1 \\ 0 & -8 & -2 \\ -2 & 7 & 2\end{array}\right]\ =\ \left[\begin{array}{rrr} 2 & 1 & 1 \\ 0 & -8 & -2 \\ 0 & 8 & 3\end{array}\right]\,.\end{split}\]
  3. To the 3rd row of matrix \(\,\boldsymbol{A}_2\,\) we add the 2nd row.
    This operation \(\,O_3(2,1,1)\,\) is represented by the matrix \(\,\boldsymbol{E}^{(3)}=\boldsymbol{E}_3(2,1,1).\,\)
    The result is the upper triangular matrix \(\ \boldsymbol{U}=\boldsymbol{E}^{(3)}\boldsymbol{A}_2:\)
    \[\begin{split}\boldsymbol{U}\ =\ \left[\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 1 & 1\end{array}\right]\ \left[\begin{array}{rrr} 2 & 1 & 1 \\ 0 & -8 & -2 \\ 0 & 8 & 3\end{array}\right]\ =\ \left[\begin{array}{rrr} 2 & 1 & 1 \\ 0 & -8 & -2 \\ 0 & 0 & 1\end{array}\right]\,.\end{split}\]

Gathering together the partial results, we may write:

(1)\[\boldsymbol{U}\ =\ \boldsymbol{E}^{(3)}\boldsymbol{E}^{(2)}\boldsymbol{E}^{(1)}\boldsymbol{A}\,.\]

In each of the the three applied elementary operations a multiple of an upper row (with a lesser index) has been added to a lower row (with a greater index). The corresponding elementary matrices, obtained by application of these operations to the identity matrix, are therefore lower triangular matrices. So are their products and inverses.

Thus the Eq. (1) can be rewritten as

\[\boldsymbol{A}\ =\ \boldsymbol{L}\,\boldsymbol{U}\]

where \(\quad\boldsymbol{L}\ \ =\ \ \left[\, \boldsymbol{E}^{(3)}\boldsymbol{E}^{(2)}\boldsymbol{E}^{(1)} \right]^{-1}\ =\ \ \ \left[\boldsymbol{E}^{(1)}\right]^{-1} \left[\boldsymbol{E}^{(2)}\right]^{-1} \left[\boldsymbol{E}^{(3)}\right]^{-1}\quad\) is a lower, \(\\\) and \(\ \boldsymbol{U}\ \) \(\,\) - \(\,\) an upper triangular matrix.

The matrix \(\ \boldsymbol{L}\ \) can be calculated by hand:

\[ \begin{align}\begin{aligned}\boldsymbol{E}^{(3)}\boldsymbol{E}^{(2)}\boldsymbol{E}^{(1)}\ =\ \boldsymbol{E}_3(2,1,1)\,\cdot\, \boldsymbol{E}_3(2,0,1)\,\cdot\, \boldsymbol{E}_3(1,0,-2)\ =\\\begin{split}=\ \left[\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 1 & 1\end{array}\right]\,\cdot\, \left[\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 1\end{array}\right]\,\cdot\, \left[\begin{array}{rrr} 1 & 0 & 0 \\ -2 & 1 & 0 \\ 0 & 0 & 1\end{array}\right]\ =\ \left[\begin{array}{rrr} 1 & 0 & 0 \\ -2 & 1 & 0 \\ -1 & 1 & 1\end{array}\right]\,;\end{split}\\\begin{split}\boldsymbol{L}\quad =\quad \left[\begin{array}{rrr} 1 & 0 & 0 \\ -2 & 1 & 0 \\ -1 & 1 & 1\end{array}\right] ^{-1}\ =\quad \left[\begin{array}{rrr} 1 & 0 & 0 \\ 2 & 1 & 0 \\ -1 & -1 & 1\end{array}\right]\,.\end{split}\end{aligned}\end{align} \]

or using the Sage code:

sage: E1 = elementary_matrix(QQ, 3, row1=1, row2=0, scale=-2)
sage: E2 = elementary_matrix(QQ, 3, row1=2, row2=0, scale= 1)
sage: E3 = elementary_matrix(QQ, 3, row1=2, row2=1, scale= 1)
sage: E = E3*E2*E1
sage: L = E.inverse(); L

[ 1  0  0]
[ 2  1  0]
[-1 -1  1]

Finally, the decomposition \(\,\boldsymbol{A}= \boldsymbol{L}\boldsymbol{U}\ \) of the matrix \(\,\boldsymbol{A}\,\) into a product of a lower and upper triangular matrices, reads as follows:

(2)\[\begin{split}\blacktriangleright\quad \left[\begin{array}{rrr} 2 & 1 & 1 \\ 4 & -6 & 0 \\ -2 & 7 & 2\end{array}\right]\ \ =\ \ \left[\begin{array}{rrr} 1 & 0 & 0 \\ 2 & 1 & 0 \\ -1 & -1 & 1\end{array}\right] \ \cdot\ \left[\begin{array}{rrr} 2 & 1 & 1 \\ 0 & -8 & -2 \\ 0 & 0 & 1\end{array}\right]\,.\end{split}\]

The above-mentioned procedure would not be directly applicable, if the upper left-most coefficient \(\,a_{11}\,\) vanished. Then, the operations on rows should be preceded by their permutation, which however involves a non-triangular elementary matrix. In such a case the procedure would be applied to the matrix \(\,\boldsymbol{P}\boldsymbol{A},\,\) where \(\,\boldsymbol{P}\,\) is the matrix of an appropriate permutation: \(\,\boldsymbol{P}\boldsymbol{A}=\boldsymbol{L}\,\boldsymbol{U}.\)

In Sage the decomposition is performed by the method LU(), 1 \(\ \) applicable to a \(\,m \times n\,\) rectangular matrix \(\,\boldsymbol{A}.\ \) By default, the output is a triplet \(\,(\boldsymbol{P},\boldsymbol{L},\boldsymbol{U}),\ \) where \(\,\boldsymbol{L}\,\) is a \(\,m \times m\,\) lower triangular matrix with all diagonal elements equal to one, \(\,\boldsymbol{U}\,\) is a \(\,m \times n\,\) upper diagonal matrix and \(\,\boldsymbol{P}\,\) is a \(\,m \times m\,\) permutation matrix such that \(\,\boldsymbol{A}\ =\ \boldsymbol{P}\,\boldsymbol{L}\,\boldsymbol{U}\,.\)

A direct application of the method LU() to the matrix \(\,\boldsymbol{A}\,\) from our example

provides a decomposition different from (2):

\[\begin{split}\left[\begin{array}{rrr} 2 & 1 & 1 \\ 4 & -6 & 0 \\ -2 & 7 & 2 \end{array}\right]\ \ =\ \ \left[\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right]\ \cdot\ \left[\begin{array}{rrr} 1 & 0 & 0 \\ \textstyle{1\over 2} & 1 & 0 \\ -\textstyle{1\over 2} & 1 & 1 \end{array}\right]\ \cdot\ \left[\begin{array}{rrr} 4 & -6 & 0 \\ 0 & 4 & 1 \\ 0 & 0 & 1 \end{array}\right]\,.\end{split}\]

To obtain the result (2), the ‘nonzero’ pivoting strategy is to be assumed:

Now the output reads:

\[\begin{split}\left[\begin{array}{rrr} 2 & 1 & 1 \\ 4 & -6 & 0 \\ -2 & 7 & 2 \end{array}\right]\ \ =\ \ \left[\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]\ \cdot\ \left[\begin{array}{rrr} 1 & 0 & 0 \\ 2 & 1 & 0 \\ -1 & -1 & 1 \end{array}\right]\ \cdot\ \left[\begin{array}{rrr} 2 & 1 & 1 \\ 0 & -8 & -2 \\ 0 & 0 & 1 \end{array}\right]\,.\end{split}\]

\(\;\)

Note

The method LU() has been designed for use with exact rings, e.g. the field QQ of rational numbers. For numerical results, the base ring should be converted to the field RDF of real numbers with double precision or to the field CDF of complex numbers with double precision.

1

http://doc.sagemath.org/html/en/reference/matrices/sage/matrix/matrix2.html