Operations upon Matrices

Elementary Operations and Elementary Matrices

To perform an elementary operation \(\,O\,\) on a product of two matrices \(\,\boldsymbol{A}\ \) and \(\ \boldsymbol{B},\ \) \(\\\) one has to apply it to the first factor of the product: \(\ O(\boldsymbol{A}\boldsymbol{B}) = (O\boldsymbol{A})\,\boldsymbol{B}.\ \) \(\\\) A more precise description is given by

Lemma. \(\,\)

If \(\,\boldsymbol{A}\in M_{m\times p}(K),\ \boldsymbol{B}\in M_{p\times n}(K),\ \) then \(\,\) for \(\ i,j=0,1,\ldots,m-1:\)

  1. \(\ O_1(i,j)\,(\boldsymbol{A}\boldsymbol{B})\ \ =\ \ [\,O_1(i,j)\,\boldsymbol{A}\,]\ \boldsymbol{B}\,,\)

  2. \(\ O_2(i,a)\,(\boldsymbol{A}\boldsymbol{B})\ \ =\ \ [\,O_2(i,a)\,\boldsymbol{A}\,]\ \boldsymbol{B}\,,\qquad (a\ne 0)\)

  3. \(\ O_3(i,j,a)\,(\boldsymbol{A}\boldsymbol{B})\ \ =\ \ [\,O_3(i,j,a)\,\boldsymbol{A}\,]\ \boldsymbol{B}\,.\)

Proof makes use of the row matrix multiplication rule:

\[\begin{split}\boldsymbol{A}\boldsymbol{B}\ \equiv\ \left[\begin{array}{c} \boldsymbol{A}_1 \\ \boldsymbol{A}_2 \\ \dots \\ \boldsymbol{A}_m \end{array}\right]\boldsymbol{B} \ \ =\ \ \left[\begin{array}{c} \boldsymbol{A}_1\,\boldsymbol{B} \\ \boldsymbol{A}_2\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_m\,\boldsymbol{B} \end{array}\right]\,.\end{split}\]

Hence, the identities \(\,\) 1., \(\,\) 2. \(\,\) and \(\,\) 3. \(\,\) may be derived as follows:

\[ \begin{align}\begin{aligned}\begin{split}O_1(i,j)\,(\boldsymbol{A}\boldsymbol{B})\ =\ O_1(i,j)\, \left[\begin{array}{c} \dots \\ \boldsymbol{A}_i\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_j\,\boldsymbol{B} \\ \dots \end{array} \right]\ =\ \left[\begin{array}{c} \dots \\ \boldsymbol{A}_j\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_i\,\boldsymbol{B} \\ \dots \end{array} \right]\ =\ \left[\begin{array}{c} \dots \\ \boldsymbol{A}_j \\ \dots \\ \boldsymbol{A}_i \\ \dots \end{array} \right]\,\boldsymbol{B}\ =\ [\,O_1(i,j)\,\boldsymbol{A}\,]\,\boldsymbol{B}\ ;\end{split}\\\begin{split}O_2(i,a)\,(\boldsymbol{A}\boldsymbol{B})\ =\ O_2(i,a)\, \left[\begin{array}{c} \boldsymbol{A}_1\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_i\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_m\,\boldsymbol{B} \\ \end{array} \right]\ =\ \left[\begin{array}{c} \boldsymbol{A}_1\,\boldsymbol{B} \\ \dots \\ a\,\boldsymbol{A}_i\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_m\,\boldsymbol{B} \\ \end{array} \right]\ =\ \left[\begin{array}{c} \boldsymbol{A}_1 \\ \dots \\ a\,\boldsymbol{A}_i \\ \dots \\ \boldsymbol{A}_m \\ \end{array} \right]\boldsymbol{B}\ =\ [\,O_2(i,a)\,\boldsymbol{A}\,]\ \boldsymbol{B}\,;\end{split}\end{aligned}\end{align} \]
\[ \begin{align}\begin{aligned}\begin{split}O_3(i,j,a)\,(\boldsymbol{A}\boldsymbol{B})\ \ =\ \ O_3(i,j,a)\, \left[\begin{array}{c} \dots \\ \boldsymbol{A}_i\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_j\,\boldsymbol{B} \\ \dots \end{array} \right]\ \ =\ \ \left[\begin{array}{c} \dots \\ \boldsymbol{A}_i\,\boldsymbol{B}\, +\, a\,\boldsymbol{A}_j\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_j\,\boldsymbol{B} \\ \dots \end{array} \right]\ \ =\end{split}\\\begin{split}=\ \ \ \left[\begin{array}{c} \dots \\ (\boldsymbol{A}_i\ + \, a\boldsymbol{A}_j)\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_j\,\boldsymbol{B} \\ \dots \end{array} \right]\ \ \ =\ \ \ \left[\begin{array}{c} \dots \\ \boldsymbol{A}_i\ + a\boldsymbol{A}_j \\ \dots \\ \boldsymbol{A}_j \\ \dots \end{array} \right]\,\boldsymbol{B}\ \ \ =\ \ \ [\,O_3(i,j,a)\,\boldsymbol{A}\,]\ \boldsymbol{B}\,.\end{split}\end{aligned}\end{align} \]

Applying an elementary operation on a matrix \(\,\boldsymbol{A}\ \) is equivalent to mutliplication of this matrix (on the left) by a suitable elementary matrix. We state this as

Theorem. \(\,\)

Let \(\,\boldsymbol{A}\in M_{m\times n}(K).\ \) Then \(\,\) for \(\ i,j=0,1,\ldots,m-1:\)

  1. \(\,O_1(i,j)\,\boldsymbol{A}\ =\ \boldsymbol{E}_1(i,j)\,\boldsymbol{A}\,,\)

  2. \(\,O_2(i,a)\,\boldsymbol{A}\ =\ \boldsymbol{E}_2(i,a)\,\boldsymbol{A}\,,\qquad (a\ne 0)\)

  3. \(\,O_3(i,j,a)\,\boldsymbol{A}\ = \boldsymbol{E}_3(i,j,a)\,\boldsymbol{A}\,,\)

where \(\ \boldsymbol{E}_1(i,j),\ \boldsymbol{E}_2(i,a),\ \boldsymbol{E}_3(i,j,a)\in M_m(K).\)

Proof. Taking into account that \(\,\boldsymbol{A} = \boldsymbol{I}_m\boldsymbol{A},\ \) the above Lemma and the definition of elementary matrices imply:

\(\ O_1(i,j)\,\boldsymbol{A}\ =\ O_1(i,j)\,(\boldsymbol{I}_m\boldsymbol{A})\ =\ [\,O_1(i,j)\,\boldsymbol{I}_m\,]\,\boldsymbol{A}\ =\ \boldsymbol{E}_1(i,j)\,\boldsymbol{A}\,,\)

\(\ O_2(i,a)\,\boldsymbol{A}\ =\ O_2(i,a)\,(\boldsymbol{I}_m\boldsymbol{A})\ =\ [\,O_2(i,a)\,\boldsymbol{I}_m\,]\,\boldsymbol{A}\ =\ \boldsymbol{E}_2(i,a)\,\boldsymbol{A}\,,\)

\(\ O_3(i,j,a)\,\boldsymbol{A}\ =\ O_3(i,j,a)\,(\boldsymbol{I}_m\boldsymbol{A})\ =\ [\,O_3(i,j,a)\,\boldsymbol{I}_m\,]\,\boldsymbol{A}\ =\ \boldsymbol{E}_3(i,j,a)\,\boldsymbol{A}\,.\)

Permutation Matrices

To perform an operation \(\,O_{\sigma}\,\) of row permutation on a product of two matrices \(\,\boldsymbol{A}\ \ \text{i}\ \ \boldsymbol{B},\ \) \(\\\) one has to apply it only to the first factor of the product.

Applying the row permutation \(\,O_{\sigma}\,\) on a rectangular matrix \(\,\boldsymbol{A}\ \) is equivalent to mutliplication of this matrix (on the left) by a suitable permutation matrix.

It is described more precisely in the following

Theorem. \(\,\)

If \(\,\boldsymbol{A}\in M_{m\times p}(K),\ \boldsymbol{B}\in M_{p\times n}(K),\ \ \sigma\in S_m,\ \ \) then:

  1. \(\ \,O_\sigma\,(\boldsymbol{A}\boldsymbol{B})\ =\ (O_\sigma\boldsymbol{A})\,\boldsymbol{B}\,;\)

  2. \(\ \,O_\sigma\,\boldsymbol{A}\ =\ \boldsymbol{P}_\sigma\,\boldsymbol{A}\,,\qquad \text{where}\quad\boldsymbol{P}_\sigma\,=\,O_\sigma\,\boldsymbol{I}_m\in M_m(K)\,.\)

Proof bases on the row matrix multiplication rule:

\[\begin{split}\boldsymbol{A}\boldsymbol{B}\ \equiv\ \left[\begin{array}{c} \boldsymbol{A}_1 \\ \boldsymbol{A}_2 \\ \dots \\ \boldsymbol{A}_m \end{array} \right]\boldsymbol{B}\ \ =\ \ \left[\begin{array}{c} \boldsymbol{A}_1\,\boldsymbol{B} \\ \boldsymbol{A}_2\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_m\,\boldsymbol{B} \end{array} \right]\,.\end{split}\]

In this way we obtain the 1. part of the thesis:

\[\begin{split}O_\sigma\,(\boldsymbol{A}\boldsymbol{B})\ =\ O_\sigma \left[\begin{array}{c} \boldsymbol{A}_1\,\boldsymbol{B} \\ \boldsymbol{A}_2\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_m\,\boldsymbol{B} \end{array} \right]\ = \left[\begin{array}{c} \boldsymbol{A}_{\sigma(1)}\,\boldsymbol{B} \\ \boldsymbol{A}_{\sigma(2)}\,\boldsymbol{B} \\ \dots \\ \boldsymbol{A}_{\sigma(m)}\,\boldsymbol{B} \end{array} \right]\ =\ \left[\begin{array}{c} \boldsymbol{A}_{\sigma(1)} \\ \boldsymbol{A}_{\sigma(2)} \\ \dots \\ \boldsymbol{A}_{\sigma(m)} \end{array} \right]\boldsymbol{B}\ =\ (O_\sigma\boldsymbol{A})\,\boldsymbol{B}\,.\end{split}\]

This easily implies the 2. part of the theorem:

\[O_\sigma\,\boldsymbol{A}\ \ =\ \ O_\sigma\,(\boldsymbol{I}_m\,\boldsymbol{A})\ \ =\ \ (O_\sigma\,\boldsymbol{I}_m)\,\boldsymbol{A}\ \ =\ \ \boldsymbol{P}_\sigma\,\boldsymbol{A}\,, \qquad\sigma\in S_m\,.\]

\(\;\)

A product of two permutation matrices is a permutation matrix. It is formulated more precisely by

Theorem. \(\,\)

If \(\quad P_\rho = O_\rho\,\boldsymbol{I}_m,\ \,P_\sigma = O_\sigma\,\boldsymbol{I}_m,\quad\) then \(\quad\boldsymbol{P}_\rho\,\boldsymbol{P}_\sigma\ =\ \boldsymbol{P}_{\sigma\,\circ\,\rho}\,, \qquad\rho,\sigma\in S_m\,.\)

Proof.

Assume first that

\[\begin{split}\boldsymbol{P}_\sigma\,\boldsymbol{I}_m\ =\ \boldsymbol{P}_\sigma\, \left[\begin{array}{c} \boldsymbol{e}_1 \\ \boldsymbol{e}_2 \\ \dots \\ \boldsymbol{e}_m \end{array} \right]\ =\ \left[\begin{array}{c} \boldsymbol{e}_{\sigma(1)} \\ \boldsymbol{e}_{\sigma(2)} \\ \dots \\ \boldsymbol{e}_{\sigma(m)} \end{array} \right]\ =\ \left[\begin{array}{c} \boldsymbol{e}'_1 \\ \boldsymbol{e}'_2 \\ \dots \\ \boldsymbol{e}'_m \end{array} \right]\,, \quad\text{where}\quad\boldsymbol{e}'_i\ =\ \boldsymbol{e}_{\sigma(i)}\,,\quad i=1,2,\ldots,m.\end{split}\]

Hence, a product of two permutation matrices may be written as

\[\begin{split}\boldsymbol{P}_\rho\,\boldsymbol{P}_\sigma\ =\ (\boldsymbol{P}_\rho\,\boldsymbol{P}_\sigma)\,\boldsymbol{I}_m\ =\ \boldsymbol{P}_\rho\,(\boldsymbol{P}_\sigma\,\boldsymbol{I}_m)\ =\ \boldsymbol{P}_\rho\, \left[\begin{array}{c} \boldsymbol{e}'_1 \\ \boldsymbol{e}'_2 \\ \dots \\ \boldsymbol{e}'_m \end{array} \right]\ =\ \left[\begin{array}{c} \boldsymbol{e}'_{\rho(1)} \\ \boldsymbol{e}'_{\rho(2)} \\ \dots \\ \boldsymbol{e}'_{\rho(m)} \end{array} \right]\,.\end{split}\]

Substitution \(\ \ i\rightarrow\rho(i)\ \ \) in the equation \(\ \ \boldsymbol{e}'_i\ =\ \boldsymbol{e}_{\sigma(i)}\ \ \) gives

\[\boldsymbol{e}'_{\rho(i)}\ =\ \boldsymbol{e}_{\sigma[\rho(i)]}\ =\ \boldsymbol{e}_{(\sigma\,\circ\,\rho)(i)}\,,\qquad i=1,2,\ldots,m.\]

Hence,

\[\begin{split}\boldsymbol{P}_\rho\,\boldsymbol{P}_\sigma\ =\ \left[\begin{array}{c} \boldsymbol{e}'_{\rho(1)} \\ \boldsymbol{e}'_{\rho(2)} \\ \dots \\ \boldsymbol{e}'_{\rho(m)} \end{array} \right]\ =\ \left[\begin{array}{c} \boldsymbol{e}_{(\sigma\,\circ\,\rho)(1)} \\ \boldsymbol{e}_{(\sigma\,\circ\,\rho)(2)} \\ \dots \\ \boldsymbol{e}_{(\sigma\,\circ\,\rho)(m)} \end{array} \right]\ =\ \boldsymbol{P}_{\sigma\,\circ\,\rho} \left[\begin{array}{c} \boldsymbol{e}_1 \\ \boldsymbol{e}_2 \\ \dots \\ \boldsymbol{e}_m \end{array} \right]\ =\ \boldsymbol{P}_{\sigma\,\circ\,\rho}\ \boldsymbol{I}_m\ =\ \boldsymbol{P}_{\sigma\,\circ\,\rho}\,.\end{split}\]