Example with Discussion

We will use the aforementioned methods to solve a system of equations over a field \(\,Q:\)

\begin{alignat*}{6} x_1 & {\,} + {\,} & 3\,x_2 & {\,} {\,} & & {\,} + {\,} & 2\,x_4 & {\,} - {\,} & x_5 & {\;} = {\;} & 2 \\ & {\,} {\,} & & {\,} {\,} & x_3 & {\,} + {\,} & 4\,x_4 & {\,} - {\,} & 3\,x_5 & {\;} = {\;} & 2 \\ x_1 & {\,} + {\,} & 3\,x_2 & {\,} + {\,} & x_3 & {\,} + {\,} & 6\,x_4 & {\,} - {\,} & 4\,x_5 & {\;} = {\;} & 4 \end{alignat*}

First, write down the coefficient matrix \(\,\boldsymbol{A},\ \) the column of constants \(\,\boldsymbol{b}\ \) and the augmented matrix \(\,\boldsymbol{B}\,\) of this system:

\[\begin{split}\boldsymbol{A}\ =\ \left[\begin{array}{rrrrr} 1 & 3 & 0 & 2 & -1 \\ 0 & 0 & 1 & 4 & -3 \\ 1 & 3 & 1 & 6 & -4 \end{array}\right]\,,\quad \boldsymbol{b}\ =\ \left[\begin{array}{r} 2 \\ 2 \\ 4 \end{array}\right]\,;\qquad \boldsymbol{B}\ =\ \left[\begin{array}{rrrrrr} 1 & 3 & 0 & 2 & -1 & 2 \\ 0 & 0 & 1 & 4 & -3 & 2 \\ 1 & 3 & 1 & 6 & -4 & 4 \end{array}\right]\,.\end{split}\]

Elimination Method

Transformation of the augmented matrix to the reduced row echelon form:

sage: B = matrix(QQ,[[1, 3, 0, 2,-1, 2],
                     [0, 0, 1, 4,-3, 2],
                     [1, 3, 1, 6,-4, 4]])

sage: table([[B, '$\\rightarrow$', B.rref()]])
(1)\[\begin{split}\left(\begin{array}{rrrrrr} 1 & 3 & 0 & 2 & -1 & 2 \\ 0 & 0 & 1 & 4 & -3 & 2 \\ 1 & 3 & 1 & 6 & -4 & 4 \end{array}\right)\quad\rightarrow\quad\left(\begin{array}{rrrrrr} 1 & 3 & 0 & 2 & -1 & 2 \\ 0 & 0 & 1 & 4 & -3 & 2 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{array}\right)\end{split}\]

Hence, the initial system of three equations is equivalent to a system consisting of the first two equations (which is understandable as the third equation is just a sum of the first two):

\begin{alignat*}{6} x_1 & {\,} + {\,} & 3\,x_2 & {\,} {\,} & & {\,} + {\,} & 2\,x_4 & {\,} - {\,} & x_5 & {\;} = {\;} & 2 \\ & {\,} {\,} & & {\,} {\,} & x_3 & {\,} + {\,} & 4\,x_4 & {\,} - {\,} & 3\,x_5 & {\;} = {\;} & 2 \\ \end{alignat*}

It is worth to notice that one can read off a particular solution of the initial system from the reduced row echelon form of the augmented matrix (1): \(\,\) the numbers in the last column of this matrix are the values of the unknowns corresponding to the columns with leading units, if the values of the other unknowns are taken to be zero. This particular solution is

(2)\[x_1 = x_3 = 2,\quad x_2 = x_4 = x_5 = 0.\]

We rewrite the above system of two equations as

\begin{alignat*}{5} x_1 & {\;} = {\;} & 2 & {\,} - {\,} & 3\,x_2 & {\,} - {\,} & 2\,x_4 & {\,} + {\,} & x_5 \\ x_3 & {\,} = {\,} & 2 & {\,} - {\,} & 4\,x_4 & {\,} + {\,} & 3\,x_5 \end{alignat*}

We take the unknowns \(\ x_2,\,x_4\ \,\text{and}\ \, x_5\ \) (corresponding to the columns without leading units) as parameters:

\[x_2 = s,\ \,x_4 = t,\ \,x_5 = u,\qquad s,t,u\,\in\,Q,\]

so that the solution is of the form

\[\begin{split}\begin{array}{l} x_1 \ =\ 2 \ - \ 3\,s \ - \ 2\,t \ + \ u \\ x_2 \ = \ s \\ x_3 \ = \ 2 \ - \ 4\,t \ + \ 3\,u \\ x_4 \ = \ t \\ x_5 \ = \ u \end{array}\qquad\quad s,t,u\,\in\,Q\,.\end{split}\]

Finally, a solution of the system may be written in a vector form

(3)\[\begin{split}\left[\begin{array}{c} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{array}\right]\ \,=\ \, \left[\begin{array}{r} 2 \\ 0 \\ 2 \\ 0 \\ 0 \end{array}\right]\ +\ s\ \left[\begin{array}{r} -3 \\ 1 \\ 0 \\ 0 \\ 0 \end{array}\right]\ +\ t\ \left[\begin{array}{r} -2 \\ 0 \\ -4 \\ 1 \\ 0 \end{array}\right]\ +\ u\ \left[\begin{array}{r} 1 \\ 0 \\ 3 \\ 0 \\ 1 \end{array}\right]\,,\quad s,t,u\,\in\,Q.\end{split}\]

A Direct Approach

We use the methods solve_right() and right_kernel_matrix() to determine a particular solution of the initial unhomogeneous system, and also a general solution of the associated homogeneous system:

sage: A = matrix(QQ,[[1, 3, 0, 2,-1],
                     [0, 0, 1, 4,-3],
                     [1, 3, 1, 6,-4]])

sage: b = vector(QQ,[2,2,4])

sage: X = A\b   # a particular solution of the original system

# Rows of matrix B0 form a basis of the space of solutions
# of the homogeneous system associated with the original one
sage: B0 = A.right_kernel_matrix()
sage: B0 = 2*B0 # to get rid of fractions

sage: show((X, B0))
\[\begin{split}\left(\quad\left(2,\,0,\,2,\,0,\,0\right),\quad \left(\ \begin{array}{rrrrr} 2 & 0 & 0 & -3 & - 4 \\ 0 & 2 & 0 & -9 & -12 \\ 0 & 0 & 2 & 1 & 2 \end{array}\ \right)\quad\right)\end{split}\]

Hence, a general solution comprises a set of vectors of the form

(4)\[\begin{split}\left[\begin{array}{c} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{array}\right]\ \, =\ \, \left[\begin{array}{r} 2 \\ 0 \\ 2 \\ 0 \\ 0 \end{array}\right]\ +\ s\ \left[\begin{array}{r} 2 \\ 0 \\ 0 \\ -3 \\ -4 \end{array}\right]\ +\ t\ \left[\begin{array}{r} 0 \\ 2 \\ 0 \\ -9 \\ -12 \end{array}\right]\ +\ u\ \left[\begin{array}{r} 0 \\ 0 \\ 2 \\ 1 \\ 2 \end{array}\right]\,,\quad s,t,u\,\in\,Q.\end{split}\]

Comparison of Results

Formulae (3) and (4) presenting a general solution of the system are not identical: they contain the same particular solution, but the vectors spanning the solution spaces of the homogeneous system are different.

In order to check the consistence of these formulae we construct and compare the aforementioned space. We make use of the function span(), which returns the space spanned by the given (in a list form) vectors:

sage: V1 = span(QQ,[[-3, 1, 0, 0, 0],[-2, 0,-4, 1,  0],[1, 0, 3, 0, 1]])
sage: V2 = span(QQ,[[ 2, 0, 0,-3,-4],[ 0, 2, 0,-9,-12],[0, 0, 2, 1, 2]])

sage: V1==V2

True

A difference between the formulae (3) and (4) might have occured because a basis of the vector space is not given unambigously: every maximal set of linearly independent vectors defines a basis. Non-trivial spaces over number fields \(\,Q,\,R,\,C\,\) have infinitely many bases.

Alternative Solution of the Homogeneous System

The homogeneous system of equations from the previous example, having a matrix form

(5)\[\begin{split}\boldsymbol{A}\,\boldsymbol{x}\ =\ \boldsymbol{0}\,, \qquad \boldsymbol{A}\ =\ \left[\begin{array}{ccccc} 1 & 3 & 0 & 2 & -1 \\ 0 & 0 & 1 & 4 & -3 \\ 1 & 3 & 1 & 6 & -4 \end{array}\right]\,,\end{split}\]

may be solved in a different, non-standard way.

First, we determine rank of the matrix \(\,\boldsymbol{A}\ \,\) and \(\ \) dimension of the solution space \(\,S_0.\ \) Note that:

  • \(\ \text{rk}\boldsymbol{A} < 3,\,\) because the rows are linearly dependent (the third is a sum of the first two);

  • \(\ \text{rk}\boldsymbol{A}\geq 2,\,\) because there exist non-zero minors of order two
    \(\qquad\qquad\ \) (eg. a minor in a top right corner).

Hence, \(\ \,\text{rk}\boldsymbol{A} = 2\ \) and a dimension of the space \(\,S_0\,,\) equal to a difference between the number of the unknowns and rank of the matrix \(\,\boldsymbol{A},\,\) is \(\ 5 - 2 = 3.\ \) To determine the space \(\,S_0\,\) it suffices to provide any of its bases consisting of three linearly independent column vectors belonging to the space \(\,Q^5.\)

General discussion.

In this situation a problem of solving the system (5) is equivalent to finding a matrix \(\,\boldsymbol{X}\,\) with five rows and three linearly independent columns, which satisfies the condition

(6)\[\boldsymbol{A}\,\boldsymbol{X}\ =\ \boldsymbol{O}_3\,,\]

where the right hand side is a square zero matrix of order three.

Indeed, assume that the matrix \(\,\boldsymbol{X}\in M_{5\times 3}(Q)\,\) satisfies the equation (6). Denote its columns by \(\,\boldsymbol{X}_1,\,\boldsymbol{X}_2,\,\boldsymbol{X}_3\ \,\) so that \(\ \,\) in column matrix notation

\[\boldsymbol{A}\,\boldsymbol{X}\ \,=\ \, \boldsymbol{A}\ \left[\,\boldsymbol{X}_1\,|\;\boldsymbol{X}_2\,|\;\boldsymbol{X}_3\,\right]\ \,=\ \, \left[\, \boldsymbol{A}\boldsymbol{X}_1\,|\; \boldsymbol{A}\boldsymbol{X}_2\,|\; \boldsymbol{A}\boldsymbol{X}_3\,\right]\ \,=\ \, \left[\,\boldsymbol{0}\,|\,\boldsymbol{0}\,|\,\boldsymbol{0}\,\right]\,.\]

Now, if we compare the columns of the last two matrices:

\[\boldsymbol{A}\boldsymbol{X}_1\ =\ \boldsymbol{0},\qquad \boldsymbol{A}\boldsymbol{X}_2\ =\ \boldsymbol{0},\qquad \boldsymbol{A}\boldsymbol{X}_3\ =\ \boldsymbol{0}\,,\]

we see that the columns \(\,\boldsymbol{X}_1,\,\boldsymbol{X}_2,\,\boldsymbol{X}_3\ \) of the matrix \(\,\boldsymbol{X}\,\) satisfy the equation (5) \(\,\) and \(\,\) (by their linear independence) comprise a basis of the space \(\,S_0.\)

Determination of a basis for the solution space.

The desired matrix \(\,\boldsymbol{X}\,\) may be constructed from the reduced row echelon form \(\,\boldsymbol{C}\,\) of the coefficient matrix \(\,\boldsymbol{A}\,\) of the system (5). \(\ \) By the formula (1):

(7)\[\begin{split}\boldsymbol{C}\quad =\quad \left[\begin{array}{rrrrr} 1 & 3 & 0 & 2 & -1 \\ 0 & 0 & 1 & 4 & -3 \\ 0 & 0 & 0 & 0 & 0 \end{array}\right]\,.\end{split}\]

Swapping the second and the third column of \(\,\boldsymbol{C}\,\) leads to a matrix \(\,\boldsymbol{D}\,\) with a clear block structure (notation of blocks includes their sizes):

(8)\[\begin{split}\boldsymbol{D}\quad =\quad \left[\begin{array}{cc|ccc} 1 & 0 & 3 & 2 & -1 \\ 0 & 1 & 0 & 4 & -3 \\ \hline 0 & 0 & 0 & 0 & 0 \end{array}\right] \quad \equiv\quad \left[\begin{array}{c|c} \boldsymbol{I_2} & \boldsymbol{F_{23}} \\ \hline \boldsymbol{O_{12}} & \boldsymbol{O_{13}} \end{array}\right]\,.\end{split}\]

Note that \(\,\boldsymbol{D}\,\) is a matrix (in a reduced echelon form) of the system which we obtaind from (5) by changing the numeration of unknowns: \(\ x_2\leftrightarrow x_3.\ \,\) We define a new matrix

(9)\[\begin{split}\boldsymbol{Y}\quad :\,=\quad \left[\begin{array}{c} \boldsymbol{-F_{23}} \\ \hline \boldsymbol{I_3} \end{array}\right] \quad =\quad \left[\begin{array}{rrr} -3 & -2 & 1 \\ 0 & -4 & 3 \\ \hline 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]\,.\end{split}\]

Applying the rules for multiplication of block matrices, we get

(10)\[ \begin{align}\begin{aligned}\begin{split}\boldsymbol{D}\cdot\boldsymbol{Y}\ \,=\ \, \left[\begin{array}{c|c} \boldsymbol{I_2} & \boldsymbol{F_{23}} \\ \hline \boldsymbol{O_{12}} & \boldsymbol{O_{13}} \end{array}\right]\ \cdot\ \left[\begin{array}{c} \boldsymbol{-F_{23}} \\ \hline \boldsymbol{I_3} \end{array}\right]\ \,=\ \, \left[\begin{array}{c} -\boldsymbol{I_2 F_{23}}+\boldsymbol{F_{23}I_3} \\ \hline \boldsymbol{-O_{12}F_{23}}+\boldsymbol{O_{13}I_3} \end{array}\right]\ \,=\end{split}\\\begin{split}=\ \, \left[\begin{array}{c} \boldsymbol{-F_{23}}+\boldsymbol{F_{23}} \\ \hline \boldsymbol{-O_{13}}+\boldsymbol{O_{13}} \end{array}\right]\ \,=\ \, \left[\begin{array}{c} \boldsymbol{O_{23}} \\ \hline \boldsymbol{O_{13}} \end{array}\right]\ \,=\ \,\boldsymbol{O_3}.\end{split}\end{aligned}\end{align} \]

The columns of the matrix \(\,\boldsymbol{Y}\,\) are then solutions of the system with a matrix \(\,\boldsymbol{D}\,\) in a reduced echelon form, that is with the swapped unknowns \(\ x_2,\,x_3.\ \) Solutions of the initial system (5) are the columns of the matrix \(\,\boldsymbol{X}\,\) obtained from \(\,\boldsymbol{Y}\,\) by swapping the second and the third row (because in these matrices, rows correspond with consecutive unknowns):

(11)\[\begin{split}\boldsymbol{X}\quad =\quad \left[\begin{array}{rrr} -3 & -2 & 1 \\ 1 & 0 & 0 \\ 0 & -4 & 3 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]\,.\end{split}\]

We still have to check that the columns of the matrix \(\,\boldsymbol{X}\,\) are linearly independent, i.e. that \(\,\text{rk}\,\boldsymbol{X} = 3.\,\) This is testified by the non-zero minor of order 3. composed from first three rows of the matrix:

\[\begin{split}\det\, \left[\begin{array}{rrr} -3 & -2 & 1 \\ 1 & 0 & 0 \\ 0 & -4 & 3 \end{array}\right]\ \,=\ \, 2.\end{split}\]

Finaly, a general solution of the homogeneous sytem (5) is of the form

(12)\[\begin{split}\left[\begin{array}{c} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{array}\right]\quad =\quad s\ \,\left[\begin{array}{r} -3 \\ 1 \\ 0 \\ 0 \\ 0 \end{array}\right]\ +\ t\ \,\left[\begin{array}{r} -2 \\ 0 \\ -4 \\ 1 \\ 0 \end{array}\right]\ +\ u\ \,\left[\begin{array}{r} 1 \\ 0 \\ 3 \\ 0 \\ 1 \end{array}\right]\,,\qquad s,t,u\,\in\,Q\,,\end{split}\]

which coincides with the solution (3) of unhomogeneous system. \(\\\)

Application of computer algebra.

Now we use tools of Sage to perform matrix operations described above.

  1. Transformation of the matrix \(\,\boldsymbol{A}\,\) to reduced echelon form \(\,\boldsymbol{C}\,\) (formula (7)):

    sage: A = matrix(QQ,[[1, 3, 0, 2,-1],
                         [0, 0, 1, 4,-3],
                         [1, 3, 1, 6,-4]])
    
    sage: C = A.rref(); C
    
    [ 1  3  0  2 -1]
    [ 0  0  1  4 -3]
    [ 0  0  0  0  0]
    
  2. Construction of the matrix \(\,\boldsymbol{D}\,\) by swapping the second and the third column of the matrix \(\,\boldsymbol{C},\,\) and indication of its block structure (formula (8)):

    sage: D = copy(C).with_swapped_columns(1,2)
    sage: D.subdivide(2,2); D
    
    [ 1  0| 3  2 -1]
    [ 0  1| 0  4 -3]
    [-----+--------]
    [ 0  0| 0  0  0]
    
  3. Isolation of the block \(\,\boldsymbol{F_{23}}\,\) and construction of the matrix \(\,\boldsymbol{Y}\ \) (formula (9)):

    sage: F23 = D.subdivision(0,1)
    sage: I3 = identity_matrix(3)
    sage: Y = block_matrix([[-F23],[I3]]); Y
    
    [-3 -2  1]
    [ 0 -4  3]
    [--------]
    [ 1  0  0]
    [ 0  1  0]
    [ 0  0  1]
    
  4. Verification that a product of matrices \(\,\boldsymbol{D}\,\) and \(\,\boldsymbol{Y}\,\) is equal to the zero matrix of order 3. \(\\\) (formula (10)):

    sage: D*Y
    
    [0 0 0]
    [0 0 0]
    [-----]
    [0 0 0]
    
  5. Construction of the matrix \(\,\boldsymbol{X}\,\) by swapping the second and the third row of the matrix \(\,\boldsymbol{Y},\,\) and removing block structure of \(\,\boldsymbol{X}\,\) (formula (11)):

    sage: X = Y.with_swapped_rows(1,2)
    sage: X.subdivide(); X
    
    [-3 -2  1]
    [ 1  0  0]
    [ 0 -4  3]
    [ 0  1  0]
    [ 0  0  1]
    
  6. Verification that the matrix \(\,\boldsymbol{X}\,\) satisfies the equation (6):

    sage: A*X
    
    [0 0 0]
    [0 0 0]
    [0 0 0]
    

Remark:

As opposed to the methods swap_columns() and swap_rows(), which perform the operations directly on the original matrix, the methods with_swapped_columns() and with_swapped_rows() used here output a tranformed matrix without changing the original one.

Joining all the operations we obtain the result (12):

We should emphasize though that a method presented in this section to solve the system (5), even though it is instructive, it is not universal: it turned out effective only because of a specific structure of the coefficient matrix \(\,\boldsymbol{A},\,\) which led to a block form (8). The standard procedure is either a direct method described before or the elimination method.

Exercise.
Check by hand calculation that the matrix \(\,\boldsymbol{X}\,\) given by the formula (11) satisfies the equation (6).