Matrix Formulation of the Elimination Method¶
The elimination method (discussed in Chapter 2) transforms a system of linear equations into the echelon form by applying elementary operations to the equations themselves.
An analogous result may be obtained by performing the corresponding operations on rows of the augmented matrix representing a given set of equations.
Augmented Matrix of a System of Equations¶
For a system of \(\,m\,\) linear equations with \(\,n\,\) unknowns:
where coefficients \(\,a_{ij}\,\) and constants \(\,b_i\ \ (i=1,2,\ldots,m;\ j=1,2,\ldots,n)\,\) belong to a field \(\,K,\,\) we have defined the coefficient matrix \(\boldsymbol{A},\ \) the column vector of constants \(\boldsymbol{b},\,\) and the column vector of unknowns \(\boldsymbol{x}:\)
Then system (1) may be rewritten compactly as
When all constants vanish: \(\ \boldsymbol{b} = \boldsymbol{0},\ \) the system of equations is said to be homogeneous. \(\\\) The homogeneous system
is associated with the unhomogeneous system (2).
Two systems of equations are equivalent, \(\,\) when they possess the same set of solutions.
The system of equations (1) is represented in one-to-one way by the augmented matrix \(\,\boldsymbol{B}:\)
The three types of elementary operations on equations of a system correspond to the row operations on the augmented matrix:
\(\,\) swap around the positions of two rows
\(\,\) multiply a row by a nonzero scalar
- \(\,\) add to a row a scalar multiple of another row\(\,\) (in particular: add or subtract two different rows).
The augmented matrix \(\,\boldsymbol{B}\,\) transformed by the above row operations refers to a system of equations equivalent to the original one. On the other hand, elementary operations applied to columns of matrix \(\,\boldsymbol{B}\,\) usually do change the set of solutions and hence are not applicable in the process of solving a system. Nevertheless, a permutation of columns of matrix \(\,\boldsymbol{A}\,\) (which only reorders variables) may sometimes facilitate the solution.
Echelon Form of a Matrix¶
We start from a few definitions.
In a zero row all elements vanish; a non-zero row has at least one non-zero element.
The leading entry (pivot) of a non-zero row is the first non-zero element from the left.
A matrix is in the row echelon form, if
\(\,\) all zero rows (if any) are below the non-zero ones, and
- \(\,\) for any two non-zero rows, the leading entry of the lower row\(\,\) is strictly to the right of the leading entry of the upper row.
These conditions imply that if a column contains the pivot of a non-zero row, then all elements of that column below the pivot vanish. Consequently, all elements of the matrix below the main diagonal are zeroes.
This is an example of a matrix in row echelon form (the leading entries are boxed):
A matrix is in reduced row echelon form if, in addition to being in row echelon form, it satifies the two following conditions:
\(\,\) all leading entries are equal to 1 \(\,\) (they are then called leading unities), \(\,\) and
\(\,\) each leading unity is the only non-zero element in its column.
An example of a matrix in reduced row echelon form is (the leading unities are boxed):
Replacing rows with columns in the above formulae, one may define the (reduced) column echelon form of a matrix. Thus a matrix is in the (reduced) column echelon form if and only if its transpose is in the (reduced) row echelon form. Since operations on equations of a system correspond to operations on rows of the augmented matrix, in the following we shall consider the row version of echelon form only.
The discussion may be generalized to the case of a matrix over an arbitrary ring with identity (e.g. the integer ring \(\,Z\)). To assure reversibility, definition of the 2nd elementary row operation is to be slightly modified (descriptions of the other two operations are unchanged):
\(\,\) multiply a row by an invertible element of the ring
Every matrix over a ring \(\,P\,\) with identity may be transformed into a row echelon form by row elementary operations - this procedure is called Gauss elimination. On the other hand, every matrix over a field \(\,K\,\) may be converted in this way into the (unique) reduced row echelon form. This row-reducing of a matrix is referred to as Gauss-Jordan elimination.
The matrix \(\,\boldsymbol{A}\,\) may be considered both over the integer ring \(\,Z\,\) or over the rational field \(\,Q:\)
If \(\,\boldsymbol{A}\,\) is a matrix over \(\,Z,\,\) then elementary operations on its rows \(\,r0,\,r1,\,r2\,\) (in Sage the numbering starts from zero!) make way to an echelon form only:
(the echelon form is actually obtained at the first step; the next operations try to get the reduced form, which is however not achievable in this framework).
Alternatively, if we assume that \(\,\boldsymbol{A}\,\) is a matrix over the field \(\,Q,\,\) the operations may be pushed further, up to the reduced echelon form:
Practical Elimination in Sage¶
In Sage there are built-in functions (strictly: methods), which perform the row and column elementary operations on matrices:
\(\,\)
swap_rows(i,j)
\(\ \) swaps around rows \(\,\) i \(\,\) and \(\,\) j;\(\,\)
rescale_row(i,a)
\(\ \) multiplies row \(\,\) i \(\,\) by the scalar \(\,\) a;\(\,\)
add_multiple_of_row(i,j,a)
\(\,\) adds \(\,\) a times row j \(\,\) to row i.
The column counterparts of these methods are swap_columns(i,j)
,
rescale_col(i,a)
and add_multiple_of_column(i,j,a)
.
The above methods perform the operations directly on the original matrix
and do not return any value. In contrast to them, the methods
with_swapped_rows(i,j)
, with_rescaled_row(i,a)
and
with_added_multiple_of_row(i,j,a)
as well as
with_swapped_columns(i,j)
, with_rescaled_col(i,a)
and
with_added_multiple_of_column(i,j,a)
do return the modified matrix while leaving the original unchanged. 0
If \(\,\boldsymbol{A}\,\) is a matrix,
\(\ \,\boldsymbol{b}\ \) is a matrix or a vector,
then the command A.augment(b)
returns the matrix, obtained by appending
\(\,\boldsymbol{b}\,\) onto the right side of \(\,\boldsymbol{A}\,\)
(vector \(\,\boldsymbol{b}\,\) is first converted to a one-column matrix).
The method augment()
may be thus used to construct the augmented matrix
of a system of linear equations from the coefficient matrix and the column
vector of constants. 1
The method echelon_form()
returns a matrix transformed to the row echelon
or (when the base ring of the matrix is a field) to the reduced row
echelon form. The method rref()
returns a matrix in the reduced
row echelon form (if the base ring is not a field, the original matrix
is replaced by an equivalent matrix over the rational field prior to
row-reducing). Both methods, echelon_form()
and rref()
,
return the transformed matrix and do not change the original one.
To convert directly the original into the echelon form,
the function echelonize()
is to be be applied instead. 2
We shall use these methods to check the results from the last section:
sage: A = matrix([[2, 5, 3, 0],
[2, 0,-2,-1],
[0, 0, 4, 5]])
sage: (A, A.echelon_form(), A.rref())
(
[ 2 5 3 0] [ 2 0 2 4] [ 1 0 0 3/4]
[ 2 0 -2 -1] [ 0 5 1 -4] [ 0 1 0 -21/20]
[ 0 0 4 5], [ 0 0 4 5], [ 0 0 1 5/4]
)
\(\;\)
Example 1. \(\,\) Consistent linear system with a unique solution
We shall apply the elimination method to the system of equations over the rational field \(\,Q:\)
The coefficient matrix \(\,\boldsymbol{A},\,\) column vector of constants \(\,\boldsymbol{b},\,\) and the augmented matrix \(\,\boldsymbol{B}:\)
The Gauss-Jordan elimination consists of the operations on rows \(\,r0,\,r1,\,r2\,\) of matrix \(\,\boldsymbol{B}:\)
\(\ \)
The following program executes this procedure step-by-step:
sage: A = matrix(QQ,[[2,-1,-1], # the coefficient matrix
[3, 4,-2],
[3,-2, 4]])
sage: b = vector([4,11,11]) # the vector of constants
sage: B = A.augment(b) # compose the augmented matrix
sage: B.add_multiple_of_row(2,1,-1) # from the third row subtract the second
sage: B.add_multiple_of_row(1,0,-1) # from the second row subtract the first
sage: B.rescale_row(2,-1/6) # third row divide by -6
sage: B.add_multiple_of_row(0,1,-2) # from first row subtract doubled second
sage: B.swap_rows(0,1) # swap first and second rows
sage: B.swap_rows(1,2) # swap second and third rows
sage: B.add_multiple_of_row(2,1,11) # to third row add the second
# multiplied by 11
sage: B.rescale_row(2,-1/10) # third row divide by -10
sage: B.add_multiple_of_row(0,2,1) # to the first row add the third
sage: B.add_multiple_of_row(1,2,1) # to the second row add the third
sage: B.add_multiple_of_row(0,1,-5) # from the first row subtract the second
# multiplied by 5
sage: B # display the transformed matrix B
[1 0 0 3]
[0 1 0 1]
[0 0 1 1]
This result may be obtained directly by means of the method rref()
:
sage: A = matrix(QQ,[[2,-1,-1], # the coefficient matrix
[3, 4,-2],
[3,-2, 4]])
sage: b = vector([4,11,11]) # the vector of constants
sage: B = A.augment(b) # compose the augmented matrix
sage: B.rref() # display the transformed matrix B
[1 0 0 3]
[0 1 0 1]
[0 0 1 1]
The reduced echelon matrix corresponds to the trivial system of equations
where we recognize immediately the solution: \(\ \ x_1 = 3,\ x_2=x_3 = 1.\)
Example 2. \(\,\) Consistent linear system with infinitely many solutions
We shall consider the system of three equations in four variables over the rational field \(\,Q:\)
The Sage code
sage: B = matrix(QQ,[[1,-1, 2,-1, 1],
[2,-3,-1, 1,-1],
[1, 0, 7,-4, 4]])
sage: table([[B, '$\\rightarrow$', B.rref()]])
displays both the original augmented matrix and its row-reduced echelon form:
The latter matrix refers to the system of two equations (the third one is fulfilled identically):
which may be rewritten as
For each pair of values \(\,x_3,\,x_4\,\) there exists exactly one pair of values \(\,x_1,\,x_2,\) for which the system is satisfied. Therefore we consider \(\,x_3,\,x_4\,\) as free parameters: \(\ x_3 = s,\ x_4 = t,\ \) and the solution takes the form
where \(\,s\ \,\text{and}\ \,t\,\) are arbitrary rational numbers. \(\,\) In the vector notation
The above discussion suggests a general approach to such systems of linear equations with infinitely many solutions. The augmented matrix being transformed to the reduced row echelon form, the unknowns corresponding to columns with leading unities are to be expressed by the remaining ones, whereupon the latter are assumed to be free parameters.
Example 3. \(\,\) Inconsistent linear system
The present system of equations differs from the previous one only in a constant term on the right-hand side:
whereby it turns out to be inconsistent. \(\\\) Indeed, the augmented matrix transformed to the reduced row echelon form
corresponds to the system
which evidently has no solution, since there are no values of \(\,x_1,\,x_2,\,x_3,\,x_4,\,\) satisfying the last equation \(\ 0=1.\)
\(\ \)
Example 4. \(\,\) Homogeneous linear system
Let’s consider the homogeneous system associated with that in Example 2.:
The augmented matrix converted to the reduced row echelon form
leads to the system of two equations
which are equivalent to
As previously, we treat \(\,x_3,\,x_4\,\) as parameters with arbitrary values: \(\ x_3 = s,\ x_4 = t :\)
and get finally the solution in the vector form:
Comparison of formulae (3) and (4) in Examples 2. and 4. suggests a relationship between the solution of an (inhomogeneous) linear system and the solution of its homogeneous counterpart. This question will be brought up in a further chapter.