Basic Operations on Matrices

In this section we shall define addition of matrices of the same size, multiplication of a matrix by a number, and the matrix multiplication, where the product of two matrices of appropriate sizes is another matrix.

Matrix multiplication is introduced in two stages: definition of the product of a matrix by a column vector (which makes possible a compact notation of a system of linear equations) is subsequently generalized to the case of a multi-column multiplicand.

Matrix Definition and Terminology

Consider a system of \(\,m\,\) linear equations in \(\,n\,\) variables

(1)\[\begin{split}\begin{array}{c} a_{11}\,x_1\;+\ \,a_{12}\,x_2\;+\ \,\ldots\ +\ \;a_{1n}\,x_n\ \,=\ \ b_1 \\ a_{21}\,x_1\;+\ \,a_{22}\,x_2\;+\ \,\ldots\ +\ \;a_{2n}\,x_n\ \,=\ \ b_2 \\ \ldots\qquad\ \ \ldots\qquad\ \ \ldots\qquad\ldots\qquad\quad\ldots \\ a_{m1}\,x_1\;+\ \,a_{m2}\,x_2\;+\ \,\ldots\ +\ \;a_{mn}\,x_n\ \,=\ \ b_m\,. \end{array}\end{split}\]

The coefficients \(\,a_{ij}\,\) of the variables \((i=1,2,\ldots,m;\ \;j=1,2,\ldots,n)\) form the rectangular matrix \(\,\boldsymbol{A}\ \) (called hereafter the coefficient matrix of the system (1)) with \(\,m\,\) rows and \(\,n\,\) columns, denoted shortly by \(\,[a_{ij}]_{m\times n}:\)

\[\begin{split}\boldsymbol{A}\ =\ [a_{ij}]_{m\times n}\ =\ \left[\begin{array}{cccc} a_{11} & a_{12} & \ldots & a_{1n} \\ a_{21} & a_{22} & \ldots & a_{2n} \\ \ldots & \ldots & \ldots & \ldots \\ a_{m1} & a_{m2} & \ldots & a_{mn} \end{array}\right]\,.\end{split}\]

Note

When the notation \(\,a_{ij}\,\) of a matrix entry is applied, the first index (here \(i\)) and the second index (here \(j\)) designate respectively the row and the column, in which the entry is located.

The set of all rectangular matrices with \(\ m\ \) rows and \(\ n\ \) columns, whose elements belong to a field \(\,K,\,\) is denoted by \(\,M_{m\times n}(K)\ \) (\(\,K\,\) is usually the field \(\,R\,\) of real numbers or the field \(\,C\ \) of complex numbers).

When \(\,m=n,\,\) \(\ \boldsymbol{A}\ \) is said to be a square matrix of size \(\,n.\ \) The set of all such square matrices is denoted by \(M_n(K).\)

The one-column matrices from \(\,M_{n\times 1}(K)\ \) may be considered identical with the corresponding column vectors from \(\,K^n.\ \) Examples thereof related to the system (1) are the solution vector \(\,\boldsymbol{x}\ \) and the vector of constants \(\,\boldsymbol{b}:\)

\[\begin{split}\boldsymbol{x}\,=\, \left[\begin{array}{c} x_{1} \\ x_{2} \\ \ldots \\ x_{n} \end{array}\right] \ \in\ K^n\simeq M_{n\times 1}(K)\,, \qquad \boldsymbol{b}\,=\, \left[\begin{array}{c} b_{1} \\ b_{2} \\ \ldots \\ b_{m} \end{array}\right] \ \in\ K^m\simeq M_{m\times 1}(K)\,.\end{split}\]

Addition and Scalar Multiplication. Vector Space of Matrices

Matrices from the set \(\,M_{m\times n}(K)\,\) may be added:

\[\begin{split}\left[\begin{array}{ccc} a_{11} & \ldots & a_{1n} \\ a_{21} & \ldots & a_{2n} \\ \ldots & \ldots & \ldots \\ a_{m1} & \ldots & a_{mn} \end{array}\right] \ \ + \ \ \left[\begin{array}{ccc} b_{11} & \ldots & b_{1n} \\ b_{21} & \ldots & b_{2n} \\ \ldots & \ldots & \ldots \\ b_{m1} & \ldots & b_{mn} \end{array}\right] \ \ :\,= \ \ \left[\begin{array}{ccc} a_{11} + b_{11} & \ldots & a_{1n} + b_{1n} \\ a_{21} + b_{21} & \ldots & a_{2n} + b_{2n} \\ \ldots & \ldots & \ldots \\ a_{m1} + b_{m1} & \ldots & a_{mn} + b_{mn} \end{array}\right]\end{split}\]

and multiplied by numbers (called henceforth scalars) \(\, c \in K\):

\[\begin{split}c \ \ \left[\begin{array}{cccc} a_{11} & a_{12} & \ldots & a_{1n} \\ a_{21} & a_{22} & \ldots & a_{2n} \\ \ldots & \ldots & \ldots & \ldots \\ a_{m1} & a_{m2} & \ldots & a_{mn} \end{array}\right] \ \ :\,= \ \ \left[\begin{array}{cccc} c \; a_{11} & c \; a_{12} & \ldots & c \; a_{1n} \\ c \; a_{21} & c \; a_{22} & \ldots & c \; a_{2n} \\ \ldots & \ldots & \ldots & \ldots \\ c \; a_{m1} & c \; a_{m2} & \ldots & c \; a_{mn} \end{array}\right]\,.\end{split}\]

In the above definitions the addition and scalar multiplication of matrices are expressed by the respective operations on their elements, which are numbers. Properties of the operations on numbers transfer in a natural way to the matrix domain. On that basis we claim that:

  • matrix addition is associative and commutative:

    \(\quad (\boldsymbol{A} + \boldsymbol{B}) \, + \, \boldsymbol{C} \ \; = \ \; \boldsymbol{A} \, + \, (\boldsymbol{B} + \boldsymbol{C})\,,\)

    \(\quad\ \boldsymbol{A}\, + \,\boldsymbol{B}\ \,=\ \, \boldsymbol{B}\, + \,\boldsymbol{A},\qquad \boldsymbol{A}, \, \boldsymbol{B}, \, \boldsymbol{C}\, \in \, M_{m\times n}(K).\)

  • scalar multiplication is distributive over scalar and matrix addition, \(\\\) and is compatible with field multiplication:

    \(\quad (a + b)\,\boldsymbol{A}\ =\ a\,\boldsymbol{A}\, +\, b\,\boldsymbol{A}\,,\) \(\quad a\,(\boldsymbol{A} + \boldsymbol{B})\ =\ a\,\boldsymbol{A}\, +\, a\,\boldsymbol{B}\,,\)

    \(\quad a\,(b\boldsymbol{A})\ =\ (ab)\,\boldsymbol{A},\qquad a,\,b\,\in K,\quad\boldsymbol{A},\,\boldsymbol{B}\,\in\, M_{m\times n}(K)\,,\)

  • scalar multiplication satisfies the trivial condition:

    \(\quad 1\,\boldsymbol{A}\ =\ \boldsymbol{A},\qquad 1\in K,\quad\boldsymbol{A}\,\in\, M_{m\times n}(K)\,.\)

All these properties imply that the set \(\,M_{m\times n}(K)\,\) of matrices with \(\,m\,\) rows and \(\,n\,\) columns over a field \(\,K\,\) is the vector space over that field. The zero vector of this space is the zero matrix \(\ \boldsymbol{O}\,=\,[\,0\,]_{m\times n},\ \) and the opposite (additive inverse) of the matrix \(\ \boldsymbol{A}\,=\,[a_{ij}]_{m\times n}\ \) is the matrix \(\ \boldsymbol{-A}\,:\,=\,[-a_{ij}]_{m\times n}.\)

Multiplication of a Matrix by a Column Vector

The product of a matrix \(\,\boldsymbol{A}\in M_{m\times n}(K)\,\) with \(\,m\,\) rows and \(\,n\,\) columns by a column vector \(\,\boldsymbol{x}\in K^n\,\) of size \(\,n\ \) is defined as follows:

(2)\[\begin{split}\left[\begin{array}{cccc} a_{11} & a_{12} & \ldots & a_{1n} \\ a_{21} & a_{22} & \ldots & a_{2n} \\ \ldots & \ldots & \ldots & \ldots \\ a_{m1} & a_{m2} & \ldots & a_{mn} \\ \end{array}\right] \ \left[\begin{array}{c} x_1 \\ x_2 \\ \ldots \\ x_n \end{array}\right] \ :\,=\ \left[\begin{array}{c} a_{11}\,x_1 +\,a_{12}\,x_2 + \,\ldots\, +\,a_{1n}\,x_n \\ a_{21}\,x_1 +\,a_{22}\,x_2 + \,\ldots\, +\,a_{2n}\,x_n \\ \ \ldots\qquad\ \ldots\qquad\ldots\qquad\ldots \\ a_{m1}\,x_1 +\,a_{m2}\,x_2 + \,\ldots\, +\,a_{mn}\,x_n \end{array}\right]\end{split}\]

(the operation is possible iff the number of matrix’ columns equals the size of the vector).

According to equation (2), the multiplication (from the left-hand side) of a column vector \(\,\boldsymbol{x}\,\) of size \(\,n\ \) by a matrix \(\,\boldsymbol{A}\,\) with \(\,m\,\) rows and \(\,n\,\) columns, returns a column vector \(\,\boldsymbol{y}\,\) of size \(\,m\,\):

\[\boldsymbol{A}\,\boldsymbol{x}\ =\ \boldsymbol{y}\,,\qquad\text{where} \quad y_i\ = \ a_{i1}\,x_1 + \,a_{i2}\,x_2 + \,\ldots\, + \,a_{in}\,x_n\,, \quad i=1,2,\ldots,m.\]

The right-hand side of equation (2) may be rewritten as

\[\begin{split}\left[\begin{array}{c} a_{11}\,x_1 +\,a_{12}\,x_2 + \,\ldots\, +\,a_{1n}\,x_n \\ a_{21}\,x_1 +\,a_{22}\,x_2 + \,\ldots\, +\,a_{2n}\,x_n \\ \ \ldots\qquad\ \ldots\qquad\ldots\qquad\ldots \\ a_{m1}\,x_1 +\,a_{m2}\,x_2 + \,\ldots\, +\,a_{mn}\,x_n \end{array}\right] \ \,=\ \, x_1 \left[\begin{array}{c} a_{11} \\ a_{21} \\ \ldots \\ a_{m1} \end{array}\right] \; +\ x_2 \left[\begin{array}{c} a_{12} \\ a_{22} \\ \ldots \\ a_{m2} \end{array}\right] \; +\ \ldots \ + \ x_n \left[\begin{array}{c} a_{1n} \\ a_{2n} \\ \ldots \\ a_{mn} \end{array}\right]\,.\end{split}\]

Denoting by \(\ \,\boldsymbol{A}_1,\ \boldsymbol{A}_2,\ \ldots,\,\boldsymbol{A}_n\ \,\) the columns of the matrix \(\,\boldsymbol{A}\,:\)

\[\boldsymbol{A}\ \,=\ \, [\,\boldsymbol{A}_1\,|\,\boldsymbol{A}_2\,|\,\ldots\,|\,\boldsymbol{A}_n\,]\]

we may rewrite equation (2) in the form

(3)\[\boldsymbol{A} \, \boldsymbol{x} \ =\ x_1\,\boldsymbol{A}_1 \ +\ x_2\,\boldsymbol{A}_2 \ +\ \ldots \ + \ x_n\,\boldsymbol{A}_n\,.\]

Rule 0. \(\ \) Product of a Matrix and a Vector.

Suppose \(\,\boldsymbol{A}\in M_{m\times n}(K)\,,\ \boldsymbol{x}\in K^n\,.\ \) Then the product \(\,\boldsymbol{A}\,\boldsymbol{x}\ \) is the linear combination of columns of matrix \(\,\boldsymbol{A},\ \) the coefficients being consecutive elements of the vector \(\,\boldsymbol{x}.\)

Going back to the generic system of linear equations (1), we shall rewrite it in the form of equality of two column vectors:

\[\begin{split}\left[\begin{array}{c} a_{11}\,x_1 +\,a_{12}\,x_2 + \,\ldots\, +\,a_{1n}\,x_n \\ a_{21}\,x_1 +\,a_{22}\,x_2 + \,\ldots\, +\,a_{2n}\,x_n \\ \ \ldots\qquad\ \ldots\qquad\ldots\qquad\ldots \\ a_{m1}\,x_1 +\,a_{m2}\,x_2 + \,\ldots\, +\,a_{mn}\,x_n \end{array}\right] \ \ =\ \ \left[\begin{array}{c} b_{1} \\ b_{2} \\ \ldots \\ b_{m} \end{array}\right]\,.\end{split}\]

The definition (2) of the matrix-vector product allows for the compact notation of (1):

\[\boldsymbol{A} \, \boldsymbol{x} \ =\ \boldsymbol{b}\,.\]

Finally, due to the formula (3), \(\,\) we obtain the column picture of a system of linear equations:

\[x_1\,\boldsymbol{A}_1\ +\ x_2\,\boldsymbol{A}_2\ +\ \ldots\ + \ x_n\,\boldsymbol{A}_n\ =\ \boldsymbol{b}\,.\]

Matrix Multiplication (Product of Two Matrices)

We mentioned earlier that column vectors from \(\,K^n\,\) may be identified with one-column matrices from \(\,M_{n\times 1}(K).\ \) Thus the formula (2) may be interpreted as the recipe for the product of a \(\ m\times n\ \) matrix by a one-column matrix of size \(\,n.\ \) In this section we shall generalize it so that the multiplicand might be any multi-column matrix with \(\,n\,\) rows.

Within this interpretation the product of matrices \(\ \boldsymbol{A}\,=\,[a_{ij}]_{m\times p}\ \;\) and \(\quad \boldsymbol{B}\,=\,[b_{ij}]_{p\times 1}\ \,\) reads:

\[\begin{split}\boldsymbol{A} \boldsymbol{B} \ =\ \left[\,\begin{array}{cccc} a_{11} & a_{12} & \ldots & a_{1p} \\ a_{21} & a_{22} & \ldots & a_{2p} \\ \ldots & \ldots & \ldots & \ldots \\ a_{m1} & a_{m2} & \ldots & a_{mp} \\ \end{array}\right] \ \left[\begin{array}{c} b_{11} \\ b_{21} \\ \ldots \\ b_{p1} \end{array}\right] \ =\ \left[\begin{array}{c} a_{11}\,b_{11} +\,a_{12}\,b_{21} + \,\ldots\, +\,a_{1p}\,b_{p1} \\ a_{21}\,b_{11} +\,a_{22}\,b_{21} + \,\ldots\, +\,a_{2p}\,b_{p1} \\ \ \ldots\qquad\ \ldots\qquad\ldots\qquad\ldots \\ a_{m1}\,b_{11} + a_{m2}\,b_{21} + \,\ldots\, +\,a_{mp}\,b_{p1} \end{array}\right]\,.\end{split}\]

Denoting \(\ \boldsymbol{A} \boldsymbol{B}\ =\ \boldsymbol{C}\ =\ [c_{ij}]_{m\times 1}\ \) we get

(4)\[\begin{split}\boldsymbol{C}\ =\ \left[\begin{array}{c} c_{11} \\ c_{21} \\ \ldots \\ c_{m1} \end{array}\right] \ =\ \left[\begin{array}{c} a_{11}\,b_{11} +\,a_{12}\,b_{21} + \,\ldots\, +\,a_{1p}\,b_{p1} \\ a_{21}\,b_{11} +\,a_{22}\,b_{21} + \,\ldots\, +\,a_{2p}\,b_{p1} \\ \ \ldots\qquad\ \ldots\qquad\ldots\qquad\ldots \\ a_{m1}\,b_{11} + a_{m2}\,b_{21} + \,\ldots\, +\,a_{mp}\,b_{p1} \end{array}\right]\,.\end{split}\]

The columns of matrix \(\ \boldsymbol{A}\ \) being denoted by \(\ \boldsymbol{A}_1,\,\boldsymbol{A}_2,\,\dots,\,\boldsymbol{A}_p,\ \) this may be written as

(5)\[\begin{split}\begin{array}{lll} & \qquad & \boldsymbol{C}\ =\ b_{11}\,\boldsymbol{A}_1\ +\ b_{21}\,\boldsymbol{A}_2\ +\ \dots\ +\ b_{p1}\,\boldsymbol{A}_p \\[6pt] \text{and} & \qquad & c_{i1}\ =\ a_{i1}\,b_{11} + a_{i2}\,b_{21} + \,\ldots\, + a_{ip}\,b_{p1} \,,\quad i\,=\,1,2,\ldots,m\,. \end{array}\end{split}\]

Now let matrix \(\ \boldsymbol{B}\ \) be composed of \(\,n\,\) columns of size \(\,p\):

\[\begin{split}\boldsymbol{B}\ \ =\ \ \left[\,\boldsymbol{B}_1\,|\,\boldsymbol{B}_2\,|\, \ldots\,|\,\boldsymbol{B}_n\,\right]\ \ =\ \ \left[\begin{array}{cccc} b_{11} & b_{12} & \ldots & b_{1n} \\ b_{21} & b_{22} & \ldots & b_{2n} \\ \ldots & \ldots & \ldots & \ldots \\ b_{p1} & b_{p2} & \ldots & b_{pn} \\ \end{array}\right]\,.\end{split}\]

The product \(\ \boldsymbol{A}\boldsymbol{B}\ \,\) is then defined as the matrix obtained by multiplication (from the left) of each column of \(\ \boldsymbol{B}\ \,\) by the matrix \(\ \boldsymbol{A}:\)

(6)\[\boldsymbol{A}\boldsymbol{B}\ \equiv\ \boldsymbol{A}\ \left[\,\boldsymbol{B}_1\,|\, \boldsymbol{B}_2\,|\,\ldots\,|\, \boldsymbol{B}_n\,\right]\ \ :\,=\ \ \left[\;\boldsymbol{A}\boldsymbol{B}_1\;|\; \boldsymbol{A}\boldsymbol{B}_2\;|\;\ldots\;|\; \boldsymbol{A}\boldsymbol{B}_n\;\right]\,.\]

Denoting \(\,\boldsymbol{A}\boldsymbol{B} = \boldsymbol{C} = [\;\boldsymbol{C}_1\,|\,\boldsymbol{C}_2\,|\,\ldots\,|\, \boldsymbol{C}_n\;] = [c_{ij}]_{m\times n}\,\) we get, by analogy with (4), (5):

\[\begin{split}\boldsymbol{C}_j\ =\ \left[\begin{array}{c} c_{1j} \\ c_{2j} \\ \ldots \\ c_{mj} \end{array}\right]\ =\ \left[\begin{array}{c} a_{11}\,b_{1j} +\,a_{12}\,b_{2j} + \,\ldots\, +\,a_{1p}\,b_{pj} \\ a_{21}\,b_{1j} +\,a_{22}\,b_{2j} + \,\ldots\, +\,a_{2p}\,b_{pj} \\ \ \ldots\qquad\ \ldots\qquad\ldots\qquad\ldots \\ a_{m1}\,b_{1j} +\,a_{m2}\,b_{2j} + \,\ldots\, +\,a_{mp}\,b_{pj} \end{array}\right]\,,\end{split}\]
(7)\[\begin{split}\begin{array}{l} \boldsymbol{C}_j\ =\ b_{1j}\,\boldsymbol{A}_1\ +\ b_{2j}\,\boldsymbol{A}_2\ +\ \ldots\ + \ b_{pj}\,\boldsymbol{A}_p \\ c_{ij}\ =\ a_{i1}\,b_{1j} +\,a_{i2}\,b_{2j} + \,\ldots\, +\,a_{ip}\,b_{pj} \,,\qquad \begin{array}{l} i\,=\,1,2,\ldots,m\,; \\ j\,=\,1,2,\ldots,n. \end{array} \end{array}\end{split}\]

The definition \(\,\) (6) \(\,\) and formula \(\,\) (7) \(\,\) which arises from it may be restated as

Rule 1. \(\ \) Column Rule of Matrix Multiplication. \(\\\)

Let \(\ \boldsymbol{A}\,\in M_{m\times p}(K),\ \boldsymbol{B}\,\in M_{p\times n}(K).\ \) Then the \(\ j\)-th column of the product \(\ \boldsymbol{A}\boldsymbol{B}\ \) is: \(\\\)

  1. \(\,\) the product of the matrix \(\ \boldsymbol{A}\,\) by the \(\ j\)-th column of matrix \(\boldsymbol{B};\) \(\\\)

  2. \(\,\) the linear combination of columns of matrix \(\ \boldsymbol{A},\,\) the coefficients being
    \(\,\) the consecutive elements of the \(\ j\)-th column of matrix \(\boldsymbol{B},\ \ j=1,2,\ldots,n.\)

The space \(\,M_{1\times 1}(K)\,\) of 1-element matrices over the field \(\,K\,\) being identified with the field \(\,K\,\) itself: \(\ M_{1\times 1}(K)\simeq K,\ \) the element \(\ c_{ij}\,\) of the matrix \(\,\boldsymbol{C}=\boldsymbol{A}\boldsymbol{B}\,\) is the product (in the sense of equation (3) and Rule 0.) of the \(\,i\)-th row of matrix \(\,\boldsymbol{A}\,\) by the \(\,j\)-th column of matrix \(\,\boldsymbol{B}:\)

\[\begin{split}c_{ij}\ =\ a_{i1}\,b_{1j} + a_{i2}\,b_{2j} + \,\ldots \;+\; a_{ip}\,b_{pj}\ \,=\ \; \left[\begin{array}{cccc} a_{i1} & a_{i2} & \ldots & a_{ip} \end{array}\right]\ \left[\begin{array}{c} b_{1j} \\ b_{2j} \\ \ldots \\ b_{pj} \end{array}\right]\,.\end{split}\]

In this way we get the practical recipe for calculating elements of the matrix product:

Rule 2. \(\ \) Practical Calculation of Matrix Product. \(\,\)

If \(\,\boldsymbol{A}\,\in M_{m\times p}(K),\ \boldsymbol{B}\,\in M_{p\times n}(K)\,,\ \) then the element at the \(\,i\)-th row and the \(\,j\)-th column of the product \(\,\boldsymbol{A}\boldsymbol{B}\,\) is the product of the \(\,i\)-th row of matrix \(\,\boldsymbol{A}\,\) by the \(\,j\)-th column of matrix \(\,\boldsymbol{B},\) \(\ \ i\,=\,1,2,\ldots,m\,,\ \,j\,=\,1,2,\ldots,n.\)

The above formulae pertaining the matrix product may be gathered as follows:

General Formulae for Matrix Multiplication. \(\\\)

Let \(\,\boldsymbol{A},\boldsymbol{B}\in M(K).\ \) The product \(\,\boldsymbol{A}\boldsymbol{B}\,\) exists if and only if the number of columns of matrix \(\,\boldsymbol{A}\,\) equals the number of rows of matrix \(\,\boldsymbol{B}.\,\) Then the row size of the product \(\,\boldsymbol{A}\boldsymbol{B}\,\) equals that of \(\,\boldsymbol{A}\,\) and the column size of \(\,\boldsymbol{A}\boldsymbol{B}\,\) equals that of \(\,\boldsymbol{B}.\,\) The elements of \(\,\boldsymbol{A}\boldsymbol{B}\,\) are products of rows of \(\,\boldsymbol{A}\,\) by the columns of \(\,\boldsymbol{B}.\,\) \(\\\)

Specifically, if \(\ \boldsymbol{A}\,=\, [a_{ij}]_{m\times p}\,,\ \boldsymbol{B}\,=\,[b_{ij}]_{p\times n}\,,\ \) then \(\ \,\boldsymbol{A} \boldsymbol{B} = \boldsymbol{C} = [c_{ij}]_{m\times n}\,,\ \) where

\[\begin{split}c_{ij}\ =\ \left[\begin{array}{cccc} a_{i1} & a_{i2} & \ldots & a_{ip} \end{array}\right]\ \left[\begin{array}{c} b_{1j} \\ b_{2j} \\ \ldots \\ b_{pj} \end{array}\right] \ \, =\ \,\sum_{k=1}^p \; a_{ik}\,b_{kj}\,, \qquad \begin{array}{l} i\,=\,1,2,\ldots,m\,; \\ j\,=\,1,2,\ldots,n. \end{array}\end{split}\]