Part A - Introduction

Linear Algebra and Trigonometry

Introduce linear algebra and trigonometry
Define vectors, matrices and operations on them

Vectors | Matrices | Exercises


Many engineers, physicists, computer scientists, computer animators and economists depend on compute-intensive applications.  Many describe the solutions to their problems in terms of large numbers of variables.  They express the relationships between these variables as linear equations.  They use n equations to solve the relationships between n variables.  They represent each variable geometrically as a scalar value along one direction in an n-dimensional problem space.  That is, they use computers to solve problems defined in terms of sets of n variables in n-dimensional space.

Variables related to one another through linear equations can be expressed as operations between vectors and matrices.  The branch of mathematics that models linear transformations in domain space is called linear algebra.  This branch classifies sets of n variables as n-dimensional vectors and describes their transformations in terms of matrices.  Linear algebra defines the rules for performing operations on vectors in the form of matrices.  Certain operations are best expressed as rotations through angles.  The branch of mathematics that defines the rules for operating on angles is called trigonometry

This chapter introduces basic rules of linear algebra and trigonometry.  This initial presentation is restricted to two-dimensional space for simplicity and clarity.  The following chapter extends these definitions and rules to n-dimensional space. 


Vectors

Vectors describe entities of information in a problem space.  Consider for instance, a vehicle's velocity over some terrain.  On a flat terrain, a complete description of the vehicle's velocity only requires two scalar values: one in, say, the north direction and one in the west direction.  Taken together, these two values determine the vehicle's speed and its direction. 

The formal definition of a vector is an entity with a direction and a finite magnitude.  Examples include displacement, velocity, and force.  We depict a vector geometrically by a directed line segment with an arrowhead.  We call the arrow tip the vector's head and the furthest point away from the head the vector's tail.  A vector's magnitude is the distance from its tail to its head. 

vector

A vector's magnitude is always non-negative. 

In these notes, we represent a vector using boldface lowercase notation: 

 p

Scalars

A scalar differs from a vector.  A scalar has magnitude but no direction.  Examples of scalar quantities include energy, mass and temperature. 

We represent a scalar using normal-face lowercase notation and distinguish it from a vector: 

 s

Operations on Vectors

Common operations on vectors include reversal, scaling, addition and subtraction.

Reversal

The reverse of a vector is the vector with its direction reversed; that is, its head and tail exchanged.  We preface a vector's symbol with a negative sign to reverse its direction: 

  r = - p

reverse vector

Scaling

We scale a vector's magnitude while preserving its direction by multiplying the vector by a positive-valued scalar: 

 r = sp

scalar multiplation of a vector

Multiplying a vector by a negative-valued scalar reverses its direction and scales the reversed vector by that scalar's absolute value.

Addition

Adding one vector to another vector produces a third vector.  This third vector extends from the tail of the original vector to the head of the added vector after having attached the tail of the added vector to the head of the original vector.  In other words, the addition of a vector to another vector translates the latter's head while holding its tail in place.  We use the plus operator to identify the addition of two vectors:

  r = p + q

adding two vectors

Subtraction

Subtracting one vector from another vector produces a third vector.  The third vector extends from the head of the subtracting vector to the head of the original vector after attaching the tail of the subtracting vector to the tail of the original vector.  In other words, subtraction translates the tail of the original vector while holding its head in place.  We use the minus operator to identify subtraction of two vectors:

  r = p - q 

subtracting vectors

Free Vectors

These head-tail descriptions of vectors are relative descriptions.  Vectors are free to translate throughout the problem space.  As long as a vector's magnitude and direction remain unchanged, its actual position in the problem space is of no consequence.  That is, the problem space itself has no origin.  For example, 40 kilometers per hour north-west from Toronto is the same as 40 kilometers north-west from Ottawa.  Each vector only defines the position of its head with respect to its own tail.  Its head and tail are free to translate in the same direction without altering the vector itself.  We refer to any vector that is free to translate in this way as a free vector

The three free vectors shown below are identical:

free vectors

We express their equivalence by writing:

  r = q = p

Displacement, velocity, force and acceleration are represented by free vectors. 

Coordinate System

To identify specific points in a problem space, we add a coordinate system to the problem space.  We call this system our frame of reference.  The most common two-dimensional system is rectangular (Cartesian).  It consists of two mutually perpendicular axes crossing one another at a point called the origin (as shown in the figure below).  We label the axes x and y: x drawn horizontally to the right and y drawn at a right angle (an angle of 90 degrees) counter-clockwise from the x axis.  We identify points along the x axis by their distance from the origin, with positive values increasing to the right and negative values decreasing to the left.  Similarly, we identify points along the y axis by their distance from the origin, with positive values increasing upwards and negative values decreasing downwards. 

2 d coordinate system

Each point in a problem space with the coordinate system has its own unique pair of values.  Each value of a pair is the distance from the origin along one of the two axes.  The pair [px, py] (as shown in the figure below) identifies point P.  We call the two values that identify point P the x and y coordinates of point P

2 d coordinates

Vectors in a Coordinate System

We identify each point in a problem space with coordinate system in terms of its own unique vector by attaching the vector's tail to the origin and its head to the point itself.  We refer to such vectors as bound vectors (as opposed to free vectors). 

2 d vector

We express the bound vector for a point in terms of its value pair and write the pair as a column of values as shown below:

      | px |
  p = |    |
      | py |

px and py are the x and y coordinates of the vector that identifies point P in the coordinate system.  Each coordinate is itself a scalar.

Transpose

We also express a vector using a row of values written in the form:

  pT = [px py]

The T in pT denotes transpose.  Transpose means the columns switched into rows (or the rows switched into columns) without any change in the coordinate values themselves.  The transpose of a column vector is a row vector with its values preserved.  The transpose of a row vector is a column vector with the values preserved. 

The transpose of the transpose of a vector is the original vector:

  pTT = p

For compactness of notation, we write a column vector in the form:

  p = [px py]T

Components of a Vector

Any vector can be represented by a number of independent components equal to the dimension of the problem space.  In a 2-dimensional space, any vector can be represented by a pair of components.  Each component is a scalar measure in a separate direction in the problem space.  of any vector are the scalar values that identify the vector in the problem space.  The following section describes operations on vectors in terms of their component values.

Free Vectors

Consider vectors qh and qt with their tails bound to the same point (the origin) as shown in the figure below.  The free vector q is the result of subtracting one of these vectors from the other: 

  q = qh - qt

The components of the free vector are the result of subtracting the coordinate values of one bound vector from the coordinate values of the other bound vector: 

  q = qh - qt = [qhx - qtx    qhy - qty]T

The head of the bound vector qh [qhx, qhy] identifies the head of the free vector q and the head of the bound vector qt [qtx, qty] identifies the tail of the free vector q as shown below. 

free vector

Consider two locations on a map represented by two vectors bound to the map's origin.  The difference between the two locations is a free vector.  If the displacement between the two locations is the same as the displacement between two other locations, the two displacement vectors themselves are identical. 

two free vectors

  q = qh - qt = [qhx - qtx    qhy - qty]T =
  r = rh - rt = [rhx - rtx    rhy - rty]T

For two identical displacements, their component values are identical, even though the bound vectors from which they were derived differ.

Reversal

Reversing a vector multiplies its components by minus one:

 r = -p = [-px   -py]T
        = [ptx - phx    pty - phy]T

Scaling

Multiplying a vector by a scalar multiplies each component by that scalar: 

  r = sp = [spx   spy]T

Addition

Adding one vector to another vector produces a third vector with components equal to the sum of the components of the contributing vectors:

 r = p + q
   = [px   py]T + [qx   qy]T
   = [px + qx    py + qy]T
 

By aligning the tail of the added vector with the head of the original vector, we see that the result of the addition is a vector from the tail of the original vector to the head of the added vector:

  r = p + q
    = [phx - ptx   phy - pty]T + [qhx - phx   qhy - phy]T
    = [phx - ptx + qhx - phx   phy - pty + qhy - phy]T
    = [qhx - ptx   qhy - pty]T

Subtraction

Subtracting one vector from another vector produces a third vector with components equal to the difference of the components of the contributing vectors:

 r = p - q
   = [px   py]T - [qx   qy]T
   = [px - qx   py - qy]T
 

By aligning the tails of the two vectors, we see that the result of the subtraction is a vector from the head of the subtracted vector to the head of the reference vector

  r = p - q
    = [phx - ptx   qhy - pty]T - [qhx - ptx   qhy - pty]T
    = [phx - ptx - (qhx - ptx)   phy - pty - (qhy - pty)]T
    = [phx - qhx   phy - qhy]T

Dot Product

The product of a row vector and a column vector is the sum of the product of the first column component of the row vector with the first row component of the column vector and the product of the second column component of the row vector with the second row component of the column vector:

  [a b]| c | = ac + bd
       | d |

Multiplying a row vector by a column vector produces a scalar value.  We call this value the scalar or dot product of the two vectors.  This scalar result is given by

  s = p∙q = pTq

              | qx |
    = [px py] |    |
              | qy |

    = pxqx + pyqy

Magnitude

The formula for a vector's magnitude follows from Pythagoras' theorem.  For a two-component vector:

  ||p|| = √[(px)2 + (py)2] = √[(phx - ptx)2 + (phy - pty)2]

The double bar symbol denotes magnitude.

The dot product of a vector with itself is the square of the vector's magnitude:

  s = p∙p = pTp

              | px |
    = [px py] |    |
              | py |

    = (px)2 + (py)2
    = ||p||2
 

The magnitude of a reversed vector is the same as the magnitude of the original vector.

Norm

The norm of a vector is the vector with the same direction but unit magnitude:

  n = p / ||p||
    = p / (p ∙ p)1/2

Matrices

Matrices transform vectors into other vectors.  In other words, matrices transform vectors from one coordinate system to another.  A transformation modifies the components of a vector to components appropriate to the new coordinate system. 

For example, consider a unit vector pointing north in a coordinate system with its x-axis pointing east and its y-axis pointing north.  The vector's components are [0, 1].  In a coordinate system with its x-axis pointing south and its y-axis pointing east, the vector's components are [-1, 0].  The matrix that transforms the vector from the first system to the second defines the changes that we need to make to the vector's components. 

A matrix describes any such transformation.  The transformation is independent of the vector on which it operates.  A matrix in two-dimensional space consists of two rows and two columns and contains one coefficient for each row and column combination.  We represent a matrix using upper-case bold-face notation: 

     | m00  m01 |
 M = |          | 
     | m10  m11 |

mij denotes the matrix coefficient for row i column j

A matrix with an equal number of rows and columns is a square matrix. 

A matrix with non-zero values on its diagonal only and zero values elsewhere is a diagonal matrix.

Identity Matrix

An identity transformation leaves all vectors unchanged.  The matrix that represents the identity transformation is called the identity matrix.  Its components are 1's along the diagonal and 0's elsewhere:

 | 1  0 | 
 | 0  1 |

Operations

We can perform the following operations on matrices:

  • addition
  • subtraction
  • multiplication by scalars, vectors and matrices

Addition

Adding one matrix to another produces a third matrix.  The coefficients of the resultant matrix are equal to the sum of the coefficients of the contributing matrices:

 C = A + B
     | A00  A01 |   | B00  B01 |
   = |          | + |          |
     | A10  A11 |   | B10  B11 |

     | A00 + B00   A01 + B01 |
   = |                       |
     | A10 + B10   A11 + B11 | 

Subtraction

Subtracting one matrix from another produces a third matrix.  The coefficients of the resultant matrix are equal to the difference of the coefficients of the contributing matrices:

 C = A - B
     | A00  A01 |   | B00  B01 |
   = |          | - |          |
     | A10  A11 |   | B10  B11 |

     | A00 - B00   A01 - B01 |
   = |                       |
     | A10 - B10   A11 - B11 | 

Scalar Multiplication

Multiplying a matrix by a scalar produces another matrix.  The coefficients of the resultant matrix are equal to the product of the scalar and the coefficients of the contributing matrix:

 C = s A
       | A00  A01 |
   = s |          |
       | A10  A11 |

     | s A00   s A01 |
   = |               |
     | s A10   s A11 | 

Matrix Column-Vector Multiplication

Post-multiplying a matrix by a column vector produces a column vector.  The components of the resultant vector are equal to the dot product of each row of the matrix and the column vector:

 c = A b
     | A00  A01 | | b0 |
   = |          | |    |
     | A10  A11 | | b1 |

     | A00 * b0 + A01 * b1 |
   = |                     |
     | A10 * b0 + A11 * b1 | 

Multiplying the row of a matrix by a column vector follows the dot product rule.  The dot product of the first row of the matrix with the column vector gives the first row coefficient in the resultant vector.  The dot product of the second row of the matrix with the column vector gives the second row coefficient in the resultant vector. 

Matrix-Matrix Multiplication

Multiplying one matrix by another produces a third matrix.  Each coefficient of the resultant matrix is equal to the dot product of the corresponding row of the left matrix and the corresponding column of the right matrix:

 C = A B
     | A00  A01 | | B00  B01 |
   = |          | |          |
     | A10  A11 | | B10  B11 |

     | A00 * B00 +  A01 * B10   A00 * B01 + A01 * B11 | 
   = |                                                |
     | A10 * B00 +  A11 * B10   A10 * B01 + A11 * B11 |
          

Vector Transformations

Sample transformations on vectors include the identity, rotation and uniform scaling transformations.  The identity matrix does not change a vector.  A uniform scaling matrix changes a vector's magnitude while preserving its direction.  A rotation matrix changes a vector's direction while preserving its magnitude.  (A non-uniform scaling matrix changes its direction and its magnitude.) 

Consider a coordinate system with its x-axis pointing east and its y-axis pointing north.  A column vector pointing east has components:

 | 1 |
 | 0 |

east

Identity Transformation

The product of the identity matrix and this column vector leaves the arrow pointing east:

 | 1 0 | | 1 |   | 1 * 1 +  0 * 0 |   | 1 |
 |     | |   | = |                | = |   | 
 | 0 1 | | 0 |   | 0 * 1 +  1 * 0 |   | 0 |

east

Uniform Scaling Transformation

The matrix that doubles the magnitude of this column vector is given by:

 | 2  0 |
 | 0  2 |

The product of this matrix and the column vector doubles the size of the arrow but keeps it pointing east:

 | 2  0 | | 1 |   | 2 * 1 +  0 * 0 |   | 2 |
 |      | |   | = |                | = |   | 
 | 0  2 | | 0 |   | 0 * 1 +  2 * 0 |   | 0 |

east

Rotation Transformation

The matrix that rotates this vector through 90o counter-clockwise is given by:

 | 0 -1 |
 | 1  0 |

The product of this matrix and the column vector rotates the arrow pointing east so that it points north:

 | 0 -1 | | 1 |   | 0 * 1 + -1 * 0 |   | 0 |
 |      | |   | = |                | = |   | 
 | 1  0 | | 0 |   | 1 * 1 +  0 * 0 |   | 1 |

north

To understand the coefficients of a rotation matrix we need to introduce angles and apply the rules of trigonometry. 

Trigonometry

Trigonometry describes the relationships between the sides of a right-angled triangle and the angles between adjacent sides. 

We express angles in either degrees or radians, where a circle contains 360 degrees or 2π radians.  To convert from degrees to radians, we multiply the degree value by π/180:

  θ (in radians) = ( π / 180 ) * θ (in degrees)

Interactive Example

Consider the bound vector in the following interactive example.  Its tail is bound to the origin of the corrdinate system.  By moving the vector's head, we change its components as well as the angle between the vector and the x axis of the coordinate system.  Let us label that angle θ.  Its value in both degrees and radians is shown below the graph.

Trigonometric Functions

Standard trigonometric functions are functions that take angles as arguments.  These functions express an angle in terms of a ratio of two lengths.  Consider the right-angled triangle shown below.  The shaded angle is the right angle (90 degrees or π/2 radians). 

trigonometry

We call the longest side - the one opposite the right-angle - the hypoteneuse.  We set it to unit length and let θ (theta) denote the angle between this hypoteneuse and another side of the triangle.  We call that side the adjacent side.  We call the third side the opposite side. 

The trigonometric function for the length of the adjacent side is

  cos θ

cos is the function name. The trigonometric function for the length of the opposite side is

   sin θ

sin is the function name. 

Values of sin θ and cos θ for common values of θ are listed in the table below:

θ (degrees) θ (radians) sin θ cos θ
0 0 0 1
30 π/6 1/2 (√3)/2
45 π/4 (√2)/2 (√2)/2
60 π/3 (√3)/2 1/2
90 π/2 1 0
120 2π/3 (√3)/2 -1/2
135 3π/4 (√2)/2 -(√2)/2
150 5π/6 1/2 -(√3)/2
180 π 0 -1
210 7π/6 -1/2 -(√3)/2
225 5π/4 -(√2)/2 -(√2)/2
240 4π/3 -(√3)/2 -1/2
270 3π/2 -1 0
300 5π/3 -(√3)/2 1/2
315 7π/4 -(√2)/2 (√2)/2
330 11π/6 -1/2 (√3)/2

The trigonometric function that defines the ratio of the length of the opposite side to that of the adjacent side the tangent of the angle that subtends the two sides.  This tangent is given by

  tan θ = sin θ / cos θ

tan is the function name. 

Using Pythagoras' theorem, we have the trigonometric identity:

  cos2 θ + sin2 θ = 1

Rotations

We can express the coefficients of the matrix that rotates a vector through angle θ in terms of the cos and sin trigonometric functions. 

The following pair of equations gives the coefficients of vector p rotated through angle θ into vector p':

  p'x = cos θ * px  -  sin θ * py 
  p'y = sin θ * px  +  cos θ * py

Rewriting this pair of equations as a matrix-vector product:

  | p'x |   | cos θ  -sin θ | | px | 
  |     | = |               | |    |
  | p'y |   | sin θ   cos θ | | py |

In more compact notation:

 p' = R p

where R is the rotation matrix defined by 

      | cos θ  -sin θ |
  R = |               |
      | sin θ   cos θ |

Transpose

The transpose (T) of a matrix is the matrix with its rows and columns exchanged.  To obtain RT from R, we exchange the rows and columns of R

       |  cos θ  sin θ |
  RT = |               |
       | -sin θ  cos θ |

Consider a rotation matrix that rotates a vector in a certain direction.  Then, the transpose of that rotation matrix rotates the vector in the opposite direction.

Scaling

We can scale a vector using a diagonal matrix with non-negative coefficients: 

      | sx  0 |
  S = |       |
      | 0  sy |

sx is the magnification factor for distances in the direction of the x-axis.  sy is the magnification factor for distances in the direction of the y-axis. 

For example, the scaling matrix

      | 3    0 |
  S = |        |
      | 0  1.8 |
 

magnifies the x component by 3 and the y component by 1.8.  Applied to the vector [1 1], this matrix yields the result shown in the Figure below: 

scaling

Matrix Multiplication

Solutions to scientific problems may involve several transformations.  These transformations are represented by matrix multiplications.  Since matrix multiplications are O(n3), these operations are highly compute intensive. 

A matrix represents a transformation that applies to all vectors defined in the space of a particular coordinate system.  That is, the matrix only needs to be determined once for all vectors in that space.  Similarily, the product of several matrices only needs to be calculated once for all vectors in that space.  Each multiplication of one matrix by another creates a new matrix, which represents a compound transformation.  Multiplying matrices by one another allows us to compress the set of transformations into one equivalent transformation. 

Consider the transformation of vector p into vector p' using matrix X:

  p' = X p

Consider a second transformation using matrix Y

  p" = Y p'

Substituting the first result for p' into this second equation yields

  p" = Y X p

We can express the final result directly through a composite transformation

 p" = C p

where the composite matrix (C) is the product of the two matrices that describe the individual transformations:

 C = Y X

A composite transformation may be the product of many transformations.  Note how each successive matrix pre-multiplies the resultant matrix for the previous transformations: 

  C = Xi Xi-1 X... X1 X0

Consider, for example,

  1. a rotation through 30 degrees counter-clockwise, followed by
  2. a scaling of 2 along the x axis and 4 along the y axis

The composite transformation is represented by the product of the following two matrices:

  scaling *  rotation

  | 2 0 |  | (√3)/2   1/2  |
  | 0 4 |  |  -1/2  (√3)/2 |

Matrix Multiplication Rule

The rule for multiplying one matrix by another follows from the rule for multiplying a matrix by a column vector.  Each coefficient in the resultant matrix is the dot product of the vector with the corresponding row's entries in the left matrix and the vector with the corresponding column's entries in the right matrix:

  c00 = a00 * b00 + a01 * b10     c10 = a10 * b00 + a11 * b10 
  c01 = a00 * b01 + a01 * b11     c11 = a10 * b01 + a11 * b11

That is, the resultant coefficient for the second row - first column is the dot product of the second row of the left matrix and the first column of the right matrix as highlighted

  scaling       rotation

  | 2 0 |  | (√3)/2   1/2  |
  | 0 4 |  |  -1/2  (√3)/2 |

    = 0*((√3)/2) + 4*(-1/2)
    = -2

Calculating the dot products for each coefficient yields

  composite matrix

  | √3   1 |
  | -2 2√3 |

Order of Multiplication

Matrix multiplication is not commutative.  The positioning of the matrices matters.  Consider the above transformations but in opposite order

  1. a scaling of 2 along the x axis and 4 along the y axis, followed by
  2. a rotation through 30 degrees counter-clockwise

The matrix product is

       rotation   * scaling

  | (√3)/2   1/2  | | 2 0 |
  |  -1/2  (√3)/2 | | 0 4 |

which reduces to

  composite

  | √3   2 |
  | -1 2√3 |

Note the difference in the off-diagonal terms.

Since matrix multiplication is a non-commutative operation, we cannot exchange the matrices and expect the same result.

Associative Property

Matrix multiplication is an associative operation.  The order of evaluation of a sequence of matrix multiplications has no effect on the result

  Xi Xj Xk = (Xi Xj) Xk = Xi (Xj Xk)

Special Properties

Rotation Matrices

The product of any rotation matrix R and its transpose is the identity matrix:

         |  cos θ  sin θ | | cos θ  -sin θ |   | 1  0 |
  R RT = |               | |               | = |      | 
         | -sin θ  cos θ | | sin θ   cos θ |   | 0  1 |

This property holds for all rotation matrices.

Uniform Scaling

A matrix with identical scaling factors in each direction changes a vector's magnitude but not its direction.  The product of such a matrix and any vector reduces to a scalar multiplication of the vector

    | a |   | s  0 || a |     | 1  0 || a |     | a |   | sa |
  S |   | = |      ||   | = s |      ||   | = s |   | = |    | 
    | b |   | 0  s || b |     | 0  1 || b |     | b |   | sb |
 

Note that the matrix between the second and third equality signs is the identity matrix.  Since it leaves the vector unchanged, we remove it directly. 


Exercises

  • Consider a vector that extends from the point [1 1] in a rectangular coordinate system to the point [2 2].  Determine the magnitude and the norm of this vector.  Define the matrix that rotates any vector 45 degrees counter-clockwise.  Determine the product of this matrix and the original vector. 
  • Define the matrix that rotates any vector 30 degrees counter-clockwise and the matrix that stretches any vector by 2 in the x direction and 4 in the y direction.  Determine the composite matrix that performs the scaling followed by the rotation.  Check your result using the vectors [1 0] and [0 1].
  • Complete the Workshop on Initial Profiling



Previous Reading  Previous: Initial Assessment Next: Linear Algebra in Science   Next Reading


  Designed by Chris Szalwinski   Copying From This Site   
Logo
Creative Commons License