Since linear algebra is broadly used in Machine Learning, here are the concepts that I thank are important.

  1. Dot Product
    • dot product of two vectors can be seem as linearly transform one to the 1D line defined by the other
    • \vec{v} \cdot \vec{u} = |\vec{v}||\vec{u}|\cos\theta
    • \vec{v} \cdot \vec{u} = \vec{u} \cdot \vec{v}
    • c(\vec{v}\cdot\vec{u}) = (c\vec{v})\cdot\vec{u}
    • (c\vec{v} + d\vec{u})\cdot\vec{w} = (c\vec{v})\cdot\vec{w} + (d\vec{u})\cdot\vec{w}
    • |\vec{v}|^2 =\vec{v}\cdot\vec{v} = \vec{v}^T\vec{v} =\langle\vec{v},\vec{v}\rangle
    • \vec{v} \cdot \vec{u} = 0 if and only if \vec{v} and \vec{u} are orthogonal to each other
  2. Matrix Multiplication
    • AB\not= BA
    • A(B+C) = AB + BC
    • (A+B)C = AC + BC
    • c(AB) = (cA)B = A(cB)
    • A(BC) = (AB)C
    • AI = A = IA
    • AA^{-1} = I = A^{-1}A
    • (AB)^{-1} = B^{-1}A^{-1}
    • det(AB) = det(A) det(B)
    • A [\vec{v}_1\:\vec{v}_2\:\vec{v}_3] = [A\vec{v}_1\:A\vec{v}_2\:A\vec{v}_3]
  3. Transpose
    • (A^T)^T = A
    • (A+B)^T = A^T + B^T
    • (cA)^T = cA^T
    • (AB)^T = B^TA^T
    • (A^{-1})^T = (A^T)^{-1}
  4. Vector Differentiation
    • \frac{d(x^TAx)}{d(x)} = x^T(A^T + A)
  5. Determinants
    • determinant of two 2D vectors is the area of the signed parallelogram formed by these two vectors (\det(\vec{a}\vec{b}) = \left| \begin{array}{cc} \vec{a}_1 & \vec{a}_2 \\ \vec{b}_1 & \vec{b}_2 \end{array} \right| )
    • determinant of three 3D vectors is the signed volume of the parallelepiped formed by these three vectors
    • if the determinant of a matrix A is 0, then A is singular. Below are some more properties of determinant of matrix:
    • |aA| = a^d|A|
    • |AB| = |A||B|
    • |A| = |A^T|
    • |A| = \frac{1}{|A^{-1}|}
  6. Rank of a matrix (for real matrix A\in\mathbb{R}^{m\times n})
    • columnrank(A) = rowrank(A) = rank(A)
    • rank(A) = rank(A^T)
    • rank(A) \le \min(m,n)
    • rank(AB) \le \min(rank(A), rank(B))
    • rank(A+B) \le rank(A) + rank(B)
    • rank(A^TA) = rank(AA^T) = rank(A)
  7. Symetric Matrices
    • A matrix A is positive semidefinite if for all vectors \vec{v} such that \vec{v}^TA\vec{v} \ge 0
      • A is positive semidefinite iff A=A^T
      • A=U^TU is positive semidefinite
  8. Trace (for square matrix)
    • tr(A) = \sum\limits_{i=1}^nA_{ii} where A\in\mathbb{R}^{n\times n}
    • tr(A) = tr(A^T)
    • tr(A+B) = tr(A) + tr(B)
    • tr(cA) = ctr(A)
    • tr(AB) = tr(BA) = tr(B^TA^T) = tr(A^TB^T)
  9. Linear Transformation
  10. Invertibility of a matrix
    • A matrix is invertible iff it is full rank
  11. Orthogonal Matrix
    • “an orthogonal matrix is a square matrix with real entries whose columns and rows are orthogonal unit vectors” (from wikipedia)
    • “The rows of an orthogonal matrix are an orthonormal basis. That is, each row has length one, and are mutually perpendicular. Similarly, the columns are also an orthonormal basis. In fact, given any orthonormal basis, the matrix whose rows are that basis is an orthogonal matrix. It is automatically the case that the columns are another orthonormal basis.” (from Wolfram MathWorld)
    • Let A and B be orthogonal matrices:
      • A^TA = I = AA^T
      • \|A\vec{v}\|_2 = \|\vec{v}\|_2
      • \langle \vec{v},\vec{u} \rangle = \langle A\vec{v},A\vec{u} \rangle
      • \det(A)=\pm 1 (when it equals +1, A is a rotation matrix; o.w. A is a reflection matrix)
      • A + B and A^{-1} are both orthogonal matrices
  12. Eigenvectors and Eigenvalues
    1. when a transformation only scale or reverse the vector but doesn’t change the direction of the vector (except reverse), we say the vector is a eigenvector of the transformation
    2. A\vec{v} = \lambda\vec{v} where \lambda is the eigenvalue associates with the eigenvector \vec{v}
    3. when we assume there is at least one eigenvector, we can use this equation to find it: \det(\lambda I - A) = 0
    4. eigenvalue decomposition
      •  Definition: “Let P be a matrix of eigenvectors of a given square matrix A and D be a diagonal matrix with the corresponding eigenvalues on the diagonal. Then, as long as P is a square matrix, A can be written as an eigen decomposition A = PDP^{-1}. Furthermore, if A is symmetric, then the columns of P are orthogonal vectors.” (from Wolfram MathWorld)
    5. Properties
      • tr(A) = \sum\limits_{i=1}^n\lambda_i
      • |A| = \prod\limits_{i=1}^n\lambda_i
  13. Singular Value Decomposition

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s