Lenear algebra for machine learning

I’ve been reviewing linear algebra, Mathematics for Machine Learning: Linear Algebra on Coursera. I finished the Week 2 module. This course is easy to understand as far. And I memorize what I did in week one and week two modules.

The three properties of dot product

Commutative

\[
r \cdot s = r_i s_i + r_j s_j \\
= 3 \times -1 + 2 \times 2 = 1 \\
= s \cdot r
\]

Distributive

\[
r \cdot (s + t) = r \cdot s + r \cdot t
\] \[
r =
\begin{bmatrix}
r_1 \\
r_2 \\
\vdots \\
r_n \\
\end{bmatrix}
s =
\begin{bmatrix}
s_1 \\
s_2 \\
\vdots \\
s_n \\
\end{bmatrix}
t =
\begin{bmatrix}
t_1 \\
t_2 \\
\vdots \\
t_n \\
\end{bmatrix} \\
s \cdot (s + t) = r_1(s_1 + t_1) + r_2(s_2 + t_2) + \cdot s + r_n (s_n + t_n) \\
= r_1s_1 + r_1t_1 + r_2s_2 + r_2t_2 + \cdot s + r_ns_n + r_nt_n \\
= r \cdot s + r \cdot t
\]

Associative over scalar multiplication

\[
r \cdot (as) = a(r \cdot s) \\
r_i(as_i) + r_j(a s_j) = a(r_is_i + r_js_j)
\]

And r dot r is equal to the size of r squared.

\[
r \cdot r = r_ir_i + r_jr_j \\
= r_i^2 + r_j^2 \\
r \cdot r = |r|^2
\]

Cosine and dot product

cosine rule

\[
c^2 = a^2 + b^2 – 2ab \cos\theta
\] \[
|r – s|^2 = |r|^2 + |s|^2 – 2|r||s|\cos\theta \\
(r-s) \cdot (r-s) = r \cdot r -s \cdot r -s \cdot r -s \cdot -s \\
= |r|^2 – 2s \cdot r + |s|^2 \\
-2s \cdot r = -2|r||s|\cos\theta \\
2s \cdot r = 2|r||s|\cos\theta \\
r \cdot s = |r||s|\cos\theta
\]

It takes the size of the two vectors and multiplies by cos of the angle between them. It tells us something about the extent to which the two vectors go in the same direction.

\(\cos 0 = 1\), \(r \cdot s = |r||s|\).
Two vectors are orthogonal to each other, \(\cos 90 = 0\), \(r \cdot s = |r||s| \times 0 = 0\).
\(\cos 180 = -1\), \(r \cdot s = -|r||s|\).

Projection

A light coming down from s. It’s the shadow of s on r. This is called the projection.

\[
\cos = \frac{adjecent}{hypotenuse} = \frac{adjecent}{|s|} \\
r \cdot s = |r| \underbrace{|s| \cos \theta}_{adjecent(|r| \times projection)}
\]

Scalar projection

\[
\frac {r \cdot s}{|r|} = |s| \cos \theta
\]

Vector projection

The scalar projection also encoded with something about the direction of r a unit vector.

\[
\frac {r \cdot s}{|r||r|}r = \frac {r \cdot s}{r \cdot r}r
\]

Changing Basis

If you do the projection, two vectors must be orthogonal.

Convert from the e set of basis vectors to the b set of bases vectors.

This projection is of length 2 time \(b_1\)

\[
\frac {r_e \cdot b_1}{|b_1|^2} = \frac {3 \times 2 + 4 \times 1}{2^2 + 1^2} = \frac {10}{5} = 2
\] \[
\frac {r_e \cdot b_1}{|b_1|^2} b1 = 2 \begin{bmatrix}2\\1 \end{bmatrix} = \begin{bmatrix}4\\2 \end{bmatrix}
\]

This projection is of length \(\frac{1}{2}\) time \(b_2\)

\[
\frac {r_e \cdot b_2}{|b_2|^2} = \frac {3 \times -2 + 4 \times 4}{-2^2 + 4^2} = \frac {10}{20} = \frac {1}{2}
\] \[
\frac {r_e \cdot b_2}{|b_2|^2} b2 = \frac {1}{2} \begin{bmatrix}-2\\4 \end{bmatrix} = \begin{bmatrix}-1\\2 \end{bmatrix}
\]

We get the original vector r from above.

\[
\begin{bmatrix}4\\2\end{bmatrix} + \begin{bmatrix}-1\\2\end{bmatrix} = \begin{bmatrix}3\\4\end{bmatrix}
\]

In the basis b, it’s going to be
\[
r_b =
\begin{bmatrix}
2 \\
\frac{1}{2} \\
\end{bmatrix}
\]

We can redescribe original axis using some other axis, some other basis vectors. The basis vectors we use to describe the space of data.

Basis, vector, and linear independence

Basis is a set of n vectors that:

  • are not linear combinations of each other (linearly independent)
  • span the space
  • The space is then n-dimensional

Applications of changing basis

We get minimus possible number for the noisiness.

Leave a Reply

Your email address will not be published. Required fields are marked *