Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Linear Algebra II: Matrix Operations for Machine Learning, Lecture notes of Linear Algebra

A comprehensive introduction to matrix operations in linear algebra, focusing on their applications in machine learning. It covers key concepts such as matrix multiplication, inversion, eigendecomposition, and singular value decomposition (svd). Code examples in python using numpy and pytorch, illustrating the practical implementation of these operations. It also explores the use of svd for image compression and demonstrates the relationship between svd and eigendecomposition.

Typology: Lecture notes

2024/2025

Available from 01/11/2025

kandlagunta-subramanyam
kandlagunta-subramanyam 🇮🇳

1 document

1 / 48

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
2-linear-algebra-ii
June 23, 2024
1 Linear Algebra II: Matrix Operations
This topic, Linear Algebra II: Matrix Operations, builds on the basics of linear algebra. It is
essential because these intermediate-level manipulations of tensors lie at the heart of most machine
learning approaches and are especially predominant in deep learning.
Through the measured exposition of theory paired with interactive examples, you’ll develop an
understanding of how linear algebra is used to solve for unknown values in high-dimensional spaces
as well as to reduce the dimensionality of complex spaces. The content covered in this topic is
itself foundational for several other topics in the Machine Learning Foundations series, especially
Probability & Information Theory and Optimization.
Over the course of studying this topic, you’ll:
Develop a geometric intuition of what’s going on beneath the hood of machine learning
algorithms, including those used for deep learning.
Be able to more intimately grasp the details of machine learning papers as well as all of the
other subjects that underlie ML, including calculus, statistics, and optimization algorithms.
Reduce the dimensionalty of complex spaces down to their most informative elements with
techniques such as eigendecomposition, singular value decomposition, and principal compo-
nent analysis.
Note that this Jupyter notebook is not intended to stand alone. It is the companion
code to a lecture or to videos from Jon Krohn’s Machine Learning Foundations series,
which offer detail on the following:
Review of Introductory Linear Algebra
Modern Linear Algebra Applications
Tensors, Vectors, and Norms
Matrix Multiplication
Matrix Inversion
Identity, Diagonal and Orthogonal Matrices
Segment 2: Eigendecomposition
Affine Transformation via Matrix Application
Eigenvectors and Eigenvalues
Matrix Determinants
Matrix Decomposition
Applications of Eigendecomposition
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30

Partial preview of the text

Download Linear Algebra II: Matrix Operations for Machine Learning and more Lecture notes Linear Algebra in PDF only on Docsity!

2-linear-algebra-ii

June 23, 2024

1 Linear Algebra II: Matrix Operations

This topic, Linear Algebra II: Matrix Operations , builds on the basics of linear algebra. It is essential because these intermediate-level manipulations of tensors lie at the heart of most machine learning approaches and are especially predominant in deep learning.

Through the measured exposition of theory paired with interactive examples, you’ll develop an understanding of how linear algebra is used to solve for unknown values in high-dimensional spaces as well as to reduce the dimensionality of complex spaces. The content covered in this topic is itself foundational for several other topics in the Machine Learning Foundations series, especially Probability & Information Theory and Optimization.

Over the course of studying this topic, you’ll:

  • Develop a geometric intuition of what’s going on beneath the hood of machine learning algorithms, including those used for deep learning.
  • Be able to more intimately grasp the details of machine learning papers as well as all of the other subjects that underlie ML, including calculus, statistics, and optimization algorithms.
  • Reduce the dimensionalty of complex spaces down to their most informative elements with techniques such as eigendecomposition, singular value decomposition, and principal compo- nent analysis.

Note that this Jupyter notebook is not intended to stand alone. It is the companion code to a lecture or to videos from Jon Krohn’s Machine Learning Foundations series, which offer detail on the following:

Review of Introductory Linear Algebra

  • Modern Linear Algebra Applications
  • Tensors, Vectors, and Norms
  • Matrix Multiplication
  • Matrix Inversion
  • Identity, Diagonal and Orthogonal Matrices

Segment 2: Eigendecomposition

  • Affine Transformation via Matrix Application
  • Eigenvectors and Eigenvalues
  • Matrix Determinants
  • Matrix Decomposition
  • Applications of Eigendecomposition

Segment 3: Matrix Operations for Machine Learning

  • Singular Value Decomposition (SVD)
  • The Moore-Penrose Pseudoinverse
  • The Trace Operator
  • Principal Component Analysis (PCA): A Simple Machine Learning Algorithm
  • Resources for Further Study of Linear Algebra

1.1 Segment 1: Review of Introductory Linear Algebra

: import numpy as np import torch

1.1.1 Vector Transposition

: x = np.array([25, 2, 5]) x

: array([25, 2, 5])

: x = np.array([[25, 2, 5]]) x

: array([[25, 2, 5]])

: (1, 3)

[ 2], [ 5]])

: (3, 1)

: x_p = torch.tensor([25, 2, 5]) x_p

: tensor([25, 2, 5])

: (3, 2)

: X_p = torch.tensor([[25, 2], [5, 26], [3, 7]]) X_p

: tensor([[25, 2], [ 5, 26], [ 3, 7]])

: torch.Size([3, 2])

1.2.2 Matrix Transposition

: array([[25, 2], [ 5, 26], [ 3, 7]])

: array([[25, 5, 3], [ 2, 26, 7]])

: tensor([[25, 5, 3], [ 2, 26, 7]])

Return to slides here.

1.2.3 Matrix Multiplication Scalars are applied to each element of matrix:

: array([[75, 6], [15, 78], [ 9, 21]])

: array([[78, 9], [18, 81], [12, 24]])

: tensor([[75, 6], [15, 78], [ 9, 21]])

: tensor([[78, 9], [18, 81], [12, 24]])

Using the multiplication operator on two tensors of the same size in PyTorch (or Numpy or Ten- sorFlow) applies element-wise operations. This is the Hadamard product (denoted by the ⊙ operator, e.g., 𝐴 ⊙ 𝐵) not matrix multiplication :

: A = np.array([[3, 4], [5, 6], [7, 8]]) A

: array([[3, 4], [5, 6], [7, 8]])

: array([[25, 2], [ 5, 26], [ 3, 7]])

: X * A

: array([[ 75, 8], [ 25, 156], [ 21, 56]])

: A_p = torch.tensor([[3, 4], [5, 6], [7, 8]]) A_p

: tensor([[3, 4], [5, 6], [7, 8]])

: X_p * A_p

[23, 63]])

1.2.4 Matrix Inversion

: X = np.array([[4, 2], [-5, -3]]) X

: array([[ 4, 2], [-5, -3]])

: Xinv = np.linalg.inv(X) Xinv

: array([[ 1.5, 1. ], [-2.5, -2. ]])

: y = np.array([4, -7]) y

: array([ 4, -7])

: w = np.dot(Xinv, y) w

: array([-1., 4.])

Show that 𝑦 = 𝑋𝑤:

: np.dot(X, w)

: array([ 4., -7.])

: X_p = torch.tensor([[4, 2], [-5, -3.]]) # note that torch.inverse() requires ␣ ↪ floats X_p

: tensor([[ 4., 2.], [-5., -3.]])

: Xinv_p = torch.inverse(X_p) Xinv_p

: tensor([[ 1.5000, 1.0000], [-2.5000, -2.0000]])

: y_p = torch.tensor([4, -7.]) y_p

: tensor([ 4., -7.])

: w_p = torch.matmul(Xinv_p, y_p) w_p

: tensor([-1., 4.])

: torch.matmul(X_p, w_p)

: tensor([ 4., -7.])

Return to slides here.

1.3 Segment 2: Eigendecomposition

1.3.1 Affine Transformation via Matrix Application Let’s say we have a vector 𝑣:

: v = np.array([3, 1]) v

: array([3, 1])

Let’s plot 𝑣 using my plot_vectors() function (which is based on Hadrien Jean’s plotVectors() function from this notebook, under MIT license).

: import matplotlib.pyplot as plt

: def plot_vectors(vectors, colors): """ Plot one or more vectors in a 2D plane, specifying a color for each.

_Arguments

vectors: list of lists or of arrays Coordinates of the vectors to plot. For example, [[1, 3], [2, 2]] contains two vectors to plot, [1, 3] and [2, 2]. colors: list Colors of the vectors. For instance: ['red', 'blue'] will display the first vector in red and the second in blue._

_Example

plot_vectors([[1, 3], [2, 2]], ['red', 'blue']) plt.xlim(-1, 4) plt.ylim(-1, 4) """_ plt.figure()

: array([3, 1])

: v == Iv

: array([ True, True])

: plot_vectors([Iv], ['blue']) plt.xlim(-1, 5) _ = plt.ylim(-1, 5)

In contrast, consider this matrix (let’s call it 𝐸) that flips vectors over the 𝑥-axis:

: E = np.array([[1, 0], [0, -1]]) E

: array([[ 1, 0], [ 0, -1]])

: Ev = np.dot(E, v) Ev

: array([ 3, -1])

: plot_vectors([v, Ev], ['lightblue', 'blue']) plt.xlim(-1, 5) _ = plt.ylim(-3, 3)

Or, this matrix, 𝐹 , which flips vectors over the 𝑦-axis:

: F = np.array([[-1, 0], [0, 1]]) F

: array([[-1, 0], [ 0, 1]])

: Fv = np.dot(F, v) Fv

: array([-3, 1])

: plot_vectors([v, Fv], ['lightblue', 'blue']) plt.xlim(-4, 4) _ = plt.ylim(-1, 5)

_ = plt.ylim(-1, 5)

: # Another example of applying A: v2 = np.array([2, 1]) plot_vectors([v2, np.dot(A, v2)], ['lightgreen', 'green']) plt.xlim(-1, 5) _ = plt.ylim(-1, 5)

We can concatenate several vectors together into a matrix (say, 𝑉 ), where each column is a separate vector. Then, whatever linear transformations we apply to 𝑉 will be independently applied to each column (vector):

: array([3, 1])

: # recall that we need to convert array to 2D to transpose into column, e.g.: np.matrix(v).T

[1]])

: v3 = np.array([-3, -1]) # mirror image of v over both axes v4 = np.array([-1, 1])

: V = np.concatenate((np.matrix(v).T, np.matrix(v2).T, np.matrix(v3).T, np.matrix(v4).T), axis=1) V

: matrix([[ 3, 2, -3, -1], [ 1, 1, -1, 1]])

Now that we can appreciate the linear transformation of vectors by matrices, let’s move on to working with eigenvectors and eigenvalues… Return to slides here.

1.3.2 Eigenvectors and Eigenvalues An eigenvector ( eigen is German for “typical”; we could translate eigenvector to “characteristic vector”) is a special vector 𝑣 such that when it is transformed by some matrix (let’s say 𝐴), the product 𝐴𝑣 has the exact same direction as 𝑣. An eigenvalue is a scalar (traditionally represented as 𝜆) that simply scales the eigenvector 𝑣 such that the following equation is satisfied: 𝐴𝑣 = 𝜆𝑣 Easiest way to understand this is to work through an example:

: array([[-1, 4], [ 2, -2]])

Eigenvectors and eigenvalues can be derived algebraically (e.g., with the QR algorithm, which was independently developed in the 1950s by both Vera Kublanovskaya and John Francis), however this is outside scope of the ML Foundations series. We’ll cheat with NumPy eig() method, which returns a tuple of:

  • a vector of eigenvalues
  • a matrix of eigenvectors

: lambdas, V = np.linalg.eig(A)

The matrix contains as many eigenvectors as there are columns of A:

: V # each column is a separate eigenvector v

: array([[ 0.86011126, -0.76454754], [ 0.51010647, 0.64456735]])

With a corresponding eigenvalue for each eigenvector:

: array([ 1.37228132, -4.37228132])

Let’s confirm that 𝐴𝑣 = 𝜆𝑣 for the first eigenvector:

: v = V[:,0] v

: array([0.86011126, 0.51010647])

: lambduh = lambdas[0] # note that "lambda" is reserved term in Python lambduh

: Av = np.dot(A, v) Av

: array([1.18031462, 0.70000958])

: lambduh * v

: array([1.18031462, 0.70000958])

: plot_vectors([Av, v], ['blue', 'lightblue']) plt.xlim(-1, 2) _ = plt.ylim(-1, 2)

Using the PyTorch eig() method, we can do exactly the same:

: array([[-1, 4], [ 2, -2]])

: A_p = torch.tensor([[-1, 4], [2, -2.]]) # must be float for PyTorch eig() A_p

: tensor([[-1., 4.], [ 2., -2.]])

: lambdas_cplx, V_cplx = torch.linalg.eig(A_p) # outputs complex numbers because ␣ ↪ real matrices can have complex eigenvectors

: V_cplx # complex-typed values with "0.j" imaginary part are in fact real numbers

: tensor([[ 0.8601+0.j, -0.7645+0.j], [ 0.5101+0.j, 0.6446+0.j]])

: V_p = V_cplx.float() V_p

/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:1: UserWarning: Casting complex values to real discards the imaginary part (Triggered internally

at /pytorch/aten/src/ATen/native/Copy.cpp:240.) """Entry point for launching an IPython kernel.

: tensor([[ 0.8601, -0.7645], [ 0.5101, 0.6446]])

: v_p = V_p[:,0] v_p

: tensor([0.8601, 0.5101])

: tensor([ 1.3723+0.j, -4.3723+0.j])

: lambdas_p = lambdas_cplx.float() lambdas_p

: tensor([ 1.3723, -4.3723])

: lambda_p = lambdas_p[0] lambda_p

: Av_p = torch.matmul(A_p, v_p) # matmul() expects float-typed tensors Av_p

: tensor([1.1803, 0.7000])

: lambda_p * v_p

: tensor([1.1803, 0.7000])

: v2_p = V_p[:,1] v2_p

: tensor([-0.7645, 0.6446])

: lambda2_p = lambdas_p[1] lambda2_p

: Av2_p = torch.matmul(A_p.float(), v2_p.float()) Av2_p

: tensor([ 3.3428, -2.8182])