








































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A comprehensive introduction to matrix operations in linear algebra, focusing on their applications in machine learning. It covers key concepts such as matrix multiplication, inversion, eigendecomposition, and singular value decomposition (svd). Code examples in python using numpy and pytorch, illustrating the practical implementation of these operations. It also explores the use of svd for image compression and demonstrates the relationship between svd and eigendecomposition.
Typology: Lecture notes
1 / 48
This page cannot be seen from the preview
Don't miss anything!
This topic, Linear Algebra II: Matrix Operations , builds on the basics of linear algebra. It is essential because these intermediate-level manipulations of tensors lie at the heart of most machine learning approaches and are especially predominant in deep learning.
Through the measured exposition of theory paired with interactive examples, you’ll develop an understanding of how linear algebra is used to solve for unknown values in high-dimensional spaces as well as to reduce the dimensionality of complex spaces. The content covered in this topic is itself foundational for several other topics in the Machine Learning Foundations series, especially Probability & Information Theory and Optimization.
Over the course of studying this topic, you’ll:
Note that this Jupyter notebook is not intended to stand alone. It is the companion code to a lecture or to videos from Jon Krohn’s Machine Learning Foundations series, which offer detail on the following:
Review of Introductory Linear Algebra
Segment 2: Eigendecomposition
Segment 3: Matrix Operations for Machine Learning
: import numpy as np import torch
1.1.1 Vector Transposition
: x = np.array([[25, 2, 5]]) x
[ 2], [ 5]])
: x_p = torch.tensor([25, 2, 5]) x_p
: X_p = torch.tensor([[25, 2], [5, 26], [3, 7]]) X_p
: tensor([[25, 2], [ 5, 26], [ 3, 7]])
1.2.2 Matrix Transposition
: array([[25, 2], [ 5, 26], [ 3, 7]])
: array([[25, 5, 3], [ 2, 26, 7]])
: tensor([[25, 5, 3], [ 2, 26, 7]])
Return to slides here.
1.2.3 Matrix Multiplication Scalars are applied to each element of matrix:
: array([[75, 6], [15, 78], [ 9, 21]])
: array([[78, 9], [18, 81], [12, 24]])
: tensor([[75, 6], [15, 78], [ 9, 21]])
: tensor([[78, 9], [18, 81], [12, 24]])
Using the multiplication operator on two tensors of the same size in PyTorch (or Numpy or Ten- sorFlow) applies element-wise operations. This is the Hadamard product (denoted by the ⊙ operator, e.g., 𝐴 ⊙ 𝐵) not matrix multiplication :
: A = np.array([[3, 4], [5, 6], [7, 8]]) A
: array([[3, 4], [5, 6], [7, 8]])
: array([[25, 2], [ 5, 26], [ 3, 7]])
: array([[ 75, 8], [ 25, 156], [ 21, 56]])
: A_p = torch.tensor([[3, 4], [5, 6], [7, 8]]) A_p
: tensor([[3, 4], [5, 6], [7, 8]])
1.2.4 Matrix Inversion
: X = np.array([[4, 2], [-5, -3]]) X
: Xinv = np.linalg.inv(X) Xinv
: array([[ 1.5, 1. ], [-2.5, -2. ]])
Show that 𝑦 = 𝑋𝑤:
: X_p = torch.tensor([[4, 2], [-5, -3.]]) # note that torch.inverse() requires ␣ ↪ floats X_p
: tensor([[ 4., 2.], [-5., -3.]])
: Xinv_p = torch.inverse(X_p) Xinv_p
: tensor([[ 1.5000, 1.0000], [-2.5000, -2.0000]])
: y_p = torch.tensor([4, -7.]) y_p
: w_p = torch.matmul(Xinv_p, y_p) w_p
Return to slides here.
1.3.1 Affine Transformation via Matrix Application Let’s say we have a vector 𝑣:
Let’s plot 𝑣 using my plot_vectors() function (which is based on Hadrien Jean’s plotVectors() function from this notebook, under MIT license).
: import matplotlib.pyplot as plt
: def plot_vectors(vectors, colors): """ Plot one or more vectors in a 2D plane, specifying a color for each.
vectors: list of lists or of arrays Coordinates of the vectors to plot. For example, [[1, 3], [2, 2]] contains two vectors to plot, [1, 3] and [2, 2]. colors: list Colors of the vectors. For instance: ['red', 'blue'] will display the first vector in red and the second in blue._
plot_vectors([[1, 3], [2, 2]], ['red', 'blue']) plt.xlim(-1, 4) plt.ylim(-1, 4) """_ plt.figure()
: plot_vectors([Iv], ['blue']) plt.xlim(-1, 5) _ = plt.ylim(-1, 5)
In contrast, consider this matrix (let’s call it 𝐸) that flips vectors over the 𝑥-axis:
: E = np.array([[1, 0], [0, -1]]) E
: plot_vectors([v, Ev], ['lightblue', 'blue']) plt.xlim(-1, 5) _ = plt.ylim(-3, 3)
Or, this matrix, 𝐹 , which flips vectors over the 𝑦-axis:
: F = np.array([[-1, 0], [0, 1]]) F
: plot_vectors([v, Fv], ['lightblue', 'blue']) plt.xlim(-4, 4) _ = plt.ylim(-1, 5)
_ = plt.ylim(-1, 5)
: # Another example of applying A: v2 = np.array([2, 1]) plot_vectors([v2, np.dot(A, v2)], ['lightgreen', 'green']) plt.xlim(-1, 5) _ = plt.ylim(-1, 5)
We can concatenate several vectors together into a matrix (say, 𝑉 ), where each column is a separate vector. Then, whatever linear transformations we apply to 𝑉 will be independently applied to each column (vector):
: # recall that we need to convert array to 2D to transpose into column, e.g.: np.matrix(v).T
[1]])
: v3 = np.array([-3, -1]) # mirror image of v over both axes v4 = np.array([-1, 1])
: V = np.concatenate((np.matrix(v).T, np.matrix(v2).T, np.matrix(v3).T, np.matrix(v4).T), axis=1) V
: matrix([[ 3, 2, -3, -1], [ 1, 1, -1, 1]])
Now that we can appreciate the linear transformation of vectors by matrices, let’s move on to working with eigenvectors and eigenvalues… Return to slides here.
1.3.2 Eigenvectors and Eigenvalues An eigenvector ( eigen is German for “typical”; we could translate eigenvector to “characteristic vector”) is a special vector 𝑣 such that when it is transformed by some matrix (let’s say 𝐴), the product 𝐴𝑣 has the exact same direction as 𝑣. An eigenvalue is a scalar (traditionally represented as 𝜆) that simply scales the eigenvector 𝑣 such that the following equation is satisfied: 𝐴𝑣 = 𝜆𝑣 Easiest way to understand this is to work through an example:
Eigenvectors and eigenvalues can be derived algebraically (e.g., with the QR algorithm, which was independently developed in the 1950s by both Vera Kublanovskaya and John Francis), however this is outside scope of the ML Foundations series. We’ll cheat with NumPy eig() method, which returns a tuple of:
: lambdas, V = np.linalg.eig(A)
The matrix contains as many eigenvectors as there are columns of A:
: V # each column is a separate eigenvector v
: array([[ 0.86011126, -0.76454754], [ 0.51010647, 0.64456735]])
With a corresponding eigenvalue for each eigenvector:
: array([ 1.37228132, -4.37228132])
Let’s confirm that 𝐴𝑣 = 𝜆𝑣 for the first eigenvector:
: array([0.86011126, 0.51010647])
: lambduh = lambdas[0] # note that "lambda" is reserved term in Python lambduh
: array([1.18031462, 0.70000958])
: array([1.18031462, 0.70000958])
: plot_vectors([Av, v], ['blue', 'lightblue']) plt.xlim(-1, 2) _ = plt.ylim(-1, 2)
Using the PyTorch eig() method, we can do exactly the same:
: A_p = torch.tensor([[-1, 4], [2, -2.]]) # must be float for PyTorch eig() A_p
: tensor([[-1., 4.], [ 2., -2.]])
: lambdas_cplx, V_cplx = torch.linalg.eig(A_p) # outputs complex numbers because ␣ ↪ real matrices can have complex eigenvectors
: V_cplx # complex-typed values with "0.j" imaginary part are in fact real numbers
: tensor([[ 0.8601+0.j, -0.7645+0.j], [ 0.5101+0.j, 0.6446+0.j]])
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:1: UserWarning: Casting complex values to real discards the imaginary part (Triggered internally
at /pytorch/aten/src/ATen/native/Copy.cpp:240.) """Entry point for launching an IPython kernel.
: tensor([[ 0.8601, -0.7645], [ 0.5101, 0.6446]])
: tensor([ 1.3723+0.j, -4.3723+0.j])
: lambdas_p = lambdas_cplx.float() lambdas_p
: lambda_p = lambdas_p[0] lambda_p
: Av_p = torch.matmul(A_p, v_p) # matmul() expects float-typed tensors Av_p
: lambda2_p = lambdas_p[1] lambda2_p