Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

SUPPORT VECTOR MACHINE, Summaries of Computer Science

SVM SUPPORT VECTOR MACHINES IN MACHINE LEARNING

Typology: Summaries

2020/2021

Uploaded on 08/10/2021

arun-prasath-3
arun-prasath-3 🇮🇳

1 document

1 / 40

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Introduction to Support
Vector Machines
Note to other teachers and users of
these slides. Andrew would be
delighted if you found this source
material useful in giving your own
lectures. Feel free to use these
slides verbatim, or to modify them
to fit your own needs. PowerPoint
originals are available. If you make
use of a significant portion of these
slides in your own lecture, please
include this message, or the
following link to the source
repository of Andrew’s tutorials:
http://www.cs.cmu.edu/~awm/tutori
als
. Comments and corrections
gratefully received. Thanks:
Andrew Moore
CMU
And
Martin Law
Michigan State
University
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28

Partial preview of the text

Download SUPPORT VECTOR MACHINE and more Summaries Computer Science in PDF only on Docsity!

Introduction to Support

Vector Machines

Note to other teachers and users of these slides. Andrew would be delighted if you found this source material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit your own needs. PowerPoint originals are available. If you make use of a significant portion of these slides in your own lecture, please include this message, or the following link to the source repository of Andrew’s tutorials: http://www.cs.cmu.edu/~awm/tutori als

. Comments and corrections gratefully received. Thanks: Andrew Moore CMU And Martin Law Michigan State University

History of SVM

SVM is related to statistical learning theory [3]

SVM was first introduced in 1992 [1]

SVM becomes popular because of its success in

handwritten digit recognition

1.1% test error rate for SVM. This is the same as the

error rates of a carefully constructed neural network,

LeNet 4.

 See Section 5.11 in [2] or the discussion in [3] for details 

SVM is now regarded as an important example of

“kernel methods”, one of the key area in machine

learning

Note: the meaning of “kernel” is different from the

“kernel” function for Parzen windows

[1] B.E. Boser et al. A Training Algorithm for Optimal Margin Classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory 5 144-152, Pittsburgh, 1992. [2] L. Bottou et al. Comparison of classifier methods: a case study in handwritten digit recognition. Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 2, pp. 77-82. [3] V. Vapnik. The Nature of Statistical Learning Theory. 2nd^ edition, Springer, 1999.

Linear Classifiers

f

x

y

est denotes + denotes - f ( x , w ,b) = sign( w. x - b) How would you classify this data?

Linear Classifiers

f

x

y

est denotes + denotes - f ( x , w ,b) = sign( w. x - b) How would you classify this data?

Linear Classifiers

f

x

y

est denotes + denotes - f ( x , w ,b) = sign( w. x - b) Any of these would be fine.. ..but which is best?

Classifier Margin

f

x

y

est denotes + denotes - f ( x , w ,b) = sign( w. x - b)

Define the

margin of a

linear classifier

as the width

that the

boundary could

be increased by

before hitting a

datapoint.

Maximum Margin

f

x

y

est denotes + denotes - f ( x , w ,b) = sign( w. x + b)

The maximum

margin linear

classifier is the

linear classifier

with the, um,

maximum

margin.

This is the

simplest kind of

SVM (Called an

LSVM)

Support Vectors are those datapoints that the margin pushes up against Linear SVM

Why Maximum Margin?

denotes + denotes - f ( x , w ,b) = sign( w. x - b)

The maximum

margin linear

classifier is the

linear classifier

with the, um,

maximum

margin.

This is the

simplest kind of

SVM (Called an

LSVM)

Support Vectors are those datapoints that the margin pushes up against

Estimate the Margin

What is the distance expression for a point x to a

line wx +b= 0?

denotes + denotes - x wx +b = 0 (^2 ) 2 1

d i^ i

b b

d

w

x w x w

x

w

X – Vector W – Normal Vector b – Scale Value

W

Large-margin Decision Boundary

The decision boundary should be as far away

from the data of both classes as possible

We should maximize the margin, m

Distance between the origin and the line w t x =-b is

b/|| w ||

Class 1 Class 2

m

Next step… Optional

Converting SVM to a form we can solve

Dual form

Allowing a few errors

Soft margin

Allowing nonlinear boundary

Kernel functions

The Dual Problem (we ignore the derivation) 

The new objective function is in terms of i only

It is known as the dual problem: if we know w , we

know all i; if we know all i, we know w

The original problem is known as the primal

problem

The objective function of the dual problem needs

to be maximized!

The dual problem is therefore:

Properties of i when we introduce the Lagrange multipliers The result when we differentiate the original Lagrangian w.r.t. b

Characteristics of the Solution

 Many of the 

i are zero (see next page for example)

 (^) w is a linear combination of a small number of data points  (^) This “sparse” representation can be viewed as data compression as in the construction of knn classifier

 x
i with non-zero^ i are called support vectors (SV)

 The decision boundary is determined only by the SV  (^) Let t j ( j =1, ...,^ s ) be the indices of the^ s^ support vectors. We can write 

For testing with a new data z

 Compute and classify z as class 1 if the sum is positive, and class 2 otherwise  (^) Note: w need not be formed explicitly

6

A Geometrical Interpretation

Class 1 Class 2

1

2

3

4

5

7

8

9

10