What is the main advantage of SVM over e.g. KDE?
KDE need to accomodate outliers in training data
-> OCSVM has parameter that represents the amount of assumed outliers in the data…
What are SVM esentially?
hyperplane in featurespace
separating the datapoints
-> evaluate the datapoints based on their sign
formula: wx-b=0
What are the problems one faces when finding this hyperplane?
how to find w, b?
uniqueness? are aw and ab solutions as well?
why shouild this be optimal?
How does the math behidn SVM work?
the vector w is normal to the hyperplane
dotproduct (w,x) calculates the distance to the hyperplane
||w|| scales the distance
-> minimize ||w||, so that actual distance is as far as possible…
What is the difference between SVM hard margin and SVM soft margin?
hard margin:
all data points have to be on the correct side
soft margin:
not always possible to linearily separate data
-> use hinge loss, where loss is proportional to the distance of the datapoint to the hyperplane (if classified incorrectly)
-> classified wrong and far away -> small loss…
What are the support vectors in SVM?
the both vectors (datapoints) that actually determine hte hyperplane
-> the ones closest to the hyperplane, determining the margin we have…
Formula Soft margin SVM
lamdba tradeoff margin size and placing all on the correct side
-> lambda balances bias and variance
How can one solve problems where the data is not linearily separable?
project data into higher space where it is solvable
find hyperplane
map hyperplane back to lower dimensional space
or:
use kernel trick
What is the formula for OC-SVM?
What does the ksi represent?
slack -> allows xi outside…
What does v represent?
larger v -> -b more important -> more distance to origin
smaller v -> sum over ksi more important -> more poitns on correct side
Summary KDE and OC-SVM?
KDE:
semi supervised
training data must be clean
density estimaion
OC-SVM:
unsupervised
allows for training data to contain outliers
requires fraction of outliers v (nu)
derive from SVM, a supervised model
no density estimation but max margin model
Hard margin svm formula?
Minimise ||w||
So that
yi(wxi-b)>=1
Last changed2 years ago