How to calcualte the derivative of an absolute ? (i.e. for the gradient descent stuff…)
How to evaluate an OC-SVM for optimality?
write down formula for loss
find optimal ksi (with formula) -> assume optimality
calculate loss
find better w, b so that overall loss is reduced and ksi still okay
optimality check:
scale ksi calculation with alpha
optimal ksi will then be sth like 0.5 alpha…
put into loss function
will likely be quadratic formula
calculate minimum with pq
adjust w and b respectively
What are advantages and disadvantages of bounded kernels in KDE?
Advantage of bounded domains of kernels:
A) Possibly faster to compute estimator fˆ, because far-away points do not contribute.
B) Far away points do not influence one another. Put differently, locality is respected.
Disadvantages:
Local kernels are somewhat arbitrary due to their ’hard’ margins.
Why is it sufficient to compute ’proportional kernels’ only? Why could it be advantageous? What does this entail for the resulting probability distribution?
Since kernels are summed up, omitting a scalar factor λ only changes the integral or ’de-normalizes’ it. If we are interested in relative likelihoods, this is not disadvantageous. However, this saves a few computations (yet, the inference time and space complexity is still in Θ(n)).
How to calc NLL loss in general?
-log p_theta(yi = i.e.2 ; f_theta(xi))
-log(yi * softmax(f_theta(xi))
Last changed2 years ago