What is a ML model basically?
function that maps input to output…
how do NN learn?=
have loss fcuntion that tells how got output is (w.r.t. to known ouitputs…)
=> find weights in the model so that this loss is minimized…
=> use loss value to adjust weights to that loss is smaller next iteration…
WHen is a functino differentiable?=
at each place
limes from left = limes from right…
=> (f(x+h) - f(x)) / h )
from left -> h -> -0
from right h -> +0
NN optimization function
arg min (theta)
1/N
sum over loss of each prediction
What assumtion must hold for backprop?
the NN furnciotn is differentiable
How to perform backprop?
calculate gradient of loss
-> vector holding derivative of loss for each parameter in NN…
-> update parameters with
theta(t+1)
=
theta(t)
- alpha *
gradient (1/N Sum over loss)
Why minus alpha?
as the gradient indicates directino of largest change
-> goes away from minima…
How to initialize weights?
ranomdly
normal distributed
Last changed2 years ago