What is sallop said the goal of adversarial ML?
fool neual network…
Whyt types of attacks do we differentiate in adversarial ML?
targeted attacks
untargeted attacks
What is the goal of a targeted attack?
we have the function
-> achieve that x+delta is classified as sth different (but specific different class…)
What is the goal of an untargeted attack?
we simply want that x + delta is classified as different class than only x, regardeless what the other calss might be (… untargeted…)
What are some assumptions on adversarial attacksß
given network f
bening datapoint (x,y)
find delta (same dimensionality as x)
such that f(x+delta) != y
WHERE
delta is small
x+delta does not exceed feature domain
e.g. x+delta element [0,1] for images…
How can we find an optimal delta in a targeted attack?
arg min over delta
L(f(x+delta),y’) + lambda ||delta||_2
=> basically minimize the loss for x plus delta and the specified target class
plus regularization factor lambda size delta
such that delta remains small…
How can we find an optimal delta in a untargeted attack?
same as in targeted
but maximize the loss (minimize negative loss)
for the initial class y…
How are the attacks written in general?
replace most with attacker loss J
-> targeted attack (x+delta, y’)
=> J(theta (NN params), x+delta, y’)
=
L(f_theta(x+delta), y’) + lambda ||delta||_2
untargeted:
(x+delta, y)
J(theta, x+delta, y)
-L(f_theta(x+delta), y) + lambda ||delta||_2
How can solving optimal delta be solved?
using gradient descend…
delta (t+1) = delta(t) - alpha gradient_delta(t)(j(tehta, x, y)
where delta(0) = [0,0,0,…,0]
=> gradient vector partial derivatives…
=> points in direction of largest increase of J
How do adjust things based on gradient?
negative gradient -> increase
positive gradient -> decrease…
What has shown in terms of susceptability of NN to adversarial attacks?
regularization factor lambda in NN -> minimum loss…
=> the larger the lambda, the larger the minimal distortion (how much change has to be introduced) is to trick the network…
What are downsides of geradient descend based adversial attacks?
expensive to find adversaral example x+delta
many forward and backward passes required…
no direct control over size of pertubation…
Zuletzt geändertvor 2 Jahren