What is the goal of adversarial learning?
for given network f
and some bening datapoint (x,y)
find some delta such that
f(x+delta) != y
where
delta is small (||d||_2 < epsilon)
and x+delta does not exceed feature domain (e.g. [0,1] for images)
What types of adversarial learning are theere?
targeted attacks
specify what misclassification you want
untargeted attacks
whatever other class is fine…
What is the mathematical goal of targeted attacks?
delta* = arg min (delta)
L(f(x+delta), y_target)
+ lambda ||delta||_2
-> find delta so that
-> loss for specific missclassificaiton is minimized
-> while keeping the delta itself small (with influence factor lambda)
What is the mathematical goal of an untargeted attack?
- L(f(x+delta), y_correct)
+
lambda ||delta||_2
=> minimize the negative loss (-> maximize the loss) of correct classification
=> by introducing the delta
=> and keeping delta still small…
How are the attacks actually written and why?
instead of loss
-> use attacker loss function
J(theta, x+delta, y_target) for targeted
J(theta, x+delta, y_true) for untargeted
=> does not necissarily have to incorporate the actual loss….
How can adversarial learning (finding delta) be solved?
gradient descend…
iterating to improve delta….
where delta_0 = null vector
-> actual derivatives over the different x ((x+delta)_1, (x+delta)_2,…)
-> keep lambda||delta|| in it …
What has shown if one trains with larger weight decays?
results in larger averagely required distortion to train adversarial stuff
-> trade off accuracy and robustness…..
WHat is the effect of more layers on adversarial learning?
bith deep and shallow susceptible
How does wight decay work?
add term
lambda Sum over i
w_i^2 / k
where k number of units in a layer…
What has shown in transferring adversarial stuff to other models?
possible…
On what (deep vs shallow) networks can adversaialö examples be found?
boith
What makes adversarial examples harder to find?
high weight regularistaion
What is the adversarial learning we had so far
white box -> does require model itself…
What are downsides of backprop for adversarial learning?
expensive to find adv example x+delta
many forward and backward passes required
no direct control over sizue of pretubation delta -> only posisble to introducee it in loss… / no hard limit
What is FGSM?
fast gradient sign method
-> faster way to find bound adversarial examples
optimal pertubation: d = epsilong * sign(w)
with
difference between classification maximized
and d bound by epsilon (infinity norm)
Give the maths why FGSM works
we want to maximize |g(x+d) - g(x)|
= |wT(x+d) - wTx|
= |wTx + wTd - wTx|
= |wTd|
= |Sum over i w_i d_i|
=> can basically be maximized when all values have same sing…
=> thus by multiplying w_i with sign(w_i) -> always positive …
-> also introduce times epsilon as actual distrortion…. (as distortion bound by epsilon…)
= Sum |w_i| epsilon
= n*avg(w)*epsilon…
What is an effect of FGSM?
difference increases linearily with size of input n
-> but delta itself does not change with n…
=> the higehr the input dimensionality
-> the higher the effect…
How to calc infinity norm?
max (abs(x))
-> largest value disregarding the sign
How to apply FGSM?
d = e * sign(gradients wrt. x)
=> finds in single step…
What is a consideration in single step FGSM?
assumes linear behavior of model (and loss function)
-> works on deep non linear NN as they are locally (in neighborhood of epsilon) linear…
What is a defense to adversarial attacksß
adversarial retraining
-> consider the loss to be
actual loss
plus adversarial loss
weighted with alpha
Formula adversarial retreaining?
Loss =
alpha * actual loss
(1- alpha) *
Loss over
x + e*signum(gradient)
=> which is basically x+delta…
What is a result from adversarial retrainig?
reduces test set error rate
imporves model robustness
weights of adv. trained model more localized
What is iterative FGSM?
apply FGSM T times, each time with small step size alpha
-> after each iteration, clip the result to be still in the epsilon bound…
=> x’(t+1)
=
Clip(x’(t) + aplha * signum(gradient…))
When is iterative FGSM useful=
if FGSM oversteps
-> e.g. local linear does not hold…
With regards to what is FGSM optimal?
infinity loss -> as measure of distance of the predicitons…
Last changed2 years ago