Adversarial Learning

Buffl

ML neu

by Jensen J.

What is the goal of adversarial learning?

for given network f
and some bening datapoint (x,y)
find some delta such that
f(x+delta) != y
where
delta is small (||d||_2 < epsilon)
and x+delta does not exceed feature domain (e.g. [0,1] for images)

What types of adversarial learning are theere?

targeted attacks
- specify what misclassification you want
untargeted attacks
- whatever other class is fine…

What is the mathematical goal of targeted attacks?

delta* = arg min (delta)
L(f(x+delta), y_target)
+ lambda ||delta||_2

-> find delta so that

-> loss for specific missclassificaiton is minimized

-> while keeping the delta itself small (with influence factor lambda)

What is the mathematical goal of an untargeted attack?

delta* = arg min (delta)
- L(f(x+delta), y_correct)
+
lambda ||delta||_2

=> minimize the negative loss (-> maximize the loss) of correct classification

=> by introducing the delta

=> and keeping delta still small…

How are the attacks actually written and why?

instead of loss
-> use attacker loss function
J(theta, x+delta, y_target) for targeted
J(theta, x+delta, y_true) for untargeted

=> does not necissarily have to incorporate the actual loss….

How can adversarial learning (finding delta) be solved?

gradient descend…
iterating to improve delta….
where delta_0 = null vector

-> actual derivatives over the different x ((x+delta)_1, (x+delta)_2,…)

-> keep lambda||delta|| in it …

What has shown if one trains with larger weight decays?

results in larger averagely required distortion to train adversarial stuff
-> trade off accuracy and robustness…..

WHat is the effect of more layers on adversarial learning?

bith deep and shallow susceptible

How does wight decay work?

add term
lambda Sum over i
w_i^2 / k
where k number of units in a layer…

What has shown in transferring adversarial stuff to other models?

possible…

On what (deep vs shallow) networks can adversaialö examples be found?

boith

What makes adversarial examples harder to find?

high weight regularistaion

What is the adversarial learning we had so far

white box -> does require model itself…

What are downsides of backprop for adversarial learning?

expensive to find adv example x+delta
- many forward and backward passes required
no direct control over sizue of pretubation delta -> only posisble to introducee it in loss… / no hard limit

What is FGSM?

fast gradient sign method

-> faster way to find bound adversarial examples

optimal pertubation: d = epsilong * sign(w)
with
difference between classification maximized
and d bound by epsilon (infinity norm)

Give the maths why FGSM works

we want to maximize |g(x+d) - g(x)|
= |wT(x+d) - wTx|
= |wTx + wTd - wTx|
= |wTd|
= |Sum over i w_i d_i|
=> can basically be maximized when all values have same sing…
=> thus by multiplying w_i with sign(w_i) -> always positive …
-> also introduce times epsilon as actual distrortion…. (as distortion bound by epsilon…)
= Sum |w_i| epsilon
= n*avg(w)*epsilon…

What is an effect of FGSM?

difference increases linearily with size of input n
-> but delta itself does not change with n…

=> the higehr the input dimensionality

-> the higher the effect…

How to calc infinity norm?

max (abs(x))
-> largest value disregarding the sign

How to apply FGSM?

d = e * sign(gradients wrt. x)
=> finds in single step…

What is a consideration in single step FGSM?

assumes linear behavior of model (and loss function)
-> works on deep non linear NN as they are locally (in neighborhood of epsilon) linear…

What is a defense to adversarial attacksß

adversarial retraining
-> consider the loss to be
actual loss
plus adversarial loss
weighted with alpha

Formula adversarial retreaining?

Loss =
alpha * actual loss
(1- alpha) *
Loss over
x + e*signum(gradient)
- => which is basically x+delta…

What is a result from adversarial retrainig?

reduces test set error rate
imporves model robustness
weights of adv. trained model more localized

What is iterative FGSM?

apply FGSM T times, each time with small step size alpha
-> after each iteration, clip the result to be still in the epsilon bound…
=> x’(t+1)
=
Clip(x’(t) + aplha * signum(gradient…))

When is iterative FGSM useful=

if FGSM oversteps
-> e.g. local linear does not hold…

With regards to what is FGSM optimal?

infinity loss -> as measure of distance of the predicitons…

Join Course

Preview

Author

Jensen J.

Information

Last changed
2 years ago

Report course