What were the goals of szegedy for targeted attacks?
minimize difference between original and adversarial sample (x - x’)
while also minimizing the classificaiton loss of the adversarial exampls x’ to the target t
keep the distorted x’ in the respective domain ([0,1] for images)
What are open questions in szegedy et al. s aproach?
how to choose c?
optimal attacker loss J?
how to enforce x’ element [0,1]^n?
How did carlini and wagner tackle the problem of finding optimal c?
binary search…
-> very expensive
binary search:
chose initial range of c [min, max]
-> if current c leads to succesfull misclassification
-> update upper bound
-> if not
-> update lower bound
=> until balance between size of distortion and sucess is found
How did C&W find a better loss function than the training loss?
tested several different ones…
require:
x’ classified as t
<=>
f_k(x’) <= 0
if 2-class netowrk
if higher class network (n>=3)
=>
baseline loss: cross entropy loss
How is the baseline loss funciotn in C&W defined?
we use cross entropy loss
-> negative sum over classification errors
-> wher the classification error is the log of the softmax output
How is the cross entropy loss for one-hot encoded y?
-> 0 if we classified correctly (i.e. log(1))
-> infinity if we missed completely (i.e. -log(0))
-> thus, for C&W, if our softmax classifiaction is > 1/2 -> the loss should be negative…!!!
What is the first proposed function of C&W?
->
-log_2(F(x)_t) -1
How to plot C&W functions?
y-axis: loss
x-axis: F(x’)_t
-> how sure is our model that x’ is target class?
=> intersection of function and x-axis at 1/2!!!
What is the second functoin of C&W?
-> maximum of other classificaiton softmax
-> minus softmax of target class
-> maximum of both
What is the fourth funciton of C&W?
What is the fifth funciton of C&W?
What is the sixth function of C&W?
wher Z(x’)i is the logit
-> output of the NN before applying softmax…
How do C&W solve the third problem (box constraint x’ element [0,1]^n) ?
Project Gradient Descent
Clipped Gradient Descent
Change of Variables
How does Projected GD work?
clipping x’ after each update
works poorly for more complacted optimizers
example: manually moving the x’i interferes with momentum
How does clipped GD work?
clip in objective function f_k
via min(max(x’,0),1)
authors report convergence problems
-> flat spots
How does change of variables work?
replace x’
What is the final approach of C&W?
minimize w.r.t. w (which is the replacement for x’)
first half is esentially ||x-x’||, where x’ is replaced to fulfill box constraints
second half is the sixth loss function, where we also replaced x’ with the box constraint with w
-> first part minimize distortion
-> second part minimize loss w.r.t target class
Summary C&W?
strong attack
expensive (due to finding c empirically, i.e. binary search)
no thershold epsilon for maximal distortion
as opposed to FGSM and iterative FGSM
distrortion controlled via hyperparameter c
-> FGSM fast
-> C&W good
Last changed2 years ago