9. Carlini & Wagner

Buffl

ML ITSecv

by Jensen J.

What were the goals of szegedy for targeted attacks?

minimize difference between original and adversarial sample (x - x’)
while also minimizing the classificaiton loss of the adversarial exampls x’ to the target t
keep the distorted x’ in the respective domain ([0,1] for images)

What are open questions in szegedy et al. s aproach?

how to choose c?
optimal attacker loss J?
how to enforce x’ element [0,1]^n?

How did carlini and wagner tackle the problem of finding optimal c?

binary search…
-> very expensive

binary search:

chose initial range of c [min, max]
-> if current c leads to succesfull misclassification
- -> update upper bound
-> if not
- -> update lower bound
=> until balance between size of distortion and sucess is found

How did C&W find a better loss function than the training loss?

tested several different ones…
require:
- x’ classified as t
- <=>
- f_k(x’) <= 0
- if 2-class netowrk
if higher class network (n>=3)
- f_k(x’) <= 0
- =>
- x’ classified as t
baseline loss: cross entropy loss

How is the baseline loss funciotn in C&W defined?

we use cross entropy loss

-> negative sum over classification errors

-> wher the classification error is the log of the softmax output

How is the cross entropy loss for one-hot encoded y?

-> 0 if we classified correctly (i.e. log(1))

-> infinity if we missed completely (i.e. -log(0))

-> thus, for C&W, if our softmax classifiaction is > 1/2 -> the loss should be negative…!!!

What is the first proposed function of C&W?

-log_2(F(x)_t) -1

How to plot C&W functions?

y-axis: loss
x-axis: F(x’)_t
- -> how sure is our model that x’ is target class?
=> intersection of function and x-axis at 1/2!!!

What is the second functoin of C&W?

-> maximum of other classificaiton softmax

-> minus softmax of target class

-> maximum of both

What is the fourth funciton of C&W?

What is the fifth funciton of C&W?

What is the sixth function of C&W?

wher Z(x’)i is the logit

-> output of the NN before applying softmax…

How do C&W solve the third problem (box constraint x’ element [0,1]^n) ?

Project Gradient Descent
Clipped Gradient Descent
Change of Variables

How does Projected GD work?

clipping x’ after each update
works poorly for more complacted optimizers
example: manually moving the x’i interferes with momentum

How does clipped GD work?

clip in objective function f_k
via min(max(x’,0),1)
authors report convergence problems
- -> flat spots

How does change of variables work?

replace x’

What is the final approach of C&W?

minimize w.r.t. w (which is the replacement for x’)
first half is esentially ||x-x’||, where x’ is replaced to fulfill box constraints
second half is the sixth loss function, where we also replaced x’ with the box constraint with w
-> first part minimize distortion
-> second part minimize loss w.r.t target class

Summary C&W?

strong attack
expensive (due to finding c empirically, i.e. binary search)
no thershold epsilon for maximal distortion
- as opposed to FGSM and iterative FGSM
- distrortion controlled via hyperparameter c
-> FGSM fast
-> C&W good

Join Course

Preview

Author

Jensen J.

Information

Last changed
2 years ago

Report course