Bayes Formula
TF-IDF
Softmax
Word2Vec
How does Word2Vec work?
learn meaning of word by company it keeps
-> for specific document / corpus / vocabulary
-> train a NN so that it predicts the most likely neighbors of a specific word
-> by maximizing the softmax of neighbors
Control Flow Embedding (high level)
Graph embedding is sum of embedding of control blocks
How to embed control blocks?=
How to calculate cosine similarity?
Cosine Loss?
When function differentiable?
for each point, left and right limes is the same
How to do backpropagation?
Solve
With Gradients
for iterative update
Targeted attack
Untargeted attack
What is the λ used for in the loss?
introduce the deviation δ with λ to the loss
-> forces it to be small …
How to solve attacks in adversarial learning?
How to calc a gradient?
partial derivatives of loss function for each input
How is KDE used?
have kernel function (e.g. normal distribution)
-> add it at every datapoint you have
-> sum the kernel functions to single funcion and normalize over the number of datatpoints
What must hold in KDE?
must integrate to 1…
must be larger than 0 at every point…
Why use KDE and not regular distribution fitting?
KDE can fit to arbitrary datra
-> regular distribution not…
KDE estimator
Normal kernel
What is h used for in KDE estimators?-
smoothness of estimator
-> h too big -> to smooth -> underfitting (high bias)
-> h too small -> too jagged -> overfitting (high variance)
How to evaluate KDE estimators?
makes sense to look at log vaues for small C…
Last changed2 years ago