What is anomaly detection?
identification of rare items
-> which raise suspicion by differing significantly from the majority of the data
What is a point anomaly?
individual dta instance
which can be considered anomalous
with respect to rest of data
a day with temperature of 50 degree celsius
What is a contextual anomaly?
data instance is anomalous
in specific context
(but not otherwise)
one day with 25 degree temparature in january
What is a collective anomaly?
collection of related data instances is anomalous
with respect ot entire data set
-> e.g. 10 day streak of days with 20 degree in february
-> e.g. one day with 20 degree could be normal (nowadays)…
Categorization matrix of anomalies?
row: single and multiple datapoints
column: global and local context
single, global -> point anomaly
single, local -> contextual
multiple, global -> collective
multiple, local -> X
What is the difference between supervised, semi-supervised and unsupervised anomaly detection?
supervised: (X,y)
-> we know what the outcome should be
semi-supervised: (X,y’)
-> we train on a single class (e.g. normal data points)
unsupervised (X,)
-> we have only the input but no label
What preprocessign do we need to do when we expect collective anomalies?
need to aggregate features over time
=> just like my windowing approach in Bachelor Thesis / Work
How can we model “normality”?
threshold apporach
-> define noemal as feature values we already have seeen
-> thus we know certain limits…
fit specific distribution to data
kernel density estimation
What is a problem with thresholding for normality?
arbitrary
yields only binary…
-> would rather like probabilistic results…
How do we use distributions to model normality?
choose specific distribution
fit it to data
-> e.g. normal distribution…
When is a cuntion a probability density function in R?
f(x) >= 0 for all x element R
integral over function = 1
allowed:
there exists x’ so that f(x) > 1
When is a function a probabiltiy?
P(x) = 0 for any x element R
P(A <= x <= B) = integal from A to B over f(x) <= 1
-> whatever region we integrate over, it is smaller than 1
When is a functoin a cumulative distribution function?
C(x) := P(z <= x) = integral from negative infinity to x over f(z) <= 1
What are two main charactersitics of kernel dentisy estimatino (KDE)?
non-parametric
do not explicitly specify which probability distribution to use
density estimation
but still use probabiltiy distribution
-> (instead of naive approaches such as remembering normal..)
What is the goal of KDE?
given set of observations {x_i}
draw from uniform randomly from an unknown density function f
find a density function f_hat (estimator) approximating the original f
=> also works for exotic distributions!!!
How is the estimator in KDE defined?
let (x1, …, xn) be n univariate samples drawn independently and identically from some distribution wiht unknown density f
estimator:
f_hat_h(x) =
1/n SUM(i=1 to n) K_h(x-x_i)
=
1 / (n*h) SUM (i=1 to n) K((x-x_i) / h)
where K is a kernel
and h is “Bandwidth”; a hyperparameter (trainable…)
What is a kernel?
non-negative function that integrates to one
-> often used standard normal distribution…
K(x) = (1/sqrt(2*pi))
*
e^(-(x^2) / 2)
What does the bandwidth h control in our estimator?
“smoothness”
-> h too big
=> estimator becomes too “smooth” (too much bias)
-> h too small
=> tesimator becomes too jagged
-> overfitting
=> too much bias in the sense of smoothing out anomalies
How do we perform cross-validation on kernel density esitmators?
for all hyperparameters h,
for all k elelment [0,…,n-1]
fit model on all x_i (without x_k)
eval model on k as product of f_hat,h(x_i) -> basically evaluate over product of the ones left out… (if leave one out then only one…)
return h where avareage validation score is best…
How can KDE be applied?
extract featueres from dat (e.g. via window-aggregation)
train KDE; possibly use cross-valuidation
evaluate new inscanee as anomalous if f(x_new) < threshold
=> if probabiltity low…
How can KDE be visualized?
if e.g. very small Threshold
-> use logarithmic y-axis…
Last changed2 years ago