What two approaches to build spam classifiers exist?
hand crafted -> naive bayes
learned features -> TF-IDF; Word Embeddings
What is a conditional probability?
P(A|B) -
probability that A occurs if B has already occured
What is a joint probability?
P(A,B) -> Probabiltiy that two things occur simultaneously
=> when counting -> Schnitt of both sets… (outcomes where both is true…)
What is the taxonomy in bayes?
P(F) -> evidence
P(F|S) -> likelihood
P(S) -> prior
P(S|F) -> posterior
Bayes Rule
P(A,B)
= P(A|B) P(B)
= P(B|A) P(A)
How to store bayes tables?
would have every combination of features and S
-> for n features
2^(n+1)
=> infeasable….
What is the naive bayes assumptino?
the features are independent of each other
=> we only have to store P(F|S) and P(F|not S)
=> 4n instead of 2^(n+1)…
Why does one naive assumptino suffices?
the nenner is the same for P(S|..) P(not S|…)
=> thus, is only normalization constant…
=> when comparing the two probabilities, can be disregarded….
=> also would require another n variables to store…
What are advantages of naive bayes over supervised learning algo?
adding new training data and features easy…
Why not directly compute P(S|F1, F1,…,Fn)=
there exist no rule for that
e.g. no P(S|F1) P(S|F1)…
Last changed2 years ago