undefined

by Jensen J.

What two approaches are there to create spam classifiers?

How does the probabiltiy chain rule work?

-> Simply stretch the left side

P(A,B,C) = P(A|B,C) * P(B|C) * P(C)

What does Bayes theorem state?

What are the different elements of the bayes theorem called?

How much space would we need to store p(F1, F2, F…, Fn | S) ?

What is the problem to store all combinations? How is it solved?

How does naive assumption work?

When in P(F1, F2, …, Fn | S), all F are conditionally independent from each other
-> simply resolves to product
- Prod P(Fi | S) over all i…
=> requires only 2n (for F and S, not S)
-> as not F can simply be calculated by 1-x…

Why dont we simply do the same assumption for the nenner of the bayes term?

How do we handle the nenner as normalizuation constant?

Why is naive bayes useful compared to simply throwing a NN at the problem?

=> Naive bayes has unique properties w.r.t. online learning…

Why dont we calculate P(S|F1,…,Fn) directly?

Last changed
2 years ago

1. NLP-I