Erklären Sie Occam’s Razor
“Other things being equal, simpler explanations are generally better than more complex ones”
“Löse nie ein Problem komplizierter als nötig, denn die einfachste, richtige Erklärung ist die die Beste”
Definition Vapnik-Chervonenis (VC) Dimension
The VC-Dimension (h) of H is equal to the maximum number of data points (from set S) which can arbitrarily seperated from H
VC (h) is a measure of the capacity of the learning system. The larger VC (h). the better a system can learn to solve a problem
Nennen Sie 3 Kriterien von denen der Lernerfolg abhängt.
Capacity of learning system
Optimization method
Learning instances in dataset
Zeichnen Sie ein Schaubild um Overfitting zu erklären.
Was sind Gründe für Overfitting und Lösungsansätze.
Gründe
Model capacity is too large
Model is trained for too many iterations
Solution
Increase number and types of instances in dataset
Steer learning with validation error e.g early stopping
Decrease model capacity
Choice of optimal hypothesis
Welche Ziele verfolgen Induktion und Deduktion?
Induction: Truth-expanding (Generation of new hypothesis)
Deduction: Truth-preserving (Deriving new rules)
Was ist Completeness und Consistency?
Was ist der Inductive Bias? Und welche zwei Bsp. gibt es?
Set of assumptions or prior knowledge that a learning system incoporates to generalize form data
Bsp.
hypotheses that maximizes the distance between input instances of different classes are preferred
If a simple and a complex hypothesis both minimize the loss function, the simpler one is preferred
Einordnung von ML Lernverfahren
Wie kann Overfitting bei Decision trees vermieden werden.
Maximum depth
Minimum samples per node
Eary stopping
Pruning
Multiple trees like random forests
Welches Problem ergibt sich bei Bagging weshalb lieber Random forests verwendet werden?
Problem of bagging: Models are highly correlated
Previous: Bagging with decision trees. Splits can use all 𝑑 attributes of drawn samples
Now: Choose random subset 𝑠 < 𝑑 of attributes for each split. Create nodes using the best attribute for maximizing information gain in 𝑠.
Why: To ensure, that the 𝑘 learned trees are less correlated. Each tree uses a different subset of attributes.
Zuletzt geändertvor 2 Monaten