AI-Ethics principles: any ethical AI should be…
Key Ethical Issues are…
What are hidden Markov models for?
What distribution issues often lead to over-confident wrong predictions of AI models?
Out-of-Distribution (OoD): Data points that significantly deviate from the training distribution.
Interpolation: Predicting within the observed data range (but the certain input data has not be seen before)
Extrapolation: Predicting beyond the observed data range
Covariate Shift: When the distribution of input features (covariate) changes between training and test data
Unbalanced Data: Unequal distribution of classes in a dataset.
Why data processing?
• Raw data often contains unnecessary high entropy/dimensions(!), noise, missing values, and inconsistencies.
• Data preprocessing ensures data quality and prepares it for efficient modeling
name common Data Preprocessing Steps
Compare the term “Model” in a traditional engineering and in an IA/Data Science context.
What is the difference between Eager Learners and Lazy Learners
Name positive and negative aspects of Eager Learners and Lazy Learners
Difference between supervised and unsupervised learning
supervised learning:
training data labels available
unsupervised learning:
training data labels are not available
algorithm infers patterns on its own
Overview over machine learning tools/models based on availability of labeled data and the task
Name methods for supervised learning and explane
(Bayesian) Linear Regression
Decision trees
k-nearest neighbors (look at neighbours pick value of majority)
support-vector machines (SVM) (draw hyperplane between groups)
Artificial Neural Networks (ANN) (collection of connected nodes)
Name methods for unsupervised learning and explane
Clustering (e.g. k-means)
and:
What is reinforcement learning?
Why is reinforcement learning fundamentally different to
supervised or unsupervised learning?
Why different? Its active, you can train for 1 hour or 10^6 hours
Name steps of a supervised ML Pipeline
and 3. is looped
Explane Information Leakage,
name solutions to problem
Explane Overfitting,
What are Hyperparameters?
Settings that control the behavior of machine learning models.
e.g.:
• Learning Rate: Controls step size during gradient descent.
• Number of Hidden Layers: Affects model complexity.
• Regularization Strength: Balances bias-variance tradeoff.
Metrics for Model Evaluation are:
What can you do, if few data is availabile?
Data Augmentation, X-Validation, ect.
What is Data Augmentation?
Data Augmentation: artificially increases the dataset by
creating modified copies of existing data
improves robustness
What is X-Validation/ Cross-Validation
Name some structures of ANNs (artifical neural network)
Autoencoder (for dimension reduction)
Generative Adversarial Networks (GANs) (two player games: The generative network generates data candidates while the discriminates network evaluates them (real vs. synthetic), for generative AI for Images)
Transformer (mimics cognitice attention: enhances the
important parts of the input data and fades out
the rest; e.g. GPT)
Last changed4 months ago