Foundations II

Buffl

Automated Driving

von Vincent Q.

Artificial Intelligence

Definition

Agent
Environment
Sensing
Acting

What is machine learning?

A computer program is said to learn

from experience with respect to some class of tasks and performance measure,

if its performance at the tasks, improves with experience.

Machine Learning can be subdivided into solving three different tasks:

Clustering: Separate data points into some groups
Regression: Fit functions to data points
Classification: Separate data points into predefined groups

What general approaches are there to make an algorithm learn from data?

Supervised Learning
Unsupervised Learning
Reinforcement Learning

Supervised Learning

Approach
Challenges

Approach: Use examples of input data and assigned output data (labels) to find an approximation of their relationship
Challenges: Where do we get the labels from?

Unsupervised Learning

Approach
Challenges

Approach: Find patterns in data without knowing target values
Challenges: What patterns are we looking for? How do we know which patterns are useful?

Reinforcement Learning

Approach
Challenges

Approach: Let the system try different approaches and reward desired states of the system
Challenges: What should the system try? What behavior led to the success? What is success? (Exploration vs. Exploitation and credit assignment problem)

Was sind Hyperparameter?

Hyperparameter sind externe Konfigurationsvariablen, die Datenwissenschaftler für das Training von Machine-Learning-Modellen verwenden.

bspw. die Anzahl der Knoten und Schichten in einem neuronalen Netzwerk

We want to know whether the model fits data well …

Verification measures the model‘s performance on the data used for selecting the model‘s parameters

The user wants to apply a model to data that was not used when finding the model‘s parameters …

Validation measures the model‘s performance on data not used for selecting the model‘s parameters

The user wants an unbiased estimate of the model‘s performance

Testing allows computing unbiased estimate of model‘s performance on data not used in learning problem

Graph including

Out of sample error
In sample error
Bias
Variance

6 Step Supervised Learning Process

Explain

epoch
batch

Using the complete data set once to update the weights —> epoch
Part of the data set to calculate one adjustment of 𝑾 —> batch

Method to minimize the loss function

(Stochastic) gradient descent

The step size along the negative gradient is called the learning rate. The choice of the learning rate is crucial because it affects whether we find a (local) minimum and at what speed.

Effect of Learning rate on convergence

n < n_opt
n = n_opt
n > n_opt
n > 2n_opt

Which optimization methods based on (stochastic) gradient descent try to mitigate potential problems such as „skipping“ minima and slow training in flat areas

Momentum, Nesterov Accelerated Gradient:
Keep moving in recent direction —> more independent of noise in batch
Adagrad, Adadelta, Rmsprop, Adam:
Make use of (approximation) of Hessian —> small steps in steep areas, large steps in flat areas

Explain underfitting and overfitting

Challenge: We often don‘t know whether the function we are trying to approximate is linear, quadratic, ...

—> Choosing a too simple model can lead to underfitting

—> Choosing a too complex model can lead to overfitting

g should learn features that are inherent to all data, not just to the data set

Explain

Generalization error
Generalization gap
Capacity
Underfitting zone
Overfitting zone

How can we reduce the generalization gap without reducing the model capacity?

Explain

Regularization

Example for a complexity penalty function

L2-Norm of a weight matrix (reeller Zahlenraum mxn)

Artificial Intelligence (AI)
Machine Learning (ML)
Representation Learning
Deep Learning

Key Concepts of Deep Learning

Artificial Neural Network (ANN): Computing system containing nodes and weighted connections that converts inputs to outputs
Convolutional Neural Network (CNN): Popular class of ANNs for processing images
Recurrent Neural Network (RNN): Popular class of ANNs for processing sequences of inputs (e.g. natural language processing)
Backpropagation: Method that efficiently computes the gradient of the loss function with the respect to the parameters of an ANN.

Artificial Neural Network

Basic idea:
- Learned function 𝑔 is defined as a X consisting of X and X
- Nodes are structured in X
- At the beginning is an X, then a number of X and finally an X

network
vertices (nodes)
weighted edges (weighted graph)
layers
input layer
hidden layers
output layer

The basic building block of an ANN is an artificial neuron. It computes an activation 𝑎 based on some input 𝑋. The activation also depends on the learnable parameters (weights) and the choice of the activation function, which can also contain learnable parameters.

Activation functions

Rectified Linear Unit (ReLU)
Parametric Rectified Linear Unit (PReLU)
sigmoid function
step function

What‘s the problem with Fully Connected Networks?

Not suited for problems with many input dimensions (e.g. computer vision)

Too many weights
- Image-size of 256x256x3 means 196.608 connections to each neuron in first hidden layer
- —> Model does not fit into memory
- —> Slow training
Model takes too long to compute output (slow inference) —> Problematic for automotive applications where strict requirements concerning time and cost apply

Three characteristics of CNNs lead to their advantage over FCNs:

Recurrent Neural Networks

Backpropagation

Error at output of neural network is propagated backwards through the network to compute gradients of loss function w.r.t. each weight in the network

Challenges of Deep Learning and real world data

Data quantity
Data quality
Data acquisition and cost

Beitreten

Vorschau

Author

Vincent Q.

Informationen

Zuletzt geändert
vor 2 Jahren

Kurs melden