undefined

Buffl

KI Fahrzeugtechni

by Jensen J.

What is the ML development workflow?

problem definition
data preparation
model
hardware
training
tuning
inference

What are the questions in problem definition?

What kind of problem?
What are inputs and desired outputs?

What are the questions in data preparation?

How much data is needed?
How to get that data?

What are the questions in Model?

Neural Network?
Which layers?

What are the questions in Hardware?

Which hardware for training?
Which hardware for inference?

What are the questions in Training?

Supervised?
Which loss function?
Which optimizer?

What are the questions in Tuning?

Iterative change of model, hyperparamters, etc.

What are the questions in Inference?

Deploy the model for your desired problem

What approaches are there to get data? What are problems?

public datasets
- how well does this fit your problem defintion (inputs, outputs,…)
- can the data be processed to fit your problem? (e.g. 3d bounding boxes -> 2d bounding boxes)
create your own dataset
- prototype needed?
- data infrastructure (how to save data? how to handle large files?)
- time consuming and expensive

How is the relation of model training and implementation to data preparation?

data prepartaion makes up 80% of your project -> crucial!!!
but only sees around 1% of AI research…

What is the suvivorship bias?

e.g. use statistics to see where planes most often hit by bullets to reinforce these areas
-> bias: only planes that did survive these bullets came back to be evaluated (-> should reinforce areas where no bullets hit…)

What is the sample bias?

unbalanced sampling from a population
e.g. tank detection
-> images with tanks only at day
-> images without tanks only at night…

What are ehtical issues with bias?

e.g. have less with people of color
-> e.g. guy holding tehrmometer
- white: thermometer
- black: gun
- …

What are class imbalances? How to consider?

e.g. detect different vehicles
-> 2000 motor cycles
-> 10 cars
=> will not yield good results…

-> repeat samples with under-represented data

-> consider imbalance in loss function

What can be problems with data annotation (e.g. output of bounding boxes)?

what to do when objects are overlapping?
-> inconsistency can be problematic…
=> different ground truths…?

How can one measure the consistency of labeling?

IoU (intersection over union) = area of overlap / area of union
=> inconsistent labels can be considered as noise…
-> smaller is worse

What is noise in data? How to handle it?

for an x value -> no consistent y values…
use more data to better middle…
or clean data to have consistent labels…

How can we increase data? (images)

Data augmentation!!!

flip images
rotate images
scale images outward or inward
crop images
translation of objects in x,y position (verschieben); similar to crop…
add gaussian noise…
- defacto changes all pixle values…
deep photo style transfer
- fancy…
- -> generate new image by new style (using generative NN…)

How effective is data agumentaiton?

depending on mehtod aroudn 10 percent
-> overall feeing augmentation useful…

What different types of biases exist?

sample biase
- dataset does not actually represent the world
exclusion bias
- systematic exclusion of information
measurement bias
- data measurement for training differs from inference (e.g. for training differnt camera used…)
recall bias
- label similar types of data inconsistent
observer bias
- effect of seeing what you expect ot see or want to see in data

What NN models did we discuss?

Fully connected
convolutional neural networks
graph neural networks
recurrent neural networks

Can we create networks from different types?

yes…

What activatoin functions did we discuss?

step functoin
sigmoid
relu

List hardware that can be used for ML

CPU
- solve general / wide range of tasks
GPU
- designed to accelerate rendering of graphic
- focus on parallelizuation
TPU (tensor processing unit)
- Designed to accelerate deep learning tasks
- TPUs from Google especially for Tensorflow
FPGA
AISEC

Differnce CPU GPU?

mainly serielles abarbeiten von inputs
GPU stark parallelisiert…

Common loss functions we discussed?

L1 (MAE)
L2 (MSE)
Binary Cross Entropy

-> BCE good for:

binary classification for clasificaiotn…

What are aspects where differetn optimizers can differ?

computaton time
local / global minima
conergence tyme
learning rate dependency
number of hyperparameters (e.g. momentum)

Why is wheight initialization important?

neccesarry to have starting point for optimization
two important dimensions
- value range
- distribution
best practice: start with default initialization from your ML library

In what regards is the value range important for weight initialization?

too small value lead to slow learning
too large values may lead to divergence
keep in mind vanishing and exploding gradients

In what regards is the distribution important for weight initialization?

constant initialization performs poorly -> need for randomness
xavier initialization with
- mean of activations is 0
- variance of activations stay the same across every layer

What can be a variation in transfer lerningß

keep weights in initiali layers
- => in image classification: early layers have generic features (low level)
re-train later layers
- train the actually important, complex features specific to our task….

What are methods to prevent overfitting?

increase data set / data set augmentaiton
reduce model size
early stopping
regularization
dropouts

In what regards is randomness imporant in Neural Networks?

sequence of data
- else, find global minima difficult for optimizer
initialization
optimizer (e.g. SGD)
randoim retularization (e.g. dropout)
calculation architecture may non-dtereministic

How can we try to keep determinism? Effects of randomness?

keep random seed
-> but not all calculations might be deterministic
=> train NN several times with same settings to account for impact of randomness
=> comparability of NN results often difficult…

What are training hyperparameters?

learning rate
decay rate
bacth size
number of epochs
dropout rate
regularization

What are hyperparameters of the model?

number of layers
number of parameters in layer

What is a problem with hyperparameter optimization?

lots of them
-> high dimensional optimization problem

What is a problem with relations between hpyerparameters?

hyperparameters not independent of each other (-> change one influences effect of other)
=> for some rough explainable (e.g. learning rate and number of epochs)
=> for some relation not explainable (e.g. dropout rate and batch size)

What is a problem w.r.t. hyperparameter search?=

training may take very long (e.g. days)
-> exhaustive serarch of hyperparameter combinations not feasible…

What is a way to optimize hyperparameters?

bayesian optimiozation
is gaussian process and gradient free
not feaible for lots of hyperparameers (e.g. 4 okay, 30 not)

how can we try to overcome the black box?

featuer visualizatoin
-> plot graphs to the weights / kernels….

activation maps -> which inputs are important?

What are some post-training optimization approiaches?

precision calibration
layer and tensor fusion
kernel auto-tuning
multi stream execution
dynamic tensor memroy

What is an application to monitor our training and model?

tensoroboard

Join Course

Preview

Author

Jensen J.

Information

Last changed
2 years ago

Report course

12_AI_Dev

Author

Jensen J.

Information