02_Perception

Buffl

KI Fahrzeugtechni

by Jensen J.

What is Machine Perception?

computer acquiring ability to interpret data related to environment
=> sensors as input…

What can computer vision be understood as in correlation with machine perception?

Machine Perception that focuses on the use of Camera as inputs…

What are advantages of cameras for machine perception?

cheapest sensor of car
-> includes most important information about environment
images can be processes by classical CV algorithms and deep learning algos

What can be challenges (in pictures) to computer vision?

occlusion (not whole object is seen)
viewpoint variation
illumination
background clutter
defomrmation (e.g. schlangenmensch -> has not “natural / usual” shape)
inter-class similarities
intra-class variation

How to overcome challenges in computer vision (image wise)?

have lots and lots of labelled images…

What is the computer vision pipeline?

chose hardware (camera, lens…)
aquire image/video stream
integrate image in SW pipeline
preprocess image for further evaluatoin
detect/extract features in image
localize object
classify object

What to keep in mind when acquiring images?

lighting
- machine vision not as capable as human
- -> carefully iluminate scene one wants to capture
- => problem autonomous driving: light changes during day, weather,…
distance and area
- camera allows to chose Field of View (FOV)
- -> to make good fotos, have to know FOV and work distnace…
resolution
- sensor that converts light from lens into electrical signal
- -> array of signals (pixels..)
frame grabber and software
- -> frame grabber (normally) sends digitized image over bus system to computer

Pros Cons Matlab for computer vision?

pros:

easy to use
good documentation
GPU booost

cons:

closed environment
performance

Pros and cons openCV for computer vision?

free for everyone (not like matlab only for students)
language C++, python

pros:

everything you need
good documentation
powerful
GPU boost

cons:

installation (especially GPU)

What is image processing?

method to perform operaitons on image in order to
enhance image
extraction of useful information

Steps image processing pipeline

Camera takes picture

camera calibration
digital transformation
color spaces
filtering
contrast enhancment
affine transformation
resampling / compression
save to image

What is camera calibration? and why is it needed?

camera moves 3d world points in 2d
-> most cameras produce distortion
=> camera calibration :
- find oarameters of camera and lens that affect image processing

What type of parameters can caues distortion?

extrinsic camera parameters
- position of camera center, camera heading
intrinsic camera parametes
- focal length
- image sensor formant
lens distortion parameters

difference radial and tangential distortion?

radial:
- curve in lens
- -> edges of images distorted
- -> objects / lines appearr more or less curved than they actual are
tangential:
- camera not perfect alligned to image plane
- -> image look tilted that some objects appera farther or closer than they actually are

How to perform calibraiton for distortion (radial)

find and draw corners
-> find distortion coefficients and correction
correct

What is a picture in data representation?

matrix of values based on color (channels x dim_x in pixels x dim_y in pixels)
=> represents rectangular grid of evenly spaced pixels

How to detect color?

filter for red, green and blue
-> at each pixel, measure amount of light falling into sensor…

Wny are there different color spaces? Name them

for different problems, as have different attributes
RGB
HSV
CMYK
CIE

How many values does RBG has=

8 bit for each channel
-> 0-255
-> 255^3 possible colors

How to understand RBG?

cube in 3d plane with length 255,255,255
-> points in cube represent color

How are HSV colors indicated?

Hue (0° red, 120° green, 240° blue) -> in between the transition from these colors…
saturation (0% gray, 100% saturated color)
value (0% dark, 100% light

How to understand hsv?

zylinder
go around it for different colors
go up for lighter, down for darker
go to center for no gray (white when fuly light) and to border for intensive color

What can filtering be used for im image processing?

apply matrix convolution to emphysize information
by transforming the image
=> convolution with kernel (filter/mask/conv matrix)

Name some filters

mean

gaussian blur

sharpen

What is contrast? And what is the idea to alter it?

range of difference in color and brightness
-> low : image values concentrated in narrow range
enhancment -> change image value distribution to cover wider range

What is used to determine contrast?-

histogramm
-> cumulative distribution functoin should optimally be linear …

What are affine transformaitons? What to keep in mindß

they preserve points, straight lines and planes…
=> paralell lines remain paralell after tranformation
=> does not necesarrily preserve angles or distances…
-> but preserver ratios of distances between points lying on a straight line

What is resampling?

change resolution (number of pixels)
-> downsampling -> decrease resolution (e.g. mean/max/min filtering)
-> upsampling .-> increase resolution (e.g. interpolation)

What are interpolated points in upsampling based?

each cell in new raster must be computed by
- sampling or interpolation over some neighborhood cells in corresponding position in original raster object

What are some upsampling techniques?

nearest neighbor
bilinear
bicubic

What is a feature ?

piece of information relevant for solving computational task related to certain application

What is feature detection?

includes methods for computing abstractions of image informatino
-> and making llocal decisions at every image point
=> resulting features subsets of image domain like isolaed points, continuous curves, connected regions

=> e.g. edges, corners, lines,…

What is feature extraction?

after detecting features
-> local image patch around feature can be extracted
=> e.g. isolation of shapes of digitized image or video stream

What is feature extraction used for?

object detection
robot navigation
motion tracking
…

What is the goal of edge detection?

Identify sudden changes
(discontinuities) in an image
- intuitively:
- most semantic and shape info can be encoded in edges
- more compact than pixels

What is the ideal of edge detection?

artists line drawing of an object
-> but artist has object-level knowledge …

What factors cause edges?

surface normal discontinuity (surface ends)
depth discontinuity (e.g. backside of ball not visible although continuous surface…)
surface color discontinuity
illumination discontinuity (e.g. shadow)

How can edged be detected?

-> rapid change in intensity function of image… (grayscale…)
=> first derivative -> extrema are edges

What is probably the most widely used edge detector in computer vision?

canny edge detection

How does canny edge detection function?

gaussian filter to smooth image
- remove noise
- larger kernel size decreases sensitivity to noise but increase localization error
compute intensity graidents for each pixel
non-maximum suppresion -> thin multiple-pixle wide ridges down to single pixel width -> in narrow intensity drops -> chose only maximum / minimum…
apply lower and upper threshold for edges (hysteris)
- intensity above are used to start edge curve
- if neighboring are above lower thershold, edge is continued
- if intensity below lower thershold, discarded as noise

Advantage and remaining disadvantage of feature extraction

edge detection makes possible to considearbly reduce amount of data in an image
-> but image still described by pixels…
=> if lines, ellipses, … could be defined by characteristic equations => amount of data reduced even more

How to reduce the amount of required data after edge detection even more?

hough transformation

How is Hogh Transfomration mathematically structuerd?

all lines in 2d representable by y=ax+b
=> each line represented by (a,b)
=> each line (a,b) corresponds to single point pi in hough space
=> hough space has the two features a,b

What is a problem of representing single points in hough space?

are part of infinite number of lines -> each point creates infinite points in hough space
=> these points form line hough space determined by all (a,b) comninations going thoruhg the point in the image space

How to comply with the principle that a line in image space corresponds to a point in hough space considering that each point on the line L creates a seperate line in the hough space?

the intersection of all these points on the line L in the hough space
=> correesponds to the single line in real space…

What intuition can be used for corner detection?

shifting a window in any direction should result in large changes…
=> flat region -> same intensity -> no changes when moving
=> edge -> no change along edge directoin movement
=> corner -> significant changes in all directions

How does depth detection work?

have to images available
have baseline (distance between cameras)
have focal length f
calculate difference of same object w.r.t. position on x-axis
Z = B*f/d
distance = Baseline * focal length / disparity (x2-x1)
=> map depth information to each pixel…

What methods exist for image classification?

rule based
- ->manual feature extraction and classification logic
machine learning
- -> manual feature extraction
- classifier learned automatically
deep learning
- -> relevant features and classifier learned jointly

How to do video analysis?

video as sequence of frames over time
-> image data is function of space (x,y) and time t
=> optical flow is pattern of apparent motion of ibjects surface and edges
=> in visual scene caused by relative motion between observer and scene

What is SLAM?

simultaneous localization and mappinf
=> visual odometry -> process of determining position and orientation of robot by analyzing associated camera images

Mean filter?