undefined

Buffl

CV_Neu

von Jensen J.

How does sparse reconstructoin usually work?

Again, what are disadvantages of feature-based methods?

create only sparse map of the world
does not sample across all available image data
- -> discard info around edges & weak intensities

What are some motivations to use direct methods to estimate relative pose?

compared to indirect (two-step -> feature track and then movement) these are one-step methods
create a potentially more dense feature map
more accurate
less prone to error propagation (as only has one step…)

What is the photometric error?

The foundation of direct methods
-> i.e. brightness consistency…

How does sparse reconstructoin usually work?

we have sparse 2D-2D correspondances (monocular…)
reduces image to set of sparse keypoints matched with feature descriptors
-> reconstructed also sparse

In direct methods, in what order are optimal pose and ocrrespondence obtained?

simultaneously

How do we define the photometric error for a paior of pixels?

least squares error

How do we obtain the reconstructed point, R and T in direct method?

What is the role of depth for our photometric error?

based on depth -> can back project any pixel in 3D space and then project into next image

How can depth information be obtained?

RGB-D cameras
binocular (stereo) camera -> pixel depth based on disparity
monocular -> have to treat depth as unknown -> optimize it along with camera pose

How are two points connected w.r.t. 3D-2D geometry in our photometric error problem?

we have to images p1 and p2
they are the perspective projections of 3D point P
we do not know P, nor do we know the rotation and translation between camera 1 and 2
=> we want to express p2 as a formula of p1

we can express the poitn P as a “stretched” version of the direction p1 (normalized)

we can then express p2 as the projection of P onto camera frame 2

in general, K is assmued to be known, R,t and Zdepth is assumed to be unknown

What is the goal of using photometric error w.r.t. the linear equation we used to set p1 and p2 into relation?

brightness constancy
-> estimate the depth, R, and t
such that the photometric error between p1 and p2 is minimized…
=> which we obtain simultaneously

What is the practical setup for our matching and estimatino problem?

no feature extraction, no matching, no RANSAC needed
-> directly minimize photometric error

What is our function we want to minimize?

we want to find the optimal rotation R, translation T, and points {P_i} (i.e. depth information)

so that we minimize

the error between intensity at p1 in the left image (i.e. image k-1 -> as we assume SLAM -> continuous number fo frames…)
and the intensity from p1 back projected point P, projected onto image k with R,T

here, we also introduce a “robust kernel” instead of the squared distance (comes in later lecture)
-> minimize this over the sum of all point we have in the right image

What are pros and cons in our photometric error minimization approach?

pros:

all image pixels can be used
- higher accuracy
- higher rboustness to motion blur and weak textures (i.e. weak gradients)

cons:

very sensitive to initial value
limited frame to frame motion

What is a problem with using regular depth for our photometric error approach=?

some features in environment (like clouds) are far off
-> leading distance to be infinity…
=> can cause prblems with numerical stability

How do we avoid problems with numerical stability w.r.t. distance?

use inverse depth paraetrization
-> replace depth with the inverse of it

-> if norm is very large (i.e. distancce point p to camera center c0) -> roh goes to 0
-> if distance is very small ->
improves numerical stability

Do we actually want to track all pixels in our approach?

no -> typically not necesarry as some pixels might be redundant
-> choose from one of three strategies

What different strategies to choose pixels do we ahve?

dense direct method
- use all pixels
semi-dense direct method
- track partial pixels with significant gradients
sparse direct mehthod
- track sparse key points

Why not to track all pixels?

not really achiveable in real time
not all pixels contribute to solution (i.e. pixel with non obvious gradient…)

How does represeentative image look of trackin all pixels?

What is the semi-dense approach?

track partial pixels with significant gradients
-> if gradient 0 -> jacobian is 0 (i.e. no contribution to problem)
-> only use pixels with high gradients (-> discard areas where gardietn non-obvious)
use tracked pixels to reconstruct semi dense structure

What is the sparse direct approach?

yields less but more reliable pixels
kind of combine indirect method with direct
-> use i.e. harris do erxtract key points
but instead of using e.g. SIFT -> we only want the position but no descriptor
- -> thus, speed up compared to indirect method as we have no feature matching to establish corerspondances….
fastest method
but can only calculate sparse reconstruction

What is the general influence of the motion baseline (i.e. frame to farme motion) on the convergence rate of our direct method?

direct SLAM not suitable for large baselines for two reasons:

initial pose may be unreliable -> leads to local minimum
photometric conssitency assumption not satisfied

-> in practice small preferred

What are some direct slam methods?

indirect:
- PTAM
- ORB-SLAM
- SVO
direct:
- SVO (is direct and indirect)
- LSD-SLAM
- DSO

What is photometric calibration?

reduce various effects that affect our brightnes constancy assumption

By what may our brightness consistency assumptino be affected?

different exposure times
vignetting
response funciton

How does the response function affect the brightness consistency assumption?

response function maps irradienace (energy per minute falling onto sensor) into brightness
-> is not linear
=> thus, to actually correctly calculate the difference in “brightness” -> we should use irrediance (as brightness is a non-linear function of irrediance and thus somewhat biased…)
=> thus we should calibrate the response funciton

How does exposure time effect brightness constancy?

longer exposure -> brighter image
in practice: pair of image differnet exposure times (i.e. cell phone automatically adjusts it…)
=> given we consider consistency of irradiacen rather than brightness
-> we have to calibrat the exposure time to have the same value

How does vignetting affect brightness (irridiance) consistency?

vignetting -> reduction of images brightness towards the periphery compared to image center
mainly caused by manufacturing flaws
=> should remove this effect before apply photometric loss…

Beitreten

Vorschau

Author

Jensen J.

Informationen

Zuletzt geändert
vor 2 Jahren

Kurs melden

C11 - Photometric Error and Direct SLAM

Author

Jensen J.

Informationen