What is the goal of 3D reconstruction?
recover the 3D structure by computing the intersection of corresponding rays
-> i.e. we need 2D point correspondances…
What is the assumptino we make in 3D reconstrunction?
we know the relative pose between cameras
What two outputs do we generate in 3D reconstruction?
output 1 from previous methods (i.e. relative poses between cameas / absolute pose between first and following cameras)
output 2: reconstructed 3D points in world frame (camera frame of first camera)
What two reconstruction types do we differentiate?
general case for sparse reconstruction
triangulation
simplified case for dense reconstruction
depth from disparity
What is the difference between triangulation and depth from disparity?
general case:
non identical cameras and not aligned
=> have to actually backproject and intersect
simplified case
identical (at least virtual identical) cameras and aligned
What do we have to do to align cameras for the simplified case?
aligning -> creating “virtual” camera by warping so that
image planes are coplanar
epipolar lines are colinear
=> similar to human eyes…
What prior infrmation do we have in 3D reconstruction and how did we obtain it?
extrinsic parameters (relative rotaiton and translation)
obtaioned by epipolar constraint or other methods (i.e. PnP, ICP)
intrinsic parameters:
focal length, principal point of each camera (u0,v0)
obtained by using a calibration method (e.g. Tsai, Zhang)
What is the definition of our triangulation problem?
determining the 2D position of a point
given a set of corresponding 2D points
and known camera poses
Do we actually compute the intersection of two rays?
no
-> due to numerical errors they wond meet exactly
=> we can only compute an approximation
How do we model the world frame in triangulation?
we set left camera to world frame
-> this yields following constraints:
What is the system of equations to triangulate a point?
this yields a linear system of equations
where we have two linear independent constraints for each camera
=> then we solve P with SVD…
What is P when we solve it with SVD?
as projection rays do not exactly intersect
-> least square approximation yields midpoint of the shortest 3D line segment connecting the two projection rays
How do we optinally proceed after triangulation?
non-linear optimization
-> finding the intersecting point by minimizing the squared reprojection errors from both sides (sum of squared reprojection error in both camera frames)
What is disparity?
displacement vector of a point
-> brain allows us to percieve disparity by using left and right “images” from eye…
=> depth inversively proportional to disparity
=> enables depth perception…
What is the baseline in stereo vision?
diestance between optimcal centers of two cameras
How to calcualte disparity from baseline?
essentially:
Z coordinates are:
baseline * focallength / (point left - point right)
=> here, only the u coords are given as we assume aligned cameras (i.e. epipolar lines are coplanar… -> only need “x” axis)
What are advantages and disadvantages of a large baseline vs small baseline?
large:
advantage: small depth error
disadvantage: difficult search problem for close objects (projection may also be outside the right image)
small:
advantage: easier search problem for close objects
disadvantage: large depth error
What is stereo rectification?
adjusting the image pairs
-> so that they agree to the assumptions of our stereo camera (i.e. we can use detph from disparity….)
Do we also need stereo rectification in commercial stereo cameras?
yes!
-> never perfectly aligned
What is the goal of stereo rectification?
epipolar lines are aligned to horizontal scanlines
-> makes correspondance search very efficient!
raw:
aligned:
How do we perform stereo rectification?
warp the original image planes onto (virtual) planes
-> paralell to the baseline
=> one transformation for each image!
=> resulting in epiplolar lines alignet to horizontal scanlines
What do we have to keep in mind when warping in stereo rectification in case we transform an image plane in 3D?
the 2D projection points coordinates are changed accordingly
What is the pipeline for stereo rectification?
define two new matrices to rotate old image planes respectively around their optical centers
-> so that new image planes become coplanar and are both paralell to baseline
=> ensures that epipolar lines are paralell
also ensure that baseline is paralell to new x-axis
ensures that epiplolar lines are horizonatl and not only paralell
in addition: corresponding points must have same y-coordinate (row ID)
requirement: new virtual cameras have same intrinsic parameters
What is the result of requiring rectified cameras to have the same y-coordinates for corresponding points?
displacement of a 2D point in the image
is only caused by extrinsic parameters (and not intrinisc… -> as intrinsic is the same…)
What is a disparity map?
as we assume y-coords to be the same
-> only x-coords differ (i.e. disparity…)
-> create a map where we assign each point (x,y) the disparity value x’-x -> i.e. right image x-ccord - left image x-coord of corresponding point
-> close objects have bigger disparity -> are shown in brighter color in disparity map
How do we create 3D point from our disparity map?
Zuletzt geändertvor einem Jahr