How to calculate FOV?
What is a normalized image?
virtual image plane
-> with focal length = 1 unit
origin of pixel coords at principal point
How to calculate a normalized image?
Why does it make sense to normalize an omage?
unnormalized -> we simply have (u,v)
-> normalized -> we can calc direction ray as our optical center now aligns to the camera center …
-> [u,v,1]…
=> easier to calc backprojections…
=> 3D points and normalized image points are colinear
-> as the vector from O -> [u,v,1] is the same direction as from O -> P (but with scale…)
also helpful for expressing certain geometric constraints (i.e. coplanarity constrain)
From our findings how can we convert between vanishing point and vanishing direction?
vanishing point as projection of point at infinity
vanishing direction as image normalization of our vanishing point
What is the geometric meaning of normalized iamge coordinates??
colinearity between 3D vectors in camera frame
(lecture C03 part 1 slide 37 schauen)
How do vanishing point and vanishing direction relate?
vanishing directoin -> line through camera center and vanishing point
-> can be computed by image normalization
How to calculate a vanishing point?
assume we have vanishing point that lies on the line from P0 to v
we extend v in the vanishing direction (unit direction of the 3D line the vanishing point lies on) P=P0+tD
reformulated (division by t) this yields
if we let t->infinity
What is the inverse of our intrinsic matrix?
How do we recover the transformation parameters in 3D 3D geometry?
provided we have two point clouds
assumption: we know the 3D-3D correspondances
compute the center of mass for both
point set normalization by normalizing both 3D point sets with their respective center of mass
compute the W matrix
perform SVD on the W matrix
compute rotation and translation from the SVD
What is the idea of point set normalization?
move the center of the point cloud
to the origion of our coordinate system
-> thus, we try to find the rotation and translation to align the origins in such a way
that the point correspondances match in 3D…
How do we compute the center of mass for each pint set?
How do we perform point set normalization?
How do we calculate our W matrix for SVD (3D-3D)?
assume: same cardinality for bot point sets
-> for each correspondance
-> calculate x’p’T
=> yielding a 3x3 matrix
and then elementwise sum over it…
What transformation do we obtaion with our 3D-3D conclusion?
transformation form p->x
How do we solve the W matrix for rotation and tanslation?
conduct SVD decomposition of W
What is the difference between homogeneous coordinages and normalized coordinates?
homogeneous coordinates -> still 2d point, but extended to allow certain calculations
normalized coordinates -> 3D direction in camera center colinear to the 3D point that projected it…
What is the essential matrix?
Is encodes the geometric relationship between calibrated cameras w.r.t. Rotation and translation
it also encodes epipolar geometry
describes relationship between points in one camera view and corresponding epipolar lines on other camera view
used in e.g. 8-point method to compute relative pose
How do we create the essential matrix?
we have to work with normalized image coordinates
we define n as the normal of the epiploar plane
then, we can express the orthogonality constraings in the right frame as
we can express the normal in the right camera frame n as cross product between the translation vector and the direction p1’ which is the rotated version of the point in the left camera frame (-> thus in the direction in the right camera frame)
we can then replace the p’_1 from the right camera frame with the rotaion of the left camera frame direction p_1 to bring both into consideration… (before only considered right camera frame…)
this puts the left and right camera frame in correlation (p2 right, p1 left)
What is the relation between T, p1 and p2 in our essential matrix?
they are coplanar w.r.t. the epipolar plane
How do we get from the coplanarity constraint to our final essential matrix?
use conclusion from before (coplanarity constraint relating two point correspondances in two camera frames)
here, we replace cross product with skew symmetric matrix
combining T_X and R -> we get to our essentrial matrix that encopdes rotation and translation as well as the coplanarity constraint (epipolar geometry) between two calibrated cameras provided a point correspondance
How do we decompose an essential matrix?
i.e. used in 5/8 point method
-> use point correspondnaces and previous conclusions to derive a matrix
decompose with SVD
How is a skew symmetric matrix defined?
What is the difference between the essential and the fundamental matrix?
in the fundamental matrix
-> we do not have normalized image coordinates
-> and we do not know the intrinsic parameters
it relates with the essential matrix by:
In our essential / fundamental matrix conclusion, from which camera center to which do we compute the rotation and translation?
from left camera (c1) to right camera (c2)
What is the advantage of the fundamental matrix over the essential?
we can work directly in ordinary image plane
instead of normalized image plane
-> i.e. can be used if we do not know intrinsic parameters K and thus cannot normalize…
Which matrix is used in the eight point method, which in the five point method?
eight poitn:
fundamental
essential
five point
What is a homography?
transformation of (typically 2D-2D) point correspondences
can also be used for 3D-2D i.e. zhang…
based on perspective projection (more general than affine transforamtion)
encodes co-planarity information
=> encodes transformation information of points that lie on the same plane
=> between two camera view from different perspective
How do we calculate a homography?
provided we have sevaral point in 3D on a plane and their corresponding points in both camera frames
we can expres the 3D plane w.r.t. one of the points as follows:
we then can write an equation to express the point p2 in the right camera frame depending on the point P in 3D
i.e. apply extrinsic and intrinsic parameters…
then, due we can introduce our 3D plane expression by modifying it to be equal to 1
here, all still happens in the right cmaera frame (c2)
we can then use the disritbutive law to “extract” the point P
we then can reqrite the point P w.r.t. the its projected point p1 in the left camera frame (normalized image coordinates -> thus k^-1 is for camera 1…) and the first K is for camera 2…
here, p1, and p2 by itself are homogenous coordinates!
=> here, we assume same calibration as we want to use it for SLAM (i.e. used same camera…)
we can then extract this block infront of p1 to get our homography
which encodes the relative camera pose information
What is our plane equation?
normal n
a poitn P lying on the plane
and the distance d from the origint to the plane
How do we calculate SSD?
we iterate over all pixels in left patch (H) and right patch(F)
-> sum up the square of their distance…
What is the relationship between disparity and depth?
Zp -> depth inforamtion
b -> baseline length
f -> focal length
ul -> x-coords in left image
ur -> x-coords in right image
What is the definition of the photometric error for a single pixel?
where e is the pixel wise error
I1,I2 are the images (i.e. 2D matrix with intensity values)
p1, p2 are the pixel coordinates
How do we extend the photometric error to multiple pixels?
How can we optimize triangulation?
minimize both left and right reprojection error
How is the reprojection error defined?
left part of the reprojection error
here, reprojection error means that after triangulation, we project the triangulated point back and compare it to the initial point we used to triangulate it
p1 is the initial point we used for tirangulation
pi(P,K1,I,0) is the projection of the triangulated point
-> i.e. K1|[I|0]P
-> here I|0 as we assign the left camera frame as world frame
the right part is
where again p2 is the point in camera frame, and we backproject the 3D point using K2, R and T (assumption we know the pose…)
Last changeda year ago