What are general limitations in image line detection?
incomplete detection due to limitaion of existing algorithm (only partial line detected…)
occluded detected image line due to limited FOV (rest of line outside of current image…)
What is the projection ambiguity problem?
consider: we have several 3D lines lying on the same projection plane
each 3D line has unique plucker coordinates
=> thus have different scale when not considered not considering the projection…
What is calibration?
process to determine
extrinsic parameters (R,T) of camera
intrinsic parameters (K plus lens distortion)
What is camera localization?
for known intrinsic parameters
-> figure out extrinsic parameters
=> in later chapter in detail…
When doing calibration, why are we not really interested in extrinsic parameters?
as for lateron tasks, they change
-> but intrinsic stay fixed
=> only store intrinsic for lateron…
Of what does Tsai’s method consist?
measuring 3D position of n>= 6 3D control points on a 3D calibartion target
and the 2D coordinates of their projections in the image
=> based on only a single image
plus prior knowledge about i.e. size of each square…
How can we solve the calibration problem using direct linear transform?
use standard methods to solve this linear system of equations…
How can one rewrite the regular formula for calibration to solve it (tsais method)?
multiply both intrinsic and extrinsic matrices
rewrite the individual elements to m11, m12, …, m34
=> rewrite as column vectors; replace right matrix with P
Set up homogenous LGS
-> here, under the division part (with mT3) due to it being lambda according to the left hand side…
What is the final LGS for tsais method to solve for?
=> (remiinder P is world coordinates we know…)
for n points -> stack all these equations into a big matrix
What is the general “formula” to create the final matrix Q?
For each point, add two rows to the Q matrix:
(here split to have the image bigger…)
Does the scale of M initially matter for tsais method?
no -> we have homogenous LGS QM = 0
=> actual scale can be revocered lateron based on the last element of K (which is part of decomposition of M) where we know that the laset element must be 1…
To solve QM = 0 (tsai), which rank must Q have?
to have unique, already scaled solution -> rank 12 (unique solution)
12 due to M being 12 dimensional
In practice: 11 suffice as we can scale lateron…
How many point correspondances are needed for tsais method?
each 3D - 2D correspondense provides 2 independent equation
-> 12 equations needed
-> 6 point correspondances
How can we solve QM = 0 when we have an overdetermined solution? (n>= 6 points)
Least sqare solition
-> minimizing the sum of the squared residuals ||QM||^2
subject to constraing ||M||^2 = 1 (why this is is explained later)
=> can be solved using SVD
Why do we need the constraint ||M|| ^2 = 1 in tsais method?
Constraint introduced to avoid obvious non-zero solution…!
How can we apply SVD to solve QM=0 in tsai?
Q is known
M is unknown
we want to minimize the residuals
min ||QM||^2 _ 2
as we want to find M
arg min _M ||QM||^2_2
subject to ||M||_2 = 1
use SVD composition on Q
Q = UEV
optimal solution M* is the column of V corresponding to the smallest singular value (which are denoted in E)
When we found M, how can we recover the extrinsic and intrinsic parameters?
M = K(R|T)
M now known, K(R|T) still unknown
use QR factorization (as it inherently satisfies orthogonaly which is required in our rotation matrix, due to it being in SO(3))
-> QR (or rather RQ) decomposes M into orthogonal R, T and upper triangluar matrix (i.e.K)
=> only works for square matrices…
What is the practical setup of tsais method? What are advantages?
require only single picture
corners can be detected with very high accuracy < 0.1 pixels currently
use many more than 6 points (ideally more than 20) and non coplanar
What is coplanarity?
set of points are called coplanar
if there exists a plane in 3D space on which all of these points lie
=> can fit a plane in 3D space through all points…
Compare tsais and zhangs method w.r.t. their requirements
non-coplanar points (more complex image such as cube)
coplanar points (i.e. chess board)
What is the mathematical approach to Zhangs method?
same as in tsai: neglect radial distortion
but due to all points being coplanar: Zw can be set to 0 (as only X and Y axis are important…)
=> set Z axis paralell to the plane of the 3D poitns
-> when writing more compact, can simply leave out part with 0 and reduce size of matrix…
How to rewrite zhangs equations considering Zw = 0?
where similar to tsais (m), h represents the combined intrinsic (kalibartion) and extrinsic (rotation and translation) matrices
How do we transform the matrix in zhangs method to be able to solve for H?
Here, H is called Homography
And keep in mind -> Zhang requires several images thus P corresponds to one of them
-> Then applied to the i-th image
How do we create the matrix in Zhangs method for multiple points? (n points)
where lds Q is known (as we know the different points in the camera frame) and H is unknown
How do we solve the linear system for zhangs method?
Again, search for minimal solution
-> Q (2nx9) should have rank 8 to have unique (up to scale) non-trovoal solutoin
detailed explanation later when discussing Homographies…
each point correspondence agian provides 2 independent equations
=> thus minimum of 4 non-collinear points required
colinear: lying on single line
solution for n>=4 points solved similar to tsais using SVD (as we again search for minimal solution…)
In zhangs method, when we solved for H, how can we decompose into intrinsic and extrinsic matirices?
QR not possible
=> as R part is not orthogonal!
=> compared to tsai: one row missing…
first focus on recovering intrinsic matrix
and then after fixing it, revocer the extrinsic matrix with regular LGS…
How can we only revocer intrinsic parameters?
intrinsic parameters stay the same for different camera perspectives (for same camera)
only extrinsic change
=> find intrinsic by using multiple images from differnet views…
How can we specifically revocer intrinsic parameters?
each view j has different homography H^j
also different R^j (rotation) and T^j (translation) parameterts
But K stays the same for all views… (intrinsic…)
also denoted as M
=> in zhang, we were able to revocer individual homographies for each image but not yet its decomposition
=> use regular DLT method
After getting the values of the homograhpy in zhangs method, how can we start revocer the camera parameters?
determine intrinsic matrix M from a set of known homographies
express columns of rotation by unknown intrinsic parameters
then enforce constraints of the rotaiton matrix
What constraints w.r.t. M do we enforce in zhang?
again due to orthogonality of elements -> must be 0 when dot product with translation…
rotation matrix is orghogonal -> langth of each element is 1…
-> As parts in red squares are the same, replace with variable B
How can we solve for intrinsic parameters after enforcing the constraints?
when we can estimate B -> we can recover M from this equation above
As b is symmetric, we are only interested in 6 of its elements
h is known (obtained after DLT in previous step)
-> rewrite above equations to linear system
How can we re-write B to a linear system of equations?
We have two constriants for our homography
-> each homography contributes two linear equations
stack 2N equations from N views to yield linear system Ab=0
solve for b using SVD
=> typically we need more than 3 views (each view provides two constraints)
After obtaining the intrinsic parameters using zhangs, how can we recover the extrinsic parameters?
we now know intrinsic and homography
thus, from previous equaitons, compute each column
and enforce constraints on scale
(each column, not only r1…)
-> after this, Ri = (r1, r2, r3) might not be strict orthogonal (but should be)
=> last step: project result from matrix space onto the SO(3) manifold to enforce orghogonality (not into more detail, only keep in mind that after previous step additional is required to enforce strict orthogonality….)
Recap, what types of distortion are there?
light rays bend more near edges of lens than they do at its optical center
if lens misaligned (not perfectly pparalell to image sensor) tangential distortion occurs
How to perform joint estimation?
given object points and image points (detected chessboard corners)
-> conduct zhangs method
compute initial intrinsic parameters
while initially setting distortion coefficients all to 0
estimate initial extrinsic parameters as if the intrinsic have been already known
run gradient descent to minimize reprojection error to jointly optimize/estimate intrinsic, extrinsic and distortion parameters
How is the reprojection error defined?
euclidean distance in pixels between
observed image point and corresponding 3D point reprojected onto the camera frame
observed: how it really is
reprojected: how it should be using current parameters
How is the overall cost/objective functoin defined ; i.e. the training goal?
where left hand side of equation in sum is observed and right hand side reprojected
=> gradient descent