What are general limitations in image line detection?

incomplete detection due to limitaion of existing algorithm (only partial line detected…)

occluded detected image line due to limited FOV (rest of line outside of current image…)

What is the projection ambiguity problem?

consider: we have several 3D lines lying on the same projection plane

each 3D line has unique plucker coordinates

=> thus have different scale when not considered not considering the projection…

What is calibration?

process to determine

extrinsic parameters (R,T) of camera

intrinsic parameters (K plus lens distortion)

What is camera localization?

for known intrinsic parameters

-> figure out extrinsic parameters

=> in later chapter in detail…

When doing calibration, why are we not really interested in extrinsic parameters?

as for lateron tasks, they change

-> but intrinsic stay fixed

=> only store intrinsic for lateron…

Of what does Tsai’s method consist?

measuring 3D position of n>= 6 3D control points on a 3D calibartion target

and the 2D coordinates of their projections in the image

=> based on only a single image

plus prior knowledge about i.e. size of each square…

How can we solve the calibration problem using direct linear transform?

=>

use standard methods to solve this linear system of equations…

How can one rewrite the regular formula for calibration to solve it (tsais method)?

multiply both intrinsic and extrinsic matrices

rewrite the individual elements to m11, m12, …, m34

=> rewrite as column vectors; replace right matrix with P

Set up homogenous LGS

-> here, under the division part (with mT3) due to it being lambda according to the left hand side…

What is the final LGS for tsais method to solve for?

=> (remiinder P is world coordinates we know…)

for n points -> stack all these equations into a big matrix

What is the general “formula” to create the final matrix Q?

For each point, add two rows to the Q matrix:

(here split to have the image bigger…)

Does the scale of M initially matter for tsais method?

no -> we have homogenous LGS QM = 0

=> actual scale can be revocered lateron based on the last element of K (which is part of decomposition of M) where we know that the laset element must be 1…

To solve QM = 0 (tsai), which rank must Q have?

to have unique, already scaled solution -> rank 12 (unique solution)

12 due to M being 12 dimensional

In practice: 11 suffice as we can scale lateron…

How many point correspondances are needed for tsais method?

each 3D - 2D correspondense provides 2 independent equation

-> 12 equations needed

-> 6 point correspondances

How can we solve QM = 0 when we have an overdetermined solution? (n>= 6 points)

Least sqare solition

-> minimizing the sum of the squared residuals ||QM||^2

subject to constraing ||M||^2 = 1 (why this is is explained later)

=> can be solved using SVD

Why do we need the constraint ||M|| ^2 = 1 in tsais method?

Constraint introduced to avoid obvious non-zero solution…!

How can we apply SVD to solve QM=0 in tsai?

Q is known

M is unknown

we want to minimize the residuals

min ||QM||^2 _ 2

as we want to find M

arg min _M ||QM||^2_2

subject to ||M||_2 = 1

use SVD composition on Q

Q = UEV

optimal solution M* is the column of V corresponding to the smallest singular value (which are denoted in E)

When we found M, how can we recover the extrinsic and intrinsic parameters?

M = K(R|T)

M now known, K(R|T) still unknown

use QR factorization (as it inherently satisfies orthogonaly which is required in our rotation matrix, due to it being in SO(3))

-> QR (or rather RQ) decomposes M into orthogonal R, T and upper triangluar matrix (i.e.K)

=> only works for square matrices…

What is the practical setup of tsais method? What are advantages?

advantage:

require only single picture

corners can be detected with very high accuracy < 0.1 pixels currently

setup:

use many more than 6 points (ideally more than 20) and non coplanar

What is coplanarity?

set of points are called coplanar

if there exists a plane in 3D space on which all of these points lie

=> can fit a plane in 3D space through all points…

What is a differetn method to tsais?

zhangs method

On what does zhangs method rely?

on 3D coplanar points

Compare tsais and zhangs method w.r.t. their requirements

Tsais:

single image

non-coplanar points (more complex image such as cube)

Zhangs:

multi-view images

coplanar points (i.e. chess board)

What is the mathematical approach to Zhangs method?

same as in tsai: neglect radial distortion

but due to all points being coplanar: Zw can be set to 0 (as only X and Y axis are important…)

=> set Z axis paralell to the plane of the 3D poitns

-> when writing more compact, can simply leave out part with 0 and reduce size of matrix…

How to rewrite zhangs equations considering Zw = 0?

where similar to tsais (m), h represents the combined intrinsic (kalibartion) and extrinsic (rotation and translation) matrices

How do we transform the matrix in zhangs method to be able to solve for H?

Here, H is called Homography

And keep in mind -> Zhang requires several images thus P corresponds to one of them

-> Then applied to the i-th image

How do we create the matrix in Zhangs method for multiple points? (n points)

where lds Q is known (as we know the different points in the camera frame) and H is unknown

How do we solve the linear system for zhangs method?

Again, search for minimal solution

-> Q (2nx9) should have rank 8 to have unique (up to scale) non-trovoal solutoin

detailed explanation later when discussing Homographies…

each point correspondence agian provides 2 independent equations

=> thus minimum of 4 non-collinear points required

colinear: lying on single line

solution for n>=4 points solved similar to tsais using SVD (as we again search for minimal solution…)

In zhangs method, when we solved for H, how can we decompose into intrinsic and extrinsic matirices?

QR not possible

=> as R part is not orthogonal!

=> compared to tsai: one row missing…

approach:

first focus on recovering intrinsic matrix

and then after fixing it, revocer the extrinsic matrix with regular LGS…

How can we only revocer intrinsic parameters?

intrinsic parameters stay the same for different camera perspectives (for same camera)

only extrinsic change

=> find intrinsic by using multiple images from differnet views…

How can we specifically revocer intrinsic parameters?

each view j has different homography H^j

also different R^j (rotation) and T^j (translation) parameterts

But K stays the same for all views… (intrinsic…)

also denoted as M

=> in zhang, we were able to revocer individual homographies for each image but not yet its decomposition

=> use regular DLT method

After getting the values of the homograhpy in zhangs method, how can we start revocer the camera parameters?

determine intrinsic matrix M from a set of known homographies

express columns of rotation by unknown intrinsic parameters

then enforce constraints of the rotaiton matrix

What constraints w.r.t. M do we enforce in zhang?

Constraint 1:

again due to orthogonality of elements -> must be 0 when dot product with translation…

Constraint 2:

rotation matrix is orghogonal -> langth of each element is 1…

-> As parts in red squares are the same, replace with variable B

How can we solve for intrinsic parameters after enforcing the constraints?

when we can estimate B -> we can recover M from this equation above

As b is symmetric, we are only interested in 6 of its elements

h is known (obtained after DLT in previous step)

-> rewrite above equations to linear system

How can we re-write B to a linear system of equations?

We have two constriants for our homography

-> each homography contributes two linear equations

stack 2N equations from N views to yield linear system Ab=0

solve for b using SVD

=> typically we need more than 3 views (each view provides two constraints)

After obtaining the intrinsic parameters using zhangs, how can we recover the extrinsic parameters?

we now know intrinsic and homography

thus, from previous equaitons, compute each column

and enforce constraints on scale

(each column, not only r1…)

-> after this, Ri = (r1, r2, r3) might not be strict orthogonal (but should be)

=> last step: project result from matrix space onto the SO(3) manifold to enforce orghogonality (not into more detail, only keep in mind that after previous step additional is required to enforce strict orthogonality….)

Recap, what types of distortion are there?

radial distortion

light rays bend more near edges of lens than they do at its optical center

tangential distortion

if lens misaligned (not perfectly pparalell to image sensor) tangential distortion occurs

How to perform joint estimation?

given object points and image points (detected chessboard corners)

-> conduct zhangs method

compute initial intrinsic parameters

while initially setting distortion coefficients all to 0

estimate initial extrinsic parameters as if the intrinsic have been already known

run gradient descent to minimize reprojection error to jointly optimize/estimate intrinsic, extrinsic and distortion parameters

How is the reprojection error defined?

euclidean distance in pixels between

observed image point and corresponding 3D point reprojected onto the camera frame

observed: how it really is

reprojected: how it should be using current parameters

How is the overall cost/objective functoin defined ; i.e. the training goal?

where left hand side of equation in sum is observed and right hand side reprojected

=> gradient descent

What is a general problem in gradient descent? How can it be solved?

might end in local minimum…

-> use tsais / zhangs method to find good initial values

-> then apply gradient descent (here in right image…)

Last changed2 months ago