What different type of semantic segmentation did we talk about?
find what the objects are
differentiate the different objects from each other
instance segmantation that also considers the background
What are the general steps of image fitting pipeline?
model selection based on type of data
-> for data fitting
-> e.g. if only rotation and translation needed
-> SO(3) is good
-> if scaling is needed
-> requires SE(3)
parameter estimation by
minimizing the distances
-> done by computing sum of distances
=> find optimal rotation, translation (parameters) to minimize sum of correspondence distances (objective function)
What is the response functoin in cameras? What is a problem with it?
maps log-exposure value (scene radiance)
to intensity levels in teh input images
=> basically transforms light signal to electruc signal…
=> is not linear, thus needs calibration
How does a pinhole camera work?
has converging lens
all rays parallel to optical axis converge at focal point
flips the image
How do we map 3D coordinates to image coordinates in pinhole cameras?
image coordinates: x
3D coordinates: X
What are the perspective effects of pinhole cameras?
far away objects appear smaller
intersection of parallel lines in 2D
What is meant by intersection of parallel lines in 2D?
paralell lines intersect at “vanishing point” in the image
can fall both inside or outside of the image
connection between two horizontal vanishing points is the horizon
=> i.e. wie in gemoetrie in HfT -> fernpunkte wenn tatsächlich paralell
-> oder in echt paralell aber durch perspektive nivht mehr…
-> blue line is horizon / vanishing line
What is meant by vanishing directoins?
vanishing direction is defined by connecting vanishing point and camera center
-> is paralell to a 3D dominant direction (which yield the vanishing point…)
-> more on that in later lectures…
What is the FOV in our pinhole camera?
angular portion of 3D scene seen by camera
=> FOV are dimensions of the scene of the target object
-> the furtehr we are away, the smaller the FOV…
What is the FOV proportional to?
inversely proportional to focal length
-> large POV -> use short focal length
-> small POV -> use large focal lenght
What notation do we have in perspective projectino?
C: optical center
-> center of lens, i.e. center of projection
Xc, Yc, Zc: axes of camera frame
Zc: optical axis (principal frame)
O: principal point
-> i.e. intersection of optical axis and image plane
How do perspective projection and parallel projeciton differ?
size varies inversely with distance -> looks realistic
parallel lines do not (in general) remain paralell
good for exact measurements
paralell lines remain paralell
less realistic looking
How are points from camera frame (stuff in front of the lens) projected onto the image plane?
based on similar triangles
!! we use front image plane!!!
In perspective projection, how do we convert image coordinates to pixel coordinates?
ku and kv are conversoin factors (between mm and pixels)
u,v are pixel coordinates
here, we can expres the focal length
in mm -> ku*f; kv*f
in pixels -> au, av (replacing ku*f…)
What is the final model to transform 3D camera coordinates to 2D image coordinates?
use formulas we derived so far (camera plane -> image plane -> pixel)
where focal length au av (in pixels…)
and principal points u0 v0
What is the skew factor?
principal point (center of camera) does not completely align with center of image frame
=> due to manufacturing process…
=> nowadays very good manufacturing -> skew factor assumed to be 0…
=> au = av (i.e. suqare pixels…)
What are the general steps of projecting a point from world frame to pixel coordinates?
world frame -> camera frame (take extrinsic parameters into consideratoin)
camera frame -> pixel
What are the important coordinate systems we make use of in perspective projection?
Where do intrinsic, where extrinsic parameters matter?
camera frame -> image frame
world frame -> camera frame
-> corresponds to rotation and translation…
What are the extrinsic parameters for mapping world to camera frame?
[R|T] -> rotation and translation
=> called extrinsic parameters
How can we (using our knowledge so far) create a condensed formula for mapping world to pixel frame?
-> combine intrinsic and extrinsic parameters
What optoins for line projection did we discuss?
two-step computatoin method
one-step computation method
How does two-step line projection work?
independently project the two endpoints of the line (homogenous)
line is then cross product of these homogenous endpoints
-> line also in homogenous representation… (as it is 3D vector…)
How is a plane defined in homogenous coordinates?
4D vector (A,B,C,D)
where n=[A,B,C] is the unit vector
D is the distance from the origin (projection from origin onto normal)
take two arbitrary points on plane to calculate the vector between them
dot product with normal must be 0 (as it is orthogonal…)
Use this to compute D
When does a point in homogenous coordinates lies on a plane?
point P (X,Y,Z,1)
-> PTn = 0
=> lies on plane
How to compute a projection plane by image line?
camera center and line on image plane span projection plane…
P: projection matrix (i.e. combined intrinsic and extrinsic matrix -> but here transposed!)
I: line on image plane (result of 1-step or 2-step method…)
How to compute the intersectoin between a 3D line and 3D plane?
L: Plucker matrix of line
=> derived from the plucker coordinates
Again, what is the principal point?
intersectoin of camera center with image plane… (center (if no intrinsic deviation) of image plane…)
What is a normalized image?
virtual image plane
focal length equal to 1
origin of pixle coordinates at principal point
Why to use normalized images?
in certain cases, want to do thing in 3D
-> treat/extend 2D coordinates of image plane in 3D (w.r.t. camera center etc.)
=> normalized images improve ease of computation
What is meant by parallelism of ray direction?
it is a geometric constrait that
-> image frame points are parelell to camera frame points
=> lie on same line from camera center
=> only differ in scale
What is meant by parallelism of normals of projection plane?
describes a geometric constraint of lines
-> normal on line on image frame and camera frame only differ by scale
=> normals are paralell
How can we alternatively express line constriants (textual…)
3D line projection (image to camera frame) is orthogonal to the normal of projection plane
direction defined by 3D point lying on 3D line and the origin is orthogonal to the normal of projeciton plane
line from camera center to point on line …
Using these constraints, explain how we can derive rotaiton and translation?
given: (camera frame) 3D point o, 3D lines ex, ey
given: projection of theme onto image frame of two cameras (p, dx, dy)
-> use constraints to calculate rotation and translation matrix between them
calculate normal nx, ny between p and dx, dy
ex, ey is scaled paralell to the cross products of both norms with translaiton
using this, we can calculate
What is one large difference between planar and spherical projection?
sperical has larger fov than planar projection
-> sperical: iamge plane is sphere…
How to project 3D points to spherical coordinates?
normalize 3D point to map onto sphere
convert to spherical coordinates
“up / down” using polar angle (0-pi)
“left/right” using asimuth (0-2pi)
r needed if we do not have unit sphere (to express distance from center if not 1…)
What is another way to express spherci images?
use cross-shaped expansion
using cube-based representaiotn
map sphere to different surfaces of the cube…
What property is seen when mapping two paralell lines onto a sphere?
create circles arond the sphere
when mapping two lines onto the sphere
the normals to the projected planes from these lines
lie on the same circle if the 3D lines are paralell
What is a big limitation of spherical projection?
spatial distortion due to equi-rectangular representation!!!
How to solve distortion problem of spheres?
use icosahedral representation
=> map to icosahedron and define triangles
map to triangle shaped sphere
expand to triangle image