What different type of semantic segmentation did we talk about?

semantic segmentation

find what the objects are

instance segmentation

differentiate the different objects from each other

panoptic segmentation

instance segmantation that also considers the background

What are the general steps of image fitting pipeline?

model selection based on type of data

-> for data fitting

-> e.g. if only rotation and translation needed

-> SO(3) is good

-> if scaling is needed

-> requires SE(3)

parameter estimation by

minimizing the distances

between correspondences

-> done by computing sum of distances

=> find optimal rotation, translation (parameters) to minimize sum of correspondence distances (objective function)

What is the response functoin in cameras? What is a problem with it?

maps log-exposure value (scene radiance)

to intensity levels in teh input images

=> basically transforms light signal to electruc signal…

=> is not linear, thus needs calibration

How does a pinhole camera work?

has converging lens

all rays parallel to optical axis converge at focal point

flips the image

How do we map 3D coordinates to image coordinates in pinhole cameras?

image coordinates: x

3D coordinates: X

What are the perspective effects of pinhole cameras?

far away objects appear smaller

intersection of parallel lines in 2D

vanishing directions

What is meant by far away objects appear smaller?

size is inversively proportional with distance

What is meant by intersection of parallel lines in 2D?

paralell lines intersect at “vanishing point” in the image

can fall both inside or outside of the image

connection between two horizontal vanishing points is the horizon

=> i.e. wie in gemoetrie in HfT -> fernpunkte wenn tatsächlich paralell

-> oder in echt paralell aber durch perspektive nivht mehr…

-> blue line is horizon / vanishing line

What is meant by vanishing directoins?

vanishing direction is defined by connecting vanishing point and camera center

-> is paralell to a 3D dominant direction (which yield the vanishing point…)

-> more on that in later lectures…

What is the FOV in our pinhole camera?

angular portion of 3D scene seen by camera

=> FOV are dimensions of the scene of the target object

-> the furtehr we are away, the smaller the FOV…

What is the FOV proportional to?

inversely proportional to focal length

-> large POV -> use short focal length

-> small POV -> use large focal lenght

What is the mathematical relation of FOV θ, image width W and focal length f?

How are points represented using homogenous coordinates?

homogenous:

(a,b,c)

cartesian:

(a/c, b/c)

What notation do we have in perspective projectino?

C: optical center

-> center of lens, i.e. center of projection

Xc, Yc, Zc: axes of camera frame

Zc: optical axis (principal frame)

O: principal point

-> i.e. intersection of optical axis and image plane

How do perspective projection and parallel projeciton differ?

perspective:

size varies inversely with distance -> looks realistic

parallel lines do not (in general) remain paralell

paralell:

good for exact measurements

paralell lines remain paralell

less realistic looking

How are points from camera frame (stuff in front of the lens) projected onto the image plane?

based on similar triangles

!! we use front image plane!!!

In perspective projection, how do we convert image coordinates to pixel coordinates?

ku and kv are conversoin factors (between mm and pixels)

u,v are pixel coordinates

here, we can expres the focal length

in mm -> ku*f; kv*f

in pixels -> au, av (replacing ku*f…)

What is the final model to transform 3D camera coordinates to 2D image coordinates?

use formulas we derived so far (camera plane -> image plane -> pixel)

where focal length au av (in pixels…)

and principal points u0 v0

What is the skew factor?

principal point (center of camera) does not completely align with center of image frame

=> due to manufacturing process…

=> nowadays very good manufacturing -> skew factor assumed to be 0…

=> au = av (i.e. suqare pixels…)

What are the general steps of projecting a point from world frame to pixel coordinates?

world frame -> camera frame (take extrinsic parameters into consideratoin)

camera frame -> pixel

What are the important coordinate systems we make use of in perspective projection?

camera frame

image coordinates

pixel coordinates

world frame

Where do intrinsic, where extrinsic parameters matter?

intrinsic:

camera frame -> image frame

extrinsic:

world frame -> camera frame

-> corresponds to rotation and translation…

What are the extrinsic parameters for mapping world to camera frame?

[R|T] -> rotation and translation

=> called extrinsic parameters

How are extrinsic parameters applied?

How can we (using our knowledge so far) create a condensed formula for mapping world to pixel frame?

-> combine intrinsic and extrinsic parameters

What optoins for line projection did we discuss?

two-step computatoin method

one-step computation method

How does two-step line projection work?

independently project the two endpoints of the line (homogenous)

line is then cross product of these homogenous endpoints

-> line also in homogenous representation… (as it is 3D vector…)

How is a plane defined in homogenous coordinates?

4D vector (A,B,C,D)

where n=[A,B,C] is the unit vector

D is the distance from the origin (projection from origin onto normal)

take two arbitrary points on plane to calculate the vector between them

dot product with normal must be 0 (as it is orthogonal…)

Use this to compute D

When does a point in homogenous coordinates lies on a plane?

point P (X,Y,Z,1)

-> PTn = 0

=> lies on plane

How to compute a projection plane by image line?

camera center and line on image plane span projection plane…

P: projection matrix (i.e. combined intrinsic and extrinsic matrix -> but here transposed!)

I: line on image plane (result of 1-step or 2-step method…)

How to compute the intersectoin between a 3D line and 3D plane?

D: intersectoin

L: Plucker matrix of line

=> derived from the plucker coordinates

pi: plane

Again, what is the principal point?

intersectoin of camera center with image plane… (center (if no intrinsic deviation) of image plane…)

What is a normalized image?

virtual image plane

focal length equal to 1

origin of pixle coordinates at principal point

Why to use normalized images?

in certain cases, want to do thing in 3D

-> treat/extend 2D coordinates of image plane in 3D (w.r.t. camera center etc.)

=> normalized images improve ease of computation

How does one compute normalized coordinates?

inverse K and multiply wiht original coordinates

What is meant by parallelism of ray direction?

it is a geometric constrait that

-> image frame points are parelell to camera frame points

=> lie on same line from camera center

=> only differ in scale

What is meant by parallelism of normals of projection plane?

describes a geometric constraint of lines

-> normal on line on image frame and camera frame only differ by scale

=> normals are paralell

How can we alternatively express line constriants (textual…)

3D line projection (image to camera frame) is orthogonal to the normal of projection plane

direction defined by 3D point lying on 3D line and the origin is orthogonal to the normal of projeciton plane

line from camera center to point on line …

Using these constraints, explain how we can derive rotaiton and translation?

given: (camera frame) 3D point o, 3D lines ex, ey

given: projection of theme onto image frame of two cameras (p, dx, dy)

-> use constraints to calculate rotation and translation matrix between them

calculate normal nx, ny between p and dx, dy

ex, ey is scaled paralell to the cross products of both norms with translaiton

using this, we can calculate

What is one large difference between planar and spherical projection?

sperical has larger fov than planar projection

-> sperical: iamge plane is sphere…

How to create a spherical projection?

use omnicamera (360° camera…)

How to project 3D points to spherical coordinates?

normalize 3D point to map onto sphere

convert to spherical coordinates

using angles…

“up / down” using polar angle (0-pi)

“left/right” using asimuth (0-2pi)

r needed if we do not have unit sphere (to express distance from center if not 1…)

How to map spherical representation to planar?

s defined size of final image

How is (in engineering and physics) polar angle expressed (contrarily to mathematics…)

What is another way to express spherci images?

use cross-shaped expansion

using cube-based representaiotn

map sphere to different surfaces of the cube…

What property is seen when mapping two paralell lines onto a sphere?

create circles arond the sphere

when mapping two lines onto the sphere

the normals to the projected planes from these lines

lie on the same circle if the 3D lines are paralell

What is a big limitation of spherical projection?

spatial distortion due to equi-rectangular representation!!!

How to solve distortion problem of spheres?

use icosahedral representation

=> map to icosahedron and define triangles

map to triangle shaped sphere

expand to triangle image

What is the intrinsic matrix for line projection?

Last changed2 months ago