02. Motion and Scene representation

Buffl

Computer VIsion 2

by Jensen J.

What is the world frame?

the global frame / coordinate systrem we use
-> allows us to provide absolute positions for individual cameras / local coordinate systems
in relation to their global position

In what are we interested when we talk of relative positions?

only in the relative positions from two or more cameras
but not in a global context

When performing a transformation, what is fixed and what is actually transformed?

we transform local frames
-> change the coordinate systems
but usually fix the points (static)

How do the coordinate systems of two camera frames relate (w.r.t. a point p in the world frame)?

basis * coefficients -> point in space
meaning the point is the same but on the right hand side with new basis (and thus also new coefficients)

What do rotation matrices belong to and what properties do they have?

special orthogonal group
a = R a’
RR^T = I (rotation matrix * translated rotaiotn matrix = identity matrix)
det(R) = 1
a’ = R^-1 a = R^T a

How can an entire euclidean transformation be expressed?

a’ = R a + t
rotation matrix R
point a
translation t

How can rotaiton and translaiton be expressed in a more compact way? What is required?

Transformatoin matrix T

requires a, b, c to be (from 3 to) 4 dimensional…

How can we create T from R and t? How must the points be changed?

What is the definintion of the special euclidean gorup (SE(3))?

rotation and translation matrix (set)
matrix multiplicatoin (operator)

Again, what is the result of applying rotaiton and translation to a point?

have point p in origional frame
apply translation and rotation to compute same point
in new frame (coordinate system)

How can one reverse rotation and translation (compute point in original frame)?

What is the transformation matrix of inverse transformation?

Given two absolute poses ((R1,t1), (R2,t2)) how can we compute the the relateive pose (R12, t12)?

How should we express the position of a camera frame in the world frame?

provide the rotaiotn and translation
from camera to world
- not from world to camera…

How can we get the R, t from world to camera?

get the inverse/transpose from camera to world…

How many degrees of freedom does the similarity transformation have?

How many degrees of freedom does euclidean transformation have?

2D -> 4
- rotation along each axis: 2
- translation along each axis: 2
3D -> 6
- ‘‘ -> 3
- ‘‘ -> 3

What is the additional dimension / degree of freedom introduced by similarity transformaiton?

euclidean (6)
plus scale (1)
-> 7

What are plücker coordinates used for?

to express lines in R^3 (3D lines…)

How are projection planes defined (w.r.t. 3d lines)?

projection plane is based on origin of camera frame
and 3D line
=>

Of what do plücker coordinates consist?

v: direcotin of 3D line (typically unit vector)
n: normal of projection plane
- -> we have V
- -> we have Q (perpendicular to v through origin)
- n: normal of projection plane (n = Q x v)

v gives directoin
n gives distance and position to origin

What is Q w.r.t. plücker coordinates?

Q is the point in the image frame (w.r.t. coordinate system of image frame) that lies on the 3D line
- -> Q gives point, v gives direction…

How is the norm of the n part in hte plücker coordinates (normal of projection plane) obtained?

use definintion of cross product
- || a x b || = ||a|| ||b|| sin theta
n = Q x v
as Q x v orthogonal -> sin (90) = 1…
-> || n || = d * ||v||
with d = ||Q|| = ||n||/||v||
- distance from origin to image plane…

In real-world scenarios, does one use 3D lines as they are?

no -> usually use segments (3D lines with start and endpoint…)

What is a general “problem” of using rotation matrices to denote rotations?

has 9 elements
but actually only 3 degrees of freedom…
- -> x,y,z axis rotation…
=> redundancy…!
=> there are better ways to express rotations

What common methods to express rotation are there? What is the best one?

roation matrix
euler angles
angle-axis (rotation vector)
quaternion
cayley’s representation

=> no ideal rotatoin representation for all purposes

-> are equivalent in some sense as they express the same

What axis exist in euler angles?

roll axis
pitch axis
yaw axis

What type of rotation exists in euler angles?

intrinsic rotation
extrinsic rotation

What is the difference between extrinsic and intrinsic rotatoin (euler angles)?

extrinsic:
- the initial axis stay fixed when we conduct the invididual rotations
- -> comparable to world space
intrinsic:
- axis change with the rotation (dynamic)
- -> rotation on one axis change palcement of other axis thus the following rotation…
- -> compared to object space / camera frame

What is a gimbal lock?

degenerate case of euler rotations
-> first two rotations (along z, y axis)
-> lead to last rotation along x asix being along initial z axis

How does the axis-angle repesenation looks like?

give rotation axis e
give rotation angle theta

-> rotation vector is theta e

How are quaternions defined?

based on polar coordinates

-> similar to axis-angle representaiton

but: actual representation with 4-dim vector q

q = (q0,q1,q2,q3)
with

How is a quaternion denoted?

(n = [x,y,z]…)

When using quaternions, how are real points denoted?

denote as quaternion with real coordinate equal to 0
p = [0, x, y, z] = [0,v]

What are some common 3D representation methods?

Point cloud
Voxel grid
Implicit surface
Mesh

What are point clouds?

discrete set of data points in space
may represent shape or object
each point position has its set of cartesian coordinates and independent of each other

What is a voxel grid?

similar to pixel grid but in 3D
voxel: 3D cube
=> grid of values organized into layers, rows and columns (resulting in these 3D cubes…)

What is mesh?

-> collectoin of vertices, edges and faces

that define shape of polyhedral object

What types of mesh are there?

triangle mesh
- preffered if gemoetry function is easy and less complex
- mostly for regular geometrical shapes
quad mesh
- (relatively) accurate results
- more used in complex systems

How do voxel, point cloud and polygon mesh compare?

	Voxel	Point cloud	Polygon mesh
memory efficiency	poor	not good	good
textures	not good	no	yes
for neural networks	easy	not easy	not easy

How does signed distance function differ from the previously introduced 3D representation methods?

previous: explicit
SDF: implicit

How does SDF roughly work?

create voxel grid
calculate distance of actual surface to each voxel
-> creates zeru-value SDF isosurface
- voxel lies outside: distance positive
- voxel lies inside: negative value
based on this, regress a decision boundary
=> creating water-thight surface

How do we actualy find the surface given the voxels?

take voxel corners
determin sign of these corners (positive: inside; negatrive: outside)
create iso-surface delimiting these edges

What are line clouds?

similar to point clouds
but with lines…

How can we improve 3D scene representation?

integrate infromation
-> i.e. parallelism and orthogonality
- man-made structures usually orthogonal and paralell
- -> enforce this in line-clouds…
-> enforce co-planarity

How can we extend semantic information from 2d to 3d? (2d image segmentation -> each pixel has semantic meaning like car, sidewalk, pedestrian…)

assign semantic meaning to
points of pointcloud

What are rigid transformations?

Rotation and translation

Join Course

Preview

Author

Jensen J.

Information

Last changed
a year ago

Report course