undefined

Buffl

CV_Neu

by Jensen J.

What is a projection plane?

plane defined by origin of a coordinate system
and a 3D line / 2D line

What is the problem formulation fo the matching/tracking problem?

estimate the transformation W (warping)
between a template image T and the current image I
=> all inlier 2D-2D point correspindences should satisfy the same warping model

How can one reformulate the warping estimatoin problem?

as the corerspondence finding problem
-> when one knows the correspondances -> one can (easily) estimate the required warping parameters

How does the warping estimation problem and the coprrespondance finding problem relate?

chicken and egg problem
-> when one has the one, the other is easy to calculate
=> in practice, we know nothing…

What types of solutions do we have to find correspondences?

indirect methods
direct methods

How do indirect methods work?

detecting and matching features
- i.e. points or lines

What are advantages and disadvantages of indirect methods?

advantages:

can cope with
- large frame-to-frame motions
- strong illumination changes

cons:

slow due to costly
- feature extraction
- matching
- outlier removal
- e.g. RANSAC

What is the general pipeline of indirect methods?

detect and match featuers that are invariant to
- scale
- rotation
- view point changes
- e.g. SIFT
gemoetric verification (RANSAC)
refine estimate by minimizing the sum of squared reprojection errors between
- observed feature in current image
- and the warped corresponding feature from the template

What are pros and cons of direct methods?

pros:

all information in the image can be exploited
- higher accuracy
- higher robustness to motino blur and weak texture
  - i.e. weak gradients
increasing the framerate reduces computational cost per frame (no RANSAC needed)

cons:

very sensitive to initial value
limited frame-to-frame motion

How do direct methods roughly work?

use all pixels (no individual point correspondances)
-> directly process pixel intensities (i.e. we use greyscale)
=> they estimate the warp parameters that
- minimize the sum of squared distances over all pixels
- of the template image
- and the warped corresponding pixel of the current iamge

yellow dots are individual pixels

minimize warp parameters
so that the SSD between over all pixels x
- T(x) -> pixel intensity in template of pixel x
- I(W(x,p)) -> intensity of the with p warped pixel x in the current image

What assumptions do we make for the direct method?

brightness constancy
temporal consistency
spatial coherence
spatial coherency

What is meant with brightness constancy?

as we directly compare squared distances of pixel intensities
-> intensity of pixels to track must not change much over consecutive frames
=> direct methods do not cope well with strong illumination chagnes

=> assume brightness is constant…

What is meant by temporal consistency?

assume that the frame-to-frame motion of object to track is small
-> around 1-2 pixels
=> direct method does not cope well with large frame-to-frame motion
- can be addressed using ocarse to fune multi scale implementations (later)

What is meant by spatial coherence?

all pixels in template undergo same transformation
- -> i.e. they lie roughly on the same 3D surface
=> i.e. if path to track contains two individual objects that move (transform) differently
- -> hard to use direct method…

What is meant by spatial coherency?

no errors in template image boundary
- only the object to track appears in template image
  - -> i.e. no background… (as it has different motion)
no occlusion
- entire template is visible in the input image
- -> i.e. postal card occluded by hand infromt of it XXXX

What is an exemplary direct method we introduced?

KLT (kanade-lucas-tomasi) tracker for small motion
consists of two sub-algorithms

Does direct methods in theory and practice differ?

yes
in theory: use all pixels
in practice (at least for KLM) -> do not use all pixels, as some are unreliable…

Of what sub-algorithms does the KLM tracker consist?

Tomasi-Kanade -> how should we select features (which pixels/image patch should we track)
- -> method to choose best features
lucas-kanade -> how should we track features from frame to frame?
- -> method to align an image patch

In the KLT tracker, are the sub-algorihms sequentially?

no the goal is to solve both simultaneously

What is the objective function in KLT in case of pure translation?

SSD over the translation parameters u,v (-> x+u,y+v)
is the Sum over all pixels in our image patch
- (intensity of pixel (x,y) in patch - intensity of pixel (x+u,y+v) in current image)^2

How do we rewrite our cost function to be able to use gradient descent?

have to get u and v out of the I1(…)
-> approximate the intensity in current image with first order taylor expansion
- => first order taylor expansion approximates the intensity near our x,y pixel
- -> for this, we create the sum of
  - I1(x,y) -> intensity at pixel (x,y)
  - directional gradient (x-direction) of current image * u (-> yields the approximate difference at the distance u from x)
  - same for directional gradient y…
- resulting in:

How do we actually differentiate the KLM SSD formula and minimize it?

minimize -> calcualte gradient and set to 0
- -> i.e. derivative w.r.t. u,v

first derive by u
- use innere * äußere

same for v

How can we solve the minimization ?

ausmultiplizieren (disregard factor -2 …)
bring Ix,Iy parts in one matrix
bring u,v in a vector
bring Ix delta I, Iy delta I in another matrix
solve for u,v vector

What do we have to look for in our M matrix?

must be invertible to solve
-> det(M) should be non-zero
-> eigenvalues should be large (i.e. not flat region, not an edge)

=> in practice: patch should be corner or more generally contain texture (else det(M) is low…)

After our findings, how can we answer the question on “how to choose patches to track”?

patches whose associated M matrix has large eigenvalues

After our findings, how can we answer the question on “how to track patches from frame to frame?

use SSD to find best fit for our patch in next frame (with displacement vector u,v)

How can we extend the discussed KLT case of simple translation to the general case (i.e. warping…)

extend our SSD formula

where x are the individual pixels of our patch
T(x) is the intensity of teh respectice pixel in our template
W(x,p) is the warping of our pixel with the unknown warping parameters p (-> warping -> new position in new image)
wher I(W(…)) is the intensity of a pixel location in the current image

How do we solve our minimization problem in the general case?

similarity:

apply first order approximation of warping

difference:

pure translation: partial derivatives to obtain direct solution
general case:
- gauss newton method to minimize SSD iteratively
- (can theoretically still use first oder optimailty conditions to generate equations w.r.t. warping parameters -> may be difficult to solve..)

How do we iteratively minimize for the genearl case?

assume p is known
-> incrementally update p (with a delta p) so that SSD is reduced
=> in each step, find delta p that minimizes the SSD

What is a drawback of our method to choose suitable patches? How can it be improved?

we have to judge all pixels in an image
-> only for first image, judge all pixels
-> in subsequent, only consider the tracked points…

How do we generally try to solve a chicken-and-egg problem? And what is the chicken-and-egg problem in our direct method?

we do not know the correspondances we want to track
neither do we know the warping parameters
-> genearlly: problem has two set of unknown parameters, parameters are mutually determined

solution: find additional constraint to solve parameter
for us: e.g. brightness consistency (i.e. SSD must be small…)

How do we generally proceed in KLT to find the best warping parameters?

incrementally update them (param p) to continuoulsy reduce the value of the cost function (SSD)
-> each iteration, we assume we know p and want to find a delta p that improves (reduces) our loss
=> i.e. gradient descent

Join Course

Preview

Author

Jensen J.

Information

Last changed
2 years ago

Report course

C05.1 - Correspondance Estimation Small Motion

Author

Jensen J.

Information