undefined

Buffl

CV_Neu

von Jensen J.

What types of descriptors did we discuss?

patch descriptors
- -> i.e. patch of intensity values
- needs to be warped into canonical space
census descriptor
- a vector with integer/float values

What is patch scale search?

trying to find the correct scale between two patches to compare
-> brute force try SSD for all (patch1) x (patch2 x scales) cobinations

What are drawbacks of patch scale search?

inefficient
-> O(N^2S)
- where N are the features per image
- and S is the number of scales to try (i.e. we only try diff. scales in one image)
as we fix scale of left patch
- -> cannot guarantee that it is optimal (distinctive enough)

How do we want to improve the high complexity of patch scale search?

general: assume that we know the scale beforehand (a priori)
- -> only O(N^2)…
gial: automatically determine scale before matching (independent of it) before we actually match
- => i.e. determine scale based on single image

What is the goal of automatic scale determination?

we want to automatically find a scale (size)
for both images individually
=> i.e. independent of tentative matching unlike before

What methods do we have to find rotation for our patch descriptor?

harris descriptor
use gradients vector

How does the harris detector work for derotating?

use harris detector as it is rotation invariant
-> as eigenvalues of M matrix correspond to diretoins of quickest and slowest changes of SSD
=> eigenvalues form an ellipse that can rotate, but shaoe stays the same
=> easy to assign rotation…

How does the second method to derotate using pixel wise gradient vectors work?

compute gradients vector at each pixel within a patch
build histogram of gradient orientations (0-2pi) weighted by gradient magnitude (norm of vector)
using this information -> extract local maxima above certain threshold
1. => constitute a candidate dominant directoin (typically 3)
use these dominant directions to align the rotatoin for both patches

Which strategy to rotate patches is preffered in practice?

the one based on gradient orientations
-> more robust

What is the aim of blob detectoin?

given single image
-> detect blobs and automatically assign them “appropriate” scales

Why is the use of blobs more comfortable than e.g. cornerrs?

blobs inherently encode scale informaiton
- -> corner can hardly do this..
=> blobs can directly be resized to evaluate two pacthed by SSD

How is the scale mathematically expressed in blobs?

radius of the circle that denotes scale

How do we get from blob matching to point corerespondances?

for two matched blobs
-> their respectve centers constitute the corresponding points

What is a drawback of using Blobs?

blob centers may not be very precise
compared with corners

How do we determine the optimal scale of a blob?

make use of LoG (laplacian of gaussians)
- -> defined for fixed scale sima
- try set of candidate scales sigma for which we compute LoG (applying kernel)
the LoG corresponds to the so called “response”
-> find the sigma where the response is maxium
=> this is our optimal scale

How can we improve our Blob finding?

disregard too close patches with same scale
-> as are probably the same…

How do we disregard “too similar / overlapping blobs”

overlapping: centers are close, sigma is the same
=> use Non maximum suppresion
=> only keep the blob wiht higher response (i.e. LoG extrema)

What is the pipeline for blob detection with scale using a single image?

build laplacian space, starting with an iinitial scale and go for n interations
1. generate a (scale-normalized) LoG filter at a given scale (k^n*initial)
2. filter image with LoG kernel
3. save square of laplacian filter response for current level of scale space
4. increase scale by factor k
perform non-maximum suppresion in scale space (-> find best scale for each blob)
display resulting circles at their characteristic scales

What scale of patch descriptor methods did we differentiate? What is the complexity?

bruteforce matching with straightforward scale search (one-step method)
- S*N^2
bruteforce matching with automatic scale search (two step method)
- N*S + N*S + N^2

What are disadvantages of patch descriptor-based methods?

if warping not estimated accurately
- -> very small errors in rotaiton, scale, viewpoint will affect matching score based on SSD
LoG is relative inefficient

What is an alternative to patch descriptor based methods?

census descriptor based methodd

What is a main difference betweeen patch and census descriptor based methods?

in census descriptor based methods
-> we do not directly compare patches with SSD
-> but compare associated vector descriptors
- -> less sensitive to noise

=> i.e. use vector to describe patch instead of pixel-wise SSD

What are disadvantages of patch-descriptor based methods?

What is an alternative to LoG to overcome its inefficiency?

use the difference of gaussian DoG kernel

What is sift?

scale invariant feature transform

What are the overarching stepts in SIFT?

key point extraction based on extreme detection using DoG (instead of LoG)
census descriptor assignment by HoG

How do we replace LoG with DoG?

LoG -> use kernel
DoG -> approximates LoG without using convolution
-> uses diffference of gaussian blurs at different scales

How do we perform DoG?

we have source image
compute several gaussian blurs of it (with different StdDev)
- sigma = 1; sigma = 2, …
calcualte difference between adjacent blurred images
=> result in DoG images

=> do this for different scales of the image (so called ocatves)

detect local extrema in these DoG images

What are the local extrema in the DoG images?

the SIFT key points

How do we detect local extrema in the DoG images?

each pixel is compared to 26 neighbors
- 8 around it (neighbors in current image)
- 9 above it (adjacent upper scale)
- 9 below it (adjacent lower scale)
- -> i.e. like a surronding cube in the DoG pyramid)
if pixel is local extrema -> select as SIFT feature

What is the output of the SIFT detector for each SIFT feature?

location (x,y) (pixel that is lcoal extrema)
scale s (scale in pyramid at which local extrema resides

How are census descriptors also called?

histogram of oriented gradients (HOG) descriptor

How do we calculate the HoG/census descriptor?

input: de-rotated patch
divide pacth into 4x4 cels
for each cell: generate 8-bin histogram (i.e. 8 directions)
=> concatenate all histograms into single 1D vector

What is the dimension of our SIFT HOG/census desriptor?

4 (cells) x 4(cells) x 8 (bins) = 128

What is the output of the SIFT algo?

location (-> pixel coords of center of patch): 2D vector
scale (-> high scale -> high blur in gauss pyramid..): 1 scalar value
orientation (-> dominant direction of HoG): 1 scalar value (i.e. angle of patch)
Descriptor (128 values …)

How does the application of HoG differ for dominant direction (rotation) determination and descriptor generation?

dominant direction:
- pixel level
- use HoG to find out how to de-rotate
descriptor generation:
- already de-rotated
- then use for cell-level descriptor generation

Beitreten

Vorschau

Author

Jensen J.

Informationen

Zuletzt geändert
vor 2 Jahren

Kurs melden

C05.3 - Correspondance Estimation Clarification and SIFT

Author

Jensen J.

Informationen