undefined

by David

Torralba and Oliva published a paper entitled "Statistics of natural image categories". They report that, contrary to what was believed before, the

---- bitte auswählen ---- power spectrum pixel intensity histogram

of (images of ) natural scenes is only

---- bitte auswählen ---- non-isotropic isotropic

if averaged across image categories, but if analysed separately for different image categories, they found strong correlations between

---- bitte auswählen ---- the shape the total power phase the complex conjugate

of the power spectrum and image categories. Typically a density-plot the power spectrum of an image of a man-made scene is more

---- bitte auswählen ---- egg-shaped circular triangluar-shaped star-shaped

. Based on

---- bitte auswählen ---- two; a large number of ; a small number of ; a single

component(s) of a principal component analysis (PCA) performed on the power spectrum, Torralba & Oliva were able to correctly categorise images into animal and non-animal scenes in % of the cases. Calculating the PCA of the power spectrum is a

---- bitte auswählen ---- non-linear linear

operation.

---- bitte auswählen ---- However Still

this operation could be performed

---- bitte auswählen ---- only with great difficulty; already in the retina; in a feedforward manner; only using feedback

in the human brain given what is currently known about physiology. Thus Torralba & Oliva concluded that animal versus non-animal categorization is so rapid because their

---- bitte auswählen ---- structural description model summary statistic image segmentation view-based

approach does not require an explicit

---- bitte auswählen ---- image segmentation Fourier transformation image alignement

step.

Torralba and Oliva published a paper entitled "Statistics of natural image categories". They report that, contrary to what was believed before, the power spectrum of (images of ) natural scenes is only isotropic if averaged across image categories, but if analysed separately for different image categories, they found strong correlations between the shape of the power spectrum and image categories. Typically a density-plot the power spectrum of an image of a man-made scene is more star-shaped . Based on a small number of component(s) of a principal component analysis (PCA) performed on the power spectrum, Torralba & Oliva were able to correctly categorise images into animal and non-animal scenes in 80 % of the cases. Calculating the PCA of the power spectrum is a non-linear operation. Still this operation could be performed in a feedforward manner in the human brain given what is currently known about physiology. Thus Torralba & Oliva concluded that animal versus non-animal categorization is so rapid because their summary statistic approach does not require an explicit image segmentation step.

Thorpe, Fize & Merlot published a study in Nature which exerted a very strong influence on the object recognition community. In their paper they showed that

---- bitte auswählen ---- human observers monkeys cats

could decide whether a previously unseen

---- bitte auswählen ---- photograph line drawing painting

of a natural scene contained an animal or not. The median reaction time (RT) of the observers was around

---- bitte auswählen ---- 400-500 100-200 200-300 500-600 600-700

ms with a mean percentage correct of

---- bitte auswählen ---- 85-90 90-95 95-100 80-85

% correct (note that the observers showed

---- bitte auswählen ---- a slight no a strong

speed-accuracy trade-off). Subsequent

---- bitte auswählen ---- ERP fMRI PET multi-unit single-unit

analyses showed that roughly

---- bitte auswählen ---- 150 100 200 250

ms after stimulus onset the measured neurophysiological correlate could already reliable signal the presence or absence of an animal in a post-hoc analysis. Thus processing of the natural scene stimulus was already completed after such a comparatively short time. According to the authors this result provides strong evidence in favour of essentially

---- bitte auswählen ---- feedforward dynamic feedback multi-level feedforward & feedback feedback deep-belief neural network

theories of visual object recognition. This, in turn, argues against object recognition theories requiring an explicit

---- bitte auswählen ---- image segmentation 2D-to-3D Fourier transform Wavelet transform multi-scale image decomposition

step prior to recognition, as such a step is presumed to require

---- bitte auswählen ---- time consuming fast computationally complex

---- bitte auswählen ---- iterative feedback feedforward non-linear processing linear decomposition fast Fourier

algorithms.

Thorpe, Fize & Merlot published a study in Nature which exerted a very strong influence on the object recognition community. In their paper they showed that human observers could decide whether a previously unseen photograph of a natural scene contained an animal or not. The median reaction time (RT) of the observers was around 400-500 ms with a mean percentage correct of 90-95 % correct (note that the observers showed a slight speed-accuracy trade-off). Subsequent ERP analyses showed that roughly 150 ms after stimulus onset the measured neurophysiological correlate could already reliable signal the presence or absence of an animal in a post-hoc analysis. Thus processing of the natural scene stimulus was already completed after such a comparatively short time. According to the authors this result provides strong evidence in favour of essentially feedforward theories of visual object recognition. This, in turn, argues against object recognition theories requiring an explicit image segmentation step prior to recognition, as such a step is presumed to require time consuming iterative/feedback (both correct as it seems) algorithms.

Wichmann, Drewes, Rosas and Gegenfurtner published a paper

---- bitte auswählen ---- casting doubt on; confirming

the conclusions made by Torralba & Oliva. The two main conclusions of the study – based on

---- bitte auswählen ---- computational analysis ; psychophysical experiments; neuro-imaging techniques

– were, first, that for human observer animal detection in typical photographs of natural scenes

---- bitte auswählen ---- is independent of the power spectrum ; relies on many PCA components of the power spectrum; depends on the power spectrum as claimed by Torralba & Oliva ; is independent of the phase spectrum

. Second, they may indicate that in typical, commercial databases the statistics of the images may

---- bitte auswählen ---- be as; not be as ; even be more

natural as/than often presumed, because photographs typically represent a

---- bitte auswählen ---- true ; random sample Gaussian sample; biased ; unbiased

view of the world.

Wichmann, Drewes, Rosas and Gegenfurtner published a paper casting doubt on the conclusions made by Torralba & Oliva. The two main conclusions of the study – based on psychophysical experiments – were, first, that for human observer animal detection in typical photographs of natural scenes is independent of the power spectrum . Second, they may indicate that in typical, commercial databases the statistics of the images may not be as natural as/than often presumed, because photographs typically represent a biased view of the world.

The average and distribution of properties, like orientation or color, over a set of objects or a region in a scene are called the

---- bitte auswählen ---- ensemble statistics ; guiding features; None of the two answers is true

of the scene, and is/are computed by the

---- bitte auswählen ---- selective ; None of the two answers is true; nonselective

pathway.

The average and distribution of properties, like orientation or color, over a set of objects or a region in a scene are called the ensemble statistics of the scene, and is/are computed by the nonselective pathway.

What are the two pathways to scene perception?

(max 380 characters)

There is both a selective and nonselective pathway for scene perception. The selective pathway involves the allocation of attention to one or a few objects at a time and is governed by the attentional bottleneck. Thus, there is selective processing of objects in the selective pathway, meaning that it is responsible for visual search, binding, and the existence of phenomena such as the attentional blink, change blindness, and inattentional blindness. The nonselective pathway, on the other hand, processes visual scenes holistically, encoding scene gist, spatial layout, and ensemble statistics very quickly. The representations in the nonselective pathway are generated as a whole and do not include descriptions of individual objects within the scene. The nonselective pathway has connections with the selective pathway and can, for instance, guide visual search for particular objects in a scene by helping the observer restrict attention to particular locations in the scene.

What are ensemble statistics?

(50 characters)

Ensemble statistics are rapidly extracted representations of visual scenes that include the average and distribution of properties like orientation or color over a set of objects or a region of space. Ensemble statistics represent knowledge about the properties of a group of objects rather than individual objects themselves.

---- bitte auswählen ---- Physical organization ; Spatial layout; Physical setting; Spatial organization Setting - - -

describes the structure of a scene without reference to the identity of specific objects in the scene.

Spatial layout describes the structure of a scene without reference to the identity of specific objects in the scene.

Join Course

Preview

Author

David

Information

Last changed
a year ago

Report course

Q8

Author

David

Information