Torralba and Oliva published a paper entitled "Statistics of natural image categories". They report that, contrary to what was believed before, the
---- bitte auswählen ---- power spectrum pixel intensity histogram
of (images of ) natural scenes is only
---- bitte auswählen ---- non-isotropic isotropic
if averaged across image categories, but if analysed separately for different image categories, they found strong correlations between
---- bitte auswählen ---- the shape the total power phase the complex conjugate
of the power spectrum and image categories. Typically a density-plot the power spectrum of an image of a man-made scene is more
---- bitte auswählen ---- egg-shaped circular triangluar-shaped star-shaped
. Based on
---- bitte auswählen ---- two; a large number of ; a small number of ; a single
component(s) of a principal component analysis (PCA) performed on the power spectrum, Torralba & Oliva were able to correctly categorise images into animal and non-animal scenes in % of the cases. Calculating the PCA of the power spectrum is a
---- bitte auswählen ---- non-linear linear
operation.
---- bitte auswählen ---- However Still
this operation could be performed
---- bitte auswählen ---- only with great difficulty; already in the retina; in a feedforward manner; only using feedback
in the human brain given what is currently known about physiology. Thus Torralba & Oliva concluded that animal versus non-animal categorization is so rapid because their
---- bitte auswählen ---- structural description model summary statistic image segmentation view-based
approach does not require an explicit
---- bitte auswählen ---- image segmentation Fourier transformation image alignement
step.
Torralba and Oliva published a paper entitled "Statistics of natural image categories". They report that, contrary to what was believed before, the power spectrum of (images of ) natural scenes is only isotropic if averaged across image categories, but if analysed separately for different image categories, they found strong correlations between the shape of the power spectrum and image categories. Typically a density-plot the power spectrum of an image of a man-made scene is more star-shaped . Based on a small number of component(s) of a principal component analysis (PCA) performed on the power spectrum, Torralba & Oliva were able to correctly categorise images into animal and non-animal scenes in 80 % of the cases. Calculating the PCA of the power spectrum is a non-linear operation. Still this operation could be performed in a feedforward manner in the human brain given what is currently known about physiology. Thus Torralba & Oliva concluded that animal versus non-animal categorization is so rapid because their summary statistic approach does not require an explicit image segmentation step.
Thorpe, Fize & Merlot published a study in Nature which exerted a very strong influence on the object recognition community. In their paper they showed that
---- bitte auswählen ---- human observers monkeys cats
could decide whether a previously unseen
---- bitte auswählen ---- photograph line drawing painting
of a natural scene contained an animal or not. The median reaction time (RT) of the observers was around
---- bitte auswählen ---- 400-500 100-200 200-300 500-600 600-700
ms with a mean percentage correct of
---- bitte auswählen ---- 85-90 90-95 95-100 80-85
% correct (note that the observers showed
---- bitte auswählen ---- a slight no a strong
speed-accuracy trade-off). Subsequent
---- bitte auswählen ---- ERP fMRI PET multi-unit single-unit
analyses showed that roughly
---- bitte auswählen ---- 150 100 200 250
ms after stimulus onset the measured neurophysiological correlate could already reliable signal the presence or absence of an animal in a post-hoc analysis. Thus processing of the natural scene stimulus was already completed after such a comparatively short time. According to the authors this result provides strong evidence in favour of essentially
---- bitte auswählen ---- feedforward dynamic feedback multi-level feedforward & feedback feedback deep-belief neural network
theories of visual object recognition. This, in turn, argues against object recognition theories requiring an explicit
---- bitte auswählen ---- image segmentation 2D-to-3D Fourier transform Wavelet transform multi-scale image decomposition
step prior to recognition, as such a step is presumed to require
---- bitte auswählen ---- time consuming fast computationally complex
---- bitte auswählen ---- iterative feedback feedforward non-linear processing linear decomposition fast Fourier
algorithms.
Thorpe, Fize & Merlot published a study in Nature which exerted a very strong influence on the object recognition community. In their paper they showed that human observers could decide whether a previously unseen photograph of a natural scene contained an animal or not. The median reaction time (RT) of the observers was around 400-500 ms with a mean percentage correct of 90-95 % correct (note that the observers showed a slight speed-accuracy trade-off). Subsequent ERP analyses showed that roughly 150 ms after stimulus onset the measured neurophysiological correlate could already reliable signal the presence or absence of an animal in a post-hoc analysis. Thus processing of the natural scene stimulus was already completed after such a comparatively short time. According to the authors this result provides strong evidence in favour of essentially feedforward theories of visual object recognition. This, in turn, argues against object recognition theories requiring an explicit image segmentation step prior to recognition, as such a step is presumed to require time consuming iterative/feedback (both correct as it seems) algorithms.
Wichmann, Drewes, Rosas and Gegenfurtner published a paper
---- bitte auswählen ---- casting doubt on; confirming
the conclusions made by Torralba & Oliva. The two main conclusions of the study – based on
---- bitte auswählen ---- computational analysis ; psychophysical experiments; neuro-imaging techniques
– were, first, that for human observer animal detection in typical photographs of natural scenes
---- bitte auswählen ---- is independent of the power spectrum ; relies on many PCA components of the power spectrum; depends on the power spectrum as claimed by Torralba & Oliva ; is independent of the phase spectrum
. Second, they may indicate that in typical, commercial databases the statistics of the images may
---- bitte auswählen ---- be as; not be as ; even be more
natural as/than often presumed, because photographs typically represent a
---- bitte auswählen ---- true ; random sample Gaussian sample; biased ; unbiased
view of the world.
Wichmann, Drewes, Rosas and Gegenfurtner published a paper casting doubt on the conclusions made by Torralba & Oliva. The two main conclusions of the study – based on psychophysical experiments – were, first, that for human observer animal detection in typical photographs of natural scenes is independent of the power spectrum . Second, they may indicate that in typical, commercial databases the statistics of the images may not be as natural as/than often presumed, because photographs typically represent a biased view of the world.
The average and distribution of properties, like orientation or color, over a set of objects or a region in a scene are called the
---- bitte auswählen ---- ensemble statistics ; guiding features; None of the two answers is true
of the scene, and is/are computed by the
---- bitte auswählen ---- selective ; None of the two answers is true; nonselective
pathway.
The average and distribution of properties, like orientation or color, over a set of objects or a region in a scene are called the ensemble statistics of the scene, and is/are computed by the nonselective pathway.
What are the two pathways to scene perception?
(max 380 characters)
There is both a selective and nonselective pathway for scene perception. The selective pathway involves the allocation of attention to one or a few objects at a time and is governed by the attentional bottleneck. Thus, there is selective processing of objects in the selective pathway, meaning that it is responsible for visual search, binding, and the existence of phenomena such as the attentional blink, change blindness, and inattentional blindness. The nonselective pathway, on the other hand, processes visual scenes holistically, encoding scene gist, spatial layout, and ensemble statistics very quickly. The representations in the nonselective pathway are generated as a whole and do not include descriptions of individual objects within the scene. The nonselective pathway has connections with the selective pathway and can, for instance, guide visual search for particular objects in a scene by helping the observer restrict attention to particular locations in the scene.
What are ensemble statistics?
(50 characters)
Ensemble statistics are rapidly extracted representations of visual scenes that include the average and distribution of properties like orientation or color over a set of objects or a region of space. Ensemble statistics represent knowledge about the properties of a group of objects rather than individual objects themselves.
---- bitte auswählen ---- Physical organization ; Spatial layout; Physical setting; Spatial organization Setting - - -
describes the structure of a scene without reference to the identity of specific objects in the scene.
Spatial layout describes the structure of a scene without reference to the identity of specific objects in the scene.
Last changed8 months ago