undefined

by David

Problems of structural descripition theories

Some researchers believe that a major problem with structural description theories of object recognition is that

Object recognition overview

3D-based ; Neuroscience-based; object recognition; view based ; picture based; 2D-based; global statistic; Gestalt; summary statistic; structural description; early vision based;

models – postulates a "visual alphabet" made from 3D geometric primitives. The most prominent theory of this kind is called ______, published in ---- bitte auswählen ----

2010 2003 1968 1996 1982 1987 1971 1975 1979

by _____[surname only!]. An opposing theory – usually referred to as ---- bitte auswählen ----

Gestalt picture based global statistic structural description 3D-based summary statistic view based Neuroscience-based object recognition early vision based 2D-based

models – instead believes the human visual system recognizes objects by storing and matching "2D-images" or "snap-shots" of objects and inter- and extrapolates between them if required to recognize an object from a novel viewpoint. One of the most well-known proponents of this theory is _________[surname only!]. ---- bitte auswählen ----

Luckily Nicely Convincingly Unfortunately

both theories are ---- bitte auswählen ----

not; well; intimately; necessarily

connected to the findings, theories and models of early spatial vision.

Very recently DiCarlo, Cox and colleagues have argued for a computational neural network approach to explain object recognition. This can be seen as a neuroscience-inspired computational instantiation of the ---- bitte auswählen ----

view-based structural description Gestalt summary statistic global statistic early vision based

approach to object recognition.

Many researchers in vision science believe that object recognition is one of the most important functions of the human visual system. Thus it is perhaps not surprising that there exists a large body of research on object recognition. One dominant approach – usually referred to as structural description models – postulates a "visual alphabet" made from 3D geometric primitives. The most prominent theory of this kind is called recognition-by-components , published in 1987 by Biederman [surname only!]. An opposing theory – usually referred to as view based models – instead believes the human visual system recognizes objects by storing and matching "2D-images" or "snap-shots" of objects and inter- and extrapolates between them if required to recognize an object from a novel viewpoint. One of the most well-known proponents of this theory is Tarr [surname only!]. Unfortunately both theories are not connected to the findings, theories and models of early spatial vision.

Yamins et al.’s HMO-Model (PNAS 2014)

Yamins and colleagues from the DiCarlo lab at MIT published an article in 2014 in which they presented their HMO model, standing for ______. The HMO model belongs in the larger class of

---- bitte auswählen ---- DDN NDD NND DNN DND

models, standing for _______ model. The HMO model's essential architectural characteristic is its

---- bitte auswählen ---- computational complexity heterogeneity harmony homogeneity :

There are, for example, many

---- bitte auswählen ---- bypass feedback

connections and different parameter settings

---- bitte auswählen ---- only at different levels of the hierarchy ; even at the same level of the hierarchy .

---- bitte auswählen ---- However In addition Furthermore ,

the basic operations performed locally are

---- bitte auswählen ---- the same different heterogenous more or less the same

throughout the network. In the Yamins et al. (2014) article they report a large-scale modelling effort, evaluating around

---- bitte auswählen ---- 5000 100 500 1.000 10.000 100.000

---- bitte auswählen ---- DNN parameterizations DNN architectures HMO model parametrizations HMO architectures

. Yamins et al. compared their models both to the response of cells in IT cortex (roughly N =

---- bitte auswählen ---- 1.000 300 3.000 100 30

cells) as well as on how well the models categorized a set of images (roughly N =

---- bitte auswählen ---- 600 10.000 3.000 6.000 1.000

images). One central finding was that models optimized for

---- bitte auswählen ---- explained variance in IT categorization performance IT-cell predictivity discrimination performance

were also superior at

---- bitte auswählen ---- explaining variance in IT categorization performance IT-cell predictivity discrimination performance

In order to obtain a categorization performance from the HMO model, a

---- bitte auswählen ---- linear non-linear partially linear

decoder was

---- bitte auswählen ---- derived from trained on assumed to exist taken from

the activity of units at the

---- bitte auswählen ---- highest intermediate lowest across

level(s) of the HMO network. Using such a procedure, the HMO model's performance was

---- bitte auswählen ---- better only slightly worse on par

than that of

---- bitte auswählen ---- both computer vision and neuronally inspired computer vision neuronally inspired

models of object recognition on the difficult

---- bitte auswählen ---- low high

variation task.

Yamins and colleagues from the DiCarlo lab at MIT published an article in 2014 in which they presented their HMO model, standing for hierarchical modular optimization. The HMO model belongs in the larger class of DNN models, standing for deep neural network model. The HMO model's essential architectural characteristic is its heterogeneity: There are, for example, many bypass connections and different parameter settings even at the same level of the hierarchy. However, the basic operations performed locally are the same throughout the network. In the Yamins et al. (2014) article they report a large-scale modelling effort, evaluating around 5000 DNN architectures. Yamins et al. compared their models both to the response of cells in IT cortex (roughly N = 300 oder 100 cells) as well as on how well the models categorized a set of images (roughly N = 6.000 images). One central finding was that models optimized for categorization performance were also superior at explaining variance in IT.

In order to obtain a categorization performance from the HMO model, a linear decoder was trained on the activity of units at the highest level(s) of the HMO network. Using such a procedure, the HMO model's performance was better than that of both computer vision and neuronally inspired models of object recognition on the difficult high variation task.

Viewpoint invariance

Viewpoint invariance refers to the idea that

Subordinate level category terms

Which of the following is a subordinate level category term?

Time required for object recognition

Tarr and his colleagues found that the amount of time needed to recognize novel objects is at least partially determined by

Fundamentals of view-based theories

What are object representations made of, according to view-based theories of object recognition?

Specificity of IT cells

A study of cells in IT cortex of a human patient showed that they responded to very specific stimuli, such as

Properties of generalized-cone components in RBC Theory(2)

The essential non-accidental properties of the generalized-cone components are:

(more than one can be true)

Superordinate level category terms

Which of the following is a superordinate level category term?

Stages of visual processing

Which of the following is a loosely defined stage of visual processing that comes after basic features have been extracted from the image, and before object recognition and scene understanding?

Entry-level categorization terms

Which of the following is an entry-level category term?

Fundamentals of RBX theory

What are object representations made of, according to the recognition-by-components (RBC) model of object recognition?

Give examples:

Superordinate level:
Basic level:
Subordinate level:

Superordinate level: Animal
Basic level: Dog
Subordinate level: Golden Retriever

Failure of object recognition

---- bitte auswählen ---- Prosopagnosia Agnosia Anomia Alexia Dyslexia is a failure to recognize objects in spite of the ability to see them.

Agnosia

Central empirical findings about human object recognition

Typically … (choose all that are ture)

Problems of view-based theories

A major problem with naïve template theories of object recognition is that

Prosopagnosia

Prosopagnosia is a neuropsychological disorder in which the patient

Properties of generalized-cone components in RBC Theory (1)

One central property of the generalized-cone components is that they have so-called ---- bitte auswählen ---- generic non-accidental non-generic accidental properties, because these properties would ---- bitte auswählen ---- rarely often always never be produced by ---- bitte auswählen ---- non-accidental non-generic generic accidental alignements of viewpoint and object features. Thus RBC theory claims that certain properties of the ---- bitte auswählen ---- 3D 1D 2D image are taken by the visual system as strong evidence that the edges in the ---- bitte auswählen ---- 3D 1D 2D world contain the same properties.

One central property of the generalized-cone components is that they have so-called non-accidental properties, because these properties would rarely be produced by accidental alignments of viewpoint and object features. Thus RBC theory claims that certain properties of the 2D image are taken by the visual system as strong evidence that the edges in the 3Dworld contain the same properties.

Anantomy of object recognition

Evidence indicates that structures in ---- bitte auswählen ---- striate parietal occipital frontal inferotemporal cortex are especially important in end-stage object recognition processes.

inferotemporal

Join Course

Preview

Author

David

Information

Last changed
a year ago

Report course

Q7

Author

David

Information