、论文：Deep Hierarchies in the Primate Visual Cortex: What Can We Learn for Computer Vision?
rods cell 低亮敏感 cones高亮彩色敏感
注视需要中央凹高密度感光细胞区域，所以需要眼动控制（主动，被动控制；up to down or down to up）
Center-Surround Receptive Fields
v1区 ： Double-Opponent Cells
5-10 percent of V1 cells are dedicated color-coding cells
feature ： edges, bars, and gratings, i.e., linear oriented patterns.
simple and complex cells.
Gabor wavelets are a reasonable approximation of simple cells
Gabor wavelets have also been very successful in applications : image compression , image retrieval , and face recognition .
cells that are sensitive to the end of a bar or edge or the border of a grating. app: pose estimation, object recognition, stereo, structure from motion
line endings, motion, color
双目差异 disparity:应用:depth, gaze control, object grasping, and object recognition
note that spatiotemporal features such as motion have been demonstrated to be the first features developmentally present in humans for recognizing objects (even sooner than color and orientation) 时空特征检测发育最早
V2 ： orientation, color, and disparity(relative disparity) (absolute disparity in V1)
new feature of V2 : more sophisticated contour representation, including texture-defined contours, illusory contours, and contours with border ownership.
2000, Zhou et al.  found that 18 percent of the cells in V1 and more than 50 percent of the cells in V2 and V4 (along the ventral pathway) respond or code according to the direction of the owner of the boundary.
V4 seems to combine input from the M as well as the P pathway
integrating lower level into higher level responses and increasing invariances
V4 cells respond to contours defined by differences in speed and/or direction of motion with an orientation selectivity that matches the selectivity to luminance-defined contours
hue is invariant to luminance
mid-level representation of motion depth
eye move control
MT Receptive fields are about 10 times larger than in V1
MT is retinotopically organized with motion and depth columns similar to orientation and ocular dominance columns in V1
MT cells are selective to higher order features of motion such as motion gradients, motion-defined edges, locally opposite motions, and motion-defined shapes
TEO is responsible for medium complexity features and it integrates information about the shapes and relative positions of multiple contour elements
10 识别2 TE
capable of object recognition has to fulfill two seemingly conflicting requirements, i.e., selectivity and invariance.
On the one hand, neurons have to distinguish between different objects to provide information about object identity
On the other hand, this system also has to treat highly dissimilar retinal images of the same object as equivalent, and must therefore be insensitive to transformations in the retinal image that occur in natural vision (e.g., changes in position, illumination, retinal size, and so on).
TE区具备：size invariance ,cue i, position i,occlusion invariance 物体离的远近看到的不管大小都可以识别；看到部分即可识别（管中窥豹可见一斑）；物体的姿势不同，角度不同均可识别；物体被遮挡也可识别
11 运动 动作
areas located in the dorsal stream are functionally related to different effectors:
LIP is involved in eye movements,
Medial Intraparietal Area (MIP) in arm movements,
AIP in hand movements (grasping), and
MST and VIP in body movements (self-motion)
Area MST is concerned with self-motion, both for movement of the head (or body) in space and movement of the eye in the head
combination of retino-centric receptive fields and eye position modulation provides a population code in LIP that can represent the location of a stimulus in head-centric coordinates（LIP）
VIP is likely to be involved in self- motion, control of head movements, and the encoding of near-extrapersonal (head centered) space which link tactile and visual fields.
activity of MIP neurons mainly reflects the movement plan(活动计划) toward the target and not merely the location of the target or visual attention evoked by the target appearance
Some AIP neurons respond during object fixation and grasping, but not during grasping in the dark (visual-dominant neurons);
other AIP neurons do not respond during object fixation but only when the object is grasped, even in the dark (motor- dominant neurons),
a third class of AIP neurons responds during object fixation and grasping and during grasping in the dark (visuo-motor neurons
AIP are sensitive to the 2D and 3D features of the object and shape of the hand (in a light or dark environment) relevant for grasping
single-opponent cells in LGN establish the two color axes red-green and blue-yellow, thereby sharpening the wavelength tuning and achieving some invariance to luminance.
Double-opponent cells provide the means to take nearby colors into account for color contrast.
V4, hue is encoded, which spans the full color space.
final step is IT, where there exists an association of color with form
动态性 复杂性 各个不变特征抽象层级
Edges are the primary features used to represent objects, it seems.
In V1, they are defined as boundaries between dark and light or between different color hues;
in V2, contours may also be defined by texture boundaries and these cells respond to illusory contours;
in V4, contours may even be defined by differences in motion
Possible solutions to the binding problem（哪些线条是一个物体的？） are tuning of cells to conjunctions of features, spatial attention, and temporal synchronization
3d 信息 depth 处理的层级：差异的绝对、相对、0阶、1阶、2阶
selectivity for zero-, first-, and second-order disparities can be measured
The coding of 3D shape in AIP is
faster (shorter latencies),
coarser (less sensitivity to discontinuities in the surfaces),
less categorical, and
more boundary based (less influence of the surface information)
compared to IT（IT区域的3d识别特性与以上相反）
analysis of motion in the primate visual system proceeds in a hierarchy from V1 (local spatiotem- poral filtering) to MT (2D motion) to MST (self-motion, motion in world coordinates)
The representation shifts from one of motion in the visual field (V1, MT) to one of motion in the world and motion of oneself in the world (MST)
motion processing is combined with disparity (MT, MST), eye movement information (MST), and vestibular signals
Object recognition goes beyond simple 2D-shape perception in several aspects:
integration of different cues and modalities,
invariance to in-depth rotation and articulated movement, use of context. 不同的旋转和角度及运动中均能识别
It is also important to distinguish between-class discrimination (object categorization) and within-class discrimination of objects.
edges can be defined by luminance in V1, by textures in V2, and by differences in motion in V4
a small fraction of IT neurons can exhibit some rotation invariance and speed of recognition of familiar objects does not depend on the rotation angle .
A particular case are face sensitive neurons that can show a rather large invariance to rotations in depth.
Representations of the same object under different angles that are presumably combined into a rotation invariant representation like simple cell responses might be combined into a complex cell response
Context plays a major role in object recognition 
and can be of different nature—semantic, spatial configuration, or pose—
and is, at least partially, provided by higher areas beyond IT
objects also help to recognize the context and context may be defined on a crude statistical level .
visual information is combined with vestibular (in MST, VIP), auditory (in LIP), somatosensory (in VIP), and proprioceptive or motor feedback signals (MST and VIP for smooth eye movements, LIP for saccades, MST/VIP/7A/MIP for eye position)
LIP represents salience in the visual scene as a target signal for eye movements, MIP and AIP provide information for reaching (target signals) and grasping (shape signals). LIP and VIP provide information for the control of self-motion.