Binocular disparity



Binocular disparity refers to the difference in image location of an object seen by the left and right eyes, resulting from the eyes' horizontal separation. The brain uses binocular disparity to extract depth information from the two-dimensional retinal images in stereopsis. In computer vision, binocular disparity refers to the same difference seen by two different cameras instead of eyes.

Definition
Human eyes are horizontally separated by about 50-75 mm (interpupillary distance) depending on each individual. Thus, each eye has a slightly different view of the world. This can be easily seen when alternately closing one eye while looking at a vertical edge.

Focusing on an object in one's sight means to move the eyes such that the image of the object falls on the fovea, the spot of sharp vision, in both eyes. This is called a vergence movement. After the vergence movement the focused object is said to be projected to corresponding points on the two retinae. Because of the different views observed by left and right eye, things which are not focused (lying in front or behind the point of fixation) do not fall on corresponding retinal points.

In the visual neurosciences binocular disparity is defined as the difference between the positions of the corresponding point and the actual point of projection in one of the eyes and is usually expressed in degrees as the visual angle.

Fig. 1: The full black circle is the point of fixation. The blue object lies nearer to the observer. Therefore it has a "near" disparity dn. Objects lying more far away (green) correspondingly have a "far" disparity df. Binocular disparity is the angle between two lines of projection in one eye. One of which is the real projection from the object to the actual point of projection. The other one is the imaginary projection running through the focal point of the lens of the one eye to the point corresponding to the actual point of projection in the other eye. For simplicity reasons here both objects lie on the line of fixation for one eye such that the imaginary projection ends directly on the fovea of the other eye, but in general the fovea acts at most as a reference. Note that far disparities are smaller than near disparities for objects having the same distance from the fixation point

In computer stereo vision, there is no interpupillary distance. Instead, there is a variables distance between the two cameras. This distance is called the baseline. Disparity increases as the baseline increases, due to the view of the cameras becoming more and more different.

Tricking neurons with 2D images
Brain cells (neurons) in a part of the brain responsible for processing visual information coming from the retinae (primary visual cortex) can detect the existence of disparity in their input from the eyes. Specifically, these neurons will be active, if an object with "their" special disparity lies within the part of the visual field to which they have access (receptive field).



Researchers investigating precise properties of these neurons with respect to disparity present visual stimuli with different disparities to the cells and look whether they are active or not. One possibility to present stimuli with different disparities is to place objects in varying depth in front of the eyes, but obviously this has many drawbacks and is not precise enough. Instead neuroscientists use a trick as schematised in Fig. 2.

Fig 2: The figure shows that the disparity of an object with different depth than the fixation point can alternatively be produced by presenting an image of the object to one eye and a laterally shifted version of the same image to the other eye. The full black circle is the point of fixation. Objects in varying depths are placed along the line of fixation of the left eye. The same disparity produced from a shift in depth of an object (filled coloured circles) can also be produced by laterally shifting the object in constant depth in the picture one eye sees (black circles with coloured margin). Note that for near disparities the lateral shift has to be larger to correspond to the same depth compared with far disparities. This is what neuroscientists usually do with random dot stimuli to study disparity selectivity of neurons, but this principle has also been applied in magic eye illusions.

Computing disparity using digital stereo images
Disparity computation is commonly done after the process of image rectification. This construction of stereo images allows for a disparity in only the horizontal direction (i.e. there is no disparity in the y image coordinates). This is a property that can also be achieved by precise alignment of the stereo cameras (see epipolar geometry). It is important to note that disparity is usually computed as a shift to the left of an image feature when viewed in the right image. For example, a single point that appears at the x coordinate t (measured in pixels) in the left image may be present at the x coordinate t - 3 in the right image. In this case, the disparity at that location in the right image would be 3 pixels. After rectification, a simple computational measure such as the sum of absolute differences (or SAD) can be used to compute disparities at each pixel in the right image. This is achieved by taking a "patch" (often square) of pixels in the left image, and finding the corresponding patch at each valid disparity in the right image. For example, for a disparity of 0, the two patches would be at the exact same location in both images. Thus, for a disparity of 1, the patch in the right image for a disparity of 0 would simply be moved 1 pixel to the left. The absolute difference is then computed for corresponding pixels in each patch (e.g. the corresponding pixel for the top-left pixel in the left patch is simply the top-left pixel in the right patch). These absolute differences are then summed to compute the final SAD score. After this SAD score has been computed for all valid disparities (normally pre-determined, e.g. 0-25), the disparity that yielded the lowest SAD score is determined to be the disparity at that location in the right image.

The method described above is a brute-force search algorithm. With large patch and/or image sizes, this technique can be very slow. However, because the patches at disparity d and disparity d + 1 have significant overlap, much of the work from computing the previous SAD score can be remembered. Techniques that save previous information can greatly increase the algorithmic efficiency of this algorithm.

Uses of disparity in computer vision
Knowledge of disparity can be used in further extraction of information from stereo images. Disparity and distance from the camera are negatively correlated. As the distance from the camera increases, the disparity decreases. This allows for depth perception in stereo images. Using geometry and algebra, the points that appear in the 2D stereo images can be mapped as coordinates in 3D space.