Abstract
Currently, sharp discontinuities in depth and occlusions in
multiview imaging systems pose serious challenges for many dense
correspondence algorithms. However, it is important for 3D
reconstruction methods to preserve depth edges as they correspond to
important shape features like silhouettes which are critical for
understanding the structure of a scene. In this paper we show how active illumination
algorithms can produce a rich set of feature maps that
are useful in dense 3D reconstruction.
We start by showing how to compute a qualitative depth map from a single camera, which encodes object
relative distances, being a useful prior for stereo. In a multiview setup, we show that along with depth edges,
binocular halfoccluded pixels can also be explicitly and reliably labeled.
To demonstrate the
usefulness of these feature maps, we show how they can be used in
two different algorithms for dense stereo correspondence.
Our experimental results show that our enhanced stereo algoritms are
able to extract high quality, discontinuity preserving
correspondence maps from scenes that are extremely challenging for
conventional stereo methods.


Depth Edges with MultiFlash
Small baseline multiflash illumination allows reliable and efficient depth edge detection. The main observation is that when a flash illuminates a scene during
image capture, thin slivers of cast shadow are created at depth
discontinuities. Thus, if we can shoot a sequence of images
in which different light sources illuminate the subject from
various positions, we can use the shadows in each image to assemble
a depth edge map using the shadow images.
NPR camera website 



Generating the Normal Map
We used our NPR output obtained with multiflash imaging as a height field to create a normal map. The main advantage of our approach is that the NPR image captures bumpy features such as hair, wrinkles, and beard, allowing us to create finedetail 3D models and NPR illustrations automatically.
In order to create the normal map, we first negate the NPR texture image so that darker regions of the height field are lower and lighter regions are higher. Then, we compute the normals based on partial derivatives of the height field surface, exactly as demonstrated in the CG book [2], page 203.
Tangentspace Bump Mapping
Since we are dealing with arbitrary geometry, we need to align the coordinate system of the normal vectors in the normal map (tangent space) with the light vectors coordinate system. This is done by creating a rotation matrix for each vertex with columns specified by the correspondent normal, binormal, and tangent vectors. See CG book [2], page 225, for how to compute these vectors. We transformed both the light vector and the normal map vectors in the same consistent eyespace coordinate system.
Results
From left to right: 3D model, bump mapping with small scale, bump mapping with large scale.
From left to right: Texturemapped 3D model, Finedetail 3D model with bumpy features created automatically, Nonphotorealistic illustration with large scale bump mapping.

Occlusion Map
Binocular halfocclusion points are those that are visible in only one of the two views provided by a
binocular imaging system.
They are a major source of errors in stereo matching algorithms, due to the fact that halfoccluded points have no correspondence
in the other view, leading to false disparity estimation. By placing the light sources close to the center of projection of each camera,
we can use the length of the shadows created by the lights surrounding the other camera to bound the halfoccluded regions. This allows us to segment occlusions in both textured and nontextured regions, without relying on the correspondence problem


Depth Edge Preserving Stereo
We now demonstrate the usefulness of these feature maps by incorporating them into two different dense stereo
correspondence algorithms, one based on local search and the second
one based on belief propagation.
Enhanced Local Stereo. We adopt a sliding window which varies in shape and size, according to depth edges and occlusion,
to perform local correlation. In order to determine the size and shape of the window for each pixel,
we determine the set of pixels that has aproximatelly the same disparity as the center pixel
of the window.
This is achieved by a region growing algorithm (starting at the center pixel)
which uses depth edges and halfoccluded points
as boundaries. Only this set of pixels is then used for matching in the other view. The other pixels in the window are disconsidered, since they correspond to a different disparity.
Left View 
Ground Truth 
Local Stereo 
Our Local Approach 








Global Stereo 
Our Global Approach 
Enhanced Global Stereo. The best results achieved in stereo matching thus far are given by global stereo methods, particularly
those based on belief propagation and graph cuts. We use the qualitative depth and occlusion maps as prior information for these methods, so that smoothness constraints are stopped at object boundaries and neighboring pixels along depth edges are encouraged to have disparity values according to depth differences in the qualitative map. 




MultiFlash Stereo Datasets
We encourage researchers to develop novel methods for stereo that take into account small baseline multiflash illumination. We are currently collecting new datasets with different cameraflash configurations.
Tripod Scene (tripod.zip 4.7MB)  Challenging scene with ambiguous patterns, textureless regions, thin structures and a geometrically complex object

Phase Functions
When modeling scattering within the layer, we can use phase functions to describe the result of light interacting with particles in the layer. A phase function describes the scattered distribution of light after a ray hits a particle in the layer. We use HenyeyGreenstein phase function (see [3], page 55), which depends on the incident and outgoing ray, and takes an asymmetric parameter g, that ranges from 1 to 1, spanning the range of strong retroreflection to strong forward scattering.
The phase function is used to determine the BRDF that describes single scattering from a medium (see [3], page 56), along with scattering albedo. Multiple scattering is empirically approximated by adding together three single scattering terms, with different values for the asymmetric parameter g.
Fresnel Effect
We need to consider the Fresnel effect, which happens when the light ray enters and exits the surface. This is important to determine the incoming and outgoing directions (and also intensity) of the light rays inside the medium, so that the BRDF/scattering is properly computed.
Results
From left to right: subsurface scattering with mostly backscattering (note the glow effect), same as before with bump mapping, subsurface scattering with mostly forward scattering.
Subsurface scattering tends to smooth the lighting effects. We are considering a constant surface thickness for the face, but properly modeling this would cause redish effects (due to blood interaction) along thin facial features, such as ears and nostrils.

