Neural Kaleidoscopic Space Sculpting

Byeongjoo Ahn, Michael De Zeeuw, Ioannis Gkioulekas, and Aswin C. Sankaranarayanan


Abstract

We introduce a method that recovers full-surround 3D reconstructions from a single kaleidoscopic image using a neural surface representation. Full-surround 3D reconstruction is critical for many applications, such as augmented and virtual reality. A kaleidoscope, which uses a single camera and multiple mirrors, is a convenient way of achieving full-surround coverage, as it redistributes light directions and thus captures multiple viewpoints in a single image. This enables single-shot and dynamic full-surround 3D reconstruction. However, using a kaleidoscopic image for multi-view stereo is challenging, as we need to decompose the image into multi-view images by identifying which pixel corresponds to which virtual camera, a process we call labeling. To address this challenge, our approach avoids the need to explicitly estimate labels, but instead "sculpts" a neural surface representation through the careful use of silhouette, background, foreground, and texture information present in the kaleidoscopic image. We demonstrate the advantages of our method in a range of simulated and real experiments, on both static and dynamic scenes.


Full-Surround 3D Reconstruction from a Single Kaleidoscopic Image

We propose a technique for full-surround 3D reconstruction with a single kaleidoscopic image. Our key insight is that a single pixel in a kaleidoscopic image is equivalent to multiple such pixels in its multi-camera counterpart. Armed with these insights on the nature of information encoded in a kaleidoscopic image, we propose a technique that we call kaleidoscopic space sculpting; sculpting sets up an optimization problem that updates a neural implicit surface using a collection of cost functions that encode background information (to remove regions) and foreground regions (to add regions), as well as the texture of the object. Interestingly, our technique does not explicitly calculate the label information. Despite this, it provides robust single-shot full-surround 3D reconstructions. For dynamic objects, we apply our technique separately on each frame of a kaleidoscopic video, to obtain full-surround 3D videos. Figure shows a gallery of objects placed beside their 3D printed counterparts, obtained using neural kaleidoscopic space sculpting.

3D printing of shape reconstructions


Key Idea: Neural Kaleidoscopic Space Sculpting

To solve labeling and geometry reconstruction jointly from a kaleidoscopic image, we propose a method that we call kaleidoscopic space sculpting, which updates the shape both in additive and subtractive ways without the necessity of labels. Our method uses two observations, which follow from the definitions of background and foreground pixels.

Observation 1. Background rays do not intersect with the object for all bounces.

Observation 2. Foreground rays intersect with the object for at least one bounce.

From these observations, we can update the 3D shape by carving the points on the background rays, as they will never intersect the object; and modeling a point on the foreground rays, as there will be at least one intersection with the object.

Point selection and labeling: The point selection for carving is straightforward, as all bounces can be used for carving in background rays. However, this is problematic for foreground rays, as we do not know which foreground ray bounce will intersect with the object among all the possible bounces. We pick the foreground bounce with the minimum distance value in each iteration for selecting the modeling points. This turns out to be a simple yet effective approach.

Point selection for sculpting


Results

Real object reconstructions from neural kaleidoscopic space sculpting


More Details

For an in-depth description of the technology behind this work, please refer to our paper.

Byeongjoo Ahn, Michael De Zeeuw, Ioannis Gkioulekas, and Aswin C. Sankaranarayanan, "Neural Kaleidoscopic Space Sculpting", IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Supplemental video


Code and Data

Code and data are available at the following GitHub repository.


Acknowledgements

We thank Giljoo Nam for helpful discussions. This work was supported by the National Science Foundation (NSF) under awards 1652569, 1900849, and 2008464, a gift from AWS Cloud Credits for Research, as well as a Sloan Research Fellowship for Ioannis Gkioulekas.

Copyright © 2023 Byeongjoo Ahn