Light-Field

Welcome to our light-field website!

This is the webpage for light-field related researches in Prof. Ravi Ramamoorthi's lab, which is affiliated with both UC San Diego and UC Berkeley.
It includes all light-field papers (e.g. depth estimation) published in recent top conferences/journals.
If you compare to certain algorithms and/or use the datasets, please also cite the appropriate papers.

***For decoding the Lytro raw input, we recommend using Lytro's official software.
Donald's decoder is also very useful and does not require any registration.***

2021

	Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Kai-En Lin, Lei Xiao, Feng Liu, Guowei Yang, Ravi Ramamoorthi International Conference on Computer Vision (ICCV), 2021 paper \| video \| abstract \| bibtex We develop a new algorithm, Deep 3D Mask Volume, which enables temporally stable view extrapolation from binocular videos of dynamic scenes, captured by static cameras. Our algorithm addresses the temporal inconsistency of disocclusions by identifying the error-prone areas with a 3D mask volume, and replaces them with static background observed throughout the video. @inproceedings {lin2021deep, title = {Deep 3D Mask Volume for View Synthesis of Dynamic Scenes}, author = {Kai-En Lin and Lei Xiao and Feng Liu and Guowei Yang and Ravi Ramamoorthi}, booktitle = {ICCV}, year = {2021}, }
	NeLF: Neural Light-transport Field for Portrait View Synthesis and Relighting Tiancheng Sun, Kai-En Lin, Sai Bi, Zexiang Xu, Ravi Ramamoorthi Eurographics Symposium on Rendering (EGSR), 2021 paper \| video \| abstract \| bibtex We present a system for portrait view synthesis and relighting: given multiple portraits, we use a neural network to predict the light-transport field in 3D space, and from the predicted Neural Light-transport Field (NeLF) produce a portrait from a new camera view under a new environmental lighting. @inproceedings {sun2021nelf, booktitle = {Eurographics Symposium on Rendering}, title = {NeLF: Neural Light-transport Field for Portrait View Synthesis and Relighting}, author = {Sun, Tiancheng and Lin, Kai-En and Bi, Sai and Xu, Zexiang and Ramamoorthi, Ravi}, year = {2021}, }
	NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis Ben Mildenhall, Pratul Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng Communications of the ACM (CACM)*, 2021 paper \| video \| abstract \| bibtex In this paper, we present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. @article{mildenhall2020nerf, title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis}, author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng}, year={2021}, booktitle={Communication of the ACM (CACM)}, year={2021} }
	Neural Light Transport for Relighting and View Synthesis Xiuming Zhang, Sean Fanello, Yun-Ta Tsai, Tiancheng Sun, Tianfan Xue, Rohit Pandey, Sergio Orts-Escolano, Philip Davidson, Christoph Rhemann, Paul Debevec, Jonathan T. Barron, Ravi Ramamoorthi, William T. Freeman ACM Transactions on Graphics (SIGGRAPH), 2021 paper \| video \| abstract \| bibtex We propose a semi-parametric approach for learning a neural representation of the light transport of a scene. The light transport is embedded in a texture atlas of known but possibly rough geometry. We model all non-diffuse and global light transport as residuals added to a physically-based diffuse base rendering.se set of input views. @article{zhang2021neural, title={Neural light transport for relighting and view synthesis}, author={Xiuming Zhang and Sean Fanello and Yun-Ta Tsai and Tiancheng Sun and Tianfan Xue and Rohit Pandey and Sergio Orts-Escolano and Philip Davidson and Christoph Rhemann and Paul Debevec and Jonathan T. Barron and Ravi Ramamoorthi and William T. Freeman}, journal={ACM Transactions on Graphics (TOG)}, year={2021}, }

2020

	Neural Reflectance Fields for Appearance Acquisition Sai Bi, Zexiang Xu, Pratul Srinivasan, Ben Mildenhall, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi arxiv preprint, 2020 paper \| abstract \| bibtex We present Neural Reflectance Fields, a novel deep scene representation that encodes volume density, normal and reflectance properties at any 3D point in a scene using a fully-connected neural network. @article{bi2020neural, title={Neural reflectance fields for appearance acquisition}, author={Bi, Sai and Xu, Zexiang and Srinivasan, Pratul and Mildenhall, Ben and Sunkavalli, Kalyan and Ha{\v{s}}an, Milo{\v{s}} and Hold-Geoffroy, Yannick and Kriegman, David and Ramamoorthi, Ravi}, journal={arXiv preprint arXiv:2008.03824}, year={2020} }
	NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis Ben Mildenhall, Pratul Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng European Conference on Computer Vision (ECCV)*, 2020 paper \| video \| abstract \| bibtex In this paper, we present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. @article{mildenhall2020nerf, title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis}, author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng}, year={2020}, booktitle={Proceedings of European Conference on Computer Vision (ECCV)}, year={2020} }
	Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images Sai Bi, Zexiang Xu, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi European Conference on Computer Vision (ECCV), 2020 paper \| video \| abstract \| bibtex We develop a novel volumetric scene representation for reconstruction from unstructured images. Our representation consists of opacity, surface normal and reflectance voxel grids. We present a novel physically-based differentiable volume ray marching framework to render these scene volumes under arbitrary viewpoint and lighting. @misc{bi2020drv, title={Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images}, author={Sai Bi and Zexiang Xu and Kalyan Sunkavalli and Miloš Hašan and Yannick Hold-Geoffroy and David Kriegman and Ravi Ramamoorthi}, booktitle={Proceedings of European Conference on Computer Vision (ECCV)}, year={2020}, }
	Deep Multi Depth Panoramas for View Synthesis Kai-En Lin, Zexiang Xu, Ben Mildenhall, Pratul Srinivasan, Yannick Hold-Geoffroy, Stephen DiVerdi, Qi Sun, Kalyan Sunkavalli, Ravi Ramamoorthi European Conference on Computer Vision (ECCV), 2020 paper \| video \| abstract \| bibtex We propose a learning-based approach for novel view synthesis for multi-camera 360 degree panorama capture rigs. We present a novel scene representation, Multi Depth Panorama (MDP), that consists of multiple RGBD alpha panoramas that represent both scene geometry and appearance. @misc{lin2020deep, title={Deep Multi Depth Panoramas for View Synthesis}, author={Kai-En Lin and Zexiang Xu and Ben Mildenhall and Pratul P. Srinivasan and Yannick Hold-Geoffroy and Stephen DiVerdi and Qi Sun and Kalyan Sunkavalli and Ravi Ramamoorthi}, booktitle={Proceedings of European Conference on Computer Vision (ECCV)}, year={2020}, }
	Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images Sai Bi, Zexiang Xu, Kalyan Sunkavalli, David Kriegman, Ravi Ramamoorthi IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 paper \| video \| abstract \| bibtex We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object from a sparse set of only six images captured by wide-baseline cameras under collocated point lighting. We construct high-quality geometry and per-vertex BRDFs. @inproceedings{bi2020deep3d, title={Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images}, author={Bi, Sai and Xu, Zexiang and Sunkavalli, Kalyan and Kriegman, David and Ramamoorthi, Ravi}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={5960--5969}, year={2020} }

2019

	Deep Recurrent Network for Fast and Full-Resolution Light Field Deblurring Jonathan Samuel Lumentut, Tae Hyun Kim, Ravi Ramamoorthi In Kyu Park IEEE Signal Processing Letters, 2019 paper \| abstract \| bibtex We propose a novel light field recurrent deblurring network that is trained under 6 degree-of-freedom camera motion-blur model. By combining the real light field captured using Lytro Illum and synthetic light field rendering of 3D scenes from UnrealCV, we provide a large-scale blurry light field dataset to train the network. @article{lumentut2019lf, author={J. S. {Lumentut} and T. H. {Kim} and R. {Ramamoorthi} and I. K. {Park}}, journal={IEEE Signal Processing Letters}, title={Deep Recurrent Network for Fast and Full-Resolution Light Field Deblurring}, year={2019}, volume={26}, number={12}, pages={1788-1792} }
	Deep View Synthesis from Sparse Photometric Images Zexiang Xu, Sai Bi, Kalyan Sunkavalli, Sunil Hadap, Hao Su, Ravi Ramamoorthi ACM Transactions on Graphics (SIGGRAPH), 2019 paper \| supplementary \| video \| abstract \| bibtex \| In this paper, we synthesize novel viewpoints across a wide range of viewing directions (covering a 60 degree cone) from a sparse set of just six viewing directions. Our method is based on a deep convolutional network trained to directly synthesize new views from the six input views. This network combines 3D convolutions on a plane sweep volume with a novel per-view per-depth plane attention map prediction network to effectively aggregate multi-view appearance. @article{xu2019deepviewsyn, author = {Zexiang Xu and Sai Bi and Kalyan Sunkavalli and Sunil Hadap and Hao Su and Ravi Ramamoorthi}, title = {Deep View Synthesis from Sparse Photometric Images}, journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)}, volume = {38}, number = {4}, year = {2019} }
	Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines Ben Mildenhall, Pratul Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, Abhishek Kar ACM Transactions on Graphics (SIGGRAPH), 2019 paper \| YouTube \| project page \| abstract \| bibtex \| We present a practical and robust deep learning solution for capturing and rendering novel views of complex real world scenes for virtual exploration. We propose an algorithm for view synthesis from an irregular grid of sampled views that first expands each sampled view into a local light field via a multiplane image (MPI) scene representation, then renders novel views by blending adjacent local lightfields. We extend traditional plenoptic sampling theory to derive a bound that specifies precisely how densely users should sample views of a given scene when using our algorithm. @article{mildenhall2019llff, title={Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines}, author={Ben Mildenhall and Pratul P. Srinivasan and Rodrigo Ortiz-Cayon and Nima Khademi Kalantari and Ravi Ramamoorthi and Ren Ng and Abhishek Kar}, journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)}, volume = {38}, number = {4}, year={2019} }
	Pushing the Boundaries of View Extrapolation with Multiplane Images Pratul P. Srinivasan, Richard Tucker, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng, Noah Snavely IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 paper \| with appendices \| YouTube \| video \| abstract \| bibtex We present a theoretical analysis showing how the range of views that can be rendered from a multi-plane image (MPI) increases linearly with the MPI disparity sampling frequency, as well as a novel MPI prediction procedure that theoretically enables view extrapolations of up to 4x the lateral viewpoint movement allowed by prior work.
	Deep HDR Video from Sequences with Alternating Exposures Nima Khademi Kalantari, Ravi Ramamoorthi EUROGRAPHICS, 2019 paper \| video \| abstract \| bibtex A practical way to generate a high dynamic range (HDR) video using off-the-shelf cameras is to capture a sequence with alternating exposures and reconstruct the missing content at each frame. Unfortunately, existing approaches are typically slow and are not able to handle challenging cases. In this paper, we propose a learning-based approach to address this difficult problem. To do this, we use two sequential convolutional neural networks (CNN) to model the entire HDR video reconstruction process.

2017

	Learning to Synthesize a 4D RGBD Light Field from a Single Image Pratul P. Srinivasan, Tongzhou Wang, Ashwin Sreelal, Ravi Ramamoorthi, Ren Ng International Conference on Computer Vision (ICCV), 2017 paper \| supplementary \| video \| abstract \| bibtex \| We present a machine learning algorithm that takes as input a 2D RGB image and synthesizes a 4D RGBD light field (color and depth of the scene in each ray direction). For training, we introduce the largest public light field dataset, consisting of over 3300 plenoptic camera light fields of scenes containing flowers and plants. Our synthesis pipeline consists of a convolutional neural network (CNN) that estimates scene geometry, a stage that renders a Lambertian light field using that geometry, and a second CNN that predicts occluded rays and non-Lambertian effects. Our algorithm builds on recent view synthesis methods, but is unique in predicting RGBD for each light field ray and improving unsupervised single image depth estimation by enforcing consistency of ray depths that should intersect the same scene point. @article{pratul2017lightField, author = {Pratul P. Srinivasan and Tongzhou Wang and Ashwin Sreelal and Ravi Ramamoorthi and Ren Ng}, title = {Learning to Synthesize a 4D RGBD Light Field from a Single Image}, journal = {International Conference on Computer Vision (ICCV)}, year = {2017} }
	Depth and Image Restoration from Light Field in a Scattering Medium Jiandong Tian, Zak Murez, Tong Cui, Zhen Zhang, David Kriegman, Ravi Ramamoorthi International Conference on Computer Vision (ICCV), 2017 paper \| abstract \| bibtex \| Traditional imaging methods and computer vision algorithms are often ineffective when images are acquired in scattering media, such as underwater, fog, and biological tissue. Here, we explore the use of light field imaging and algorithms for image restoration and depth estimation that address the image degradation from the medium. Towards this end, we make the following three contributions. First, we present a new single image restoration algorithm which removes backscatter and attenuation from images better than existing methods do, and apply it to each view in the light field. Second, we combine a novel transmission based depth cue with existing correspondence and defocus cues to improve light field depth estimation. In densely scattering media, our transmission depth cue is critical for depth estimation since the images have low signal to noise ratios which significantly degrades the performance of the correspondence and defocus cues. Finally, we propose shearing and refocusing multiple views of the light field to recover a single image of higher quality than what is possible from a single view. We demonstrate the benefits of our method through extensive experimental results in a water tank. @article{tian2017light, author = {Jiandong Tian and Zak Murez and Tong Cui and Zhen Zhang and David Kriegman and Ravi Ramamoorthi}, title = {Depth and Image Restoration from Light Field in a Scattering Medium}, journal = {International Conference on Computer Vision (ICCV)}, year = {2017} }
	Deep High Dynamic Range Imaging of Dynamic Scenes Nima Khademi Kalantari, Ravi Ramamoorthi ACM Transactions on Graphics (SIGGRAPH), 2017 paper \| abstract \| bibtex \| project page Producing a high dynamic range (HDR) image from a set of images with different exposures is a challenging process for dynamic scenes. A category of existing techniques first register the input images to a reference image and then merge the aligned images into an HDR image. However, the artifacts of the registration usually appear as ghosting and tearing in the final HDR images. In this paper, we propose a learning-based approach to address this problem for dynamic scenes. We use a convolutional neural network (CNN) as our learning model and present and compare three different system architectures to model the HDR merge process. Furthermore, we create a large dataset of input LDR images and their corresponding ground truth HDR images to train our system. We demonstrate the performance of our system by producing high-quality HDR images from a set of three LDR images. Experimental results show that our method consistently produces better results than several state-of-the-art approaches on challenging scenes. @article{kalantari2017hdr, author = {Nima Khademi Kalantari and Ravi Ramamoorthi}, title = {Deep High Dynamic Range Imaging of Dynamic Scenes}, journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)}, volume = {36}, number = {4}, year = {2017}, }
	Light Field Video Capture Using a Learning-Based Hybrid Imaging System Ting-Chun Wang, Jun-Yan Zhu, Nima Khademi Kalantari, Alexei Efros, Ravi Ramamoorthi ACM Transactions on Graphics (SIGGRAPH), 2017 paper \| lo-res pdf \| abstract \| bibtex \| project page Capturing light fields requires a huge bandwidth to record the data: a modern light field camera can only take three images per second. Temporal interpolation at such extreme scale is infeasible as too much information will be entirely missing between adjacent frames. Instead, we develop a hybrid imaging system, adding another standard video camera to capture the temporal information. Given a 3 fps light field sequence and a standard 30 fps 2D video, our system can then generate a full light field video at 30 fps. We adopt a learning-based approach, which can be decomposed into two steps: spatio-temporal flow estimation and appearance estimation. The flow estimation propagates the angular information from the light field sequence to the 2D video, so we can warp input images to the target view. The appearance estimation then combines these warped images to output the final pixels. The whole process is trained end-to-end using convolutional neural networks. @article{wang2017light, author = {Ting-Chun Wang and Jun-Yan Zhu and Nima Khademi Kalantari and Alexei A. Efros and Ravi Ramamoorthi}, title = {Light Field Video Capture Using a Learning-Based Hybrid Imaging System}, journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)}, volume = {36}, number = {4}, year = {2017}, }
	SVBRDF-Invariant Shape and Reflectance Estimation from Light-Field Cameras Ting-Chun Wang, Manmohan Chandraker, Alexei Efros, Ravi Ramamoorthi Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2017 paper \| abstract \| bibtex In this paper, we derive a spatially-varying (SV)BRDF-invariant theory for recovering 3D shape and reflectance from light-field cameras. Our key theoretical insight is a novel analysis of diffuse plus single-lobe SVBRDFs under a light-field setup. We show that, although direct shape recovery is not possible, an equation relating depths and normals can still be derived. Using this equation, we then propose using a polynomial (quadratic) shape prior to resolve the shape ambiguity. Once shape is estimated, we also recover the reflectance. We present extensive synthetic data on the entire MERL BRDF dataset, as well as a number of real examples to validate the theory, where we simultaneously recover shape and BRDFs from a single image taken with a Lytro Illum camera. @article{wang2017svbrdf, title={{SVBRDF}-Invariant Shape and Reflectance Estimation from Light-Field Cameras}, author={Wang, Ting-Chun and Chandraker, Manmohan and Efros, Alexei and Ramamoorthi, Ravi}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, year={2017}, }
	Light Field Blind Motion Deblurring Pratul P. Srinivasan, Ren Ng, Ravi Ramamoorthi IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 paper \| abstract \| bibtex We study the problem of deblurring light fields of general 3D scenes captured under 3D camera motion and present both theoretical and practical contributions. By analyzing the motion-blurred light field in the primal and Fourier domains, we develop intuition into the effects of camera motion on the light field, show the advantages of capturing a 4D light field instead of a conventional 2D image for motion deblurring, and derive simple methods of motion deblurring in certain cases. We then present an algorithm to blindly deblur light fields of general scenes without any estimation of scene geometry, and demonstrate that we can recover both the sharp light field and the 3D camera motion path of real and synthetically-blurred light fields.
	Robust Energy Minimization for BRDF-Invariant Shape from Light Fields Zhengqin Li, Zexiang Xu, Ravi Ramamoorthi, Manmohan Chandraker IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 paper \| abstract \| bibtex \| supplementary \| code Highly effective optimization frameworks have been developed for traditional multiview stereo relying on Lambertian photoconsistency. However, they do not account for complex material properties. On the other hand, recent works have explored PDE invariants for shape recovery with complex BRDFs, but they have not been incorporated into robust numerical optimization frameworks. We present a variational energy minimization framework for robust recovery of shape in multiview stereo with complex, unknown BRDFs. While our formulation is general, we demonstrate its efficacy on shape recovery using a single light field image, where the microlens array may be considered as a realization of a purely translational multiview stereo setup. Our formulation automatically balances contributions from texture gradients, traditional Lambertian photoconsistency, an appropriate BRDF-invariant PDE and a smoothness prior. Unlike prior works, our energy function inherently handles spatially-varying BRDFs and albedos. Extensive experiments with synthetic and real data show that our optimization framework consistently achieves errors lower than Lambertian baselines and further, is more robust than prior BRDF-invariant reconstruction methods.

2016

	Learning-Based View Synthesis for Light Field Cameras Nima Khademi Kalantari, Ting-Chun Wang, Ravi Ramamoorthi ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 2016 paper \| abstract \| bibtex \| project page With the introduction of consumer light field cameras, light field imaging has recently become widespread. However, there is an inherent trade-off between the angular and spatial resolution, and thus, these cameras often sparsely sample in either spatial or angular domain. In this paper, we use machine learning to mitigate this trade-off. Specifically, we propose a novel learning-based approach to synthesize new views from a sparse set of input views. We build upon existing view synthesis techniques and break down the process into disparity and color estimation components. We use two sequential convolutional neural networks to model these two components and train both networks simultaneously by minimizing the error between the synthesized and ground truth images. We show the performance of our approach using only four corner sub-aperture views from the light fields captured by the Lytro Illum camera. Experimental results show that our approach synthesizes high-quality images that are superior to the state-of-the-art techniques on a variety of challenging real-world scenes. We believe our method could potentially decrease the required angular resolution of consumer light field cameras, which allows their spatial resolution to increase. @article{LearningViewSynthesis, author = {Nima Khademi Kalantari and Ting-Chun Wang and Ravi Ramamoorthi}, title = {Learning-Based View Synthesis for Light Field Cameras}, journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2016)}, volume = {35}, number = {6}, year = {2016}, }
	A 4D Light-Field Dataset and CNN Architectures for Material Recognition Ting-Chun Wang, Jun-Yan Zhu, Ebi Hiroaki, Manmohan Chandraker, Alexei Efros, Ravi Ramamoorthi European Conference on Computer Vision (ECCV), 2016 paper \| abstract \| HTML comparison \| bibtex \| dataset (2D thumbnail) full dataset (15.9G) We introduce a new light-field dataset of materials, and take advantage of the recent success of deep learning to perform material recognition on the 4D light-field. Our dataset contains 12 material categories, each with 100 images taken with a Lytro Illum, from which we extract about 30,000 patches in total. Since recognition networks have not been trained on 4D images before, we propose and compare several novel CNN architectures to train on light-field images. In our experiments, the best performing CNN architecture achieves a 7% boost compared with 2D image classification (70% to 77%). @inproceedings{wang2016dataset, title={A {4D} light-field dataset and {CNN} architectures for material recognition}, author={Wang, Ting-Chun and Zhu, Jun-Yan and Hiroaki, Ebi and Chandraker, Manmohan and Efros, Alexei and Ramamoorthi, Ravi}, booktitle={Proceedings of European Conference on Computer Vision (ECCV)}, year={2016} }
	Shape Estimation from Shading, Defocus, and Correspondence Using Light-Field Angular Coherence Michael Tao, Pratul Srinivasan, Sunil Hadap, Szymon Rusinkiewicz, Jitendra Malik, Ravi Ramamoorthi Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016 paper \| abstract \| bibtex Light-field cameras are quickly becoming commodity items, with consumer and industrial applications. They capture many nearby views simultaneously using a single image with a micro-lens array, thereby providing a wealth of cues for depth recovery: defocus, correspondence, and shading. In particular, apart from conventional image shading, one can refocus images after acquisition, and shift one’s viewpoint within the sub-apertures of the main lens, effectively obtaining multiple views. We present a principled algorithm for dense depth estimation that combines defocus and correspondence metrics. We then extend our analysis to the additional cue of shading, using it to refine fine details in the shape. By exploiting an all-in-focus image, in which pixels are expected to exhibit angular coherence, we define an optimization framework that integrates photo consistency, depth consistency, and shading consistency. We show that combining all three sources of information: defocus, correspondence, and shading, outperforms state-of-the-art light-field depth estimation algorithms in multiple scenarios. @article{tao2016shape, title={Shape Estimation from Shading, Defocus, and Correspondence Using Light-Field Angular Coherence}, author={Tao, Michael and Srinivasan, Pratul and Hadap, Sunil and Rusinkiewicz, Szymon and Malik, Jitendra and Ramamoorthi, Ravi}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, year={2016}, }
	SVBRDF-Invariant Shape and Reflectance Estimation from Light-Field Cameras Ting-Chun Wang, Manmohan Chandraker, Alexei Efros, Ravi Ramamoorthi IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016 (oral presentation) paper \| abstract \| supplementary \| HTML comparison \| bibtex In this paper, we derive a spatially-varying (SV)BRDF-invariant theory for recovering 3D shape and reflectance from light-field cameras. Our key theoretical insight is a novel analysis of diffuse plus single-lobe SVBRDFs under a light-field setup. We show that, although direct shape recovery is not possible, an equation relating depths and normals can still be derived. Using this equation, we then propose using a polynomial (quadratic) shape prior to resolve the shape ambiguity. Once shape is estimated, we also recover the reflectance. We present extensive synthetic data on the entire MERL BRDF dataset, as well as a number of real examples to validate the theory, where we simultaneously recover shape and BRDFs from a single image taken with a Lytro Illum camera. @inproceedings{wang2016svbrdf, title={SVBRDF-invariant shape and reflectance estimation from light-field cameras}, author={Wang, Ting-Chun and Chandraker, Manmohan and Efros, Alexei and Ramamoorthi, Ravi}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2016} }
	Depth from Semi-Calibrated Stereo and Defocus Ting-Chun Wang, Manohar Srikanth, Ravi Ramamoorthi IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016 paper \| abstract \| HTML comparison \| bibtex In this work, we propose a multi-camera system where we combine a main high-quality camera with two low-res auxiliary cameras. The auxiliary cameras are well calibrated and act as a passive depth sensor by generating disparity maps. The main camera has an interchangeable lens and can produce good quality images at high resolution. Our goal is, given the low-res depth map from the auxiliary cameras, generate a depth map from the viewpoint of the main camera. The advantage of our system, compared to other systems such as light-field cameras or RGBD sensors, is the ability to generate a high-resolution color image with a complete depth map, without sacrificing resolution and with minimal auxiliary hardware. @inproceedings{wang2016semi, title={Depth from semi-calibrated stereo and defocus}, author={Wang, Ting-Chun and Srikanth, Manohar and Ramamoorthi, Ravi}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2016} }
	Depth Estimation with Occlusion Modeling Using Light-field Cameras Ting-Chun Wang, Alexei Efros, Ravi Ramamoorthi Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016 paper \| abstract \| bibtex In this paper, an occlusion-aware depth estimation algorithm is developed; the method also enables identification of occlusion edges, which may be useful in other applications. It can be shown that although photo-consistency is not preserved for pixels at occlusions, it still holds in approximately half the viewpoints. Moreover, the line separating the two view regions (occluded object vs. occluder) has the same orientation as that of the occlusion edge in the spatial domain. By ensuring photo-consistency in only the occluded view region, depth estimation can be improved. @article{wang2016depth, title={Depth estimation with occlusion modeling using light-field cameras}, author={Wang, Ting-Chun and Efros, Alexei and Ramamoorthi, Ravi}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, year={2016}, }

2015

	Occlusion-aware depth estimation using light-field cameras Ting-Chun Wang, Alexei Efros, Ravi Ramamoorthi International Conference on Computer Vision (ICCV), 2015 paper \| abstract \| bibtex \| supp code \| dataset (3.3GB) In this paper, we develop a depth estimation algorithm for light field cameras that treats occlusion explicitly; the method also enables identification of occlusion edges, which may be useful in other applications. We show that, although pixels at occlusions do not preserve photo-consistency in general, they are still consistent in approximately half the viewpoints. @inproceedings{wang2015occlusion, title={Occlusion-aware depth estimation using light-field cameras}, author={Wang, Ting-Chun and Efros, Alexei and Ramamoorthi, Ravi}, booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)}, year={2015} }
	Oriented Light-Field Windows for Scene Flow Pratul Srinivasan, Michael Tao, Ren Ng, Ravi Ramamoorthi International Conference on Computer Vision (ICCV), 2015 paper \| abstract \| bibtex \| code (152MB) For Lambertian surfaces focused to the correct depth, the 2D distribution of angular rays from a pixel remains consistent. We build on this idea to develop an oriented 4D light-field window that accounts for shearing(depth), translation (matching), and windowing. Our main application is to scene flow, a generalization of optical flow to the 3D vector field describing the motion of each point in the scene. @inproceedings{srinivasan2015oriented, title={Oriented Light-Field Windows for Scene Flow}, author={Srinivasan, Pratul and Tao, Michael and Ng, Ren and Ramamoorthi, Ravi}, booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)}, year={2015} }
	Depth from Shading, Defocus, and Correspondence using Light-field Angular Coherence Michael Tao, Pratul Srinivasan, Jitendra Malik, Szymon Rusinkiewicz, Ravi Ramamoorthi Conference on Computer Vision and Pattern Recognition (CVPR), 2015 paper \| abstract \| bibtex \| code (72MB) Using shading information is essential to improve shape estimation from light field cameras. We develop an improved technique for local shape estimation from defocus and correspondence cues, and show how shading can be used to further refine the depth. We show that the angular pixels have angular coherence, which exhibits three properties: photoconsistency, depth consistency, and shading consistency. @inproceedings{tao2015shading, title={Depth from Shading, Defocus, and Correspondence Using Light-Field Angular Coherence}, author={Tao, Michael W and Srinivasan, Pratul P and Malik, Jitendra and Rusinkiewicz, Szymon and Ramamoorthi, Ravi}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2015} }
	A Light Transport Framework for Lenslet Light Field Cameras Chia-Kai Liang, Ravi Ramamoorthi ACM Transactions on Graphics (TOG), 2015 paper \| abstract \| bibtex It is often stated that there is a fundamental tradeoff between spatial and angular resolution of lenslet light field cameras, but there has been limited understanding of this tradeoff theoretically or numerically. In this paper, we develop a light transport framework for understanding the fundamental limits of light field camera resolution. @article{liang2015light, title={A light transport framework for lenslet light field cameras}, author={Liang, Chia-Kai and Ramamoorthi, Ravi}, journal={ACM Transactions on Graphics (TOG)}, volume={34}, number={2}, pages={16}, year={2015} }
	Depth estimation and specular removal for glossy surfaces using point and line consistency with light-field cameras Michael Tao, Jong-Chyi Su, Ting-Chun Wang, Jitendra Malik, Ravi Ramamoorthi Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015 paper \| abstract \| bibtex code (5.2MB) \| dataset (1.1GB) Light-field cameras have now become available in both consumer and industrial applications, and recent papers have demonstrated practical algorithms for depth recovery from a passive single-shot capture. However, current light-field depth estimation methods are designed for Lambertian objects and fail or degrade for glossy or specular surfaces. In this paper, we present a novel theory of the relationship between light-field data and reflectance from the dichromatic model. @article{tao2015specular, title={Depth Estimation and Specular Removal for Glossy Surfaces Using Point and Line Consistency with Light-Field Cameras}, author={Tao, Michael and Su, Jong-Chyi and Wang, Ting-Chun and Malik, Jitendra and Ramamoorthi, Ravi}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, year={2015}, }

2014

Depth estimation for glossy surfaces with light-field cameras
Michael Tao, Ting-Chun Wang, Jitendra Malik, Ravi Ramamoorthi
ECCV Workshop on Light Fields for Computer Vision (L4CV), 2014

paper | abstract | bibtex
open source decoder for Lytro Illum (44MB)

Light-field cameras have now become available in both consumer and industrial applications, and recent papers have demonstrated practical algorithms for depth recovery from a passive single-shot capture. In this paper, we develop an iterative approach to use the benefits of light-field data to estimate and remove the specular component, improving the depth estimation. The approach enables light-field data depth estimation to support both specular and diffuse scenes.

@inproceedings{tao2014glossy,
   title={Depth estimation for glossy surfaces with 
   light-field cameras},
   author={Tao, Michael W and Wang, Ting-Chun 
   and Malik, Jitendra and Ramamoorthi, Ravi},
   booktitle={Proceedings of the IEEE European 
   Conference on Computer Vision Workshops (ECCVW)},
   year={2014},
}

2013

Depth from Combining Defocus and Correspondence Using Light-Field Cameras
Michael Tao, Sunil Hadap, Jitendra Malik, Ravi Ramamoorthi
International Conference on Computer Vision (ICCV), 2013

Light-field cameras have recently become available to the consumer market. An array of micro-lenses captures enough information that one can refocus images after acquisition, as well as shift one's viewpoint within the sub-apertures of the main lens, effectively obtaining multiple views. Thus, depth cues from both defocus and correspondence are available simultaneously in a single capture, and we show how to exploit both by analyzing the EPI.

@inproceedings{tao2013depth,
   author={Tao, Michael and Hadap, Sunil
   and Malik, Jitendra and Ramamoorthi, Ravi},
   title={Depth from combining defocus and 
   correspondence using light-field cameras},	
   booktitle={Proceedings of the IEEE International 
   Conference on Computer Vision (ICCV)},
   year={2013},
}

External Mask Based Depth and Light Field Camera
Dikpal Reddy, Jiamin Bai, Ravi Ramamoorthi
ICCV Workshop on Consumer Depth Cameras for Vision, 2013

paper | abstract | bibtex | video (97MB)

We present a method to convert a digital single-lens reflex (DSLR) camera into a high-resolution consumer depth and light-field camera by affixing an external aperture mask to the main lens. Compared to the existing consumer depth and light field cameras, our camera is easy to construct with minimal additional costs, and our design is camera and lens agnostic. The main advantage of our design is the ease of switching between an SLR camera and a native resolution depth/light field camera. We also do not need to modify the internals of the camera or the lens.

@inproceedings{reddy2013external,
   author={Reddy, Deepti and Bai, Jie and Ramamoorthi, Ravi},
   title={External mask based depth and light field camera},	
   booktitle={Proceedings of the IEEE International 
   Conference on Computer Vision (ICCV) Workshops},
   year={2013},
}

	Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Kai-En Lin, Lei Xiao, Feng Liu, Guowei Yang, Ravi Ramamoorthi International Conference on Computer Vision (ICCV), 2021 paper \| video \| abstract \| bibtex We develop a new algorithm, Deep 3D Mask Volume, which enables temporally stable view extrapolation from binocular videos of dynamic scenes, captured by static cameras. Our algorithm addresses the temporal inconsistency of disocclusions by identifying the error-prone areas with a 3D mask volume, and replaces them with static background observed throughout the video. @inproceedings {lin2021deep, title = {Deep 3D Mask Volume for View Synthesis of Dynamic Scenes}, author = {Kai-En Lin and Lei Xiao and Feng Liu and Guowei Yang and Ravi Ramamoorthi}, booktitle = {ICCV}, year = {2021}, }
	NeLF: Neural Light-transport Field for Portrait View Synthesis and Relighting Tiancheng Sun, Kai-En Lin, Sai Bi, Zexiang Xu, Ravi Ramamoorthi Eurographics Symposium on Rendering (EGSR), 2021 paper \| video \| abstract \| bibtex We present a system for portrait view synthesis and relighting: given multiple portraits, we use a neural network to predict the light-transport field in 3D space, and from the predicted Neural Light-transport Field (NeLF) produce a portrait from a new camera view under a new environmental lighting. @inproceedings {sun2021nelf, booktitle = {Eurographics Symposium on Rendering}, title = {NeLF: Neural Light-transport Field for Portrait View Synthesis and Relighting}, author = {Sun, Tiancheng and Lin, Kai-En and Bi, Sai and Xu, Zexiang and Ramamoorthi, Ravi}, year = {2021}, }
	NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis Ben Mildenhall, Pratul Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng Communications of the ACM (CACM)*, 2021 paper \| video \| abstract \| bibtex In this paper, we present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. @article{mildenhall2020nerf, title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis}, author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng}, year={2021}, booktitle={Communication of the ACM (CACM)}, year={2021} }
	Neural Light Transport for Relighting and View Synthesis Xiuming Zhang, Sean Fanello, Yun-Ta Tsai, Tiancheng Sun, Tianfan Xue, Rohit Pandey, Sergio Orts-Escolano, Philip Davidson, Christoph Rhemann, Paul Debevec, Jonathan T. Barron, Ravi Ramamoorthi, William T. Freeman ACM Transactions on Graphics (SIGGRAPH), 2021 paper \| video \| abstract \| bibtex We propose a semi-parametric approach for learning a neural representation of the light transport of a scene. The light transport is embedded in a texture atlas of known but possibly rough geometry. We model all non-diffuse and global light transport as residuals added to a physically-based diffuse base rendering.se set of input views. @article{zhang2021neural, title={Neural light transport for relighting and view synthesis}, author={Xiuming Zhang and Sean Fanello and Yun-Ta Tsai and Tiancheng Sun and Tianfan Xue and Rohit Pandey and Sergio Orts-Escolano and Philip Davidson and Christoph Rhemann and Paul Debevec and Jonathan T. Barron and Ravi Ramamoorthi and William T. Freeman}, journal={ACM Transactions on Graphics (TOG)}, year={2021}, }

	Neural Reflectance Fields for Appearance Acquisition Sai Bi, Zexiang Xu, Pratul Srinivasan, Ben Mildenhall, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi arxiv preprint, 2020 paper \| abstract \| bibtex We present Neural Reflectance Fields, a novel deep scene representation that encodes volume density, normal and reflectance properties at any 3D point in a scene using a fully-connected neural network. @article{bi2020neural, title={Neural reflectance fields for appearance acquisition}, author={Bi, Sai and Xu, Zexiang and Srinivasan, Pratul and Mildenhall, Ben and Sunkavalli, Kalyan and Ha{\v{s}}an, Milo{\v{s}} and Hold-Geoffroy, Yannick and Kriegman, David and Ramamoorthi, Ravi}, journal={arXiv preprint arXiv:2008.03824}, year={2020} }
	NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis Ben Mildenhall, Pratul Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng European Conference on Computer Vision (ECCV)*, 2020 paper \| video \| abstract \| bibtex In this paper, we present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. @article{mildenhall2020nerf, title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis}, author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng}, year={2020}, booktitle={Proceedings of European Conference on Computer Vision (ECCV)}, year={2020} }
	Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images Sai Bi, Zexiang Xu, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi European Conference on Computer Vision (ECCV), 2020 paper \| video \| abstract \| bibtex We develop a novel volumetric scene representation for reconstruction from unstructured images. Our representation consists of opacity, surface normal and reflectance voxel grids. We present a novel physically-based differentiable volume ray marching framework to render these scene volumes under arbitrary viewpoint and lighting. @misc{bi2020drv, title={Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images}, author={Sai Bi and Zexiang Xu and Kalyan Sunkavalli and Miloš Hašan and Yannick Hold-Geoffroy and David Kriegman and Ravi Ramamoorthi}, booktitle={Proceedings of European Conference on Computer Vision (ECCV)}, year={2020}, }
	Deep Multi Depth Panoramas for View Synthesis Kai-En Lin, Zexiang Xu, Ben Mildenhall, Pratul Srinivasan, Yannick Hold-Geoffroy, Stephen DiVerdi, Qi Sun, Kalyan Sunkavalli, Ravi Ramamoorthi European Conference on Computer Vision (ECCV), 2020 paper \| video \| abstract \| bibtex We propose a learning-based approach for novel view synthesis for multi-camera 360 degree panorama capture rigs. We present a novel scene representation, Multi Depth Panorama (MDP), that consists of multiple RGBD alpha panoramas that represent both scene geometry and appearance. @misc{lin2020deep, title={Deep Multi Depth Panoramas for View Synthesis}, author={Kai-En Lin and Zexiang Xu and Ben Mildenhall and Pratul P. Srinivasan and Yannick Hold-Geoffroy and Stephen DiVerdi and Qi Sun and Kalyan Sunkavalli and Ravi Ramamoorthi}, booktitle={Proceedings of European Conference on Computer Vision (ECCV)}, year={2020}, }
	Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images Sai Bi, Zexiang Xu, Kalyan Sunkavalli, David Kriegman, Ravi Ramamoorthi IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 paper \| video \| abstract \| bibtex We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object from a sparse set of only six images captured by wide-baseline cameras under collocated point lighting. We construct high-quality geometry and per-vertex BRDFs. @inproceedings{bi2020deep3d, title={Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images}, author={Bi, Sai and Xu, Zexiang and Sunkavalli, Kalyan and Kriegman, David and Ramamoorthi, Ravi}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={5960--5969}, year={2020} }

	Deep Recurrent Network for Fast and Full-Resolution Light Field Deblurring Jonathan Samuel Lumentut, Tae Hyun Kim, Ravi Ramamoorthi In Kyu Park IEEE Signal Processing Letters, 2019 paper \| abstract \| bibtex We propose a novel light field recurrent deblurring network that is trained under 6 degree-of-freedom camera motion-blur model. By combining the real light field captured using Lytro Illum and synthetic light field rendering of 3D scenes from UnrealCV, we provide a large-scale blurry light field dataset to train the network. @article{lumentut2019lf, author={J. S. {Lumentut} and T. H. {Kim} and R. {Ramamoorthi} and I. K. {Park}}, journal={IEEE Signal Processing Letters}, title={Deep Recurrent Network for Fast and Full-Resolution Light Field Deblurring}, year={2019}, volume={26}, number={12}, pages={1788-1792} }
	Deep View Synthesis from Sparse Photometric Images Zexiang Xu, Sai Bi, Kalyan Sunkavalli, Sunil Hadap, Hao Su, Ravi Ramamoorthi ACM Transactions on Graphics (SIGGRAPH), 2019 paper \| supplementary \| video \| abstract \| bibtex \| In this paper, we synthesize novel viewpoints across a wide range of viewing directions (covering a 60 degree cone) from a sparse set of just six viewing directions. Our method is based on a deep convolutional network trained to directly synthesize new views from the six input views. This network combines 3D convolutions on a plane sweep volume with a novel per-view per-depth plane attention map prediction network to effectively aggregate multi-view appearance. @article{xu2019deepviewsyn, author = {Zexiang Xu and Sai Bi and Kalyan Sunkavalli and Sunil Hadap and Hao Su and Ravi Ramamoorthi}, title = {Deep View Synthesis from Sparse Photometric Images}, journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)}, volume = {38}, number = {4}, year = {2019} }
	Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines Ben Mildenhall, Pratul Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, Abhishek Kar ACM Transactions on Graphics (SIGGRAPH), 2019 paper \| YouTube \| project page \| abstract \| bibtex \| We present a practical and robust deep learning solution for capturing and rendering novel views of complex real world scenes for virtual exploration. We propose an algorithm for view synthesis from an irregular grid of sampled views that first expands each sampled view into a local light field via a multiplane image (MPI) scene representation, then renders novel views by blending adjacent local lightfields. We extend traditional plenoptic sampling theory to derive a bound that specifies precisely how densely users should sample views of a given scene when using our algorithm. @article{mildenhall2019llff, title={Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines}, author={Ben Mildenhall and Pratul P. Srinivasan and Rodrigo Ortiz-Cayon and Nima Khademi Kalantari and Ravi Ramamoorthi and Ren Ng and Abhishek Kar}, journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)}, volume = {38}, number = {4}, year={2019} }
	Pushing the Boundaries of View Extrapolation with Multiplane Images Pratul P. Srinivasan, Richard Tucker, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng, Noah Snavely IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 paper \| with appendices \| YouTube \| video \| abstract \| bibtex We present a theoretical analysis showing how the range of views that can be rendered from a multi-plane image (MPI) increases linearly with the MPI disparity sampling frequency, as well as a novel MPI prediction procedure that theoretically enables view extrapolations of up to 4x the lateral viewpoint movement allowed by prior work.
	Deep HDR Video from Sequences with Alternating Exposures Nima Khademi Kalantari, Ravi Ramamoorthi EUROGRAPHICS, 2019 paper \| video \| abstract \| bibtex A practical way to generate a high dynamic range (HDR) video using off-the-shelf cameras is to capture a sequence with alternating exposures and reconstruct the missing content at each frame. Unfortunately, existing approaches are typically slow and are not able to handle challenging cases. In this paper, we propose a learning-based approach to address this difficult problem. To do this, we use two sequential convolutional neural networks (CNN) to model the entire HDR video reconstruction process.

	Learning to Synthesize a 4D RGBD Light Field from a Single Image Pratul P. Srinivasan, Tongzhou Wang, Ashwin Sreelal, Ravi Ramamoorthi, Ren Ng International Conference on Computer Vision (ICCV), 2017 paper \| supplementary \| video \| abstract \| bibtex \| We present a machine learning algorithm that takes as input a 2D RGB image and synthesizes a 4D RGBD light field (color and depth of the scene in each ray direction). For training, we introduce the largest public light field dataset, consisting of over 3300 plenoptic camera light fields of scenes containing flowers and plants. Our synthesis pipeline consists of a convolutional neural network (CNN) that estimates scene geometry, a stage that renders a Lambertian light field using that geometry, and a second CNN that predicts occluded rays and non-Lambertian effects. Our algorithm builds on recent view synthesis methods, but is unique in predicting RGBD for each light field ray and improving unsupervised single image depth estimation by enforcing consistency of ray depths that should intersect the same scene point. @article{pratul2017lightField, author = {Pratul P. Srinivasan and Tongzhou Wang and Ashwin Sreelal and Ravi Ramamoorthi and Ren Ng}, title = {Learning to Synthesize a 4D RGBD Light Field from a Single Image}, journal = {International Conference on Computer Vision (ICCV)}, year = {2017} }
	Depth and Image Restoration from Light Field in a Scattering Medium Jiandong Tian, Zak Murez, Tong Cui, Zhen Zhang, David Kriegman, Ravi Ramamoorthi International Conference on Computer Vision (ICCV), 2017 paper \| abstract \| bibtex \| Traditional imaging methods and computer vision algorithms are often ineffective when images are acquired in scattering media, such as underwater, fog, and biological tissue. Here, we explore the use of light field imaging and algorithms for image restoration and depth estimation that address the image degradation from the medium. Towards this end, we make the following three contributions. First, we present a new single image restoration algorithm which removes backscatter and attenuation from images better than existing methods do, and apply it to each view in the light field. Second, we combine a novel transmission based depth cue with existing correspondence and defocus cues to improve light field depth estimation. In densely scattering media, our transmission depth cue is critical for depth estimation since the images have low signal to noise ratios which significantly degrades the performance of the correspondence and defocus cues. Finally, we propose shearing and refocusing multiple views of the light field to recover a single image of higher quality than what is possible from a single view. We demonstrate the benefits of our method through extensive experimental results in a water tank. @article{tian2017light, author = {Jiandong Tian and Zak Murez and Tong Cui and Zhen Zhang and David Kriegman and Ravi Ramamoorthi}, title = {Depth and Image Restoration from Light Field in a Scattering Medium}, journal = {International Conference on Computer Vision (ICCV)}, year = {2017} }
	Deep High Dynamic Range Imaging of Dynamic Scenes Nima Khademi Kalantari, Ravi Ramamoorthi ACM Transactions on Graphics (SIGGRAPH), 2017 paper \| abstract \| bibtex \| project page Producing a high dynamic range (HDR) image from a set of images with different exposures is a challenging process for dynamic scenes. A category of existing techniques first register the input images to a reference image and then merge the aligned images into an HDR image. However, the artifacts of the registration usually appear as ghosting and tearing in the final HDR images. In this paper, we propose a learning-based approach to address this problem for dynamic scenes. We use a convolutional neural network (CNN) as our learning model and present and compare three different system architectures to model the HDR merge process. Furthermore, we create a large dataset of input LDR images and their corresponding ground truth HDR images to train our system. We demonstrate the performance of our system by producing high-quality HDR images from a set of three LDR images. Experimental results show that our method consistently produces better results than several state-of-the-art approaches on challenging scenes. @article{kalantari2017hdr, author = {Nima Khademi Kalantari and Ravi Ramamoorthi}, title = {Deep High Dynamic Range Imaging of Dynamic Scenes}, journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)}, volume = {36}, number = {4}, year = {2017}, }
	Light Field Video Capture Using a Learning-Based Hybrid Imaging System Ting-Chun Wang, Jun-Yan Zhu, Nima Khademi Kalantari, Alexei Efros, Ravi Ramamoorthi ACM Transactions on Graphics (SIGGRAPH), 2017 paper \| lo-res pdf \| abstract \| bibtex \| project page Capturing light fields requires a huge bandwidth to record the data: a modern light field camera can only take three images per second. Temporal interpolation at such extreme scale is infeasible as too much information will be entirely missing between adjacent frames. Instead, we develop a hybrid imaging system, adding another standard video camera to capture the temporal information. Given a 3 fps light field sequence and a standard 30 fps 2D video, our system can then generate a full light field video at 30 fps. We adopt a learning-based approach, which can be decomposed into two steps: spatio-temporal flow estimation and appearance estimation. The flow estimation propagates the angular information from the light field sequence to the 2D video, so we can warp input images to the target view. The appearance estimation then combines these warped images to output the final pixels. The whole process is trained end-to-end using convolutional neural networks. @article{wang2017light, author = {Ting-Chun Wang and Jun-Yan Zhu and Nima Khademi Kalantari and Alexei A. Efros and Ravi Ramamoorthi}, title = {Light Field Video Capture Using a Learning-Based Hybrid Imaging System}, journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)}, volume = {36}, number = {4}, year = {2017}, }
	SVBRDF-Invariant Shape and Reflectance Estimation from Light-Field Cameras Ting-Chun Wang, Manmohan Chandraker, Alexei Efros, Ravi Ramamoorthi Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2017 paper \| abstract \| bibtex In this paper, we derive a spatially-varying (SV)BRDF-invariant theory for recovering 3D shape and reflectance from light-field cameras. Our key theoretical insight is a novel analysis of diffuse plus single-lobe SVBRDFs under a light-field setup. We show that, although direct shape recovery is not possible, an equation relating depths and normals can still be derived. Using this equation, we then propose using a polynomial (quadratic) shape prior to resolve the shape ambiguity. Once shape is estimated, we also recover the reflectance. We present extensive synthetic data on the entire MERL BRDF dataset, as well as a number of real examples to validate the theory, where we simultaneously recover shape and BRDFs from a single image taken with a Lytro Illum camera. @article{wang2017svbrdf, title={{SVBRDF}-Invariant Shape and Reflectance Estimation from Light-Field Cameras}, author={Wang, Ting-Chun and Chandraker, Manmohan and Efros, Alexei and Ramamoorthi, Ravi}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, year={2017}, }
	Light Field Blind Motion Deblurring Pratul P. Srinivasan, Ren Ng, Ravi Ramamoorthi IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 paper \| abstract \| bibtex We study the problem of deblurring light fields of general 3D scenes captured under 3D camera motion and present both theoretical and practical contributions. By analyzing the motion-blurred light field in the primal and Fourier domains, we develop intuition into the effects of camera motion on the light field, show the advantages of capturing a 4D light field instead of a conventional 2D image for motion deblurring, and derive simple methods of motion deblurring in certain cases. We then present an algorithm to blindly deblur light fields of general scenes without any estimation of scene geometry, and demonstrate that we can recover both the sharp light field and the 3D camera motion path of real and synthetically-blurred light fields.
	Robust Energy Minimization for BRDF-Invariant Shape from Light Fields Zhengqin Li, Zexiang Xu, Ravi Ramamoorthi, Manmohan Chandraker IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 paper \| abstract \| bibtex \| supplementary \| code Highly effective optimization frameworks have been developed for traditional multiview stereo relying on Lambertian photoconsistency. However, they do not account for complex material properties. On the other hand, recent works have explored PDE invariants for shape recovery with complex BRDFs, but they have not been incorporated into robust numerical optimization frameworks. We present a variational energy minimization framework for robust recovery of shape in multiview stereo with complex, unknown BRDFs. While our formulation is general, we demonstrate its efficacy on shape recovery using a single light field image, where the microlens array may be considered as a realization of a purely translational multiview stereo setup. Our formulation automatically balances contributions from texture gradients, traditional Lambertian photoconsistency, an appropriate BRDF-invariant PDE and a smoothness prior. Unlike prior works, our energy function inherently handles spatially-varying BRDFs and albedos. Extensive experiments with synthetic and real data show that our optimization framework consistently achieves errors lower than Lambertian baselines and further, is more robust than prior BRDF-invariant reconstruction methods.

	Learning-Based View Synthesis for Light Field Cameras Nima Khademi Kalantari, Ting-Chun Wang, Ravi Ramamoorthi ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 2016 paper \| abstract \| bibtex \| project page With the introduction of consumer light field cameras, light field imaging has recently become widespread. However, there is an inherent trade-off between the angular and spatial resolution, and thus, these cameras often sparsely sample in either spatial or angular domain. In this paper, we use machine learning to mitigate this trade-off. Specifically, we propose a novel learning-based approach to synthesize new views from a sparse set of input views. We build upon existing view synthesis techniques and break down the process into disparity and color estimation components. We use two sequential convolutional neural networks to model these two components and train both networks simultaneously by minimizing the error between the synthesized and ground truth images. We show the performance of our approach using only four corner sub-aperture views from the light fields captured by the Lytro Illum camera. Experimental results show that our approach synthesizes high-quality images that are superior to the state-of-the-art techniques on a variety of challenging real-world scenes. We believe our method could potentially decrease the required angular resolution of consumer light field cameras, which allows their spatial resolution to increase. @article{LearningViewSynthesis, author = {Nima Khademi Kalantari and Ting-Chun Wang and Ravi Ramamoorthi}, title = {Learning-Based View Synthesis for Light Field Cameras}, journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2016)}, volume = {35}, number = {6}, year = {2016}, }
	A 4D Light-Field Dataset and CNN Architectures for Material Recognition Ting-Chun Wang, Jun-Yan Zhu, Ebi Hiroaki, Manmohan Chandraker, Alexei Efros, Ravi Ramamoorthi European Conference on Computer Vision (ECCV), 2016 paper \| abstract \| HTML comparison \| bibtex \| dataset (2D thumbnail) full dataset (15.9G) We introduce a new light-field dataset of materials, and take advantage of the recent success of deep learning to perform material recognition on the 4D light-field. Our dataset contains 12 material categories, each with 100 images taken with a Lytro Illum, from which we extract about 30,000 patches in total. Since recognition networks have not been trained on 4D images before, we propose and compare several novel CNN architectures to train on light-field images. In our experiments, the best performing CNN architecture achieves a 7% boost compared with 2D image classification (70% to 77%). @inproceedings{wang2016dataset, title={A {4D} light-field dataset and {CNN} architectures for material recognition}, author={Wang, Ting-Chun and Zhu, Jun-Yan and Hiroaki, Ebi and Chandraker, Manmohan and Efros, Alexei and Ramamoorthi, Ravi}, booktitle={Proceedings of European Conference on Computer Vision (ECCV)}, year={2016} }
	Shape Estimation from Shading, Defocus, and Correspondence Using Light-Field Angular Coherence Michael Tao, Pratul Srinivasan, Sunil Hadap, Szymon Rusinkiewicz, Jitendra Malik, Ravi Ramamoorthi Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016 paper \| abstract \| bibtex Light-field cameras are quickly becoming commodity items, with consumer and industrial applications. They capture many nearby views simultaneously using a single image with a micro-lens array, thereby providing a wealth of cues for depth recovery: defocus, correspondence, and shading. In particular, apart from conventional image shading, one can refocus images after acquisition, and shift one’s viewpoint within the sub-apertures of the main lens, effectively obtaining multiple views. We present a principled algorithm for dense depth estimation that combines defocus and correspondence metrics. We then extend our analysis to the additional cue of shading, using it to refine fine details in the shape. By exploiting an all-in-focus image, in which pixels are expected to exhibit angular coherence, we define an optimization framework that integrates photo consistency, depth consistency, and shading consistency. We show that combining all three sources of information: defocus, correspondence, and shading, outperforms state-of-the-art light-field depth estimation algorithms in multiple scenarios. @article{tao2016shape, title={Shape Estimation from Shading, Defocus, and Correspondence Using Light-Field Angular Coherence}, author={Tao, Michael and Srinivasan, Pratul and Hadap, Sunil and Rusinkiewicz, Szymon and Malik, Jitendra and Ramamoorthi, Ravi}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, year={2016}, }
	SVBRDF-Invariant Shape and Reflectance Estimation from Light-Field Cameras Ting-Chun Wang, Manmohan Chandraker, Alexei Efros, Ravi Ramamoorthi IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016 (oral presentation) paper \| abstract \| supplementary \| HTML comparison \| bibtex In this paper, we derive a spatially-varying (SV)BRDF-invariant theory for recovering 3D shape and reflectance from light-field cameras. Our key theoretical insight is a novel analysis of diffuse plus single-lobe SVBRDFs under a light-field setup. We show that, although direct shape recovery is not possible, an equation relating depths and normals can still be derived. Using this equation, we then propose using a polynomial (quadratic) shape prior to resolve the shape ambiguity. Once shape is estimated, we also recover the reflectance. We present extensive synthetic data on the entire MERL BRDF dataset, as well as a number of real examples to validate the theory, where we simultaneously recover shape and BRDFs from a single image taken with a Lytro Illum camera. @inproceedings{wang2016svbrdf, title={SVBRDF-invariant shape and reflectance estimation from light-field cameras}, author={Wang, Ting-Chun and Chandraker, Manmohan and Efros, Alexei and Ramamoorthi, Ravi}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2016} }
	Depth from Semi-Calibrated Stereo and Defocus Ting-Chun Wang, Manohar Srikanth, Ravi Ramamoorthi IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016 paper \| abstract \| HTML comparison \| bibtex In this work, we propose a multi-camera system where we combine a main high-quality camera with two low-res auxiliary cameras. The auxiliary cameras are well calibrated and act as a passive depth sensor by generating disparity maps. The main camera has an interchangeable lens and can produce good quality images at high resolution. Our goal is, given the low-res depth map from the auxiliary cameras, generate a depth map from the viewpoint of the main camera. The advantage of our system, compared to other systems such as light-field cameras or RGBD sensors, is the ability to generate a high-resolution color image with a complete depth map, without sacrificing resolution and with minimal auxiliary hardware. @inproceedings{wang2016semi, title={Depth from semi-calibrated stereo and defocus}, author={Wang, Ting-Chun and Srikanth, Manohar and Ramamoorthi, Ravi}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2016} }
	Depth Estimation with Occlusion Modeling Using Light-field Cameras Ting-Chun Wang, Alexei Efros, Ravi Ramamoorthi Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016 paper \| abstract \| bibtex In this paper, an occlusion-aware depth estimation algorithm is developed; the method also enables identification of occlusion edges, which may be useful in other applications. It can be shown that although photo-consistency is not preserved for pixels at occlusions, it still holds in approximately half the viewpoints. Moreover, the line separating the two view regions (occluded object vs. occluder) has the same orientation as that of the occlusion edge in the spatial domain. By ensuring photo-consistency in only the occluded view region, depth estimation can be improved. @article{wang2016depth, title={Depth estimation with occlusion modeling using light-field cameras}, author={Wang, Ting-Chun and Efros, Alexei and Ramamoorthi, Ravi}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, year={2016}, }