Welcome to our light-field website!

This is the webpage for light-field related researches in Prof. Ravi Ramamoorthi's lab, which is affiliated with both UC San Diego and UC Berkeley.
It includes all light-field papers (e.g. depth estimation) published in recent top conferences/journals.
If you compare to certain algorithms and/or use the datasets, please also cite the appropriate papers.

***For decoding the Lytro raw input, we recommend using Lytro's official software.
Donald's decoder is also very useful and does not require any registration.***

2021


nerf

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes
Kai-En Lin, Lei Xiao, Feng Liu, Guowei Yang, Ravi Ramamoorthi
International Conference on Computer Vision (ICCV), 2021

paper | video | abstract | bibtex

We develop a new algorithm, Deep 3D Mask Volume, which enables temporally stable view extrapolation from binocular videos of dynamic scenes, captured by static cameras. Our algorithm addresses the temporal inconsistency of disocclusions by identifying the error-prone areas with a 3D mask volume, and replaces them with static background observed throughout the video.

@inproceedings {lin2021deep,
      title = {Deep 3D Mask Volume for View Synthesis of Dynamic Scenes},
      author = {Kai-En Lin and Lei Xiao and Feng Liu and Guowei Yang and Ravi Ramamoorthi},
      booktitle = {ICCV},
      year = {2021},
}
                                          
nerf

NeLF: Neural Light-transport Field for Portrait View Synthesis and Relighting
Tiancheng Sun*, Kai-En Lin*, Sai Bi, Zexiang Xu, Ravi Ramamoorthi
Eurographics Symposium on Rendering (EGSR), 2021

paper | video | abstract | bibtex

We present a system for portrait view synthesis and relighting: given multiple portraits, we use a neural network to predict the light-transport field in 3D space, and from the predicted Neural Light-transport Field (NeLF) produce a portrait from a new camera view under a new environmental lighting.

@inproceedings {sun2021nelf,
      booktitle = {Eurographics Symposium on Rendering},
      title = {NeLF: Neural Light-transport Field for Portrait View Synthesis and Relighting},
      author = {Sun, Tiancheng and Lin, Kai-En and Bi, Sai and Xu, Zexiang and Ramamoorthi, Ravi},
      year = {2021},
}
                                          
nerf

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Ben Mildenhall*, Pratul Srinivasan*, Matthew Tancik*, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng
Communications of the ACM (CACM), 2021

paper | video | abstract | bibtex

In this paper, we present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views.

@article{mildenhall2020nerf,
      title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis},
      author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik 
            and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng},
      year={2021},
      booktitle={Communication of the ACM (CACM)},
      year={2021}
}
                                                
nerf

Neural Light Transport for Relighting and View Synthesis
Xiuming Zhang, Sean Fanello, Yun-Ta Tsai, Tiancheng Sun, Tianfan Xue, Rohit Pandey, Sergio Orts-Escolano, Philip Davidson, Christoph Rhemann, Paul Debevec, Jonathan T. Barron, Ravi Ramamoorthi, William T. Freeman
ACM Transactions on Graphics (SIGGRAPH), 2021

paper | video | abstract | bibtex

We propose a semi-parametric approach for learning a neural representation of the light transport of a scene. The light transport is embedded in a texture atlas of known but possibly rough geometry. We model all non-diffuse and global light transport as residuals added to a physically-based diffuse base rendering.se set of input views.

  @article{zhang2021neural,
    title={Neural light transport for relighting and view synthesis},
    author={Xiuming Zhang and Sean Fanello and Yun-Ta Tsai
            and Tiancheng Sun and Tianfan Xue and Rohit Pandey
            and Sergio Orts-Escolano and Philip Davidson
            and Christoph Rhemann and Paul Debevec and Jonathan T. Barron
            and Ravi Ramamoorthi and William T. Freeman},
    journal={ACM Transactions on Graphics (TOG)},
    year={2021},
  }
                                                

2020


nerf

Neural Reflectance Fields for Appearance Acquisition
Sai Bi*, Zexiang Xu*, Pratul Srinivasan, Ben Mildenhall, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi
arxiv preprint, 2020

paper | abstract | bibtex

We present Neural Reflectance Fields, a novel deep scene representation that encodes volume density, normal and reflectance properties at any 3D point in a scene using a fully-connected neural network.

@article{bi2020neural,
      title={Neural reflectance fields for appearance acquisition},
      author={Bi, Sai and Xu, Zexiang and Srinivasan, Pratul
      and Mildenhall, Ben and Sunkavalli, Kalyan 
      and Ha{\v{s}}an, Milo{\v{s}} and Hold-Geoffroy, Yannick 
      and Kriegman, David and Ramamoorthi, Ravi},
      journal={arXiv preprint arXiv:2008.03824},
      year={2020}
}
                                          
nerf

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Ben Mildenhall*, Pratul Srinivasan*, Matthew Tancik*, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng
European Conference on Computer Vision (ECCV), 2020

paper | video | abstract | bibtex

In this paper, we present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views.

@article{mildenhall2020nerf,
      title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis},
      author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik 
            and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng},
      year={2020},
      booktitle={Proceedings of European Conference on 
            Computer Vision (ECCV)},
      year={2020}
}
                                                
deep_reflectance_volume

Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images
Sai Bi, Zexiang Xu, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi
European Conference on Computer Vision (ECCV), 2020

paper | video | abstract | bibtex

We develop a novel volumetric scene representation for reconstruction from unstructured images. Our representation consists of opacity, surface normal and reflectance voxel grids. We present a novel physically-based differentiable volume ray marching framework to render these scene volumes under arbitrary viewpoint and lighting.

@misc{bi2020drv,
      title={Deep Reflectance Volumes: Relightable Reconstructions from 
            Multi-View Photometric Images},
      author={Sai Bi and Zexiang Xu and Kalyan Sunkavalli and Miloš Hašan 
            and Yannick Hold-Geoffroy and David Kriegman and Ravi Ramamoorthi},
      booktitle={Proceedings of European Conference on 
            Computer Vision (ECCV)},
      year={2020},
}
            
                              
multi_depth_panoramas

Deep Multi Depth Panoramas for View Synthesis
Kai-En Lin, Zexiang Xu, Ben Mildenhall, Pratul Srinivasan, Yannick Hold-Geoffroy, Stephen DiVerdi, Qi Sun, Kalyan Sunkavalli, Ravi Ramamoorthi
European Conference on Computer Vision (ECCV), 2020

paper | video | abstract | bibtex

We propose a learning-based approach for novel view synthesis for multi-camera 360 degree panorama capture rigs. We present a novel scene representation, Multi Depth Panorama (MDP), that consists of multiple RGBD alpha panoramas that represent both scene geometry and appearance.

@misc{lin2020deep,
      title={Deep Multi Depth Panoramas for View Synthesis},
      author={Kai-En Lin and Zexiang Xu and Ben Mildenhall and Pratul P. Srinivasan 
            and Yannick Hold-Geoffroy and Stephen DiVerdi and Qi Sun 
            and Kalyan Sunkavalli and Ravi Ramamoorthi},
      booktitle={Proceedings of European Conference on 
            Computer Vision (ECCV)},
      year={2020},
}
            
                              
deep3d

Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images
Sai Bi, Zexiang Xu, Kalyan Sunkavalli, David Kriegman, Ravi Ramamoorthi
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

paper | video | abstract | bibtex

We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object from a sparse set of only six images captured by wide-baseline cameras under collocated point lighting. We construct high-quality geometry and per-vertex BRDFs.

@inproceedings{bi2020deep3d,
      title={Deep 3D Capture: Geometry and Reflectance from Sparse 
            Multi-View Images},
      author={Bi, Sai and Xu, Zexiang and Sunkavalli, Kalyan 
            and Kriegman, David and Ramamoorthi, Ravi},
      booktitle={Proceedings of the IEEE/CVF Conference on 
            Computer Vision and Pattern Recognition},
      pages={5960--5969},
      year={2020}
}
                              

2019


light field deblurring

Deep Recurrent Network for Fast and Full-Resolution Light Field Deblurring
Jonathan Samuel Lumentut, Tae Hyun Kim, Ravi Ramamoorthi In Kyu Park
IEEE Signal Processing Letters, 2019

paper | abstract | bibtex

We propose a novel light field recurrent deblurring network that is trained under 6 degree-of-freedom camera motion-blur model. By combining the real light field captured using Lytro Illum and synthetic light field rendering of 3D scenes from UnrealCV, we provide a large-scale blurry light field dataset to train the network.

@article{lumentut2019lf,
      author={J. S. {Lumentut} and T. H. {Kim} and R. {Ramamoorthi} 
            and I. K. {Park}},
      journal={IEEE Signal Processing Letters}, 
      title={Deep Recurrent Network for Fast and Full-Resolution 
            Light Field Deblurring}, 
      year={2019},
      volume={26},
      number={12},
      pages={1788-1792}
}
                              
viewsyn

Deep View Synthesis from Sparse Photometric Images
Zexiang Xu, Sai Bi, Kalyan Sunkavalli, Sunil Hadap, Hao Su, Ravi Ramamoorthi
ACM Transactions on Graphics (SIGGRAPH), 2019

paper | supplementary | video | abstract | bibtex |

In this paper, we synthesize novel viewpoints across a wide range of viewing directions (covering a 60 degree cone) from a sparse set of just six viewing directions. Our method is based on a deep convolutional network trained to directly synthesize new views from the six input views. This network combines 3D convolutions on a plane sweep volume with a novel per-view per-depth plane attention map prediction network to effectively aggregate multi-view appearance.

@article{xu2019deepviewsyn,
   author  = {Zexiang Xu and Sai Bi and Kalyan Sunkavalli and Sunil Hadap and Hao Su
              and Ravi Ramamoorthi},
   title   = {Deep View Synthesis from Sparse Photometric Images},
   journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)},
   volume  = {38},
   number  = {4},
   year    = {2019}
}
      
lffusion

Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines
Ben Mildenhall*, Pratul Srinivasan*, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, Abhishek Kar
ACM Transactions on Graphics (SIGGRAPH), 2019

paper | YouTube | project page | abstract | bibtex |

We present a practical and robust deep learning solution for capturing and rendering novel views of complex real world scenes for virtual exploration. We propose an algorithm for view synthesis from an irregular grid of sampled views that first expands each sampled view into a local light field via a multiplane image (MPI) scene representation, then renders novel views by blending adjacent local lightfields. We extend traditional plenoptic sampling theory to derive a bound that specifies precisely how densely users should sample views of a given scene when using our algorithm.

        @article{mildenhall2019llff,
          title={Local Light Field Fusion: Practical View Synthesis with 
          Prescriptive Sampling Guidelines},
          author={Ben Mildenhall and Pratul P. Srinivasan and Rodrigo Ortiz-Cayon and 
          Nima Khademi Kalantari and Ravi Ramamoorthi and Ren Ng and Abhishek Kar},
          journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)},
           volume  = {38},
           number  = {4},
          year={2019}
        }
      
mpi

Pushing the Boundaries of View Extrapolation with Multiplane Images
Pratul P. Srinivasan, Richard Tucker, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng, Noah Snavely
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019

paper | with appendices | YouTube | video | abstract | bibtex

We present a theoretical analysis showing how the range of views that can be rendered from a multi-plane image (MPI) increases linearly with the MPI disparity sampling frequency, as well as a novel MPI prediction procedure that theoretically enables view extrapolations of up to 4x the lateral viewpoint movement allowed by prior work.


      
hdrvideo

Deep HDR Video from Sequences with Alternating Exposures
Nima Khademi Kalantari, Ravi Ramamoorthi
EUROGRAPHICS, 2019

paper | video | abstract | bibtex

A practical way to generate a high dynamic range (HDR) video using off-the-shelf cameras is to capture a sequence with alternating exposures and reconstruct the missing content at each frame. Unfortunately, existing approaches are typically slow and are not able to handle challenging cases. In this paper, we propose a learning-based approach to address this difficult problem. To do this, we use two sequential convolutional neural networks (CNN) to model the entire HDR video reconstruction process.


      

2017


lfsyn

Learning to Synthesize a 4D RGBD Light Field from a Single Image
Pratul P. Srinivasan, Tongzhou Wang, Ashwin Sreelal, Ravi Ramamoorthi, Ren Ng
International Conference on Computer Vision (ICCV), 2017

paper | supplementary | video | abstract | bibtex |

We present a machine learning algorithm that takes as input a 2D RGB image and synthesizes a 4D RGBD light field (color and depth of the scene in each ray direction). For training, we introduce the largest public light field dataset, consisting of over 3300 plenoptic camera light fields of scenes containing flowers and plants. Our synthesis pipeline consists of a convolutional neural network (CNN) that estimates scene geometry, a stage that renders a Lambertian light field using that geometry, and a second CNN that predicts occluded rays and non-Lambertian effects. Our algorithm builds on recent view synthesis methods, but is unique in predicting RGBD for each light field ray and improving unsupervised single image depth estimation by enforcing consistency of ray depths that should intersect the same scene point.

@article{pratul2017lightField,
   author  = {Pratul P. Srinivasan and Tongzhou Wang and Ashwin Sreelal
              and Ravi Ramamoorthi and Ren Ng},
   title   = {Learning to Synthesize a 4D RGBD Light Field from a Single Image},
   journal = {International Conference on Computer Vision (ICCV)},
   year    = {2017}
}
      
lfwater

Depth and Image Restoration from Light Field in a Scattering Medium
Jiandong Tian, Zak Murez, Tong Cui, Zhen Zhang, David Kriegman, Ravi Ramamoorthi
International Conference on Computer Vision (ICCV), 2017

paper | abstract | bibtex |

Traditional imaging methods and computer vision algorithms are often ineffective when images are acquired in scattering media, such as underwater, fog, and biological tissue. Here, we explore the use of light field imaging and algorithms for image restoration and depth estimation that address the image degradation from the medium. Towards this end, we make the following three contributions. First, we present a new single image restoration algorithm which removes backscatter and attenuation from images better than existing methods do, and apply it to each view in the light field. Second, we combine a novel transmission based depth cue with existing correspondence and defocus cues to improve light field depth estimation. In densely scattering media, our transmission depth cue is critical for depth estimation since the images have low signal to noise ratios which significantly degrades the performance of the correspondence and defocus cues. Finally, we propose shearing and refocusing multiple views of the light field to recover a single image of higher quality than what is possible from a single view. We demonstrate the benefits of our method through extensive experimental results in a water tank.

@article{tian2017light,
   author  = {Jiandong Tian and Zak Murez and Tong Cui and Zhen Zhang
              and David Kriegman and Ravi Ramamoorthi},
   title   = {Depth and Image Restoration from Light Field in a Scattering Medium},
   journal = {International Conference on Computer Vision (ICCV)},
   year    = {2017}
}
      
hdr_sig

Deep High Dynamic Range Imaging of Dynamic Scenes
Nima Khademi Kalantari, Ravi Ramamoorthi
ACM Transactions on Graphics (SIGGRAPH), 2017

paper | abstract | bibtex | project page

Producing a high dynamic range (HDR) image from a set of images with different exposures is a challenging process for dynamic scenes. A category of existing techniques first register the input images to a reference image and then merge the aligned images into an HDR image. However, the artifacts of the registration usually appear as ghosting and tearing in the final HDR images. In this paper, we propose a learning-based approach to address this problem for dynamic scenes. We use a convolutional neural network (CNN) as our learning model and present and compare three different system architectures to model the HDR merge process. Furthermore, we create a large dataset of input LDR images and their corresponding ground truth HDR images to train our system. We demonstrate the performance of our system by producing high-quality HDR images from a set of three LDR images. Experimental results show that our method consistently produces better results than several state-of-the-art approaches on challenging scenes.

@article{kalantari2017hdr,
   author  = {Nima Khademi Kalantari and Ravi Ramamoorthi},
   title   = {Deep High Dynamic Range Imaging of Dynamic Scenes},
   journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)},
   volume  = {36},
   number  = {4},
   year    = {2017},
}
      
lfv_sig

Light Field Video Capture Using a Learning-Based Hybrid Imaging System
Ting-Chun Wang, Jun-Yan Zhu, Nima Khademi Kalantari, Alexei Efros, Ravi Ramamoorthi
ACM Transactions on Graphics (SIGGRAPH), 2017

paper | lo-res pdf | abstract | bibtex | project page

Capturing light fields requires a huge bandwidth to record the data: a modern light field camera can only take three images per second. Temporal interpolation at such extreme scale is infeasible as too much information will be entirely missing between adjacent frames. Instead, we develop a hybrid imaging system, adding another standard video camera to capture the temporal information. Given a 3 fps light field sequence and a standard 30 fps 2D video, our system can then generate a full light field video at 30 fps. We adopt a learning-based approach, which can be decomposed into two steps: spatio-temporal flow estimation and appearance estimation. The flow estimation propagates the angular information from the light field sequence to the 2D video, so we can warp input images to the target view. The appearance estimation then combines these warped images to output the final pixels. The whole process is trained end-to-end using convolutional neural networks.

@article{wang2017light,
   author  = {Ting-Chun Wang and Jun-Yan Zhu and Nima Khademi Kalantari 
              and Alexei A. Efros and Ravi Ramamoorthi},
   title   = {Light Field Video Capture Using a Learning-Based Hybrid 
              Imaging System},
   journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH)},
   volume  = {36},
   number  = {4},
   year    = {2017},
}
      
brdf_pami

SVBRDF-Invariant Shape and Reflectance Estimation from Light-Field Cameras
Ting-Chun Wang, Manmohan Chandraker, Alexei Efros, Ravi Ramamoorthi
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2017

paper | abstract | bibtex

In this paper, we derive a spatially-varying (SV)BRDF-invariant theory for recovering 3D shape and reflectance from light-field cameras. Our key theoretical insight is a novel analysis of diffuse plus single-lobe SVBRDFs under a light-field setup. We show that, although direct shape recovery is not possible, an equation relating depths and normals can still be derived. Using this equation, we then propose using a polynomial (quadratic) shape prior to resolve the shape ambiguity. Once shape is estimated, we also recover the reflectance. We present extensive synthetic data on the entire MERL BRDF dataset, as well as a number of real examples to validate the theory, where we simultaneously recover shape and BRDFs from a single image taken with a Lytro Illum camera.

@article{wang2017svbrdf,
   title={{SVBRDF}-Invariant Shape and Reflectance 
   Estimation from Light-Field Cameras},
   author={Wang, Ting-Chun and Chandraker, Manmohan
   and Efros, Alexei and Ramamoorthi, Ravi},
   journal={IEEE Transactions on Pattern 
   Analysis and Machine Intelligence (TPAMI)},
   year={2017},
}
      
lfmb_cvpr

Light Field Blind Motion Deblurring
Pratul P. Srinivasan, Ren Ng, Ravi Ramamoorthi
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017

paper | abstract | bibtex

We study the problem of deblurring light fields of general 3D scenes captured under 3D camera motion and present both theoretical and practical contributions. By analyzing the motion-blurred light field in the primal and Fourier domains, we develop intuition into the effects of camera motion on the light field, show the advantages of capturing a 4D light field instead of a conventional 2D image for motion deblurring, and derive simple methods of motion deblurring in certain cases. We then present an algorithm to blindly deblur light fields of general scenes without any estimation of scene geometry, and demonstrate that we can recover both the sharp light field and the 3D camera motion path of real and synthetically-blurred light fields.


      
lfd_cvpr

Robust Energy Minimization for BRDF-Invariant Shape from Light Fields
Zhengqin Li, Zexiang Xu, Ravi Ramamoorthi, Manmohan Chandraker
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017

paper | abstract | bibtex | supplementary | code

Highly effective optimization frameworks have been developed for traditional multiview stereo relying on Lambertian photoconsistency. However, they do not account for complex material properties. On the other hand, recent works have explored PDE invariants for shape recovery with complex BRDFs, but they have not been incorporated into robust numerical optimization frameworks. We present a variational energy minimization framework for robust recovery of shape in multiview stereo with complex, unknown BRDFs. While our formulation is general, we demonstrate its efficacy on shape recovery using a single light field image, where the microlens array may be considered as a realization of a purely translational multiview stereo setup. Our formulation automatically balances contributions from texture gradients, traditional Lambertian photoconsistency, an appropriate BRDF-invariant PDE and a smoothness prior. Unlike prior works, our energy function inherently handles spatially-varying BRDFs and albedos. Extensive experiments with synthetic and real data show that our optimization framework consistently achieves errors lower than Lambertian baselines and further, is more robust than prior BRDF-invariant reconstruction methods.

      

2016


view_siga

Learning-Based View Synthesis for Light Field Cameras
Nima Khademi Kalantari, Ting-Chun Wang, Ravi Ramamoorthi
ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 2016

paper | abstract | bibtex | project page

With the introduction of consumer light field cameras, light field imaging has recently become widespread. However, there is an inherent trade-off between the angular and spatial resolution, and thus, these cameras often sparsely sample in either spatial or angular domain. In this paper, we use machine learning to mitigate this trade-off. Specifically, we propose a novel learning-based approach to synthesize new views from a sparse set of input views. We build upon existing view synthesis techniques and break down the process into disparity and color estimation components. We use two sequential convolutional neural networks to model these two components and train both networks simultaneously by minimizing the error between the synthesized and ground truth images. We show the performance of our approach using only four corner sub-aperture views from the light fields captured by the Lytro Illum camera. Experimental results show that our approach synthesizes high-quality images that are superior to the state-of-the-art techniques on a variety of challenging real-world scenes. We believe our method could potentially decrease the required angular resolution of consumer light field cameras, which allows their spatial resolution to increase.

@article{LearningViewSynthesis,
   author  = {Nima Khademi Kalantari and 
   Ting-Chun Wang and Ravi Ramamoorthi},
   title   = {Learning-Based View Synthesis 
   for Light Field Cameras},
   journal = {ACM Transactions on Graphics 
   (Proceedings of SIGGRAPH Asia 2016)},
   volume  = {35},
   number  = {6},
   year    = {2016},
}
      
lfmr_eccv

A 4D Light-Field Dataset and CNN Architectures for Material Recognition
Ting-Chun Wang, Jun-Yan Zhu, Ebi Hiroaki, Manmohan Chandraker, Alexei Efros, Ravi Ramamoorthi
European Conference on Computer Vision (ECCV), 2016

paper | abstract | HTML comparison | bibtex | dataset (2D thumbnail)
full dataset (15.9G)

We introduce a new light-field dataset of materials, and take advantage of the recent success of deep learning to perform material recognition on the 4D light-field. Our dataset contains 12 material categories, each with 100 images taken with a Lytro Illum, from which we extract about 30,000 patches in total. Since recognition networks have not been trained on 4D images before, we propose and compare several novel CNN architectures to train on light-field images. In our experiments, the best performing CNN architecture achieves a 7% boost compared with 2D image classification (70% to 77%).

@inproceedings{wang2016dataset,
   title={A {4D} light-field dataset and {CNN} 
   architectures for material recognition},
   author={Wang, Ting-Chun and Zhu, Jun-Yan 
   and Hiroaki, Ebi and Chandraker, Manmohan 
   and Efros, Alexei and Ramamoorthi, Ravi},
   booktitle={Proceedings of European Conference on 
   Computer Vision (ECCV)},
   year={2016}
}
      
shading_pami

Shape Estimation from Shading, Defocus, and Correspondence Using Light-Field Angular Coherence
Michael Tao, Pratul Srinivasan, Sunil Hadap, Szymon Rusinkiewicz, Jitendra Malik, Ravi Ramamoorthi
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016

paper | abstract | bibtex

Light-field cameras are quickly becoming commodity items, with consumer and industrial applications. They capture many nearby views simultaneously using a single image with a micro-lens array, thereby providing a wealth of cues for depth recovery: defocus, correspondence, and shading. In particular, apart from conventional image shading, one can refocus images after acquisition, and shift one’s viewpoint within the sub-apertures of the main lens, effectively obtaining multiple views. We present a principled algorithm for dense depth estimation that combines defocus and correspondence metrics. We then extend our analysis to the additional cue of shading, using it to refine fine details in the shape. By exploiting an all-in-focus image, in which pixels are expected to exhibit angular coherence, we define an optimization framework that integrates photo consistency, depth consistency, and shading consistency. We show that combining all three sources of information: defocus, correspondence, and shading, outperforms state-of-the-art light-field depth estimation algorithms in multiple scenarios.

@article{tao2016shape,
   title={Shape Estimation from Shading, Defocus, and 
   Correspondence Using Light-Field Angular Coherence},
   author={Tao, Michael and Srinivasan, Pratul 
   and Hadap, Sunil and Rusinkiewicz, Szymon 
   and Malik, Jitendra and Ramamoorthi, Ravi},
   journal={IEEE Transactions on Pattern 
   Analysis and Machine Intelligence (TPAMI)},
   year={2016},
}
      
brdf_cvpr

SVBRDF-Invariant Shape and Reflectance Estimation from Light-Field Cameras
Ting-Chun Wang, Manmohan Chandraker, Alexei Efros, Ravi Ramamoorthi
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
(oral presentation)

paper | abstract | supplementary | HTML comparison | bibtex

In this paper, we derive a spatially-varying (SV)BRDF-invariant theory for recovering 3D shape and reflectance from light-field cameras. Our key theoretical insight is a novel analysis of diffuse plus single-lobe SVBRDFs under a light-field setup. We show that, although direct shape recovery is not possible, an equation relating depths and normals can still be derived. Using this equation, we then propose using a polynomial (quadratic) shape prior to resolve the shape ambiguity. Once shape is estimated, we also recover the reflectance. We present extensive synthetic data on the entire MERL BRDF dataset, as well as a number of real examples to validate the theory, where we simultaneously recover shape and BRDFs from a single image taken with a Lytro Illum camera.

@inproceedings{wang2016svbrdf,
   title={SVBRDF-invariant shape and reflectance 
   estimation from light-field cameras},
   author={Wang, Ting-Chun and Chandraker, Manmohan 
   and Efros, Alexei and Ramamoorthi, Ravi},
   booktitle={Proceedings of the IEEE Conference on 
   Computer Vision and Pattern Recognition (CVPR)},
   year={2016}
}
      
stereo_cvpr

Depth from Semi-Calibrated Stereo and Defocus
Ting-Chun Wang, Manohar Srikanth, Ravi Ramamoorthi
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

paper | abstract | HTML comparison | bibtex

In this work, we propose a multi-camera system where we combine a main high-quality camera with two low-res auxiliary cameras. The auxiliary cameras are well calibrated and act as a passive depth sensor by generating disparity maps. The main camera has an interchangeable lens and can produce good quality images at high resolution. Our goal is, given the low-res depth map from the auxiliary cameras, generate a depth map from the viewpoint of the main camera. The advantage of our system, compared to other systems such as light-field cameras or RGBD sensors, is the ability to generate a high-resolution color image with a complete depth map, without sacrificing resolution and with minimal auxiliary hardware.

@inproceedings{wang2016semi,
   title={Depth from semi-calibrated stereo and defocus},
   author={Wang, Ting-Chun and Srikanth, Manohar
   and Ramamoorthi, Ravi},
   booktitle={Proceedings of the IEEE Conference on 
   Computer Vision and Pattern Recognition (CVPR)},
   year={2016}
}
      
occlusion_pami

Depth Estimation with Occlusion Modeling Using Light-field Cameras
Ting-Chun Wang, Alexei Efros, Ravi Ramamoorthi
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016

paper | abstract | bibtex

In this paper, an occlusion-aware depth estimation algorithm is developed; the method also enables identification of occlusion edges, which may be useful in other applications. It can be shown that although photo-consistency is not preserved for pixels at occlusions, it still holds in approximately half the viewpoints. Moreover, the line separating the two view regions (occluded object vs. occluder) has the same orientation as that of the occlusion edge in the spatial domain. By ensuring photo-consistency in only the occluded view region, depth estimation can be improved.

@article{wang2016depth,
   title={Depth estimation with occlusion modeling 
   using light-field cameras},
   author={Wang, Ting-Chun and Efros, Alexei and 
   Ramamoorthi, Ravi},
   journal={IEEE Transactions on Pattern 
   Analysis and Machine Intelligence (TPAMI)},
   year={2016},
}
      

2015


occlusion

Occlusion-aware depth estimation using light-field cameras
Ting-Chun Wang, Alexei Efros, Ravi Ramamoorthi
International Conference on Computer Vision (ICCV), 2015

paper | abstract | bibtex | supp
code | dataset (3.3GB)

In this paper, we develop a depth estimation algorithm for light field cameras that treats occlusion explicitly; the method also enables identification of occlusion edges, which may be useful in other applications. We show that, although pixels at occlusions do not preserve photo-consistency in general, they are still consistent in approximately half the viewpoints.

@inproceedings{wang2015occlusion,
   title={Occlusion-aware depth estimation using 
   light-field cameras},
   author={Wang, Ting-Chun and Efros, Alexei and 
   Ramamoorthi, Ravi},
   booktitle={Proceedings of the IEEE International 
   Conference on Computer Vision (ICCV)},
   year={2015}
}
      
flow

Oriented Light-Field Windows for Scene Flow
Pratul Srinivasan, Michael Tao, Ren Ng, Ravi Ramamoorthi
International Conference on Computer Vision (ICCV), 2015

paper | abstract | bibtex | code (152MB)

For Lambertian surfaces focused to the correct depth, the 2D distribution of angular rays from a pixel remains consistent. We build on this idea to develop an oriented 4D light-field window that accounts for shearing(depth), translation (matching), and windowing. Our main application is to scene flow, a generalization of optical flow to the 3D vector field describing the motion of each point in the scene.

@inproceedings{srinivasan2015oriented,
   title={Oriented Light-Field Windows for Scene Flow},
   author={Srinivasan, Pratul and Tao, Michael 
   and Ng, Ren and Ramamoorthi, Ravi},
   booktitle={Proceedings of the IEEE International 
   Conference on Computer Vision (ICCV)},
   year={2015}
}
      
shading

Depth from Shading, Defocus, and Correspondence using Light-field Angular Coherence
Michael Tao, Pratul Srinivasan, Jitendra Malik, Szymon Rusinkiewicz, Ravi Ramamoorthi
Conference on Computer Vision and Pattern Recognition (CVPR), 2015

paper | abstract | bibtex | code (72MB)

Using shading information is essential to improve shape estimation from light field cameras. We develop an improved technique for local shape estimation from defocus and correspondence cues, and show how shading can be used to further refine the depth. We show that the angular pixels have angular coherence, which exhibits three properties: photoconsistency, depth consistency, and shading consistency.

@inproceedings{tao2015shading,
   title={Depth from Shading, Defocus, and 
   Correspondence Using Light-Field Angular Coherence},
   author={Tao, Michael W and Srinivasan, Pratul P 
   and Malik, Jitendra and Rusinkiewicz, Szymon 
   and Ramamoorthi, Ravi},
   booktitle={Proceedings of the IEEE Conference on 
   Computer Vision and Pattern Recognition (CVPR)},
   year={2015}
}
      
lfres

A Light Transport Framework for Lenslet Light Field Cameras
Chia-Kai Liang, Ravi Ramamoorthi
ACM Transactions on Graphics (TOG), 2015

paper | abstract | bibtex

It is often stated that there is a fundamental tradeoff between spatial and angular resolution of lenslet light field cameras, but there has been limited understanding of this tradeoff theoretically or numerically. In this paper, we develop a light transport framework for understanding the fundamental limits of light field camera resolution.

@article{liang2015light,
  title={A light transport framework for lenslet light field cameras},
  author={Liang, Chia-Kai and Ramamoorthi, Ravi},
  journal={ACM Transactions on Graphics (TOG)},
  volume={34},
  number={2},
  pages={16},
  year={2015}
}
      
glossy_pami

Depth estimation and specular removal for glossy surfaces using point and line consistency with light-field cameras
Michael Tao, Jong-Chyi Su, Ting-Chun Wang, Jitendra Malik, Ravi Ramamoorthi
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015

paper | abstract | bibtex
code (5.2MB) | dataset (1.1GB)

Light-field cameras have now become available in both consumer and industrial applications, and recent papers have demonstrated practical algorithms for depth recovery from a passive single-shot capture. However, current light-field depth estimation methods are designed for Lambertian objects and fail or degrade for glossy or specular surfaces. In this paper, we present a novel theory of the relationship between light-field data and reflectance from the dichromatic model.

@article{tao2015specular,
title={Depth Estimation and Specular Removal for 
   Glossy Surfaces Using Point and Line Consistency 
   with Light-Field Cameras},
   author={Tao, Michael and Su, Jong-Chyi and 
   Wang, Ting-Chun and Malik, Jitendra 
   and Ramamoorthi, Ravi},
   journal={IEEE Transactions on Pattern 
   Analysis and Machine Intelligence (TPAMI)},
   year={2015},
}
      

2014


glossy

Depth estimation for glossy surfaces with light-field cameras
Michael Tao, Ting-Chun Wang, Jitendra Malik, Ravi Ramamoorthi
ECCV Workshop on Light Fields for Computer Vision (L4CV), 2014

paper | abstract | bibtex
open source decoder for Lytro Illum (44MB)

Light-field cameras have now become available in both consumer and industrial applications, and recent papers have demonstrated practical algorithms for depth recovery from a passive single-shot capture. In this paper, we develop an iterative approach to use the benefits of light-field data to estimate and remove the specular component, improving the depth estimation. The approach enables light-field data depth estimation to support both specular and diffuse scenes.

@inproceedings{tao2014glossy,
   title={Depth estimation for glossy surfaces with 
   light-field cameras},
   author={Tao, Michael W and Wang, Ting-Chun 
   and Malik, Jitendra and Ramamoorthi, Ravi},
   booktitle={Proceedings of the IEEE European 
   Conference on Computer Vision Workshops (ECCVW)},
   year={2014},
}
      

2013


depth

Depth from Combining Defocus and Correspondence Using Light-Field Cameras
Michael Tao, Sunil Hadap, Jitendra Malik, Ravi Ramamoorthi
International Conference on Computer Vision (ICCV), 2013

paper | abstract | bibtex | supp | video (38MB)
code (8.8MB) | dataset (83MB)

Light-field cameras have recently become available to the consumer market. An array of micro-lenses captures enough information that one can refocus images after acquisition, as well as shift one's viewpoint within the sub-apertures of the main lens, effectively obtaining multiple views. Thus, depth cues from both defocus and correspondence are available simultaneously in a single capture, and we show how to exploit both by analyzing the EPI.

@inproceedings{tao2013depth,
   author={Tao, Michael and Hadap, Sunil
   and Malik, Jitendra and Ramamoorthi, Ravi},
   title={Depth from combining defocus and 
   correspondence using light-field cameras},	
   booktitle={Proceedings of the IEEE International 
   Conference on Computer Vision (ICCV)},
   year={2013},
}
      
depth

External Mask Based Depth and Light Field Camera
Dikpal Reddy, Jiamin Bai, Ravi Ramamoorthi
ICCV Workshop on Consumer Depth Cameras for Vision, 2013

paper | abstract | bibtex | video (97MB)

We present a method to convert a digital single-lens reflex (DSLR) camera into a high-resolution consumer depth and light-field camera by affixing an external aperture mask to the main lens. Compared to the existing consumer depth and light field cameras, our camera is easy to construct with minimal additional costs, and our design is camera and lens agnostic. The main advantage of our design is the ease of switching between an SLR camera and a native resolution depth/light field camera. We also do not need to modify the internals of the camera or the lens.

@inproceedings{reddy2013external,
   author={Reddy, Deepti and Bai, Jie and Ramamoorthi, Ravi},
   title={External mask based depth and light field camera},	
   booktitle={Proceedings of the IEEE International 
   Conference on Computer Vision (ICCV) Workshops},
   year={2013},
}