Research

Light fields can provide immersive experiences with a level of realism unsurpassed by any other imaging technology. They are used in the domains of augmented reality, virtual reality, and autostereoscopic displays. Since light rays are captured from every direction through every point in a volume of space, we can recreate the correct perspective of the scene from any viewpoint position inside this volume. This enables a variety of post-capture processing capabilities such as refocusing, changing depth-of-field, extracting depth/disparity information, and 3D modeling. Light fields also find applications in view synthesis, super resolution, object recognition, material identification, depth estimation, and medical imaging.

Digital Refocusing

Changing Dof by digital image aperture modification

Due to the recent technological advances in optics, sensor manufacturing, and available transmission bandwidth, as well as the investment of many tech giants in this area, it is expected that soon many light field transmission systems will be available to both consumers and professionals. The high dimensional light fields offer powerful capabilities for scene understanding and post processing. However, they do bring up challenges in terms of data capture and data compression. The raw image captured by a light field camera is huge. For instance, Lytro Illum produces raw data at 7728x5368 resolution and 10 bits-per-pixel (bpp) precision, namely 51,854,880 bytes per shot/frame. Thus new approaches are required to compress light field data to enable efficient storage, transmission and display. This must be done in accordance with the JPEG Pleno which is a standard framework for representing light field modalities and also offers backward compatibility for 2D and 3D displays.

A light field has spatial, angular, and temporal correlations. Existing image or video coding approaches such as JPEG, MPEG and HEVC do not fully exploit redundancies present in light field content. Effective light field compression approaches must exploit the intra-view, inter-view, statistical and perception-based correlations present in the multiple viewpoint images captured. There are lossy and lossless compression schemes for light fields and some hybrid coding techniques can also be employed. By taking advantage of spatio-temporal properties, there is immense potential of light fields in multi-view autostereoscopic displays. Such displays need optimizations in terms of multi-layer decomposition and real time streaming of light field data. Designing display invariant schemes for practical usage of light fields is also critical for 3D display applications. There is immense scope and demand for algorithms that aim to provide high compression quality with reasonable resource requirements in terms of bitrate reduction, computational complexity, and reconstruction quality. Obtaining such optimal bitrate reduction while preserving the light field structure is challenging.

The representation of light fields must also be compact and generalized to provide flexibility in realizing a range of bitrates for compression. We must consider the aspects of interoperability, scalability, and adaptability of the scheme since different multi-baseline geometries are used for multi-view capturing. Regarding 3D display adaption, the light field representation and coding must be scalable and appropriate for different viewing conditions (screen size and distance, e.g., from cinema to tv to phones). The format must be invariant to the display type and allow the depth impression to be easily changed to best meet viewers’ preferences for visual comfort.

The research goal and motivation of my current work is to focus on issues in analysis, optimal data format generation, representation, interpretation and compression of light field data.