Decomposing Single Images for Layered Photo Retouching

University College London    

Eurographics Symposium on Rendering 2017

Appearance manipulation of a single photograph (top images) when using off-the-shelf software like Photoshop directly (left arrow) and when using the same in combination with our new layering (right arrow). For the car example, the image was decomposed into layers (albedo, irradiance, specular, and ambient occlusion), which were then manipulated individually: specular highlights were strengthened and blurred; irradiance and ambient occlusion were darkened and have added contrast; the albedo color was changed. While the image generated without our decomposition took much more effort (selections, adjustments with curves, and feathered image areas), the result is still inferior. For the statue example, a different decomposition splitting the original image into light directions was used. The light coming from the left was changed to become more blue, while light coming from the right was changed to become more red. A similar effect is hard to achieve in Photoshop even after one order of magnitude more effort.


Abstract

Photographers routinely compose multiple manipulated photos of the same scene into a single image, producing a fidelity difficult to achieve using any individual photo. Alternately, 3D artists set up rendering systems to produce layered images to isolate individual aspects of the light transport, which are composed into the final result in post-production. Regrettably, these approaches either take considerable time and effort to capture, or remain limited to synthetic scenes. In this paper, we suggest a method to decompose a single image into multiple layers that approximates effects such as shadow, diffuse illumination, albedo, and specular shading. To this end, we extend the idea of intrinsic images along two axes: first, by complementing shading and reflectance with specularity and occlusion, and second, by introducing directional dependence. We do so by training a convolutional neural network (CNN) with synthetic data. Such decompositions can then be manipulated in any off-the-shelf image manipulation software and composited back. We demonstrate the effectiveness of our decomposition on synthetic (i. e., rendered) and real data (i. e., photographs), and use them for photo manipulations, which are otherwise impossible to perform based on single images. We provide comparisons with state-of-the-art methods and also evaluate the quality of our decompositions via a user study measuring the effectiveness of the resultant photo retouching setup.


Video
Model

The decomposition has two main steps: (i) producing training data and (ii) a convolutional neural network to decompose single images into editable layers.

The components of our two imaging models. The first row is the intrinsic axis, the second row the directional axis, and the third row shows how one directional element can subsequently be also decomposed into its intrinsics.

The decomposition is done using a CNN that consumes a photo and outputs all its layers. Its architecture is summarized by following figure.

Results
Decomposition

Decomposition of input images into light transport layers.

Directional Decomposition

Decomposition of input images into the six directional layers for different objects

Light transport-based Edits

Follows a list of example edits performed using our intrinsic decomposition. All the edits are compared to an equivalent edit produced on a same effort basis using a single input image.

In this example, specular highlights were reduced, and brightness of albedo enhanced. Dark regions were also made more evident. Without access to our decomposition, the processes of increasing the brightness of the albedo and reducing the specular highlist clash.

For this car, specular highlights were boosted and parts of the car directly lit from light sources were emphasized. The single channel edit presents unwanted artifacts on the side, doesn't rightfully boost the front of the car and doesn't homogenously darken the side.

The appearance of the statue was adjusted to make the material look more like marble and copper, respectively. The freedom given by our decomposition allows us to obtain results that are both more convincing and quicker to obtain.

Directional-based Edits

Follows a list of example edits performed using our directional decomposition. All the edits are compared to an equivalent edit produced on a same effort basis using a single input image.

This example uses our 14-way edit. The specular color of light was changed based on directionality (blue from the left, red from the right). Note how the teapot without lid is fully blue in the single-channel edit even on the back side, while with our decomposition it has red on the proper side even though there is another object to its right that has blue on its left side.

Directionality here is used to simulate the effects of sun exposure (from the right and the left side respectively) causing the regions that would be hit the most from that direction to lose tint. In the single channel edit, the shape of the statue is not taken into account, leading to a flat-looking edit.

The light coming from the bottom was disabled, while reducing the contribution of all the other light directions except that coming from the top. Once again, the single channel edit ends up looking flat as a result of not taking geometry into account.

Bibtex
    
    @article{innamorati17decomposing,
        author = {Innamorati, Carlo and Ritschel, Tobias and Weyrich, Tim and Mitra, Niloy J.},
        title = {Decomposing Single Images for Layered Photo Retouching},
        journal = {Computer Graphics Forum (Proc. Eurogr. Symp. on Rendering)},
        volume = 36,
        number = 4,
        pages = {15--25},
        month = jul,
        year = 2017,
        publisher = {The Eurographics Association and John Wiley & Sons Ltd.},
        ISSN = {1467-8659},
        doi = {10.1111/cgf.13220}
    }
    
Acknowledgements

We thank our reviewers for their detailed and insightful comments and the user study participants for their time and feedback. We also thank Paul Guerrero, James Hennessey, Moos Hueting, Aron Monszpart and Tuanfeng Yang Wang for their help, comments and ideas. This work was partially funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 642841, by the ERC Starting Grant SmartGeometry (StG-2013-335373), and by the UK Engineering and Physical Sciences Research Council (grant EP/K023578/1).

Links