论文信息 - Encoding Color Difference Signals for High Dynamic Range and Wide Gamut Imagery

Encoding Color Difference Signals for High Dynamic Range and Wide Gamut Imagery

High dynamic range and wide color gamut are currently being introduced to television and cinema. This extended information requires not only more efficient signal encodings, but also improved color spaces. Due to the increasing variation in display capabilities, it is desirable to have a color signal encoding that is not only suitable for efficient quantization but also for color volume mapping. While an efficient method for high dynamic range luminance encoding has been put forward, a similar encoding scheme for color difference signals is not yet available. We address this with a novel color space representation that can be used for both efficient encoding of high dynamic range and wide gamut color difference signals as well as color volume mapping. We compare the performance, robustness and complexity against other color spaces in a variety of usage scenarios. Introduction Luminance encoding for high dynamic range images has been studied extensively [1, 2]. There are also multiple proposals for high dynamic range color image encoding [3, 4, 5, 6, 7]. Common to all of these approaches is a focus on encoding efficiency; they are designed to be able to encode images using a minimum number of code values without introducing visible quantization artifacts or loss of image details. A modern image encoding scheme should not only be optimized for quantization but should also facilitate downstream steps like color volume mapping to prevent computationally expensive color space conversions for each step. Figure 1 depicts the full video distribution pipeline. Figure 1. High dynamic range and wide color gamut distribution pipeline. Thus, it is desirable to transmit video signals in an encoding color space that is not only suitable for efficient image encoding but also for tone and gamut mapping often referred to as color volume mapping. Unlike print media, digital cinema and television (TV) distribution did not typically require color volume mapping in the past because most display devices had a gamut and dynamic range that matched the encoded signal. A notable exception was early LCD displays that did not cover the complete Rec.709 [8] gamut. Future displays will have a larger variance in covered dynamic range and gamut. One reason for this is the trend toward mobile viewing, which extends potential ambient illumination from dark cinema and dim living rooms to bright sunlight for outside viewing. A second reason is that new display technologies have extended capabilities. Laser-illuminated projectors [9] are now becoming available and extend the gamut from DCI-P3 [10] to Rec.2020 [11]. At the same time, major TV networks are investigating how to distribute signals in Rec.2020 color space [12] to leverage the wider color gamut of emerging display technologies such as OLED [13], quantum dot displays [14] and already existing multi-primary displays [15, 16]. Most of these new displays and cinema projectors will not feature the full Rec.2020 gamut or the peak luminance of high dynamic range mastering displays. Thus, there will likely be more variation in device capabilities than ever before. This means that mapping between different color volumes will become critically important for consistent best-possible reproduction of TV and cinema imagery. In this paper, we co-optimize a color space for both high dynamic range (HDR) and wide color gamut (WCG) encoding efficiency as well as color volume mapping performance. In order to verify the efficiency of our color space, we compare it to state of the art HDR color encodings. Finally, we identify challenges associated with our approach and discuss further research opportunities. Requirements The major goal of image encoding is to minimize color distortions when images are represented with a given number of digital code words, as well as to find the number of code values needed to prevent visible quantization artifacts. The best encoding performance is typically achieved when quantization error is distributed perceptually evenly over the color space. To fully avoid visible quantization errors, the step of one code value should always be below the detection threshold of one ‘just noticeable difference’ (JND). Thus, the more uniform in size the JND ellipsoids are throughout the color-space, the more efficient 240 © 2015 Society for Imaging Science and Technology the encoding is, as there are less code values wasted to encode subJND steps in areas of the color space where JND-ellipsoids are larger [17]. We will denote this requirement as ‘JND-uniformity’. In addition to JND-uniformity, a color space for video encoding should decorrelate the achromatic axis from the chromatic axes to enable color subsampling that exploits the lower contrast sensitivity of the human visual system for high frequency chroma details. Furthermore, a color space used for color volume mapping should be as hue-linear as possible, as observers perceive changes in hue to be more impactful than changes in lightness or chroma. As such, most gamut mapping algorithms either completely avoid or heavily penalize hue changes. Thus, when mapping is performed toward the achromatic axis, or when intensity is changed, the color space should not introduce any hue changes. As well as maintaining uniformity inside the gamut volume, it is important to consider sufficiently large bounds to encode the necessary gamut and dynamic range. Rec.2020 gamut is the design goal for the next generation of displays and therefore the minimum requirement for a modern video encoding space. Short-term adaptive processes can expand the required dynamic range for entertainment imaging beyond the steady state adaptation of the human visual system [18]. A consumer study using an HDR research display in a dark viewing environment identified a dynamic range of 0.005-10000cd/m as required to satisfy 90% of the viewers [19]. Finally, the computational complexity of the transformation from the encoding color space to device RGB should be as low as possible to allow for mass deployment in a wide range of devices. Specifically the transformation should minimize computations in linear light and allow separable operations (i.e. functions of a single component rather than multiple components). Prior Work One of the challenges in designing an efficient HDR color encoding is that in contrast to color appearance model expectations, neither surround luminance nor observer adaptation are known for HDR entertainment imaging scenarios. As a result, a static color difference formula designed for standard dynamic range like Delta-E 2000 [20] cannot be used to accurately predict JNDs. Instead, the quantization in any part of an HDR color space should always be determined by the adaptation parameters that result in the smallest detection step in that area to make sure visible quantization artifacts can be avoided for any content on any display in any viewing environment. The perceptual quantizer (PQ) curve [21] follows exactly this approach for luminance encoding. The PQ curve was derived as a constant minimum detectability curve from the Barten contrast sensitivity function (CSF) model [22] and is further predictable from a local cone model [23]. It always quantizes below the minimum contrast beyond the detection threshold for any adaptation state at any luminance. Figure 2 illustrates the concept of the PQ curve by comparing a range of theoretical cumulative JND curves for fixed steady state adaptation with the PQ-curve. Figure 2. Minimum quantization for steady-state adaptation (red, green and blue lines) versus quantization determined by the minimum discriminability of all possible adaptation states (PQ / black dashed line). Methods As our goal is a color difference encoding model that can be mass deployed, we chose an existing encoding scheme and searched for the optimal parameters of this model, rather than starting from scratch, without any constraints. The methods introduced here can also be used to optimize other models. In the following we will introduce the color space model and discuss the test and training sets and the cost functions we used for optimization of the model parameters. Color Space Model We chose a color space model that follows the processing steps of broadcast video Y’CbCr and IPT [24]. As depicted in Formulas 1-3, this model consists of a linear transformation on device-dependent linear RGB followed by a nonlinear encoding function and a second linear transformation to decorrelate the nonlinear coded components into one achromatic channel and two color difference channels storing chroma information. ! ! ! = !! ∙ ! !!"#$%" !!"#$%" !!"#$%" (1) !′ !′ !′ = !!" ! ! ! ! (2) ! !! !! = !! ∙ !′ !′ !′ (3) 241 23rd Color and Imaging Conference Final Program and Proceedings This model is easily invertible and already implemented in several hardware devices. In addition, IPT is known for its excellent hue linearity in standard dynamic range scenarios. The difference between Y’CbCr models and IPT is that in IPT the nonlinearity is applied to LMS cone fundamentals, rather than RGB primaries, and IPT uses a different color-differencing matrix compared to Y’CbCr. Both models have in common that intensity I and luma Y’ are not necessarily exactly weighted to the V-lambda luminosity function for all colors. Having chosen a specific model, we next search for the individual parameters of the model. At first, we determine the nonlinearity function (!!") by looking only at the achromatic axis. For Y’CbCr and IPT the achromatic case is true if all nonlinear components have the same value before applying the decorrelation matrix (!!). From !!" !!(!) = ! = !! = !! we can deduce that if the optimal nonlinear encoding for I is known, this exact function also needs to be applied to !,! and ! for quantization along the achromatic axis to be maximally efficient. The PQ curve shown in Formula 4 satisfies this requirement. !!":!!!!!!!!!" = !!!!! !! ! !"""" !

[1] D. L. Macadam. Visual Sensitivities to Color Differences in Daylight , 1942 .

[2] Gregory Ward Larson,et al. LogLuv Encoding for Full-Gamut, High-Dynamic Range Images , 1998, J. Graphics, GPU, & Game Tools.

[3] Maciej Pedzisz. Beyond BT.709 , 2013 .

[4] M. Luo,et al. The development of the CIE 2000 Colour Difference Formula , 2001 .

[5] Charles Poynton,et al. Deploying wide colour gamut and high dynamic range in HD and UHD , 2014 .

[6] Gregory J. Ward,et al. The RADIANCE lighting simulation and rendering system , 1994, SIGGRAPH.

[7] Peter G. J. Barten,et al. Formula for the contrast sensitivity of the human eye , 2003, IS&T/SPIE Electronic Imaging.

[8] R. Berns,et al. Determination of constant Hue Loci for a CRT gamut and their predictions using color appearance spaces , 1995 .

[9] Erik Reinhard,et al. Face-based luminance matching for perceptual colormap generation , 2002, IEEE Visualization, 2002. VIS 2002..

[10] T D Cradduck,et al. National electrical manufacturers association , 1983, Journal of the A.I.E.E..

[11] Barry D. Silverstein,et al. 25.4: A Laser‐Based Digital Cinema Projector , 2011 .

[12] Erik Reinhard,et al. A reassessment of the simultaneous dynamic range of the human visual system , 2010, APGV '10.

[13] Sophie M. Wuerger,et al. Towards a spatio-chromatic standard observer for detection , 2002, IS&T/SPIE Electronic Imaging.

[14] Ingmar Lissner,et al. How Perceptually Uniform Can a Hue Linear Color Space Be? , 2010, CIC.

[15] Scott Miller,et al. Color signal encoding for high dynamic range and wide color gamut based on human perception , 2014, Electronic Imaging.

[16] Timo Kunkel,et al. Preference limits of the visual dynamic range for ultra high quality and aesthetic conveyance , 2013, Electronic Imaging.

[17] Paul V. Johnson,et al. 240 Hz OLED technology properties that can enable improved image quality , 2014 .

[18] Scott Miller,et al. Perceptual Signal Coding for More Efficient Usage of Bit Codes , 2012 .

[19] Scott J. Daly,et al. Use of a local cone model to predict essential CSF light adaptation behavior used in the design of luminance quantization nonlinearities , 2015, Electronic Imaging.

[20] W D Wright,et al. The sensitivity of the eye to small colour differences , 1941 .

[21] Mark D. Fairchild,et al. Development and Testing of a Color Space (IPT) with Improved Hue Uniformity , 1998, CIC.

[22] Rafal Mantiuk,et al. Measurements of achromatic and chromatic contrast sensitivity functions for an extended range of adaptation luminance , 2013, Electronic Imaging.

[23] Timo Kunkel,et al. HDR and wide gamut appearance-based color encoding and its quantification , 2013, 2013 Picture Coding Symposium (PCS).

[24] Kozo Nakamura,et al. 62.1: Five‐Primary‐Color 60‐Inch LCD with Novel Wide Color Gamut and Wide Viewing Angle , 2009 .

[25] Oleg S. Pianykh,et al. Digital Imaging and Communications in Medicine (DICOM) , 2017, Radiopaedia.org.