Evaluating the Performance of Polynomial Regression Method with Different Parameters during Color Characterization
The polynomial regression method is employed to calculate the relationship of device color space and CIE color space for color characterization, and the performance of different expressions with specific parameters is evaluated. Firstly, the polynomial equation for color conversion is established and the computation of polynomial coefficients is analysed. And then different forms of polynomial equations are used to calculate the RGB and CMYK’s CIE color values, while the corresponding color errors are compared. At last, an optimal polynomial expression is obtained by analysing several related parameters during color conversion, including polynomial numbers, the degree of polynomial terms, the selection of CIE visual spaces, and the linearization.
As color electronics often have different imaging characteristics, such as the imaging mechanism, color space, apparatus capability, and material peculiarity , the color images always look different in some detail when they are output by different devices. For example, the same image displayed on monitor looks brighter and more colorful than that printed out on paper by printer. Even the same image displayed on two different monitors sometimes produces different visual effects. Thus, for color signal processing system with several color devices, in order to maintain the color consistence of color images, the precision of color signal transmission between different devices must be high enough. Now, for most of the color signal processing systems, the device-connection space is often used, such as the CIE color spaces CIEXYZ and CIELAB . If the CIELAB space, for example, is chosen as the standard connection space, the color transmission process can be divided into two parts, device-to-CIELAB and CIELAB-to-device. Therefore, the color signal processing precision is highly depending on the color conversion algorithms between device colors and CIELAB colors.
There are many models which can be used for converting color signals, such as Neugebauer model, neural network, interpolation method, and polynomial regression. The regression model is widely used in color signal processing systems , since it can produce the high accuracy by using less sample data and also it can be used in both the device-to-CIELAB and the CIELAB-to-device directions. However, there are still some problems unresolved for this model during processing color signals; for example,(1)the calculation precision of this model is highly dependent on the number and the degree of polynomial terms , so it is important to obtain the optimal polynomial expressions for specific signal processing process;(2)the selection of device-connection space, such as CIEXYZ and CIELAB spaces, may have some influence on the signal processing precision [5, 6], so it still needs to be analyzed and tested for polynomial regression models;(3)in some cases, the RGB signals are linearized before processing with CIE colors, but in other cases linearization is not added. Hence, for both the RGB and the CMYK signals, the effect of linearization processing should be analyzed and tested [7, 8], which may reveal whether or not it should be added for specific color devices and polynomials.
In this paper, these issues above are analyzed and tested in corresponding experiments. The different polynomials expressions, the different device-connection color spaces, and influence of linearization for signal processing are all tested on RGB and CMYK devices. At last, for the specific RGB and CMYK color signal processing systems, the optimal parameters are obtained with detailed analysis.
2. Polynomial Regression Model for Color Signal Processing
Polynomial regression is a form of linear regression in which the relationship between the independent variable and the dependent variable is modeled as an degree polynomial. Meanwhile, polynomial regression fits a nonlinear relationship between the value of and the corresponding conditional mean of values denoted by and has been used to describe nonlinear phenomena, such as the growth rate of tissues , the distribution of carbon isotopes in lake sediments , and the progression of disease epidemics .
Within the polynomial regression model, if the dependent variable and multiple independent variables have the linear relationship and there are groups of sample data: then the relationship between them can be described as  where are the coefficients to be determined and are independent random variables.
The system of expressions above can be represented using the matrix where
If are the estimated values by least squares methods for parameter , then the regression equation is
By using least squares method, the coefficients can be resolved as follows:
In addition, the polynomial regression method can also be used to describe nonlinear problems, in which the dependent variable is modeled as an degree polynomial of independent variables, so this model can be rightly used in color signal processing systems. Taking the monitor as an example, with the CIEXYZ device-connection space, the relationship between and CIEXYZ can be expressed as where is the polynomial coefficients, is the degree of polynomial, and , and the expression above can also be represented using matrix where the is the coefficients matrix and is the matrix of polynomials, while represents the number of polynomials.
Thus, when , , the first-degree polynomial matrix is shown as follows: when , , the second-degree polynomial matrix should be when , , the third-degree polynomial matrix is and when , , the fourth-degree polynomial matrix is There are also some other forms of polynomials used in color signal processing, and the polynomial matrixes are shown as follows:
When the coefficients and are defined in color signal processing, with the sample color data which consist of device colors and CIEXYZ colors, the coefficient matrix can be resolved using least square method.
3. Study on the Key Parameters during Color Signal Processing
To determine the key parameters within the polynomial regression model, a color signal processing system with several color devices is introduced in the experiment. As the additive primary color is and subtractive primary color is CMYK, an monitor and CMYK printer are chosen as the typical testing color devices. Within the color signal processing system, the device-connection space is CIEXYZ or CIELAB, so the color processing is mainly based on four color spaces.
For the purpose of obtaining the relationship between the device colors and the device-connection colors, the training sample data should be gathered in advance. For the IBM monitor, the three primary channels Red, Green, and Blue are all divided evenly into 9 parts, and each value of , , and colors ranges within [0 32 64 96 128 160 192 224 255]. When all these patches are displayed on monitor, the corresponding CIEXYZ and CIELAB colors are measured with Spectrophotometer X-Rite DTP94. These and corresponding CIEXYZ or CIELAB colors form the training sample data. To verify the accuracy of polynomial regression model, the testing sample data should also be collected. Similar to the training sample data, the testing data consists of color patches with the single channel ranging within [16 48 80 112 144 176 208 240].
For the CMYK Epson 9880 printer, the single channel is divided into 11 parts with the interval 10, so every color channel ranges within [0 10 20 30 40 50 60 70 80 90 100]. Because the subtractive primary colors are Cyan, Magenta, and Yellow (CMY) and the color of Black can be seen as the replacement of a certain amount of CMY, in experiment the device color CMYK is treated as CMY. Thus, when all the CMY color patches are printed out, the corresponding CIEXYZ and CIELAB colors are measured with Spectrophotometer X-Rite 528, and all these CMY and corresponding CIE colors form the training sample data. In addition, the testing data consists of patches with the single channel ranging within [5 25 45 65 85 95].
In general, the regression errors for the training sample reduce data as the number of polynomial terms or the degree of polynomials increases. However, for the color data of entire range, the regression errors will increase when the number of polynomial terms exceeds a certain value. Therefore, it is highly important and necessary to find the optimal polynomial expressions producing least errors, especially determined by the number of polynomial terms or the degree of polynomials. In experiment, to evaluate the different polynomial expressions employed in color signal processing, the error computation is defined by the color difference CIE76 formula : where is the regression color and is the measured color.
3.1. Number and the Degree of Polynomial Terms
To find the appropriate polynomial expressions, the monitor is tested and the signal processing errors with different polynomial expressions are compared. Firstly, the training sample data is used to obtain the polynomial coefficients between and CIELAB signals; then for all the colors, the CIELAB values can be simulated by using the obtained coefficients; secondly, in order to analyze the regression precision, the measured CIELAB color values from the training data are used to compute the errors of different polynomial expressions; at last the errors are represented as color differences between measured and simulated CIELAB values as below.
From the above result, it can be seen that the regression precision obviously improves as the number of polynomial terms increases, but when the number of terms reaches a certain value the mean error becomes small enough. For example, for the first-degree polynomial , the average error is which exceeds the reproduction error threshold [17, 18], and the maximal error is which is a visually unacceptable error. Additionally its standard deviation and variance are and , respectively, which indicate that the distribution of errors is unsatisfactory.
In general, the regression precision can be evaluated mainly from the average error for different polynomials shown in Figure 1. The regression errors for polynomials , , , and are all exceeding units, while for the other polynomials , , and their color differences are all acceptable. The figure shows that polynomial is the most accurate, but its precision is very close to polynomial . In addition, too many terms of the polynomial may increase the difficulty of coefficients-solving process , so the polynomial should be most suitable for RGB color signal processing.
Using the regression coefficients from the training data, the relationship between and CIELAB signals can be described. To verify the precision of different polynomials for the whole range of signals, the testing sample data should be used. For the testing data which is outside the range of training data, the ’s corresponding simulation CIELAB values can be computed using the relationship obtained above, and the errors can also be calculated by comparing the measured values. For the different polynomials and the testing data, the errors are shown below.
From Table 2, for the and CIELAB color signals, the signal processing precision of different polynomials for testing data is similar to the training data shown in Table 1, and the acceptable forms of polynomials are , , and , respectively.
For the purpose of testing the different polynomials’ performance for CMYK signals, the training sample data of EPSON9880 printer are used to obtain the relationship between CMYK and CIELAB. Because the errors are too large for polynomials with few terms such as and , polynomials , , , , and are only tested for CMYK signals. With the obtained polynomial coefficients solved with training data, for the 216 testing patches of testing sample data, the color differences are shown in Table 3.
It can be seen that, for the CMYK devices, most of the polynomials perform well with the average error below . This is mainly because the color-rendering properties of CMYK printers surpass those of the RGB monitors, such as the regularity of color gamut and color consistence [1, 19]. On the whole the fourth-degree polynomial with the largest number of terms has the smallest color difference and its error distribution is also ideal. The performance of the third-degree polynomial is close to the , which indicates that the preferred polynomials for CMYK and CIELAB signal processing should be the or .
3.2. Selection of CIE Color Spaces during Color Signal Processing
During the signal processing for different color devices, the use of two CIE color spaces, CIEXYZ and CIELAB spaces, often brings in different color errors. Hence, it is important to find the appropriate CIE color space for the specified color devices and polynomials.
In the experiment, the CIEXYZ and CIELAB color spaces are tested, respectively, on monitors and on CMYK printers to compare their color errors. For the IBM monitor, when the CIEXYZ is selected as device-connection space, the errors of different polynomials are shown in Table 4. Corresponding to the color errors listed in Table 2 where the CIELAB is the device-connection space, the mean color differences corresponding to these two CIE spaces are compared in Figure 2.
As the figure shows, for the monitors, the polynomials perform somewhat differently with the CIEXYZ and CIELAB color spaces. For the low-degree polynomials, the color error of signal processing with CIEXYZ space is greater than with CIELAB space, and along the increasing of polynomial terms, the influence of device-connection color space becomes very little. So when the polynomials of less than 10 terms are used in signal processing, the CIELAB color space is recommended as the device-connection space, while for the case when the polynomial terms are between 10 and 20, the CIEXYZ space is suggested.
To test the influence of CIE color space for CMYK devices, the errors of CIEXYZ and CIELAB color spaces are also compared on Epson9889 printer. The CMYK signal processing errors with CIELAB space have been listed in Table 3; for the polynomials of 8 to 35 terms, the color errors with CIEXYZ space are recorded in Table 5.
It can be seen that, for the second-degree polynomials and , third-degree polynomial , and fourth-degree polynomial , all the color errors are acceptable. Taking account of the precision and computing efficiency, the third-degree polynomial is the most suitable model for CMYK devices using CIEXYZ color space. In addition, similar to the comparison of signal processing with different device-connection spaces, the CMYK signal processing using CIELAB and CIEXYZ spaces is compared in Figure 3.
It can be seen that, within the CMYK signal processing based on polynomial regression method, the precision of color conversion using CIELAB color space is higher than that of using CIEXYZ space for a majority of the second, third, and fourth degree of polynomials. Therefore, in CMYK signal processing, the CIELAB color space is preferred as the device-connection space for the polynomial regression model.
3.3. Linearization during Color Signal Processing
For the devices, the linearization is often applied to calibrate the color device’s gray balance , in which the signals are firstly converted into lightness signals before color conversion process. In the experiment, the linearization is described as follows: where , , , respectively, are the lightness signals, respectively, and are the linearization functions which are described as follows: where stands for one of the colors , , or and are the coefficients.
Within the color signals processing, the device colors are firstly linearized into the lightness signals , then the relationship between and CIELAB is obtained by polynomial regression model, and at last the color error is calculated by using the measured colors within the testing data. For the IBM monitor in the experiment, the color differences of different linearized polynomials are shown in Table 6.
To test the influence of the linearization on the color signal processing precision, the two groups of color errors in Tables 2 and 6 are compared. As shown in Figure 4, the errors are very close for the two processes, so a conclusion is reached that for the devices the linearization has little impact on the signal processing precision especially for the polynomials of degree greater than or equal to three.
Similarly, to test the linearization for CMYK signal processing, the color errors of testing sample of EPSON printer are recorded in Table 7, and the comparison with the signal processing without linearization in Table 3 is described in Figure 5.
It can be seen that, for the CMYK printers, the precision of signal processing with linearization is lower than that without linearization for most polynomial models. In some cases the linearization process does not improve the CMYK signal processing precision, so it is not advisable for CMYK devices calibration.
In this paper, the polynomial regression model is used for RGB and CMYK color signal processing. For the purpose of improving color signal processing precision, the item number and degree of the polynomials are tested. By comparing the color errors within the color signal processing, the appropriate polynomial expressions for and CMYK color devices are obtained. In addition, the parameters of device-connection color space and linearization are tested. In general, the and CMYK color signal processing by employing the polynomial regression method can be concluded as follows.(1)During the and CMYK color signal processing with polynomials regression, it is advised to use the third-degree polynomials or fourth-degree polynomials . Taking into account the coefficient solving process and the color errors in experiment, the third-degree is the most appropriate model.(2)When the CIE color space is used as the device-connection space, for the devices, the signal processing precision is higher with CIEXYZ space than with CIELAB space for polynomials including 10 to 20 terms, while in other cases the CIELAB is more precise. For CMYK devices, in most cases the CIELAB color space performs better than the CIEXYZ space.(3)When the linearization is added in the color signal processing, the precision improves somewhat for parts of polynomials for the devices, while for the CMYK devices, the addition of linearization reduces the signal processing accuracy instead, so the linearization is not recommended to be used within the CMYK signal processing.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work is supported by the National Science Foundation of China (no. 61174101), Doctor Foundation of Xi’an University of Technology (no. 104-211302), and “13115” Creative Foundation of Science and Technology (no. 2009GDGC-06), Shaanxi province of China.
X. Yanfang, L. Wenyao, and Z. Kunlonget, “Characterization of color scanners,” Optics and Precision Engineering, vol. 12, no. 1, pp. 15–20, 2003.View at: Google Scholar
Y. Wang and H. Xu, “Colorimetric characterization for scanner based on polynomial regression models,” Acta Optica Sinica, vol. 27, no. 6, pp. 1135–1138, 2007.View at: Google Scholar
X. Han, J. Shi, and X. Huang, “Study of characterizing color printers with the polynomial regression model,” Optical Technique, vol. 37, no. 1, pp. 25–30, 2011.View at: Google Scholar