Abstract

Grey prediction models have become common methods which are widely employed to solve the problems with “small examples and poor information.” However, modeling objects of existing grey prediction models are limited to the homogenous data sequences which only contain the same data type. This paper studies the methodology of building prediction models of interval grey numbers that are grey heterogeneous data sequence, with a real parameter. Firstly, the position of the real parameter in an interval grey number sequence is discussed, and the real number is expanded into an interval grey number by adopting the method of grey generation. On this basis, a prediction model of interval grey number with a real parameter is deduced and built. Finally, this novel model is successfully applied to forecast the concentration of organic pollutant DDT in the atmosphere. The analysis and research results in this paper extend the object of grey prediction from homogenous data sequence to grey heterogeneous data sequence. Those research findings are of positive significance in terms of enriching and improving the theory system of grey prediction models.

1. Introduction

The grey system theory, established based on “grey box” developed by Julong Deng in 1982, is a new methodology that focuses on the study of problems involving small samples and poor information [1, 2]. It deals with partially known information through generating, excavating, and extracting useful information from what is available [3, 4]. Therefore, systems’ operational behaviors and their laws of evolution can be correctly described and effectively monitored [5]. A grey prediction model is one of the most important parts of grey system theory, and it has been employed in many fields, such as industry [6, 7], agriculture [8], environment [9], and military [10]. According to the needs of solving practical problems, scholars conduct a lot of researches about the extension and optimization of grey prediction models in the appliance process. Those research findings are mainly concentrated in five aspects as follows.(a)Study on the preprocessing methods of modeling sequences. It mainly includes improving the smoothness of sequence by function transformations [11] and weakening the effects of shocking disturbances to model structures through buffer operators [12]. Such measures can improve accuracy of models.(b)Study on the optimization methods of grey prediction model parameters. It is mainly through mathematical methods to optimize initial values [13] and background values [14] of grey models, and this can improve the modeling parameters (that is, and ) of grey models.(c)Study on modeling mechanisms of grey prediction models. It mainly includes researching on methods of grey generation of modeling sequence [15], model stability and ill-conditioned modeling conditions [16], and application ranges of grey prediction models [17].(d)Study on some novel grey prediction model, such as discrete grey prediction model, abbreviated DGM model [18], nonequidistance grey prediction model, and abbreviated NGM model [19].

Other related studies. They include combining grey prediction models with other models or methods [20], for example, the Markov model or support vector machine [21, 22], to improve the accuracy of grey models, studying on the error check methods of grey prediction models.

The modeling objects of inchoate researches and applications of grey prediction models are mainly based on real numbers. However, because of the complexity of study objects or the limits of technical means and people’s reason, the information in the real world is often shown in the form of grey numbers, such as environmental monitoring data may be an interval grey number which is located in a specific range (e.g., ) or the data is a discrete grey number which has some uncertain value (e.g., ). Provided this situation, it is not feasible to employ the traditional modeling methods based on real numbers in order to build grey prediction models with grey uncertainty characteristics modeling data sets. In order to tackle modeling problems of grey uncertain information, some novel concepts, such as grey band, grey layer, and standard discrete grey number, are proposed in literature [23, 24]. In this research, different approaches are conducted in the study of modeling methods of interval and discrete grey number. This leads to a methodology which could expand modeling objects from real numbers to grey numbers efficiently.

Based on the above point of view, the grey prediction modeling objects could be categorized into the following: (i) sequence of real number, (ii) sequence of interval grey number, and (iii) sequence of discrete grey number. Each of these modeling object categories contains only one data type; they are all real numbers, interval grey numbers, or discrete grey numbers. However, there is a mixed data type of the modeling objects elements existing when the modeling process is employed into practical applications. For instance, in the sequence , the first element “” of sequence is a discrete grey number, the second one “” is a real number, the third one “” is an interval grey number, and so forth. Such sequences are named the heterogeneous data sequences. Under such circumstance, further research focused on how to build a grey prediction model based on heterogeneous data sequences will be investigated.

In this paper, the study on the prediction model of interval grey numbers with a real parameter will be conducted. In the target model, a grey heterogeneous modeling data sequence is adopted as the modeling objects element. Such modeling objects contain some interval grey numbers and one real number. In the modeling process, the real number is expanded into an interval grey number by adopting the method of traditional grey prediction model based on real numbers. Alternatively, the method of generation of nonadjacent neighbor mean could also bring the same process. Subsequently, the grey prediction models of interval grey number sequence and kernel sequence are built based on the authors’ former research findings. Such models aim to forecast future of interval grey number and kernels.

Only the monogenesis data sequence can be simulated or forecasted by existing grey system models. Hence, all data of the modeling sequence should be within the same data type. This means that the existing grey models are invalid when dealing with two or more different types of the data in modeling sequence (i.e., grey heterogeneous data sequence). For this reason, a novel prediction model of interval grey numbers with a real parameter is proposed in this paper. The present work extends the modeling object from the monogenesis data sequence to the heterogeneous one. The application scope of the grey prediction models is expanded, such that it enriches the theory system of the grey prediction model with a positive significance.

The rest of the paper is organized as follows. In Section 2, the fundamentals of DGM model and IGM model are introduced. Section 3 studies the modeling methods of the novel grey prediction model of interval grey numbers with a real parameter, IGNPM for short. In Section 4, the relationships between IGNPM model and other models are analyzed. Next, we employ IGNPM model to forecast the changed trend of the concentration of DDT in atmosphere in Section 5. In Section 6, other contents associated with this paper are discussed. Finally, Section 7 concludes the paper and presents the future work.

2. Preliminaries

2.1. DGM Model

Definition 1. Assume that is a sequence, where and is the 1-AGO sequence of as follows: where Then, formula (4) is known as a DGM model or the discrete form of a GM model [18].

Theorem 2. If is a sequence parameter and then the least square estimate sequence of grey differential formula (4) satisfies .

Theorem 3. If are as stated in Theorem 2 and , then we can obtain the following formulas.(1)Let , then (2)The restored values can be calculated as following:

Note. Proofs of Theorems 2 and 3 are omitted here, and the detailed information can be found in literature [18].

2.2. Prediction Model of Interval Grey Number (IGM Model)

Definition 4. A grey number with both a lower limit and an upper limit is called an interval grey number, denoted as , where . A sequence which consists of interval grey number    is called an interval grey number sequence, denoted as , .

Definition 5. A sequence which consists of all those lower limits of interval grey numbers in is called the lower limit sequence of , denoted as , where is an interval grey number defined in Definition 4. Similarly, one uses to denote the upper limit sequence of .

Assume that is an interval grey number sequence, that is where According to literature [25], can be equivalently transformed into two real number sequences and ; that is, where

It has been proved in the literature [25] that an interval grey number sequence and its transformed sequences (that is, and ) have the same amount of information. In this paper, we will adopt this result. Herein, we omit the proof. By building prediction DGM models of sequences and , respectively, the grey prediction model based on interval grey number sequence is developed as follows: where

The derivation process of the details and the meaning of parameters , , , and can be inferred to literature [25].

3. Modeling Process

Definition 6. If an interval grey number sequence that is defined in Definition 1 has one and only one element , where satisfies the constrain of , then is named an interval grey number sequence with a real number. is called the real number parameter of sequence , denoted as .

Definition 7. An interval grey number sequence with a real number can be divided into three types according to the location of in as follows:(i)when , is named the first element real parameter of interval grey number sequence, as shown in Figure 1(a);(ii)when , is named the last element real parameter of interval grey number sequence, as shown in Figure 1(b);(iii)when ,   is named the middle element parameter of interval grey number sequence, as shown in Figure 1(c).

According to the preknowledge in Section 2 of this paper, an interval grey number sequence can be transformed into two real number sequences and only when the elements of are all interval grey numbers. Subsequently, a grey prediction model of interval grey number based on DGM model could be developed. On the other hand, it is not feasible to develop grey prediction models based on formula (12) when it is an interval grey number sequence with a real number, due to the obstacle in transforming it into two real number sequences and .

In order to take the advantage of the existing modeling method of prediction model of interval grey number proposed in literature [18], it is necessary to expand the real parameter in Definition 6 to an interval grey number. By doing this, a homogenous data sequence with only one type elements can be then converted from the above heterogeneous data sequences with the real parameter . Further, we can apply the modeling methods in literature[25] to build a novel grey prediction model. The process of modeling is shown in Figure 2.

There are three steps in the process of modeling as shown in Figure 2. The study content in Step 2 actually manages to establish an interval grey number prediction model based on DGM model. Such method had been discussed in literature [25]. Due to the low level of complexity, Step 2 will not be described or analyzed in this paper. The following part of this paper will mainly focus on the research on Steps 1 and 3.

3.1. Boundary Expansion of Real Parameters

Assume a real parameter is also an interval grey number with upper limit and lower limit . The expansion of real number into corresponding interval grey number could be achieved by simulation process of calculating and deducing the values of and . We use to denote the corresponding interval grey number. As defined in Definitions 4 and 5, when , the upper limit sequence of is ; meanwhile, the lower limit sequence is . Therefore, in order to achieve the expansion of the real number , one approach is to build antitone (reverse) grey prediction models of sequences and , so that the process to simulate the limit and lower limit of interval grey number could be conducted. Given that sequence is the antitone sequence of , we have By building the DGM model of sequence , we have Let Then, formula (15) can be simplified to formula (17) as follows: According to formula (17), when , we can get the value of ; that is, where is the simulated value of upper limit of interval grey number ; that is, .

Similarly, build the DGM model of sequence as follows: Let Then formula (19) can be simplified to formula (21) as follows: where is the simulated value of lower limit of interval grey number ; that is, .

According to formulas (19) and (21), we can get the following conclusion:

Hence, when , the expanded upper and lower sequences , are

When , the grey prediction models of sequences and can be built directly, which are used for simulating the upper limit and lower limit . Next, real parameter can then be converted to . Consider At the same time, provided , the expanded upper and lower sequences , are

When , as real number is situated in some location of the midpiece of , the upper and lower limit sequences and are divided into four smaller subsequences by (as shown in Figure 3). Due to the fact that grey prediction models are built based on small sample, whose sample size is no less than four (i.e., ), the quantity of elements in modeling sequence is often small. Therefore, the four subsequences of , divided by , may not satisfy the modeling requirements of grey prediction model, shown as , , , and in Figure 3. Consequently, it is not feasible to expand the upper and lower limit of real parameter by adopting the same methodology described in the situation when or . In this paper, a generation method of nonadjacent neighbor mean is proposed to tackle this task.

Assume that and are the simulated values of the upper and lower limit of real parameter , respectively; then the upper sequence and lower sequence can be written as follows: Then, a simple way to fill vacant data is by adopting the nonadjacent neighbor mean generation method. The calculation expressions of and are as follows:

Through the above studies, we have deduced a sequence with elements containing only interval grey numbers (as shown in Figure 4). Those studies create a solid foundation of building a novel grey prediction model based on DGM in Section 3.2 of this paper.

3.2. DGM Model of Interval Grey Number Sequence

As discussed in the previous session.

When ,

When ,

When ,

The conversion from and to the sequences and can be accomplished by adopting formula (11). Then, we employ formula (12) to build the DGM model of interval grey number sequence . The specific derivation process is similar to the second part of this paper, and here derivation and proof will be no longer conducted.

3.3. Prediction Model of Interval Grey Number with a Real Parameter

The purpose of expanding boundary of real parameter is to form the complete upper and lower limit sequences. Accordingly, a prediction model can be built to forecast the boundary of unknown information. Unfortunately, it increases the inaccuracy of forecast system during the process of converting a real number into an interval grey number; in other words, known information becomes uncertain information. Hence, in order to improve the performance of the model simulation and prediction, it is crucial to develop a methodology to forecast the greatest possible value of an interval grey number based on the boundary that is forecasted from the unknown information. In this section, we will study the issue through the “kernel” of interval grey number.

A grey number’s kernel is an important concept in grey system theory and also one of the basic attributes. A kernel is the most possible real number which can be used to represent the whitenization number of a grey number on the basis of full consideration of known (given) information [26]. According to literature [26], when the values’ distribution of an interval grey number is unknown, its calculation method of kernel is as follows: Since a real number can be seen as the interval grey number whose upper bound is equal to its lower bound, a real number’s kernel is just itself.

Based on the above analysis, the interval grey number sequence with a real parameter can be converted into a real number sequence based on kernels according to formula (31), denoted as ; that is,

Now, building a DGM model, it is achievable to define the greatest possible value of an interval grey number as follows:

According to formulas (12) and (33), the final expression of the novel grey prediction model is as follows: and .

When the position of the real parameter in is different, the calculation methods of and in formula (34) vary accordingly, as discussed below.

When , according to formula (11), Then, When , the calculation methods of and are the same as before; that is, According to the above discussion, the final form of the novel model is described as follows: Formula (38) is called the prediction model of interval grey number with a real number based on DGM model, IGRM model for short.

4. Relations between IGNPM Model and Other Models

In this section, the relations between IGRM , IGM (which is short for “interval grey number prediction model”), and DGM model will be discussed.

The modeling object of DGM is a real number sequence; on the other hand, the modeling object of IGRM is an interval grey number sequence with a real number. With the growth of supplementary information regarding the grey system, when every interval grey number in IGRM becomes real numbers, in other words, when the interval grey number in IGRM ’s lower limit and its corresponding upper limit overlap on one point, the model can only be built from the kernel sequence of the area sequence based on formula (33), instead of being deduced from formula (19). In fact, such model is a traditional DGM model. Therefore, the GRM is the elevation and further development based on DGM model. In addition, the relationship between them has both general and special aspect.

Similarly, when the real parameter in IGRM is extended into an interval grey number, IGRM can be seen as a standard interval grey prediction model, which is structured as same as the DGM model in formula (12). The relationship among them is shown in Figure 5.

5. Case Study

Persistent organic pollutants, abbreviated POPs, are toxic chemicals that adversely affect human health and the environment worldwide. They remain for long periods of time in the environment and can accumulate and pass from one species to the next through the food chain [27]. Currently, twenty-two POPs have been included in “Stockholm Convention on POPs,” and twelve of them belong to organochlorine pesticides (OCPs). In recent years, the studies on OCPs have become the hot issues in the field of environmental chemistry. In particular, dichlorodiphenyltrichloroethane (DDT) is one of the common pollutants in the OCPs, which has a relative long history of usage and high accumulative production in China. In this paper, IGRM model will be used to forecast the trend of alteration in the concentration of DDT. For instance, Table 1 lists the assumed concentration values of DDT in the atmosphere of a certain city in southern China in different time intervals.

Data Specification. The monitoring data in Table 1 is not recorded from certain monitoring time point, but several continuous time periods. In addition, each continuous time period is then divided into several monitoring time points due to the fact that the value of DDT concentration in atmosphere may not be identical at different monitoring time points, plus that we are unable to determine which monitoring value is more accurate. Hence, the value range of DDT concentration could only be composed from the lower and upper limits of collected monitoring data. This leads to the fact that the data of DDT concentration is an interval grey number (as shown in TR-1~7 in Table 1). Meanwhile, if the monitoring data collected from all monitoring time points in certain time period is a real number, then the data will be taken as the monitoring value (as shown in TR-8 in Table 1) for this time period.

Data in Table 1 can be described as an interval grey number sequence with a real number; that is,

A dynamic grey prediction model of the concentration of DDT in the atmosphere is built by adopting IGRM model, and the modeling processes are described in detail in the following steps.

Step 1 (boundary expansion of real parameter). The lower limit sequence and upper limit sequence of are as follows according to Table 1, respectively: Then, we can build DGM models of sequences and , correspondingly, as follows: ,   , the average relative error %, %, and the prediction values and are as follows: Using above equations, we get . Hence, the expanded interval grey number sequence is as follows.
Hence, , the expanded interval grey number sequence is as follows

Step 2 (building prediction model of interval grey number sequence ). Sequences and can be computed according to formula (11) and sequence as follows: By building DGM of sequences and , the corresponding parameters and relative errors can be achieved as follows.Parameters of DGM of sequence : and .Parameters of DGM of sequence : and .
Then, parameters and of IGRM can be computed as follows: Taking and into formula (38), we can get the following results: Formula (46) is the interval grey number prediction model of the concentration of DDT in the atmosphere.

Step 3 (IGRM model of the concentration of DDT in the atmosphere). First, build the DGM model of “kernel” (which is the greatest possible value of interval grey number in its range) sequence. The expanded interval grey number sequence is as follows: According to formula (31), the kernel sequence of is as follows: Building the DGM model of , one has where the parameters of DGM of are and %. Consider Further, by combining formulas (46) and (50), we have
Formula (51) is the IGRM model of the concentration of DDT in the atmosphere. It is a dynamic prediction model. Based on the value of , we can forecast the range and greatest possible value of the concentration of DDT.

Step 4 (forecasting the range and greatest possible value of the concentration of DDT when ). Forecasting is a method for speculating and acquiring information of the future, by analyzing and summarizing the historical laws of system development, on the condition that the system develops according to such historical laws. Therefore, the key factor of evaluating the quality of a model’s prediction function lies in whether a model can predict the development regulation of system effectively. As a result, it is of necessity to evaluate the precision of the simulation, prior to adopting the model for the purpose of predicting the development trend. Such model can be employed only if it satisfies the testing requirements. In general, the average relative error check is the most common test method for grey prediction models.
In this paper, the errors of the model simulation are analyzed from three different perspectives, presented in formula (51). They are lower limit sequence, upper limit sequence, and kernel sequence, respectively. The evaluation result is shown in Tables 2, 3 and 4.

The comparison between original and simulated values is shown in Figure 6.

Finally, the synthetic simulated error of IGRM model of the concentration of DDT in the atmosphere can be calculated as follows:

It is shown that the precision of IGRM model is between the first and second classes according to the level of accuracy test reference table [28]. So, this model can be used in the mid-short term prediction. Prediction values are shown in Table 5 for .

6. Discussion

Although the interval grey number sequence with only one real number is studied in this paper, the modeling method is of reference and practical meaning for the scenarios when the quantity of real number is greater than 1 (). In this case, such real numbers can be expanded by adopting generation method of adjacent (or nonadjacent) neighbor mean, and then the corresponding grey prediction model could be built according to formula (38). Before modeling, the precision of simulation or prediction of IGRM model may be improved by various measures according to practical situations: (1) optimizing the initial and background values of IGRM model by ameliorating the smoothness degree of modeling sequence, (2) or combination of different methods, and so forth.

In the process of practical application, when the simulation error of IGRM model is too large to meet the accuracy requirement stipulated in the literature [28], it is not practical to employ IGRM model for forecasting purposes. Other approaches need to be investigated in order to solve the above problem.

7. Conclusions

Currently, the modeling objects of the grey prediction model are mainly time series data with the same data type. The existing models become impractical when the data types of the elements in the modeling sequence are variant (heterogeneous data sequence). In this paper, we proposed a new approach of a prediction modeling. First, the position of the real number in the interval grey number sequence is identified. Then, the real number is expanded into an interval grey number by either GM model or generation method of nonadjacent neighbor mean based on the identified position of the real number. Further, the grey prediction model of an interval grey number’s kernel is built, based on deducing the novel grey prediction modeling method of interval grey number sequence with a real number. Subsequently, a new model, IGRM , is presented. Finally, the above modeling method is employed to forecast the concentration of DDT in the atmosphere. Based on the data evaluation, the effectiveness and practicability of this model are verified.

In this paper, the research findings extend modeling objects of grey prediction models from homogenous data to heterogeneous data. They are of a positive significance for enriching and improving the theoretical system of grey prediction modeling. The next research target is how to build the grey prediction model of grey heterogeneous data sequence which contains both interval grey number and discrete grey number.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by The National Natural Science Foundation, China (71271226 and 51375517), The Humanistic and Social Science Youth Foundation of Ministry of Education of China (11YJC630273 and 14YJAZH033), Program for Chongqing Innovation Team in University (KJTD201313), and Chongqing Frontier and Applied Basic Research Project (cstc2014jcyjA00024 and cstc2014jcyjA00037). The authors thank the anonymous referees for their constructive remarks that helped to improve the clarity and the completeness of this paper.