Abstract

This paper introduces an approach for parameters identification of a statistical predicting model with the use of the available individual data. Unknown parameters are separated into two groups: the ones specifying the average trend over large set of individuals and the ones describing the details of a concrete person. In order to calculate the vector of unknown parameters, a multidimensional constrained optimization problem is solved minimizing the discrepancy between real data and the model prediction over the set of feasible solutions. Both the individual retrospective data and factors influencing the individual dynamics are taken into account. The application of the method for predicting the movement of a patient with congenital motility disorders is considered.

1. Introduction

Multidimensional statistics methods, including predicting and dependencies analysis, are required to research complex systems with many random factors [14]. In order to obtain more reliable individual prediction, the patient specific information needs to be taken into account to correct statistical model parameters.

Modeling of biomedical systems provides actual application of mathematical methods and algorithms. This paper studies the problem of locomotor development prediction for people with congenital motility abnormalities. The authors propose an approach to identify the parameters of a statistical predictive model taking into account the available set of individual data.

2. Current State

Mathematical methods and information technologies are important tools in modern biomedical research. Together with physicomechanical modeling of physiological systems [57], the study of statistical properties of disease courses for patients with similar diagnoses is also of certain interest. Statistical predictive modeling in medicine covers a wide range of areas including cardiology [810], pulmonology [1113], neurology [14, 15], and others.

One of the disadvantages of statistical predictive models is that they are only applicable to an average individual. In order to get a more accurate prediction it is necessary to identify statistical predictive model parameters using data for a concrete person. Using this approach which is known as “individual prediction” method [16, 17] in medical science is of major interest for the modern interdisciplinary scientific development.

3. Materials and Methods

3.1. Initial Data Analysis

The motility index quantifies the level of locomotor development. In order to calculate an expert evaluates 12 different groups of locomotor skills. The expert estimations are then arranged according to the five-point scale and the index is defined as the sum of the estimations. Due to the construction algorithm the index is defined within the interval . The best locomotor development corresponds to the top boundary of parameter and the worst locomotor development corresponds to the lowest value.

The motility index dependence on age for a group of patients is considered to be a random process , where is the patient’s age. The process corresponds to each patient in the group.

To analyze the stochastic process structure the following correlation function is used as follows: where and are time slices of the centred random process at the time and and are dispersions of at the moments and , and is the expectation operator.

The available statistics database contains 157 observations of cerebral palsy patients. The research sample was based on occasional patient visits to a rehabilitation centre during their first nine years of life. Experts determined each patient’s motility index during their visits. Detailed information was collected on each patient: medical status of parents and close relatives prior to the individual’s conception and characteristics of prenatal and perinatal periods (prenatal and intranatal factors).

Despite the presence of relatively large number of observations, it was only possible to focus on 5 people who were constantly observed during the entire research period. Given the data limitation the study of locomotor skills development was conducted on a group of 5 people and the complete initial sample of 157 observations was used for calculating the average parameters.

A scatter diagram of random process is shown in Figure 1, where is the age (in months). Figure 2 shows the graph of the corresponding correlation function for fixed age (months).

When the correlation function is close to one, the process is characterized by a strong dependence between the time slices. This indicates that the process realizations are similar. The similarity is the necessary condition for the application of the proposed method.

3.2. General Algorithm for Individual Prediction

Individual prediction of motility index is based on the following algorithm.

The first step determines the age dependence of the average motility. The consecutive selection method of the exponential terms [16] is used for this purpose. According to this method, the average motility index trend is represented as follows: which includes the following unknown parameters: .

The individual motility index trend is constructed as bounded from above monotonically increasing function:

This paper studies the locomotor skills accumulation process only. The loss of such skills is not taken into account therefore the monotonically increasing assumption could be used. The motility index boundaries are defined by their calculation method (from 0 to 60).

The generalized factor of prenatal and intranatal conditions is introduced as follows: where is the risk factors during pregnancy and birth that have the highest influence on the motility index, and is the correlation between and the motility index.

The assumption of factors being responsible for development delay and, as such, affecting the absolute term value is used.

The motility index value at time 0 with no prenatal and intranatal risk factors taken into account is calculated by solving the following convex optimization problem with constraints in the form of inequalities.

Find and such that: where ) are known values calculated during the average trend identification, is the number of real motility index values in the interval of the initial individual observation is the ’s patient real motility index value at the moment of time , is the prediction horizon, and is the possible highest value of motility index.

The first expression in system (5) is the minimum condition for the sum of square deviations from real motility index values. The second expression is the motility monotonic growth condition. The third one is the motility index restriction condition.

The problem (5) could either be solved numerically or analytically via the Kuhn-Tucker theorem [18].

The angular coefficient (which determines how strong the factors influence the initial motility index changes) is calculated using linear approximation of the real motility index dependence at the initial time on the generalized prenatal and intranatal factors for a group of patients. The calculated coefficient is used for adjustment taking the influence of prenatal and intranatal factors into account: . The other individual coefficients are calculated by solving the optimization problem (5) with the adjusted coefficient obtained during the previous step.

4. Results and Discussion

The practical applicability of this method is illustrated for a patient . The person is a cerebral palsy patient and was observed at the Perm Center of Complex Rehabilitation for People with Disabilities during the first nine years of life.

The entire observation time period is divided into two intervals: base period (first four years) and prediction period (next five years). The first period data is used to develop the motility index prediction model while the second period data serve as a test for prediction results (Table 1).

The initial locomotor development dynamics of the observed patient (black dots in Figure 3) differ significantly from the age dependence of average motility index for a group of patients with similar diagnosis (dotted line in Figure 3). Therefore the prediction based on the average trend only will not produce acceptable accuracy in final results. At the same time the prediction based on the individual data only will also not be accurate since the base period is shorter than the prediction period. The proposed method which corrects the average trend parameters using individual patient data is further carried out to increase the prediction accuracy.

Using the method of sequential selection of exponential terms [16] results in the expression for the average motility index trend:

The correlation analysis method is applied to identify a set of variables that are further used as prenatal and intranatal conditions (Table 2) that significantly influence the motility index for a group of patients with cerebral palsy. The sample of 157 observations was used to calculate the correlation coefficient [17].

The individual values of prenatal and intranatal factors are shown in Table 3: “0” means “no”, “1” means “yes”.

The generalized factor of prenatal and intranatal conditions with the individual data and the correlations shown in Tables 2 and 3 being taken into account has the value .

Solving the system (5) gives the motility index at the initial time in the absence of prenatal and intranatal risk factors: .

The coefficient obtained by the least-squares method [19] for the group of patients is −16.68. The corresponding adjusted initial motility index is calculated as: . The remaining coefficients are , .

The identified individual motility index trend is shown in Figure 4 (solid line). Compared to the average trend the individual predictive model is much closer to the control data (gray dots). The determination coefficient is .

5. Conclusions

The proposed method of parameters identification for a statistical predictive model based on data set of patient information can be successfully applied conditioned on the correlation function being close to one. This method is based on multidimensional statistical analyses [14] and individual prediction approach [16, 17].

The application of the method to statistical modeling of medical and social systems resulted in an algorithm which does not require modeling of physical and mechanical processes. The obtained algorithm is based on statistical analysis only. The application of the algorithm significantly improves the prediction accuracy compared to the predictions based on average statistics. The improvement has been successfully demonstrated in an example of individual locomotor development prediction for a patient with congenital motility disorders.

The results show the advantages of the proposed method when the predicted index has a significant variation within the group with the prediction interval being larger than the base period.

Acknowledgment

This work was supported by the Russian Foundation for Basic Research (Grant 10-04-96096-r_ural_a).