Abstract

Background. Reference interval (RI) research is to make it a concise, effective, and practical diagnostic tool. This study aimed to establish sex- and age-specific RI for myocardial enzyme activity in population aged 1–<18 years old in Changchun, China. Methods. Healthy subjects (n = 6,322, 1–<18 years old) were recruited from communities and schools. Aspartate aminotransferase (AST), lactate dehydrogenase (LDH), creatine kinase (CK), and creatine kinase isoenzyme (CKMB) were measured using an automatic biochemical analyzer. Fisher’s optimal segmentation method was used to partition by including percentiles as impact factors, aiming at minimizing the sum of the squares of the total dispersion into groups as splitting sequence of ordered data. Results. AST decreased gradually and was partitioned as 1, 2∼<10 and 10∼<18 years old. LDH presented disparate descending rate among 1∼<4, 4∼<12, and 12∼<18 years old. CK stood quite stable with the same RI in all ages. CKMB began to differ at 6 years of age sexually and then remained stable during 6∼<14 years old for male while it continued to decline in female. Cardiac development was partitioned as 1∼<6, 6∼<13, and 13∼<18 years old using multiple percentiles from massive data that reflect characteristics of totality as impact factors. Conclusions. Fisher’s optimal segmentation method excelled for multidimensionality, continuity, and loop calculating as dealing with RIs for myocardial enzymes activity and cardiac development process despite limitations. In future, impact of partition on the overall interval should be delved into.

1. Introduction

Reference interval (RI) is currently a hot spot in laboratory medicine. It is defined as the range between 2.5th percentile and 97.5th percentile of a certain indicator in healthy population [1]. As an important part of clinical laboratory and modern medicine, RI could provide valuable information for patients about diagnosis, progression, treatment, and outcome [2].

Studies have shown that RI is related to ethnicity [3], environment [4], diet [5], and others. Therefore, the establishment work must be carefully determined, taking into account the potential impact factors [6], such as enrolled individuals [7] and methodology [8]. Many research groups such as CALIPER (Canadian Laboratory Initiative in Pediatric Reference Intervals) [9] and KiGGS (German Health Interview and Examination Survey for Children and Adolescents) [10] have made great progress in this field [11].

RI partition is to divide the indicator into several stages according to the changes and distinct differences during growth and development, providing reliable and practical information for effective supervision, prevention, and diagnosis as well as management of public health. At present, the commonly applied partitioning method is to determine the changing point visually, and then use Z test, nonparametric method, or robust method to prove classification statistically based on actual data. Thus, the partition points are subjectively decided [12]. There could be too many groupings leading to inconvenience for clinical application, or the grouping is so little that they weaken the rationality.

We consider that partition problem be regarded as the cluster analysis of multidimensional ordered series, that is, the partition point with the largest difference between groups is found in massive data. Fisher’s optimal segmentation is a method for clustering ordered samples, showing characteristics of multifactor, continuous time series and the best classification by loop calculation. This study was to investigate myocardial enzymes partition: aspartate aminotransferase (AST), lactate dehydrogenase (LDH), creatine kinase (CK), and creatine kinase isoenzyme (CKMB) in healthy children and adolescents aged 1–<18 years old from Jilin Province, China and establish a RI of cardiac development combining the 4 indicators into the model.

2. Material and Methods

Fisher’s optimal segmentation, as cluster analysis of ordered samples, is based on the sum of the squares of the total deviations of the classifications, achieving minimal internal differences and maximum differences among groups.

Using , for representing n ordered samples into k classes, this segmentation can be expressed as follows:

The subpoint is (that is, ).

Talking about permutation and combination, division has S ways in total:

Among these ways, one or several must be more optimal with the smallest sum of squares of the total deviations of each classification.

If there are n ordered samples, each one is an m-dimensional vector, then the correlation matrix X can be built:

If the dimensions of characteristic values of indicators are different, it is necessary to perform nondimensionlessization with the following formula:

is the characteristic value after nondimensionlessization and is the maximum in the column of the j indicator.

Suppose P class contains samples recorded as . Mean of the class is

is used to indicate the diameter of class, and it can be recorded as

Essence of defining the optimal segmentation of objective function is to find a certain set of points that the sum of squares of the total dispersion of each classification is the smallest. Thus, the objective function is defined as

The smaller the objective function value, the smaller the internal difference and the larger the differences among classes. Segmentation that minimizes the value of the objective function is the optimal one, i.e.,

is the optimal k segment of n ordered samples. The theorem is as follows: the optimal k segment of ordered samples series must be completed by adding a segment after the optimal k-1 segment of one of its truncated sections. Therefore, the optimal two-division error formula can be obtained:

Then, the recursion formula of the optimal k segment is obtained:

If the number of is known, the segmentation method that minimizes is as follows.

Find the segment point ik at first so that is minimized, i.e.,

So, the k class is . Then, search for , making it

Thus, we obtained k-1 class as

The rest may be deduced by analogy, all the classifications can be obtained, which is the classification result of the optimal k classification. This is the result of the optimal k classification. Then, the curve of objective function with the number of k segment is plotted, and the k value corresponding to the turning point of the curve is the optimal segmentation number. Calculate the absolute value of the slope of the curve at each segmentation point:

Draw the -k curve. The larger is, the better the k classification is than the k−1 segmentation. When is close to 0, there is no need to continue the subdivision. Generally, k that corresponds to the maximum of is taken as the most optimal number of classifications.

CLSI (Clinical Laboratory Standard Institute) EP28-A3c [12] guideline does not set inclusion and exclusion criteria for reference individuals. This study refers to standard document for the Chinese adult reference interval [13]. Finally, children aged 1–<7 years old from communities and health centers as well as adolescents aged 7–<18 years old from primary schools, junior middle schools, and high schools in Jilin Province, who were apparently healthy, were targeted.

Selection process could be through into 3 steps: (1) questionnaire; (2) physician evaluation; and (3) laboratory screening. Experiment personnel issued the questionnaires at designated institutions and required guardians to fill it out strictly according to the facts, including height, weight, diet, health status, family history, medical conditions, diseases, recent infection, history of surgery within 6 months, blood donation or transfusion within 4 months, and pharmacy history within 2 weeks.

Questionnaires were collected and reviewed. There would be a pediatrician assessing health status of subjects at a certain day every week. After that the subjects were informed to draw blood at a designated place every week as required.

Laboratory exclusion criteria were as follows: HBsAg positive, HCV positive, and HIV antibody positive; serum creatinine (male) > 97 μmol/L; serum creatinine (female) > 73 μmol/L; serum uric acid > 475 μmol/L; fasting plasma glucose > 7.0 mmol/L; serum albumin < 35.0 g/L; C reactive protein > 10.0 mg/L; serum creatine kinase > 500 U/L; hemoglobin (male) < 120 g/L; hemoglobin (female) < 110 g/L; and white blood cell count < 3.0 × 109/L, or > 12.0 × 109/L.

This study was approved by the institutional ethics committee of the First Hospital of Jilin University (2016–306). Subject and his/her guardian signed up written consent. All methods/experiments were carried out in accordance with relevant guidelines and regulations (Declaration of Helsinki).

EP28-A3c (13) stipulates that there should be no less than 120 reference individuals each partition. Since partition points remain unknown, enrollment and sample collection work lasted continuously from September 2017 to December 2018 to ensure that every 1 year gap and each gender are included over 120 individuals. Recruitment was from 5 administrative regions (9 in total) of Jilin Province.

Subjects were guaranteed regular diet and exercise 3 days ago and fasted for >8 hours (age under 3 was suggested 3–6 hours) before blood collection. Four millimeters of venous blood were collected in plastic vacutainers, placed at room temperature for 30 minutes and then centrifuged at 3,000 rpm for 10 minutes. Hemolyzed, lipemia, or jaundice specimens were removed. Serum in Changchun city should be transported to the First Hospital of Jilin University within 2 hours after separation for analyzed, while serum beyond Changchun city required 8 hours by cold chain trucks.

Ortho VITROS 5600 automatic biochemical analyzer and reagents were applied to detect serum AST (multipoint rate method, reagent containing pyridoxal 5-phosphate), LDH (multipoint rate method), CK (multipoint rate method), and CKMB (multipoint rate method). Calibrators and quality control materials were also supported by Ortho Clinical Diagnostics.

Database was reviewed and miscellaneous data were eliminated. Data processing is carried out based on EP28-A3c (13). Outliers were removed using the Dixon method and re-evaluated by box plots. Kolmogorov–Smirnov test was conducted to determine if the data followed Gaussian distribution. If it was satisfied, percentiles of each index were calculated as ; if not, calculated after Box–Cox normality transformation with Mintlab. Percentiles were substituted as impact factors into the model, and then an ordered sample matrix X with a capacity of 17 and each one being a 20-dimensional vector were built. After obtaining the optimal classification number and points, Z test was to verdict whether there was statistical difference between classes. If Z > Z, classes were combined; otherwise, RI should be established, respectively. Gender difference was performed in the same way. Average or median of each indicator per 1 year old was calculated according to the normality, and a matrix with capacity of 17 and each one being a 4-dimensional vector were built by substituting the calculation into the model. Trends of indicators were assessed by scatter plots, and the partition results were verified.

Statistical analysis and figures were completed using LMS, Excel, Medcalc, SPSS, Matlab, and Mintlab.

3. Results

3.1. Baseline Information and RIs of Myocardial Enzyme

There were 6,322 healthy children and adolescents enrolled, including 2,998 subjects from Baishan, 2,193 subjects from Changchun, 500 subjects from Songyuan, 466 subjects from Yanbian, and 165 subjects from the city of Jilin. For 3,119 males and 3,203 females, sex ratio was 1 : 1.03. RI was obtained by performing Z test after the age segmentation using Fisher’s method (Table 1).

Taking CKMB as an example (including males and females), objective function -k and the nonnegative slope -k curve were plotted (Figure 1). It was shown that the error function -k decreased as the number of segments increased. At k = 3, -k curve became the steepest and made a bend, with -k curve reaching its peak. Hence, the optimal classification number k was 3, and specific endpoints were presented by code results. With -k and -k, the optimal classification numbers k for AST, LDH, or CK were also the same as that of CKMB. However, reference interval of CK does not need partition after Z test.

3.2. Age and Sex Related Trends

From the scatter plot of each indicator (Figure 2), it was shown that AST decreased and the speed slowed gradually, and males’ values overwhelmed females’ values. LDH presented disparate descending rate among age partitions among 1∼<4, 4∼<12, and 12∼<18 years old. CK stood quite stable, sharing the same RI in population aged 1–<18 years. CKMB differed between sexes at 6 years age; for male, CKMB remained stable during 6∼<14 years old and began to decline thereafter; for female, CKMB continued to decline with age.

3.3. Determination of Cardiac Development in Population Aged 1–<18 Years

Mean or median of each indicator from every year was included in this model; then, cardiac development was partitioned as 1∼<6, 6∼<13, and 13∼<18 years old. The partitions for male and female were calculated as 1∼<6, 6∼<14, and 14∼<18 and 1∼<6, 6∼<12, and 12∼<18 years old, respectively. It can be speculated that adolescent females’ heart development started to change earlier.

4. Discussion

4.1. Status of Pediatric RI Research

Pediatric RI study is the focus of clinical laboratory medicine [13]. Although many details are not quite clear yet, such as reference individual criteria, partition method, and efficacy verification [12], this project is of great significance. There is no feasible pediatric RI in China nowadays. RIs from various sources are applied in clinical laboratories at all levels, such as local databases, manuals, textbooks, or literature, of which data are outdated and credibility is quite doubtful [1]. Therefore, diagnosis of many pediatric diseases relies on rich experience of pediatricians largely, and it is often the case that explanation of results is inconsistent with diagnosis due to improper RIs. Establishment of an accurate and reasonable pediatric RI is critical for monitor and treatment of illness.

Many indicators change constantly due to children and adolescents’ growth and development [14, 15]. For example, alkaline phosphatase in neonates is slightly higher than adult level within 3 months after birth and turns out to be 2-3 times higher during puberty [16]. N-terminal B-type natriuretic peptide concentration in pediatric population reached 260% of adults’ [17]. It is obvious that simply following adult RI is likely to cause disturbance in some situations, which seriously affects clinical decision-making. In addition, the whole process of variation is continuous but staged owing to time-phased hormones and external stimuli [14]. Summing up the above, underlying idea of EP28-A3c [12] is “Let data speak”, leading to investigation of how to properly present and apply the information data contains. There is a still lot of work to do on this field.

4.2. Fisher’s Optimal Segmentation

The goal of RI research is to make it a concise, effective, and practical diagnostic tool. Age is an ordered variable apparently. For instance, partitions of indicators as AST and LDH that steadily rise or decline are judged from scatter plot subjectively. Changing point refers to a certain moment when the sudden alternation of an ordered sequence happens. Fisher’s optimal segmentation includes multiple percentiles from massive data that reflect characteristics of totality as impact factors into the model. Fisher’s optimal segmentation method is used as a clustering method for ordered samples. It splits a sequence of ordered samples according to principle that distinguishes among partitions being the most distinct as well as internal differences being smallest. The optimal solution is to minimize the sum of the squares of the total dispersion into groups, while all possible classifications maintain the time continuity. It has no strict requirements of data form with selection of percentiles being less affected by extreme values although certain information is lost.

The results are in agreement with some clinical cardiovascular studies. As for AST and LDH, the downward trends might be due to increased liver size, muscle mass, and fat-muscle distribution changing with age [18]; CK and CKMB are greatly affected by physical activities. We found the partitions are quite consistent with phases of elementary school (6–12 years old), junior middle school (12–15 years old), and high school (15–18 years old) in China, when physical education and sports in the peer group could exert a great impact.

Combination of multimarkers is commonly applied to assess myocardial damage. Trend of percentiles 75th, 90th, 95th, and 99th of LVMI 2.7 (body height to the allometric power of 2.7) increases gradually over 10–12 years old, stating that a fixed cut-off point would be theoretically inappropriate and generates a pattern of age-dependent subdiagnosis [19]. We had found that cardiac development was partitioned at 11 years of age in females as well as 12 years of age in males. Khoury et al [20] states that end-diastolic and end-systolic volumes of both ventricles indexed to weight exhibits a slight decrease from childhood to adolescence where a plateau shows at around 14 years old among 99 subjects of 8–20 years old, close to our results that cardiac development partitioned at 12 years old or so. Recent work has used steady-state free-processing (cine-SSFP) protocols to yield pediatric cardiac reference values in 60 children, divided into age groups of 8–11 years, 12–14 years, and 15–17 years old [21], which is different from our results for preadolescent period as 1–5 years and 6–12 years.

Though only being significant in the older group, sex differences are noted in both age groups when analyzing pediatric patients (8–15 years) versus adolescents/young adults (16–20 years) [22]. Besides, data shows that females have a lower prevalence of left ventricular hypertrophy (LVH) than men under any given level of blood pressure [23]. It might be the reason that most of the indicators begin to present higher levels in males in puberty. Goble et al. reports that body size, and in particular lean body mass, explains much of the variability in cardiac growth seen in children [24]. Thus, late bloomer as boys may not start to appear cardiac growth until fully growing stages as adolescence.

5. Limitation

First, this model cannot include age and sex into the model at the same time, as seen when the 4 indicators were combined to reflect cardiac development. Hence, it is not quite accurate for identifying the exact age when sex difference begins to show. Second, results of this study are a general trend to determine the optimal segmentation of cardiac development in children and adolescents aged 1∼<18 years old. Partition may vary slightly if age coverage changes.

6. Conclusion

Fisher's optimal segmentation method was used to establish RIs for myocardial enzyme activity in healthy children and adolescents aged 1–<18 years in Jilin Province, China. To describe cardiac development process, multiple percentiles were selected as impact factors included in the model for partition investigation. The method presented multidimensionality, continuity, and loop calculating as dealing with such problem. However, it also has some shortcomings such as inability to assess both age and sex at the same time. In future research, impact of partition on the overall interval should be delved into.

Data Availability

The testing data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare there are no conflicts of interest.

Authors’ Contributions

W. G. and Y. J. carried out studies and were involved in protocol development, gaining ethical approval, patient recruitment, and data analysis. J. X. wrote the final draft of the manuscript. W. G. and Y. J. were involved in patient recruitment and wrote the first draft of the manuscript. J. X. and Q. Z. wrote the first draft of the manuscript. All authors reviewed and edited the manuscript and approved the final version of the manuscript.

Acknowledgments

The data cited from the laboratories of authors were supported in part by grants from the National Science Foundation of China (no. 81501839, to Dr. Qi Zhou), Scientific and Technological “13th Five-Year Plan” Project of Jilin Provincial Department of Education (no. JJKH20180214 KJ, to Dr. Qi Zhou.), Jilin Province Health and Technology Innovation Development Program (no. 2017J071, to Dr. Jiancheng Xu), Jilin Science and Technology Development Program (no. 20170623092TC-09, to Dr. Jiancheng Xu; no. 20160101091JC, to Dr. Jiancheng Xu; no. 20150414039GH, to Dr. Jiancheng Xu; and no. 20190304110YY, to Dr. Jiancheng Xu), The First Hospital Translational Funding for Scientific &Technological Achievements (no.JDYYZH-1902002, to Dr. Jiancheng Xu), and Norman Bethune Program of Jilin University (no. 2012223, to Dr. Jiancheng Xu).