Purpose. To analyze the effects of dosimetric parameters and clinical characteristics on overall survival (OS) by machine learning algorithms. Methods and Materials. 128 patients with cervical cancer were treated with definitive pelvic radiotherapy with or without chemotherapy followed by image-guided brachytherapy. The elastic-net models with integrating DVH parameters and baseline clinical factors, only DVH parameters and only baseline clinical factors were constructed in 5-folds cross-validations for 100 iteration bootstrapping, and then were compared using concordance index (C-index) criteria. Finally, the selected important factors were used to build multivariable Cox-pH models for OS and also shown in nomograms for clinical usage. Results. The median OS occurred was 25.78 months with 25 (19.53%) deaths. The elastic-net models integrating clinical and DVH factors had the best prediction performances (C-index 0.76 in the train set and C-index 0.74 in the test set). Three important factors were selected, including baseline hemoglobin level as the protective factor, primary tumor volume (GTV_P) volume, and body V5 as the risk factors. The final multivariable Cox-pH models were constructed using these important factors and had prediction performance (C-index: 0.78, 95%CI: 0.73–0.81). Conclusions. This is the first attempt to establish elastic-net models to study the contributions of DVH parameters for predicting OS in patients with cervical cancer. These results can facilitate individualized tailoring of radiation treatment in cervical cancer patients.

1. Introduction

Cervical cancer is the fourth most frequently diagnosed cancer and the fourth leading cause of cancer death in women, with an estimated 604,000 new cases and 342,000 deaths worldwide in 2020 [1]. Multidisciplinary management planning based on the tumor size and extension made by a multidisciplinary tumor board before the start of any treatment is recommended by European Society for Medical Oncology (ESMO) guideline of cervical cancer [2]. For International Federation of Gynecology and Obstetrics (FIGO) stage IA1 to IB1, surgery is the main treatment with adjuvant radiotherapy (RT) ± chemotherapy in case of risk factors, and for the FIGO stage IB2–IVA, concurrent chemoradiotherapy (CCRT) represents the standard [2]. Definitive RT ± chemotherapy can also be used for patients with the FIGO stage IVB with oligometastasis [3] or who are not candidates for hysterectomy. Neoadjuvant chemotherapy remains controversial for locally advanced cervical cancer [2]. With regard to immunotherapy, in addition to its applications in recurrent or metastatic cervical cancer [4, 5], ongoing trials are investigating the combination of immunotherapy with RT or CCRT in locally advanced cervical cancer [5, 6]. Despite modern advances in various treatment modalities, the mortality of cervical cancer still remains high, with 5-year overall survival (OS) of about 65% after CCRT [7]. Therefore, it is crucial to identify prognostic factors to tailor personalized management strategies for patients with different risk levels.

The FIGO staging system has been the most commonly used method to classify the prognosis of cervical cancer patients. The 5-year OS rates of FIGO stage I-IV cervical cancer were 83%–100%, 70%–80%, 42%, and 32%, respectively [8, 9]. Over the years, researchers had made a tremendous effort to identify other clinical prognostic factors for OS [7]. High body mass index (BMI>25) at the time of cancer diagnosis was found to be positively associated with 2-year and 5-year survival rates [10]. Biological parameters, including pretreatment levels of hemoglobin, leucocyte, and platelet, were identified as prognostic factors for locally advanced cervical cancer [11]. A previously developed nomograms with C-index of 0.713 identified several prognostic factors associated with OS, including squamous cell carcinoma antigen (SCC–Ag), BMI, tumor size, pelvic wall involvement, and para-aortic lymph node metastasis [12]. Concurrent chemotherapy (≥4 cycles) [13], monocyte [14], age [15], and performance status were also found to be prognostic for survival. Another nomogram showed tumor size, grading, and parametria status affected 5-year OS in locally advanced cervical cancer primarily treated with neoadjuvant chemotherapy followed by radical surgery [16]. Nevertheless, the abovementioned prognostic factors mainly describe clinical features other than radiotherapy parameters. Since radiotherapy forms the backbone of cervical cancer treatment, it is reasonable to presume that dose-volume histogram (DVH) parameters may have an impact on OS. Analysis of dose-effect relationship between DVH parameters and prognosis for cervical cancer patients suggested that 100%, 98%, and 90% of high-risk clinical target volume received radiotherapy dose (HR-CTV D100, HR-CTV D98, and HR-CTV90) were independent factors affecting OS [17]. Retrospective DVH analysis showed that the equivalent dose in 2 Gy (EQD2) of HR-CTV D90 was significant determinant of OS in patients with uterine cervical cancer [13]. In addition to target volume, the prognostic impact of DVH parameters of organs at risk (OARs) has been studied in a range of cancer types. Multivariate analyses showed that lung V20 (volume covered by radiation dose of ≥ 20 Gy) and lung V5 (volume covered by radiation dose of ≥ 5 Gy) were associated with OS in patients with esophageal cancer treated with neoadjuvant chemoradiotherapy when adjusting for surgical margin and pathological treatment response [18]. To the best of our knowledge, there are no studies on the effect of DVH parameters of both tumor and OARs on OS during external beam radiotherapy (EBRT) of cervical cancer.

Different approaches can be employed to identify clinical and dosimetric parameters that affect the patient’s outcome. A multidimensional nomogram has been developed for predicting progression-free survival (PFS) in patients with locoregionally advanced nasopharyngeal carcinoma [19]. The random survival forest model identified D99 (the dose that covered 99% of the volume) as an important variable associated with survival of high-grade glioma [20]. Besides the random survival forest model, the elastic-net model, as a machine learning method, yields higher discriminative performance in (chemo) radiotherapy outcome than other studied classifiers [21]. Therefore, in this study, we employed an elastic-net model to determine the key clinical and DVH parameters in predicting survival outcome of cervical cancer patients.

2. Materials and Methods

2.1. Description of Cohorts

The local institutional ethics committee approved the study (reference number (2019) 049). All patients provided written informed consent for the use of personal medical records for academic purpose before treatment and consent form for this specific study was waived.

A cohort of patients diagnosed with cervical cancer in a single institute in China from January 2015 to February 2021 was selected for this study. All patients were treated with definitive radiotherapy. Eligible patients met the criteria: ≥18 years old; previously untreated, pathologically confirmed cervical carcinoma; stage IB-IVB using FIGO (2018) (only stage IVB with oligo-metastasis scheduled for radical chemoradiotherapy included); main treatment was EBRT with or without chemotherapy followed by image-guided brachytherapy. Key exclusion criteria were the following: small cell carcinoma of the cervix, acquired immune deficiency syndrome, concomitant secondary primary malignancies, radiotherapy in adjuvant or recurrent settings, or patients who did not complete planned radiotherapy. The last follow-up time was 30 April 2021.

2.2. Radiation Therapy

All patients received EBRT using RapidArc or three-dimensional conformal radiotherapy (3D-CRT) techniques. EBRT was delivered on a 6 MV linear accelerator. For RapidArc, GTV_P, and GTV_N were defined as primary gross tumor volume and locoregional pathological lymph nodes detected by physical examination, pelvis magnetic resonance imaging (MRI), or positron emission tomography (PET)/CT. PTV_4500 and PTV_5500 (planning target volume of pelvis and metastatic lymph node received prescribed dose of 45 Gy and 55 Gy): prescription dose was 45 Gy in 25 fractions to PTV_4500 with a simultaneous integrated boost of 55 Gy to PTV_5500. For 3D-CRT, two sequential phases were used (45 Gy/25 fractions to pelvis for phase I; FIGO IIIB 16 Gy/8 fractions, other stages 10 Gy/5 fractions boosting to pelvic wall for phase II). All EBRT was daily, 5 fractions per week. CT or MRI guided brachytherapy was performed 3–4 weeks after initiation of EBRT with a 192 Ir (iridium) high dose rate, once a week for a total of 4 times. The cumulative equivalent of >84 Gy (EQD2) for stage IB-IIIA and >90 Gy (EQD2) for ≥ stage IIIB were set for the cervical tumor. DVH parameters during EBRT were obtained from the Varian Eclipse treatment planning system (version 15.0).

2.3. Chemotherapy

Concurrent cisplatin at 40 mg/m2 was given weekly during EBRT. Carboplatin (area under the curve (AUC) = 2 mg/ml/min) weekly was used as an alternative if creatinine clearance ≤50 ml/min. In cases involving long radiotherapy waiting time, induction chemotherapy with paclitaxel plus carboplatin was given. Chemotherapy was not recommended to patients aged over 70 or FIGO stage IB1.

2.4. Follow Up

In the first 2 years of follow-up, all the patients had regular assessment every 3 months, then every 6 months in the third to fifth year, and yearly after the fifth year. OS was the time from the start of EBRT to the date of death from any cause or the last confirmed date of survival.

2.5. Univariate Analysis and Multivariable Analysis

Univariate Cox-pH analysis was conducted to generate hazard ratios (HRs) with confidence intervals (CIs) of each single risk factor’s contribution for OS. The factors extracted by elastic-net models were applied to build the final multivariable Cox-pH model. The Concordance index (C-index) were then applied to show the performance of the final multivariable Cox-pH model. The final multivariable Cox-pH models for predicting OS were used to construct nomograms.

2.6. Elastic-Net Modeling

Elastic-net regression is a type of penalized regression [22, 23]. Elastic-net uses both L1 norm penalty and L2 norm penalty on the regression covariates, and uses a mixing parameter that defines the proportion (alpha parameter) of penalty applied to the covariates between both L1 and L2 norms. Taken together, the elastic-net regression method allows retention of correlated covariates, but also regularizes model predictors in a manner that allows for improved prediction performance.

Elastic-net models were constructed for the prediction of OS using a 5-folds cross-validation methodology in 100 iterations bootstrapping, to approximate the models’ generalization abilities when lacking an external validation dataset [21, 24]. To determine the important features for OS by elastic-net models, we selected the best alpha and lambda in the elastic-net model by the criteria of C-index. The features with significant coefficient in elastic-net models and high selected frequencies in bootstrapping were selected as important factors.

2.7. Statistical Considerations

All continuous features were normalized in log10(x + 1). All statistical analyses were performed by R software (version 4.0.2, R Development Core Team, Vienna, Austria). The R package glmnet was used to implement elastic-net modeling. value less than 0.05 was considered statistically significant.

3. Results

3.1. Patient Characteristics

A total of 128 patients were assessed as eligible for inclusion in this study. Table 1 lists detailed characteristics of the study population. The median OS was 25.78 (interquartile range, IQR: 14.26–41.57) months with 25 (19.53%) deaths. The median age was 53 (IQR: 46–63) years. 78.91% patients were treated with RapidArc and the others were treated with 3D-CRT. 20.31% patients had induction chemotherapy before EBRT and 84.38% patients had concurrent chemotherapy during radiotherapy. Results of univariate Cox-pH analysis of clinical factors influencing OS were also summarized in Table 1. Patients with a higher BMI or baseline hemoglobin levels had longer OS (HR: 1.06e - 3, 95%CI: 1.75e - 6–0.65, value = 0.04; HR: 8.96e - 4, 95%CI: 4.34e - 6–0.18, value <0.01, respectively); while patients had poor survival with FIGO 2018 stage IV (HR: 5.82, 95%CI: 1.44–23.48, value = 0.01).

3.2. DVH Parameters

In this study, 20 DVH features were extracted, including dmax, dmean, and volume of tumor targets (GTV_P, GTV_N, PTV_4500, and PTV_5500), and dmax, dmean, , , and volume of OARs (body and bones) (Table 2). As summarized in Table 2, the median dmean of GTV_P was 47.1 (IQR: 46.6–49.21) Gy and the median dmax was 53.2 (IQR: 48.6–57.8) Gy. The median dmean of the whole body was 12.11 (IQR: 10.67–13.9) Gy and of the bones was 29.22 (IQR: 27.84–31.88) Gy.

Univariate Cox-pH analysis results of DVH parameters for OS are also presented in Table 2. Patients with poor survival had significantly higher volume metrics of tumor (GTV_P volume HR: 9.8, 95%CI: 2.67–35.91, value: <0.01; PTV_4500 volume HR: 181.2, 95%CI: 1.98–1.66e + 04, value = 0.02; GTV_N dmean low vs. none HR: 2.69, 95%CI: 1.06–6.8, value = 0.04). Furthermore, patients with poor survival had a higher total body dose (dmean HR: 544.54, 95%CI: 7.9–3.75e + 04, -value = 3.53e - 3; V5 HR: 991, 95%CI: 8.97–1.1e + 05, value = 4.06e-03; V45 HR: 11.34, 95%CI: 1.13–114.19, value = 0.04).

Pearson’s correlations between all DVH parameters are shown in Figure 1. Relatively strong positive correlations among the dosimetry of the tumor were found (Pearson’s correlations > 0.5). While there were little correlations between clinical characteristics and DVH parameters, also little correlations among tumor dosimetry and OARs dosimetry.

3.3. Prediction Performances of Elastic-Net Models

To study the risk factors of survival, three kinds of elastic-net models were established, including the model with integrating clinical factors and DVH parameters, with only clinical factors and with only DVH parameters. These three models had best prediction performances when alpha parameters equal to 0.8, 0.7, and 0.5, respectively. The prediction metric C-index was used to evaluate and compare the performances of three models in the train set and the test set as shown in Figure 2.

In the train set, models integrating clinical and DVH features had the best performances (C-index: 0.76, 1st–3rd quartile: 0.74–0.77). Also, in test sets, the models integrating clinical and DVH features (C-index: 0.74, 1st–3rd quartile: 0.68–0.8) performed much better than models based on clinical features only (C-index: 0.67, 1st–3rd quartile: 0.58–0.72), and a little better than models with DVH parameters only (C-index: 0.72, 1st–3rd quartile: 0.62–0.78). These results indicated DVH parameters had contributions to survival, future more indicated that DVH parameters applied complementary information of clinical factors in survival prediction.

3.4. Important Factors in Elastic-Net Models

The performances of all factors in the models with integrating clinical and DVH parameters were summarized, including the mean- and value of their coefficients in the elastic-net models and the selected frequencies in 100 iterations, as shown in Figure 3 and (Supplemental Table 1). In clinical factors, the hemoglobin level at baseline was an important protective factor from death (mean coefficient: 0.47, 95%CI: 0.38–0.57, value: <0.01, frequency:72%). In DVH parameters, both GTV_P volume and body are the most promotive factors for death (mean coefficient: 1.26, 95%CI: 1.21–1.32, value: <0.01, frequency: 92%; mean coefficient: 2.54, 95%CI: 2.1–3.09, value: <0.01, frequency: 90%, respectively).

3.5. The Final Multivariable Cox-pH Model

For the possibility of clinical usage, the final multivariable Cox-pH model integrating the key clinical characteristics (hemoglobin at baseline) and DVH parameters (GTV_P volume and body ) was constructed as shown in Figure 4(a) with C-index (0.78, 95%CI: 0.73–0.81). The corresponding nomogram for survival prediction were developed for clinical use as shown in Figure 4(b).

4. Discussion

The present study analyzed the effects of dosimetric parameters and clinical characteristics on OS by machine learning algorithms. The results showed that elastic-net models with integrating clinical and DVH factors had best prediction performances (C-index 0.76 in the train set and C-index 0.74 in the test set). Three important factors were selected, including baseline hemoglobin level, primary tumor volume (GTV_P), and body . The final multivariable Cox-pH model constructed using these important factors had prediction performance (C-index: 0.78, 95%CI: 0.73–0.81) better than previous studies [2527]. It indicated that the addition of DVH parameters to clinical factors in the model improved the prediction ability for OS. At the same time, the final multivariable Cox-pH model and the nomogram plot with only three readily available indicators in practice making it feasible in clinical application.

In clinical factors, our study found that the hemoglobin level at baseline was an important protective factor from death which was widely acknowledged. Many other studies have reached similar conclusions. Pretreatment hemoglobin was found to be a potential biomarker for survival prognosis in not only early cervical cancer [28] but also locally advanced cervical carcinoma [12]. The first international expert consensus guideline informing a minimum hemoglobin transfusion target of 90 g/L was endorsed to balance tumor radiosensitivity with appropriate use of a scarce resource for patients with cervical cancer undergoing EBRT and brachytherapy [29]. The hemoglobin level more than 90 g/L at presentation was positively associated with a 5-year OS rate [30]. A new score identified <120 g/L for hemoglobin at the time of diagnosis impacted disease free survival (DFS) and OS [11].

In DVH parameters, both GTV_P volume and body V5 were the most promotive factors for death. It is consistent with the conclusions of other studies that the larger GTV_P volume, the worse the survival. The 5-year survival rate of cervical cancer patients with tumor volume <40 cm3 was significantly better than that of patients with >40 cm3 [31]. The total volume of metabolic tumors was an independent prognostic factor for the recurrence-free survival of patients undergoing radical radiotherapy and chemotherapy for cervical cancer [32]. Researches on other tumors also support this conclusion. GTV_P volume ≥5 cm3 was associated with a significantly worse OS in patients with sinonasal mucosal melanoma [33]. Another finding suggested that a pathological tumor volume of ≥18 cm3 was significantly correlated with shorter OS of oral squamous cell carcinoma [34]. Similar conclusion was also found in rectum cancer [35], nasopharyngeal carcinoma [36], supraglottic carcinoma [37], and glioblastoma [38].

Body is, especially, an important risk DVH parameters we found for survival, which was little considered in radiation therapy before. There are two types of radiation health effect, including acute and late on-set disorders. Clinical symptoms of acute disorder begin with a decrease in lymphocytes, and then the symptoms appear, such as alopecia, skin erythema, hematopoietic damage, gastrointestinal damage, and central nervous system damage, with increasing radiation dose [39]. Body radiation can potentially result in both acute and long-lasting adverse effects, particularly, on hematopoietic and immune cells [40]. Studies have shown that radiation-induced lymphocytopenia is associated with poor prognosis in solid tumors [41], such as cervical cancer [42] and non-small cell lung cancer [43]. Regarding the late on-set disorder, predominant health effects are cancer [4446], non-cancer disease [47, 48], and the genetic effect [4951]. In addition, it should be noted that with the development of modern radiotherapy techniques, such as intensity-modulated radiotherapy (IMRT), patients receive a larger volume of low-dose radiation. Body dose-volume distributions may influence the risk of second primary cancer [52]. Moreover, radiation-induced normal tissue damage and repair also has a dose-volume effect [53].

There are some limitations in this study. First of all, this is a retrospective study. A prospective study is needed to collect more complete data. Secondly, since the international cervical cancer staging system does not include prognostic biomarkers, and current treatment recommendations are mainly based on staging, we did not include nonanatomical prognostic biomarkers, such as human papillomavirus (HPV) infection data and SCC-Ag values. Thirdly, the median follow-up for our analysis was 26.4 months, and longer follow-up is needed to fully assess long-term survival benefits. Lastly, although a 5-folds cross-validation methodology in 100 iterations bootstrapping was used to assure the models’ generalization abilities, an external validation is needed in the future study. Nonetheless, the findings of our study provide valuable data to guide clinical practice and future research.

In conclusion, this is the first attempt to establish elastic-net models to evaluate the roles of DVH parameters in predicting OS in patients with cervical cancer. In addition to clinical factors, DVH parameters such as GTV_P volume and body appear to be important predictors of survival outcome. These results can facilitate individualized tailoring of treatment and patient counseling in the holistic management of cervical cancer.

Data Availability

The datasets analyzed during the current study are not publicly available because the data are strictly confidential, but are available from the corresponding author upon a reasonable request.

Ethical Approval

The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. All patients signed, at hospital admission, consent for the use of their data for retrospective and scientific investigation. The paper has been performed in accordance with the Declaration of Helsinki and has been approved by the local ethics committee.

The authors affirm that human research participants provided informed consent for publication.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Conception and design were performed by Zhiyuan Xu, Li Yang, Longhua Chen, and Hao Yu. Administrative support was performed by Zhiyuan Xu and Li Yang. Provision of study materials or patients was performed by Zhiyuan Xu, Li Yang, and Qin Liu. Collection and assembly of data were performed by Longhua Chen, Hao Yu, Zhiyuan Xu, Li Yang, and Qin Liu. Data analysis and interpretation were performed by Hao Yu, Longhua Chen, Zhiyuan Xu, and Li Yang. Manuscript writing were performed by all the authors. Final approval of manuscript were performed by all the authors. Zhiyuan Xu and Li Yang contributed equally.


The authors thank the nurses, therapists, physicist, and physicians who participated in this study. This study was supported by the Health Commission of Guangdong Province, China (No. B2020100); Shenzhen Science and Technology Program (Nos. KQTD20180411185028798 and JCYJ20210324114600002); High Level-Hospital Program, Health Commission of Guangdong Province, China (Nos. HKUSZH201902031, HKUSZH201901017, and HKUSZH201901038); Shenzhen Fundamental Research Program (No. JCYJ2020109150427184) and Shenzhen Key Medical Discipline Construction Fund (No. SZXK014).

Supplementary Materials

Supplemental Table 1 The factors in elastic-net models integrating clinical and dosimetric factors which were constructed in 5-fold cross validation and 100 bootstrapping iterations. (Supplementary Materials)