Abstract

Background. To establish two nomograms to quantify the diagnostic factors of lung metastasis (LM) and their role in assessing prognosis in young patients with LM osteosarcoma. Methods. A total of 618 osteosarcoma young patients from 2010 to 2015 were included from the Surveillance, Epidemiology, and End Results (SEER) database. Another 131 patients with osteosarcoma from local hospitals were also collected as an external validation set. Patients were randomized into training sets (n = 434) and validation sets (n = 184) with a ratio of 7:3. Univariate and multivariate logistic regression analyses were used to identify the risk factor for LM and were used to construct the nomogram. Risk variables for the overall survival rate of patients with LM were evaluated by Cox regression. Another nomogram was also constructed to predict survival rates. The results were validated using bootstrap resampling and retrospective research on 131 osteosarcoma young patients from 2010 to 2019 at three local hospitals. Results. There were 114 (18.45%) patients diagnosed as LM at initial diagnosis. The multivariate logistic regression analysis suggested that T stage, N stage, and bone metastasis were independent risk factors for LM in newly diagnosed young osteosarcoma patients (). The ROC analysis revealed that area under the curve (AUC) values were 0.751, 0.821, and 0.735 in the training set, internal validation set, and external validation set, respectively, indicating good predictive discrimination. The multivariate Cox proportional hazard regression analysis suggested that age, surgery, chemotherapy, primary site, and bone metastasis were prognostic factors for young osteosarcoma patients with LM. The time-dependent ROC curves showed that the AUCs for predicting 1-year, 2-year, and 3-year survival rates were 0.817, 0.792, and 0.815 in the training set and 0.772, 0.807, and 0.804 in the internal validation set, respectively. As for the external validation set, the AUCs for predicting 1-year, 2-year, and 3-year survival rates were 0.787, 0.818, and 0.717. Conclusions. The nomograms can help clinicians strengthen their personal decision-making and can improve the prognosis of osteosarcoma patients.

1. Background

Osteosarcoma is the most frequent primary malignant bone tumor among people [1]. Osteosarcoma originates from primitive mesenchymal cells and occurs mostly in bone and rarely from tissue [2]. The incidence of osteosarcoma is around one to three cases annually per million individuals worldwide, and the highest incidence is in children and young adults [3]. The lung is the most common site of metastasis in patients with osteosarcoma, and it is the main cause of death in osteosarcoma patients [4, 5]. Studies have shown that the prognosis of patients with lung metastasis is very poor, with a 5-year survival rate of about 25% [6]. According to previous studies, about 15–20% of patients have visual evidence of metastasis and almost 90% of the metastasis occurs in the lung at the time of diagnosis, while the remaining 80–90% of patients may have micrometastasis that are still subclinical or undetectable [5, 7, 8]. Furthermore, even 30–40% of patients with localized tumors would have a local or distant recurrence in the first 2–3 years, and approximately, 90% of relapses are LM [9, 10]. Therefore, predicting the occurrence of LM in osteosarcoma and providing more personalized treatment advice for patients with LM have important clinical significance for improving prognosis.

Although systematic treatment was developed after 1970, polychemotherapy and surgery still remain insufficient. In the last 50 years, the survival rates of osteosarcoma patients have not improved significantly, regardless of whether metastasis occurred [11]. The Musculoskeletal Tumor Society and the American Joint Committee on Cancer are the most widely used staging system. However, these systems are still limited in predicting the occurrence of LM and in guiding the prognosis of patients with LM. The nomogram has been widely used in predicting the prognosis of cancer patients as a tool that can combine different kinds of variables [12]. By combing important variables, the nomogram can individually estimate the probability of events such as the overall survival rate more accurately than the traditional staging systems [1214]. In addition, the development of data science makes it possible to use big data from medical databases for statistical analysis. Data mining technology has become the frontier of medical research due to its good performance in assessing patient risk and helping to build clinical decision-making of disease prediction models [15, 16]. Considering that there are no studies focused on establishing predictive models for the diagnosis and prognosis of LM in osteosarcoma, especially for children and youth patients, the aim of the present study was to develop two nomograms to predict the probability of LM and the survival rates of young osteosarcoma patients, respectively.

2. Methods

2.1. People Selection

The data contained in this retrospective study were downloaded from the SEER database (version 8.3.6). Patients with a diagnosis of osteosarcoma between 2010 and 2015 were included in this study. Patients were randomized into training sets and internal validation sets with a ratio of 7:3. In addition, we retrospectively collected data for young osteosarcoma patients from three local hospitals as an external validation set. The training sets were used to establish the nomograms, while the validation sets were used to validate the established nomograms.

The inclusion criteria were as follows: (1) osteosarcoma was the primary tumor; (2) the age at diagnosis was 24 or less; (3) patients were histologically confirmed; (4) patients with complete clinicopathological features, demographic information, and follow-up information. Exclusion criteria were as follows: (1) patients who were diagnosed with the state of death; (2) the aforementioned information was missing; (3) patients with survival time <1 month.

Analysis of anonymous data from the SEER database is exempt from medical ethics review and does not need informed consent. The content of this retrospective research did not involve human subjects or personal privacy. Hence, informed consent from patients was not required in this study.

2.2. Study Variables

We selected 8 variables based on the patient-specific information from the SEER database for the study of risk factors of LM in osteosarcoma, including age, sex, race, tumor size, primary site, T stage, N stage, and bone metastasis. Patients’ ages were divided into three stages: 0–8 years, 9–16 years, and 17–24 years. Tumor size was classified into three grades, ranging from 0 to 100 mm, 101 to 200 mm, and greater than 200 mm. Subsequently, the information on grade, histological type, surgery, radiotherapy, and chemotherapy were included in the analysis of prognostic factors of young osteosarcoma patients with LM. In the present study, the overall survival rate was defined as the survival time from diagnosis to death from any cause.

2.3. Construction and Validation of Nomograms

To start with, the total patients included in this study were enrolled in the first cohort to study the risk factors of LM in osteosarcoma. After that, those diagnosed with LM among the total patients were further set up as a second cohort for survival analysis. Patients in each cohort were randomized into training and internal validation sets with a ratio of 7:3. Data from local hospitals were used as external validation sets. The training sets were used to establish the nomograms, while the validation sets were used to validate the established nomograms.

In the study of risk factors for LM, univariate and multivariate logistic regression analyses were applied and the risk factors were chosen to construct a nomogram. The univariate and multivariate Cox proportional hazard regression analyses were performed to identify the independent prognostic factors in the survival analysis, and a prognostic nomogram was constructed based on the prognostic factors. Receiver operating characteristic (ROC) curves or time-dependent ROC curves for each nomogram were established, and the corresponding area under the curve (AUC) was used to evaluate the discrimination of nomograms. Furthermore, the calibration curves and decision curve analysis (DCA) for each nomogram were also established to estimate the clinical application value. Finally, to further verify the value of the prognostic nomogram, we divided the patients into two risk levels according to the cut-off values of total nomogram points. The Kaplan–Meier (K-M) survival curves with a log-rank test were generated, and we established the scatter diagram to make it more visual.

2.4. Statistical Analysis

This study adopted SPSS 25.0, X-tile (version 3.6.1), and R software (version 4.0.1) for all statistical analyses. Age and tumor size were classified into categorical variables and expressed as frequency (proportions). The chi-square test and the rank-sum test were used for categorical data. R packages including “rms” and “regplot” in R software were employed to draw graphics. All values were two‐sided, and values <0.05 were considered statistically significant.

3. Results

3.1. The Characteristics of the Population

In the present study, a total of 618 young osteosarcoma patients from the SEER database were included. Among them, 114 (18.45%) patients were diagnosed as LM at initial diagnosis. Furthermore, 434 (70.00%) and 184 (30.00%) patients were randomly divided into the training set and the internal validation set in the first cohort. In addition, data of 131 patients from our local hospitals were collected as an external validation set. There were no significant differences between the training and two validation sets, except for race and LM () (Table 1).

3.2. Risk Factors of LM in Osteosarcoma Patients

To identify the risk factors of LM in young osteosarcoma patients, a univariate logistic analysis was first performed. The results indicated that four variables were related to LM in young osteosarcoma patients, including T stage (T2: OR = 4.166, 95% CI: 2.283–7.603, ; T3: OR = 11.067, 95% CI: 2.876–42.583, ; TX: OR = 38.733, 95% CI: 7.380–203.290, ), N stage (N1: OR = 26.106, 95% CI: 5.663–120.346, ; NX: OR = 10.680, 95% CI: 3.200–35.645, ), tumor size (101–200: OR = 2.005, 95% CI: 1.183–3.401, ; >200: OR = 6.406, 95% CI: 3.221–12.742, ), and bone metastasis (OR = 18.889, 95% CI: 5.256–67.887, ). The abovementioned variables were further included in the multivariate logistic regression analysis subsequently. The results showed that N stage (N1: OR = 15.855, 95% CI: 2.932–85.742, ; NX: OR = 9.123, 95% CI: 2.440–34.107, ), T stage (T2: OR = 3.871, 95% CI: 2.052–7.300, ; T3: OR = 4.739, 95% CI: 0.876–25.653, ; TX: OR = 18.915, 95%CI: 3.023–118.342, ), and bone metastasis (OR = 13.966, 95%CI: 3.423–56.976, ) were independent risk factors for LM in newly diagnosed young osteosarcoma patients (Table 2).

3.3. Development and Validation of the Nomogram for Prediction of LM

A nomogram was established based on the results of multivariable logistics regression (Figure 1). The ROC curves of each set were constructed, and the corresponding AUC values were 0.751, 0.821, and 0.735 in the training set, internal validation set, and external validation set, respectively (Figures 2(a)2(c)). Furthermore, ROC curves were constructed for each independent factor. The result suggested that the nomogram had a significant advantage in the accuracy of prediction compared with other variables. As shown in Figures 2(d)2(f), the AUC of the nomogram was higher than that of other independent risk factors, both in the training set and the validation set. In the diagnostic model, the TN stage was a risk factor and the DCA curve was directly compared with it, but the TN stage was not a risk factor in the prognostic model, so the comparison was not made. Therefore, we further compared modeling with our model using TN staging (Supplementary Figure 1, 2). In the internal validation set, one-year DCA images could not be shown due to too few samples and too low TN staging decision income. In other images, it can be seen that our model has a better prediction effect. The calibration curves of each set showed a robust calibration of the nomogram (Figures 3(a)3(c)), and the DCA curves of each set indicated that the nomogram had higher net benefits than any other independent risk factors (Figures 3(d)3(f)).

3.4. Survival Analysis for Patients with LM

A total of 114 young osteosarcoma patients with LM from the SEER database and 32 patients from local hospitals were included for the survival analysis. The baselines of these patients were summarized in Table 3. The univariate Cox proportional hazard regression analysis showed that age, race, primary site, N stage, surgery, chemotherapy, and bone metastasis were independent prognostic factors. The multivariate Cox proportional hazard regression analysis suggested that age (9–16: HR = 0.264, 95% CI: 0.081–0.861, ; 17–25: HR = 0.621, 95% CI: 0.196–1.971, ), primary site (other: HR = 6.866, 95% CI: 1.538–30.655, ; spine/pelvis: HR = 2.126, 95% CI: 0.664–6.810, ; upper limb: HR = 2.138, 95% CI: 0.998–4.580, ), surgery (HR = 0.400, 95% CI: 0.188–0.855, ), chemotherapy (HR = 0.123, 95% CI: 0.037–0.411, ), and bone metastasis (HR = 3.981, 95% CI: 1.695–9.350, ) were prognostic factors for young osteosarcoma patients with LM (Table 4).

3.4.1. Development and Validation of the Prognostic Nomogram for Assessing Survival

Based on the Cox regression analysis, a prognostic nomogram was constructed (Figure 4(a)). The time-dependent ROC curves showed that the AUC for predicting 1-year, 2-year, and 3-year survival rates were 0.817, 0.792, and 0.815 in the training set; 0.772, 0.807, and 0.804 in the internal validation set; 0.787, 0.818, and 0.717 in the external validation set, respectively (Figures 4(b)4(d)). The time-dependent ROC curve comparisons of the prognostic nomogram and other factors were also constructed both in the training set and two validation sets. The AUC of the nomogram was higher than that of age, primary site, surgery, chemotherapy, and bone metastasis at 1, 2, and 3 years (Figures 5(a)−5(i)). In addition, the calibration curves indicated a good consistency between the nomogram-predicted overall survival rate and the actual overall survival rate at 1, 2, and 3 years in each set (Figures 6(a)−6(i)). The DCA was used to evaluate the clinical utility, and each set had more decent performance than single independent prognostic factors (Figures 7(a)−7(i)). Furthermore, patients were divided into two risk groups according to the cut-off points. The optimal cut-off point for the total score was determined by X-tile software. A score of less than 196 was considered the low-risk group, and more than 196 was considered the high-risk group. The K-M survival curves for each set were generated. It was suggested that the patients in the high-risk group have a worse prognosis (Figures 8(a)8(c)) than those in the low-risk group. Eventually, three scatter diagrams were also generated to show the difference between different risk groups. With the increase in the risk score, the survival rate of patients declines, and the survival time also decreases (Figures 8(d)8(f)).

4. Discussion

Despite being the most common primary malignant bone tumor in children and young adults, osteosarcoma can still be considered a very rare disease. Approximately, 400 new cases are diagnosed annually in children and young adults in the USA [17]. Cohort studies of patients with osteosarcoma have been difficult due to the scarcity of patients and the high heterogeneity at the genetic level [18]. Considering that older patients with osteosarcoma may have great heterogeneity with younger patients, this study only focused on children and young adults. In the present study, we reviewed and analyzed previous valuable data from the SEER database, and the results showed that 114/618 (18.45%) of young patients had LM at the time of diagnosis. Although this percentage is slightly higher than the value reported in the previous studies, this may be due to the development and popularity of clearer tests [19].

In the study of the diagnostic factor of LM, T stage, bone metastasis, and N stage were identified as the most meaningful factors. T stage as an independent risk factor for predicting distant metastasis has been reported in previous studies [13, 14]. Bone metastasis and N stage were also independent risk factors. Tumor metastasis requires the support of the corresponding microenvironment, and these factors may indicate that the microenvironment is out of order and suitable for osteosarcoma metastasis [2022]. Wang et al. [23] reported that the monocyte ratio and neutrophil/lymphocyte ratio could predict metastasis in osteosarcoma patients. Comparing the diagnostic model with single factors by ROC curves and DCA curves, the nomogram had better predictive performances. There have been some previous studies on the modeling of LM in osteosarcoma [13, 14, 24]. However, none of these models have been validated by local data, which may make it difficult to adapt the models’ conclusions to other patients. In addition, the patient cohorts in these studies included older patients, which may make the conclusion less conclusive. Patients over 60 are often associated with Paget’s disease and probably represent a distinct biological process [6]. To our knowledge, this is the first multivariate model to predict the risk of LM in young osteosarcoma patients, which shows good prediction performance in a variety of validation methods.

Osteosarcoma patients with LM or relapse have an unquestionably worse prognosis, with a 1-year survival rate of only 60%, compared with nearly 90% for localized tumors. The 2-year and 3-year survival rates of osteosarcoma with LM were less than 50%, and the gap between the survival rates of LM and localized osteosarcoma was further widened [17]. Today, the treatment of osteosarcoma is not limited to a single method, but multidisciplinary treatment has made a significant contribution to improving the survival time and quality of patients [5, 6]. Since 1970, almost all patients with osteosarcoma have been recommended to receive neoadjuvant chemotherapy or chemotherapy. At present, doxorubicin, cisplatin, high-dose methotrexate, and ifosfamide are considered the most active agents against osteosarcoma, but the ideal combination remains to be defined [2528]. It is worth noting that the side effects of chemotherapy drugs may also have an adverse effect on the prognosis of patients. Serotonin antagonists alone or in combination with dexamethasone have been widely used to reduce chemotherapy-induced emesis [29, 30]. Lewis et al. reported that the use of G-CSF can help increase the dose of treatment and improve histological response [31]. The surgical removal of all evident diseases has been identified to be effective in improving the prognosis of patients with osteosarcoma. A cohort study involving 202 patients with metastasis reported that surgery could significantly improve prognosis. In addition, patients with unresectable macroscopic tumors had a five-fold higher risk of dying than patients who underwent a complete surgical resection of all tumor deposits [32]. More advanced biomedical engineering combined with preoperative chemotherapy has gradually transformed traditional amputation into limb salvage surgery. However, the surgery of axial bone remains particularly challenging because of the higher risk of recurrence and the common complications associated with reconstruction [33, 34]. A careful exploration must be made for potential microscopic metastasis. Due to the limited imaging resolution, CT scans may underestimate the number of LM. Pastorino et al. reported that bilateral exploration by open thoracotomy would find 16% more microscopic modules than CT, and these nodules were mostly metastatic tumors after postoperative pathological examination [35]. As a minimally invasive technique popularized in recent years, video-assisted thoracoscopy can significantly shorten the length of hospital stay and reduce the perioperative risk [36]. However, thoracoscopy also faces the problem of target miss by the limited intraoperative field of vision and faces difficulty in removing the deep nodules due to the limited equipment. Therefore, the choice of surgery requires a detailed preoperative assessment. With the discovery of more biomarkers related to osteosarcoma, immunotherapy and targeted therapy provide a new direction for the treatment of osteosarcoma, but the data on these new methods have not been encouraging [4, 37]. Osteosarcoma was long considered a radioresistant tumor. However, a study suggested that radiotherapy may be useful for patients who are unable to accept complete resection or have microscopic residual tumor nodules [33]. In recent years, several new radiotherapy techniques make it possible to simultaneously maintain a high radiation dose of the lesion and protect other important organs [38, 39]. Unfortunately, radiotherapy was not an independent risk factor in our study of prognosis factors in patients with LM. Age and more complex metastatic states, such as combining bone metastasis, have also been suggested to be associated with prognosis in patients with osteosarcoma [8, 32, 40, 41]. The age group of 0–8 years had the highest risk of death, which may be due to poorer immune function and less tolerance to treatment such as surgery compared with other age groups. The risk was lower in the 9–16-year group than in the 17–24-year group, which may be due to the better physical repair ability of this group. In addition, since the incidence was highest in the 9–16-year group, this may lead to more timely examination and mature treatment. Compared to similar previous studies, the proposed model had a better predictive performance because it was more targeted [13, 42]. In addition, we used local patient data for external validation, which means that our model had better external adaptability. Similar to the diagnostic model, we further compared the prognostic model with single risk factors by ROC curves and DCA curves. The AUCs of the nomogram were higher than all the corresponding single factors. In the comparison of DCA curves, the model has better net benefits than a single factor in each cohort.

There are still some limitations in the present study. First, the selection bias from retrospective studies is inevitable. Second, some relevant biomarkers and more detailed treatment information are missing from the SEER database, such as whether TP53 has break-apart translocations or received immunotherapy. The current database does not include surgical information on metastatic lesions, so the model needs to be further validated in a cohort population with better information. Third, the SEER database lacks the records of patients with subsequent LM, causing the incidence of LM in young osteosarcoma patients to be underestimated. Despite these limitations, we believe that this study can help clinicians provide more accurate treatment advice for osteosarcoma patients with LM.

5. Conclusion

The present study showed that T stage, N stage, and bone metastasis were independent diagnostic factors of LM for children and young adult osteosarcoma patients. In addition, chemotherapy, primary site, surgery, age, and bone metastasis were associated with prognosis. Two nomograms were established, and considerable predictive performance was obtained.

Abbreviations

AUC:Area under the curve
LM:Lung metastasis
SEER:Surveillance, Epidemiology, and End Results.

Data Availability

The datasets generated and/or analyzed during the current study are available in the SEER database (https://seer.cancer.gov/).

Ethical Approval

The authors received permission to access the research data file in the SEER program from the National Cancer Institute, US (reference number 15260-Nov2018). Approval was waived by the local ethics committee, as SEER data are publicly available and deidentified.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

ZL conceptualized the study, conducted the initial analyses, drafted the manuscript, and revised the manuscript. GL conducted the initial analyses and revised the manuscript. HL collected the data and performed statistical analysis. JZ collected the data and performed statistical analysis. DW supervised the direction of the manuscript, conducted design of the work and interpretation of the data, and revised the manuscript. All authors have read and approved the final manuscript.

Acknowledgments

This research was supported by the Jilin Provincial Science and Technology Development Program (No. 20200404036YY).

Supplementary Materials

Supplementary Figure 1: comparison of decision curve analysis between the diagnostic nomogram and the AJCC stage in the training set (a), internal validation set (b), and external validation set (c). Supplementary Figure 2: comparison of decision curve analysis between the prognostic nomogram and the AJCC stage. 12-month, 24-month, and 36-month survival rates in the training set (a-c), internal validation set (d-f), and external validation set (g-i). (Supplementary Materials)