Abstract

The time for the triggering event in neoplasms can be estimated using incubation period modeling techniques. We applied these techniques to primary osteosarcoma and Ewing’s sarcoma of bone using the Surveillance Epidemiology and End Results database for all cases of osteosarcoma or Ewing’s sarcoma of bone from 1993 through 2010. Secondary neoplasms were excluded. The age at diagnosis, gender, ethnicity, and anatomic location were collected. The time () of the insult/triggering event was calculated using the best fit frequency distribution of age at diagnosis. There were 4,356 patients with osteosarcoma and 1,832 patients with Ewing’s sarcoma. The Pearson IV distribution was the best fit for both osteosarcoma () and Ewing’s sarcoma (). For these distributions is −0.7 years of age (4 weeks after conception) for Ewing’s sarcoma, 0.45 years for long bone osteosarcoma, and 10.4 years for parosteal osteosarcoma. This confirms the genetic etiology of Ewing’s sarcoma since an is 4 weeks after conception. Long bone osteosarcoma is not entirely genetic, as was 0.4 years for conventional osteosarcoma and 10.4 years for parosteal osteosarcoma. The etiologies for those two different types of osteosarcoma are thus different.

1. Introduction

Osteosarcoma and Ewing’s sarcoma are the most common primary osseous malignancies, especially in children and young adults [14]. The etiology of osteosarcoma and Ewing’s sarcoma is becoming more defined. Nearly all cases of Ewing’s sarcoma (>90%) are associated with a t(11;22)(q24;q12) translocation that encodes the EWS/FLI oncoprotein [411]. The genetic etiology of osteosarcoma is less apparent, although it has been associated with various genetic abnormalities [1218]. The genetic abnormalities postulated are loss of chromosome 13 (with the long arm (13q) containing the RB1 tumor-suppressor gene) [12], amplification of the MDM2 gene, and aberrant expression or loss of the tumor suppressor genes Rb and p53 [4, 15] and potential other associations [1618] with certain populations being at risk [18] although definitive oncogenes or tumor suppressor genes are still unknown [16].

The time of the triggering event which starts the development of these neoplasms is unknown. An estimation of such time can be determined using incubation period modeling techniques. This modeling has been applied to malignancies [1922] but not those involving long bones. It was the purpose of this study to apply incubation period modeling to osteosarcoma and Ewing’s sarcoma of bone and determine the age of “exposure” to the triggering event.

2. Methods

2.1. Data

The Surveillance Epidemiology and End Results (SEER) Program of the National Cancer Institute database was queried for all patients with osteosarcoma or Ewing’s sarcoma diagnosed from 1993 through 2010 [38]. SEER collects and publishes cancer incidence data from population-based cancer registries covering approximately 28% of the US population [39]. We collected the age at diagnosis, gender, ethnicity, tumor type (e.g., conventional osteosarcoma, parosteal sarcoma, Ewing’s, sarcoma of bone, or soft tissue), state of origin, anatomic location of sarcoma (e.g., long bone, jaw, vertebra), and the data about the malignancy whether it is primary or secondary. Ethnicity was classified according to Eveleth and Tanner [40] as White, Black, Amerindian (Hispanic and Native American), Indo-Malay (Asian origins), Indo-Mediterranean, and Native Australian/Polynesian. The USA state of origin was classified into four different groups using the 2000 Healthcare Cost and Utilization Project Kids’ Inpatient Database (KID) [41] criteria (Northeast, South, Midwest, and West). Those with a secondary tumor were excluded (e.g., pagetic, previous malignancy). This study was determined to be exempt by the local Institutional Review Board. Further details regarding the acquisition of the SEER data can be found at http://www.seer.cancer.gov/.

2.2. Statistical Analysis

Discrete data are reported as frequencies and percentages and continuous data as the mean standard deviation. Due to the asymmetric distribution of the age data, analyses between groups of continuous data were performed using nonparametric statistics (Mann-Whitney test: 2 groups; Kruskal-Wallis test: ≥3 groups). Differences between groups of discrete data were analyzed by the Pearson test. Statistical analyses for standard demographic data were performed with Systat 10 software (2000, SPSS Inc., Chicago, IL).

Incubation period modeling was used to determine the time of the insult/triggering event [1921, 4248]. The ideal best fit distribution should have an . In addition to the values, plots of the model and the data were subjectively examined to see if there was a peculiarity of the model’s fit that visually makes it a less suitable model. Only asymmetric distributions were evaluated which includes the log normal distribution. Statistical analyses for incubation period modeling and were performed using TableCurve 2D software v5.0 (2002, Systat Software, Richmond, CA). The details and theory behind incubation period modeling are given in the appendix.

3. Results and Discussion

3.1. Osteosarcoma

Subgroups were created since it is well known that there are differences in the epidemiology/demographics between conventional osteosarcoma (osteosarcoma not otherwise specified; chondroblastic, fibroblastic, telangiectatic osteosarcoma) separated by long and short bones; central and small cell medullary osteosarcoma; parosteal osteosarcoma; and osteosarcoma involving the mandible, skull, pelvis, and rib/sternum/clavicle [49].

There were 4,356 patients with osteosarcoma; 3,998 (91.8%) were conventional and 225 parosteal (5.2%). The remaining types accounted for 3%. There was a slight male predominance (55.6% male, 44.4% female). Statistically significant differences between the conventional and parosteal types by age, gender, race, and anatomic location were noted (Table 1). Within the conventional group there were statistically significant differences in anatomic location for age, race, and gender (Table 2).

Incubation period modeling was performed separately for each subgroup of osteosarcoma. The Pearson IV and log normal distributions were excellent fits for long bone osteosarcoma (Table 3) ( of 0.987 and 0.97, resp.) (Figure 1). These result in an at the 99 and 99.9% levels of 2.2 and 0.45 years for the Pearson IV and 2.95 and 0.3 years for the log normal distribution. Thus the triggering event for conventional long bone osteosarcoma is under 3 years of age but not prenatal since was not ≤0. Parosteal long bone osteosarcoma demonstrated a moderate fit , with an of 8.6 and 7.65 years at the 99 and 99.9% points, respectively. When reviewing the data (Figures 2(a) and 2(b)), there were 200 cases separated by 1-year increments, creating significant scatter. The age data was collapsed into groups of 3 years (1 to 3, 4 to 6, etc.); subsequent analysis demonstrated equally excellent fits with both the Pearson IV and log normal ( of 0.96 for both), or of 10.5 and 10.4 years for the 99 and 99.9% points, respectively (Figures 2(c) and 2(d)). There were no significant fits for any of the other osteosarcoma types, even when collapsing the age into groups of 3 years.

3.2. Ewing’s Sarcoma

There were 1,832 patients with primary osseous Ewing’s sarcoma; the average age at diagnosis was years. There were 1,158 males (63.2%) and 674 females (36.8%); ethnicity was 1,683 White, 49 Black, 43 Indo-Malay, 17 Amerindian, 14 Indo-Mediterranean, and 18 Polynesian. Incubation period modeling again demonstrated that the Pearson IV and log normal distributions were excellent fits ( of 0.99 and 0.95, resp.) (Figure 3). This results in an at the 99 and 99.9% points of 0.4 and −0.7 years for the Pearson IV distribution. The log normal distribution is not definable at the values of and so must be excluded. Thus the triggering event for Ewing’s sarcoma of bone is prenatal at −0.7 years (36 weeks before birth or 4 weeks after conception).

3.3. Discussion

This is the largest series to date of the demographics for these two primarily pediatric osseous malignancies. Our results are similar to others (Table 4) confirming the slight male predominance for both neoplasms [2, 25, 30, 31, 33, 50] as well as the rarity of Ewing’s sarcoma in African/Black people [2, 3, 2527, 35]. Ewing’s sarcoma was also noted to be rare in Amerindians (Hispanics/Native Americans) (22/2321, 0.94%), and Asians (Indo-Malay’s) (75/2321, 3.2%). Due to the rarity of Asian ethnic subgroups in this country, and thus the SEER database, we cannot comment on the differences previously noted between different Asian people [23, 51].

Ewing’s sarcoma is nearly always associated with a t(11;22)(q24;q12) translocation [411]. Such a translocation may be due to many different mechanisms (spontaneous mutation, environmental exposures to parents, etc.). Regardless of the mechanism, we hypothesized that incubation period modeling would demonstrate a time of “insult” as prenatal or birth. We found that the for Ewing’s sarcoma of bone at the 99.9% point was −0.7 years (36 weeks before birth, or ~4 weeks after conception). This demonstrates that incubation period modeling is extremely accurate and is in complete agreement with the genetic data; any other result would be in stark contrast to the well known genetic translocation. This confirms the accuracy of this modeling and can thus be used for other neoplasms.

Osteosarcoma has been associated with various genetic abnormalities [1215], as well as environmental exposures [7, 14, 15, 29]. If osteosarcoma is a pure genetic defect, similar to Ewing’s sarcoma, then incubation period modeling should find an “insult” time as prenatal or at birth. This study found that the at the 99.9% level for long bone osteosarcoma is very early in life at ~0.4 years but not prenatal. The exact insult is still unknown and will require further investigation. Other types of osteosarcoma demonstrated no good fit with incubation period modeling indicating that these subtypes are heterogeneous with no actual time of “insult.” The only exception might be with parosteal osteosarcoma, which demonstrated an of 10.4 years for the 99.9% point when collapsing into 3-year “buckets” of age. This may indicate that there are two factors in the etiology of these tumors, both genetic and environmental exposure. Unfortunately, our incubation period modeling technique can separate out a double hit hypothesis.

A weakness of this study is that the histologic specimens used to make the diagnosis were not reviewed by the same pathologist, which could introduce bias. There are no studies addressing this question for osteosarcoma, although there have been studies addressing other malignancies [5255]. With ovarian carcinoma, there is fair agreement in grade [52]; with non-Hodgkin’s lymphomas the diagnosis was very accurate but reliability for the subtype was poor (59%, ) [53]; with Hodgkin lymphoma subtype agreement was overall good , and with lung cancer the histologic type was reasonably reliable but independent review was necessary for precise histologic typing. Thus the overall diagnosis appears to be reliable although the subtype classification is likely less reliable. Since this question has not been addressed for osteosarcoma, the potential impact on our results is unknown.

4. Conclusion

Incubation period modeling markedly confirms the genetic etiology of Ewing’s sarcoma. Long bone osteosarcoma is not completely genetic in etiology, as was 0.4 years for conventional long bone osteosarcoma and 10.4 years for parosteal osteosarcoma. Thus the etiologies for those two different types of osteosarcoma are likely different due to the different . It has been recognized that osteosarcoma of the mandible is a “different animal” [56], and the marked differences in age at diagnosis and variability with incubation period modeling amongst the different subtypes of osteosarcoma indicate that although pathologically they are similar, etiologically they are different. Further research will be needed to elucidate the etiologies of these various osteosarcoma types.

Appendix

Incubation period modeling originated with infectious diseases [45, 46]. The theory behind this modeling is that a pathogen infects an organism and the organism develops clinical symptoms and is subsequently diagnosed with the disease. The incubation period is the time from exposure to clinical manifestation of the disease. Different members of a population afflicted with the same disease demonstrate different incubation periods. The natural variation in incubation periods can be characterized by a frequency distribution, with the -axis representing the age at diagnosis and the -axis the number/proportion of organisms infected at that particular age/time. This modeling has been expanded to other chronic diseases [44, 57] including malignancies [1922], such as malignancies arising from radiation exposure.

The mechanics of such modeling is to first find a frequency distribution possessing an excellent fit of the data for the entire population afflicted with the disease in question. When (frequency distribution is zero), no cases of the disease have yet occurred. The which corresponds to represents the moment in time when the host was exposed to the infectious/triggering/causative agent(s) (defined as ). Many frequency distributions are mathematically undefined when due to a particular mathematical expression (e.g., division by zero or logarithm of 0). In such instances the corresponding to an when only 0.1–1% of the population has been diagnosed with the disease is used as . This is determined by the area (integral) under the frequency distribution from to the maximum age and denotes that proportion of the entire population spanning that portion of the -axis. These integral values are then determined for those when the integral areas under the curve are 99% and 99.9%. The fitting of many complex distributions can now be easily performed today with personal computer software packages, which were not available in the first incubation period modeling studies from the 1950s.

The distribution first used historically for incubation period modeling and infectious diseases was the log normal distribution [45, 46]; however it cannot be used for (e.g., prenatal or genetic events). More recent studies fit many different distributions to find most representative fit [44, 58, 59]. Many populations are better evaluated with asymmetric distributions as they are not normally distributed [46, 47] or require to be <0 (prenatal exposure or genetic transmission, occurring at conception/before birth).

The mathematical equations for these distributions are as follows.

(1) The lognormal distribution is mathematically expressed aswhere is the base of the natural logarithm.

Note that, for any , the distribution is undefined, rendering this distribution unsatisfactory for any models needing an , that is, birth or younger (i.e., prenatal exposure and/or genetic disorder).

(2) The Pearson IV distribution is mathematically expressed aswhere , , , and are fitted parameters unique for the data set analyzed. This distribution can be defined for any .

Conflict of Interests

The authors declare that they have no conflict of interests.

Acknowledgments

This study was supported in part by the Garceau Professorship Endowment, Department of Orthopaedic Surgery, Indiana University School of Medicine, and the George Rapp Pediatric Orthopaedic Research Endowment, Riley Children’s Foundation, Riley Children’s Hospital, Indianapolis, Indiana.