Background. The International Classification of Disease, ninth revision (ICD-9) is designed to code disease into categories which are placed into administrative databases. These databases have been used for epidemiological studies. However, the categories used in the ICD9-codes are not always the most effective for evaluating specific diseases or their outcomes, such as the outcomes of cancer treatment. Therefore a re-classification of the ICD-9 codes into new categories specific to cancer outcomes is needed. Methods. An expert panel comprised of two physicians created broad categories that would be most useful to researchers investigating outcomes and morbidities associated with the treatment of cancer. A Senior Data Coordinator with expertise in ICD-9 coding, then joined this panel and each code was re-classified into the new categories. Results. Consensus was achieved for the categories to go from the 17 categories in ICD-9 to 39 categories. The ICD-9 Codes were placed into new categories, and subcategories were also created for more specific outcomes. The results of this re-classification is available in tabular form. Conclusions. ICD-9 codes were re-classified by group consensus into categories that are designed for oncology survivorship research. The novel re-classification system can be used by those involved in cancer survivorship research.

1. Background

The importance of a classification system for the grouping of causes of morbidity or mortality has long been known to be crucial for the study of disease. The first attempt to classify disease systematically has been attributed to Francois Bossier de Lacroix, (1706–1777), better known as Sauvages [1] in his treatise Nosologia Methodica, written in the 18th century. Subsequently, many groups have made attempts to create their own classification systems to compile quantitative data about various diseases within different population groups. In these systems, individual code categories are assigned to conditions that occur frequently and are associated with significant morbidity; others are grouped together, often by anatomical site or physiologic system [2]. Since the early 1900’s, international collaborations have attempted to revise and update these classification systems and this has led to the development of the International Classification of Diseases, which is now under the direction of the World Health Organization. The first version of the International Classification of Diseases was adopted in 1900. The ninth version, known as ICD-9, was published in 1975 and uses a five-digit coding system where the categories are meaningful at the 3-digit level [3].

The ICD-9 has become a useful tool for health researchers, as the use of administrative databases in the study of diseases has flourished over the last decade. Administrative databases provide a quick and efficient method of eliciting clinical information regarding hospitalization, as compared to the historically used gold standard of chart review. These administrative databases were not intended for research but rather to collect information regarding resource utilization. However, studies have shown that clinical data extracted from hospital databases in Canada provide reliable data when compared to manual chart review [4]. There are limitations to these databases; it has been suggested that comorbidities in these databases may be underreported for certain codes [5].

A reorganization of ICD-9 codes has been completed for four major chronic conditions (coronary artery disease, congestive heart failure, asthma, and chronic obstructive pulmonary disease) by a group of researchers for the purpose of creating a consistent research tool for the study of these health problems [6]. These researchers used the consensus of experts in the field and followed the recommendations made by Fink et al. [7]. Their recommendations stated that a group consensus should focus on a carefully defined problem that could be investigated in a timely and economical way, that consensus panel members should be representative of their profession, and that decisions on important issues should be justified by available empirically derived data as well as by judgments and experience.

The Childhood/Adolescent/Young Adult Cancer Survivor Program (CAYACS) is a research program investigating late outcomes in survivors of pediatric and young adult cancer through the linkage of administrative databases. One of the major aims of this program was to analyze hospitalizations in survivors of childhood and adolescent cancer occurring 5 years after the date of diagnosis. ICD-9 codes reported on the hospital separation forms of 5-year cancer survivors can be linked and compared with controls who did not have childhood cancer. In reviewing the ICD-9 coding book, it became clear that the categories used in this book were not ideally suited for research into cancer survivorship. Therefore, a reclassification of ICD-9 coding was needed that was specific for all cancer survivorship issues. The purpose of this paper is to develop this reclassification of the ICD-9 codes that can be used by all researchers in cancer survivorship. Specifically, this reclassification system can be used by researchers interested in iatrogenic late effects due to therapies given to patients with cancer. It can also be used to study the association of cancer with other diseases that may share etiologic determinants. Finally, it can also be used in Health Services research investigating the rates of hospitalizations or medical services use in those who had previously treated cancer.

2. Methods

The first step was to review the categories used in the ICD-9 and then to decide what categories would be useful for oncology outcome research. Two investigators (SRR and KG) decided which major categories should be included. These categories included both main categories and a few subcategories as required. It was decided to use a category called “other” to group together all codes which were not easily identifiable or did not seem as important for oncology research.

The second step was then to create an expert panel which included a radiation oncologist, a pediatric oncologist, and a data coordinator with extensive knowledge in ICD-9 coding (KG, SRR, LL). All 3 members of the panel had experience in survivorship research and were involved in a study using administrative databases to look at long-term outcomes in children treated for cancer (the CAYACS program). This panel then systematically reviewed each code in the ICD-9 coding book and placed each code into its new category in an Excel database.

The final step was the transformation of this spreadsheet (performed by ML) into a program that reads ICD-9 codes from a data file and assigns the correct category using R code (reference), so that this new database could be easily used in future studies.

3. Results

The categories decided upon by the panel are shown in Table 1. This changed the number of major categories from the 17 found in ICD-9 to 39 categories. The categories first used the ICD-9 categories served as a backbone and then new categories were created to encompass groups of conditions that would be of interest to those involved in survivorship research. After long discussion, the 2 clinicians involved in the study determined that these were the categories of choice.

The reclassification of the ICD-9 codes into the new categories is shown in Table 2. All the codes from the ICD-9 book were able to be incorporated into the new classification groups. The group was able to achieve full consensus for all codes. The majority of codes were easy to place into the new categories, but there were many codes that did not fit easily into a specific category. However, group consensus was achieved for all the reclassification choices.

4. Discussion

The development of the ICD-9 codes has enabled health administrators and policy makers to investigate the frequency and causes for hospitalizations across jurisdictions. This coding system categorizes headings into 17 major groupings. There has been recent interest in the use of these hospital administrative databases to help answer epidemiological hypotheses. However, as the coding system is generalized to the entire spectrum of health conditions, it is not ideal for specific groups of interest. This became evident to our CAYACS program when we were attempting to use ICD-9 codes to analyze causes of hospitalizations in cancer survivors. The existing numerical groupings were not ideal for survivorship research. For instance, causes of infections were scattered throughout the ICD-9 coding groupings despite having infection as a major grouping. The hospital’s data coordinator could code an infection based on the pathogen (codes 001-139.8) or could code based on the system affected by the infection (codes scattered throughout the range). For a clinical researcher who is interested in all infections in a group of individuals with a specific health condition, the ICD-9 code groupings are not suited for this type of research. This becomes even more important when considering a very specific area of research, such as the treatment of cancer and its late outcomes. The purpose of this study was to reclassify the ICD-9 codes into practical groupings that can be used by a health researcher specifically for cancer follow-up outcomes.

This study has therefore reclassified the ICD-9 codes into categories which are useful to those involved in oncology research using administrative databases. This reclassification system can be used by all groups looking at causes of hospitalization in those diagnosed with cancer, whether these patients are on active treatment or are in posttherapy surveillance as long-term survivors. All the codes in ICD-9 are accounted for and have been placed into specific categories. Subcategories were created that would help distinguish areas of interest within larger groups. For instance, within the cardiovascular system it is important to distinguish hypertension, myocardial infarction, arrhythmias, valvular disease and cardiomyopathy from each other, as each subcategory would likely have differing attributable factors and risks. By separating out these different conditions, we can study the the long-term risk of hospitalization associated with different initial childhood cancer diagnoses and therapies. We can for example, measure the risk of hospitalization for different cardiac conditions in long-term survivors treated for childhood Hodgkin lymphoma treated with mantle radiotherapy.

A strength of this study is that consensus was easily achieved for all ICD-9 codes between the 3 members of the panel. The inclusion of a senior data coordinator who has extensive experience and expertise in coding in hospital discharges gave insight into the practicality of coding. As all 3 members of the panel are involved in survivorship research, the new classification scheme was based on experience with data derived from ICD9-coding.

The main limitation of this study is that it represents the opinion of only one group of clinicians. Certainly others may have a few subtle changes they would suggest to the classifications or the categories in general.

5. Conclusions

By our accounts this is the first reclassification of the ICD-9 codes into new diagnostic groupings that are more useful for the clinical researcher. Moreover, this new classification system is ideal for oncology-specific outcomes and can therefore be used by all researchers in the study of cancer treatment and survivorship.

Conflict of Interests

The authors declare that they have no conflict or-interests.

Authors Contributions

S. R. Rassekh conceived the study, participated in the design, was on the expert panel that performed the reclassification, and drafted the manuscript. M. Lorenzi helped design the study, created all the tables, and helped draft the manuscript. L. Lee helped design the study and was on the expert panel that performed the reclassification and helped draft the manuscript. S. Devji helped in the design of the study and in drafting the manuscript. M. McBride helped design the study, is the primary investigator of the CAYACS project which helped fund this study, and helped draft the manuscript. K. Goddard helped to conceive the study, participated in the design, was on the expert panel that performed the reclassification, and helped draft the manuscript. All authors read and approved the final manuscript.


This project was jointly funded by the Canadian Institutes of Health Research (#MOP49563) and the Canadian Cancer Society (PPG#016001) as part of their support of the CAYACS Research Program (Childhood, Adolescent, Young Adult Cancer Survivorship Program).