Abstract

Aims. To assess whether replacing CA125 with HE4 in the classical formulas of risk of malignancy indices (RMIs) can improve diagnostic performance. Methods. For each of 312 patients with an adnexal mass, classical RMIs 1–4 were computed based on ultrasound score, menopausal status, and serum CA125 levels. Additionally, modified RMIs (mRMIs) 1–4 were recalculated by replacing CA125 with HE4. Results. Malignant pathology was diagnosed in 52 patients (16.67%). There was no significant difference in diagnostic performance (area under the receiver operating characteristic curve [AUC]) between each classical RMI and its corresponding mRMI. In the entire sample, the AUC was 0.899, 0.900, 0.895, and 0.908 for classical RMIs 1–4 compared to 0.903, 0.929, 0.930, and 0.931 for mRMIs 1–4. In premenopausal patients, the AUC was 0.818, 0.798, 0.795, and 0.802 for classical RMIs 1–4 compared to 0.839, 0.875, 0.876, and 0.856 for mRMIs 1–4. In postmenopausal patients, the AUC was 0.906, 0.895, 0.896, and 0.906 for classical RMIs 1–4 compared to 0.907, 0.923, 0.924, and 0.930 for mRMI 1–4. Conclusions. Use of HE4 instead of CA125 did not significantly improve diagnostic performance of RMIs 1–4 in patients with an adnexal mass.

1. Introduction

Risk of malignancy indices (RMIs) are multimodal scoring systems used for the presurgical differentiation of adnexal tumors. RMI 1, originally proposed by Jacobs et al. in 1990, is calculated based on ultrasound findings, serum levels of tumor marker cancer antigen 125 (CA125), and menopausal status [1]. In 1996, Tingulstad et al. introduced RMI 2, which modified RMI 1 by replacing the values of the parameters; in 1999, the same authors proposed RMI 3, which further modified the values of the RMI 1 parameters [2, 3]. Finally, in 2009, Yamamoto et al. proposed RMI 4 by including an additional ultrasound parameter in the RMI 1 formula [4].

Currently, many national guidelines for the management of malignancies emphasize the role of RMIs in the preoperative assessment of adnexal tumors [5, 6]. However, a recent meta-analysis by Meys et al. [7] found that subjective assessment is superior to these scoring systems. Specifically, the 47 articles analyzed by Meys et al. described the outcomes of 19,674 adnexal masses assessed via subjective assessment, simple rules, logistic regression models (LR2), and RMIs 1–3. The authors concluded that simple rules with subjective assessment by experienced ultrasound examiners (for inconclusive masses) yielded the best results. When an expert is not available, LR2 can be applied instead of subjective assessment [7]. These findings justify the need for the development of tools based only on ultrasound assessment, eliminating the need for blood taking for tumor marker measurement. Until such tools become available for clinical use, further improvement of current RMIs is desirable.

In 1991, Kirchhoff et al. published the first report describing the presence of human epididymis protein (HE4) in the distal part of the epididymis [8]. In 2003, Hellström et al. were the first to highlight the potential role of HE4 as a serum marker of ovarian cancer [9]. Then, in 2008, Moore et al. concluded that, as a single marker, HE4 had the highest sensitivity for detecting ovarian cancer, especially stage I, among patients with adnexal masses [10]. Similarly, in 2011, Escudero et al. showed that HE4 had higher specificity than CA125 in patients with benign gynecologic diseases, with abnormal concentrations of HE4 and CA125 noted in 1.3% and 33.2% of patients, respectively, and a significantly higher area under the receiver operating characteristic (AUC-ROC) curve for HE4 than for CA125 when differentiating benign from malignant diseases [11]. These findings regarding the high specificity of HE4 in patients with adnexal masses may justify the use of HE4 instead of CA125 in the formulas of RMIs for the presurgical assessment of adnexal masses.

With the above in mind, the aim of the present study was to perform a comparative evaluation of the diagnostic performance of classical RMIs 1–4 against modified RMIs (mRMIs) 1–4 (where CA125 is replaced with HE4) in the preoperative differentiation of malignant from nonmalignant adnexal tumors in premenopausal and postmenopausal patients.

2. Materials and Methods

This prospective study included 312 patients admitted to our clinic between October 2012 and May 2015 and scheduled to undergo surgery for adnexal tumors. The inclusion criteria were age ≥ 18 years, ultrasound assessment of adnexal mass and measurement of tumor markers CA125 and HE4 within 5 days before surgical intervention, and informed consent. The exclusion criteria were renal disease, history of malignancy, chemotherapy, radiotherapy, presence of fibroids > 5 cm, and lack of histological assessment of the mass.

Each patient underwent transvaginal ultrasound. Transabdominal ultrasound was performed in patients who were virgins at the time of treatment, when the mass could not be visualized entirely by transvaginal ultrasound, and to detect metastases in the abdominal organs when there was suspicion of malignancy. Examination was performed using the ultrasound apparatus Philips iU22. The following ultrasound findings were considered in the examination: multilocular cyst, solid areas, bilateral lesions, ascites, and metastases [14]. The definitions of multilocular cyst, solid areas, and ascites were consistent with the terms and definitions established by the International Ovarian Tumors Analysis (IOTA) group [12]. Distant metastases were defined as focal lesions in renal, splenic, and hepatic parenchyma or as omental cake. If bilateral lesions were noted, only data regarding the lesion with a more complex structure were included in the statistical analysis.

One point was assigned for each ultrasound finding and the sum was used to determine the value for the RMI ultrasound score (). The maximum diameter of the lesion () was considered as an additional ultrasound parameter in RMI 4 [4]. Menopause was defined as the absence of menstruation for at least 1 year. Serum CA125 and HE4 levels were measured via electrochemiluminescence immunoassay performed using a Cobas 8000 e602 apparatus. The formulas of the classical RMIs 1–3 used the ultrasound score (), menopausal status (), and serum levels of CA125 (RMIs ), while RMI 4 also included the maximum diameter of the lesion () (RMI ) [14].

For RMI 1, a -value of 0, 1, and 3 was assigned when the total ultrasound scores were 0, 1, and ≥2 points, respectively. For RMIs 2 and 4, a -value of 1 and 4 was assigned when the total ultrasound scores were ≤1 and ≥2 points, respectively. For RMI 3, a -value of 1 and 3 was assigned when the total ultrasound scores were ≤1 and ≥2 points, respectively. Menopausal () status had a value of 1 in premenopausal patients and a value of 3 or 4 in postmenopausal patients depending on whether was included in RMIs 1 and 3 or RMIs 2 and 4, respectively. The serum levels of CA125 were used directly in the formulas of classical RMIs 1–4. -value was 1 and 2 for maximal lesion diameters of <70 mm and ≥70 mm, respectively [14]. The corresponding modified RMIs 1–4 (mRMI 1–4) were calculated using the same formulas after replacing the serum levels of CA125 by those of HE4 as follows: mRMIs and mRMI .

The final diagnosis of adnexal masses was based on histopathological examination of the excised masses. The staging of malignant masses was based on the guidelines of Fédération Internationale de Gynécologie et d’Obstétrique (FIGO). In the statistical analysis, borderline tumors were considered as malignant. All data of tumor markers, menopausal status, and ultrasound features were collected prospectively. Both classical and modified forms were collected and documented in a prospective manner. The final decision about the method of management was taken by at least two gynecologists depending on classical RMIs, serum CA125 levels, and subjective assessment of adnexal tumors. The operators were unaware of the results of mRMIs. At the end of the study, the final analysis was performed. A cut-off of 200 and 450 was set for RMIs 1–3 and RMI 4, respectively, as suggested by the original proponents of each index [14]. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), diagnostic accuracy, positive likelihood ratio (LR+), negative likelihood ratio (LR−), and diagnostic odds ratio (DOR) of the classical and mRMIs were calculated. Sensitivity defines the proportion of truly positive subjects with the disease in a group of all subjects with the disease. Specificity is defined as the proportion of subjects without the disease with negative test results within all subjects without the disease. PPV is defined as the proportion of patients with positive test results in all subjects with positive results. NPV represents the proportion of subjects without the disease with a negative test result in all subjects with negative test results. LR+ can be calculated according to the following formula: LR+ = sensitivity/(1 − specificity). LR− is calculated according to the following formula: LR− = (1 − sensitivity)/specificity. Diagnostic accuracy is expressed as a proportion of correctly classified subjects among all subjects. DOR of a test is the ratio of the odds of positivity in subjects with disease relative to the odds in subjects without disease [13]. Measures of diagnostic accuracy were performed for the differentiation of malignant from nonmalignant adnexal masses in the whole sample (premenopausal and postmenopausal patients). These measures were also used for the differentiation of malignant stage I (FIGO) adnexal tumors from nonmalignant adnexal masses. The Mann–Whitney U test was used to assess the differences in the distribution of CA125 and HE4 levels according to the malignancy status of the adnexal mass. The chi-square test was used to assess the differences in the distribution of age, menopausal status, and ultrasound score according to malignancy status. The area under the ROC curve (AUC) was constructed for both classical and mRMI 1–4. The Hanley and McNeil test was used to assess the difference between the AUCs of classical and mRMIs. A value < 0.05 was considered to indicate statistical significance. Diagnostic measures for mRMI were calculated depending on optimal cut-off levels gained from AUC. The study protocol was approved by the local Ethical Committee number KB/192/2012.

3. Results

A total of 312 patients were included in the study. Patient age ranged from 18 to 85 years with a mean of 48.5 years (standard deviation, 16.8 years). Malignant pathology was diagnosed in 52 patients (16.67%). A total of 117 (37.50%) patients were postmenopausal at the time of assessment. The distribution of the nonmalignant and malignant pathologies is displayed in Tables 1 and 2, respectively, whereas the distribution of FIGO stages of malignant adnexal pathologies is shown in Table 3.

The differences among nonmalignant and malignant adnexal masses in terms of age, menopausal status, ultrasound score, and tumor diameter are shown in Table 4. The descriptive statistics of the distribution of biomarker serum levels (CA125 and HE4) are shown in Table 5.

The diagnostic performance of classical RMIs 1–4 is shown in Table 6 for the entire study sample and for groups defined in terms of menopausal status. The diagnostic performance of these indices for the differentiation of malignant stage I (FIGO) from nonmalignant adnexal tumors is shown in Table 7.

A ROC curve was constructed for classical and mRMIs 1–4 obtained for the entire study sample, as well as for premenopausal and postmenopausal patients (Figures 1, 2, and 3, resp.). A ROC curve was also constructed for these diagnostic tools for differentiation of malignant stage I (FIGO) from nonmalignant adnexal masses (Figure 4). The DOR and AUC of the classical and modified indices for whole sample with corresponding values are shown in Table 8. The DOR and AUC of diagnostic indices for differentiation of malignant stage I (FIGO) from nonmalignant adnexal tumors is shown in Table 9. The optimal cut-off levels of mRMI obtained by ROC-AUC were used for calculating DOR. Measures for diagnostic accuracy of mRMI for differentiation between all malignant and only stage I (FIGO) from nonmalignant adnexal tumors are displayed in Tables 10 and 11, respectively.

All classical and mRMIs were able to differentiate malignant from nonmalignant adnexal tumors. However, the AUCs for all mRMIs were higher when considering the entire study sample of 312 patients. Nonetheless, according to the Hanley and McNeil values, the AUCs of corresponding classical and mRMIs 1–4 were not significantly different (Table 12) in the entire study sample or in either group defined in terms of menopausal status (i.e., premenopausal or postmenopausal patients) as well as in the differentiation of malignant stage I (FIGO) from nonmalignant adnexal masses. The optimal cut-off levels for mRMIs 1–4 for the whole population were 103.2, 250, 188, and 380, respectively. Lower and higher optimal cut-off levels were found in premenopausal and postmenopausal patients, respectively. Comparisons of predictive accuracy between classical and mRMIs 1–4 for differentiation of malignant stage I (FIGO) from nonmalignant adnexal tumors are presented in Table 12.

4. Discussion

With the development of 3D ultrasound, Doppler ultrasound, and novel tumor markers, it may be possible to further improve the accuracy of available RMIs. Wang et al. developed a binary logistic regression model to improve RMIs by incorporating tumor-specific growth factor and Doppler blood-flow parameters and found that, compared to RMI 1, the new RMI provided better predictions, especially in the diagnosis of ovarian germ cell tumors and other early-stage adnexal tumors [14].

In the present study, we considered cut-off values of 200 for RMIs 1–3 and 450 for RMI 4 according to the thresholds suggested by the original proponents of these indices. Some authors, however, have reported other values for optimal RMI thresholds when discriminating malignancy from nonmalignancy [1519]. The discrepancies likely originate from differences in population composition, sample size, proportion of malignant pathologies, incidence of advanced malignant disease, proportion of menopausal patients, experience of the ultrasound examiner, type of medical center (oncologic or otherwise), and laboratory method for detecting CA125 [1521]. It might be difficult to determine a cut-off value for RMIs with worldwide acceptance [22].

To our knowledge, this is the first study to analyze four variants of RMI in the Polish population. Our results revealed that all classical RMI variants could differentiate malignant from nonmalignant adnexal pathologies. Although our results agree with most previous studies in other populations, care should be taken when applying our findings in clinical practice involving other populations. For example, Ong et al. found that RMIs 1–4 could not appropriately distinguish between malignant and nonmalignant cases in an Asian population, although such findings may be attributed to the retrospective design of the study and the fact that endometriotic cysts accounted for the majority (76.8%) of nonmalignant lesions in the study population [23].

The RMI scoring systems combine several diagnostic parameters in order to improve the diagnostic performance of each individual parameter [24]. In our study, statistical analysis revealed that the incidence of malignancy increased significantly with increasing age, ultrasound score, and tumor diameter. Moreover, malignant disease was significantly more common among postmenopausal patients than among premenopausal patients. These findings are compatible with those of Yamamoto et al. [4]. Other mathematical models incorporating clinical data, ultrasound findings, and tumor markers have been proposed, such as the logistic regression model developed by the IOTA group (LR1, LR2) and the ADNEX model [25, 26]. However, LR− and ADNEX-based predictions refer to the probability of malignancy, whereas RMI-based predictions indicate that malignancy will occur if the RMI is above a certain cut-off point.

Ovarian cancer is one of the diagnostic dilemmas in gynecological oncology due to a lack of diagnostic tools for the early recognition of the disease [24]. If the diagnostic tool is highly sensitive, a higher proportion of malignant cases will be captured, which will eventually be managed appropriately (treatment by oncological gynecologists). The present study revealed relatively low sensitivities for diagnostic tools, which might be related to relatively high proportions of stage I (FIGO) adnexal malignancies (30%), where ultrasound features and tumor markers may not be differentiated from those of nonmalignant masses. However, high specificities emphasize the ability of the test to recognize subjects without the disease. The relatively low PPV and high NPV in the present study should be interpreted with caution and should not be transferred to other settings with different prevalence of the disease as the prevalence of the disease affects both of these values and PPV in particular [13]. Positive and negative LRs and DORs were useful in this study as they provided more estimates of the diagnostic value. DOR depends mainly on sensitivity and specificity but not on the prevalence of the disease [13]. The sensitivities of mRMIs were higher than those of the corresponding classical RMIs, especially in the premenopausal group. The higher sensitivity of mRMIs allows them to detect more malignant cases. However, lower specificity for mRMIs was found compared to the corresponding classical RMIs. Modifying RMIs resulted in a decreased positive likelihood ratio. Table 8 showed a comparison of the area under the curve and DOR. When looking at the whole sample, classical RMI and mRMI 4 had the highest AUC, while classical RMI 1 and mRMI 4 had the highest DOR. In premenopausal patients, classical RMI 1 and mRMI 3 had the highest AUC and DOR. In the postmenopausal group, classical RMI 1 had the highest AUC and DOR, mRMI 4 had the highest AUC, and mRMI 3 had the highest DOR.

For the differentiation of malignant stage I (FIGO) from nonmalignant adnexal tumors, mRMIs have higher sensitivities than classical RMIs, enabling proper detection of the early stages of malignancy. The AUC of mRMIs was higher than that of the classical RMIs but there was no significant difference between each mRMI and its corresponding classical RMI. In addition, the AUCs of classical RMIs and mRMIs for the differentiation of early stages of malignant adnexal tumors were lower than the values obtained for the whole population. Our results should be interpreted with caution, since there were a small number of patients with malignant stage I adnexal masses .

Menopausal status is one of the key parameters in the RMI formula. The present study revealed that malignant pathologies occur more commonly in postmenopausal patients . In the Polish population, 80% of ovarian cancer cases are noted in patients older than 50 years. However, ovarian cancer represents 4% of all malignant cases in patients aged ≤19 years and more than 6% in patients aged 20–44 years [27]. In the present study, the sensitivities of the four variants of classical and mRMIs were lower in premenopausal than in postmenopausal patients. The AUCs of the classical and modified variants of RMI were also lower in premenopausal patients. Overall, the AUCs of mRMIs and corresponding classical RMIs were not significantly different regardless of menopausal status.

Yenen et al. retrospectively compared 50 ovarian borderline tumors and 50 controls, reporting that RMI 4 was the best predictive RMI variant for the preoperative discrimination of such tumors, with a cut-off value of 200 [28]. It is difficult to interpret these results in the context of the present study because of the small sample size. Further prospective studies with higher numbers of patients are needed for more detailed evaluations.

The assessment of diagnostic performance using risk of ovarian malignancy algorithm (ROMA) and other predicting models was outside the scope of this study. One limitation of the present study was that the RMI-based predictions were not compared against those of other mathematical models or systemic scores. Anton et al. reported no differences in the accuracies of CA125, HE4, ROMA, and RMI for the differentiation of malignant from nonmalignant adnexal tumors [29]. Similarly, Karlsen et al. analyzed 1,218 patients with pelvic masses and concluded that ROMA and RMI perform equally well [30]. Another limitation of the present study is that it was conducted in a combined gynecology and oncology unit, where it is expected that patients with suspicion of malignant tumors are preferentially referred, potentially increasing the percentage of malignant adnexal pathologies in our study population. However, the proportion of malignant cases in our study population was low (16.67%), and 59.61% of the malignancies were diagnosed as stage III or IV, which are easily distinguished on ultrasound. Furthermore, we did not follow up the patients who were indicated for conservative management. Finally, the fact that we excluded patients with other diseases may have reduced the rate of false results. The present results should be verified by multicenter prospective studies to help in the selection of patients at the tertiary level for the most suitable surgical intervention. Our results are not representative of the whole population at the primary level, where the prevalence of the disease is low. However, this study was prospective in design, which represents its main strength, along with the fact that both tumor markers were measured at the same time using the same apparatus.

We found that replacing CA125 with HE4 did not improve the overall performance of RMIs. Previous studies showed that ultrasound parameters are superior to tumor markers [31, 32]. Taken together, these observations suggest that it might be more beneficial to modify RMI formulas to incorporate only ultrasound and clinical data without the use of tumor markers. Further prospective studies are needed to confirm this hypothesis and improve the diagnostic performance of RMIs.

5. Conclusions

Classical RMIs and mRMIs 1–4 are useful for the presurgical differentiation of malignant and nonmalignant adnexal tumors. Replacing CA125 with HE4 in the classical RMI formulas did not improve the diagnostic performance of these indices.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

This study was funded by the Medical University of Warsaw through a grant for research and scientific work aimed at the scientific development of young doctors and Ph.D. students (no. 2WA/PM21D/13).