Research Article | Open Access
Kijja Jearwattanakanok, Sirikan Yamada, Watcharin Suntornlimsiri, Waratsuda Smuthtai, Jayanton Patumanond, "Validation of the Diagnostic Score for Acute Lower Abdominal Pain in Women of Reproductive Age", Emergency Medicine International, vol. 2014, Article ID 320926, 6 pages, 2014. https://doi.org/10.1155/2014/320926
Validation of the Diagnostic Score for Acute Lower Abdominal Pain in Women of Reproductive Age
Background. The differential diagnoses of acute appendicitis obstetrics, and gynecological conditions (OB-GYNc) or nonspecific abdominal pain in young adult females with lower abdominal pain are clinically challenging. The present study aimed to validate the recently developed clinical score for the diagnosis of acute lower abdominal pain in female of reproductive age. Method. Medical records of reproductive age women (15–50 years) who were admitted for acute lower abdominal pain were collected. Validation data were obtained from patients admitted during a different period from the development data. Result. There were 302 patients in the validation cohort. For appendicitis, the score had a sensitivity of 91.9%, a specificity of 79.0%, and a positive likelihood ratio of 4.39. The sensitivity, specificity, and positive likelihood ratio in diagnosis of OB-GYNc were 73.0%, 91.6%, and 8.73, respectively. The areas under the receiver operating curves (ROC), the positive likelihood ratios, for appendicitis and OB-GYNc in the validation data were not significantly different from the development data, implying similar performances. Conclusion. The clinical score developed for the diagnosis of acute lower abdominal pain in female of reproductive age may be applied to guide differential diagnoses in these patients.
Abdominal pain is one of the most common chief complaints of emergency department patients. It was the main symptom of 12.1% to 20.4% of noninjury visits to emergency departments of USA, and 16.8% to 17.8% of them were in severe conditions . It is difficult to diagnose the causes of abdominal pain in some patients. Diagnosis of acute appendicitis, for example, was less accurate in young adult females than in males. The accuracies of diagnosis of acute appendicitis in young adult females were 71.7% to 75.3%, while the accuracies in male were 88.6% to 90.0% . Diagnosis of acute lower abdominal pain in young adult females was particularly difficult due to overlapping symptoms of obstetrics and gynecological conditions with those of acute appendicitis. Negative appendectomies often occurred mostly from missed diagnoses of obstetrics and gynecological conditions .
CT scan improved accuracy in diagnosing appendicitis and can detect other causes of abdominal pain in female patients . The use of CT scan can reduce negative appendectomies . However, the universal use of CT scan for diagnosing appendicitis may not be cost-effective in global budget scheme reimbursement for healthcare .
Although ultrasound is not as accurate as CT scan, it also showed benefit in diagnosing acute lower abdominal pain [7, 8], especially for pregnant women and children, whom radiation is relatively contraindicated. However, ultrasound alone had low sensitivity in the diagnosis of appendicitis. Its sensitivity was not more than unaided-clinical judgment .
Clinical prediction rules, through which clinical findings were systematically applied to predict difficult clinical conditions , may be another approach for the diagnosis of acute lower abdominal pain in females of reproductive age. Alvarado’s score, although intentionally developed for early diagnosis of acute appendicitis , has been studied for admission criteria  or criteria for CT scan . However, appendicitis scores were not adequately applicable to abdominal pain in females of reproductive age, because they could not detect obstetrics and gynecological causes. We, therefore, developed a clinical scoring for the diagnosis of acute lower abdominal pain in these particular patients . In this study, we aimed to validate our clinical scoring with patients in a different time period.
2.1. The Scoring System
The score is comprised of simple clinical findings, laboratory results, and a constant. Item scores were assigned for guarding or rebound tenderness, pregnancy (either by clinical or urine pregnancy test), tenderness at right lower quadrant of abdomen, tenderness at left lower quadrant of abdomen, leukocytosis (white cell count ≥10,000/μL), predominate neutrophil ≥75% in complete blood count, and a constant. The assigned scores and algorithm for diagnostic prediction were shown (Table 1). The item scores had both positive and negative values, which reflected an increase or a decrease in probabilities of the corresponding diagnoses when presenting with those clinical findings.
|RLQ: right lower quadrant; LLQ: left lower quadrant.|
2.2. Validation Data
The setting hospital is Nakornping Hospital, a tertiary care hospital in Chiang Mai, Thailand. Validation data were extracted from the medical records of female patients aged 15–50 years who were admitted to surgical department or obstetrics and gynecology department during January and July 2009 with a chief complaint of acute lower abdominal pain within 14 days. Patients were classified into three groups upon their final professional diagnoses, which were (1) acute appendicitis (ICD10 code K-35); (2) obstetrics and gynecological conditions (OB-GYNc), including ectopic pregnancy (ICD10 code O-00), pelvic inflammatory disease (ICD10 code N70), and complicated ovarian cyst (ICD10 code N83); and (3) nonspecific abdominal pain (NSAP) (ICD10 code A09, K57, and R10 or other causes of abdominal pain). Study variables were age, marital status, duration of pain, presence of shifting of pain, nausea and vomiting, pregnancy, abnormal vaginal bleeding, presence of fever, systolic blood pressure, site of abdominal pain, presence of guarding or rebound tenderness from abdominal examination, result of complete blood count, and urine pregnancy test. Item scores were calculated and diagnostic prediction was performed for each patient. Final professional diagnoses in the medical records were considered as the reference standard for testing of the score accuracy.
2.3. Statistical Analysis
Patients’ characteristics of the development data and the validation data were summarized. Score predicted diagnosis of each patient was compared with final professional diagnosis. Diagnostic indices were calculated in the validation data. The abilities to discriminate appendicitis and OB-GYNc, in terms of areas under the receiver operating curves of the two data sets, were compared with the test for equality of two ROC curves. The positive likelihood ratios for the diagnosis of appendicitis and OB-GYNc of the development data and the validation data were tested with chi-squared for homogeneity test. The probability curves of appendicitis score and OB-GYN score were estimated from logistic regression postestimation function on actual rates of appendicitis and OB-GYNc in the development data and validation data.
The study was approved by the Ethical Committee of Nakornping Hospital and the Ethical Committee of the Faculty of Medicine, Chiang Mai University.
The patients’ characteristics of the derivation data and the validation data were similar (Table 2). Appendicitis was the most common diagnosis in both data sets (70.5% in development data and 65.2% in validation data). The final diagnoses of patients were shown (Table 3).
When comparing the score-predicted diagnoses and the final professional diagnoses in patients from the validation data, the score correctly diagnosed 24 of 33 NSAP patients (72.7%), 181 of 203 appendicitis patients (89.2%), and 46 of 66 OB-GYNc patients (69.7%). The overall accuracy of the score was 83.1% (251/302) (Table 4). The score had a sensitivity of 91.9%, a specificity of 79.0%, and a positive likelihood ratio of 4.39 for diagnosis of appendicitis. For the diagnosis of OB-GYNc, the score had a sensitivity of 73.0%, a specificity of 91.6%, and a positive likelihood ratio of 8.73, respectively. The diagnostic indices and their 95% confidence intervals were displayed (Table 5).
When using the criteria in Table 1 for prediction of diagnoses, the performance of the score in discrimination of appendicitis in terms of ROC analysis and positive likelihood ratio in the validation data were not significantly different from those in the development data. The area under ROC curve for the discrimination of appendicitis and “nonappendicitis” was 0.855 in the validation data and 0.796 in the development data (). The positive likelihood ratios for diagnosis of appendicitis in the validation data and the development data were 4.39 and 2.97, respectively (). The areas under ROC curves for the discrimination of OB-GYNc and “non-OB-GYNc” were not different in the validation data and the development data (0.823 and 0.808; ). The ROC areas of the development data reported in this study were different from those reported in our previous study because in previous study we reported the ROC areas of individual scores (appendicitis score for appendicitis and OB-GYN score for OB-GYNc), not as the whole algorithm like in this study. Similarly, the positive likelihood ratios for diagnosis of OB-GYNc were not significantly different in the validation data and the development data (8.73 and 12.94; ) (Table 6). The estimate probability curves from actual rates in the development data and the validation data of appendicitis diagnosis from appendicitis score and OB-GYNc from OB-GYN score were shown (Figure 1).
The present study was the second part of the previous study in clinical prediction rule for the diagnosis of acute lower abdominal pain in females of reproductive age . In general, clinical prediction rule studies are comprised of derivation, validation, and impact studies, with an increase in the level of evidences in each phase . Validation study is important before applying such clinical prediction rule into clinical practice because the results of prediction may not necessarily be reproducible in other settings or in the other time periods . In this validation study, we found no significant differences in the prediction of diagnoses between the validation data and the development data. This could be explained simply by the fact that we conducted the study at the same setting as in the development of the diagnostic score; patients’ characteristics and patterns of clinical practices were unlikely to be different from time to time.
Clinical scoring for the diagnosis of abdominal pain has been extensively studied for appendicitis [17–22]. There were relatively fewer studies for obstetrics and gynecological conditions [23–25]. However, those studied were applied for the diagnosis of only single disease (appendicitis, ectopic pregnancy, pelvic inflammatory disease, or adnexal torsion). The present diagnostic score has an advantage in inferring differential diagnosis of more than one condition, resembling routine clinical approach to patients. The main advantage of this score is triaging. It can guide emergency room physicians whether to admit the patients, and what specialties to consult. For example, a patient with appendicitis score and OB-GYN score equal to or less than zero, which diagnosis of NSAP is likely; this patient can probably be admitted to the observation room or discharged from emergency room and appointed to followup in the next 24 hours for a case with mild symptoms. The probability of appendicitis in this case would be approximately 20% or less; and the probability of OB-GYNc is very low (Figure 1). In addition, score-predicted probability in Figure 1 can also be applied for selective management. Patients with appendicitis score of 0–2 or OB-GYN score of 2–4, whose probabilities of appendicitis or OB-GYNc are approximately 20% to 60%, would be appropriate candidates for further investigations, such as ultrasound or CT, prior to admission. By triaging and selective management, the time spent in emergency department is expected to be less.
This study has several limitations. The obvious one is retrospective design of the study. Clinical signs and symptoms that were not documented either could be absent of such clinical findings or were not evaluated. The different observers may have different interpretations of physical examination, and clinical signs that change over time may not be well recorded.
Using of final professional diagnoses as the reference standard is another limitation. The problem of different follow-up times and different clinical judgments amongst doctors also leads to misclassification. These limitations can be reduced if a prospective validation study of the diagnostic scoring system is performed, with interobserver agreement of measurements, including standardized criteria for diagnostic indicators, objective criteria for final diagnosis of each condition, and standardized follow-up time.
The result of this study should be used with caution. Patients in our setting were mainly referred from smaller hospitals in Chiang Mai. Most of them needed to be admitted to either general surgery department or obstetrics and gynecology department. Different patients’ characteristics and different patient flows in other settings would affect the accuracy of the scoring system. For example, myoma uteri complications such as necrosis or torsion were rare in our settings. In other hospitals where myoma uteri complications are major causes of acute lower abdominal pain, this diagnostic score may not be suitable for such settings. Applying this scoring system to different settings, different patterns of patients flow, could probably lead to misdiagnoses in some conditions. External validation in different settings should be performed prior to adoption into clinical practice in other settings. Further impact studies of the score to assess its impacts on multidimensions of clinical practice, such as time spent in emergency department, additional diagnostic value on top of unaided junior physicians’ judgments, and time and cost of diagnosis, should be conducted in the future.
The clinical diagnostic score can triage appendicitis, OB-GYNc, and NSAP in female patients with acute lower abdominal pain. The diagnostic score can guide emergency department physicians for proper admissions and selective managements.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
- F. A. Bhuiya, S. R. Pitts, and L. F. McCaig, “Emergency department visits for chest pain and abdominal pain: United States, 1999–2008,” NCHS Data Brief, no. 43, pp. 1–8, 2010.
- S. W. Wen and C. D. Naylor, “Diagnostic accuracy and short-term surgical outcomes in cases of suspected acute appendicitis,” Canadian Medical Association Journal, vol. 152, no. 10, pp. 1617–1626, 1995.
- S. A. Seetahal, O. B. Bolorunduro, T. C. Sookdeo et al., “Negative appendectomy: a 10-year review of a nationally representative sample,” American Journal of Surgery, vol. 201, no. 4, pp. 433–437, 2011.
- E. K. Paulson, M. F. Kalady, and T. N. Pappas, “Clinical practice. Suspected appendicitis,” The New England Journal of Medicine, vol. 348, no. 3, pp. 236–242, 2003.
- B. C. Morse, R. H. Roettger, C. A. Kalbaugh, D. W. Blackhurst, and W. B. Hines Jr., “Abdominal CT scanning in reproductive-age women with right lower quadrant abdominal pain: does its use reduce negative appendectomy rates and healthcare costs?” American Surgeon, vol. 73, no. 6, pp. 580–584, 2007.
- K.-H. Lin, W.-S. Leung, C.-P. Wang, and W.-K. Chen, “Cost analysis of management in acute appendicitis with CT scanning under a hospital global budgeting scheme,” Emergency Medicine Journal, vol. 25, no. 3, pp. 149–152, 2008.
- K. Soda, K. Nemoto, S. Yoshizawa, T. Hibiki, K. Shizuya, and F. Konishi, “Detection of pinpoint tenderness on the appendix under ultrasonography is useful to confirm acute appendicitis,” Archives of Surgery, vol. 136, no. 10, pp. 1136–1140, 2001.
- W. B. Schwerk, B. Wichtrup, M. Rothmund, and J. Ruschoff, “Ultrasonography in the diagnosis of acute appendicitis: a prospective study,” Gastroenterology, vol. 97, no. 3, pp. 630–639, 1989.
- H. Jahn, F. K. Mathiesen, K. Neckelmann, C. P. Hovendal, T. Bellstrom, and F. Gottrup, “Comparison of clinical judgment and diagnostic ultrasonography in the diagnosis of acute appendicitis: experience with a score-aided diagnosis,” European Journal of Surgery, vol. 163, no. 6, pp. 433–443, 1997.
- P. Beattie and R. Nelson, “Clinical prediction rules: what are they and what do they tell us?” Australian Journal of Physiotherapy, vol. 52, no. 3, pp. 157–163, 2006.
- A. Alvarado, “A practical score for the early diagnosis of acute appendicitis,” Annals of Emergency Medicine, vol. 15, no. 5, pp. 557–564, 1986.
- M. Y. P. Chan, C. Tan, M. T. Chiu, and Y. Y. Ng, “Alvarado score: an admission criterion in patients with right iliac fossa pain,” Surgeon, vol. 1, no. 1, pp. 39–41, 2003.
- R. McKay and J. Shepherd, “The use of the clinical scoring system by Alvarado in the decision to perform computed tomography for acute appendicitis in the ED,” American Journal of Emergency Medicine, vol. 25, no. 5, pp. 489–493, 2007.
- K. Jearwattanakanok, S. Yamada, W. Suntornlimsiri, W. Smuthtai, and J. Patumanond, “Clinical scoring for diagnosis of acute lower abdominal pain in female of reproductive age,” Emergency Medicine International, vol. 2013, Article ID 730167, 6 pages, 2013.
- T. G. McGinn, G. H. Guyatt, P. C. Wyer, C. D. Naylor, I. G. Stiell, and W. S. Richardson, “Users' guides to the medical literature XXII: how to use articles about clinical decision rules. Evidence-Based Medicine Working Group,” Journal of the American Medical Association, vol. 284, no. 1, pp. 79–84, 2000.
- T. Fahey and J. van der Lei, “Producing and using clinical prediction rules,” in The Evidence Base of Clinical Diagnosis: Theory and Methods of Diagnostic Research, J. A. Knottnerus and F. Buntinx, Eds., pp. 213–236, Blackwell Publishing, Oxford, UK, 2nd edition, 2009.
- H. Sitter, S. Hoffmann, I. Hassan, and A. Zielke, “Diagnostic score in appendicitis: validation of a diagnostic score (Eskelinen score) in patients in whom acute appendicitis is suspected,” Langenbeck's Archives of Surgery, vol. 389, no. 3, pp. 213–218, 2004.
- D. M. Kulik, E. M. Uleryk, and J. L. Maguire, “Does this child have appendicitis? A systematic review of clinical prediction rules for children with acute abdominal pain,” Journal of Clinical Epidemiology, vol. 66, no. 1, pp. 95–104, 2013.
- S. M. M. de Castro, Ç. Ünlü, E. P. Steller, B. A. van Wagensveld, and B. C. Vrouenraets, “Evaluation of the appendicitis inflammatory response score for patients with acute appendicitis,” World Journal of Surgery, vol. 36, no. 7, pp. 1540–1545, 2012.
- C. Ohmann, Q. Yang, and C. Franke, “Diagnostic scores for acute appendicitis,” European Journal of Surgery, vol. 161, no. 4, pp. 273–281, 1995.
- R. Ohle, F. O'Reilly, K. K. O'Brien, T. Fahey, and B. D. Dimitrov, “The Alvarado score for predicting acute appendicitis: a systematic review,” BMC Medicine, vol. 9, article 139, 2011.
- C. Ohmann, C. Franke, and Q. Yang, “Clinical benefit of a diagnostic score for appendicitis: results of a prospective interventional study. German Study Group of Acute Abdominal Pain,” Archives of Surgery, vol. 134, no. 9, pp. 993–996, 1999.
- C. Huchon, P. Panel, G. Kayem et al., “Is a standardized questionnaire useful for tubal rupture screening in patients with ectopic pregnancy?” Academic Emergency Medicine, vol. 19, no. 1, pp. 24–30, 2012.
- K. Morishita, M. Gushimiyagi, M. Hashiguchi, G. H. Stein, and Y. Tokuda, “Clinical prediction rule to distinguish pelvic inflammatory disease from acute appendicitis in women of childbearing age,” American Journal of Emergency Medicine, vol. 25, no. 2, pp. 152–157, 2007.
- C. Huchon, S. Staraci, and A. Fauconnier, “Adnexal torsion: a predictive score for pre-operative diagnosis,” Human Reproduction, vol. 25, no. 9, pp. 2276–2280, 2010.
Copyright © 2014 Kijja Jearwattanakanok et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.