Conference Issue: Big Data for Biomedical ResearchView this Special Issue
Significant Physical and Exercise-Related Variables for Exercise-Centred Lifestyle: Big Data Analysis for Gynaecological Cancer Patients
This study investigated the characteristics of gynaecological cancers and is aimed at identifying significant risk variables using the National Health Insurance Sharing Service database to develop practical interventions for affected patients. Data regarding patients with uterine and ovarian cancer from the National Health Insurance Sharing Service database were collected and analysed using Student’s -test, logistic regression, and receiver operating characteristic curve analyses. Student’s -test analyses revealed that age, body mass index, blood pressure, and waist variables differed significantly among patients with uterine cancer. Gamma-glutamyl transpeptidase levels were higher in patients with ovarian cancer than in patients with uterine cancer. Physical fitness function tests reflected the status of patients with cancer. Moreover, physical disability was associated with an increased incidence of ovarian cancer. Intensive exercise for 20 min more than 1 time per week must be avoided to prevent uterine cancer. Receiver operating characteristic curve analyses showed that the optimal cutoff value for one-leg standing time, a prognostic and preventive factor in ovarian cancer, was 9.50 s (sensitivity, 94.9%; specificity, 96.9%). Controlling significant variables for each gynaecological cancer type in an individualised and optimised manner is recommended, including by maintenance of an adjusted exercise-centred lifestyle.
According to the 2016 gynaecological cancer-related report of Statistics Korea (http://kostat.go.kr/portal/eng/index.action), 44,367 and 3,538 individuals had uterine and ovarian cancers (6th and 17th most common cancers in Korea), respectively. Careful diagnosis should be performed for gynaecological cancers, which show tell-tale symptoms (i.e., postmenopausal bleeding or irregular bleeding) and is commonly diagnosed in the early stages. Gynaecological cancers have high recurrence rates and low survival rates. Moreover, specific aetiological factors contributing to gynaecological cancers are unclear, except for human papillomavirus infection, which causes cervical cancer . Although the survival rate has increased due to medical treatments, including surgery, postoperative complications such as anaemia, malnutrition, and depression often occur [2, 3]. Nevertheless, there has been gradual progress in identifying modifiable variables that can severely affect the incidence of cancer, as well as other dependent factors that vary among cancer types . Among these variables, physical activity considerably and positively improved both cancer symptoms and postoperative recovery. Prescribed exercise is adjusted based on the cancer target and involves considerations of critical disease-related factors ; however, specific prescribed exercise has not been well tailored to patients with gynaecological cancer.
Korean individuals are required to register their regular health check-up data, which enables the pooling of massive health-related datasets. The Korean government arranged electronic data updates beginning with the 2002 dataset pool, which is arranged within the National Health Insurance Sharing Service (NHISS) Database (DB) (https://www.nhis.or.kr) . This DB allows researchers to perform prospective, retrospective, cross-sectional, and longitudinal studies. Studies using this massive DB have been performed to provide practical guidance for evidence-based interventions (e.g., exercise prescriptions) for target diseases .
This study aimed to determine the characteristics of gynaecological cancers (specifically uterine and ovarian cancer) and identify significant NHISS DB-derived risk variables among patients with these cancer types to provide beneficial information for the development of practical interventions. This study evaluated the following hypotheses: (1)There are significant gynaecological cancer-dependent risk variables that can be related to clinical benefits during targeted cancer treatment (i.e., risk factors for developing cancer or cancer prognosis)(2)Exercise prescription-orienting markers and physical function-related variables were identified using logistic regression analyses(3)By screening risk variables using logistic regression analyses, optimal threshold-related cutoff values can be obtained using receiver operating characteristic (ROC) curve analyses
2. Materials and Methods
2.1. The Data Source for Gynaecological Cancer Study and Its Demographic Characteristics
All Korean-sourced medical records since 2002, collected during regular medical health check-ups, were gathered in electronic format in the NHISS DB (https://nhiss.nhis.or.kr/bd/ab/bdaba000eng.do). The NHISS DB used in the study included records of 514,866 randomly selected individuals aged >40 years. The DB included data regarding sex, age, medical insurance fee, resident region, and death, and all data were used for patient stratification. Of the 514,866 patients registered in the NHISS DB, only female patients () were included in this study (Figure 1). The demographic and categorical characteristics of the patients included in this study are shown in Tables 1 and 2, respectively.
The study protocol was approved by the Institutional Review Board of the Seoul National University Bundang Hospital (X-1707-411-903). All participants were anonymously registered using their ID numbers to secure their personal information. The IRB determined that the NHISS data were exempt from the requirement for consent.
2.2. Study Design
Data of patients with uterine and ovarian cancers used in this cohort study were obtained from the NHISS-derived longitudinal DB. Using the Korean Standard Classification of Disease and Cause of Death (http://kssc.kostat.go.kr/ksscNew_web/index.jsp) to select codes for gynaecological cancers, 1,434 patients with uterine cancer, and 1,338 patients with ovarian cancer were extracted from this database. The following codes were used for uterine cancer: D06, carcinoma in situ of the cervix uteri; D06.0, carcinoma in situ of the endocervix; D06.1, carcinoma in situ of the exocervix; D06.7, carcinoma in situ of the other parts of the cervix; D06.9, carcinoma in situ of the cervix; and unspecified, and D07.0, carcinoma in situ of the endometrium. The following codes were used for ovarian cancer: C56, malignant neoplasm of the ovary; C56.0, malignant neoplasm of the ovary, right; C56.1, malignant neoplasm of the ovary, left; and C56.9, malignant neoplasm of the ovary, unspecified side. The codes for the control groups were as follows: D26, other benign neoplasms of the uterus and D27, benign neoplasm of the ovary. The variables necessary to analyse the detailed statuses of patients regarding carcinogenesis were selected and subsequently calculated using Student’s -tests, logistic regression analyses, and ROC curve analyses (Figure 1), while also considering benign diseases of each corresponding cancer group.
2.3. Variables Used in This Study
The following variables were used in this study: AGE, age (years); ANNUAL_HEALTH_CHECKUP, health check-up year (year); BMI, body mass index (kg/m2); SBP, systolic blood pressure (mmHg); DBP, diastolic blood pressure (mmHg); TC, total cholesterol (mg/dL); HMG, haemoglobin (g/dL); AST, serum glutamic oxaloacetic transaminase and aspartate aminotransferase (U/L); ALT, serum glutamic pyruvic transaminase and alanine aminotransferase (U/L); γ-glutamyl transpeptidase, gamma-glutamyl transpeptidase (U/L); EXERCI, frequency of moderate-intensity exercise per week (1, no; 2, 1–2 times; 3, 3–4 times; 4, 5–6 times; and 5, every day); WC, waist circumference (cm); VPA, frequency of high-intensity min (1, 0 days/week; 2, 1 day/week; 3, 2 days/week; 4, 3 days/week; 5, 4 days/week; 6, 5 days/week; 7, 6 days/week; and 8, 7 days/week); MPA, frequency of moderate-intensity min (1, 0 days/week; 2, 1 day/week; 3, 2 days/week; 4, 3 days/week; 5, 4 days/week; 6, 5 days/week; 7, 6 days/week; and 8, 7 days/week); LPA, frequency of light-intensity exercise/ min (1, 0 days/week; 2, 1 day/week; 3, 2 days/week; 4, 3 days/week; 5, 4 days/week; 6, 5 days/week; 7, 6 days/week; and 8, 7 days/week); timed up and go test (TUGT), time in seconds for standing up from a chair, walking 3 m, and returning to the same chair to measure basic mobility; GAIT, gait disability (1, yes and 2, no); unipedal stance test (UST), standing time on one leg in seconds to assess static postural and balance control for monitoring neurological and musculoskeletal status, and for managing fall risk; mixed UST, standing on one leg (1, with eyes closed and 2, with eyes open); and FALL, experiencing a fall within 6 months (1, yes and 2, no).
2.4. Statistical Analyses
All data are presented as the . Student’s -tests and logistic regression analyses were used to evaluate differences between the gynaecological cancer and noncancer groups. The values were not adjusted for multiple comparisons. ROC curve analyses were used to determine the optimal cutoff values for the significant variables identified via logistic regression analyses. SAS software (version 9.4; SAS Institute, Cary, NC, USA) was used for all statistical analyses. Statistical significance was set at .
3.1. Characteristics of Gynaecological Cancer and Nongynaecological Cancer Groups
Age, BMI, systolic and diastolic blood pressures, and waist circumference were significantly different between patients with uterine cancer and the control group (). In contrast, gamma (γ)-glutamyl transpeptidase levels were significantly higher in patients with ovarian cancer than in patients with benign ovarian lesions (). Significant differences in physical function-related variables (e.g., unipedal stance test and timed up and go test) () were observed among patients with cancer (Table 3).
3.2. Logistic Regression Analyses of Nonphysical Activity-Related Variables
Logistic regression analyses showed cancer type-specific trends for age, BMI, serum glutamic oxaloacetic transaminase and aspartate aminotransferase, serum glutamic pyruvic transaminase and alanine aminotransferase, γ-glutamyl transpeptidase, and waist circumference ().
Increased times in the timed up and go test and unipedal stance test were associated with a greater incidence of ovarian cancer (). Significant differences in mixed unipedal stance test results were observed in patients with ovarian cancer (odds ratio (OR), 0.55; 95% confidence interval (CI), 0.400–0.78; ). Moreover, a significant difference was observed between patients with ovarian cancer and those with uterine cancer in terms of experiencing falls within 6 months (). Patients with no falls during the preceding 6 months had a lower incidence of ovarian cancer (OR, 0.70) (Table 3).
3.3. Results of Logistic Regression Analyses on Exercise-Dependent Modalities
Patients were asked about various exercise modalities with different intensities, durations, and frequencies (Table 4). Numbers next to the variables show a significant difference in the selected questionnaire. For example, the frequency of moderate-intensity exercise per week (1)  indicates that 1 and 4 patients with uterine and ovarian cancers, respectively, responded to the questionnaire. Regarding the frequency of high-intensity min in patients with uterine cancer, engagement in this exercise for 1 d per week demonstrated significant differences (OR, 1.55; 95% CI, 1.14–2.12; ). Regarding moderate-intensity exercise for >30 min in patients with uterine cancer, engagement in this exercise for 1 d per week demonstrated significant differences (OR, 1.76; 95% CI, 1.30–2.37; ). In patients with ovarian cancer, the frequency of 5 days/week engagement in light-intensity exercise/ min was the only significant difference (OR, 1.61; 95% CI, 1.08-2.39).
3.4. ROC Curve Analyses of Gynaecological Cancer
For ovarian cancer, the cutoff value of γ-glutamyl transpeptidase was 14.50 (U/L, with a sensitivity of 70.7% and specificity of 71%. The UST results showed an optimal cutoff value of 9.50 s, with a sensitivity of 94.9% and specificity of 96.9% () (Table 5). The ROC curve analysis of the UST results for ovarian cancer is shown in Figure 2.
In this study, 1,434 patients with uterine cancer and 1,338 patients with ovarian cancer from the 514,866 registered NHISS DB retrospective cohort were enrolled . The regression model in this study was originally used to compare patients with and without a target cancer. Significant differences among the variables associated with cancer incidence were identified between uterine and ovarian cancers. Specifically, patients with ovarian cancer had higher levels of γ-glutamyl transpeptidase (). Moreover, physical fitness functional test results reflected the inferior status of patients with cancer (), compared with patients who did not have cancer. Physical disabilities increased the incidence of ovarian cancer (OR, 1.01; ). In contrast, patients with no falls during the preceding 6 months had a lower incidence of ovarian cancer (OR, 0.70). Furthermore, a 20 min intensive exercise session more than 1 time per week was associated with an increased incidence of uterine cancer (). ROC curve analysis showed that the optimal cutoff value for one-leg standing time, a prognostic and preventive factor in ovarian cancer, was 9.50 s (sensitivity, 94.9%; specificity, 96.9%). These findings can aid in the design of preventive and care exercise interventions for gynaecological cancer.
4.1. Patterns Observed from the Obtained Results
The measurement of continuous variables revealed two patterns (Table 3). Age in uterine cancer and γ-glutamyl transpeptidase in ovarian cancer were identified as significant influencing factors in logistic regression analyses ( and , respectively). Interestingly, γ-glutamyl transpeptidase levels were elevated in ovarian cancer patients who may be exposed to chemotherapy agents that affect liver function in the long term. Another possibility is that active cancer in the liver showed increased levels of γ-glutamyl transpeptidase. For age, ROC curve analyses showed a significant cutoff value of 41.5 years for patients with uterine cancer. The other variables did not exhibit significant differences, as shown in Table 3. Logistic regression analyses in an important previous study showed that diastolic blood pressure and frequency of moderate-intensity exercise per week 2 (1–2 times per week) were significant influencing factors in patients with colorectal cancer . In general, obesity-related variables were found to be significant influencing factors. In our clinical experience, endometrioma markedly differs from ovarian cancer in terms of physical function, because patients with endometrioma tend to have obesity and decreased physical activity.
4.2. Obesity-Related Variables Identified in Previous Studies
The current study findings were consistent with a previous report regarding cancer levels classified according to the effects of obesity. In patients with cancer, leukaemia, pancreatic carcinoma, uterine carcinoma, and colorectal carcinoma contribute to gynaecological cancer, and obesity (BMI) is the presenting sign . The results reported by McTiernan et al. were consistent with the findings of the current study, such that overweight and obesity statuses were associated with the development of gynaecological cancer and elevated patient mortality . Among patients with ovarian cancer, 30% were overweight, and 12% were obese . According to an aetiological study at the molecular level, elevated levels of inflammatory factors (e.g., tumour necrosis factor-alpha, interleukins 1 and 6, and C-reactive protein levels) were observed in patients with cancer and were linked to reduced survival [13–15]. Although patients can survive cancer that is aggravated by severe obesity, they continue to experience obesity-related comorbidities such as hypertension, diabetes, osteoarthritis, and cardiopathy after cancer treatment [16, 17]. Patients who are overweight and obese are also likely to develop physical disabilities, which may affect their risk status.
4.3. Effects of Exercise-Related Interventions on Cancer Status
Concerning the effect of exercise on cancer status, few studies have investigated the appropriate intensity of exercise intervention for patients with cancer. The results of this study suggest that the modalities (e.g., intensity, duration, and frequency) of exercise intervention should depend on the cancer type. Notably, among patients with uterine cancer, engagement in high-intensity exercise for >20 min for 1 d per week had an OR of 1.55. Hence, 20 min of high-intensity exercise more than 1 time per week must be avoided to prevent uterine cancer. Overall, the findings suggest that >2 days of 30 min moderate-intensity exercise may be ideal for patients with uterine cancer (Table 4).
Furthermore, the UST could be a more appropriate measure for patients with ovarian cancer than for patients with uterine cancer () (Tables 3 and 5). Falls in the preceding 6 months were much less common in patients with ovarian cancer (OR, 0.70; 95% CI, 0.55–0.89) than in patients with uterine cancer (OR, 1.49; 95% CI, 1.12–1.98; ). This may be attributable to many factors such as age, carcinogenesis causes, disease symptoms, and detailed body locations and functions, although the diseases are both classified as gynaecological cancers. The results of the physical function-related analyses were consistent in patients with ovarian cancer (Table 5). Moreover, the lack of physical capacity, as a cancer symptom, might increase the risk of falls, although this depends on the type of cancer.
In our clinical experience, patients with ovarian cancer receive comparatively more chemotherapy than other patients with cancer, contributing to reduced sensations in the feet. This is consistent with changes in walking-related variables, such as the TUGT and UST results. Patients with gynaecological cancer are trapped in a vicious cycle, whereby decreased sensation of balance during walking is caused by neuromuscular impairment, which then induces core muscle weakening, resulting in less participation in rehabilitation programs. Moreover, studies in the past decade have shown that the socioeconomic cost of injury due to falls is approximately 343,000,000,000 Korean won, and 32% of older patients experience falls [18–20].
Finally, ROC curve analyses (Table 5) showed that UST results had a significant optimal cutoff value of 9.5 s (sensitivity, 94.9%; specificity, 96.9%), which can be used to design exercise intervention programs for patients with ovarian cancer.
The limitations of this study include its cross-sectional design, which led to a greater emphasis on associations between significant variables (e.g., exercise duration and type) and cancer symptoms (carcinogenetic repression), rather than on aetiological understanding. However, aetiological associations can be identified using logistic regression analyses , as in this study. The significant findings from the regression models used in this study should be considered for practical applications.
The following novel findings were obtained from the NHISS DB analysis of patients with gynaecological cancer: (1)Physical disabilities increased the incidence of ovarian cancer (OR, 1.01; ). Patients with no falls during the preceding 6 months had a lower incidence of ovarian cancer (OR, 0.70). Exmination of four exercise modalities showed that a 20 min intensive exercise session (1 time per week) was associated with an increased incidence of uterine cancer ()(2)ROC curve analyses showed that the optimal cutoff value for one-leg standing time, a prognostic and preventive factor in ovarian cancer, was 9.50 s (sensitivity, 94.9%; specificity, 96.9%)
Moreover, significant variables (e.g., age, waist circumference, BMI, and hepatic- and blood pressure-related variables) varied according to cancer type. Similarly, some variables exhibited a pattern such as those related to physical fitness function. Therefore, exercise interventions must be tailored to target cancer symptoms, with careful consideration of the significant variables identified in this large-scale NHISS DB study.
Publicly available datasets were analysed in this study. These data can be found at https://nhiss.nhis.or.kr.
The study protocol was approved by the Institutional Review Board of the Seoul National University Bundang Hospital (X-1707-411-903). All participants were anonymized (registered as their ID numbers to completely secure their personal information). The IRB determined that the NHISS data were exempt from the requirement for consent.
Conflicts of Interest
The authors declare no conflict of interest.
This work was supported by a 2020 Yeungnam University Research Grant and the SNUBH Research Fund (13-2017-017). Publicly available datasets were analysed in this study.
N. Muñoz, F. X. Bosch, S. de Sanjosé et al., “Epidemiologic classification of human papillomavirus types associated with cervical cancer,” New England Journal of Medicine, vol. 348, no. 6, pp. 518–527, 2003.View at: Publisher Site | Google Scholar
L. J. Moulton, P. G. Rose, and H. Mahdi, “Adverse post-operative outcomes in Jehovah's witnesses with gynecologic cancer within 30 days of surgery: A single institution review of 36 cases,” Gynecologic Oncology Case Reports, vol. 22, pp. 32–36, 2017.View at: Publisher Site | Google Scholar
C. H. A. Lee, J. C. Kong, H. Ismail, B. Riedel, and A. Heriot, “Systematic review and meta-analysis of objective assessment of physical fitness in patients undergoing colorectal cancer surgery,” Diseases of the Colon and Rectum, vol. 61, no. 3, pp. 400–409, 2018.View at: Publisher Site | Google Scholar
S. Kaul, J. C. Avila, D. Jupiter, A. M. Rodriguez, A. C. Kirchhoff, and Y. F. Kuo, “Modifiable health-related factors (smoking, physical activity and body mass index) and health care use and costs among adult cancer survivors,” Journal of Cancer Research and Clinical Oncology, vol. 143, no. 12, pp. 2469–2480, 2017.View at: Publisher Site | Google Scholar
H. Jee, J. E. Chang, and E. J. Yang, “Positive prehabilitative effect of intense treadmill exercise for ameliorating cancer cachexia symptoms in a mouse model,” Journal of Cancer, vol. 7, no. 15, pp. 2378–2387, 2016.View at: Publisher Site | Google Scholar
Y. Choi, J. H. Kim, K. B. Yoo et al., “The effect of cost-sharing in private health insurance on the utilization of health care services between private insurance purchasers and non-purchasers: a study of the Korean health panel survey (2008–2012),” BMC Health Services Research, vol. 15, no. 1, p. 489, 2015.View at: Publisher Site | Google Scholar
H. Jee, H. D. Lee, and S. Y. Lee, “Evidence-based cutoff threshold values from receiver operating characteristic curve analysis for knee osteoarthritis in the 50-year-old Korean population: analysis of big data from the national health insurance sharing service,” BioMed Research International, vol. 2018, Article ID 2013671, 2018.View at: Google Scholar
J. B. Choi, E. J. Lee, K. D. Han, S. H. Hong, and U. S. Ha, “Estimating the impact of body mass index on bladder cancer risk: stratification by smoking status,” Scientific Reports, vol. 8, no. 1, p. 947, 2018.View at: Publisher Site | Google Scholar
H. Jee and J. H. Kim, “Gender difference in colorectal cancer indicators for exercise interventions: the National Health Insurance Sharing Service-derived big data analysis,” Journal of Exercise Rehabilitation, vol. 15, pp. 811–818, 2018.View at: Google Scholar
E. K. Choi, H. B. Park, K. H. Lee et al., “Body mass index and 20 specific cancers: re-analyses of dose-response meta- analyses of observational studies,” Annals of Oncology, vol. 29, no. 3, pp. 749–757, 2018.View at: Publisher Site | Google Scholar
A. McTiernan, M. Irwin, and V. Vongruenigen, “Weight, physical activity, diet, and prognosis in breast and gynecologic cancers,” Journal of Clinical Oncology, vol. 28, no. 26, pp. 4074–4080, 2010.View at: Publisher Site | Google Scholar
S. V. Barrett, J. Paul, A. Hay, P. A. Vasey, S. B. Kaye, and R. M. Glasspool, “Does body mass index affect progression-free or overall survival in patients with ovarian cancer? Results from SCOTROC I trial,” Annals of Oncology, vol. 19, no. 5, pp. 898–902, 2008.View at: Publisher Site | Google Scholar
R. T. Chlebowski, E. Aiello, and A. McTiernan, “Weight loss in breast cancer patient management,” Journal of Clinical Oncology, vol. 20, no. 4, pp. 1128–1143, 2002.View at: Publisher Site | Google Scholar
B. L. Pierce, M. L. Neuhouser, M. H. Wener et al., “Correlates of circulating C-reactive protein and serum amyloid a concentrations in breast cancer survivors,” Breast Cancer Research and Treatment, vol. 114, no. 1, pp. 155–167, 2009.View at: Publisher Site | Google Scholar
B. L. Pierce, R. Ballard-Barbash, L. Bernstein et al., “Elevated biomarkers of inflammation are associated with reduced survival among breast cancer patients,” Journal of Clinical Oncology, vol. 27, no. 21, pp. 3437–3444, 2009.View at: Publisher Site | Google Scholar
E. Everett, H. Tamimi, B. Greer et al., “The effect of body mass index on clinical/pathologic features, surgical morbidity, and outcome in patients with endometrial cancer,” Gynecological Oncology, vol. 90, no. 1, pp. 150–157, 2003.View at: Publisher Site | Google Scholar
V. E. von Gruenigen, K. S. Courneya, H. E. Gibbons, M. B. Kavanagh, S. E. Waggoner, and E. Lerner, “Feasibility and effectiveness of a lifestyle intervention program in obese endometrial cancer patients: a randomized trial,” Gynecological Oncology, vol. 109, no. 1, pp. 19–26, 2008.View at: Publisher Site | Google Scholar
K. I. Kim, H. K. Jung, C. O. Kim et al., “Evidence-based guidelines for fall prevention in Korea,” Korean Journal of Internal Medicine, vol. 32, no. 1, pp. 199–210, 2017.View at: Publisher Site | Google Scholar
J. Y. Lim, W. B. Park, M. K. Oh, E. K. Kang, and N. J. Paik, “Falls in a proportional region population in Korean elderly: incidence, consequences, and risk factors,” Journal of Korean Geriatric Society, vol. 14, no. 1, pp. 8–17, 2010.View at: Publisher Site | Google Scholar
S. G. Lee and S. Kam, “Incidence and estimation of socioeconomic costs of falls in the rural elderly population,” Journal of Korean Geriatric Society, vol. 15, no. 1, pp. 8–19, 2011.View at: Publisher Site | Google Scholar