Objectives. To analyse the clinical features, outcomes, and risk factors of patients with diffuse large B-cell lymphoma (DLBCL) in China, with the aim to establish a new prognostic model based on risk factors. Methods. Clinical features and outcomes of 564 patients newly diagnosed with DLBCL from Jan 2009 to May 2017 were analyzed retrospectively. Variables were screened by LASSO regression and nomogram was constructed. Results. The 5-year overall survival (OS) of the cohort was 75%. The 5-year OS of patients differentiated by International Prognostic Index (IPI) score was 90% (score 0–2), 73% (score 3), and 51% (score 4-5), respectively. Age > 60, Eastern Cooperative Oncology Group (ECOG) > 1, Ann Arbor stage III-IV, bone marrow involvement, low level of albumin (ALB), and lymphatic/monocyte ratio (LMR) were independent predictors of OS. The predictive model was developed based on factors including age, bone marrow involvement, LMR, ALB, and ECOG scores. The predictive ability of the model (AUC, 0.77) was better than that of IPI (AUC, 0.74) and NCCN-IPI (AUC, 0.69). The 5-year OS of patients in the low-, intermediate-, and high-risk groups identified by the new predictive model was 89%, 70%, and 33%, respectively. Conclusions. The new prediction model had better predictive performance and could better identify high-risk patients.

1. Introduction

Diffuse large B-cell lymphoma (DLBCL) is the most common histological subtype of non-Hodgkin lymphoma (NHL), accounting for approximately 25% of NHL cases [1]. The disease is aggressive and requires aggressive medical intervention after diagnosis. The International Prognostic Index (IPI) [2] has played an important role in determining the prognosis of patients with DLBCL over the past two decades. With the addition of rituximab to the CHOP or CHOP-like regimen, the prognosis of patients in each risk group according to IPI improved. New prognostic scoring systems, such as R-IPI [3] and NCCN-IPI [4], have emerged to better discriminate the survival of patients with DLBCL.

With the innovation of immunohistochemistry and molecular examination techniques and the optimization of treatment strategies, we need more accurate prognostic models to identify very high-risk DLBCL patients or biological heterogeneity to guide individualized treatment. The application of Lasso regression [5] facilitated the selection of variables, and the use of nomograms [6] proved to be of better predictive value. In this study, we analyzed the clinical characteristics of patients with DLBCL and explored the factors influencing survival, screened variables through Lasso regression, and constructed a nomogram to stratify the prognosis of patients with DLBCL.

2. Patients and Methods

2.1. Patients

From Jan 2009 to May 2017, a total of 564 patients newly diagnosed with DLBCL according to WHO classification [7] by three specialized pathologists were included. Patients who did not have complete clinical and immunohistochemical data or who were diagnosed but not treated in our hospital were excluded (n = 65). Because primary central nervous system lymphomas are highly heterogeneous entities, we also exclude them from our study (n = 41). Baseline data were collected such as gender, age, B symptoms, performance status, number of extranodal involvement, presence of bulky disease (≥7.5 cm), Ann Arbor stage, cell of origin, lactate dehydrogenase (LDH), albumin (Alb), white blood cell count (WBC), hemoglobin (HGB), platelet (PLT), absolute lymphocyte count (ALC), absolute monocyte count (AMC), the ratio of lymphocyte to monocyte (LMR), and D-dimer. All patients in the cohort were routinely evaluated by lumbar examination before treatment. Bone marrow examination was performed before treatment to determine whether there was bone marrow involvement, and the efficacy was assessed by whole-body computed tomography scan (CT) or positron emission tomography/computed tomography (PET-CT).

2.2. Treatment, Follow-Up, and Outcome

The first-line therapy for DLBCL patients was the R-CHOP or R-CHOP-like regimen [8]. Addition of intravenous methotrexate (1 g/m2, four times) [9] injection was used as central prophylaxis in patients at central high risk. For elderly patients >70 years of age, we divided them into three groups: fit, unfit, and frail according to the comprehensive geriatric assessment (CGA) [10], and they were treated with R-CHOP, R-mini-CHOP [11], and R2 (rituximab 375 mg/m2 on d1, lenalidomide 25 mg on day 1–14) regimens, respectively. Follow-up was conducted by making phone calls, consulting medical records, or the electronic follow-up system we run. 61 (10.8%) patients were lost to follow-up until the final follow-up period of May 1, 2022. Overall survival (OS) was calculated from the time of diagnosis to death for any cause or the last follow-up.

2.3. Model Foundation and Validation

66.7% (n = 376) of the original dataset was randomly selected as a training cohort and the rest (n = 188) as a validation cohort. Univariate and multivariate Cox regression analyses were performed to screen potential variables associated with OS. The selected significant variables ( value <0.5) were then used in the least absolute shrinkage and selection operator (LASSO) regression algorithm, and then, a predictive nomogram was constructed. The area under receiver operating curve (AUC) is used to evaluate the performance of the model. According to the analysis results, calibration curves are drawn to determine whether the predicted and actual survival probabilities are consistent. The total score for each patient was assessed using nomogram in an externally validated cohort and used as an independent factor for Cox regression analysis.

2.4. Statistical Analysis

The survival of patients was analyzed by Kaplan–Meier survival curve, and differences between groups were compared by the log-rank test. Graph Pad Prisma 9.0 and R statistical software 4.1.3 (https://www.r-project.org/) were used to perform the statistical analyses. value <0.05 was considered to be statistically different.

3. Results

3.1. Clinical Characteristics

The baseline clinical characteristics of the entire cohort are shown in Table 1. 282 (50.0%) patients included in this study were male. The median age of all patients was 58 years (range: 15–90). The age distribution at diagnosis is shown in Figure 1(a). More patients were non-GCB subtype (50.7%) and advanced stage (72.0%). Fewer patients had bulky disease (13.7%), bone marrow (12.6%), or CNS involvement (6.4%). The gastrointestinal tract (GIT) constituted the most common site of primary extranodal DLBCL, accounting for 15.8% (89/564) of DLBCL cases (9.2% stomach and 6.6% intestine). Patients with primary breast involvement rank second (14, 2.5%), followed by the thyroid gland (12, 2.1%), testis (11, 2.0%), female genital system (9, 1.6%), bone (9, 1.6%), and others. Figure 1(b) shows the site distribution of extranodal lymphomas. No significant difference in clinical characteristics was found between the training and validation cohorts ().

3.2. Outcome

At the last follow-up, 18.6% (105/564) of patients were lost to follow-up, 22.5% (127/564) of patients were dead. The 5-year OS of patients with DLBCL was 75% (Figure 2(a)). Patients with GCB subtype (Figure 1(b)), Ann Arbor stage I-II (Figure 2(c)), ECOG score 0-1 (Figure 2(d)), and fewer extranodal involvement sites (Figure 2(g)) had better clinical outcome. The overall survival of patients with bone marrow (Figure 2(e)) or CNS (Figure 2(f)) involvement was significantly lower than those without. Patients with lower Alb or LMR levels resulted in lower 5-year OS rate (59% vs. 82%, ; 67% vs. 81%, ) (Figures 2(h) and 2(i)).

3.3. Risk Factors

The univariate and multivariate analyses results for patients with DLBCL are presented in Table 2. On the basis of the univariate results, gender, age, ECOG, B symptom, Ann Arbor stage, IPI score, cell of origin, BM involvement, CNS involvement, extranodal site, LDH level, LMR, and ALB were significantly associated with survival. However, in the multivariate analysis, age > 60 (HR: 2.086, 95% CI: 1.371–3.175), ECOG > 1 (HR: 2.666, 95% CI: 1.790–3.970), Ann Arbor stage III-IV (HR: 1.857, 95% CI: 1.035–3.333), BM involvement (HR: 2.024, 95% CI: 1.413–2.898), low LMR (HR: 1.605, 95% CI: 1.128–2.283), and low Alb (HR: 1.548, 95% CI: 1.088–2.202) were independent risk factors of OS in patients with DLBCL.

3.4. Parameter Selection

A total of 16 candidate parameters (age, gender, cell of origin, Ki-67, bulky disease, BM involvement, CNS involvement, B symptom, LDH, D-dimer, Alb, LMR, ECOG, Ann Arbor stage, extranodal site, and HBsAg) in the training cohort were screened and verified using Lasso Cox regression (Figures 3(a) and 3(b)). Finally, several factors, age, BM involvement, LMR, Alb, and ECOG performance status, were independently associated with the prognosis of patients with DLBCL and were included in subsequent nomogram.

3.5. Construction and Validation of the Predictive Nomogram

The predictive model (Figure 3(c)) was constructed by 5 factors identified from the results of Lasso regression. In this model, ECOG performance status ≥2 was assigned the highest score of 100, age > 60 y, bone marrow involvement, low levels of LMR, and low Alb was scored 86, 67, 58, and 28 points, respectively. The AUC for the nomogram was 0.77 (95% CI: 0.70–0.82), and the calibration curves of the nomogram showed great consistency between the predicted OS rates and actual observations outcome (Figures 3(d) and 3(e)).

3.6. Comparison with Current Prognostic Scoring Systems

Nomograms showed better accuracy in predicting 5-year survival in cohorts compared to IPI, NCCN-IPI, NPI [12], and Kyoto-index [13] (Figures 4(a) and 4(b)). The AUC of the predictive model (0.77, 95% CI: 0.70–0.82) in the training cohort was higher than that of the IPI (0.74, 95% CI: 0.67–0.80), NCCN-IPI (0.69, 95% CI: 0.62–0.76), NPI (0.70, 95% CI: 0.63–0.76), and Kyoto-index (0.69, 95% CI: 0.62–0.76) (). Likewise, the AUC of IPI (0.72, 95% CI: 0.65–0.82) and NCCN-IPI (0.73, 95% CI: 0.65–0.82) in the validation cohort was lower than that of the predictive model.

We then classified all patients into low-, intermediate-, and high-risk groups based on OS scores generated by nomogram. The cutoff values were determined by X-tile software (Figure 5(a)). The 5-year OS of patients differentiated by International Prognostic Index (IPI) score was 90% (score 0–2), 73% (score 3), and 51% (score 4-5), respectively (Figure 5(c)). The 5-year OS of patients in the low-, intermediate-, and high-risk groups identified by the new predictive model was 89%, 70%, and 33%, respectively (Figure 5(b)).

To clearly demonstrate the relationship between IPI scores and the new model’s predictions in outcome of DLBCL patients, a Sankey diagram was constructed (Figure 6). We further categorized patients in the high-risk group (point 4-5) defined by the IPI score into subgroups of 117 patients in the nonhigh-risk group and 44 in the high-risk group by the new model. The baseline clinical characteristics of the two subgroups are shown in Table 3.

4. Discussion

To our knowledge, our cohort had the best clinical outcome among the reported studies with the same sample size of patients in general hospitals in China.

The median age of patients with DLBCL in our study was 58 years, which was consistent with the data reported by other research centers in Asia [14, 15], but lower than those reported in other continents [16, 17]. Compared with other studies [13, 1827], especially in cancer hospitals in China [6, 12], our cohort had a higher proportion of patients with advanced stage and combined B symptoms, which indicated that patients in our center have a heavier burden of disease. Primary extranodal DLBCL can originate from almost any part of the body, and the most common site of involvement in our cohort was the gastrointestinal tract. In addition, the involvement of the mammary gland, thyroid gland, and testis also occupied a large portion, which was consistent with previously reported data [2830].

Multicenter data showed that the 5-year OS of DLBCL is about 64% in the rituximab era [3032], while the survival of our cohort was better. This may be due to the availability of more new drugs, improvements in supportive care, and appropriate adjustments in treatment regimens. Previous prediction models [2, 4] had shown that age, stage, ECOG PS, bone marrow involvement, and number of extranodal sites are momentous prognostic factors, and the data in our study were consistent. In addition, non-GCB [33] pathological subtype was a predictor of poor prognosis (5-year OS 73%).

Albumin levels are commonly used in lymphoma studies. Decreased albumin levels indicate the poor nutritional status of the patient or the consumption of the tumor. However, studies had shown that low Alb may be driven more by proinflammatory status [34] or increased cytokine release [35] than by nutritional status. Biccler et al. [20] considered Alb as a predictor of poor clinical outcomes for patients with DLBCL. Similarly, our data suggested that patients with low Alb have significantly worse survival than other patients. In addition, McMillan et al. [36] considered albumin levels as a good predictor of disease progression. Patients with low albumin levels, especially older patients, were more prone to coinfection, which was also associated with a worse prognosis [37].

As an easily available biomarker, the role of LMR in predicting the survival of DLBCL had been increasingly emphasized. Absolute monocytes were positively associated with the number of tumor-associated macrophages, while the latter was associated with a worse prognosis of DLBCL [38]. Low absolute lymphocyte count suggested poor immune status and was associated with poor prognosis in patients with DLBCL. Therefore, lower LMR predicted worse clinical outcomes [39]. However, there is no uniform standard for the optimal cutoff value of LMR, and a meta-analysis of patients with DLBCL showed that LMR ranged from 1.6 to 4 [40]. Therefore, the critical point determined by ROC curve in our study is 2.5.

Survival of patients with DLBCL had greatly improved in the last 20 years, but we realized that survival in high-risk patients was still poor. We developed a new prediction model to better distinguish high-risk patients. Based on the original IPI and NCCN-IPI, we removed some variables and added Alb and LMR. After verification of internal and external data, the predictive model we developed proved to have good predictive performance. And this model had better predictive power than those of IPI and NCCN-IPI. High-risk patients differentiated according to our model had a worse prognosis.

The main limitation of our study was that it is a single-center retrospective study and its results may not be fully applicable to all patients with DLBCL. In addition, selection bias was difficult to avoid. The model we developed also needs to be validated by larger samples and external study cohorts.

In summary, we analyzed the clinical features of patients with DLBCL in our center and showed better survival. Then, we constructed a new model with better predictive performance by identifying prognostic risk factors, which may help clinicians to better predict clinical outcomes for patients in the rituximab era.

Data Availability

The data supporting the findings of this study are included within the article.

Ethical Approval

All procedures followed were in accordance with the Helsinki Declaration. The study was approved by the Institutional Review Board of Peking Union Medical College Hospital.

Informed consent was obtained from all patients for being included in the study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Wei Zhang and Dao-bin Zhou conceptualized the study. Yan Zhang and Wei Wang performed data curation. Jinrong Zhao performed formal analysis, investigated, validated, and visualized the study, developed methodology, collected resources, developed software, and wrote the original draft of the manuscript. Dao-bin Zhou acquired fund and supervised the study. Wei Zhang reviewed and edited the article.


The authors thank the patients and their families and to Ms. Zhao Feiyan for her help in using the software. This work was supported by the National Natural Science Foundation of China (81970188).