Background and Aims. Men who have sex with men (MSM) are at high risk of HIV infection. The nonhomologous end joining (NHEJ) pathway is the main way of double-stranded DNA break (DSB) repair in the higher eukaryotes and can repair the DSB timely at any time in cell cycle. It is also indicated that the NHEJ pathway is associated with HIV-1 infection since the DSB in host genome DNA occurs in the process of HIV-1 integration. The aim of the present investigation was to evaluate associations of single-nucleotide polymorphisms (SNPs) in NHEJ pathway genes with susceptibility to HIV-1 infection and AIDS progression among MSM residing in northern China. Methods. A total of 481 HIV-1 seropositive men and 493 HIV-1 seronegative men were included in this case-control study. Genotyping of 22 SNPs in NHEJ pathway genes was performed using the SNPscan™ Kit. Results. Positive associations were observed between XRCC6 rs132770 and XRCC4 rs1056503 genotypes and the susceptibility to HIV-1 infection. In gene-gene interaction analysis, significant SNP-SNP interactions of XRCC6 and XRCC4 genetic variations were found to play a potential role in the risk of HIV-1 infection. In stratified analysis, XRCC5 rs16855458 was significantly associated with CD4+ T cell counts in AIDS patients, whereas LIG4 rs1805388 was linked to the clinical phases of AIDS patients. Conclusions. NHEJ gene polymorphisms can be considered to be risk factors of HIV-1 infection and AIDS progression in the northern Chinese MSM population.

1. Introduction

Acquired immune deficiency syndrome (AIDS) due to the infection of human immunodeficiency virus (HIV) is a chronic infectious disease and continues to be a major global public health issue. There were approximately 37.7 million people globally and 1.045 million people in China living with HIV by the end of 2020 [1]. The significant increase in the proportion of behavior spread of men who have sex with men (MSM) is the dominant pathway of all kinds of HIV infection routes. The individuals with different susceptibility to HIV infection and clinical disease progression arise from different genetic backgrounds of the host [2]. The finding of AIDS-related genes with single-nucleotide polymorphisms (SNPs) is an important breakthrough that can help us to explore the role of host genetic background in HIV infection, reveal the pathogenesis of AIDS, predict the disease process, and develop new drugs and vaccines [3].

Double-stranded DNA break (DSB) is one of the main reasons for the gene mutation and chromosome break and plays an important role in tumorigenesis and progression of tumors [4]. The nonhomologous end joining (NHEJ) pathway is the main approach of DSB repair (DSBR) in the higher eukaryotes and can repair DSBs timely at any time in cell cycle [5, 6]. There are five core genes (XRCC7, XRCC6, XRCC5, XRCC4, and LIG4) in the NHEJ pathway that encodes five proteins (DNA-PK, Ku70, Ku80, XRCC4, and LIG4), respectively. Studies have shown that NHEJ gene polymorphisms are associated with susceptibility to a wide variety of cancers and disease progression. For instance, XRCC7 gene polymorphisms play an important role in prostate cancer [7], bladder cancer [8], liver cancer [9], thyroid cancer [10], and lung cancer [11]. The other gene polymorphisms such as XRCC4, XRCC5, XRCC6, and LIG4 SNPs are also associated with many different types of cancers [1215].

It has been indicated that the NHEJ pathway is associated with HIV-1 infection because the DSB in host genome DNA occurs in the process of HIV-1 integration [16]. However, the role of SNPs in NHEJ genes and their importance in HIV-1 infection and AIDS progression remain unclear. In this study, we conducted a case-control study in the northern Han Chinese population to investigate associations of 22 SNPs in XRCC7, XRCC6, XRCC5, XRCC4, and LIG4 genes with the risk of HIV-1 infection and the progression of AIDS. Furthermore, a gene-gene interaction analysis was conducted to explore the role of combined effects of SNPs in the risk of HIV-1 infection.

2. Materials and Methods

2.1. Subjects

A total of 481 HIV-1 seropositive men and 493 health controls were selected for this study. The study participants were all of Han descents and had lived in Harbin, Heilongjiang Province, in North China for at least three generations. All participants were not genetically related within three generations.

481 HIV-1 seropositive men were recruited from Heilongjiang Center for Disease Control and Prevention. The age of the subjects ranged from 16 to 75 years (mean , ), and the average CD4+ T lymphocyte count at that time point was 335 cells/μl (range, 3-1038 cells/μl). All subjects had acquired HIV-1 infection through male-male homosexual transmission. These patients were categorized as category 1 (T cells/μl) or category 2 (T cells/μl) by the CD4+ T lymphocyte count and as category A (clinical phase III+IV) or category B (clinical phase I+II) by the clinical stage.

493 HIV-1 seronegative men age-matched to the HIV-1 patients were randomly selected as the control group from the comprehensive medical examination population of the Second Affiliated Hospital of Harbin Medical University. The age of the uninfected controls ranged from 16 to 75 years (mean , ). All participants provided informed consent approved by local ethics review board.

2.2. SNP Selection and Genotyping

22 candidate SNPs in NHEJ pathway genes were included in the present study. Among them, two SNPs (rs7830743 and rs7003908) were from XRCC7, four SNPs (rs132770, rs5751129, rs2267437, and rs132774) were from XRCC6, eight SNPs (rs828907, rs705649, rs16855458, rs3770502, rs9288516, rs3835, rs1051677, and rs2440) were from XRCC5, six SNPs (rs1056503, rs6869366, rs2075685, rs10040363, rs963248, and rs35268) were from XRCC4, and two SNPs (rs1805388 and rs1805389) were from LIG4.

Genomic DNA was extracted from 200 μl of peripheral blood of all participants using the QIAamp blood kit (Qiagen, Germany) according to the manufacturer’s protocol. All 22 SNPs were genotyped in 481 HIV-1-infected and 493 HIV-1-uninfected individuals using a custom-designed 48-Plex SNPscan™ Kit (supplied by Genesky Bio-technologies Inc., Shanghai, China), according to the method of high-throughput SNP genotyping utilizing double ligation and multiplex fluorescence PCR. For quality control, a 5% random sample of cases and controls was genotyped twice to verify the genotyping accuracy; the reproducibility was 100%.

2.3. Statistical Analysis

The genotype and allele frequencies were calculated through directly counting the numbers after the genotypes of the cases and controls were determined. A chi-square test was used for examining the deviation from Hardy-Weinberg’s equilibrium (HWE) for all SNPs of the control group, the association between genotype frequencies and susceptibility to HIV-1 infection, and the association between the genotype frequencies and clinical features (such as the CD4+ T lymphocyte count and clinical stage) of the case group. Odds ratios (ORs) and 95% confidence intervals (95% CI) were estimated as the relative risk associated with SNPs. The generalized multifactor dimensionality reduction (GMDR) software [17] was applied to assess SNP-SNP interactions. SPSS 23.0 software (IBM-SPSS, Inc., Chicago, IL, USA) was used for all statistical analyses. The analyses of linkage disequilibrium (LD) and the haplotype frequencies were performed using the HaploView software [18]. The differences with a value less than 0.05 were considered statistically significant.

3. Results

3.1. Hardy-Weinberg Equilibrium Test

The success rates of genotyping were >98% for all SNPs. As shown in Table 1, all 22 SNPs did not deviate from the Hardy-Weinberg equilibrium in the control group ().

3.2. Associations of NHEJ Gene Polymorphisms with HIV-1 Infection

To explore the possible associations, the genotype distribution of 22 SNPs was investigated. Then, differences of genotype frequencies between cases and controls were analyzed under three genetic models (codominant model, dominant model, and recessive model). As shown in Figure 1, a significant association was found for XRCC6 rs132770 under codominant (, , 95% CI: 2.000-55.251) and recessive (, , 95% CI: 1.986-54.933) genetic models. In addition, the genotype TT of XRCC4 rs1056503 showed significant association with increased susceptibility of HIV-1 infection in the codominant model (TT vs. GG, , , 95% CI: 1.037-2.779) and recessive model (TT vs. TG+GG, , , 95% CI: 1.060-2.750). However, no association with HIV-1 infection was observed in any genetic model for the remaining 20 SNPs ().

3.3. Analysis of the SNP-SNP Interaction

The GMDR method was used to study the association of 10 SNPs in XRCC6 and XRCC4 genes with high-order interactions on HIV-1 infection. Through a 10-fold cross-validation, the best four-locus model involving XRCC6 (rs2267437) and XRCC4 (rs10040363, rs963248, and rs1056503) was identified (Figure 2). In order to obtain the ORs for joint effects of the four SNPs on HIV-1 infection, traditional statistical methods were applied to this four-locus model to aid in interpretation, which identified three significant genotype combinations from all possible high-risk genotype combinations. In this four-locus (rs1056503-rs2267437-rs10040363-rs963248) model, the ORs for three significant high-risk genotype combinations (TT)-(CC)-(AG/GG)-(TC/CC), (TT)-(CC)-(AA)-(TC/CC), and (TT)-(CC)-(AA)-(TT) were 6.667 (), 7.333 (), and 6.667 (), respectively (Table 2).

3.4. Analysis of Haplotype Associations

LD between SNPs in NHEJ genes was analyzed using HaploView software. There was strong LD among four SNPs in XRCC6 gene, eight SNPs in XRCC5 gene, six SNPs in XRCC4 gene, and two SNPs in LIG4 gene, respectively. There were no significant differences in frequencies of all haplotypes between HIV-1-infected cases and healthy controls (). Table 3 shows all blocks and haplotypes identified and the frequencies of these haplotypes.

3.5. Associations of NHEJ Gene SNPs with CD4+ T Cell Count and Clinical Phase in AIDS Patients

To investigate the relationship between NHEJ gene polymorphisms and AIDS progression, differences in allele frequencies were analyzed between subgroups of HIV-1-infected cases which were divided using CD4+ T lymphocyte count and clinical stage as indicators, respectively. The CD4+ T cell counts of HIV-1-infected cases ranged from 3 to 1038 cells/μl (, ). The associations between SNPs and CD4+ T cell counts were used to assess the influence of gene polymorphisms on the immunity status of patients. As shown in Table 4, there were significant differences in genotype frequencies between different subgroups of cases for XRCC5 rs16855458 and LIG4 rs1805388 (). In detail, the subjects with AA or AC of rs16855458 have a significantly lower CD4+ T lymphocyte count, compared to subjects with CC genotype (, , 95% CI: 1.054-2.243). The subjects with AA or AG of rs1805388 have a later clinical stage of AIDS, compared to subjects with GG genotype (, , 95% CI: 1.027-2.209). However, other SNPs were not associated with the CD4+ T lymphocyte count and clinical stages (). These results suggested that rs16855458 and rs1805388 were associated with the clinical features and progression of AIDS in the northern Chinese population.

4. Discussion

According to the molecular mechanism of HIV-1 infection, viral DNA is inserted into the host genomic DNA in the process of HIV-1 integration. The integration process was equivalent to genomic DNA with DSBs in host cells under the action of HIV-1, and then, the signal of damage repair would start the NHEJ pathway. For example, the DNA-PK protein interacts with HIV-1 Tat to regulate HIV-1 replication and transcription [19, 20]. Therefore, we believed that the NHEJ genes were involved in HIV-1 infection and the disease progression. To the best of our knowledge, this comprehensive study is the first to systematically evaluate the association between the polymorphisms in NHEJ genes and the susceptibility to HIV-1 infection and the progression of AIDS.

In this study, the differences of genotype frequencies of XRCC6 rs132770 and XRCC4 rs1056503 were found between the cases and the controls under different genetic models. Our results implied a positive association of SNPs in NEHJ genes with the susceptibility to HIV-1 infection in the northern Chinese MSM population. The XRCC6 gene encodes Ku70 protein, which functions as a single-stranded DNA- and ATP-dependent helicase and may be involved in the repair of nonhomologous DNA ends such as that required for DSB repair. The Ku70 protein also interacts with HIV-1 integrase and is beneficial to virus integration and replication in the process of the HIV-1 infection [21, 22]. Given that rs132770 locates close to the translation starting point in the XRCC6 promoter, one of the possible reasons for the positive association is that rs132770 affects the expression of Ku70 mRNA; or rs132770 may be in high linkage with some functional variants conferring the etiology of HIV-1 infection. Similar to our findings, it has been reported that different XRCC6 genotypes may contribute to susceptibility to another disease related to virus infection, namely, hepatocellular carcinoma (HCC) [2325].

The XRCC4 gene encodes XRCC4 protein, which can activate and enhance the activity of LIG4 protein and play an important role in NEHJ repair pathway [26]. Recently, XRCC4 SNPs have been reported to be associated with the risk of a variety of diseases. For example, one study found that XRCC4 mutations may lead to the occurrence of small head dwarfism [27]. Several other studies have shown that SNPs in XRCC4 gene could affect the susceptibility and progression of virus-related HCC [2830]. Our study implicated that XRCC4 rs1056503 was associated with HIV-1 infection, which was consistent with the above reports. Rs1056503 is located in the 5 regulatory region of XRCC4 gene, which may cause changes in mRNA expression level and XRCC4 protein function. Then, functional changes in XRCC4 protein may affect NHEJ biological processes in DSBR. Further experimental assay should be performed to solidify our speculations. In addition, in the analysis of SNP-SNP interaction, our results provide evidence for a four-locus interaction between XRCC6 and XRCC4 variants in the risk of HIV-1 infection and further highlight the role of multilocus effects in the genetic component of HIV-1 infection.

As an indicator of AIDS clinical characteristics, CD4+ T cell count reflects the number of immune cells in patients. The AIDS patients with CD4+ T cell count less than 350 cells/μl should be given antiretroviral therapy or other treatments according to the World Health Organization (WHO) [3133]. In the present study, we found a significant difference in frequencies of XRCC5 rs16855458 genotypes between the two subgroups of cases, where genotypes AA and AC were associated with lower numbers of CD4+ T cells. These results suggest that XRCC5 rs16855458 is involved in the progression of AIDS. The XRCC5 gene encodes Ku80 protein which forms a Ku heterodimer with Ku70 protein. Functional studies showed that changes in expression levels of Ku80 protein are the main reason of tumor development and can be used as a predictor of patient survival as well as treatment outcome [34, 35]. In the process of HIV-1 infection, the XRCC5 gene is closely related to HIV-1 integration and translation [3638]. We propose that the rs16855458 in XRCC5 intron may regulate the transcription and expression of the XRCC5 by alternative splicing, which interacts with HIV-1 to promote its integration and translation, leading to the decrease in the CD4+ T lymphocyte count and the AIDS acceleration. Similar to our findings, the polymorphisms of XRCC5 gene have also been reported to be associated with virus-related HCC [24].

In this study, the HIV-1 seropositive cases were divided into two subgroups based on clinical stage, which is a clinical feature of AIDS and directly reflects the disease progression. The clinical symptoms of patients in phases I and II are mild and just show HIV-1 antibody positive. On the contrary, patients in phases III and IV have serious clinical symptoms such as nervous system lesions, continuous fever and diarrhea, sepsis, and various kinds of tumors caused by the loss of immune functions and should be timely given the antiretroviral therapy or other treatments. The results of our study revealed that there was a significant difference in genotype frequencies of LIG4 rs1805388 between MSM cases in clinical phase I+II and those in clinical phase III+IV, and AA/AG genotypes could significantly promote the disease progression of AIDS. The LIG4 gene encodes LIG4 protein, which connects the DSB end and completes NHEJ repair. Previous studies have shown that LIG4 gene polymorphisms are associated with many clinical features of lung and ovarian cancer, such as treatment outcome, progression-free survival, and overall survival [39, 40]. Mutations in the LIG4 gene can not only lead to abnormal development of immune defects but also cause severe combined immunodeficiency disease in normal individuals [41]. The rs1805388 is located in the exon region of LIG4 gene, which is a missense mutation of threonine and isoleucine. Here, we propose that the reason for this association was the functional changes of LIG4 protein resulting from the genetic variant directly affecting the clinical stage of AIDS.

Several limitations of this study should be considered. First, there is a lack of information on critical factors in MSM cases, including history of injection drug use, clinical data on viral loads, and other clinical manifestations. Second, cases and controls were not exposed to the same conditions, because we could not collect samples of healthy MSM controls due to privacy regulations.

For future studies, we recommend that the findings of this study should be expanded to other ethnic groups in different regions in the world, beyond the northern Chinese Han population.

5. Conclusions

The study confirmed that NHEJ gene polymorphisms played an important role in HIV-1 infection and AIDS progression among MSM populations in northern China. Our study opens a new field for further investigation of underlying functional mechanisms of the association between NHEJ gene polymorphisms and HIV-1/AIDS.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Ethical Approval

This study was reviewed and approved by the Ethics Committee of the Harbin Medical University (No. HMUIRB20180019), and all experimental procedures complied with the Declaration of Helsinki.

All participants gave written informed consent to take part in the present study after the nature of study had been fully explained.


This article is based on a previously available preprint: “Associations of the Polymorphisms of the NHEJ Pathway Genes with HIV-1 Infection and Aids Progression among Men Who Have Sex with Men in Northern China” [42].

Conflicts of Interest

The authors declare no conflict of interest.

Authors’ Contributions

Xuelong Zhang and Xi Wang contribute equally to this work.


This work was funded by the National Natural Science Foundation of China (grant number 81373220) and the Postdoctoral Foundation of Hei Long Jiang Province (grant numbers LRB 08-340 and LBH-Q11029). We gratefully acknowledge the numerous sample donors for making this work possible.