Computational and Mathematical Methods in Medicine

Computational and Mathematical Methods in Medicine / 2016 / Article

Research Article | Open Access

Volume 2016 |Article ID 2460184 | 10 pages | https://doi.org/10.1155/2016/2460184

A Canonical Correlation Analysis of AIDS Restriction Genes and Metabolic Pathways Identifies Purine Metabolism as a Key Cooperator

Academic Editor: Maria N. D. S. Cordeiro
Received05 Mar 2016
Accepted08 Jun 2016
Published04 Jul 2016

Abstract

Human immunodeficiency virus causes a severe disease in humans, referred to as immune deficiency syndrome. Studies on the interaction between host genetic factors and the virus have revealed dozens of genes that impact diverse processes in the AIDS disease. To resolve more genetic factors related to AIDS, a canonical correlation analysis was used to determine the correlation between AIDS restriction and metabolic pathway gene expression. The results show that HIV-1 postentry cellular viral cofactors from AIDS restriction genes are coexpressed in human transcriptome microarray datasets. Further, the purine metabolism pathway comprises novel host factors that are coexpressed with AIDS restriction genes. Using a canonical correlation analysis for expression is a reliable approach to exploring the mechanism underlying AIDS.

1. Introduction

Human immunodeficiency virus (HIV) is the basis for acquired immune deficiency syndrome (AIDS) pathogenesis and destroys the lymphoid system with prodigious replicates, which reduces a patient’s ability to survive. Since HIV was identified in the 1980s, this pathogen has taken more than 10 million people’s lives throughout the world. Researchers have developed considerable information on HIV involving immunology, virology, host genetics, and treatment over the past few decades.

Human genetics research involving the infectious disease HIV has progressed considerably after initiation of the human genome project (HGP), which is sequencing the entire human genome, both physically and functionally [1]. Many host genetic factors that influence AIDS epidemiological heterogeneity have been characterized [24]. From the HIV entry receptor on lymphoid cells to oncogenes in human glioblastomas, AIDS restriction genes (ARGs) are widely involved in biological pathways, and nearly 40 ARGs have been studied in depth through functional analyses [512]. Host genomic analysis is a key approach to studying AIDS epidemiology [13].

Further, genome, transcriptome, proteome, and metabolome biodatasets related to HIV have grown exponentially due to advanced sequencing technology. However, an integrative study on these datasets is limited in terms of understanding the complicated biological network.

Recent studies have revealed that metabolic pathways exert certain effects on the control of AIDS disease progression [14]. For example, the oxygen concentration can modulate T-cell differentiation through controlling metabolic status [15]. Metabolizing ATP to adenosine inhibits HIV-specific effector cells. Further, HIV infection is affected by dNTP hydrolysis. Efficient HIV-1 infection of CD4(+) lymphocytes requires sufficient glucose uptake via the Glut1 glucose transporter [16]. Tryptophan and phenylalanine metabolism also play an important role in HIV because HIV pathophysiology is associated with inflammatory stress due to dysregulated amino acid metabolism [17]. The HIV protein NEF impacts lipid-related metabolism through impairing cholesterol metabolism in both infected and bystander cells [18, 19]. This evidence suggests that cross talk between AIDS and the host metabolism is an important research topic that is necessary to resolve the disease mechanism and aid in therapy. Integrating biodatasets with an in-depth analysis of host AIDS restriction genes and metabolic pathways is imperative.

In the transcriptome, gene coexpression is a model for understanding how individual genes are correlated in certain conditions [20, 21]. Based on advances in this field, researchers hypothesize that the coexpression of genes in certain pathways indicates an integrative correlation between the two molecular pathways. Full genes in metabolic pathway are available for the human genome. Identifying correlations between a group of metabolic pathway genes and ARGs is a more comprehensive means for understanding integrative biodatasets. However, traditional methods using a Pearson or partial correlation are only suitable for a single gene. A canonical correlation analysis (CCA) is an efficient and powerful approach for measuring coexpression between two sets of genes. A Childhood Asthma Management Program (CAMP) study using a CCA successfully detected genetic regulatory variants [22]. Using the CCA, the glioblastoma transcriptomes of 45 patients were thoroughly analyzed to identify the glioma pathway genes [23].

In this paper, we used a CCA to analyze coexpression between ARGs and metabolic pathways from KEGG. We discuss the most important metabolic pathways coexpressed with the ARGs, which may imply strategies for AIDS diagnosis and therapy.

2. Methods

2.1. Datasets

Human genome expression datasets were downloaded from the website COPRESDB (http://coxpresdb.jp/), which contains approximately 4000 experiments and expression data on 20,000 human genes. Metabolic pathway genes were downloaded from KEGG (http://www.kegg.jp/); this dataset includes 129 typical metabolic pathways with predicted genes. The ARGs were collected from published literature. Two expression datasets were generated to include metabolic pathway gene and ARG expression data, respectively (Tables 1 and 2).


Gene symbolGene IDEffect

APOBEC3B9582 Increase infection
APOBEC3G60489Accelerates AIDS
CCL116356
CCL176361
CCL186362
CCL26347
CCL46351
CCL56352
CUL58065Accelerates CD4 loss
CXCR13577
CXCR610663Accelerates AIDS
DC-SIGN30835Decreases infection
DEFB11672
GML2765
HCP510866HIV set point
HLA-A3105Delays AIDS
HLA-B3106Delays AIDS
HLA-C3107Delays AIDS
IDH13417Prevents infection
IFENG3458Accelerates AIDS
IL103586Accelerates AIDS
IL43565
IRF13659
KIR2669Delays AIDS
LY6D8581
MYH94627End stage renal disease
NCOR29612 Increase infection
PECI/ECI210455Accelerates AIDS
PPIA/CypA5478Accelerates AIDS
PROX15629Delays AIDS progression
SDF1/CXCL126387Delays AIDS
Slurp157152
Slurp2/Ly66004
TLR47099
TLR851311
TLR954106
TRIM5a85363 Increase infection
TSG1017251Accelerates AIDS
ZNRD130834


Pathway nameKEGG IDClass of metabolism pathwayGene number

Glycolysis/gluconeogenesis10Carbohydrate metabolism67
Citrate cycle (TCA cycle)20Carbohydrate metabolism31
Pentose phosphate pathway30Carbohydrate metabolism29
Pentose and glucuronate interconversions40Carbohydrate metabolism34
Fructose and mannose metabolism51Carbohydrate metabolism36
Galactose metabolism52Carbohydrate metabolism30
Ascorbate and aldarate metabolism53Carbohydrate metabolism27
Starch and sucrose metabolism500Carbohydrate metabolism56
Amino sugar and nucleotide sugar 520Carbohydrate metabolism49
Pyruvate metabolism620Carbohydrate metabolism42
Glyoxylate and dicarboxylate metabolism630Carbohydrate metabolism24
Propanoate metabolism640Carbohydrate metabolism32
Butanoate metabolism650Carbohydrate metabolism29
Inositol phosphate metabolism562Carbohydrate metabolism61
Oxidative phosphorylation190Energy metabolism133
Nitrogen metabolism910Energy metabolism27
Sulfur metabolism920Energy metabolism18
Fatty acid biosynthesis61Lipid metabolism6
Fatty acid elongation62Lipid metabolism23
Fatty acid metabolism71Lipid metabolism44
Ketone bodies72Lipid metabolism9
Steroid biosynthesis100Lipid metabolism18
Primary bile acid biosynthesis120Lipid metabolism17
Steroid hormone biosynthesis140Lipid metabolism56
Glycerolipid metabolism561Lipid metabolism55
Glycerophospholipid metabolism564Lipid metabolism91
Ether lipid metabolism565Lipid metabolism42
Sphingolipid metabolism600Lipid metabolism47
Arachidonic acid metabolism590Lipid metabolism68
Linoleic acid metabolism591Lipid metabolism33
Alpha-linolenic acid metabolism592Lipid metabolism25
Biosynthesis of unsaturated fatty acids1040Lipid metabolism21
Purine metabolism230Nucleotide metabolism173
Pyrimidine metabolism240Nucleotide metabolism107
Alanine, aspartate, and glutamate metabolism250Amino acid metabolism32
Glycine, serine, and threonine metabolism260Amino acid metabolism37
Cysteine and methionine metabolism270Amino acid metabolism34
Valine, leucine, and isoleucine degradation280Amino acid metabolism44
Valine, leucine, and isoleucine biosynthesis290Amino acid metabolism2
Lysine biosynthesis300Amino acid metabolism2
Lysine degradation310Amino acid metabolism49
Arginine and proline metabolism330Amino acid metabolism57
Histidine metabolism340Amino acid metabolism28
Tyrosine metabolism350Amino acid metabolism39
Phenylalanine metabolism360Amino acid metabolism18
Tryptophan metabolism380Amino acid metabolism40
Phenylalanine, tyrosine, and tryptophan biosynthesis400Amino acid metabolism5
Beta-alanine metabolism410Metabolism of other amino acids29
Taurine and hypotaurine metabolism430Metabolism of other amino acids10
Selenocompound metabolism450Metabolism of other amino acids17
Cyanoamino acid metabolism460Metabolism of other amino acids7
D-Glutamine and D-glutamate metabolism471Metabolism of other amino acids4
D-Arginine and D-ornithine metabolism472Metabolism of other amino acids1
Glutathione metabolism480Metabolism of other amino acids51
N-Glycan biosynthesis510Glycan biosynthesis and metabolism49
Mucin type O-glycan biosynthesis512Glycan biosynthesis and metabolism31
Other types of O-glycan biosynthesis514Glycan biosynthesis and metabolism30
Glycosaminoglycan biosynthesis, chondroitin sulfate/dermatan sulfate532Glycan biosynthesis and metabolism20
Glycosaminoglycan biosynthesis, heparan sulfate/heparin534Glycan biosynthesis and metabolism24
Glycosaminoglycan biosynthesis, keratan sulfate533Glycan biosynthesis and metabolism15
Glycosaminoglycan degradation531Glycan biosynthesis and metabolism19
Glycosylphosphatidylinositol- (GPI-) anchor biosynthesis563Glycan biosynthesis and metabolism25
Glycosphingolipid biosynthesis, lacto- and neolactoseries601Glycan biosynthesis and metabolism26
Glycosphingolipid biosynthesis, globoseries603Glycan biosynthesis and metabolism14
Glycosphingolipid biosynthesis, ganglioseries604Glycan biosynthesis and metabolism15
Other glycan degradation511Glycan biosynthesis and metabolism18
Thiamine metabolism730Metabolism of cofactors and vitamins4
Riboflavin metabolism740Metabolism of cofactors and vitamins13
Vitamin B6 metabolism750Metabolism of cofactors and vitamins6
Nicotinate and nicotinamide metabolism760Metabolism of cofactors and vitamins28
Pantothenate and CoA biosynthesis770Metabolism of cofactors and vitamins17
Biotin metabolism780Metabolism of cofactors and vitamins3
Lipoic acid metabolism785Metabolism of cofactors and vitamins3
Folate biosynthesis790Metabolism of cofactors and vitamins14
One carbon pool by folate670Metabolism of cofactors and vitamins20
Retinol metabolism830Metabolism of cofactors and vitamins68
Porphyrin and chlorophyll metabolism860Metabolism of cofactors and vitamins43
Ubiquinone and other terpenoid-quinone biosynthesis130Metabolism of cofactors and vitamins10
Terpenoid backbone biosynthesis900Metabolism of terpenoids and polyketides21
Caffeine metabolism232Biosynthesis of other secondary metabolites7
Butirosin and neomycin biosynthesis524Biosynthesis of other secondary metabolites5
Metabolism of xenobiotics by cytochrome P450980Xenobiotics biodegradation and metabolism80
Drug metabolism, cytochrome P450982Xenobiotics biodegradation and metabolism74
Drug metabolism, other enzymes983Xenobiotics biodegradation and metabolism51

2.2. Canonical Correlation Analysis

To analyze the correlations between ARG and metabolic pathway gene expression, we used a CCA, which integrates multiple correlations into a few significant correlations. This statistical method calculates the correlation between two sets of variables and generates statistically independent pairs of new variables, which are referred to as canonical variables. The linear combination of the variables creates a component of the canonical variable pair in each group of the original variables.

In this study, these variables were defined at each flag as follows: ARG expression described by genes in the vector and metabolic pathway gene expression described by genes in the vector . The respective sets of canonical variables and are results from the linear combination of ARG and metabolic pathway gene expression. The ARG expression canonical variables are included in the vector , which is the result of the linear combination comprising the vector (original ARGs expression) and the canonical coefficients vector as . The vector contains the canonical variables for metabolic pathway gene expression, which result from the linear combination of the vector (original metabolic pathway genes expression) and canonical coefficient vector. The ARG and metabolic pathway gene variance-covariance matrices can be used to estimate the canonical correlation coefficients.

The magnitude of the correlation between each pair of canonical variables is described by the vector eigenvalues. The canonical coefficients exist in the eigenvectors and can be used to estimate the canonical variables. The variance-covariance matrices contain the variances and covariances within the groups for the ARGs and metabolic pathway genes, respectively. The covariances between variables were calculated from the variance-covariance matrices.

2.3. The Study Design and Software Tools

The canonical correlation analysis was performed using the R platform (http://www.r-project.org/). After the canonical variables were generated from the expression datasets composed of ARGs and metabolic pathway genes, we set the absolute value 0.15 as the threshold for selecting ARGs correlated with canonical variables. To select metabolic pathway genes correlated with canonical variables, we sorted the genes using the absolute value, and the top 50 were selected for further enrichment analyses. Functional annotations were generated and enrichment analyses were performed for the metabolic pathway genes using the web-based DAVID tool (http://david.abcc.ncifcrf.gov/). For the pathway enrichment analyses, the “KEGG_PATHWAY” was selected. The pathways with a value < 0.01 were considered significant.

3. Results

3.1. The ARGs and Metabolic Pathway Genes
3.1.1. The General CCA Results

Eight significant (, Wilk’s Lambda, ) canonical correlations were discerned between the ARG and metabolic pathway gene transcriptomes using the CCA. 60% of the total ARG expression variance was explained by the ARGs canonical variables. Significant metabolic pathway canonical variables explained 38% of the metabolic gene transcriptome variation. Thus, ARG-metabolic pathway associations were involved in a substantial proportion of the total variance. The first pair of canonical variables had a correlation of 0.99, while the second pair of canonical variables had a correlation of 0.98.

3.2. Relationships between the Canonical Variables and Original Genes
3.2.1. Pair 1 (C1, P1)

As shown in Table 3, the canonical variable C1 explains 2.4% of the variability in the original ARGs expression variables. We observed positive correlations (absolute value > 0.15) with all ARGs, including PPIA (0.42), ZNRD1 (0.37), MYH9 (0.36), TSG101 (0.31), IDH1 (0.28), TRIM5a (0.17), and CUL5 (0.15), but not GML (−0.17) and NCOR2 (−0.31). The greatest positive correlation was observed between and PPIA. In contrast, the greatest negative correlation was observed between and NCOR2. Among seven ARGs with positive correlations, the four ARGs, PPIA, TSG101, TRIM5a, and CUL5, are postentry cellular viral cofactors.


Gene symbol

DEFB10.010.010.020.150.06−0.02−0.14−0.04
KIR0.08−0.02−0.05−0.01−0.140.080.17−0.12
GML0.170.160.12−0.060.21−0.060.030.07
HLA-A0.10−0.14−0.010.00−0.030.22−0.13−0.09
HLA-B0.100.090.120.120.410.250.070.31
HLA-C0.070.26−0.04−0.080.210.330.220.21
IDH10.280.170.220.170.600.201.120.63
IFENG0.000.030.080.070.05−0.09−0.12−0.07
IL4−0.120.180.080.050.01−0.100.150.30
CXCR1−0.070.250.200.000.170.40−0.140.08
IL10−0.02−0.050.020.130.05−0.04−0.05−0.01
IRF10.07−0.090.080.100.230.24−0.140.24
MYH90.360.170.21−0.140.490.500.140.19
PPIA/CypA0.420.921.880.580.541.110.001.12
PROX1−0.140.070.030.160.230.020.550.66
Slurp2/Ly6−0.04−0.030.000.10−0.130.120.090.02
CCL20.02−0.03−0.040.00−0.050.120.080.00
CCL40.02−0.060.020.090.020.050.04−0.06
CCL50.03−0.060.010.030.000.02−0.140.05
CCL110.02−0.050.010.020.09−0.010.25−0.05
CCL170.00−0.060.050.070.06−0.02−0.19−0.06
CCL180.03−0.040.000.060.050.140.090.05
SDF1/CXCL120.09−0.100.170.180.260.140.090.22
TLR40.06−0.05−0.020.240.150.260.000.02
TSG1010.310.480.25−0.050.170.490.541.03
CUL50.150.510.870.230.190.400.80−0.04
LY6D0.01−0.050.100.240.100.130.01−0.08
APOBEC3B0.040.030.040.030.15−0.120.020.14
NCOR20.310.280.370.271.100.240.380.52
PECI/ECI20.090.150.240.01−0.100.060.350.25
CXCR6−0.06−0.100.02−0.060.030.18−0.070.09
HCP5−0.040.02−0.01−0.010.320.010.02−0.01
ZNRD10.370.140.28−0.030.320.91−0.040.10
DC-SIGN−0.040.290.130.000.170.030.220.33
TLR80.130.36−0.030.300.330.70−0.150.17
TLR9−0.030.180.11−0.06−0.020.17−0.240.36
Slurp1−0.03−0.100.190.610.210.320.04−0.06
APOBEC3G0.060.17−0.140.110.190.070.440.39
TRIM5a0.17−0.130.150.000.260.300.220.53

As shown in Table 4, the canonical variable accounts for the variability in the original metabolic pathway gene expression data. The metabolic pathway genes that correlated with variable were enriched for purine metabolism; these genes include phosphodiesterase 4C (5143), polymerase (RNA) III (DNA directed) polypeptide K (51728), and primase (5558).


ComponentTermCountPop hits valueGenes

+Purine metabolism31535143, 51728, 5558
3+Glycolysis/gluconeogenesis3605223, 2597, 57818
3+Pyrimidine metabolism3955425, 51727, 7372
4−Purine metabolism41531716, 51728, 55703, 5313
4+Purine metabolism515355811, 5147, 5425, 5432, 8654
5−Inositol phosphate metabolism3548871, 5330, 3707
6−Pyrimidine metabolism3955435, 51727, 84172
6+Pyruvate metabolism3405162, 4191, 38
6+Terpenoid backbone biosynthesis2152224, 38
7−Pyrimidine metabolism39554963, 5435, 5430
7+Methane metabolism26128, 4524

3.2.2. Pair 2 (C2, P2)

As shown in Table 3, the canonical variable explains 5.3% of the variability in the original ARG expression variables. This variable highly correlated with the ARGs PPIA (0.92), CUL5 (0.51), TSG101 (0.48), IDH1 (0.17), and PECI (0.15), but not GML (−0.16), APOBEC3G (−0.17), MYH9 (−0.17), IL4 (−0.18), TLR9 (−0.18), CXCR1 (−0.25), HLA-C (−0.26), NCOR2 (−0.28), DC-SIGN (−0.29), and TLR8 (−0.36). The greatest positive correlation was observed between and PPIA. However, the greatest negative correlation was observed between and DC-SIGN. Among the ARGs with large correlations, PPIA, TSG101, CUL5, and APOBEC3G are postentry cellular viral cofactors. Among the ARGs with negative correlations, CXCR1 and IL4 are related to cytokines. DC-SIGN is involved in chemokines, which play important role in HIV entry through chemokine receptors.

As shown in Table 4, the canonical variable accounts for the variability in the original metabolic pathway gene expression data. The metabolic pathway genes that highly correlate with the variable are not enriched in a certain pathway.

3.2.3. Pair 3 (C3, P3)

As shown in Table 3, the canonical variable explains 12.7% of the variability on the original ARG expression variables. This variable positively correlated (absolute value > 0.15) with PPIA (1.88), NCOR2 (0.37), ZNRD1 (0.28), MYH9 (0.21), CXCR1 (0.20), and Slurp1 (0.19); in contrast, it negatively correlated with TRIM5a (−0.15), SDF1 (−0.17), IDH1 (−0.22), PECI (−0.24), TSG101 (−0.25), and CUL5 (−0.87). The greatest positive correlation was observed between and PPIA. However, the greatest negative correlation was observed between and CUL5. Among the ARGs that highly correlated with , PPIA, TSG101, TRIM5a, and CUL5 are postentry cellular viral cofactors. However, only PPIA positively correlated with .

As shown in Table 4, the canonical variable accounts for the variability in the original metabolic pathway gene expression data. The metabolic pathway genes that highly correlated with the variable are enriched in glycolysis and pyrimidine metabolism. The glycolysis genes include phosphoglycerate mutase 1 (5223), glyceraldehyde-3-phosphate dehydrogenase (2597), and glucose-6-phosphatase (57818). The pyrimidine metabolism genes include polymerase (DNA directed), delta 2 (5425), cytidine monophosphate (UMP-CMP) kinase 1 (51727), and uridine monophosphate synthetase (7372).

3.2.4. Pair 4 (C4, P4)

As shown in Table 3, the canonical variable explains 3.3% of the variability in the original ARG expression variables. This variable highly correlated (absolute value > 0.15) with PPIA (0.58), TLR8 (0.30), TLR4 (0.24), and PROX1 (0.16), but not DEFB1 (−0.15), IDH1 (−0.17), SDF1 (−0.18), CUL5 (−0.23), LY6D (−0.24), NCOR2 (−0.27), and Slurp1 (−0.61). The greatest positive correlation was observed between and PPIA. However, the greatest negative correlation was observed between and Slurp1. Among the ARGs that highly correlated with , only PPIA and CUL5 are postentry cellular viral cofactors.

As shown in Table 4, the canonical variable accounts for the variability in the original metabolic pathway gene expression data. The metabolic pathway genes that highly correlated with the variable are enriched in purine metabolism. These genes include deoxyguanosine kinase (1716), polymerase (RNA) III (DNA directed) polypeptide K (51728), polymerase (RNA) III (DNA directed) polypeptide B (55703), pyruvate kinase (5313), adenylate cyclase 10 (55811), phosphodiesterase 6D (5147), polymerase (DNA directed), delta 2 (5425), polymerase (RNA) II (DNA directed) polypeptide C (5432), and phosphodiesterase 5A (8654).

3.2.5. Pair 5 (C5, P5)

As shown in Table 3, the canonical variable explains 8.3% of the variability in the original ARG expression variables. This variable highly correlated (absolute value > 0.15) with IDH1 (0.60), TLR8 (0.33), ZNRD1 (0.32), TRIM5a (0.26), IRF1 (0.23), PROX1 (0.23), Slurp1 (0.21), HLA-C (0.21), GML (0.21), CUL5 (0.19), CXCR1 (0.17), TSG101 (0.17), APOBEC3B (0.15), TLR4 (−0.15), DC-SIGN (−0.17), SDF1 (−0.26), HLA-B (−0.41), MYH9 (−0.49), PPIA (−0.54), and NCOR2 (−1.10). The greatest positive correlations were observed between and IDH1. However, the greatest negative correlations were observed between and NCOR2. Among the ARGs that highly correlated with , PPIA, TSG101, APOBEC3B, TRIM5a, and CUL5 are postentry cellular viral cofactors. HLA-C and HLA-B are members of the HLA system. DC-SIGN and SDF1 are related to chemokines. CXCR1 is related to the cytokines pathway.

As shown in Table 4, the canonical variable accounts for the variability in the original metabolic pathway gene expression data. The metabolic pathway genes that highly correlated with the variable are enriched in inositol phosphate metabolism; these genes include synaptojanin 2 (8871), phospholipase C beta 2 (5330), and inositol-trisphosphate 3-kinase B (3707).

3.2.6. Pair 6 (C6, P6)

As shown in Table 3, the canonical variable explains 10.8% of the variability in the original ARG expression variables. This variable highly correlated (absolute value > 0.15) with PPIA (1.11), TLR8 (0.70), TSG101 (0.49), CUL5 (0.40), Slurp1 (0.32), TLR4 (0.26), HLA-B (0.25), CXCR6 (0.18), TLR9 (−0.17), IDH1 (−0.20), HLA-A (−0.22), IRF1 (−0.24), NCOR2 (−0.24), TRIM5a (−0.30), HLA-C (−0.33), CXCR1 (−0.40), MYH9 (−0.50), and ZNRD1 (−0.91). The greatest positive correlation was observed between and PPIA. However, the greatest negative correlation was observed between and ZNRD1. Among the ARGs that highly correlated with , PPIA, TSG101, TRIM5a, and CUL5 are postentry cellular viral cofactors. HLA-A, HLA-C, and HLA-B are members of the HLA system. CXCR6 is related to chemokine receptors. IRF1 and CXCR1 are related to cytokines.

As shown in Table 4, the canonical variable accounts for the variability in the original metabolic pathway gene expression data. The metabolic pathway genes that highly correlated with variable are enriched in pyrimidine metabolism and terpenoid backbone biosynthesis. These genes include polymerase (RNA) II (DNA directed) polypeptide F (5435), cytidine monophosphate (UMP-CMP) kinase 1 (51727), polymerase (RNA) I polypeptide B (84172), farnesyl diphosphate synthase (2224), and acetyl-CoA acetyltransferase 1 (38).

3.2.7. Pair 7 (C7, P7)

As shown in Table 3, the canonical variable explains 9% of the variability in the original ARG expression variables. This variable highly correlated (absolute value > 0.15) with IDH1 (1.12), PROX1 (0.55), CCL11 (0.25), DC-SIGN (0.22), TRIM5a (0.22), KIR (0.17), IL4 (−0.15), TLR8 (−0.15), HLA-C (−0.22), TLR9 (−0.24), PECI (−0.35), NCOR2 (−0.38), APOBEC3G (−0.44), TSG101 (−0.54), and CUL5 (−0.80). The greatest positive correlation was observed between and IDH1. However, the greatest negative correlation was observed between and CUL5. Among the ARGs that highly correlated with , TSG101, APOBEC3G, TRIM5a, and CUL5 are postentry cellular viral cofactors. KIR and HLA-C are in the HLA system. DC-SIGN and CCL11 are related to chemokine receptors. IL4 is related to cytokines.

As shown in Table 4, the canonical variable accounts for the variability in the original metabolic pathway gene expression data. The metabolic pathway genes that highly correlated with variable are enriched in pyrimidine metabolism and methane metabolism. These genes include uridine-cytidine kinase 1-like 1 (54963), polymerase (RNA) II (DNA directed) polypeptide F (5435), polymerase (RNA) II (DNA directed) polypeptide A (5430), alcohol dehydrogenase 5 (class III) (128), and methylenetetrahydrofolate reductase (4524).

3.2.8. Pair 8 (C8, P8)

As shown in Table 3, the canonical variable explains 12% of the variability in the original ARG expression variables. This variable highly correlated (absolute value > 0.15) with PPIA (1.12), IDH1 (0.63), TRIM5a (0.53), NCOR2 (0.52), APOBEC3G (0.39), TLR9 (0.36), DC-SIGN (0.33), IL4 (0.30), PECI (0.25), SDF1 (0.22), TLR8 (0.17), MYH9 (−0.19), HLA-C (−0.21), IRF1 (−0.24), HLA-B (−0.31), PROX1 (−0.66), and TSG101 (−1.03). The greatest positive correlation was observed between and PPIA. However, the greatest negative correlation was observed between and TSG101. Among the ARGs that highly correlated with , TSG101, APOBEC3G, TRIM5a, and PPIA are postentry cellular viral cofactors. KIR and HLA-C are in the HLA system. DC-SIGN and SDF1 are related to chemokine receptors. IL4 and IRF1 are related to cytokines. HLA-C and HLA-B are in the HLA system.

As shown in Table 4, the canonical variable accounts for the variability in the original metabolic pathway gene expression data. The metabolic pathway genes that highly correlated with variable are not enriched in a metabolic pathway.

4. Discussion

Researchers have used numerous approaches to identify host genes related to AIDS [513]. Most studies use genomic information but not integration of the genome and transcriptome. However, most SNPs at ARGs impact AIDS through changing host gene transcription [710]. This study features novel experiments that focus on ARG cooperation at the transcription level and extends the correlation between ARGs and metabolic pathway genes to discover novel host genes related to AIDS.

For each variable in the canonical correlation analysis, HIV-1 postentry cellular viral cofactors highly cooperated at the transcription level. PPIA, TSG101, TRIM5a, APOBEC3G, and CUL5 frequently appeared together to correlate with the canonical variables. PPIA functions in cyclosporin A-mediated immunosuppression by encoding a member of the peptidyl-prolyl cis-trans isomerase (PPIase) family [24]. Formation of HIV virions requires an interaction between PPIA and HIV viral proteins. TSG101 negatively regulates cell growth and differentiation by producing a protein that interacts with stathmin [25]. TRIM5a is an E3 ubiquitin-ligase, and its ubiquitination function is involved in retroviral restriction [26]. These genes encode HIV-1 postentry cellular viral cofactors involved in different biological processes. Thus, the high correlation between these genes and canonical variables demonstrates that these genes are coordinated at the transcriptional level. These data suggest that a potential transcriptional regulator for these genes may be a key host factor related to AIDS.

The high-frequency ARGs that correlated with canonical variables include PPIA, TSG101, CUL5, NCOR2, IDH1, and MYH9. PPIA, TSG101, and CUL5 are discussed above. NCOR2 with histone deacetylases is a nuclear receptor corepressor [27]. IDH1 encodes isocitrate dehydrogenases involved in cytoplasmic NADPH production and pyruvate metabolism [28]. MYH9 aids in maintaining cell shape, cell motility, and cytokinesis as a conventional nonmuscle myosin [29]. These ARGs are not enriched in a certain biological process. However, many host genetic factors have not been studied.

The low-frequency ARGs that correlated with canonical variables include DEFB1 with , KIR with , HLA-A with , CCL11 with , LY6D with , APOBEC3B with , and CXCR6 with . DEFB1 is a defensin and is implicated in cystic fibrosis pathogenesis [30]. HLA-A is a major histocompatibility complex class I heavy chain paralogue; these paralogues are expressed in nearly all cells [31]. CCL11 is chemokine (C-C motif) ligand 11 and is implicated in immunoregulatory and inflammatory processes [32]. CXCR6 is chemokine (C-X-C motif) receptor [33]. LY6D is a member of the lymphocyte antigen 6 complex [34]. APOBEC3B is a member of the cytidine deaminase gene family. Recent studies have revealed that these ARGs may be RNA-editing enzymes that control the cell cycle [35]. Further, these genes only correlated with one canonical variable, which suggests that the specificity of the correlation may determine the canonical variable correlated with a certain metabolic pathway.

The most significant metabolic pathway in our analysis is purine metabolism, which featured correlations with two canonical variables and the lowest values. Recent studies analyzed purine codon patterns in variable and constant regions of HIV-1 and showed that HIV-1 RNA exhibits extreme enrichment in the purine A compared with most organisms [36]. These data suggest that a potential therapeutic agent against HIV-1 may involve novel purine derivatives [37]. Studies have elucidated twenty-four purine derivatives that act as HIV-1 Tat TAR interaction inhibitors [38]. More recently, research revealed that host cells with a modified purine biosynthesis pathway exhibit increased activity by tenofovir against sensitive and drug resistant HIV-1 [39]. In this study, we show a high correlation between ARG and purine metabolism gene expression. These data imply that purine metabolism genes are significant candidates for studying the host genomic or transcriptome influence on AIDS.

5. Conclusions

In this study, we used a CCA to analyze the correlations between ARG and metabolic pathway gene expression. The results show that HIV-1 postentry cellular viral cofactors are highly coexpressed, which suggests that regulating this group of host genes may be a key factor in studies to understand the AIDS-host interaction mechanism. Furthermore, we show that purine metabolism pathway genes coordinate with ARGs; this novel discovery supports future studies on AIDS therapy using purine derivatives. Both coexpressed ARGs and metabolic pathway genes also provide a new marker for AIDS diagnosis.

Competing Interests

The authors declare no financial interest related to this work.

Authors’ Contributions

Hanhui Ye and Jinjin Yuan contributed equally to this work.

Acknowledgments

The study was supported by the Medical Innovation Project of Fujian Health Department (Grant no. 2015-CXB-28), the Scientific Foundation of Fuzhou City (Grant no. 2015-S-143-6), and the Key Clinical Specialty Discipline Construction Program of Fuzhou, Fujian, China (Grant no. 201510301).

References

  1. E. S. Lander, L. M. Linton, B. Birren et al., “Initial sequencing and analysis of the human genome,” Nature, vol. 409, pp. 860–921, 2001. View at: Publisher Site | Google Scholar
  2. P. An and C. A. Winkler, “Host genes associated with HIV/AIDS: advances in gene discovery,” Trends in Genetics, vol. 26, no. 3, pp. 119–131, 2010. View at: Publisher Site | Google Scholar
  3. S. J. O'Brien and S. L. Hendrickson, “Host genomic influences on HIV/AIDS,” Genome Biology, vol. 14, article 201, 2013. View at: Publisher Site | Google Scholar
  4. S. J. O'Brien and G. W. Nelson, “Human genes that limit AIDS,” Nature Genetics, vol. 36, no. 6, pp. 565–574, 2004. View at: Publisher Site | Google Scholar
  5. M. Dean, M. Carrington, C. Winkler et al., “Genetic restriction of HIV-1 infection and progression to AIDS by a deletion allele of the CKR5 structural gene. Hemophilia Growth and Development Study, Multicenter AIDS Cohort Study, Multicenter Hemophilia Cohort Study, San Francisco City Cohort, ALIVE Study,” Science, vol. 273, no. 5283, pp. 1856–1862, 1996. View at: Publisher Site | Google Scholar
  6. C. Winkler, W. Modi, M. W. Smith et al., “Genetic restriction of AIDS pathogenesis by an SDF-1 chemokine gene variant. ALIVE Study, Hemophilia Growth and Development Study (HGDS), Multicenter AIDS Cohort Study (MACS), Multicenter Hemophilia Cohort Study (MHCS), San Francisco City Cohort (SFCC),” Science, vol. 279, no. 5349, pp. 389–393, 1998. View at: Google Scholar
  7. M. Carrington, G. W. Nelson, M. P. Martin et al., “HLA and HIV-1: heterozygote advantage and B35-Cw04 disadvantage,” Science, vol. 283, no. 5408, pp. 1748–1752, 1999. View at: Publisher Site | Google Scholar
  8. H. D. Shin, C. Winkler, J. C. Stephens et al., “Genetic restriction of HIV-1 pathogenesis to AIDS by promoter alleles of IL10,” Proceedings of the National Academy of Sciences of the United States of America, vol. 97, no. 26, pp. 14467–14472, 2000. View at: Publisher Site | Google Scholar
  9. X. Gao, G. W. Nelson, P. Karacki et al., “Effect of a single amino acid change in MHC class I molecules on the rate of progression to AIDS,” The New England Journal of Medicine, vol. 344, no. 22, pp. 1668–1675, 2001. View at: Publisher Site | Google Scholar
  10. M. P. Martin, X. Gao, J.-H. Lee et al., “Epistatic interaction between KIR3DS1 and HLA-B delays the progression to AIDS,” Nature Genetics, vol. 31, no. 4, pp. 429–434, 2002. View at: Publisher Site | Google Scholar
  11. M. Carrington and S. J. O'Brien, “The influence of HLA genotype on AIDS,” Annual Review of Medicine, vol. 54, pp. 535–551, 2003. View at: Publisher Site | Google Scholar
  12. P. An, D. Vlahov, J. B. Margolick et al., “A tumor necrosis factor-α-inducible promoter variant of interferon-γ accelerates CD4+ T cell depletion in human immunodeficiency virus-1-infected individuals,” The Journal of Infectious Diseases, vol. 188, no. 2, pp. 228–231, 2003. View at: Publisher Site | Google Scholar
  13. J. Fellay, K. V. Shianna, D. Ge et al., “A whole-genome association study of major determinants for host control of HIV-1,” Science, vol. 317, no. 5840, pp. 944–947, 2007. View at: Publisher Site | Google Scholar
  14. M. Craveiro, I. Clerc, M. Sitbon, and N. Taylor, “Metabolic pathways as regulators of HIV infection,” Current Opinion in HIV and AIDS, vol. 8, no. 3, pp. 182–189, 2013. View at: Publisher Site | Google Scholar
  15. C. S. Palmer, M. Ostrowski, B. Balderson, N. Christian, and S. M. Crowe, “Glucose metabolism regulates T cell activation, differentiation, and functions,” Frontiers in Immunology, vol. 6, article 1, Article ID 00001, 2015. View at: Publisher Site | Google Scholar
  16. R. Moore, H. Adler, V. Jackson et al., “Impaired glucose metabolism in HIV-infected pregnant women: a retrospective analysis,” International Journal of STD & AIDS, vol. 27, no. 7, pp. 581–585, 2016. View at: Publisher Site | Google Scholar
  17. J. M. Gostner, K. Becker, K. Kurz, and D. Fuchs, “Disturbed amino acid metabolism in HIV: association with neuropsychiatric symptoms,” Frontiers in Psychiatry, vol. 6, article 97, 2015. View at: Publisher Site | Google Scholar
  18. H. Low, L. Cheng, M.-S. Di Yacovo et al., “Lipid metabolism in patients infected with Nef-deficient HIV-1 strain,” Atherosclerosis, vol. 244, pp. 22–28, 2016. View at: Publisher Site | Google Scholar
  19. D. Podzamczer, “Lipid metabolism and cardiovascular risk in HIV infection: new perspectives and the role of nevirapine,” AIDS Reviews, vol. 15, no. 4, pp. 195–203, 2013. View at: Google Scholar
  20. Y. Okamura, Y. Aoki, T. Obayashi et al., “COXPRESdb in 2015: coexpression database for animal species by DNA-microarray and RNAseq-based expression data with multiple quality assessment systems,” Nucleic Acids Research, vol. 43, no. 1, pp. D82–D86, 2015. View at: Publisher Site | Google Scholar
  21. C. Chen, T. K. Hyun, X. Han et al., “Coexpression within integrated mitochondrial pathways reveals different networks in normal and chemically treated transcriptomes,” International Journal of Genomics, vol. 2014, Article ID 452891, 10 pages, 2014. View at: Publisher Site | Google Scholar
  22. M. G. Naylor, X. Lin, S. T. Weiss, B. A. Raby, and C. Lange, “Using canonical correlation analysis to discover genetic regulatory variants,” PLoS ONE, vol. 5, no. 5, Article ID e10395, 2010. View at: Publisher Site | Google Scholar
  23. S. Waaijenborg and A. H. Zwinderman, “Sparse canonical correlation analysis for identifying, connecting and completing gene-expression networks,” BMC Bioinformatics, vol. 10, article 315, 2009. View at: Publisher Site | Google Scholar
  24. C. Camilloni, A. B. Sahakyan, M. J. Holliday et al., “Cyclophilin A catalyzes proline isomerization by an electrostatic handle mechanism,” Proceedings of the National Academy of Sciences of the United States of America, vol. 111, no. 28, pp. 10203–10208, 2014. View at: Publisher Site | Google Scholar
  25. J. Lu, Z. Han, Y. Liu et al., “A host-oriented inhibitor of Junin Argentine hemorrhagic fever virus egress,” Journal of Virology, vol. 88, no. 9, pp. 4736–4743, 2014. View at: Publisher Site | Google Scholar
  26. S. B. Kutluay, D. Perez-Caballero, and P. D. Bieniasz, “Fates of retroviral core components during unrestricted and TRIM5-restricted infection,” PLoS Pathogens, vol. 9, no. 3, Article ID e1003214, 2013. View at: Publisher Site | Google Scholar
  27. L. Zhang, C. Gong, S. L. Y. Lau et al., “SpliceArray profiling of breast cancer reveals a novel variant of NCOR2/SMRT that is associated with tamoxifen resistance and control of ERα transcriptional activity,” Cancer Research, vol. 73, no. 1, pp. 246–255, 2013. View at: Publisher Site | Google Scholar
  28. J. L. Izquierdo-Garcia, P. Viswanath, P. Eriksson et al., “IDH1 mutation induces reprogramming of pyruvate metabolism,” Cancer Research, vol. 75, no. 15, pp. 2999–3009, 2015. View at: Publisher Site | Google Scholar
  29. H. Elliott, R. S. Fischer, K. A. Myers et al., “Myosin II controls cellular branching morphogenesis and migration in three dimensions by minimizing cell-surface curvature,” Nature Cell Biology, vol. 17, no. 2, pp. 137–147, 2015. View at: Publisher Site | Google Scholar
  30. J. A. Estrada-Aguirre, I. Osuna-Ramírez, E. Prado Montes de Oca et al., “DEFB1 5′UTR polymorphisms modulate the risk of HIV-1 infection in Mexican women,” Current HIV Research, vol. 12, no. 3, pp. 220–226, 2014. View at: Publisher Site | Google Scholar
  31. R. Srivastava, A. A. Khan, D. Spencer et al., “HLA-A02:01—restricted epitopes identified from the herpes simplex virus tegument protein VP11/12 preferentially recall polyfunctional effector memory CD8+ T cells from seropositive asymptomatic individuals and protect humanized HLA-A02:01 transgenic mice against ocular herpes,” The Journal of Immunology, vol. 194, no. 5, pp. 2232–2248, 2015. View at: Publisher Site | Google Scholar
  32. F. Zhu, P. Liu, J. Li, and Y. Zhang, “Eotaxin-1 promotes prostate cancer cell invasion via activation of the CCR3-ERK pathway and upregulation of MMP-3 expression,” Oncology Reports, vol. 31, no. 5, pp. 2049–2054, 2014. View at: Publisher Site | Google Scholar
  33. A. J. Morgan, C. Guillen, F. A. Symon, S. S. Birring, J. J. Campbell, and A. J. Wardlaw, “CXCR6 identifies a putative population of retained human lung T cells characterised by co-expression of activation markers,” Immunobiology, vol. 213, no. 7, pp. 599–608, 2008. View at: Publisher Site | Google Scholar
  34. R. H. Brakenhoff, M. Gerretsen, E. M. C. Knippels et al., “The human E48 antigen, highly homologous to the murine Ly-6 antigen ThB, is a GPI-anchored molecule apparently involved in keratinocyte cell-cell adhesion,” The Journal of Cell Biology, vol. 129, no. 6, pp. 1677–1689, 1995. View at: Publisher Site | Google Scholar
  35. E. Y. Kim, R. Lorenzo-Redondo, S. J. Little et al., “Human APOBEC3 induced mutation of human immunodeficiency virus type-1 contributes to adaptation and evolution in natural infection,” PLoS Pathogens, vol. 10, no. 7, Article ID e1004281, 2014. View at: Publisher Site | Google Scholar
  36. D. R. Forsdyke, “Implications of HIV RNA structure for recombination, speciation, and the neutralism-selectionism controversy,” Microbes and Infection, vol. 16, no. 2, pp. 96–103, 2014. View at: Publisher Site | Google Scholar
  37. D. Kang, Z. Fang, B. Huang et al., “Synthesis and preliminary antiviral activities of piperidine-substituted purines against HIV and influenza A/H1N1 infections,” Chemical Biology & Drug Design, vol. 86, no. 4, pp. 568–577, 2015. View at: Publisher Site | Google Scholar
  38. R. Pang, C. Zhang, D. Yuan, and M. Yang, “Design and SAR of new substituted purines bearing aryl groups at N9 position as HIV-1 Tat–TAR interaction inhibitors,” Bioorganic & Medicinal Chemistry, vol. 16, no. 17, pp. 8178–8186, 2008. View at: Publisher Site | Google Scholar
  39. A. Heredia, C. E. Davis, M. S. Reitz et al., “Targeting of the purine biosynthesis host cell pathway enhances the activity of tenofovir against sensitive and drug-resistant HIV-1,” The Journal of Infectious Diseases, vol. 208, no. 12, pp. 2085–2094, 2013. View at: Publisher Site | Google Scholar

Copyright © 2016 Hanhui Ye et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

896 Views | 407 Downloads | 2 Citations
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at help@hindawi.com to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.