Abstract
Much effort has been spent recently in identifying host factors required for HIV-1 to effectively replicate in cultured human cells. However, much less is known about the genetic factors in vivo that impact viral replication in lymphatic tissue, the primary anatomical site of virus–host interactions where the bulk of viral replication and pathogenesis occurs. To identify genetic determinants in lymphatic tissue that critically affect HIV-1 replication, we used microarrays to transcriptionally profile and identify host genes expressed in inguinal lymph nodes that were associated determinants of viral load. Strikingly, ∼95% of the transcripts (558) in this data set (592 transcripts total) were negatively associated with HIV-1 replication. Genes in this subset 1) inhibit cellular activation/proliferation (e.g., TCFL5, SOCS5 and SCOS7, KLF10), 2) promote heterochromatin formation (e.g., HIC2, CREBZF, ZNF148/ZBP-89), 3) increase collagen synthesis (e.g., PLOD2, POSTN, CRTAP), and 4) reduce cellular transcription and translation. Potential anti–HIV-1 restriction factors were also identified (e.g., NR3C1, HNRNPU, PACT). Only ∼5% of the transcripts (34) were positively associated with HIV-1 replication. Paradoxically, nearly all of these genes function in innate and adaptive immunity, particularly highlighting heightened gene expression in the IFN system. We conclude that this conventional host response cannot contain HIV-1 replication and, in fact, could well contribute to increased replication through immune activation. More importantly, genes that have a negative association with virus replication point to target cell availability and potentially new viral restriction factors as principal determinants of viral load.
Over the last decade, systems biology has taken on an increasingly important role in investigating microbial diseases, delineating salient features of the host-pathogen relationship, and identifying potential host genes that are critical determinants of microbial replication and pathogenesis. In the case of HIV-1, which like any obligate intracellular pathogen relies on the transcriptional and translational machinery of the host cell to complete its life cycle (1–3), these studies have revealed components of host gene expression that establish a favorable intracellular environment for efficient virus replication. For example, genomics-based approaches have, to date, documented changes in gene expression in cultured cells during HIV-1 infection (4), and more recently, small interfering RNA technology has identified hundreds of host genes seemingly indispensable for HIV-1 replication in vitro (5–8).
In contrast, much less is known about host genes that play important roles in viral replication in vivo in which HIV-1 replicates in the complex environment of lymphatic tissue (LT) in the context of a host responding to infection. In previous microarray studies of HIV-1 infection in LT, we have shown that infection massively perturbs host gene expression and that this transcriptional profile is highly dependent on stage of disease (9). In this work, we report studies that go beyond this initial identification of stage-specific features of the host response in LT to now identify genes that play important roles in viral replication in vivo. We now show the following: 1) there is little overlap between genes in vivo compared with genes in vitro that correlate with viral replication; 2) paradoxically, host immune responses correlate with high viral loads; and 3) ∼95% of the correlations are inverse correlations that point to the importance of target cell availability, cellular activation, transcriptional factors, and new inhibitors as determinants of viral load in vivo.
Materials and Methods
Ethics statement
This study was conducted according to the principles expressed in the Declaration of Helsinki. This study was approved by the Institutional Review Board of the University of Minnesota. All patients provided written informed consent for the collection of samples and subsequent analysis.
Lymph node biopsy specimens
Inguinal lymph node biopsies from 22 untreated HIV-1–infected individuals at different clinical stages were obtained for this University of Minnesota Institutional Review Board-approved microarray study. Viral load measurements were obtained the same day as biopsies. Each lymph node biopsy was placed into a Falcon tube and snap frozen by dropping it into liquid nitrogen.
RNA extraction, synthesis of biotin-labeled cRNA probes, and microarray hybridization
Frozen lymph nodes were homogenized with a power homogenizer (Heat Systems Ultrasonic, Farmingdale, NY) in TRIzol (Invitrogen, Carlsbad, CA) without thawing. Total RNA was isolated, according to the manufacturer’s protocol, and further purified with an RNeasy mini kit (Qiagen, Valencia, CA). Double-stranded cDNA and biotin-labeled cRNA probes were synthesized from 5 μg total RNA with a MessageAmp II aRNA kit (Ambion, Austin, TX). The cRNA probes were column purified and fragmented with a fragmentation kit (Ambion).
Fifteen micrograms of fragmented cRNA was hybridized to an Affymetrix Human Genome U133 Plus 2.0 array (Santa Clara, CA). After hybridization, chips were washed, stained with streptavidin-PE, and scanned with GeneChip Operating Software at the Biomedical Genomics Center at the University of Minnesota. The experiments from each RNA sample were duplicated in the preparation of each cRNA probe and microarray hybridization.
Microarray data analysis
The .cel files produced by the Affymetrix data analysis platform were uploaded into the Expressionist program (Genedata, Pro version 4.5, Basel, Switzerland), and the expression level for each of the ∼56,000 probe sets on the arrays was quantified using the robust multi-chip average algorithm. The mean expression level from duplicate chips from the same individual’s RNA was computed and used in the subsequent analysis. The robust multi-chip average algorithm produces a summary of gene expression that is of log scale; we used this log scale for all analyses.
To examine the relationship between gene expression and viral load, a linear regression model was fit with gene expression as the response variable and disease stage and the log of viral load as predictor variables. We controlled for disease stage when examining the relationship between viral load and gene expression because we previously found an association between gene expression and disease stage for HIV-1 infection (9). The p values for the null hypothesis (testing no association between viral load and gene expression) were then transformed to q values using the q value package in the statistical software program R. By varying the threshold for the q values, one can obtain a set of genes that are found to be associated with viral load at a given false discovery rate. Because the number of genes found to be associated with viral load increases dramatically as this threshold rises above ∼7%, we used the conservative cutoff value of 7% for the false discovery rate. To summarize the association between log of viral load and gene expression, we computed the partial correlation between these two quantities for all genes while controlling for the effect of disease stage.
Immunofluorescence
Immunofluorescence was performed using a biotin-free detection system on 5-μm tissue sections mounted on glass slides. Tissues were deparaffinized and rehydrated in deionized water. Heat-induced epitope retrieval was performed using a water bath (95–98°C for 10–20 min) in DiVA Decloaker (Biocare Medical, Concord, CA) retrieval buffer, followed by cooling to room temperature. Tissue sections were then treated with Proteinase K (4 μg/ml) for 20 min at 37°C, followed by blocking with SNIPER blocking reagent (Biocare Medical) for 15 min at room temperature. A primary Ab specific for collagen type 1 (Sigma-Aldrich; clone Col-1, catalog C2456) was diluted 1:100 in Tris-NaCl-blocking buffer (0.1 M Tris-HCl [pH 7.5]; 0.15 M NaCl; 0.05% Tween 20 with DuPont [Wilmington, DE] blocking buffer) and incubated overnight at 4°C. After the primary Ab incubation, sections were washed with PBS and then incubated with fluorophore-conjugated secondary Abs (Alexa Fluor dyes; Invitrogen) in 5% nonfat milk for 2 h at room temperature. These sections were washed and mounted using Aqua Poly/Mount (Polysciences, Warrington, PA). Immunofluorescent micrographs were taken using an Olympus BX61 Fluoview confocal microscope with the following objectives: ×20 (0.75 NA), ×40 (0.75 NA), and ×60 (1.42 NA); images were acquired using Olympus Fluoview software (Melville, NY; version 1.7a). Isotype-matched IgG- or IgM-negative control Abs in all instances yielded negative staining results.
Microarray data accession number
All microarray results have been deposited in the Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo/; accession number GSE21589).
Results
Predominance of negative correlations with viral replication
To identify genes within LT related to viral replication, we examined host gene expression and its association with viral load in inguinal lymph nodes from 22 HIV-1–infected individuals (Table I). We identified 592 transcripts significantly associated with viral load (−0.6 > partial correlation coefficient > 0.6; 7% false discovery rate) (Fig. 1A). Strikingly, ∼95% of the transcripts (558) in this data set are negatively associated with HIV-1 replication, whereas only ∼5% (34 transcripts) have a positive association (Fig. 1B). Based on gene ontology/annotations from the NetAffx Analysis Center, Ingenuity Pathways Analysis, and extensive examination of published literature, we classified ∼60% of all altered transcripts into functional categories, resulting in a list of 345 genes (the remaining transcripts have, as of yet, no identification or functional information available) (Figs. 2, 3, Supplemental Table 1).
Antiviral host response positively correlates with virus replication
Surprisingly, only a small subset of genes was positively associated with viral replication (32 genes + 2 transcripts of unknown function), with most of these genes (∼70%) paradoxically coding for proteins involved in innate and adaptive defenses, which might have been expected to decrease viral load (Fig. 2). IFN-stimulated genes were a prominent component of this response (16 of the 22 genes ascribed to immune defenses), but there were also genes encoding proteins involved in cell-mediated cytotoxicity (granzyme H, perforin, CD8, and NKG7), antiviral chemokines (RANTES and MIP-1β), and ligands for the chemokine receptor CXCR3 (CXCL9, CXCL10, and CXCL11). Finally, STAT1, a master regulator of transcription of numerous immune defense genes, was also positively associated with HIV-1 replication. In sum, this analysis reflects a robust antiviral host response that parallels virus replication, but is apparently associated with higher viral loads rather than containment of HIV-1 replication.
In a previous analysis of LT comparing uninfected with HIV-1–infected individuals (9), we found signatures of gene expression dependent on stage of disease, with the most striking signature in early infection in which expression of genes that control immune activation and innate and adaptive immune defenses was highly upregulated. Because immune activation is now widely believed to be a factor in overall immune dysfunction and negative prognosticator of disease progression (10, 11), we used a chronic immune activation (CIA) index based on gene expression, developed by Rempel et al. (12), to characterize cellular immune dysfunction during HIV-1 infection. This index is comprised of genes related to immune activation that are elevated during infection and can be computed for each infected individual by taking the mean fluorescence intensity of specified genes for each sample and calculated as follows:
where n equals the number of selected genes. In this work, we explored the relationship between CIA and viral replication in our study cohort. We used 18 genes for the CIA index, genes that had the highest increase in expression in LT during HIV-1 infection (9) (Fig. 4A). The CIA values and viral load measurements were plotted for each individual, revealing a significant association between CIA and viral load (r = 0.6288; p < 0.0001) (Fig. 4B). Additionally, these LT genes with the highest upregulation upon infection also have the strongest positive correlation to viral load (Fig. 4A). Overall, this analysis provides a glimpse into the intricate relationship between immune defenses, immune activation, and viral replication, whereby individuals with a higher CIA index are more likely to have higher viral loads.
Negative correlations: immune activation and transcriptional accessibility important for HIV-1 replication
Strikingly, the vast majority of host genes expressed in LT during HIV-1 infection were negatively associated with virus replication (313 genes + 245 transcripts of unknown function). A large proportion of genes in this data set encodes for proteins involved in immune activation, target cell availability, and transcriptional/translational metabolism (Fig. 3), highlighting the importance of activation status and cellular transcriptional/translational machinery in virus replication. In this list of genes negatively correlated with viral load (Supplemental Table 2), there are genes involved in inhibiting cellular activation and proliferation, such as the suppressors of cytokine signaling, SOCS5 and SCOS7; TCFL5/Cha, a transcription factor implicated in the maintenance of a resting state in T cells; CCNG2, an unconventional cyclin that blocks cell cycle entry; KLF10, a transcription factor involved in inhibiting cellular proliferation through the TGF-β signaling pathway; and GPR83, a G protein-coupled receptor involved in the generation of Foxp3+ regulatory T (TReg) cells.
We also identified genes that regulate the intracellular environment for virus replication, such as genes involved in suppressing cellular transcription, particularly through the modification of DNA histones (Supplemental Table 3). Examples include genes whose products recruit histone deacetylases, such as CREBZF and ZNF148/ZBP-89, and genes whose products interact/recruit histone methyltransferases, such as HP1BP3, HIC2, and DPF3. This analysis suggests that these gene products adversely affect virus replication by modifying DNA histones, promoting heterochromatin formation, and suppressing cellular and viral transcription.
Negative correlations: collagen deposition and virus replication
One pathological hallmark of HIV-1 infection is the aberrant deposition of collagen and consequent fibrotic damage in both gut and LT (13), a process that may have adverse consequences for the maintenance and preservation of CD4+ T cells and overall immune function in these anatomical niches. In this study, we show that genes related to collagen synthesis/deposition are negatively associated with virus replication (Supplemental Table 4). Examples include PLOD2, an enzyme responsible for the irreversible cross-linking of collagen fibers; POSTN, a regulator of collagen fibrillogenesis; CRTAP, a scaffolding protein linking collagen-synthesizing enzymes to their precursor substrates; CTHRC1, a regulator of TGF-β responsiveness and their target genes such as collagens; and COL4 and COL13, collagen members important for the structure of the basement membrane. The products of these genes most likely increase overall collagen deposition and fibrosis in LT and negatively affect virus replication by potentially decreasing the viability/availability of target immune cells for HIV-1.
To examine this in more detail, we explored the relationship between mRNA expression of a collagen-synthesizing gene and the resulting end product of this process, protein levels of collagen deposited in LT. One gene we looked at was PLOD2, a telopeptide lysyl hydroxylase primarily responsible for increasing pyridinoline cross-links within collagen molecules during fibrillogenesis, a process that leads to the irreversible accumulation of collagen in fibrotic tissues (14). We used immunofluorescence to visualize collagen type I deposition in inguinal lymph nodes from HIV-1–infected individuals in our study cohort. We initially compared subject WU59 with GL38 because of their divergent PLOD2 mRNA expression (Fig. 5A) and found a relationship between PLOD2 mRNA expression and collagen type I deposition, whereby greater levels of fibrosis were detected in the lymph node of subject GL38 coincident with higher levels of PLOD2 mRNA (compare Fig. 5B, 5C). Previously reported to be increased during HIV-1 infection (15), much less collagen deposition was seen in the lymph node of an uninfected individual (Fig. 5D). There was also a negative correlation between collagen deposition in LT and numbers of peripheral blood CD4+ T cells in a group of our study subjects (Fig. 5E), in agreement with previous reports (15, 16), suggesting that increased fibrosis during HIV-1 infection may negatively impact viral replication by adversely affecting CD4+ T cell viability and reducing access to and availability of target cells for viral replication.
Candidates for new host restriction factors
In our list of genes negatively associated with viral load, we identified genes encoding potential anti–HIV-1 restriction factors (Supplemental Table 5). One such gene is an intracellular glucocorticoid receptor known as nuclear receptor subfamily 3, group C, member 1(NR3C1). This receptor has been shown to impede proviral integration in PBMCs in the absence of steroidal ligands (17); this block is alleviated in the presence of NR3C1’s ligand and requires the HIV-1 protein, Vpr. Another candidate restriction factor is CUG triplet repeat, RNA-binding protein 1 (CUGBP1), a previously unrecognized downstream effector of IFN-β signaling in primary macrophages that induces a transcriptional inhibitory protein that works to suppress HIV/SIV replication (18). Heterogeneous nuclear ribonucleoprotein U is another candidate identified in our analysis, a ribonucleoprotein that has been shown to specifically target the 3′ long terminal repeat in the viral mRNA and block the cytoplasmic accumulation of HIV-1 mRNAs (19).
Discussion
The major and unexpected finding of this study is the unique transcriptome profile in LT during HIV-1 infection related to viral replication, whereby the vast majority of changes in mRNA expression are negatively associated with HIV-1 replication, whereas only a small subset of genes (∼5%) is positively associated with virus levels. Surprisingly, most of the genes in this small subset mediate innate and adaptive defenses mounted by the host to contain HIV-1, a host response that would have been expected to negatively correlate with viral load. Although counterintuitive, this LT analysis is in agreement with a transcriptome analysis of primary, peripheral blood CD4+ T cells from HIV-1–infected individuals (20), in which expression of a large proportion of IFN-related genes also correlated with increasing levels of virus (Supplemental Table 6).
In addition to a heightened IFN-system response, key genes important for cell-mediated immunity (e.g., perforin, granzyme H, CD8, NKG7) (Fig. 2) and recruitment of immune cells to sites of infection (e.g., CXCL9, CXCL10, CXCL11) (21) were also positively associated with HIV-1 replication. Interestingly, granzyme H, in addition to its proapoptotic function as a serine protease highly expressed in NK cells and closely related to granzyme B (22, 23), has recently been shown to mediate antiviral activity through direct cleavage of intracellular viral substrates (24, 25). Genes related to chemotaxis may provide some clues regarding the surprising association between host defense genes and high viral loads—the chemotactic response may actually be part of a double-edged sword, on the one hand, serving to recruit HIV-1–specific cytotoxic CD8+ T cells to eliminate virus-infected cells (26) but also serving to fuel the infection further by recruiting susceptible, activated CD4+ T cells to enhance viral dissemination (27).
Surprisingly, we did not identify classical activation markers such as Ki-67, HLA-DR, CD69, and CD38 in our data set, genes predicted to be positively correlated with virus replication due, in part, to their role in cellular activation as a means to provide the virus with additional target cells for infection. One interpretation would be that these gene products actually represent general activation of the immune system as a whole and are not specifically indicative of activation in the subsets of cells that are specific targets of the virus. In other words, these classical activation markers may not be the primary determinants of a suitable intracellular environment for permissiveness in target cells. The genes we did detect with a positive correlation to viral load may actually be more representative of cellular activation and permissiveness in individual target cells. The significant association between the CIA index and viral load suggests that this may indeed be the case (Fig. 4).
How then to explain the paradoxical, positive association between host defenses and viral load, and even more strikingly, the negative correlations that dominate the data set? One explanation could be that HIV-1 itself is driving host gene expression; for example, as virus replication increases, the host responds comparably by increasing expression of genes controlling immune defenses. However, due to the complex relationship between immune defenses, immune activation, and virus replication, another explanation could be a model in which the level of viral replication is determined by a balance between both factors that provide virus access to the largest number of susceptible host cells and increase replication and factors that decrease target cell availability/permissiveness and inhibit viral replication (Fig. 6). This schematic is consistent with a previous model proposed from transcriptional profiling of the response to highly active antiretroviral therapy (28), aimed to explain the slow progress of HIV-1 infection. On one side of the balance are the supportive determinants of viral replication, such as target cell availability, the activation state of CD4+ T cells and other susceptible cells, and intrinsic intracellular factors that support replication. On the other side of the balance are innate and adaptive defenses and host restriction factors that suppress HIV-1 replication. In this model, host defenses most likely inhibit viral replication to some extent, yet, because they are inseparable from immune activation, related proinflammatory cytokines, and recruitment of activated target cells, on balance host defenses actually may contribute to increased viral load.
In this model, many of the negatively associated genes that dominate the data set reflect the importance of target cell availability and the ability of host intracellular machinery to support viral replication. Collagen deposition in this model negatively impacts viral load through fibrotic damage of the lymph node niche and its documented effects of decreasing CD4+ T cells (Fig. 5E) (15, 16), essentially decreasing access and opportunities for virus to interact with target cells.
There may also be a complex relationship between collagen deposition, TReg cells, and HIV-1 replication, suggested by findings in the rhesus macaque model of SIV infection that TGF-β1–producing TReg cells are primarily responsible for inducing collagen synthesis/deposition in LT (29). In HIV-1 infection, TReg cells might dampen cellular activation in bystander immune cells, a classical feature of TReg cells (30), thus decreasing overall viral output in HIV-infected cells. This would explain why GPR83, a G protein-coupled receptor reported to be involved in the peripheral generation of TReg cells in vivo (31, 32), along with mediators of the TGF-β signaling pathway (e.g., ITGB8, SMAD5, PEG10, GDF10, KLF10), are all negatively associated with HIV-1 replication.
Beyond the principal hypothesis of target cell availability and permissiveness as the key determinant of viral load, there may be new, host restriction factors that also play an important role. By identifying genes that are both negatively associated with virus replication and code for proteins that display antiviral properties, we found several candidate genes that fit into this category (Supplemental Table 5). One gene in this list, PACT, warrants additional comment. PACT encodes a protein kinase that acts upstream of the important antiviral, sentinel-like molecule, dsRNA-dependent protein kinase (PKR) (33). PACT has been shown to serve as a cellular activator of PKR in the absence of viral RNA (34), but has also recently been demonstrated to possess a role in type I IFN production during viral infection, specifically bypassing PKR activation during amplification of the IFN response (35). Thus, we have a gene that acts upstream of the IFN-response pathway and is negatively associated with viral replication in a data set in which all other IFN-responsive genes are positively associated with HIV-1 replication. This may indicate that PACT is acting outside the IFN-response pathway and adversely affecting HIV-1 replication through its other functions, such as inhibiting cellular translation and inducing apoptosis (36) or amplifying levels of micro-RNAs (37) that may serve to inhibit HIV-1 replication (38, 39). We are currently investigating PACT and other candidate genes for their potential role as novel anti–HIV-1 restriction factors.
We have previously documented changes in LT gene expression during HIV-1 infection and found that infection substantially alters the transcriptional profile compared with uninfected individuals (9) and that these transcriptional changes are dependent on stage of disease. A comparison between the present analysis and our previous microarray findings is challenging because the previous study stratified changes in gene expression based on disease stage, whereas our current analysis stratifies gene expression based strictly on virus levels, variables that do not share high concordance (virus levels versus disease stage). Nevertheless, when comparing the present analysis with our previous microarray findings, we find that the majority of genes positively associated with virus levels are significantly upregulated early in infection (acute stage of disease), overlapping data illustrating a shared relationship between immune defense genes and viral levels in early disease. In contrast, most of the genes negatively associated with virus levels do not overlap with our previous data set, illustrating the overall discordance between disease stage and virus levels. One would expect to observe greater concordance between data sets if viral levels were indeed correlated with disease stage, but even in our limited study cohort, viral levels vary appreciably across stage. Thus, this comparative analysis highlights two sets of genes—one set of genes may be directly influencing virus replication (current study) and another set of genes that are stage dependent and, on a global scale, may impact virus replication through direct or indirect means (9).
In this microarray study, we used whole lymph nodes for processing and generation of the template RNA for microarray chip hybridization as a means to capture the sum of all interactions in an important microenvironment (lymph node) where the bulk of HIV-1 replication and pathogenesis occurs. As such, cells were not first separated into individual subpopulations for RNA preparation (i.e., CD4+ T cells), which likely explains why there was very little overlap (<3%) with recent small-interfering RNA-knockdown screens (5–8) designed to identify host cell factors required for HIV-1 infection in vitro (Supplemental Table 7). The advantage of this type of tissue-population microarray study is that RNA from all the essential cell populations residing within the lymph node microenvironment are present during chip hybridization and subsequent analysis. Genes identified in this study are a good starting point for further investigations of their roles in HIV-1 replication and pathogenesis using in situ technologies to identify the types of cells in which gene expression has been altered and immunohistochemistry/immunofluorescence to identify potential changes in protein expression. For example, we used immunofluorescence to identify the cellular expression of SP100, an IFN-responsive gene that codes for one of the major components of a nuclear transcriptional complex known as the promyelocytic leukemia nuclear body (40) and has been reported to play an important role in innate antiviral defense against a number of different DNA and RNA viruses (41, 42). We found the size and number of SP100-containing nuclear bodies increased in LT CD3+ T cells and CD163+ macrophages during HIV-1 infection compared with uninfected individuals (Supplemental Fig. 1). In sum, we believe this global, tissue microarray methodology is an important first step in a systems-biology approach to understand HIV-1 infection that provides a future framework for focused investigations examining the roles of specific genes impacting HIV-1 replication and pathogenesis.
In summary, this work yields key insights into systems biology of HIV-1 infection in LT, generates fruitful starting points for additional investigations into candidate genes that may play an important role in aiding or inhibiting HIV-1 infection, and may identify adjunctive approaches to improve therapies and immune reconstitution.
Acknowledgements
We thank all of the donor participants in this study.
Disclosures The authors have no financial conflicts of interest.
Footnotes
This work was supported by National Institutes of Health Grant R01 AI056997 (to A.T.H.).
The sequences presented in this article have been submitted to the Gene Expression Omnibus Web site under accession number GSE21589.
The online version of this article contains supplemental material.