One key barrier to curative therapies for HIV is the limited understanding of HIV persistence. HIV provirus integration sites (ISs) within BACH2 are common, and almost all sites mapped to date are located upstream of the start codon in the same transcriptional orientation as the gene. These unique features suggest the possibility of insertional mutagenesis at this location. Using CRISPR/Cas9-based homology-directed repair in primary human CD4+ T cells, we directly modeled the effects of HIV integration within BACH2. Integration of the HIV long terminal repeat (LTR) and major splice donor increased BACH2 mRNA and protein levels, altered gene expression, and promoted selective outgrowth of an activated, proliferative, and T regulatory–like cell population. In contrast, introduction of the HIV-LTR alone or an HIV-LTR-major splice donor construct into STAT5B, a second common HIV IS, had no functional impact. Thus, HIV LTR-driven BACH2 expression modulates T cell programming and leads to cellular outgrowth and unique phenotypic changes, findings that support a direct role for IS-dependent HIV-1 persistence.

This article is featured in Top Reads, p.

The replication cycle of retroviruses, including HIV, pose a unique problem for therapeutic intervention because of integration into host chromosomal DNA. Although antiretroviral therapy (ART) can suppress viral replication and person-to-person spread, infected cells persist for the life of the infected individual. Thus, understanding the cellular and viral determinants that govern HIV persistence is critical for design of interventions for HIV cure.

HIV integrates in a semirandom fashion and is enriched in transcriptional units (1, 2) and near superenhancer regions (3). Strikingly, HIV integration site (IS) distribution studies have revealed unusually frequent integrations within the transcriptional unit of a few genes, most notably BACH2 and STAT5B (46). Importantly, the vast majority of BACH2 ISs have been reported in the forward direction relative to gene polarity, and nearly all cluster in the intron immediately preceding the first coding exon (48). In contrast with BACH2, while ISs located in the STAT5B locus are also predominantly located upstream of the coding sequences, there is no clear directional forward orientation bias (48).

Long terminal repeats (LTRs) contain promoter, enhancer, and transcription terminating elements (9). HIV integration into specific chromosomal loci has been associated with cell proliferation in vivo (4, 5). Based on the unique high frequency of detection of HIV ISs in BACH2 and STAT5B in individuals with undetectable viral loads on ART, Cesana et al. (10) explored the possibility that HIV LTRs direct expression of the coding sequences of these genes. When integrated upstream of the first coding exon, expression of a cellular protein can potentially be initiated under control of the HIV promoter (LTR) by splicing from HIV splice donor sequences into the splice acceptors of the downstream genes. The major splice donor (MSD) of HIV, immediately downstream of the 5′ LTR, is predicted to be the most efficient means of forming chimeric transcripts with the capacity to encode these cellular proteins. To date, proviral genome structures have been reported for only four integration events within BACH2. In all cases, however, the integrated proviruses retained an intact 5′ LTR and MSD as would be required to generate chimeric transcripts (8). Consistent with the concept, Cesana et al. (10) detected LTR-driven chimeric transcripts at BACH2 in 9% and at STAT5B in 31% of a cohort of HIV-infected individuals, including individuals receiving ART. Further, using a comprehensive, longitudinal RNA analysis of 44 subjects treated with suppressive ART early after HIV infection, we identified T cells bearing BACH2 hybrid transcripts in >40% of subjects, findings consistent with clonal expansion of T cells bearing this IS (Lisa M. Frenkel and James I. Mullins, unpublished observations). Together, these observations support the concept that retroviral insertion in BACH2, STAT5B, and potentially other key sites may directly alter T cells in a manner that promotes progressive clonal expansion via cell proliferation and/or survival (11). However, this concept and the functional impact of candidate novel transcriptional products in human primary T cells has not been directly tested.

In this study, we used gene editing to directly recapitulate the formation of hybrid transcripts reported in patient cells and thereby allow for interrogation of impacts at common HIV IS hotspots, including BACH2 and STAT5B (10). To achieve this goal, we used a dual-delivery system of CRISPR/Cas9 and recombinant adeno-associated virus (AAV) donor templates to mediate homology-directed repair (HDR)–based knock-in of alternative LTR cassettes within both BACH2 and STAT5B in primary CD4+ T cells. While seeking to uncover the mechanisms behind the long-term persistence of latently infected HIV cells, our novel HIV-mediated gene expression system revealed subtle nuances of T cell biology driven by the transcription factor BACH2. Our combined observations demonstrate that HDR editing of BACH2 using the LTR-MSD leads to increased BACH2 expression and generation of a unique T cell population that exhibited features of T regulatory (Treg) cells and manifested enhanced proliferation and selective outgrowth in comparison with control edited populations.

Gibson Assembly or T4 DNA Ligase (New England Biolabs) was used to clone gene-synthesized or PCR-amplified fragments (Integrated DNA Technologies [IDT, Newark, NJ] or Twist Bioscience). For each genomic target, plasmids were created to have 600-bp homology arms homologous to the respective cut sites. All inserted nucleotide sequences and ligation products were confirmed by Sanger sequencing. Stellar competent cells (TaKaRa) were transformed with the finalized plasmids to generate stocks for AAV production.

All donor templates designed for editing experiments were cloned into AAV plasmid backbones as previously described (12, 13). Recombinant AAV6 stocks were produced in HEK293T cells as previously described (13). Viral titers were determined by qPCR using AAV-specific primers and probe as described previously (12).

Cryopreserved PBMCs isolated from volunteers were purchased from the Fred Hutchinson Cancer Research Center. Upon thawing, CD4+ cells were isolated by negative selection (EasySep Human CD4+ Enrichment Kit; STEMCELL Technologies) and cultured in RPMI 1640 media with 20% FBS, 1× GlutaMAX (Life Technologies), 10 nM HEPES (Life Technologies), and 1× 2-ME (Life Technologies). Cells were cultured in 50 ng/ml human IL-2 (PeproTech) between 0.5 and 1 × 10e6/ml in flat-bottom culture plates at 37°C with 5% CO2.

Single guide RNAs (sgRNAs) were designed to target genomic sites at the BACH2, STAT5B, and AAVS1 loci using the CCTop online tool (14) and were synthesized by Synthego. Guides were reconstituted in buffer TE (Synthego) with a final concentration of 50 pmol/μl, aliquoted, and stored at −80°C. Alt-R HiFi Cas9 was purchased from IDT.

After negative CD4+ selection, cells were activated for 3 d with Dynabeads Human T-Activator CD3/CD28 for T Cell Expansion and Activation (Life Technologies) according to the manufacturer’s instructions. After activation, beads were magnetically removed, and cells were cultured overnight. Cells were then washed twice in PBS and resuspended in LONZA buffer P3 prior to nucleofection. Cas9 nuclease and sgRNAs were complexed at a 1:2.5 ratio and delivered to cells by LONZA 4-D Nucleofection System according to the manufacturer’s instructions at a concentration of 2 μM per 1.0 × 106 cells. Cells were then transferred into prewarmed media formulated as described earlier, supplemented with 2.5% FBS. AAV6 vectors carrying the corresponding repair templates were used to transduce edited cells at 30% culture volume. Twenty-four hours later, standard media supplemented with 20% FBS were added to all culture wells and cells. Cells were maintained at 0.5–1.0 × 106 cells/ml for the duration of the experiments. Media including fresh IL-2 were replaced every 2–3 d. The following sgRNA sequences were used in editing experiments: BACH2 G1, 5′-AACTGCTTGAGCCCAAAAGG-3′; BACH2 G3, 5′-CCAGCAGTAAGTCTGTTGTA; STAT5B G3, GAGGCTACCACCTCACCTAG-3′; and AAVS1 P1, 5′-ATTCCCAGGGCCGGTTAATG-3′.

Genomic DNA was extracted from all experimental samples (DNeasy Blood and Tissue; Qiagen) and used in digital droplet PCR (ddPCR) to assess HDR or in conventional PCR to quantify nonhomologous end-joining (NHEJ). To assess NHEJ, we amplified gDNA by PCR using PrimeSTAR GXL Polymerase (Clontech) and custom primers sitting 200–300 bp flanking the predicted cut site. Amplicons were then Sanger sequenced (Genewiz), and the resulting .ab1 sequence files served as input for Inference of CRISPR Edits Analysis developed by Synthego. Primer sequences are available on request.

Each sgRNA generated in this study was submitted to CCTop (https://crispr.cos.uni-heidelberg.de/) to examine putative off-target CRISPR/Cas9 localization to the human genome (hg38) with the additional parameters: 20-bp target site length, 12-bp core length, and four total potential mismatches (maximum of two core mismatches). The top 8 predicted cut sites were investigated by sequencing as described earlier for evidence of off-target cutting (Supplemental Fig. 1), and the primer sequences are available on request.

RNA was extracted from 3–5 × 105 cells from all experimental samples using the RNeasy Micro Kit (Qiagen). A total of 100–200 ng of RNA was used to synthesize cDNA using the iScript Advanced cDNA Synthesis Kit for RT-qPCR (Bio-Rad Laboratories) according to the manufacturer’s instructions. Synthesized cDNA was used as the template for hybrid/endogenous transcript ddPCR, as well as sequencing reactions.

Flow cytometry was performed on an LSR II (BD Biosciences) flow cytometer, and data were analyzed using FlowJo software (Tree Star). Cells were labeled with fluorescent Abs and the viability stain Alexa Fluor 350 carboxyl acid NHS ester (Invitrogen). Surface proteins were stained by incubation with Abs for 30 min at 4°C. In cases where intracellular protein expression was assessed, cells were fixed and permeabilized following surface stain. For FOXP3, CTLA-4, and HELIOS, fixation/permeabilization was carried out with the True-Nuclear Transcription Factor Buffer Set (BioLegend) according to the manufacturer’s instructions. For intracellular cytokine production, cells were cultured in media containing 50 ng/ml PMA and 1 μg/ml Ionomycin (Millipore Sigma) and 1 μg/ml GolgiStop (BD Biosciences) for 5 h at 37°C. This was then followed by fixation/permeabilization with Cytofix/Cytoperm (BD Biosciences) and then incubated with cytokine-specific Abs. To assess responsiveness to IL-2, we performed stimulation and staining for pStat5 as described previously (15). We used 5 ng/ml IL-2 (PeproTech) to stimulate the cells during the pStat5 stain. To assess cellular proliferation by flow cytometry, we used the CellTrace Far Red Cell Proliferation Kit (Invitrogen) according to the manufacturer’s instructions. Briefly, cells were incubated in PBS with the appropriate concentration of CellTrace stain for 20 min at 37°C. Five times the staining volume of media was then added to all stained wells to remove free dye for 5 min, and then cells were pelleted by centrifugation and resuspended in fresh media. Cells were cultured for 96 h before proliferation was assessed by flow cytometry.

Abs used in this study include (from BioLegend) IFNg (catalog number [cat.] 502516), TNFa (cat. 502944), CTLA-4 (cat. 349908), FOXP3 (cat. 320208), CD4 (cat. 300530), LAG3 (cat. 369314), ICOS (cat. 313528), PD-1 (cat. 329922 and 329920), CD69 (cat. 310910), CD44 (cat. 103028); (from BD Biosciences) p-STAT5 (cat. 612599), CD3 (cat. 349201), CD25 (cat. 335789), CD127 (cat. 563086), CD62L (cat. 562720), CD8 (cat. 560662); (from eBioscience) HELIOS (cat. 501124891/61988342) and TIGIT (cat. 5011268/12950042); and (from Life Technologies) IL-2 (cat. 25-7029-42). All Abs were used at a 1:100 dilution except for p-STAT5 (1:10) and TIGIT (1:50).

ddPCR was used to assess targeting efficiency in edited samples (Bio-Rad Laboratories). All primers and probes were designed in Primer-Blast (https://www.ncbi.nlm.nih.gov/tools/primer-blast/) using the Homo sapiens genome as the reference database and were purchased from IDT. Briefly, a forward primer and HEX probe were placed on the LTR region of the inserted cassette, and a reverse primer was located downstream of the 3′ homology arm on the endogenous locus. The same forward primer and probe were used to assess editing at all genomic loci, while the reverse primer was specific to the gene targeted. The same primer and probe sets were used to assess editing events targeted by both the Solo-LTR and LTR-MSD constructs. A single control set of primers and a probe were designed to amplify a region of the STAT5B gene to assess the total number of alleles interrogated in each ddPCR. All ddPCRs were duplexed with the insert set of primers and probes, as well as the control set of primers and probes, to assess the allelic frequency of HDR. Targeting efficiency was expressed as the ratio of the concentration of targeted alleles (positive for LTR insert) to that of the total alleles (positive for STAT5B control) present. Primer and probe sequences are available on request. ddPCR was performed according to the manufacturers’ instructions.

The expressions of hybrid HIV/BACH2 and HIV/STAT5B transcripts from edited samples were quantified using custom ddPCR primers and probes designed by Cesana et al. (10), and the identities were confirmed by Sanger sequencing. Expression of BACH2, STAT5B, IL10, TGFB1, and HPRT was quantified using TaqMan Gene Expression Assays (Applied Biosystems): BACH2 (Hs00222364_m1, exon boundary 8–9), STAT5B (Hs00273500_m1, exon boundary 16–17), IL10 (Hs00961622_m1, exon boundary 4–5), TGFB1 (Hs00998133_m1, exon boundary 6–7), and HPRT (Hs99999909_m1, exon boundary 6–7). The concentration of hybrid or endogenous transcripts was normalized to the concentration of HPRT transcripts duplexed in the same reaction.

Chemiluminescent Western blot images and Ponceau S stain images were captured on the ChemiDoc MP imaging system. Quantification was performed using the ImageJ program. The Ponceau S stain band was used for quantification and normalization: BACH2 molecular mass, 130 kDa; ACTB (b-actin) molecular mass, 42 kDa.

Approximately 1 × 10e6 CD4+ T cells from a late culture time point (day 48 postediting) were collected for RNA extraction (Qiagen) and poly-A selected mRNA library preparation (TruSeq RNA Library Prep Kit v2; Illumina). Cells were collected from two editing experiments. A total of 33–41 million raw paired-end reads per sample were generated. Reads were mapped to a custom hg38 human reference genome that divided the BACH2 locus into two regions: BACH2_5p reads mapping upstream of the target site (hg38 gene position: 115,649) and BACH2_3p reads mapping downstream of the target site. Additionally, the LTR-MSD sequence was appended to hg38. STAR 2.7 was used to filter and map reads (16). HTSeq was used to count the overlap of reads with genes (17). Count normalization and differential gene expression analysis were performed using DESeq2 (18). The data produced by this experiment were deposited in the NCBI BioSample database for public access with the accession number PRJNA733175 (https://www.ncbi.nlm.nih.gov/bioproject/733175).

Gene lists determined by differential expression were passed through the Gene Ontology (GO) (19, 20) biological process complete list and analyzed by PANTHER overrepresentation test (released April 7, 2020) (21) using the Fisher’s exact test with false discovery rate (FDR) correction. An FDR p < 0.05 was used as the cutoff. The GO annotation version and release date were GO Ontology database DOI: 10.5281/zenodo.3727280 and March 23, 2020, respectively.

Statistical analyses were performed in GraphPad Prism 7 (GraphPad) and R Software package. The analyses performed are indicated in the figure legends. Combined experimental values are shown as mean ± SD (p values: NS, p ≥ 0.05, *p = 0.01–0.05, **p = 0.001–0.01, ***p = 0.0001–0.001, ****p ≤ 0.00001).

To directly assess the impact of HIV integration into the BACH2 locus, we used HDR-based gene editing to insert different HIV component constructs into primary human CD4 T cells. Designer nucleases and rAAV6 donor templates were codelivered using a highly efficient HDR-editing platform previously developed in our laboratory (12, 13, 2224). The CRISPR/Cas9 system was used to achieve efficient nuclease-based DNA targeting using synthetic sgRNAs targeting a region within intron 5 of BACH2 (BACH2 G1 and G3) and within intron 1 of STAT5B (STAT5B G1; (Fig. 1A, Supplemental Fig. 1), respectively, previously found to have multiple distinct HIV integrations in patient cohorts (4).

FIGURE 1.

Gene editing using AAV-CRISPR/Cas9 for targeted HIV-LTR insertion at the BACH2 intron 5 locus. (A) Schematic of the CRISPR/Cas9-AAV HDR-based editing approach for HIV-LTR insertion within intron 5 of the BACH2 locus. The AAV donor constructs consist of a single HIV-LTR element (Solo-LTR) or an LTR followed by viral sequence leading up to the MSD (LTR-MSD), flanked by 600-bp homology arms to BACH2 and the AAV inverted terminal repeats (AAV ITRs) required for AAV packaging and viral production. The CRISPR/Cas9 target site for guide 3 (G3) is indicated by a black arrow; relative location of a known HIV insertion site at BACH2 intron 5 is indicated by red arrow (5). (B) Experimental protocol and timeline. (C) Frequency of indels generated by BACH2 and STAT5B guides as measured by Inference of CRISPR Edits (ICE). For BACH2 G1, n = 6; BACH2 G3, n = 5; and STAT5B G1, n = 4. (D) Guide-specific HDR-editing efficiency of AAV HIV-LTR constructs as measured by ddPCR in different PBMC donors. For BACH2 G1, n = 3; BACH2 G3 (n = 4 displayed) from total, n = 7; and STAT5B G1, n = 4. All data are represented as mean, and error bars indicate SD.

FIGURE 1.

Gene editing using AAV-CRISPR/Cas9 for targeted HIV-LTR insertion at the BACH2 intron 5 locus. (A) Schematic of the CRISPR/Cas9-AAV HDR-based editing approach for HIV-LTR insertion within intron 5 of the BACH2 locus. The AAV donor constructs consist of a single HIV-LTR element (Solo-LTR) or an LTR followed by viral sequence leading up to the MSD (LTR-MSD), flanked by 600-bp homology arms to BACH2 and the AAV inverted terminal repeats (AAV ITRs) required for AAV packaging and viral production. The CRISPR/Cas9 target site for guide 3 (G3) is indicated by a black arrow; relative location of a known HIV insertion site at BACH2 intron 5 is indicated by red arrow (5). (B) Experimental protocol and timeline. (C) Frequency of indels generated by BACH2 and STAT5B guides as measured by Inference of CRISPR Edits (ICE). For BACH2 G1, n = 6; BACH2 G3, n = 5; and STAT5B G1, n = 4. (D) Guide-specific HDR-editing efficiency of AAV HIV-LTR constructs as measured by ddPCR in different PBMC donors. For BACH2 G1, n = 3; BACH2 G3 (n = 4 displayed) from total, n = 7; and STAT5B G1, n = 4. All data are represented as mean, and error bars indicate SD.

Close modal

Minor limitations in sgRNA design, because of oligo specificity and the 5′-NGG-3′ Cas9 PAM site requirement, restricted targeting to 2 bp upstream (hg38 BACH2 gene position: 115,649) of an in vivo recovered BACH2 IS (6) (Fig. 1A). Similarly, targeting was directed to intron 1, the intron preceding the start of the coding sequence of STAT5B (hg38 STAT5B gene position: 29,125) 1 bp downstream of an in vivo recovered IS (6).

As shown in (Fig. 1B, CD4+ T cells were activated, electroporated for delivery of CAS9 protein and guide RNA as ribonucleoproteins (RNPs) complexes, and maintained in culture for up to 7 wk for analyses. Initial Cas9 cutting efficiencies at the BACH2 and STAT5B loci were assessed by NHEJ repair readout, a measure of the NHEJ-mediated insertions and deletions (indels) that manifest after repair of double-strand DNA breaks. An average of 97% (±2.4 SD) of indels was found at the intended BACH2 target site and 96% (±1.2) of indels at the STAT5B target site, indicating highly efficient on-target cutting at both loci (Fig. 2C). Off-target cleavage was also measured for BACH2 G3 and STAT5B G1 based on in silico predictions and found to be minimal (Supplemental Fig. 1). Additional BACH2 sgRNAs were assessed for both on-target and off-target NHEJ repair based on in silico predicted guide design; however, appreciable off-target cleavage was observed for BACH2 G1, and this guide was therefore excluded from further testing (Supplemental Fig. 1).

FIGURE 2.

Quantification of integrated HIV-LTR cassettes showing specific expansion of BACH2 LTR-MSD–targeted primary CD4+ T cells over time. (A) Representative outgrowth of Solo-LTR and LTR-MSD–edited populations from a single experiment targeting BACH2 over time, measured by ddPCR. Error bars represent SD between two biological replicates in each experiment. (B) Fold expansion of edited populations targeting BACH2 over time in different PBMC donor cells. Representative (n = 3 displayed) from total, n = 7. (C) Proliferation of CD4+ T cell samples from BACH2 targeting experiments assessed over time. Representative (n = 3 displayed) from total, n = 14. (D) Representative outgrowth of Solo-LTR and LTR-MSD–edited populations from a single experiment targeting STAT5B over time, measured by ddPCR. Error bars represent SD between two biological replicates in each experiment. (E) Fold expansion of edited populations targeting STAT5B over time in different PBMC donor cells. Representative (n = 3) is shown, n = 4. (F) Proliferation of CD4+ T cell samples from STAT5B-targeting experiments assessed over time. Representative (n = 3) is shown, n = 5. All data are represented as mean and error bars indicate SD.

FIGURE 2.

Quantification of integrated HIV-LTR cassettes showing specific expansion of BACH2 LTR-MSD–targeted primary CD4+ T cells over time. (A) Representative outgrowth of Solo-LTR and LTR-MSD–edited populations from a single experiment targeting BACH2 over time, measured by ddPCR. Error bars represent SD between two biological replicates in each experiment. (B) Fold expansion of edited populations targeting BACH2 over time in different PBMC donor cells. Representative (n = 3 displayed) from total, n = 7. (C) Proliferation of CD4+ T cell samples from BACH2 targeting experiments assessed over time. Representative (n = 3 displayed) from total, n = 14. (D) Representative outgrowth of Solo-LTR and LTR-MSD–edited populations from a single experiment targeting STAT5B over time, measured by ddPCR. Error bars represent SD between two biological replicates in each experiment. (E) Fold expansion of edited populations targeting STAT5B over time in different PBMC donor cells. Representative (n = 3) is shown, n = 4. (F) Proliferation of CD4+ T cell samples from STAT5B-targeting experiments assessed over time. Representative (n = 3) is shown, n = 5. All data are represented as mean and error bars indicate SD.

Close modal

Next, to recapitulate HIV-IS that faithfully mirrored the IS identified in relevant patient populations, four alternative AAV donor cassettes were generated. These donors contained either a single HIV-LTR (Solo-LTR) or a 5′ HIV-LTR continuing a downstream MSD sequence (LTR-MSD) and were designed to target either BACH2 or STAT5B. As shown in (Fig. 1B, CD4+ T cells were initially activated and cultured for 4 d prior to electroporation for RNP delivery and subsequently infected with rAAV6 donor viruses. HDR-mediated targeting of the AAV cassettes was measured by ddPCR, spanning the 600-bp donor template homology arms. HDR rates at the BACH2 locus were 33.5% (±1.2%) for Solo-LTR and 23.4% (±2.1%) for LTR-MSD. Slightly higher targeting rates of 49.5% (±6.8%) and 37.7% (±7.5%), respectively, were obtained within the STAT5B (Fig. 1D). Regardless of initial editing efficiency or targeted loci, cellular viability was minimally impacted by RNP and rAAV codelivery or the resulting genomic manipulations (Supplemental Figs. 1, 2).

To determine the effects of HIV LTR insertions at specific cellular loci, we cultured HDR-edited and control primary CD4+ T cell populations for up to 7 wk postediting. Strikingly, the proportion of BACH2 LTR-MSD–edited cells progressively increased over time, comprising ∼70% of the culture at 7 wk (Fig. 2A). When normalized to initial targeting frequency, BACH2 LTR-MSD–edited cells expanded by nearly 3-fold (Fig. 2B). These growth characteristics were dependent on the presence of the HIV MSD, because Solo-LTR–edited cells at the BACH2 locus did not expand (Fig. 2A, 2B). Absolute cell counts showed BACH2 LTR-MSD–edited cells continued to expand over 40 d of culture, while Solo-LTR–edited cell numbers, similar to controls, plateaued after ∼14 d (Fig. 2C). In contrast, STAT5B-edited cells did not exhibit a proliferative phenotype with either construct (Fig. 2D, 2E). Consistent with ddPCR-based HDR tracking, STAT5B-edited cells, targeted with either the LTR-MSD or the Solo-LTR AAV donor, did not differ in cell counts when compared with controls (Fig. 2F). As an additional control, we used HDR-based editing of AAVS1, a “safe harbor” chromosomal location not associated with HIV IS or hybrid transcripts (25). Similar to control T cells and STAT5B-edited cells, AAVS1 LTR-edited cells did not exhibit differences in T cell expansion during 4 wk of culture (Supplemental Fig. 2). To control for an idiosyncratic impact of the specific CRISPR/Cas9 target site within intron 5 of BACH2, we used an alternative CRISPR guide (BACH2 G1) and AAV targeting cassette to introduce to the LTR-MSD cassette into an alternative location in intron 5. This led to a similar selective expansion of BACH2 LTR-MSD–edited T cells, further validating these findings (Supplemental Fig. 1). Together, these results demonstrate that targeting of an HIV LTR-MSD within intron 1 of BACH2 uniquely promotes the competitive outgrowth of HDR-targeted primary CD4+ T cells.

Next, we investigated whether LTR hybrid transcripts were produced by the targeted cell populations. Primers were positioned in the LTR trans-cassette and in either exon 6 of BACH2 (Fig. 3A) or exon 2 of STAT5B (Fig. 3B). As expected, only the BACH2- and STAT5B-edited cells targeted with the LTR-MSD cassette exhibited qualitative levels of LTR hybrid transcripts. The identities of these hybrid transcripts were confirmed by DNA sequencing (data not shown).

FIGURE 3.

BACH2 LTR-MSD editing promotes increased expression of LTR-hybrid mRNA and BACH2 protein. (A and B) Schematic of BACH2/STAT5B endogenous and LTR-hybrid mRNA; white and gray boxes indicate noncoding and coding exons (Ex), respectively. The integrated LTR-MSD cassette is shown in black. Arrows represent primer sets for RT-ddPCR capturing hybrid transcripts (hF1/hR1). Bars indicate amplified cDNA products from the endogenous or hybrid primer sets with dashed lines showing spliced regions. (C) Ratio of hybrid BACH2 or STAT5B to endogenous BACH2 or STAT5B transcripts, respectively, standardized to HPRT transcripts, generated by LTR-MSD–targeted samples at early (day 7) and late (day 35) time points. n = 6 for BACH2 and n = 4 for STAT5B. (D and E) Fold change of BACH2 and STAT5B endogenous transcripts standardized to HPRT transcripts relative to the average expression across all sample groups within an experiment. Levels at early and late time points measured by ddPCR, n = 6 (C) or 3 (D). (F) Western blot analysis of BACH2 (upper panel) and STAT5B (lower panel) protein expression from a representative experiment with two biological replicates for each HDR-targeted sample. β-Actin expression was assessed in parallel as a loading control. The representative Western blots were performed twice independently. (G) Normalized and averaged signal from the Western blots in (F). Significance was determined by one-way ANOVAs with Tukey’s multiple comparison tests (D, E, and G), as well as Student two-tailed t tests (C). All data are represented as mean, and error bars indicate SD. *p = 0.01–0.05, **p = 0.001–0.01, ****p ≤ 0.00001.

FIGURE 3.

BACH2 LTR-MSD editing promotes increased expression of LTR-hybrid mRNA and BACH2 protein. (A and B) Schematic of BACH2/STAT5B endogenous and LTR-hybrid mRNA; white and gray boxes indicate noncoding and coding exons (Ex), respectively. The integrated LTR-MSD cassette is shown in black. Arrows represent primer sets for RT-ddPCR capturing hybrid transcripts (hF1/hR1). Bars indicate amplified cDNA products from the endogenous or hybrid primer sets with dashed lines showing spliced regions. (C) Ratio of hybrid BACH2 or STAT5B to endogenous BACH2 or STAT5B transcripts, respectively, standardized to HPRT transcripts, generated by LTR-MSD–targeted samples at early (day 7) and late (day 35) time points. n = 6 for BACH2 and n = 4 for STAT5B. (D and E) Fold change of BACH2 and STAT5B endogenous transcripts standardized to HPRT transcripts relative to the average expression across all sample groups within an experiment. Levels at early and late time points measured by ddPCR, n = 6 (C) or 3 (D). (F) Western blot analysis of BACH2 (upper panel) and STAT5B (lower panel) protein expression from a representative experiment with two biological replicates for each HDR-targeted sample. β-Actin expression was assessed in parallel as a loading control. The representative Western blots were performed twice independently. (G) Normalized and averaged signal from the Western blots in (F). Significance was determined by one-way ANOVAs with Tukey’s multiple comparison tests (D, E, and G), as well as Student two-tailed t tests (C). All data are represented as mean, and error bars indicate SD. *p = 0.01–0.05, **p = 0.001–0.01, ****p ≤ 0.00001.

Close modal

To determine whether the selective outgrowth of BACH2 LTR-MSD–edited T cells correlated with production of hybrid transcripts, we quantified the relative amount of hybrid versus endogenous transcript in BACH2 or STAT5B HDR-edited T cell populations using RT-ddPCR (Fig. 3A, 3B) at early (day 7) and late (day 35) cell culture time points. In all assays, transcript levels were standardized to those of a housekeeping gene, hypoxanthine guanine phosphoribosyl transferase (HPRT). In BACH2 LTR-MSD–edited cells, the level of hybrid transcripts, as a fraction of total BACH2 transcripts, increased 2.2-fold from early to late culture time points (p < 0.01) (Fig. 3C). In contrast, no change in hybrid transcript was observed in the STAT5B-edited cells (Fig. 3C). The levels of endogenous BACH2 and STAT5B transcripts were also evaluated over time, and interestingly, BACH2 transcript levels increased modestly in BACH2 LTR-MSD–edited cells (1.3-fold ± 0.2; p < 0.05) (Fig. 3D). This increase was not observed in BACH2 Solo-LTR–edited samples, suggesting that generation of hybrid BACH2 transcripts may facilitate this change. In contrast, there was no significant change in endogenous STAT5B transcripts in LTR-MSD–edited cells (1.2-fold ± 0.2; p > 0.05) compared with control cells (Fig. 3E). Consistent with these findings, BACH2 protein levels increased by up to 2.5-fold by day 21 in LTR-MSD–edited cells (p < 0.01), although there was no significant change in STAT5B LTR-MSD–edited cells (Fig. 3F, 3G). Together, these data demonstrate that expression of BACH2 can be driven by transcription initiated at the HDR-integrated LTR and is likely responsible, at least in part, for the increased proliferation and/or selective survival of HDR-edited CD4+ T cells.

We hypothesized that the alterations in BACH2 gene expression in edited cells would alter the phenotype of these populations. We performed RNA gene expression analysis in HDR-edited populations at ∼7 wk postediting. Insertion of the LTR-MSD resulted in differential expression of 452 genes relative to cells edited with Solo-LTR (adjusted [adj.] p < 0.05): 309 upregulated and 142 downregulated in LTR-MSD, respectively. Established gene sets were characterized, including (1) Treg cell markers, (2) T cell exhaustion cell markers, and (3) cell-cycle-associated genes (Fig. 4A and data not shown). IL7R (CD127) was 7.5-fold lower (adj. p = 1.1E–7) and IL2RB (CD25) was 5.9-fold higher (adj. p = 7.6E−4), consistent with a Treg phenotype (26). Markers for T cell exhaustion, LAG-3 and TIGIT, were also more abundantly expressed in the LTR-MSD population (11.3-fold, adj. p = 3.1E−4 and 3.7-fold, adj. p = 0.0051, respectively). Finally, at least 72 genes associated with cell-cycle processes (GO:0022402) were differentially expressed in LTR-MSD–edited cells, including MKI67 (3.7-fold, adj. p = 1.8E−5), TOP2A (2.9-fold, adj. p = 0.012), FOS (35.6-fold, adj. p = 3.5E−4), ASPM (3.1-fold, adj. p = 4.5E−4), and H2AFX (2.4-fold, adj. p = 6.1E−4) (Fig. 4A and data not shown). An unbiased GO biological processes overrepresentation test for the 452 differentially expressed genes in LTR-MSD–edited cells revealed a list of 246 gene sets with an FDR < 0.02 (data are available on request). The top 30 overrepresented biological processes gene sets with fold enrichment are shown, with the majority of these indicating genes enriched in cell-cycle processes (Fig. 4B). An additional comparison between LTR-MSD and unedited cells revealed 154 differentially expressed genes (adj. p < 0.05): 82 upregulated and 72 downregulated in LTR-MSD, respectively (Fig. 4B).

Next, differential expression at the BACH2 locus (Fig. 4C) was interrogated. Reads mapping to the inserted LTR cassette region (designated gene name: LTR) were 16.3-fold higher in LTR-MSD–edited cells (adj. p = 6.9E−4). This was associated with a 1.3-fold increase in transcript levels downstream of the insertion site (designated gene name: BACH2_3p; adj. p = 0.73), mirroring the RT-ddPCR results (Fig. 4C). Interestingly, reads mapped to the region upstream of the BACH2 insertion site (designated gene name: BACH2_5p) were 4.1-fold lower (adj. p = 3.7E–3) in LTR-MSD–edited cells, implying that LTR promoter-initiated transcription dominates over the host BACH2 promoter (Fig. 4C).

FIGURE 4.

RNA-seq analysis of edited populations showing unique gene expression profile and hybrid transcripts in BACH2 LTR-MSD–targeted T cells. (A) Volcano plot showing the 452 differentially expressed genes between BACH2 LTR-MSD– and Solo-LTR–edited CD4 T cell populations at 48 d of culture (309 genes upregulated in LTR-MSD and 143 downregulated in LTR-MSD compared with Solo-LTR, respectively). Each dot represents a gene, with gray not reaching significance and less than an absolute log2 fold change of 1, green not reaching significance but an absolute log2 fold change > 1, blue having an adj. p < 0.05 but less than an absolute log2 fold change of 1, and red having an adj. p < 0.05 and an absolute log2 fold change > 1. Log2 fold change > 0 means higher expression in LTR-MSD and vice versa. A select number of genes were annotated. (B) Shown are the top 30 alphabetically sorted gene sets, fold-enrichment scores, and FDR determined from a PANTER (Protein ANalysis Through Evolutionary Relationships) overrepresentation test of homo sapiens GO biological processes. (C) Coverage map of RNA-seq reads mapped to the hg38 chr6 BACH2 locus. Gene polarity is displayed right to left as on chromosome 6 for mock-edited control cells, LTR-Solo–edited, and LTR-MSD–edited cells. Accumulation of reads are shown as histograms scaled to read count (y-axis). Exons 1–9 are shown as orange rectangles with the LTR insertion into intron 5 indicated as a red bar.

FIGURE 4.

RNA-seq analysis of edited populations showing unique gene expression profile and hybrid transcripts in BACH2 LTR-MSD–targeted T cells. (A) Volcano plot showing the 452 differentially expressed genes between BACH2 LTR-MSD– and Solo-LTR–edited CD4 T cell populations at 48 d of culture (309 genes upregulated in LTR-MSD and 143 downregulated in LTR-MSD compared with Solo-LTR, respectively). Each dot represents a gene, with gray not reaching significance and less than an absolute log2 fold change of 1, green not reaching significance but an absolute log2 fold change > 1, blue having an adj. p < 0.05 but less than an absolute log2 fold change of 1, and red having an adj. p < 0.05 and an absolute log2 fold change > 1. Log2 fold change > 0 means higher expression in LTR-MSD and vice versa. A select number of genes were annotated. (B) Shown are the top 30 alphabetically sorted gene sets, fold-enrichment scores, and FDR determined from a PANTER (Protein ANalysis Through Evolutionary Relationships) overrepresentation test of homo sapiens GO biological processes. (C) Coverage map of RNA-seq reads mapped to the hg38 chr6 BACH2 locus. Gene polarity is displayed right to left as on chromosome 6 for mock-edited control cells, LTR-Solo–edited, and LTR-MSD–edited cells. Accumulation of reads are shown as histograms scaled to read count (y-axis). Exons 1–9 are shown as orange rectangles with the LTR insertion into intron 5 indicated as a red bar.

Close modal

To further investigate this altered cell phenotype, we used multicolor flow cytometry to evaluate the expression of proteins associated with T cell activation, exhaustion, and differentiation. BACH2 LTR-MSD–edited cells showed increased expression of multiple surface markers, including PD-1, CD69, LAG-3, TIGIT, and CD25 (Fig. 5A, 5B). BACH2 LTR-MSD–edited cells also exhibited a shift from a predominantly naive (CD44/CD62L+) or memory (CD44+/CD62L) phenotype to a predominantly T effector (CD44/CD62L) phenotype (Fig. 5A).

FIGURE 5.

BACH2 LTR-MSD–targeted T cells exhibit a unique proliferative, activated, Treg-like cell phenotype. (A) Representative flow plots and gating strategy of edited and control samples for the indicated cell surface and intracellular markers. (B) Pooled data of markers in (A) with n = 6 (PD-1, CD69, CD25, CD127), n = 5 (LAG-3, TIGIT), and n = 3 (FOXP3, CTLA-4, HELIOS). Data are shown as fold change compared with average mean MFI from all sample groups within an independent experiment. (C) Combined IL-10 and TGFB transcript levels from BACH2-edited samples standardized to HPRT transcripts at the late time point, n = 5. Data are shown as fold change compared with average mean expression from all sample groups within an independent experiment. (D) Pooled cytokine production from four experiments, standardized as in (B). Briefly, cells were stimulated with PMA, Ionomycin, and GolgiStop before fixing/perming and staining for cytokine production. (E) Left: representative flow plots tracking CD4 expression postediting. Right: combined data showing fold change in CD4 expression from experiments using four independent donors. (F) Left: representative flow plots of CellTrace (CT) proliferation dye staining after 96-h incubation. Briefly, cells were washed and incubated with CT dye for 20 min, then quenched with fresh media for 5 min. Cells were then spun down and cultured in full cytokine media for 96 h, at which point they were flowed for CT dilution. Right: pooled data of CT stain from three independent experiments. Data are represented as mean, and error bars indicate SD.

FIGURE 5.

BACH2 LTR-MSD–targeted T cells exhibit a unique proliferative, activated, Treg-like cell phenotype. (A) Representative flow plots and gating strategy of edited and control samples for the indicated cell surface and intracellular markers. (B) Pooled data of markers in (A) with n = 6 (PD-1, CD69, CD25, CD127), n = 5 (LAG-3, TIGIT), and n = 3 (FOXP3, CTLA-4, HELIOS). Data are shown as fold change compared with average mean MFI from all sample groups within an independent experiment. (C) Combined IL-10 and TGFB transcript levels from BACH2-edited samples standardized to HPRT transcripts at the late time point, n = 5. Data are shown as fold change compared with average mean expression from all sample groups within an independent experiment. (D) Pooled cytokine production from four experiments, standardized as in (B). Briefly, cells were stimulated with PMA, Ionomycin, and GolgiStop before fixing/perming and staining for cytokine production. (E) Left: representative flow plots tracking CD4 expression postediting. Right: combined data showing fold change in CD4 expression from experiments using four independent donors. (F) Left: representative flow plots of CellTrace (CT) proliferation dye staining after 96-h incubation. Briefly, cells were washed and incubated with CT dye for 20 min, then quenched with fresh media for 5 min. Cells were then spun down and cultured in full cytokine media for 96 h, at which point they were flowed for CT dilution. Right: pooled data of CT stain from three independent experiments. Data are represented as mean, and error bars indicate SD.

Close modal

Notably, expression of proteins associated with a Treg phenotype was also impacted by LTR-driven BACH2 expression: LTR-MSD BACH2-edited cells showed an increase in the proportion of CD25high/CD127low cells compared with controls (1.4-fold ± 0.21, p < 0.001) (Fig. 5A, 5B). Strikingly, expression of intracellular proteins associated with Treg programming, including the key transcription factors, FOXP3 and HELIOS, as well as the cytoplasmic pool of the inhibitory receptor, CTLA-4, were increased in LTR-MSD–edited cells (Fig. 5A). FOXP3 and HELIOS were increased by 1.5-fold (±0.3, p < 0.05) and 1.6-fold (±0.28, p < 0.01) respectively, while CTLA-4 expression was increased by 2.1-fold (±0.32, p < 0.001) compared with controls (Fig. 5B). To further evaluate this Treg-like phenotype, we evaluated edited cells for suppressive cytokine production, including IL-10 and TGF-β commonly expressed by multiple Treg subsets (27). IL-10 transcript levels were significantly increased in the LTR-MSD BACH2-edited cells (1.8-fold ± 0.29, p < 0.001), while TGF-β1 transcripts were decreased (1.3-fold lower ± 0.16, p < 0.001) compared with controls (Fig. 5C).

The multiple surface and transcriptional phenotypic changes observed in LTR-MSD BACH2-edited cells prompted assessment of the capacity to express other key cytokines. Conventional Treg cells consistently exhibit a reduction in IL-2 expression and variable expression of inflammatory cytokines. IL-2 production in BACH2-edited cells was measured and found to be significantly reduced (1.2-fold lower ± 0.11, p < 0.05) compared with controls (Fig. 5D). There were no discernable differences in the production of two other key inflammatory cytokines: INF-γ or TNF-α (Fig. 5D).

Interestingly, flow cytometry revealed a reduction in CD4 expression after HDR editing, a finding specific to BACH2 LTR-MSD–edited cells. Over 3 wk in culture, CD4+ cells decreased from 98 to 73% in the LTR-MSD–edited cells (average of four donors, p < 0.01; (Fig. 5E). Although CD4 loss varied among PBMC donors, a significant and reproducible loss of CD4 expression was observed in all but one donor, with the greatest reduction reaching 50% CD4 cells (data not shown). To investigate whether this process reflected downregulation of CD4 production versus surface internalization, we used a competitive flow cytometry–based CD4 mAb assay. Intracellular CD4 expression was not detected in the CD4 surface negative population, suggesting that CD4 loss reflected reduced transcription and/or altered posttranscriptional events (data not shown).

To begin to investigate the mechanism by which LTR-MSD BACH2-edited cells persisted and expanded in culture (Fig. 2C), we measured cellular proliferation using a flow cytometry–based proliferation dye dilution assay. At 5 wk postediting and growth in T cell expansion conditions, we observed that LTR-MSD cells proliferated 2.6-fold (p < 0.01) more than control cells (Fig. 5F), findings that correlate with enhanced growth rates of BACH2 LTR-MSD–targeted cells in culture.

Next, similar to CD25hi conventional Tregs (28), it was posited that increased CD25 expression in HDR-edited cells would facilitate an increase in signaling in response to IL-2. To initially test this idea, we performed flow analysis of p-STAT5 to measure IL-2 signaling. Cell populations were cultured in the absence of exogenous IL-2 for 24 h, stimulated with IL-2, and assessed by flow cytometry (29). No significant differences in p-STAT5 expression were observed in cells with integrated LTR-MSD over controls (Supplemental Fig. 3A). However, we reasoned that IL-2 starvation might differentially impact BACH2 LTR-MSD–targeted versus control cell populations. To this end, cells were exposed to different IL-2 doses (5 and 50 ng/ml IL-2) postediting, and proliferation was assessed by flow cytometry. Notably, in limiting IL-2 conditions, LTR-MSD–edited cells proliferated 1.9-fold more (p < 0.001) than the control cells (Supplemental Fig. 3B–D). Additionally, as shown in Supplemental Fig. 4, we performed LTR-MSD editing of BACH2 using input CD4 T cells that were depleted of natural Treg (nTreg) cells. In this setting, HDR editing again led to the generation and expansion of a Treg-like population. Thus, although LTR-MSD–edited Treg-like cells are not derived from nTreg cells, they exhibit phenotypic similarities to nTreg, including increased competitive fitness in an IL-2–limited environment.

In summary, our combined observations support the conclusion that BACH2 LTR-MSD editing results in increased BACH2 expression, thereby leading to generation of a unique T cell population exhibiting multiple Treg-like features: high levels of transcription factors, including FOXP3 and Helios; increased expression of inhibitory receptors, including CTLA-4 and LAG-3; upregulation of the high-affinity IL-2R (CD25high/CD127low); reduced expression of IL-2; and increased expression of IL-10. In parallel, this population exhibits evidence for increased cell activation (upregulation of CD69, PD-1, and acquisition of T effector cell phenotype) that correlates with enhanced proliferation and selective expansion in vitro.

To our knowledge, our findings provide the first direct demonstration that targeting of the HIV LTR to a defined genomic location can modulate the biological behavior of primary human T cells. These data show that two key retroviral elements, the HIV LTR and MSD, are sufficient to manipulate cellular function and phenotype through insertional mutagenesis mediated by LTR hybrid transcripts. Further, although performed in an ex vivo model system using truncated provirus elements, our results suggest a potential impact of these events in the context of HIV pathogenesis. Modeling a cis-acting LTR promoter targeting an IS frequently detected in patients with HIV, we demonstrate that BACH2 RNA and protein levels increase over time, resulting in proliferation and selective outgrowth of T cell populations with an activated, Treg-like phenotype. In contrast with previous protein overexpression studies (10, 30), our primary T cell HDR knock-in system closely recapitulates the endogenous transcriptional control at key HIV ISs, including both BACH2 and STAT5B.

Importantly, these results help to explain the persistence and strong orientation bias of HIV proviruses at the BACH2 locus in infected individuals. To date, the available published HIV insertion site data have documented 67 independent insertions across seven individuals at the BACH2 locus, all in the forward orientation with respect to gene polarity (47). The integrated proviral sequences within BACH2 in HIV subjects remain to be definitively characterized. Interestingly, to date, proviral sequences from four BACH2 ISs have been reported, and each retained an intact 5′-LTR and MSD (8), as required to generate hybrid transcripts. Because our results show that the MSD, immediately downstream of the 5′-LTR, is critical to this process, it suggests that proviruses within BACH2 may be defective and lack transcription terminating elements present within the 3′-LTR. Alternatively, intact integrated proviruses in the same location could also generate hybrid transcripts secondary to equivalent splice capture events, leading to expansion of these cells and contribution to the latent HIV reservoir. In our studies, detailed mapping of RNA sequencing (RNA-seq) reads at the BACH2 locus showed that although the CDS region downstream of the intron 5 insertion site has a modest 1.3-fold increase in expression, host BACH2 transcription is reduced by >4-fold. This suggests that BACH2 expression is carefully regulated to physiological levels in what appears to be a negative feedback system, i.e., LTR-driven expression of BACH2 negatively regulates BACH2 promoter–driven expression. This observation is consistent with previous findings of LTR promoter dominance (3133).

Together with the previous observations indicating that BACH2 is a ubiquitous target for HIV integration, our findings suggest that T cells with this IS proliferate in vivo and thereby reach detection levels because of a unique survival advantage. Consistent with this concept, BACH2 HDR-edited cells, under the control of an LTR promoter, exhibit a significantly enhanced proliferative capacity and expansion in vitro compared with unedited cells. Interestingly, this selective expansion was most evident at time points later than typical for primary cell culture models, which reach peak expansion rates in 7–14 d. These findings appear to model the subtleties anticipated in vivo, in which a modest, but physiologically relevant, increase in a transcriptional regulator manifests a physiologically relevant phenotype over time. Overall, these observations are consistent with a proposed model where HIV-driven BACH2 IS–dependent proliferation leads to a gradual clonal expansion with selection observed over a scale of years in vivo (11).

BACH2-edited CD4+ T cells also progressively acquire a Treg-like phenotype in culture. This population exhibits many, but not all, features present in nTreg cells, as well as evidence for increased cell activation, enhanced proliferation, and selective expansion in vitro. These combined phenotypic and growth properties likely reflect the complex role for BACH2 in T cell development and functional specification. BACH2 acts as a transcriptional regulator required for thymic and peripheral Treg development and homeostasis, generation of T central memory cells, and in limiting effector T cell differentiation (2628, 3136). BACH2 has also been implicated as a transcriptional repressor that limits activity of superenhancers proposed to drive establishment and stability of T cell lineages and subsets in human and murine T cells. Although most studies of BACH2 have relied on loss of function or overexpression, our studies provide insight into the impacts of a modest change in BACH2 dosage on the biology of primary human T cells. Additional work examining the transcriptional and epigenetic changes driven by these LTR hybrid transcripts is required to fully elucidate the presumed multiple impacts of these events in primary T cells.

Interestingly, LTR or LTR-MSD insertion within the STAT5B locus did not elicit any measurable molecular, phenotypic, or growth differences in primary CD4 T cells (Supplemental Fig. 2). As previously reported in HIV-infected persons, LTR-initiated hybrid transcripts were detected in STAT5B-edited cells (12). However, in this study, HIV-STAT5B hybrid transcripts did not lead to significant increases in STAT5B protein expression or measurable functional effects. Of note, HIV ISs at this locus do not display a clear forward orientation bias (46). Although additional studies are warranted, our results are consistent with the lack of orientation bias and imply that one or more alternative (enhancer, chromatin, or yet unknown) trans-mediated mechanism(s) may be responsible for the observed frequency and persistence of STAT5B IS T cell clones.

Our findings support the concept that modest increases in BACH2 expression may lead to the reprogramming of conventional CD4 T cells, thereby mimicking HIV provirus integration within a newly infected circulating CD4 T cell. Based on HDR editing of both bulk (Fig. 2) and nTreg-depleted (Supplemental Fig. 4) CD4 T cells, our data do not support the concept that reprogramming primarily impacts nTreg cells, findings that contradict earlier work suggesting overexpression of BACH2 specifically within nTreg cells leads to their selective expansion (10). That work based its conclusions on identification of hybrid transcripts in purified CD25hi, CD127lo T cells isolated from patients with HIV, an approach that would similarly enrich for the novel Treg-like population identified in this study. Additional studies by those authors relied on overexpression of BACH2 in isolated induced Treg or nTreg cells, an approach that does not model the subtle impacts on BACH2 expression mediated by site-specific, HDR-based integration of a relevant LTR cassette.

The Treg-like population identified in this study may represent an immunosuppressive T cell subset. The frequency and retention of ISs within BACH2 in vivo thus raises the possibility that such cells may alter immune responses and/or HIV disease outcome, regardless of whether these cells harbor replication-competent proviruses. Collectively, previous work suggests that Tregs exert both negative and positive impacts on HIV disease outcome via limiting immune response to HIV and opportunistic infection and attenuating HIV-induced immune hyperactivation (3739). Moreover, HIV viral load is positively correlated with Treg percentages (4044) and negatively correlated with absolute Treg numbers (41, 4547). However, immune hyperactivation is a negative prognostic factor associated with HIV disease progression (48, 49). Additional work is required to understand the impact of HIV integrations at BACH2 and the outcome on HIV disease progression and the viral reservoir.

In summary, we provide direct mechanistic support for how a specific HIV integration event may lead to expansion and persistence of clonally derived, CD4 T cell populations within infected individuals. Specifically, we show that transcriptional activation of BACH2 via the generation of hybrid HIV-human transcripts is sufficient to generate a proliferative, activated, and Treg-like population in vitro. Based on these findings, we propose a model wherein IS drives phenotype and whereby proviruses convert a population of CD4+ T cells to a potentially immune-suppressive state that, via parallel provision of selective advantage, may modulate disease outcome and/or HIV persistence.

We acknowledge and thank I. Khan, E. Lopez, C. Stoffers, C. Zavala, and A. Ott of the Viral Production Team at Seattle Children’s Research Institute for providing AAV stocks.

This work was supported by U.S. Department of Health and Human Services, National Institutes of Health Grants 1R01AI125026 (J.I.M., principal investigator [PI]), 1R61DA047010 (J.I.M., multiple PI), R01CA206466 (L.M.F., PI), R01AI134419 (L.M.F., PI), 1R01AI122361 (J.I.M., PI), 1R01DA040532-01, and P30AI027757; Centers for AIDS Research Retrovirology and Molecular Data Sciences Core (C. Celu, PI; J.I.M., Core Director); the Seattle Children’s Research Institute Program for Cell and Gene Therapy; a Children’s Guild Association Endowed Chair in Pediatric Immunology (to D.J.R.); and a Hansen Investigator in Pediatric Innovation Endowment (to D.J.R.).

J.I.M. conceived the concept with input from L.M.F.; M.L.C. and M.J.D. devised and led the project; J.I.M., A.K., and D.J.R. consulted on project direction and experimental design; M.L.C., M.J.D., and S.C.S. designed and performed experiments and completed the analyses; C.S., H.J., and S.C.S. performed experiments, and all authors contributed to interpretation of results; M.J.D. and M.L.C. wrote the manuscript with support from S.C.S., J.I.M., A.K., and D.J.R.; D.J.R. revised the manuscript in response to review.

The data presented in this article have been submitted to the National Center for Biotechnology Information (NCBI) BioSample database (https://www.ncbi.nlm.nih.gov/bioproject/733175) under accession number PRJNA733175.

The online version of this article contains supplemental material.

Abbreviations used in this article:

AAV, adeno-associated virus; adj.

adjusted

ART

antiretroviral therapy

cat.

catalog number

ddPCR

digital droplet PCR

FDR

false discovery rate

GO

Gene Ontology

HDR

homology-directed repair

IDT

Integrated DNA Technologies

IS

integration site

LTR

long terminal repeat

MSD

major splice donor

NHEJ

nonhomologous end-joining

nTreg

natural T regulatory

RNA-seq

RNA sequencing

RNP, ribonucleoprotein; sgRNA

single guide RNA

Treg

T regulatory

1.
Lewinski
M. K.
,
D.
Bisgrove
,
P.
Shinn
,
H.
Chen
,
C.
Hoffmann
,
S.
Hannenhalli
,
E.
Verdin
,
C. C.
Berry
,
J. R.
Ecker
,
F. D.
Bushman
.
2005
.
Genome-wide analysis of chromosomal features repressing human immunodeficiency virus transcription.
J. Virol.
79
:
6610
6619
.
2.
Schröder
A. R. W.
,
P.
Shinn
,
H.
Chen
,
C.
Berry
,
J. R.
Ecker
,
F.
Bushman
.
2002
.
HIV-1 integration in the human genome favors active genes and local hotspots.
Cell
110
:
521
529
.
3.
Lucic
B.
,
H.-C.
Chen
,
M.
Kuzman
,
E.
Zorita
,
J.
Wegner
,
V.
Minneker
,
W.
Wang
,
R.
Fronza
,
S.
Laufs
,
M.
Schmidt
, et al
2019
.
Spatially clustered loci with multiple enhancers are frequent targets of HIV-1 integration. [Published erratum appears in 2021 Nat. Commun. 12: 6326.]
Nat. Commun.
10
:
4059
.
4.
Wagner
T. A.
,
S.
McLaughlin
,
K.
Garg
,
C. Y. K.
Cheung
,
B. B.
Larsen
,
S.
Styrchak
,
H. C.
Huang
,
P. T.
Edlefsen
,
J. I.
Mullins
,
L. M.
Frenkel
.
2014
.
HIV latency. Proliferation of cells with HIV integrated into cancer genes contributes to persistent infection.
Science
345
:
570
573
.
5.
Maldarelli
F.
,
X.
Wu
,
L.
Su
,
F. R.
Simonetti
,
W.
Shao
,
S.
Hill
,
J.
Spindler
,
A. L.
Ferris
,
J. W.
Mellors
,
M. F.
Kearney
, et al
2014
.
HIV latency. Specific HIV integration sites are linked to clonal expansion and persistence of infected cells.
Science
345
:
179
183
.
6.
Ikeda
T.
,
J.
Shibata
,
K.
Yoshimura
,
A.
Koito
,
S.
Matsushita
.
2007
.
Recurrent HIV-1 integration at the BACH2 locus in resting CD4+ T cell populations during effective highly active antiretroviral therapy.
J. Infect. Dis.
195
:
716
725
.
7.
Mack
K. D.
,
X.
Jin
,
S.
Yu
,
R.
Wei
,
L.
Kapp
,
C.
Green
,
B.
Herndier
,
N. W.
Abbey
,
A.
Elbaggari
,
Y.
Liu
,
M. S.
McGrath
.
2003
.
HIV insertions within and proximal to host cell genes are a common finding in tissues containing high levels of HIV DNA and macrophage-associated p24 antigen expression.
J. Acquir. Immune Defic. Syndr.
33
:
308
320
.
8.
Simonetti
F. R.
,
H.
Zhang
,
G. P.
Soroosh
,
J.
Duan
,
K.
Rhodehouse
,
A. L.
Hill
,
S. A.
Beg
,
K.
McCormick
,
H. E.
Raymond
,
C. L.
Nobles
, et al
2021
.
Antigen-driven clonal selection shapes the persistence of HIV-1-infected CD4+ T cells in vivo.
J. Clin. Invest.
131
:
e145254
.
9.
Jern
P.
,
J. M.
Coffin
.
2008
.
Effects of retroviruses on host genome function.
Annu. Rev. Genet.
42
:
709
732
.
10.
Cesana
D.
,
F. R.
Santoni de Sio
,
L.
Rudilosso
,
P.
Gallina
,
A.
Calabria
,
S.
Beretta
,
I.
Merelli
,
E.
Bruzzesi
,
L.
Passerini
,
S.
Nozza
, et al
2017
.
HIV-1-mediated insertional activation of STAT5B and BACH2 trigger viral reservoir in T regulatory cells.
Nat. Commun.
8
:
498
.
11.
Liu
R.
,
F. R.
Simonetti
,
Y. C.
Ho
.
2020
.
The forces driving clonal expansion of the HIV-1 latent reservoir.
Virol. J.
17
:
4
.
12.
Sather
B. D.
,
G. S.
Romano Ibarra
,
K.
Sommer
,
G.
Curinga
,
M.
Hale
,
I. F.
Khan
,
S.
Singh
,
Y.
Song
,
K.
Gwiazda
,
J.
Sahni
, et al
2015
.
Efficient modification of CCR5 in primary human hematopoietic cells using a megaTAL nuclease and AAV donor template.
Sci. Transl. Med.
7
:
307ra156
.
13.
Hung
K. L.
,
I.
Meitlis
,
M.
Hale
,
C.-Y.
Chen
,
S.
Singh
,
S. W.
Jackson
,
C. H.
Miao
,
I. F.
Khan
,
D. J.
Rawlings
,
R. G.
James
.
2018
.
Engineering protein-secreting plasma cells by homology-directed repair in primary human B cells.
Mol. Ther.
26
:
456
467
.
14.
Stemmer
M.
,
T.
Thumberger
,
M.
Del Sol Keyer
,
J.
Wittbrodt
,
J. L.
Mateo
.
2015
.
CCTop: an intuitive, flexible and reliable CRISPR/Cas9 target prediction tool. [Published erratum appears in 2017 PLoS One 12: e0176619.]
PLoS One
10
:
e0124633
.
15.
Anderson
W.
,
J.
Thorpe
,
S. A.
Long
,
D. J.
Rawlings
.
2019
.
Efficient CRISPR/Cas9 disruption of autoimmune-associated genes reveals key signaling programs in primary human T cells.
J. Immunol.
203
:
3166
3178
.
16.
Dobin
A.
,
C. A.
Davis
,
F.
Schlesinger
,
J.
Drenkow
,
C.
Zaleski
,
S.
Jha
,
P.
Batut
,
M.
Chaisson
,
T. R.
Gingeras
.
2013
.
STAR: ultrafast universal RNA-seq aligner.
Bioinformatics
29
:
15
21
.
17.
Anders
S.
,
P. T.
Pyl
,
W.
Huber
.
2015
.
HTSeq—a Python framework to work with high-throughput sequencing data.
Bioinformatics
31
:
166
169
.
18.
Love
M. I.
,
W.
Huber
,
S.
Anders
.
2014
.
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.
Genome Biol.
15
:
550
.
19.
Ashburner
M.
,
C. A.
Ball
,
J. A.
Blake
,
D.
Botstein
,
H.
Butler
,
J. M.
Cherry
,
A. P.
Davis
,
K.
Dolinski
,
S. S.
Dwight
,
J. T.
Eppig
, et al
The Gene Ontology Consortium
.
2000
.
Gene ontology: tool for the unification of biology.
Nat. Genet.
25
:
25
29
.
20.
The Gene Ontology Consortium
.
2019
.
The Gene Ontology Resource: 20 years and still GOing strong.
Nucleic Acids Res.
47
(
D1
):
D330
D338
.
21.
Mi
H.
,
A.
Muruganujan
,
D.
Ebert
,
X.
Huang
,
P. D.
Thomas
.
2019
.
PANTHER version 14: more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools.
Nucleic Acids Res.
47
(
D1
):
D419
D426
.
22.
Hale
M.
,
B.
Lee
,
Y.
Honaker
,
W.-H.
Leung
,
A. E.
Grier
,
H. M.
Jacobs
,
K.
Sommer
,
J.
Sahni
,
S. W.
Jackson
,
A. M.
Scharenberg
, et al
2017
.
Homology-directed recombination for enhanced engineering of chimeric antigen receptor T cells.
Mol. Ther. Methods Clin. Dev.
4
:
192
203
.
23.
Hubbard
N.
,
D.
Hagin
,
K.
Sommer
,
Y.
Song
,
I.
Khan
,
C.
Clough
,
H. D.
Ochs
,
D. J.
Rawlings
,
A. M.
Scharenberg
,
T. R.
Torgerson
.
2016
.
Targeted gene editing restores regulated CD40L function in X-linked hyper-IgM syndrome.
Blood
127
:
2513
2522
.
24.
Honaker
Y.
,
N.
Hubbard
,
Y.
Xiang
,
L.
Fisher
,
D.
Hagin
,
K.
Sommer
,
Y.
Song
,
S. J.
Yang
,
C.
Lopez
,
T.
Tappen
, et al
2020
.
Gene editing to induce FOXP3 expression in human CD4+ T cells leads to a stable regulatory phenotype and function.
Sci. Transl. Med.
12
:
eaay6422
.
25.
Sadelain
M.
,
E. P.
Papapetrou
,
F. D.
Bushman
.
2011
.
Safe harbours for the integration of new DNA in the human genome.
Nat. Rev. Cancer
12
:
51
58
.
26.
Chen
X.
,
J. J.
Oppenheim
.
2011
.
Resolving the identity myth: key markers of functional CD4+FoxP3+ regulatory T cells.
Int. Immunopharmacol.
11
:
1489
1496
.
27.
Schmidt
A.
,
N.
Oberle
,
P. H.
Krammer
.
2012
.
Molecular mechanisms of treg-mediated T cell suppression.
Front. Immunol.
3
:
51
.
28.
Malek
T. R.
,
I.
Castro
.
2010
.
Interleukin-2 receptor signaling: at the interface between tolerance and immunity.
Immunity
33
:
153
165
.
29.
Johnston
J. A.
,
C. M.
Bacon
,
D. S.
Finbloom
,
R. C.
Rees
,
D.
Kaplan
,
K.
Shibuya
,
J. R.
Ortaldo
,
S.
Gupta
,
Y. Q.
Chen
,
J. D.
Giri
, et al
1995
.
Tyrosine phosphorylation and activation of STAT5, STAT3, and Janus kinases by interleukins 2 and 15.
Proc. Natl. Acad. Sci. USA
92
:
8705
8709
.
30.
Eipers
P. G.
,
J. F.
Salazar-Gonzalez
,
C. D.
Morrow
.
2011
.
HIV gene expression from intact proviruses positioned in bacterial artificial chromosomes at integration sites previously identified in latently infected T cells.
Virology
410
:
151
160
.
31.
Lenasi
T.
,
X.
Contreras
,
B. M.
Peterlin
.
2008
.
Transcriptional interference antagonizes proviral gene expression to promote HIV latency.
Cell Host Microbe
4
:
123
133
.
32.
Sherrill-Mix
S.
,
K. E.
Ocwieja
,
F. D.
Bushman
.
2015
.
Gene activity in primary T cells infected with HIV89.6: intron retention and induction of genomic repeats.
Retrovirology
12
:
79
.
33.
Liu
R.
,
Y. J.
Yeh
,
A.
Varabyou
,
J. A.
Collora
,
S.
Sherrill-Mix
,
C. C.
Talbot
Jr.
,
S.
Mehta
,
K.
Albrecht
,
H.
Hao
,
H.
Zhang
, et al
2020
.
Single-cell transcriptional landscapes reveal HIV-1-driven aberrant host gene transcription as a potential therapeutic target.
Sci. Transl. Med.
12
:
eaaz0802
.
34.
Grant
F. M.
,
J.
Yang
,
R.
Nasrallah
,
J.
Clarke
,
F.
Sadiyah
,
S. K.
Whiteside
,
C. J.
Imianowski
,
P.
Kuo
,
P.
Vardaka
,
T.
Todorov
, et al
2020
.
BACH2 drives quiescence and maintenance of resting Treg cells to promote homeostasis and cancer immunosuppression.
J. Exp. Med.
217
:
e20190711
.
35.
Sidwell
T.
,
Y.
Liao
,
A. L.
Garnham
,
A.
Vasanthakumar
,
R.
Gloury
,
J.
Blume
,
P. P.
Teh
,
D.
Chisanga
,
C.
Thelemann
,
F.
de Labastida Rivera
, et al
2020
.
Attenuation of TCR-induced transcription by Bach2 controls regulatory T cell differentiation and homeostasis.
Nat. Commun.
11
:
252
.
36.
Kim
E. H.
,
D. J.
Gasper
,
S. H.
Lee
,
E. H.
Plisch
,
J.
Svaren
,
M.
Suresh
.
2014
.
Bach2 regulates homeostasis of Foxp3+ regulatory T cells and protects against fatal lung disease in mice.
J. Immunol.
192
:
985
995
.
37.
Simonetta
F.
,
C.
Bourgeois
.
2013
.
CD4+FOXP3+ regulatory T-cell subsets in human immunodeficiency virus infection.
Front. Immunol.
4
:
215
.
38.
Chevalier
M. F.
,
L.
Weiss
.
2013
.
The split personality of regulatory T cells in HIV infection.
Blood
121
:
29
37
.
39.
Valverde-Villegas
J. M.
,
M. C. C.
Matte
,
R. M.
de Medeiros
,
J. A. B.
Chies
.
2015
.
New insights about Treg and Th17 cells in HIV infection and disease progression.
J. Immunol. Res.
2015
:
647916
.
40.
Lim
A.
,
D.
Tan
,
P.
Price
,
A.
Kamarulzaman
,
H.-Y.
Tan
,
I.
James
,
M. A.
French
.
2007
.
Proportions of circulating T cells with a regulatory cell phenotype increase with HIV-associated immune activation and remain high on antiretroviral therapy.
AIDS
21
:
1525
1534
.
41.
Schulze Zur Wiesch
J.
,
A.
Thomssen
,
P.
Hartjen
,
I.
Tóth
,
C.
Lehmann
,
D.
Meyer-Olson
,
K.
Colberg
,
S.
Frerk
,
D.
Babikir
,
S.
Schmiedel
, et al
2011
.
Comprehensive analysis of frequency and phenotype of T regulatory cells in HIV infection: CD39 expression of FoxP3+ T regulatory cells correlates with progressive disease.
J. Virol.
85
:
1287
1297
.
42.
Zhuang
Y.
,
X.
Wei
,
Y.
Li
,
K.
Zhao
,
J.
Zhang
,
W.
Kang
,
Y.
Sun
.
2012
.
HCV coinfection does not alter the frequency of regulatory T cells or CD8+ T cell immune activation in chronically infected HIV+ Chinese subjects.
AIDS Res. Hum. Retroviruses
28
:
1044
1051
.
43.
Zhang
Z.
,
Y.
Jiang
,
M.
Zhang
,
W.
Shi
,
J.
Liu
,
X.
Han
,
Y.
Wang
,
X.
Jin
,
H.
Shang
.
2008
.
Relationship of frequency of CD4+CD25+Foxp3+ regulatory T cells with disease progression in antiretroviral-naive HIV-1 infected Chinese.
Jpn. J. Infect. Dis.
61
:
391
392
.
44.
Loke
P.
,
D.
Favre
,
P. W.
Hunt
,
J. M.
Leung
,
B.
Kanwar
,
J. N.
Martin
,
S. G.
Deeks
,
J. M.
McCune
.
2010
.
Correlating cellular and molecular signatures of mucosal immunity that distinguish HIV controllers from noncontrollers.
Blood
115
:
e20
e32
.
45.
Tenorio
A. R.
,
J.
Martinson
,
D.
Pollard
,
L.
Baum
,
A.
Landay
.
2008
.
The relationship of T-regulatory cell subsets to disease stage, immune activation, and pathogen-specific immunity in HIV infection.
J. Acquir. Immune Defic. Syndr.
48
:
577
580
.
46.
Eggena
M. P.
,
B.
Barugahare
,
N.
Jones
,
M.
Okello
,
S.
Mutalya
,
C.
Kityo
,
P.
Mugyenyi
,
H.
Cao
.
2005
.
Depletion of regulatory T cells in HIV infection is associated with immune activation.
J. Immunol.
174
:
4407
4414
.
47.
Nilsson
J.
,
A.
Boasso
,
P. A.
Velilla
,
R.
Zhang
,
M.
Vaccari
,
G.
Franchini
,
G. M.
Shearer
,
J.
Andersson
,
C.
Chougnet
.
2006
.
HIV-1-driven regulatory T-cell accumulation in lymphoid tissues is associated with disease progression in HIV/AIDS.
Blood
108
:
3808
3817
.
48.
Bouscarat
F.
,
M.
Levacher-Clergeot
,
M. C.
Dazza
,
K. W.
Strauss
,
P. M.
Girard
,
C.
Ruggeri
,
M.
Sinet
.
1996
.
Correlation of CD8 lymphocyte activation with cellular viremia and plasma HIV RNA levels in asymptomatic patients infected by human immunodeficiency virus type 1.
AIDS Res. Hum. Retroviruses
12
:
17
24
.
49.
Deeks
S. G.
,
C. M. R.
Kitchen
,
L.
Liu
,
H.
Guo
,
R.
Gascon
,
A. B.
Narváez
,
P.
Hunt
,
J. N.
Martin
,
J. O.
Kahn
,
J.
Levy
, et al
2004
.
Immune activation set point during early HIV infection predicts subsequent CD4+ T-cell changes independent of viral load.
Blood
104
:
942
947
.

The authors have no financial conflicts of interest.

Supplementary data