Abstract
CMV is a prevalent human pathogen. The virus cannot be eliminated from the body, but is kept in check by CMV-specific T cells. Patients with an insufficient T cell response, such as transplant recipients, are at high risk of developing CMV disease. However, the CMV-specific T cell repertoire is complex, and it is not yet clear which T cells protect best against virus reactivation and disease. In this study, we present a highly resolved characterization of CMV-specific human CD8+ T cells based on enrichment by specific peptide stimulation and mRNA sequencing of their TCR β-chains (TCRβ). Our analysis included recently identified T cell epitopes restricted through HLA-C, whose presentation is resistant to viral immunomodulation, and well-studied HLA-B–restricted epitopes. In eight healthy virus carriers, we identified a total of 1052 CMV-specific TCRβ sequences. HLA-C–restricted, CMV-specific TCRβ clonotypes dominated the ex vivo T cell response and contributed the highest-frequency clonotype of the entire repertoire in two of eight donors. We analyzed sharing and similarity of CMV-specific TCRβ sequences and identified 63 public or related sequences belonging to 17 public TCRβ families. In our cohort, and in an independent cohort of 352 donors, the cumulative frequency of these public TCRβ family members was a highly discriminatory indicator of carrying both CMV infection and the relevant HLA type. Based on these findings, we propose CMV-specific TCRβ signatures as a biomarker for an antiviral T cell response to identify patients in need of treatment and to guide future development of immunotherapy.
This article is featured in In This Issue, p.627
Introduction
Like other members of the herpesvirus family, human CMV infects its carriers for life, and prevention of overt disease requires a protective repertoire of virus-specific T cells (1, 2). Persons who lack such a T cell repertoire, such as patients after allogeneic hematopoietic stem cell transplantation, are at risk for reactivating latent CMV infection, and CMV disease remains a threat to their survival and well-being (3). Conventional antiviral chemotherapy has significant adverse effects, is not universally effective, may delay viral reactivation rather than prevent it, and is subverted by viral resistance (3, 4).
Re-establishment of a functional and durable antiviral T cell repertoire is expected to enable patients to control CMV infection for their lifetime. Transfer of CMV-specific T cells from immunocompetent donors to patients after allogeneic hematopoietic stem cell transplantation has yielded encouraging results over the last three decades (5, 6). However, most of these trials were of small to medium scale and without randomized controls. Preliminary reports on two recent, large, randomized controlled trials suggest efficacy (5), but complete analysis is not yet available.
Therefore, present capability to clinically exploit the CMV-specific T cell repertoire is limited. Although the CMV-specific T cell response has been studied in great detail (7), fundamental questions remain unanswered. No consensus has been reached on which viral Ags and epitopes induce T cell responses that directly protect against infection and which T cell specificities are merely correlated with the presence of other, more effective specificities (7). The challenge of understanding the CMV-specific T cell repertoire is formidable because CMV expresses more than 200 viral proteins and has evolved multiple mechanisms to interfere with T cell recognition by modulating cytokine and chemokine responses, Ag processing, intracellular peptide translocation, stability of HLA molecules, and more (8, 9). Moreover, human CMV shows sequence variation, and some specific T cells may recognize only certain strain-specific epitope variants (10).
A particularly strong CD8+ T cell response is directed to two viral Ags with contrasting functional roles and kinetics of expression: the transcription factor IE-1 and the structural protein pp65 (11, 12). Earlier data indicated that pp65-specific CD8+ T cells were the most effective in attacking infected cells in vitro (13), whereas IE-1–specific T cells were more strongly associated with reduced viral reactivation in patients after transplantation (14, 15) and protective in the murine CMV model (16). Our recent findings suggested that viral immunoevasion is not predominantly guided by the identity of the Ag, but by the identity of the epitope and the HLA class I molecule that presents it (17, 18). For example, we identified an HLA-C–restricted CD8+ T cell epitope from IE-1, whose presentation is highly resistant to viral immunomodulation. T cells specific for this epitope are of high incidence and frequency in healthy donors (17, 18). These properties are shared (19) by a second CD8+ T cell epitope restricted through the same HLA (20); this epitope is derived from the rarely studied UL29/28-encoded CMV protein. It is unknown whether there is a causal relationship between escape of certain epitopes from viral immunomodulation and high incidence of epitope-specific T cells and whether T cells against such epitopes are associated with protection from reactivation. To address such questions, the CMV-specific T cell response needs to be analyzed and understood in much more detail using methods that are of sufficient resolution to adequately cover the complexity of the repertoire. High-resolution sequencing of the TCR is such a method.
Most human T cells express a heterodimeric αβ TCR that specifically recognizes the antigenic target, a complex of an HLA molecule and a peptide. Both chains, α and β, have highly variable sequences. The specificity of each αβ T cell is ensured by its expression of only one TCR β-chain (TCRβ) and one or, occasionally, two TCR α-chains (21). Variability of TCR sequence is produced by recombination in the thymus. In the case of the human TCRβ, a V(D)J reading frame is produced by imprecisely joining 1 of 46 functional V gene segments, one of two short D segments, and 1 of 13 J segments, mostly with insertion of template-independent nucleotides between the segments (22, 23). The sequence around these junctions encodes the CDR3, a loop that reaches out to the peptide embedded in the HLA molecule (24). The number of different TCRβs in the T cell repertoire of a human being was estimated to be in the range of millions (25–27) or even hundreds of millions (28); this is only a small fraction of the diversity that is theoretically possible (22, 27). CMV-specific TCR repertoires have been studied in detail before (29–31), but most studies were limited to pp65 and HLA-C–restricted T cells were not included. The advent of massively parallel sequencing of TCR-encoding DNA or mRNA (26, 32) has now made it possible to identify the specificity-defining element of millions of T cells in a sample, and pioneering studies have applied this technique to the analysis of CMV-specific T cells (33–36).
In this study, we use high-resolution TCRβ sequencing to investigate the repertoire of CMV-specific CD8+ T cells, focusing on previously unstudied HLA-C–restricted T cells that promise to be of high clinical interest. Specific T cells were selectively enriched by peptide-driven in vitro expansion; this method is transferable to settings when samples are small or HLA/peptide multimers are not available. We found that T cells specific for HLA-C–restricted CMV epitopes showed exceptional clonal dominance within the overall TCRβ repertoire. Moreover, we identified a set of public and related CMV-specific TCRβ sequences that reliably distinguished persons with or without CMV-specific T cell immunity, immediately suggesting future application of this method in clinical immunomonitoring.
Materials and Methods
Blood donors
Human T cells were derived from anonymous peripheral blood buffy coats purchased from Institut für Transfusionsmedizin, Ulm, Germany. The institutional review board (Ethikkommission bei der Medizinischen Fakultät der Ludwig-Maximilians-Universität München, Project no. 17-455, 16.10.2017) has approved our use of anonymous human material. We did not seek or obtain consent because all material and data were obtained anonymously. PBMCs were isolated by density centrifugation and cryopreserved until use. Donors were HLA-typed at four-digit resolution (Center for Human Genetics and Laboratory Diagnostics, Martinsried, Germany). All donors were positive for HLA-B*07:02 and HLA-C*07:02. CMV status (Supplemental Table I) was determined by anti-CMV IgG ELISA (Siemens).
Peptide stimulation assay
The four CMV-derived peptides CRVLCCYVL (CRV; HLA-C*07:02, IE-1), FRCPRRFCF (FRC; HLA-C*07:02, UL29/28), RPHERNGFTVL (RPH; HLA-B*07:02, pp65), and TPRVTGGGAM (TPR; HLA-B*07:02, pp65) (Supplemental Table I) were used separately to stimulate and selectively expand virus-specific T cells from CMV-positive donors P01–P08. Cell culture medium was RPMI 1640 (Invitrogen) supplemented with 8 or 10% FCS (BioSell or Invitrogen). Per culture, 25 million PBMCs were suspended in 2 ml cell culture medium containing 5 μg/ml peptide (JPT Peptide Technologies; ≥70% purity), incubated at 37°C for 1 h, and washed three times with PBS (PAN Biotech) to remove excess peptide. PBMCs were resuspended in 12.5 ml cell culture medium supplemented with 50 U/ml IL-2 (Proleukin; Novartis) and distributed at 2.5 ml/well to a 12-well plate. The plate was incubated at 37°C and 5% CO2. After 6 ± 1 d, the cells of each well were resuspended, distributed to two wells, and 1 ml of fresh culture medium supplemented with IL-2 was added to each well. Cells were harvested on day 10 of culture.
T cell stimulation with autologous mini–lymphoblastoid cell lines
PBMCs from three healthy donors (P01–P03) were infected with empty mini-Epstein–Barr viruses encoding pp65, IE-1, or no CMV protein (37). The resulting transformed B cell lines (mini–lymphoblastoid cell lines [mini-LCLs]) were maintained in RPMI 1640 medium supplemented with 8 or 10% FCS. For the stimulation assay, mini-LCLs were γ-irradiated with 50 Gy in a 137Cs device. Subsequently, 150,000 mini-LCL cells were combined with 6 million PBMCs in 3 ml RPMI 1640/FCS per replicate in a 12-well plate, in four replicates per culture. After 9 d, and then every 7 d, the T cells were restimulated: cultures were harvested, washed, and counted, and 3 million T cells per well (12-well plate) were coincubated with 1 million irradiated mini-LCLs in medium with 50 U/ml IL-2. On day 30, cultures were harvested to analyze T cells.
TCRβ library preparation
Where indicated (Supplemental Table II), CD8+ T cells were enriched from PBMCs or T cell cultures by magnetic separation with CD8 MicroBeads and MS or LS Columns (Miltenyi Biotec).
Total RNA was extracted from PBMCs, T cell cultures, or CD8-enriched T cells using the Qiagen RNeasy Kit (Qiagen). One microgram RNA per sample was reversely transcribed to cDNA with the QuantiTect Reverse Transcription Kit (Qiagen) using a primer designed to target both the Cβ1 and Cβ2 regions of the TCR RNA (5′-GCACC TCCTT CCCAT TCAC-3′). cDNA was amplified in two subsequent PCRs using Pfu DNA polymerase on a thermocycler (Biometra T Gradient). Both PCRs were initiated at 95°C for 2 min; cycles consisted of incubation at 95°C (30 s), 65°C (30 s), and 72°C (60 s); final elongation was at 72°C (10 min). The first reaction was a multiplex PCR with 45 distinct forward primers that bind to the Vβ region and cover all possible human TCR Vβ segments and a reverse primer that anneals to the Cβ1 and Cβ2 regions; the Cβ primer was optimized for this protocol, and Vβ binding sites were mostly taken from established protocols (26). Forward (Vβ) and reverse (Cβ) primers carried sequences complementary to the Illumina Read 2 and Read 1 priming sequence, respectively. To enhance the nucleotide diversity of TCRβ reads, facilitate cluster recognition, and avoid artifacts, three different forms of the reverse primer were used, with zero, one, or two degenerated nucleotides inserted between the Cβ-binding sequence and the Illumina Read 1 sequence. All primers were used in equimolar amounts (a total of 10 μM; for primer sequences, see Supplemental Table I). The first PCR consisted of only 10 amplification cycles to minimize PCR amplification bias. The second PCR was performed with index primers (NEBNext Multiplex Oligos for Illumina; New England BioLabs), to attach barcodes and the i5 and i7 adapters for cluster generation on the Illumina flow cell (Fig. 1B). After each PCR step, the PCR product was purified using Agencourt AMPure XP magnetic beads (Beckman Coulter). Length and quantity of PCR products for sequencing was determined using the Agilent DNA 1000 Kit and the Bioanalyzer 2100 (Agilent Technologies).
High-throughput sequencing and data analysis
The barcoded samples were combined to a final concentration of 10 nM DNA and sequenced with the Illumina HiSeq 1500 system in paired-end, rapid-run mode. The libraries were bidirectionally sequenced with read lengths between 120 and 175 bp in each direction.
Raw data were demultiplexed and quality-filtered using web-based tools on the Galaxy platform. Next, all reads were aligned to the matching Vβ, Jβ, and Cβ genes, TCRβ clonotypes were built from identical sequences, and similar clonotypes were clustered with the MiXCR software (38). TCR clonotype data were further processed using custom scripts in R to compare samples and characterize specificity and sharing of TCRβ sequences. Graphs were made with R, GraphPad Prism, or Microsoft Excel. The p values were calculated with two-sided Mann–Whitney U tests in GraphPad Prism, version 7. Raw sequence data and TCR clonotype data are available at the National Center for Biotechnology Information’s Gene Expression Omnibus, accession number GSE114931 (https://www.ncbi.nlm.nih.gov/geo).
Identification of specific TCRβ sequences
Specific TCRβ clonotypes for each epitope were identified by comparing TCRβ clonotype frequencies in three samples: a sample S stimulated with the specific peptide of interest, a sample C stimulated with a control peptide, and a sample U of unstimulated PBMCs. Specific TCRβ clonotypes were required to be enriched in S over C and in S over U. Let si, ci and ui be the relative read frequency (proportion of reads) of clonotype i in the three samples. To count as specific, clonotypes must exceed two enrichment cutoffs (Supplemental Fig. 1B). The first enrichment cutoff was identified as a local minimum of a weighted density distribution of log10 (si/ci) of all clonotypes i that fulfilled the condition sici > 1 × 10−6, i.e., of all medium- to high-frequency clonotypes. Analogously, the second enrichment cutoff was identified as a local minimum of a weighted density distribution of log10 (si/ui) of all clonotypes i that fulfilled siui > 1 × 10−7. To eliminate low-fidelity background signals, specific clonotypes must also exceed a specific sample read count cutoff (Supplemental Fig. 1C). This cutoff was determined by analyzing the two density distributions of log10 si for all clonotypes i that had a low frequency in control samples (i.e., an absolute frequency of 1–10 reads in samples C or U, respectively). The read count at a local minimum of each of these two distributions was identified, and the mean of these two read counts served as the cutoff value. If any of these two density distributions did not have a local minimum, the cutoff was positioned at its global maximum times 100 (i.e., at 100 reads or higher in sample S).
Using these criteria, specific TCRβ clonotypes were identified for eight CMV-positive donors (P01–P08) and four epitopes (CRV, FRC, RPH, and TPR), resulting in identification of 1052 CMV peptide-specific TCRβ sequences that were unique at the amino acid level (all specific sequences are listed in Supplemental Table II). Identification of Ag-specific TCRβ sequences from mini-LCL stimulations was achieved in a similar manner by comparing the frequencies of each clonotype in the CMV Ag-stimulated sample with their ex vivo frequencies and with their frequencies in the samples obtained by stimulation with an empty-vector mini-LCL control. Only TCRβ sequences that were enriched compared with both control samples were considered specific for epitopes from that CMV Ag.
Identification of public and related TCRβ sequences
Public CMV epitope–specific TCRβ sequences and TCRβ families were identified based on 1052 specific TCRβ sequences from donors P01–P08. In a first step, TCRβ sequences were categorized as public if they were CMV-specific in at least two donors with identical Vβ and Jβ gene segments, CDR3 aa sequence, and epitope specificity. In a second step, this set of sequences was extended by such TCRβ sequences that were present in only one donor, but highly similar to public TCRβ sequences. Sequences were considered highly similar if 1) they used the same Vβ gene segments, 2) had the same CDR3 length, and 3) differed in maximally 2 aa in the CDR3.
Results
A method for high-resolution TCRβ repertoire analysis by peptide stimulation
We devised a method to analyze human epitope-specific TCRβ repertoires at high resolution. This method combines simple, short-term in vitro stimulation of PBMCs with synthetic peptide (39, 40) and high-throughput TCRβ sequencing (Fig. 1). Our focus was on donors carrying the HLA class I haplotype B*07:02–C*07:02, which is the most frequent HLA-B/C haplotype in donors of European descent (41). We tested the four most immunogenic CMV peptides that are known to be presented by HLA allotypes encoded by this haplotype (17, 19, 42): the HLA-C*07:02–restricted epitopes CRV and FRC and the HLA-B*07:02–restricted epitopes RPH and TPR. HLA-C*07:02–restricted epitopes are of special clinical interest because their recognition resists viral immunoevasion (17, 18). HLA-C*07:02 is prevalent not only in people of European descent, but also in East Asian and Native American populations (41).
Experimental setup. (A) Schema of the three-sample assay used to expand and analyze peptide-specific T cells. PBMCs were isolated from peripheral blood of healthy donors, loaded with single CMV-derived peptides, and cultured for 10 d with IL-2. Cells before and after stimulation were lysed, and TCRβ libraries were prepared from bulk RNA and analyzed by high-throughput sequencing. Specific TCRβ sequences for each epitope were identified by comparing TCRβ clonotype frequencies in the following three samples: 1) stimulated with specific peptide, 2) stimulated with control peptide, and 3) before stimulation. Clonotypes that were enriched by the specific peptide, but not in controls, were considered specific. (B) Preparation of TCRβ libraries for bidirectional sequencing of the CDR3. After total RNA isolation, TCRβ RNA was reversely transcribed using a Cβ gene-specific primer (1). In a first PCR step, cDNA was amplified by semi-multiplexed PCR (2) with a mixture of 45 forward primers that covered all Vβ genes and appended the Illumina sequencing read 2 primer binding site to the product. The reverse primer was complementary to both Cβ genes and appended the sequencing read 1 priming site. A second PCR step was performed with a single primer on each side (3), which adds Illumina i5 and i7 adapters and sample indices for multiplexed Illumina sequencing (4).
Experimental setup. (A) Schema of the three-sample assay used to expand and analyze peptide-specific T cells. PBMCs were isolated from peripheral blood of healthy donors, loaded with single CMV-derived peptides, and cultured for 10 d with IL-2. Cells before and after stimulation were lysed, and TCRβ libraries were prepared from bulk RNA and analyzed by high-throughput sequencing. Specific TCRβ sequences for each epitope were identified by comparing TCRβ clonotype frequencies in the following three samples: 1) stimulated with specific peptide, 2) stimulated with control peptide, and 3) before stimulation. Clonotypes that were enriched by the specific peptide, but not in controls, were considered specific. (B) Preparation of TCRβ libraries for bidirectional sequencing of the CDR3. After total RNA isolation, TCRβ RNA was reversely transcribed using a Cβ gene-specific primer (1). In a first PCR step, cDNA was amplified by semi-multiplexed PCR (2) with a mixture of 45 forward primers that covered all Vβ genes and appended the Illumina sequencing read 2 primer binding site to the product. The reverse primer was complementary to both Cβ genes and appended the sequencing read 1 priming site. A second PCR step was performed with a single primer on each side (3), which adds Illumina i5 and i7 adapters and sample indices for multiplexed Illumina sequencing (4).
TCRβ cDNA libraries for Illumina sequencing were prepared from T cell samples in a two-step RT-PCR (Fig. 1B). The two-step PCR procedure was designed to limit potential amplification bias due to multiplex priming and to increase fidelity by enabling bidirectional sequencing of the CDR3. After Illumina sequencing, TCRβ clonotypes were built using the software MiXCR (38). A median of 5.0 × 106 productive TCRβ sequence reads were obtained per sample (Supplemental Table II). When we plotted TCRβ clonotype frequencies before and after stimulation, well-separated clusters suggestive of peptide-reactive expanded TCRβ clonotypes became apparent in CMV-positive, but not in CMV-negative, donors (Fig. 2). For a more precise definition, we evaluated TCRβ sequences to be epitope-specific if they appeared in the cluster of enriched clonotypes after cultivation with a specific peptide, but not after cultivation with a control peptide of different HLA restriction (three-sample comparison, enrichment cutoff). In addition, a frequency cutoff was applied to minimize statistical noise from low-frequency clonotypes (Supplemental Fig. 1); this cutoff was calculated from the sample-specific clonotype frequency distribution. Within an epitope-specific population, TCRβ clonotype frequencies before and after stimulation were generally well correlated (Fig. 2, Supplemental Fig. 1E).
Populations of enriched TCRβ clonotypes are exclusive to CMV-positive donors. Relative frequency (proportion of reads) of TCRβ clonotypes before (x-axis) and after (y-axis) stimulation with one of four CMV peptides in CMV-positive donor P01 (upper panel) and CMV-negative donor N05 (lower panel). Each TCRβ clonotype is defined as the entirety of identical reads on the nucleotide level and represented by a black dot. Clonotypes that were undetectable in one condition were assigned a pseudo-frequency corresponding to 0.5 reads to enable their display on a logarithmic axis.
Populations of enriched TCRβ clonotypes are exclusive to CMV-positive donors. Relative frequency (proportion of reads) of TCRβ clonotypes before (x-axis) and after (y-axis) stimulation with one of four CMV peptides in CMV-positive donor P01 (upper panel) and CMV-negative donor N05 (lower panel). Each TCRβ clonotype is defined as the entirety of identical reads on the nucleotide level and represented by a black dot. Clonotypes that were undetectable in one condition were assigned a pseudo-frequency corresponding to 0.5 reads to enable their display on a logarithmic axis.
We identified a total of 1052 unique TCRβ amino acid sequences in eight CMV-positive donors (P01–P08) that met the specificity criteria for exactly one epitope (listed in Supplemental Table II). Of these, 435 were specific for CRV, 266 for FRC, 191 for RPH, and 160 for TPR. Nineteen TCRβ sequences passed the criteria for two epitopes, no TCRβ for more than two epitopes. Thus, we observed minor overlap between TCRβ sequences assigned to different specificities.
Peptide stimulation expands T cell clonotypes that recognize processed Ag
There is a concern that synthetic peptide may not exclusively stimulate T cells that will be able to recognize the naturally processed epitope (43). Therefore, we tested whether our peptide-enriched T cell clonotypes respond to endogenously processed CMV Ags. We established autologous mini-LCLs that constitutively express CMV proteins pp65 or IE-1 from a mini-EBV genome. Such mini-LCLs effectively present CMV epitopes of any autologous HLA restriction to CD8+ and CD4+ T cells (17, 37, 44, 45). PBMCs of three CMV-positive donors, P01–P03, were stimulated with autologous mini-LCLs (pp65, IE-1, or control mini-LCLs without CMV Ag), and the resulting TCRβ repertoires (Supplemental Table II) were compared with those obtained through peptide stimulation. In the mini-LCL condition, TCRβ sequences were considered CMV-specific if they were enriched by stimulation with mini-LCLs that expressed the CMV Ag of interest, but not by stimulation with control mini-LCLs that lacked the CMV Ag. We found that the majority of TCRβ clonotypes that were expanded by one of the peptides CRV, RPH, or TPR also recognized the corresponding CMV Ag processed by mini-LCLs (Fig. 3). These results confirm that our simple peptide stimulation assay is a generally valid approach for the identification of CMV-specific TCRβ clonotypes of both high and low frequency.
Specific TCRβ clonotypes identified in the peptide stimulation assay also respond to endogenously processed Ag. The analysis was performed for the IE-1 Ag (epitope CRV) in CMV-positive donors P01– P03 and for the pp65 Ag (epitopes RPH and TPR) in donors P01 and P03. The plot shows all TCRβ clonotypes that were identified as epitope specific in the peptide stimulation assay. The y-axis indicates the frequency of each clonotype after peptide stimulation. Black dots represent clonotypes that were specifically enriched by stimulation with an autologous mini-LCL that expresses the corresponding CMV Ag; gray dots indicate clonotypes for which this was not the case. The numbers on top indicate the total number of epitope-specific TCRβ clonotypes and, in parentheses, the number of clonotypes responding to Ag endogenously processed by mini-LCLs. Samples of donor P01 were CD8 enriched before sequencing.
Specific TCRβ clonotypes identified in the peptide stimulation assay also respond to endogenously processed Ag. The analysis was performed for the IE-1 Ag (epitope CRV) in CMV-positive donors P01– P03 and for the pp65 Ag (epitopes RPH and TPR) in donors P01 and P03. The plot shows all TCRβ clonotypes that were identified as epitope specific in the peptide stimulation assay. The y-axis indicates the frequency of each clonotype after peptide stimulation. Black dots represent clonotypes that were specifically enriched by stimulation with an autologous mini-LCL that expresses the corresponding CMV Ag; gray dots indicate clonotypes for which this was not the case. The numbers on top indicate the total number of epitope-specific TCRβ clonotypes and, in parentheses, the number of clonotypes responding to Ag endogenously processed by mini-LCLs. Samples of donor P01 were CD8 enriched before sequencing.
CMV-specific TCRβ clonotypes are abundant in the T cell repertoire of virus carriers
We investigated the contribution of CMV epitope–specific TCRs to the overall TCRβ repertoire in peripheral blood of eight CMV-positive (P01–P08) and eight CMV-negative (N01–N08) donors. Among the top 100 most frequent ex vivo TCRβ clonotypes of any CMV-positive donor (Fig. 4A), 2–10 were specific for one of the CMV epitopes CRV, FRC, or TPR. In six of eight CMV-positive donors, CMV-specific clonotypes were among the five most frequent clonotypes, and in two of eight donors, they supplied the top-frequency clonotype of the entire repertoire. Among a donor’s CMV-specific clonotypes, the most frequent one was specific for CRV in six donors and specific for FRC in two of eight donors. In CMV-negative donors, none of the top 100 TCRβ clonotypes were specifically enriched by CMV peptide stimulation, and thus none of them was categorized as CMV specific. When looking at the cumulative read frequencies, CRV- or FRC-specific clonotypes dominated the response in CMV carriers (Fig. 4B). These results demonstrate that CMV-specific T cells distinctly shape the T cell repertoire of virus carriers, with a prominent role for HLA-C–restricted clonotypes.
CMV-specific TCRβ clonotypes dominate the peripheral T cell repertoire of CMV-positive donors. (A) The figure shows the proportion of reads of the 100 most frequent TCRβ clonotypes of CMV-positive donors P01–P08 and CMV-negative donors N01–N08 in peripheral blood ex vivo. TCRβ clonotypes that were identified as specific for CMV epitopes CRV, FRC, or TPR are shown as colored dots, clonotypes of unknown specificity as black dots. Because epitope RPH was not tested in donors P02 or N01–N08, it was omitted from this analysis. (B) Frequencies of CMV-specific TCRβ sequences in the ex vivo repertoires (left) and CD8-enriched repertoires (right) of CMV-positive donors P01–P08 as cumulative proportion of reads. RPH-specific T cells were not investigated in donor P02, and such T cells are, therefore, not depicted in the plots.
CMV-specific TCRβ clonotypes dominate the peripheral T cell repertoire of CMV-positive donors. (A) The figure shows the proportion of reads of the 100 most frequent TCRβ clonotypes of CMV-positive donors P01–P08 and CMV-negative donors N01–N08 in peripheral blood ex vivo. TCRβ clonotypes that were identified as specific for CMV epitopes CRV, FRC, or TPR are shown as colored dots, clonotypes of unknown specificity as black dots. Because epitope RPH was not tested in donors P02 or N01–N08, it was omitted from this analysis. (B) Frequencies of CMV-specific TCRβ sequences in the ex vivo repertoires (left) and CD8-enriched repertoires (right) of CMV-positive donors P01–P08 as cumulative proportion of reads. RPH-specific T cells were not investigated in donor P02, and such T cells are, therefore, not depicted in the plots.
Patterns of Vβ and Jβ gene segment usage in CMV epitope–specific TCRs
We analyzed TCRβ clonotypes sharing their epitope specificity for shared structural features. First, we evaluated overall usage of Vβ and Jβ gene segments (Fig. 5) in CMV-specific TCRβ clonotypes. For each epitope, particular Vβ and Jβ genes were overrepresented, such as Vβ6-1/-5/-6, Vβ25-1, and Vβ28 and Jβ1-1, Jβ2-1, or Jβ2-7 for epitope CRV. However, no Vβ/Jβ combination dominated the response to any of the four epitopes. It follows that gene segment use alone is not sufficiently informative as a marker of CMV-specific CD8+ T cell immunity to the epitopes investigated in this study.
Usage of Vβ and Jβ gene segments in CMV epitope–specific T cells. The left semicircle of each chord diagram represents Vβ gene segment usage (rainbow colors), the right semicircle represents Jβ usage (grayscale shades), and the chords indicate which gene segments appear together in TCRβ sequences. Sizes of sectors and chords are proportional to the sum of the number of nucleotide-unique, Ag-specific TCRβ sequences in each of the donors P01–P08. The most frequently used gene segments are labeled, and the numbers below the plots show how many TCRβ sequences were specific for each epitope.
Usage of Vβ and Jβ gene segments in CMV epitope–specific T cells. The left semicircle of each chord diagram represents Vβ gene segment usage (rainbow colors), the right semicircle represents Jβ usage (grayscale shades), and the chords indicate which gene segments appear together in TCRβ sequences. Sizes of sectors and chords are proportional to the sum of the number of nucleotide-unique, Ag-specific TCRβ sequences in each of the donors P01–P08. The most frequently used gene segments are labeled, and the numbers below the plots show how many TCRβ sequences were specific for each epitope.
Public CMV-specific TCRβ sequences and TCRβ families
Next, we searched for the presence of identical TCRβ amino acid sequences with the same CMV epitope specificity in different donors, known as shared or public TCRs. We found 26 TCRβ sequences that were specific in at least two out of eight CMV-positive donors (Table I). Several of these public sequences with the same epitope specificity were closely related in sequence because they used the same Vβ gene, had the same CDR3 length, and differed in a maximum of 2 aa within the CDR3. Hence, we looked for additional TCRβ sequences that appeared in only one of the eight donors but were closely related to one of the 26 public TCRβ sequences according to the criteria stated above; we identified 37 such TCRβ sequences. The resulting set of 63 public or related CMV-specific TCRβ sequences was composed of 17 similarity groups, which we refer to as public TCRβ families. Of the 63 sequences, 21 were HLA-B–restricted (epitopes TPR and RPH), and 10 of them had been previously described (29, 42, 46, 47). In contrast, the epitope specificity and HLA restriction of none of the 42 HLA-C–restricted TCRβ sequences (epitopes CRV or FRC) had been previously shown, although five of these sequences were found to be enriched in CMV-positive donors compared with CMV-negative donors (48).
Vβ Gene . | CDR3β Sequence . | Jβ Gene . | Specificity . | Donors . | Public Family . | Reference . |
---|---|---|---|---|---|---|
TRBV25-1 | CASSPGDEQFF | TRBJ2-1 | CRV | P01, P05, P06, P07 | CRV1 | (48)a |
TRBV25-1 | CASTPGDEQFF | TRBJ2-1 | CRV | P03, P06, P07 | CRV1 | (48)a |
TRBV25-1 | CASSAGDEQYF | TRBJ2-7 | CRV | P05 | CRV1 | |
TRBV25-1 | CASSPGDEQYF | TRBJ2-7 | CRV | P01 | CRV1 | (48)a |
TRBV25-1 | CASSPGDTQYF | TRBJ2-3 | CRV | P06 | CRV1 | |
TRBV25-1 | CASTHGDEQFF | TRBJ2-1 | CRV | P05 | CRV1 | |
TRBV25-1 | CASTPGDEQYF | TRBJ2-7 | CRV | P06 | CRV1 | |
TRBV25-1 | CASTQGDEQFF | TRBJ2-1 | CRV | P06 | CRV1 | |
TRBV25-1 | CASTSGDEQFF | TRBJ2-1 | CRV | P07 | CRV1 | |
TRBV25-1 | CASTTGDEQFF | TRBJ2-1 | CRV | P03 | CRV1 | |
TRBV25-1 | CATSPGDEQYF | TRBJ2-7 | CRV | P01 | CRV1 | |
TRBV25-1 | CASTLGDEQYF | TRBJ2-7 | CRV | P05 | CRV1 | |
TRBV25-1 | CAVTAGDEQFF | TRBJ2-1 | CRV | P07 | CRV1 | |
TRBV28 | CASSPISNEQFF | TRBJ2-1 | CRV | P01, P07 | CRV2 | (48)a |
TRBV28 | CASSPVSNEQFF | TRBJ2-1 | CRV | P01, P02 | CRV2 | |
TRBV28 | CASSPISNEQYF | TRBJ2-7 | CRV | P01 | CRV2 | |
TRBV6-1/6-5/6-6 | CASSPGTPRDEQFF | TRBJ2-1 | CRV | P03, P05 | CRV3 | |
TRBV6-1/6-5/6-6 | CASSQGTPRDEQYF | TRBJ2-7 | CRV | P05 | CRV3 | |
TRBV6-1/6-5/6-6 | CASSSGQKNTEAFF | TRBJ1-1 | CRV | P01, P07 | CRV4 | |
TRBV6-1/6-5/6-6 | CASSTGQKNTEAFF | TRBJ1-1 | CRV | P01 | CRV4 | |
TRBV6-1/6-5/6-6 | CASTPGQKNTEAFF | TRBJ1-1 | CRV | P04 | CRV4 | |
TRBV6-1/6-5/6-6 | CASTTGQKNTEAFF | TRBJ1-1 | CRV | P05 | CRV4 | |
TRBV6-1/6-5/6-6 | CATTSGQKNTEAFF | TRBJ1-1 | CRV | P01 | CRV4 | |
TRBV6-1/6-5/6-6 | CASQPGQKNTEAFF | TRBJ1-1 | CRV | P08 | CRV4 | |
TRBV6-1/6-5/6-6 | CASSSGLTNTEAFF | TRBJ1-1 | CRV | P06 | CRV4 | |
TRBV20-1 | CSAPDWNNEQFF | TRBJ2-1 | CRV | P01, P02 | CRV5 | |
TRBV20-1 | CSAPDWGNEQFF | TRBJ2-1 | CRV | P08 | CRV5 | |
TRBV20-1 | CSAPNWFNEQFF | TRBJ2-1 | CRV | P05 | CRV5 | |
TRBV20-1 | CSAPTWDNEQFF | TRBJ2-1 | CRV | P01 | CRV5 | |
TRBV28 | CASSFPDTQYF | TRBJ2-3 | CRV | P01, P02 | CRV6 | |
TRBV28 | CASTPWGAEAFF | TRBJ1-1 | CRV | P04, P08 | CRV7 | |
TRBV15 | CATSRTGGETQYF | TRBJ2-5 | FRC | P01, P03, P05 | FRC1 | |
TRBV15 | CATSREGGETQYF | TRBJ2-5 | FRC | P05, P06, P08 | FRC1 | |
TRBV15 | CATSAEGGETQYF | TRBJ2-5 | FRC | P08 | FRC1 | |
TRBV15 | CATSGTAGETQYF | TRBJ2-5 | FRC | P08 | FRC1 | |
TRBV15 | CATSRDAGETQYF | TRBJ2-5 | FRC | P06 | FRC1 | |
TRBV15 | CATSRDGGETQYF | TRBJ2-5 | FRC | P02 | FRC1 | |
TRBV15 | CATSRVAGETQYF | TRBJ2-5 | FRC | P06 | FRC1 | (48)a |
TRBV15 | CATSVTGGETQYF | TRBJ2-5 | FRC | P02 | FRC1 | |
TRBV6-2/6-3 | CASSGGLEAFF | TRBJ1-1 | FRC | P03, P07 | FRC2 | |
TRBV6-2/6-3 | CASSLGLEAFF | TRBJ1-1 | FRC | P03 | FRC2 | |
TRBV4-3 | CASSPQRNTEAFF | TRBJ1-1 | RPH | P03, P04, P05, P06, P08 | RPH1 | (46) |
TRBV4-3 | CASSPARNTEAFF | TRBJ1-1 | RPH | P03, P05, P08 | RPH1 | (42) |
TRBV4-3 | CASSPSRNTEAFF | TRBJ1-1 | RPH | P03, P08 | RPH1 | (42) |
TRBV4-3 | CASSPHRNTEAFF | TRBJ1-1 | RPH | P03, P05 | RPH1 | (42) |
TRBV4-3 | CASSPNRNTEAFF | TRBJ1-1 | RPH | P03, P08 | RPH1 | (46) |
TRBV4-3 | CASSPGRNTEAFF | TRBJ1-1 | RPH | P03 | RPH1 | |
TRBV4-3 | CASSPTRNTEAFF | TRBJ1-1 | RPH | P08 | RPH1 | (46) |
TRBV7-8 | CASSFRTVSSYEQYF | TRBJ2-7 | TPR | P01, P03, P04 | TPR1 | |
TRBV7-8 | CASSFRTVNSYEQYF | TRBJ2-7 | TPR | P02, P03 | TPR1 | |
TRBV7-8 | CASSLRTVSSYEQYF | TRBJ2-7 | TPR | P02, P04 | TPR1 | |
TRBV7-9 | CASSLIGVSSYNEQFF | TRBJ2-1 | TPR | P01, P02, P06 | TPR2 | (29, 42) |
TRBV7-9 | CASSLKGVSSYNEQFF | TRBJ2-1 | TPR | P06 | TPR2 | |
TRBV7-9 | CASSLRGESSYNEQFF | TRBJ2-1 | TPR | P01 | TPR2 | |
TRBV7-9 | CASSFRQGVNTGELFF | TRBJ2-2 | TPR | P01, P02 | TPR3 | |
TRBV7-9 | CASSFRQGSNTGELFF | TRBJ2-2 | TPR | P01 | TPR3 | |
TRBV7-9 | CASSFRSGINTGELFF | TRBJ2-2 | TPR | P02 | TPR3 | |
TRBV7-9 | CASSFRQGTPTGELFF | TRBJ2-2 | TPR | P04 | TPR3 | |
TRBV6-2/6-3 | CASSYSSGELFF | TRBJ2-2 | TPR | P01, P08 | TPR4 | (29)b |
TRBV6-2/6-3 | CASSYSGNTEAFF | TRBJ1-1 | TPR | P02, P08 | TPR5 | (47)a |
TRBV7-2 | CASSSRGTVNTEAFF | TRBJ1-1 | TPR | P03, P05 | TPR6 | |
TRBV7-9 | CASSLHTQGARTEAFF | TRBJ1-1 | TPR | P02, P07 | TPR7 | |
TRBV7-9 | CASSLHSRGARTEAFF | TRBJ1-1 | TPR | P02 | TPR7 |
Vβ Gene . | CDR3β Sequence . | Jβ Gene . | Specificity . | Donors . | Public Family . | Reference . |
---|---|---|---|---|---|---|
TRBV25-1 | CASSPGDEQFF | TRBJ2-1 | CRV | P01, P05, P06, P07 | CRV1 | (48)a |
TRBV25-1 | CASTPGDEQFF | TRBJ2-1 | CRV | P03, P06, P07 | CRV1 | (48)a |
TRBV25-1 | CASSAGDEQYF | TRBJ2-7 | CRV | P05 | CRV1 | |
TRBV25-1 | CASSPGDEQYF | TRBJ2-7 | CRV | P01 | CRV1 | (48)a |
TRBV25-1 | CASSPGDTQYF | TRBJ2-3 | CRV | P06 | CRV1 | |
TRBV25-1 | CASTHGDEQFF | TRBJ2-1 | CRV | P05 | CRV1 | |
TRBV25-1 | CASTPGDEQYF | TRBJ2-7 | CRV | P06 | CRV1 | |
TRBV25-1 | CASTQGDEQFF | TRBJ2-1 | CRV | P06 | CRV1 | |
TRBV25-1 | CASTSGDEQFF | TRBJ2-1 | CRV | P07 | CRV1 | |
TRBV25-1 | CASTTGDEQFF | TRBJ2-1 | CRV | P03 | CRV1 | |
TRBV25-1 | CATSPGDEQYF | TRBJ2-7 | CRV | P01 | CRV1 | |
TRBV25-1 | CASTLGDEQYF | TRBJ2-7 | CRV | P05 | CRV1 | |
TRBV25-1 | CAVTAGDEQFF | TRBJ2-1 | CRV | P07 | CRV1 | |
TRBV28 | CASSPISNEQFF | TRBJ2-1 | CRV | P01, P07 | CRV2 | (48)a |
TRBV28 | CASSPVSNEQFF | TRBJ2-1 | CRV | P01, P02 | CRV2 | |
TRBV28 | CASSPISNEQYF | TRBJ2-7 | CRV | P01 | CRV2 | |
TRBV6-1/6-5/6-6 | CASSPGTPRDEQFF | TRBJ2-1 | CRV | P03, P05 | CRV3 | |
TRBV6-1/6-5/6-6 | CASSQGTPRDEQYF | TRBJ2-7 | CRV | P05 | CRV3 | |
TRBV6-1/6-5/6-6 | CASSSGQKNTEAFF | TRBJ1-1 | CRV | P01, P07 | CRV4 | |
TRBV6-1/6-5/6-6 | CASSTGQKNTEAFF | TRBJ1-1 | CRV | P01 | CRV4 | |
TRBV6-1/6-5/6-6 | CASTPGQKNTEAFF | TRBJ1-1 | CRV | P04 | CRV4 | |
TRBV6-1/6-5/6-6 | CASTTGQKNTEAFF | TRBJ1-1 | CRV | P05 | CRV4 | |
TRBV6-1/6-5/6-6 | CATTSGQKNTEAFF | TRBJ1-1 | CRV | P01 | CRV4 | |
TRBV6-1/6-5/6-6 | CASQPGQKNTEAFF | TRBJ1-1 | CRV | P08 | CRV4 | |
TRBV6-1/6-5/6-6 | CASSSGLTNTEAFF | TRBJ1-1 | CRV | P06 | CRV4 | |
TRBV20-1 | CSAPDWNNEQFF | TRBJ2-1 | CRV | P01, P02 | CRV5 | |
TRBV20-1 | CSAPDWGNEQFF | TRBJ2-1 | CRV | P08 | CRV5 | |
TRBV20-1 | CSAPNWFNEQFF | TRBJ2-1 | CRV | P05 | CRV5 | |
TRBV20-1 | CSAPTWDNEQFF | TRBJ2-1 | CRV | P01 | CRV5 | |
TRBV28 | CASSFPDTQYF | TRBJ2-3 | CRV | P01, P02 | CRV6 | |
TRBV28 | CASTPWGAEAFF | TRBJ1-1 | CRV | P04, P08 | CRV7 | |
TRBV15 | CATSRTGGETQYF | TRBJ2-5 | FRC | P01, P03, P05 | FRC1 | |
TRBV15 | CATSREGGETQYF | TRBJ2-5 | FRC | P05, P06, P08 | FRC1 | |
TRBV15 | CATSAEGGETQYF | TRBJ2-5 | FRC | P08 | FRC1 | |
TRBV15 | CATSGTAGETQYF | TRBJ2-5 | FRC | P08 | FRC1 | |
TRBV15 | CATSRDAGETQYF | TRBJ2-5 | FRC | P06 | FRC1 | |
TRBV15 | CATSRDGGETQYF | TRBJ2-5 | FRC | P02 | FRC1 | |
TRBV15 | CATSRVAGETQYF | TRBJ2-5 | FRC | P06 | FRC1 | (48)a |
TRBV15 | CATSVTGGETQYF | TRBJ2-5 | FRC | P02 | FRC1 | |
TRBV6-2/6-3 | CASSGGLEAFF | TRBJ1-1 | FRC | P03, P07 | FRC2 | |
TRBV6-2/6-3 | CASSLGLEAFF | TRBJ1-1 | FRC | P03 | FRC2 | |
TRBV4-3 | CASSPQRNTEAFF | TRBJ1-1 | RPH | P03, P04, P05, P06, P08 | RPH1 | (46) |
TRBV4-3 | CASSPARNTEAFF | TRBJ1-1 | RPH | P03, P05, P08 | RPH1 | (42) |
TRBV4-3 | CASSPSRNTEAFF | TRBJ1-1 | RPH | P03, P08 | RPH1 | (42) |
TRBV4-3 | CASSPHRNTEAFF | TRBJ1-1 | RPH | P03, P05 | RPH1 | (42) |
TRBV4-3 | CASSPNRNTEAFF | TRBJ1-1 | RPH | P03, P08 | RPH1 | (46) |
TRBV4-3 | CASSPGRNTEAFF | TRBJ1-1 | RPH | P03 | RPH1 | |
TRBV4-3 | CASSPTRNTEAFF | TRBJ1-1 | RPH | P08 | RPH1 | (46) |
TRBV7-8 | CASSFRTVSSYEQYF | TRBJ2-7 | TPR | P01, P03, P04 | TPR1 | |
TRBV7-8 | CASSFRTVNSYEQYF | TRBJ2-7 | TPR | P02, P03 | TPR1 | |
TRBV7-8 | CASSLRTVSSYEQYF | TRBJ2-7 | TPR | P02, P04 | TPR1 | |
TRBV7-9 | CASSLIGVSSYNEQFF | TRBJ2-1 | TPR | P01, P02, P06 | TPR2 | (29, 42) |
TRBV7-9 | CASSLKGVSSYNEQFF | TRBJ2-1 | TPR | P06 | TPR2 | |
TRBV7-9 | CASSLRGESSYNEQFF | TRBJ2-1 | TPR | P01 | TPR2 | |
TRBV7-9 | CASSFRQGVNTGELFF | TRBJ2-2 | TPR | P01, P02 | TPR3 | |
TRBV7-9 | CASSFRQGSNTGELFF | TRBJ2-2 | TPR | P01 | TPR3 | |
TRBV7-9 | CASSFRSGINTGELFF | TRBJ2-2 | TPR | P02 | TPR3 | |
TRBV7-9 | CASSFRQGTPTGELFF | TRBJ2-2 | TPR | P04 | TPR3 | |
TRBV6-2/6-3 | CASSYSSGELFF | TRBJ2-2 | TPR | P01, P08 | TPR4 | (29)b |
TRBV6-2/6-3 | CASSYSGNTEAFF | TRBJ1-1 | TPR | P02, P08 | TPR5 | (47)a |
TRBV7-2 | CASSSRGTVNTEAFF | TRBJ1-1 | TPR | P03, P05 | TPR6 | |
TRBV7-9 | CASSLHTQGARTEAFF | TRBJ1-1 | TPR | P02, P07 | TPR7 | |
TRBV7-9 | CASSLHSRGARTEAFF | TRBJ1-1 | TPR | P02 | TPR7 |
Only donors in whom the TCRs were functionally identified as epitope specific are listed. Boldface indicates the 26 public TCRβ sequences. Amino acid exchanges in the 37 related TCRβ sequences compared with the most similar public TCRβ sequence are underlined.
CMV epitope and HLA restriction are not described.
Reported with a different Vβ gene usage.
Public TCRβ families precisely distinguish CMV-positive from CMV-negative donors
We analyzed the frequency of public TCRβ families in primary T cells from an extended cohort of CMV-positive and CMV-negative donors: eight additional CMV-positive donors (P09–P16) and nine CMV-negative donors (N01–N09) with the HLA-B*07:02–C*07:02 haplotype. Public TCRβ family members were much more abundant in CMV-positive than CMV-negative donors (Fig. 6A). This was true for each of the four CMV epitopes when evaluated separately (Fig. 6B, left). Taking all epitopes together, the median cumulative read frequency was 171-fold higher in CMV-positive than in CMV-negative donors of our 25-donor cohort (Fig. 6C; data for each TCRβ and donor are provided in Supplemental Table III). Mean cumulative frequency in donors P09–P16 and donors P01–P08 was similar, which showed that there was no bias in favor of the P01–P08 cohort of donors, in whom the sequences were originally identified.
Frequency of public TCRβ families in peripheral T cell repertoires of CMV-positive and CMV-negative donors. (A) Cumulative proportion of TCRβ sequence reads for each individual TCRβ family in the ex vivo TCRβ repertoires of donors P01–P16 and N01–N09. Read frequencies for each of the 63 TCRβ sequences that form these families can be found in Supplemental Table III. (B) Cumulative proportion of reads for all public or related TCRβ sequences with the same epitope specificity in our donor cohort and an independent HLA-B7–positive subcohort from Emerson et al. (48). (C) Cumulative proportion of TCRβ sequence reads of all public/related TCRβ sequences in CMV-positive (solid circles) and CMV-negative (hollow circles) donors in our cohort and in HLA-B7–positive and HLA-B7–negative donors from Emerson et al. (Supplemental Table III). Gray circles identify donors P01–P08, from whose repertoires the TCRβ sequences were originally derived. The dashed line indicates a possible cutoff to separate CMV-positive and CMV-negative donors in our cohort (F1 score = 1) and the B7-positive cohort of Emerson et al. (F1 score = 0.93). Solid lines show the median cumulative read frequencies. The p values were calculated with a two-tailed Mann–Whitney U test.
Frequency of public TCRβ families in peripheral T cell repertoires of CMV-positive and CMV-negative donors. (A) Cumulative proportion of TCRβ sequence reads for each individual TCRβ family in the ex vivo TCRβ repertoires of donors P01–P16 and N01–N09. Read frequencies for each of the 63 TCRβ sequences that form these families can be found in Supplemental Table III. (B) Cumulative proportion of reads for all public or related TCRβ sequences with the same epitope specificity in our donor cohort and an independent HLA-B7–positive subcohort from Emerson et al. (48). (C) Cumulative proportion of TCRβ sequence reads of all public/related TCRβ sequences in CMV-positive (solid circles) and CMV-negative (hollow circles) donors in our cohort and in HLA-B7–positive and HLA-B7–negative donors from Emerson et al. (Supplemental Table III). Gray circles identify donors P01–P08, from whose repertoires the TCRβ sequences were originally derived. The dashed line indicates a possible cutoff to separate CMV-positive and CMV-negative donors in our cohort (F1 score = 1) and the B7-positive cohort of Emerson et al. (F1 score = 0.93). Solid lines show the median cumulative read frequencies. The p values were calculated with a two-tailed Mann–Whitney U test.
To further validate the set of public TCRβ families, we tested it on a larger cohort of donors whose ex vivo TCRβ repertoires were recently published (48) and whose CMV status and partial HLA type (low resolution HLA-A and -B, no HLA-C) was available. Because presence of HLA-B7 is a strong indicator of the presence of the haplotype HLA-B*07:02–C*07:02 in persons of European descent, whereas persons of Asian descent often express HLA-C*07:02 without HLA-B*07:02 (41), we limited our analysis to the 352 donors categorized as “White, not Hispanic or Latino.” Of these donors, 94 were HLA-B7 positive, and 258 were HLA-B7 negative. As shown in Fig. 6C, CMV-specific public TCRβ families were strongly enriched in HLA-B7–positive, CMV-positive donors, but not in HLA-B7–positive, CMV-negative donors (p < 1 × 10−15). In HLA-B7–negative donors, no enrichment was observed irrespective of CMV status. In a separate analysis of each epitope in this cohort, CRV was the strongest discriminator (p = 7.5 × 10−13; Fig. 6B, right).
A chosen cutoff value of 10−4 for the total proportion of reads of our set of 63 public or related TCRβ sequences led to perfect discrimination between CMV-positive and CMV-negative donors within our cohort, and very good identification of CMV-positive donors within the HLA-B7–positive published cohort (F1 score = 0.93; 100% specificity, 88% sensitivity). Taken together, our results show that the CMV-specific TCRβ signature, as identified through our approach, is highly indicative of CMV-specific T cell immunity associated with the CMV status in healthy donors.
Discussion
In this study, we analyzed the composition and sharing of TCRβ repertoires against four epitopes that are major targets of the CMV-specific CD8+ T cell response. Specific TCRs against two of these epitopes, restricted through HLA-C, had not to our knowledge been studied before. We identified a set of CMV-specific public TCRβ families that distinguishes CMV-positive from CMV-negative healthy persons in two independent cohorts with high precision.
We show that the CMV epitopes investigated in this study considerably shape the T cell repertoire of healthy virus carriers. HLA-C*07:02–restricted T cells were particularly prominent; in six of eight donors, they provided one of the four most frequent TCRβ clonotypes to the overall T cell repertoire and, in two of these donors, the top-frequency TCRβ clonotype. These results expand on the general observation that CMV-specific T cells make up for a large proportion of the CD8+ T cell repertoire, on average amounting to 5% of the total CD8+ T cell response based on IFN-γ secretion (49). This proportion is likely to be even larger when all Ag-specific cells are included in measurements, not just those that exert a chosen effector function at the time of analysis (50). Advanced donor age boosts CMV-specific T cell frequencies as well (19), an aspect we could not investigate in our cohort of anonymous donors.
The TCRβ signatures of CMV epitope–specific T cells were diverse. For a given epitope, the dominant TCRβ sequence was usually a different one in different donors (Supplemental Table II). Thus, TCRβ repertoires were not dominated across the board by heavily conserved (public) sequences, in contrast to what was observed for some epitopes from influenza (51, 52) or EBV (53). Rather, the TCRβ repertoires in this study were mostly composed of clonotypes that were shared with only a subset of matched donors or were entirely donor-specific (private). These patterns are reminiscent of those previously found for TCR repertoires of HLA-A– and HLA-B–restricted CMV-specific T cells (29–31, 33, 34, 54, 55). Nonetheless, our approach identified a series of public TCRβ sequences whose cumulative frequency was strongly indicative of CMV carrier status in larger cohorts of healthy donors, even though these public TCRs did not necessarily represent the most abundant epitope-specific T cells in the eight donors in whom they were originally identified. It seems clear that exclusive TCRβ sequencing underestimates the true diversity of an epitope-specific T cell response because the same TCRβ can be paired with different TCR α-chains to generate the same, or an overlapping, human antiviral epitope specificity (53, 56, 57). However, our results show that TCRβ sequencing, even in the absence of TCRα analysis, is already highly informative regarding the CMV-specific T cell status of donors. This finding may already have been anticipated by a recent large-scale study (48) that showed that signatures of enriched TCRβ sequences distinguished CMV-positive from CMV-negative donors with high precision. Our study extends these findings by demonstrating the predictive power of TCRβ sequences with known CMV epitope specificity. In contrast, the previous study (48) analyzed TCRβ sequences that were CMV-associated, but mostly not known to be CMV-specific, which means that other pathogens with overlapping epidemiology may have contributed to the signal. For future diagnostic or prognostic applications in immunocompromised patients, it will be preferable to focus on analysis of precisely defined Ag-specific TCRβs because such patients may simultaneously reactivate or acquire multiple, and even related, pathogens. We conclude that TCRβ sequencing provides a highly informative and economic standalone approach to identification of epitope-specific T cells in healthy carriers and patients at risk for viral reactivation and in potential need of antiviral prophylaxis or treatment (1, 58, 59). For example, TCRβ sequencing could replace HLA/peptide multimer staining or other contemporary assays used in monitoring (1, 58) as a means to examine the presence of T cells specific for CMV and other infections after transplantation, covering multiple epitopes, HLAs, and virus specificities in a single assay. Such an assay will facilitate the identification of patients in need of antiviral chemotherapy or adoptive T cell therapy, thereby reducing adverse effects and costs of unnecessary treatment and minimizing the need of blood samples for multiple diagnostic tests. In a more long-term perspective, comprehensive analysis of the CMV-specific TCR repertoires in patients will allow to identify those epitope specificities and TCRs that are most strongly associated with protection from CMV reactivation. Hypothetically, these may include T cells specific for HLA-C–restricted epitopes because of their resistance to CMV immunoevasins (17, 18). Protective TCRs and epitope specificities will provide the best basis for prognostic evaluation and therapeutic decisions and will be the best candidates for development of adoptive T cell therapy or vaccination.
Conserved TCRβs are not necessarily perfectly conserved. It was often observed that human HLA-A– or HLA-B–restricted CD8+ T cell responses to an epitope contain nonidentical, but highly similar, TCRβ sequences; these use the same or closely related Vβ and Jβ segments and typically have CDR3 regions that show exchanges in only a few amino acid positions. Such relationships are apparent in datasets on multiple epitopes from CMV (29, 31, 42, 55, 60, 61), EBV (29, 31, 53, 61–64), HIV (29), or influenza virus (51, 52). Similarity between TCRβs of the same specificity can also manifest as the presence of conserved short motifs within the CDR3 (29, 52), often at or near residues 6 and 7; these residues are at its center and generally make strong contributions to binding of MHC/peptide by the TCR and shaping T cell specificity (65). Analysis of such TCRβ relationships was recently taken to a more comprehensive level (33, 34) by the design of computational algorithms to cluster T cells of the same specificity according to multiple indicators of sequence similarity. Global sequence similarity of the CDR3 is the single factor that predicts specificity best (34), and prediction is further improved by adding aspects such as CDR3 length and Vβ gene segment usage. Accordingly, we observed multiple occurrences of high similarity of CDR3 sequence, length, and Vβ usage among our 26 CMV-specific public TCRs. Therefore, we decided to group CMV-specific public and related TCRβs into families based on these criteria. We found that TCRβ families that were defined in this way could precisely distinguish healthy persons with and without established CMV-specific T cell immunity. Public TCRβ family sequences were as rare in CMV-negative persons who carry the relevant HLA haplotype as in CMV-positive persons who lack the HLA haplotype. This finding shows that such TCRβ family members are unlikely to appear in T cells of irrelevant specificity or HLA restriction, confirms the predictive power of TCRβ sequencing, and suggests that our TCRβ grouping approach can, in future studies, be extended to a broader variety of epitopes, viruses, or antigenic entities.
Because its initial description (26, 32), high-throughput sequencing of TCRβ repertoires has become a widely used research method, but it has not yet found wide clinical application. We have now introduced technical improvements that will enhance its robustness and applicability to accelerate advancement to routine clinical use. Dual-indexed, paired-end bidirectional sequencing of the entire CDR3 region is likely to reduce errors in the resulting Illumina sequencing data (66). Our technique to identify epitope-specific TCRβ repertoires by peptide stimulation and comparison with two controls eliminates the need to physically isolate specific T cells. Earlier studies that aimed at characterizing epitope-specific TCR repertoires have generally employed sorting of HLA/peptide multimer–labeled T cells (33–35, 54, 61) or occasionally sorting of T cells labeled with markers of activation or proliferation (67). High-throughput sequencing has the capacity to identify TCR sequences even from very low-frequency components of the sample. However, T cell subsets sorted based on multimers or other markers are not perfectly pure, and there is a possibility that low-frequency contaminants, which may derive from dominant clones from the parent population, are erroneously assigned the specificity of interest. Regardless whether multimer sorting or peptide stimulation is used to identify specific TCR clonotypes, artifacts can be avoided by quantitatively verifying enrichment of clonotypes relative to the parent population and relative to a sample treated with a different multimer or peptide. Our present approach has the limitation that it only covers T cells capable of proliferating in vitro in response to Ag. However, depending on the research question at hand, this limitation may be advantageous because T cells capable of proliferation will, in many cases, be those that are functionally more relevant in disease or immune control. Moreover, the numerical increase of Ag-responsive T cells due to proliferation, as well as the increased absolute amount of TCRβ mRNA per cell several days after activation (68), will, in itself, increase resolution and thus the likelihood that rare clonotypes can be detected in samples of limited size. In contrast, HLA/peptide multimer staining can capture T cells that express a specific TCR irrespective of their functional properties; however, despite recent progress (69, 70), multimers are still challenging to produce for certain HLA allotypes and epitopes, and they may not always stain the entire Ag-specific T cell population (71).
It seems safe to predict that identification and quantification of Ag-specific TCRβ repertoires will increasingly enter clinical practice for purposes of diagnostics and monitoring. Repositories of annotated Ag-specific TCRs (72), refined computational tools for TCR sequence analysis (38), and TCR sequence datasets (48), generously made available to the public, will be of great use in further developing the method. With growing datasets, an increasing number of epitope-specific TCR sequences against various pathogens will be found to be shared between carriers. Such public TCR sequences will find application in various fields. In clinical research, public TCR sequences may be used as indicators to track virus-specific T cell responses in patients after transplantation to identify epitopes that mobilize a protective T cell response against pathogens such as CMV (6). Virus-specific TCR signatures may also be exploited in diagnostics and disease monitoring (36, 73, 74) to inform about a patient’s status regarding past or present infection with multiple pathogens, success of vaccination or T cell transfer, and risk of future infection or reactivation. Pathogen-specific TCRs that are frequently present in the self-tolerant repertoire of multiple healthy donors are likely to be nonresponsive to human Ags in various genetic backgrounds. Such TCRs carry a low risk of allo-HLA cross-reactivity (75) and are therefore favorable candidates for immunotherapy with TCR-transgenic T cells (76).
Footnotes
This work was supported by the Deutsche Forschungsgemeinschaft (SFB-TR36, Project A4).
The sequences presented in this article have been submitted to the National Center for Biotechnology Information’s Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo) under accession number GSE114931.
The online version of this article contains supplemental material.
Abbreviations used in this article:
- CRV
CRVLCCYVL
- FRC
FRCPRRFCF
- mini-LCL
mini–lymphoblastoid cell line
- RPH
RPHERNGFTVL
- TCRβ
TCR β-chain
- TPR
TPRVTGGGAM.
References
Disclosures
The authors have no financial conflicts of interest.