Abstract
Strong statistical associations between polymorphisms in HIV-1 population sequences and carriage of HLA class I alleles have been widely used to identify possible sites of CD8 T cell immune selection in vivo. However, there have been few attempts to prospectively and systematically test these genetic hypotheses arising from population-based studies at a cellular, functional level. We assayed CD8 T cell epitope-specific IFN-γ responses in 290 individuals from the same cohort, which gave rise to 874 HLA–HIV associations in genetic analyses, taking into account autologous viral sequences and individual HLA genotypes. We found immunological evidence for 58% of 374 associations tested as sites of primary immune selection and identified up to 50 novel HIV-1 epitopes using this reverse-genomics approach. Many HLA-adapted epitopes elicited equivalent or higher-magnitude IFN-γ responses than did the nonadapted epitopes, particularly in Nef. At a population level, inclusion of all of the immunoreactive variant CD8 T cell epitopes in Gag, Pol, Nef, and Env suggested that HIV adaptation leads to an inflation of Nef-directed immune responses relative to other proteins. We concluded that HLA–HIV associations mark viral epitopes subject to CD8 T cell selection. These results can be used to guide functional studies of specific epitopes and escape mutations, as well as to test, train, and evaluate analytical models of viral escape and fitness. The inflation of Nef and HLA-adapted variant responses may have negative effects on natural and vaccine immunity against HIV and, therefore, has implications for diversity coverage approaches in HIV vaccine design.
The dual challenges of HIV-1 diversity and evasion of human immunity have concentrated efforts in the vaccine field to optimize diversity coverage in vaccines on the one hand (1, 2), as well as to distinguish protective from nonprotective immune responses on the other hand (3). With respect to CD8 T cell immunity, diversity and immunogenicity considerations may well intersect if specific, predictable genetic variations in HIV-1 have important functional consequences for prevalent epitope-specific responses. HIV-1 mutational escape from cellular immune responses generated in acute and chronic infection contributes to HIV-1 diversity at the population level. In particular, HLA-restricted CD8 CTL responses are sufficiently suppressive to exert selection pressure on HIV quasispecies; however, in most individuals, ongoing viral replication allows the eventual outgrowth of CTL-adapted viruses (4, 5). Therefore, such variations have functional implications for immunogenicity, and if present in a vaccine immunogen, would effectively be preadapted to certain HLA types. Furthermore, the presence of escape mutations in a vaccine immunogen may influence the immunodominance of vaccine-induced CTL responses, as suggested by significant changes in immunodominance hierarchies that follow early viral evolution and diversification in natural infection (6). Understanding the immunological consequences of specific HIV variations may become increasingly important as more are incorporated into polyvalent vaccines designed to optimize population-diversity coverage (1, 7, 8).
Although information on specific variations can be derived from a number of observational CTL-escape studies (5, 9–11), the breadth of HLA backgrounds and viral mutations examined in these studies are narrow, relative to the great breadth of HLA genotypes and HIV-1 diversity present in human populations. Since the first population-based HLA–HIV association study in 2002 (12), several large-scale studies have identified natural HIV-1 polymorphisms and networks of polymorphisms that seem to be significantly HLA allele-specific across the full HIV-1 subtype B and C proteomes, after accounting for viral phylogeny and linkage disequilibria in the MHC (13–19). These associations are not a functional demonstration of immune escape but, rather, may be considered individual hypotheses, based on a statistical association, of an in vivo biologic interaction between an HLA class I molecule and the viral epitope spanning the polymorphism or distant epitopes linked functionally to the polymorphic site. Although recent approaches have also sought to identify whole mutational networks involving multiple viral codons into the analyses (16), it is not possible to prove the order of consecutive changes by these analyses alone (i.e., it is possible that residues covary because of compensatory fitness-balancing interactions between viral residues or because of codominant targeting by the same HLA-restricted CTLs). These studies used published CTL epitope and escape data and known compensatory patterns to validate associations; however, the repertoire of confirmed, published epitopes are not complete, particularly for less-common HLA alleles, alleles associated with nonwhite Caucasian populations, and HLA-C in general. HLA-C–restricted responses may be particularly important in view of recent evidence linking levels of HLA-C cellular expression to better immunological control (20). There is even less viral escape data to validate the functional effect of all polymorphisms observed in vivo. Therefore, we sought to use population-derived HLA–HIV associations as starting hypotheses and systematically characterize the epitope-specific CD8 T cell responses that may account for them in vivo, as well as determine the functional effects of HLA-associated variations on T cell reactivity in individuals and in a population. Those HLA–HIV associations for which no evidence for direct influence on viral epitope T cell interactions could be found after systematic testing would also increase the likelihood of them being driven by compensatory interactions or networks within the HIV proteome. We used a previously published dataset of genome-wide HLA–HIV-1 associations derived from a large diverse population from the United States to predict the epitopic targets of prevalent CTL responses (19) and assayed these responses ex vivo in the same population. For each individual, we tested known and predicted nonadapted or immune-susceptible HIV-1 epitopes along with the paired adapted epitope sequence relevant to their own HLA-A, -B, and -C alleles and autologous viral epitope sequences. We primarily aimed to determine the proportion of HLA–HIV genetic associations that could be additionally explained or supported by T cell epitope data gained as a result of this systematic testing compared with using only published epitope information. Having carried out large-scale population-based cellular testing, we aimed to generally characterize the distribution of these prevalent T cell responses across the HIV proteome, their response rates, and magnitude. We also aimed to analyze how immune reactivity is influenced by the strength of the epitope-predication value, the autologous virus sequence, and clinical indices. Finally, we sought to determine the changes to reactivity caused by HLA-driven polymorphism on individual epitopes and overall patterns of immune reactivity at the population level that could impact vaccine-design considerations.
Materials and Methods
Study cohort and samples
The cohort of individuals examined in this study (n = 414) was a subset of the 555 individuals with chronic HIV-1 infection who were coenrolled in Adult AIDS Clinical Trials Group (AACTG) studies A5142 and A5128 from the United States. AACTG A5142 was a randomized clinical trial comparing three first-line antiretroviral drug regimens in individuals with no previous antiretroviral therapy and a viral load ≥2000 copies/ml plasma (21). There were no inclusion/exclusion criteria based on CD4 T cell counts. Subjects were recruited from 55 centers across the United States between 2003 and 2004 and were coenrolled in A5128 if they provided consent for inclusion in the AIDS Clinical Trials Group human DNA bank (22). Baseline pretreatment viral load measurements were available. All participants provided written informed consent to these investigations, and the study was approved by the Institutional Review Board governing the AACTG prior to commencement.
The subset of 414 individuals had HIV-1 sequencing, HLA class I genotyping resolved to four-digit types in all but three cases, and participated in a previous population analysis involving 800 individuals that generated a dataset of 874 HLA allele-associated HIV-1 genome-wide subtype B polymorphisms (19). These study participants were selected based on availability of cryopreserved PBMCs for immunological studies. PBMCs obtained from baseline visits and before commencement of antiretroviral therapy had been cryopreserved in central AACTG facilities between 2003 and 2004 and were transported to the Centre for Clinical Immunology and Biomedical Statistics in 2008.
Formulation of HLA-based peptide sets
For every one of the 874 HLA associations identified in the previous genetic analysis involving the AACTG 5142/5128 cohort (19), we applied the Epipred T cell epitope-prediction program (23; http://atom.research.microsoft.com/bio/epipred.aspx) to a sequence window of 13 aa residues flanking either side of the HLA-associated site in the population-consensus sequence to score the probability of CD8 T cell epitopes with a matching HLA allelic restriction. Scores were generated for the sequence containing the adapted amino acid, as well as the nonadapted amino acids, to predict the effect of the polymorphism on immune reactivity. The Epipred prediction algorithm was trained on characteristics of known CD8 T cell epitopes, including HLA-specific peptide-binding motifs, TCR contact residues, epitope length, and flanking sequences, to generate a probability score for predicted epitopes relative to known, published epitopes assigned a score of 1. Epipred used Bayes rule to compute the posterior probability that a viral sequence contains an epitope, assuming a prior probability of 10%. A detailed example of an Epipred calculation for a single-input HLA allele–peptide sequence is provided in Supplemental Table I. All epitope sequences with a score ≥ 0.4 (representing ≥40% positive predictive value of being a true epitope flanking an association, and a 4-fold increase from prior probability) were considered putative epitopes for immunological testing, even if they contained the HLA-adapted polymorphism. Peptides representing the paired HLA-adapted (resistant/escaped) or nonadapted (susceptible/wild-type) sequences were synthesized and tested to confirm HLA-restricted immune reactivity to the nonadapted epitope, as well as loss or reduction of reactivity due to specified HLA-associated epitope variations from that epitope.
Additional epitopes (n = 137) that were not spanning any HLA–HIV polymorphism associations in the genetic analysis (19) but were in the “A” (optimally defined/confirmed) or “B” (not optimally defined) lists of defined CD8 T cell epitopes published in the January 2009 update of the Los Alamos National Laboratory (LANL) HIV immunology database (http://www.hiv.lanl.gov/content/immunology) were added to the testing protocol to act as positive controls where possible (identified as “A and B list epitopes without HLA-HIV associations”).
Epitope selection was predicated by the HLA genotype of the subject. However, the number of predictions that was finally tested was constrained by the numbers of PBMCs available. For this reason, epitopes for each individual were ranked in order of preference for testing based first on being possible novel epitopes, second on Epipred score, and third on sequence match to the autologous viral sequence. Ranked lists of epitopes for every individual in the cohort were generated electronically using an in-house database. PBMCs were thawed, rested overnight in RPMI 1640 and 10% heat-inactivated fetal calf serum (R10), and the number of cells was ascertained using a Vi-Cell XR (Beckman Coulter, Gladesville, NSW, Australia), as previously described (24). Epitopes were then selected for testing for each individual from the ranked lists based on the number of cells.
IFN-γ ELISPOT assays
IFN-γ responses to HIV-1–derived epitopes were quantified using Mabtech (Nacka Strand, Sweden) reagents in 96-well nitrocellulose-backed plates (Millipore, Bedford, MA). Plates coated with 2 μg/ml anti–IFN-γ Ab were blocked with R10 for a minimum of 30 min and washed using an ELx 405 washer (BioTek, Winooski, VT), after which 30,000–50,000 PBMCs, along with anti-CD28 Ab (BD Biosciences-Pharmingen, North Ryde, NSW, Australia) at a final concentration of 1 μg/ml, were added to each well (24). Lyophilized peptides (Invitrogen, Mulgrave, VIC, Australia) were reconstituted to 10 mg/ml in DMSO, from which 1 mg/ml aliquots were made and stored at −20°C before use. The 1 mg/ml peptide stocks were further diluted to 50 μg/ml in R10 and tested in single or duplicate wells at a final concentration of 5 μg/ml. Where possible, triplicate wells of media alone served as negative controls, whereas anti-CD3 Ab was used as a positive control either in single or duplicate wells. After the addition of cells, peptides, and anti-CD3 Ab, the ELISPOT plates were incubated overnight at 37°C. Plates were then washed, and IFN-γ spots were developed with biotinylated Ab and streptavidin horseradish peroxidase, according to the manufacturer’s instructions. IFN-γ spots were detected using 3, 3′, 5, 5′-tetramethylbenzidine (24).
The large number of peptides, PBMC samples, and individualized testing required use of a previously described automated system (24), in which the electronically generated peptide lists for each individual were integrated with the Biomek FX automated sample-handling platform (Beckman Coulter), with software developed to electronically track the locations and volumes of all reagents, including peptides and PBMCs, on the 96-well nitrocellulose plate. Databases were created in-house to track reagent stock volumes and the number of freeze/thaw cycles of peptide stocks and document experimental procedures and results. Once optimized, epitope-specific IFN-γ responses were investigated in a maximum of 30 individuals in 1 d (24). The plates were read on an AID plate reader (Autoimmun Diagnostika, Strassberg, Germany), and the average count for the background was subtracted from all wells. Positive responses were defined as greater than twice the mean of the background and ≥100 spot-forming units (SFUs)/106 PBMCs (25). Very high spot counts for nine epitopes that could not be enumerated by the AID plate reader and were designated “too numerous to count” were assigned a value of 15,000 SFUs/106 PBMCs for all quantitative analyses, based on the uppermost limit of values actually enumerated in the study.
Statistical analyses
A number of predicted epitopes had more than one possible HLA restriction; in some cases, individuals carried two or more of the associated HLA alleles. In this case, Epipred scores were used to identify the most likely responding peptide–HLA combination in the individual. Therefore, IFN-γ responses were inherently more likely to be attributed to putative epitopes with high scores or known epitopes (where the Epipred score was assigned as 1) over putative epitopes with low scores as the most conservative approach to the analyses.
For each epitope, the proportion of responders was calculated as the proportion of individuals tested who had a response ≥100 SFUs/106 PBMCs (25). Selected analyses involving comparisons of relative magnitude of responses included all nonzero responses to account for the use of a predefined cut-off for positivity, as well as to avoid a zero-inflated distribution of responses given that nonzero responses were normally distributed on the log scale. Mann–Whitney tests were used for evaluation of group epitope-specific differences, Spearman correlations were used for assessment of correlations with Epipred scores, and generalized linear mixed models were used for assessing individual-specific associations using TIBCO Spotfire S+ 8.2 for Windows. All other analyses were performed using Prism 5.02 (GraphPad).
Results
Predicted CD8 T cell epitopes spanning HLA associations
There were 221 already-known CD8 T cell epitopes shown at or near sites of HLA–HIV associations with matching HLA restriction. There were an additional 53 epitopes that had minor variations in length or sequence to known epitopes, but these changes were not at sites of HLA-associated polymorphism. The remaining 157 epitopes seemed to be completely novel, giving a total of 431 epitopes with unique HLA restrictions and Epipred scores between 0.4 and 1 spanning 367 (of 874 total) HLA associations (Fig. 1). There were 507 HLA allele-specific polymorphisms, all with q-values [false-discovery rates (26)] < 0.2 following phylogenetic correction, for which no epitope sequence with predictive scores >0.4 was detected within a 26-codon sequence spanning the association. Among the 431 epitopes associated with the HLA allele-specific polymorphisms, there was a markedly higher proportion of epitopes in Nef (37% of all predicted epitopes; 0.77 epitopes/codon) compared with all other proteins (Fig. 2A).
For 52 epitopes, the presence of the HLA-associated substitution in the epitope changed the Epipred score of the epitope from >0.4 to <0.4, predicting reduced or lost immune reactivity and in keeping with CTL escape in vivo. However, we detected Epipred scores >0.4 for 131 adapted epitopes, of which 25 had higher scores relative to the nonadapted epitope. In these cases, we presume that the HLA-adapted variant sequence retained characteristics of an epitope still predicted to elicit a T cell response by the Epipred program.
IFN-γ T cell responses to predicted epitopes spanning HLA associations
We then sought to test these predictions in assays of ex vivo epitope-specific T cell responses using the IFN-γ ELISPOT assay. Of the 414 patient-specific PBMC samples thawed and enumerated, 290 had cell counts >1.5 × 105 cells/ml, with an average viability of 82% (range, 33–100%) after thawing; these were used in subsequent immunological investigations. Using the known HLA class I alleles carried by individuals in the cohort with sufficient PBMCs available for testing (n = 290), we generated a list of known and putative CD8 T cell epitopes unique to each individual in the study. Of all 431 potential HLA/epitope combinations for testing arising from our genetic analysis, 320 (spanning 327 HLA–HIV associations) were ultimately tested; for the remainder, there were insufficient numbers of subjects with the relevant HLA or, less commonly, problems with synthesizing the peptide. Of these 327 HLA–HIV associations, only 35% were proximate to well-characterized, published CD8 T cell epitopes with the relevant HLA restriction.
CD8 T cell responses to these epitopes, together with A- and B-list epitopes without HLA–HIV associations, were investigated across 94 HLA-individualized 96-well plates for 290 individuals, with an average of 13 epitopes tested per individual (range, 1–56 epitopes). At least one positive IFN-γ response was elicited by 51% of the epitopes tested and in 140 of the 290 individuals investigated. The number of responses per individual ranged from 0 to 33, with an average of 2 epitope-specific responses. Among individuals who mounted positive responses, the median magnitude of their IFN-γ responses was 590 SFUs/106 PBMCs (interquartile range [IQR], 280–1440 SFUs/106 PBMCs); 128 individuals did not respond to any tested peptides, and 22 individuals failed to elicit a response to the anti-CD3 Ab control. These 22 individuals who did not respond to the positive control had an average cell viability of 67% compared with 84.1% in the remainder of the cohort (p < 0.0001; Mann–Whitney test).
Protein distribution of prevalent detected IFN-γ T cell responses
The protein distribution of responses was similar to the distribution of predicted epitopes, with epitopes in Nef eliciting the largest proportion of responses overall (38% of all epitope-specific responses); Nef also had the highest number of responding epitopes/codon compared with all other proteins (Fig. 2B), although the overall mean magnitude of responses was not significantly different among Gag, Nef, Tat, Pol, and Env (Fig. 3). Notably, no IFN-γ response was detected against Vpu epitopes, including the known Vpu epitope ER9 (EYRKILRQR) (27), although the genetic analyses identified the E29Q, I33L, and R37K mutations within the epitope associated with carriage of HLA-A*33 in this cohort (19).
On a within-individual basis, Nef-derived epitopes elicited the highest-magnitude response more commonly (n = 74 epitopes; median, 1780 SFUs/106 PBMCs; IQR, 750–4600 SFUs/106 PBMCs), whereas there were 31 epitopes in Gag (median, 980 SFUs/106 PBMCs; IQR, 300–4000 SFUs/106 PBMCs) and 26 epitopes in Pol (median, 680 SFUs/106 PBMCs; IQR, 360–1130 SFUs/106 PBMCs) that accounted for the highest-magnitude response in responding individuals. These patterns of reactivity both at the population and individual level largely reflected the distribution of HLA associations and epitope predictions, because there was a greater number of epitopes from Nef predicted (Fig. 2A) and tested (Fig. 2B). In a mixed-model regression analysis, which takes the numbers of epitopes tested into account, Nef epitopes were more likely to mount positive responses compared with Env (p = 0.02) but not compared with epitopes in Gag (p > 0.9) and Pol (p = 0.1). A slight majority (57%) of these “highest magnitude per individual” responses targeted known epitopes, whereas the remaining IFN-γ responses were directed against putative epitopes and minor variants of known epitopes. Of note, the number of individuals tested for each epitope was also a function of the prevalence of the restricting HLA allele, such that epitopes associated with rare alleles were tested less frequently. We identified a group of 33 epitopes that was tested in at least five individuals and elicited positive responses in ≥40% of those individuals tested. In this group of prevalent “responding” epitopes, 61% were clustered in Nef (Fig. 4). We did not detect any statistically significant differences in the distribution of HLA restrictions between putative versus A- or B-list epitopes (data not shown).
HLA associations marking novel CD8 T cell epitopes
In the study cohort overall, positive IFN-γ responses were directed against a total of 143 known epitopes drawn from those associated with the HLA associations in the original genetic analyses or those added from the January 2009 LANL update and not associated with HLA-driven polymorphism. Of these, 73 A-list epitopes and 70 B-list epitopes elicited at least one IFN-γ response in this cohort. In general, known epitopes had an average response rate of 33%, with 122 known epitopes eliciting no responses at all. There were consistent responses against nine novel epitopes in individuals carrying the HLA allele predicted to restrict the epitope (Table I). These nine epitopes were considered “high-probability” novel epitopes because they were not listed in A- or B-lists of the January 2009 LANL update (http://www.hiv.lanl.gov/content/immunology), there was common carriage of only one HLA allele predicted to bind the epitope, there were at least five individuals tested, and the response rate among those tested was ≥40% and, therefore, comparable to the mean response rate (33%) seen for known epitopes. For example, the HLA-association studies identified HLA-C*04:01–driven polymorphism within FF9 (FPQGKAREF) in the Gag/Pol transframe region restricted by HLA-C*04:01. This epitope elicited responses in three of six individuals with carriage of HLA-C*04:01 tested (median, 380 SFUs/106 PBMCs; range, 340–1100 SFUs/106 PBMCs).
Protein . | HLA . | Epitope . | No. of Individuals Tested . | Positive Responses (%) . |
---|---|---|---|---|
Gag | C*07:01 | EIYKRWIIL | 12 | 58 |
Gag | B*42 | EPIDKELYPL | 5 | 40 |
Pol | B*27:05 | KRKGGIGGYa | 7 | 86 |
Pol | C*04:01 | FPQGKAREF | 6 | 50 |
Tat | A*32:01 | NCYCKQCCF | 6 | 50 |
Nef | A*03 | SVVGWPAVR | 19 | 58 |
Nef | C*08 | QVPVRPMTYK | 11 | 55 |
Nef | B*14:02 | QRQDILDLW | 6 | 50 |
Nef | C*04:01 | VRYPLTFGW | 19 | 42 |
Protein . | HLA . | Epitope . | No. of Individuals Tested . | Positive Responses (%) . |
---|---|---|---|---|
Gag | C*07:01 | EIYKRWIIL | 12 | 58 |
Gag | B*42 | EPIDKELYPL | 5 | 40 |
Pol | B*27:05 | KRKGGIGGYa | 7 | 86 |
Pol | C*04:01 | FPQGKAREF | 6 | 50 |
Tat | A*32:01 | NCYCKQCCF | 6 | 50 |
Nef | A*03 | SVVGWPAVR | 19 | 58 |
Nef | C*08 | QVPVRPMTYK | 11 | 55 |
Nef | B*14:02 | QRQDILDLW | 6 | 50 |
Nef | C*04:01 | VRYPLTFGW | 19 | 42 |
High-probability novel epitopes are those for which the sequence or the HLA restriction was not published as at 2009 LANL update, were tested in five or more individuals, and had a positive response rate ≥40%.
This epitope was not listed in the January 2009 update of A- or B-list epitopes from the LANL HIV immunology database, but it was described in Ref. 28.
There were an additional 41 “possible” novel epitopes that were not listed in the LANL A- or B-lists and elicited at least one positive IFN-γ response in the study; however, fewer than five individuals were tested, or the response rate was <40% (Table II). For example, HIV adaptation to HLA-B*14:02 was associated with a change from tyrosine (Y) at position 133 in Nef and was predicted to lie within the TW9 (TRYPLTFGW) epitope. IFN-γ responses were investigated in eight individuals with HLA-B*14:02; there were three responders (median magnitude, 480 SFUs/106 PBMCs; range, 400–740 SFUs/106 PBMCs).
Protein . | Novel HLA Restrictions . | Epitope . | No. of Individuals Tested . | Positive Responses (%) . | Known Alternative HLA Restriction (Ref.) . |
---|---|---|---|---|---|
Gag | B*57 | HQAISPRTL | 1 | 100 | B*15 (15), B*1510 (29) |
Gag | B*14:01 | DRWEKIRLR | 2 | 50 | None |
Gag | B*15:01 | RLRPGGK(R)KKY | 13 (15) | 38 (7) | A*03 (30), A*0301 (31) |
Gag | B*45:01 | AEQASQDVKNW | 5 | 20 | B*44 (32), B*4402 (32) |
Gag | C*03 | RLRPGGKKKY | 7 | 14 | A*03 (30), A*0301 (31) |
Gag | A*03:01 | RAPRKKGCWK | 27 | 4 | None |
Gag | A*31:01 | TVKCFNCGK | 6 | 17 | None |
Pol | B*35 | FPQGKARE(K)F | 37 (37) | 27 (8) | None |
Pol | B*35:12 | VPLTEEAEL | 13 | 15 | None |
Pol | A*68 | LVDFRELNK | 13 | 15 | None |
Pol | B*08:01 | QVRDQAEHL | 14 | 14 | None |
Pol | A*02:01 | IIKIQNFRV | 38 | 5 | None |
Pol | A*24 | QYDQILIEI | 19 | 5 | None |
Pol | B*07:02 | FPQGKAREL(F) | 26 (24) | 12 (8) | None |
Pol | A*33:01 | YLS(A)WVPAHK | 6 (4) | 17 (25) | None |
Vif | B*15 | ISKKAKRWFY | 18 | 6 | None |
Vpr | A*02:06 | WTLELLEEL | 4 | 25 | None |
Vpr | B*27:05 | SRIGITRQR | 5 | 20 | None |
Vpr | B*51:01 | FPRVWLHGL | 10 | 10 | None |
Tat | A*32:01 | CCFHCQVCF | 6 | 33 | None |
Tat | B*58:02 | CCFHCQVCF | 4 | 25 | None |
Tat | C*16:01 | CCFHCQVCF | 4 | 25 | None |
Tat | C*06 | CCFHCQVCF | 7 | 14 | None |
Tat | A*03 | GLGISYGRK | 30 | 17 | None |
Env | A*01:01 | GPGPGRAFY | 10 | 30 | None |
Env | A*01:01 | SFEPIPI(S)HY | 16 (16) | 19 (6) | A*29 (33), A*2902 (34) |
Nef | C*07:02 | QVPLRPMTY(F)K | 1 (20) | 100 (25) | A*03 (35), A*0301 (36), A*1101 (37) |
Nef | C*02:02 | KRQDILDLW | 2 | 50 | None |
Nef | C*04:01 | TRYPLTFGW | 18 | 39 | A*33 (15) |
Nef | B*14:02 | TRYPLTFGW | 8 | 38 | A*33 (15) |
Nef | B*18:01 | KEVLVWKF | 17 | 35 | None |
Nef | C*03:03 | QVPLRPMTFK | 4 | 25 | None |
Nef | B*42:01 | YPLTFGWCF | 4 | 25 | B*53 (38) |
Nef | B*08:01 | AFHHMAREL | 14 | 21 | None |
Nef | A*31:01 | DPEKEVLVWK | 5 | 20 | None |
Nef | B*14 | KRQDILDLW | 6 | 17 | None |
Nef | B*40:01 | MEDPEKEVL | 6 | 17 | None |
Nef | B*08:01 | GALDLSHFL | 10 | 10 | None |
Nef | B*44 | KRRDILDLW | 23 | 9 | None |
Nef | C*04:01 | GAFDLSHFL | 31 | 6 | None |
Nef | B*44 | GYFPDWQNY | 17 | 6 | None |
Protein . | Novel HLA Restrictions . | Epitope . | No. of Individuals Tested . | Positive Responses (%) . | Known Alternative HLA Restriction (Ref.) . |
---|---|---|---|---|---|
Gag | B*57 | HQAISPRTL | 1 | 100 | B*15 (15), B*1510 (29) |
Gag | B*14:01 | DRWEKIRLR | 2 | 50 | None |
Gag | B*15:01 | RLRPGGK(R)KKY | 13 (15) | 38 (7) | A*03 (30), A*0301 (31) |
Gag | B*45:01 | AEQASQDVKNW | 5 | 20 | B*44 (32), B*4402 (32) |
Gag | C*03 | RLRPGGKKKY | 7 | 14 | A*03 (30), A*0301 (31) |
Gag | A*03:01 | RAPRKKGCWK | 27 | 4 | None |
Gag | A*31:01 | TVKCFNCGK | 6 | 17 | None |
Pol | B*35 | FPQGKARE(K)F | 37 (37) | 27 (8) | None |
Pol | B*35:12 | VPLTEEAEL | 13 | 15 | None |
Pol | A*68 | LVDFRELNK | 13 | 15 | None |
Pol | B*08:01 | QVRDQAEHL | 14 | 14 | None |
Pol | A*02:01 | IIKIQNFRV | 38 | 5 | None |
Pol | A*24 | QYDQILIEI | 19 | 5 | None |
Pol | B*07:02 | FPQGKAREL(F) | 26 (24) | 12 (8) | None |
Pol | A*33:01 | YLS(A)WVPAHK | 6 (4) | 17 (25) | None |
Vif | B*15 | ISKKAKRWFY | 18 | 6 | None |
Vpr | A*02:06 | WTLELLEEL | 4 | 25 | None |
Vpr | B*27:05 | SRIGITRQR | 5 | 20 | None |
Vpr | B*51:01 | FPRVWLHGL | 10 | 10 | None |
Tat | A*32:01 | CCFHCQVCF | 6 | 33 | None |
Tat | B*58:02 | CCFHCQVCF | 4 | 25 | None |
Tat | C*16:01 | CCFHCQVCF | 4 | 25 | None |
Tat | C*06 | CCFHCQVCF | 7 | 14 | None |
Tat | A*03 | GLGISYGRK | 30 | 17 | None |
Env | A*01:01 | GPGPGRAFY | 10 | 30 | None |
Env | A*01:01 | SFEPIPI(S)HY | 16 (16) | 19 (6) | A*29 (33), A*2902 (34) |
Nef | C*07:02 | QVPLRPMTY(F)K | 1 (20) | 100 (25) | A*03 (35), A*0301 (36), A*1101 (37) |
Nef | C*02:02 | KRQDILDLW | 2 | 50 | None |
Nef | C*04:01 | TRYPLTFGW | 18 | 39 | A*33 (15) |
Nef | B*14:02 | TRYPLTFGW | 8 | 38 | A*33 (15) |
Nef | B*18:01 | KEVLVWKF | 17 | 35 | None |
Nef | C*03:03 | QVPLRPMTFK | 4 | 25 | None |
Nef | B*42:01 | YPLTFGWCF | 4 | 25 | B*53 (38) |
Nef | B*08:01 | AFHHMAREL | 14 | 21 | None |
Nef | A*31:01 | DPEKEVLVWK | 5 | 20 | None |
Nef | B*14 | KRQDILDLW | 6 | 17 | None |
Nef | B*40:01 | MEDPEKEVL | 6 | 17 | None |
Nef | B*08:01 | GALDLSHFL | 10 | 10 | None |
Nef | B*44 | KRRDILDLW | 23 | 9 | None |
Nef | C*04:01 | GAFDLSHFL | 31 | 6 | None |
Nef | B*44 | GYFPDWQNY | 17 | 6 | None |
These epitopes were either tested in less than five individuals, or the proportion of responders was <40%. Epitopes and variants were counted as a single possible novel epitope. Data on variants of these epitopes with the same HLA restriction are shown in parentheses. Identical peptide sequences with different HLA restrictions are listed separately.
Taking all responses detected against novel epitopes, known epitopes, and minor variants of known epitopes and presuming that the ex vivo peptide presentation was mediated by the predicted HLA allele, ELISPOT testing in this cohort confirmed 190 (58%) of 327 HLA associations that had a predicted epitope and were ultimately able to be directly tested in our study, given the relevant HLA types available. There were 137 HLA–HIV associations for which we could not show minimal support for marking a primary site of T cell selection based on our immunological studies involving HLA-directed ELISPOT screens of 290 individuals (Fig. 5). As previously mentioned, there were 507 HLA allele-specific polymorphisms identified in the original genetic analysis that did not have any known or putative epitope predicted in proximity to the association. Those HLA associations for which we could not assign any known epitope or any predicted novel epitope or find at least one positive IFN-γ response in the proteomic region spanning the association may be considered more likely to represent secondary/compensatory amino acid covariation or false-positive associations and are less likely to indicate a primary site of T cell escape.
Epitope-specific responses and associations with Epipred scores, autologous viral sequences, viral loads, and CD4 counts
In the subset of epitopes tested in at least five individuals, higher proportions of individuals responded to known epitopes compared with putative epitopes (median proportion of responders: A-list = 33% and B-list = 19% versus putative = 7%; p = 0.0003 and p = 0.003 respectively; Mann–Whitney test; Fig. 6A). The magnitude of IFN-γ responses for known epitopes was also higher than the magnitude of responses for putative epitopes (median epitope-specific responses: A-list = 420 SFUs/106 PBMCs and B-list = 200 SFUs/106 PBMCs versus putative = 70 SFUs/106 PBMCs; p = 0.003 and p = 0.02 respectively, taking all nonzero responses into account; Mann–Whitney U test; Fig. 6B). Among putative epitopes, we did not detect a statistically significant correlation between Epipred scores and either the proportion of positive responders (Spearman’s r = +0.06; p = 0.5) or the magnitude of IFN-γ responses (Spearman’s r = +0.05; p = 0.6).
At the individual level, baseline viral load and CD4 counts did not predict response (p > 0.1), but the probability of responding was significantly higher for nonadapted epitope sequences that matched the autologous viral sequences (p < 0.0001; generalized linear mixed-effect models).
IFN-γ responses to HLA-adapted epitopes
For the majority of epitopes, peptides with the nonadapted (susceptible/wild-type) sequence and the adapted (resistant/escaped) sequence were synthesized and tested to confirm HLA-restricted immune reactivity to the nonadapted epitope, as well as loss or reduction of reactivity due to the specified HLA-associated epitope variation. For 76 nonadapted epitopes tested in parallel with the paired adapted epitope, the HLA-associated amino acid substitution occurred within the epitope; for 32 of these, complete loss of an IFN-γ response to the adapted epitope was seen in all cases. In the remainder, the HLA-adapted version of the epitope still elicited IFN-γ responses ≥100 SFUs/106 PBMCs.
HLA-associated polymorphisms occurred outside 66 epitopes tested in our study, representing potential sites of epitope-processing escape. IFN-γ responses were elicited by 30 of these epitopes (median magnitude, 1090 SFUs/106 PBMCs; IQR, 420–2690 SFUs/106 PBMCs). In addition, in 26 cases, mutations occurring within one putative or known epitope resulted in predictions of new possible epitopes (Epipred scores ≥ 0.4) adjacent to or partially overlapping the original epitope and associated with the same allele, suggesting that some neo-epitopes may remain available for HLA and T cell engagement, despite being in an HLA-adapted state. For example, an HLA-A*24:02–driven change from tyrosine (Y) to phenylalanine (F) at codon 135 in Nef RF10 (RYPLTFGWCF) (39) was still associated with an Epipred prediction (score = 0.61) of HLA-A*24:02–mediated recognition of FF9 (FPLTFGWCF). Both epitopes were tested in six individuals with carriage of HLA-A*24:02, with five individuals responding to the nonadapted epitope (median, 1580 SFUs/106 PBMCs; range, 360–4560 SFUs/106 PBMCs) and IFN-γ responses elicited by the adapted epitope in three individuals (median, 440 SFUs/106 PBMCs; range, 200–520 SFUs/106 PBMCs).
A substantial number of adapted epitopes elicited IFN-γ responses ≥100 SFUs/106 PBMCs (n = 74), including 11 in which the mean magnitude of the response was 2-fold higher for the adapted epitope relative to the nonadapted epitope in each individual tested. This seemed to be a general phenomenon, with examples in all proteins except Vpu, but it was very prominent in Nef (Fig. 7). There were some extremely complex patterns of new epitope creation resulting from HLA-associated changes in Nef, as described above for the HLA-A*24:02–restricted epitope, RF10. This was particularly evident in the central region of Nef, where a sequence of 77 aa (positions 71–148) contained 21 partially overlapping epitopes created by polymorphism, which elicited IFN-γ responses in our study cohort. Given the possible ramifications of this for vaccine-induced immunity, we compared the proportion of CD8 T cell epitopes that would be in Nef compared with Gag, Pol, and Env (as common vaccine Ags), if all HLA-specific variations and predicted epitopes were taken into account, versus the numbers of epitopes in these proteins in a single subtype B strain sequence (Fig. 8A, 8B). This indicated an inflation of Nef epitopes and contraction of Pol, Gag, and particularly, Env epitopes associated with diversity coverage at the population level. This was further replicated when comparing proportions of epitopes that induce IFN-γ responses, with Nef accounting for the greatest proportion of epitope-specific IFN-γ responses relative to the other proteins in individuals in this study (Fig. 8C).
To determine whether responses to HLA-adapted epitopes could reflect general cross-reactivity phenomena, as opposed to de novo responses to the adapted epitope specifically, we sought to determine whether responses to adapted epitopes were more likely when there was a response to the nonadapted epitope, despite a lack of match with the autologous viral epitope, and therefore, a more cross-reactive response. As noted above, the probability of responding, in general, was significantly higher for nonadapted epitope sequences that matched the autologous viral sequences (p < 0.0001; generalized linear mixed-effect model) and among those with demonstrated responses against a nonadapted epitope; those with match between the autologous sequence and nonadapted epitope sequence exhibited higher response rates to the adapted epitope (mean, adjusted for protein = 25%) compared with those where the individual’s autologous viral sequence matched only the adapted epitope (14%; p = 0.05) or neither the nonadapted nor the adapted epitope (14%; p = 0.02).
Discussion
To our knowledge, this is the first large-scale reverse-genomics study in which the results of a genetic analysis were used to directly inform the selection and subsequent testing of particular viral Ags. Overall, we were able to provide immunological support for 190 HLA-associated polymorphisms in subtype B HIV-1 as being sites of direct T cell recognition in vivo based on ex vivo IFN-γ responses in the appropriate HLA background. This was 58% of the HLA associations tested in the study, representing an increase from only 35% that could have been explained by well-characterized published CD8 T cell epitopes alone, prior to any cellular testing. For nine high-probability epitopes, there was a sufficiently frequent HLA type to show that the most likely HLA restriction of the epitopic response in the cohort matched that of the prediction, and there was sufficient frequency of testing and responses in ≥40% of cases to give the best level of evidence for immunoreactivity. An additional set of possible novel epitopes was defined with response rates <40% but immunoreactivity in at least one individual with the predicted restricting HLA allele. However, it is noteworthy that even well-characterized published epitopes that have been used as a standard to validate genetic associations and as reagents in immunological studies had a mean response rate of only 33%. Therefore, we applied a higher standard of evidence for immunogenicity to potential novel epitopes compared with that observed for known epitopes in this study. The fact that cellular responsiveness was correlated with sequence match of the testing Ag to autologous virus, as shown in other studies (40), further confirmed that viral diversity does influence the specificity of cellular responses within the individual. These data, in general, provided experimental evidence of a direct biological basis for 190 strongly HLA-associated subtype B HIV-1 polymorphisms proteome-wide as sites of HIV-1 adaptation to HLA-restricted T cell responses and should serve to guide further epitope characterization and viral-escape studies.
HIV-1 Nef was associated with the greatest number of epitopes that elicited IFN-γ over the whole cohort and within individuals. This intense immunogenicity is in keeping with the extreme levels of HLA allele-specific selection in Nef shown in several population-based genetics studies (14, 19, 41) and mirrors the distribution of well-characterized epitopes defined by cellular studies. Because the majority of putative epitopes were tested in parallel with their HLA-adapted pair, we were also able to determine whether any functional consequences of polymorphism within epitopes were apparent in a screening ELISPOT assay. Marked reductions in IFN-γ responses associated with the polymorphisms were seen in a proportion of cases, supporting a role for loss of TCR engagement or HLA–peptide binding in vivo in these examples. There were also instances in which the HLA-adapted or escaped version of the epitope elicited equivalent or higher magnitude responses than did the nonadapted versions. In a screening ELISPOT with excess peptide concentrations, it is possible that such reactivity patterns result from T cell cross-reactivity, although this seemed more likely to occur with Nef epitopes compared with other proteins, and it is not clear why TCR clonotypes specific for Nef epitopes should be inherently more cross-reactive than other TCRs. Furthermore, we did not find that responses that seemed more inherently cross-reactive, as indicated by lack of match with autologous viral sequences, were more likely to respond to the adapted epitopes. The general determinants of T cell recognition of viral variants have been explored in other studies (42, 43). It is important to emphasize that we tested specific epitope pairs based on population signals of adaptation. In all of these specific instances of positive responses to HLA-adapted epitopes, there was strong statistical evidence of the adapted residue being enriched in vivo in the selecting HLA context in the original HLA-associations analysis, suggesting that in our cellular studies, either the true differences in peptide avidity were not apparent at excess peptide concentrations and would diverge with serial peptide dilutions or, alternatively, that inducing immune responses to adapted variants provides some selective advantage to HIV-1 in vivo.
The formation of neo-epitopes as a result of T cell escape has been described in longitudinal studies (44), but our data suggested that this could be a reasonably common phenomenon. We described cases of HLA selection leading to high-avidity, neo-epitope–specific responses in chronic progressive HIV infection (45) and argued that this could represent a way for HIV mutations to promote “bad” immunodominance patterns in chronic infection and drive HIV evolution, not necessarily away from all immune recognition but to enhanced, but ineffective, recognition of a narrow range of epitopes. In this study, there were several extremely complex patterns of HLA-associated polymorphisms in Nef leading to formation of new epitope targets for the same and new HLA alleles that were partially overlapping or distant from the original epitope. Given this combination of high variability with high density of reactive epitopes, including reactivity to many overlapping HLA-adapted variants, it is not surprising that Nef epitopes, as a proportion of all reactive epitopes, are relatively inflated, and the IFN-γ responses to Nef dominates over all others when considered at a population level. If these Nef responses lead to a relative reduction in targeting of more structurally or functionally constrained proteins, such as Gag or Pol, in vivo, where viral adaptations are more likely to incur fitness costs, then Nef-dominated immunity is conceivably more advantageous to the virus than to the host. Because these immunodominance patterns characterize chronic infection where immune control has manifestly failed, recapitulating such immune hierarchies by a vaccine immunogen would seem empirically undesirable, particularly for therapeutic vaccines that could serve to boost this inflation. It is not known whether broad poly-specific vaccine-induced responses prior to viral exposure could block, not block, or even enhance particular transmitting viral variants, although this data will emerge as more polyvalent strategies in preventative vaccines advance to clinical trials. Computational strategies that are based on conservation or are polyvalent but seek to minimize the inclusion of rare or unfavorable epitopes or are based on acute transmitted founder viruses may overcome this issue. This set of immunological data could be useful to help in scoring algorithms used to computationally optimize inclusion of important circulating acute variants and perhaps help to exclude particular variants that seem prone to interference or immunodominance phenomena in vivo.
Despite the large size of our study cohort, the extreme polymorphism of HLA molecules still limits the degree to which the HLA allele restriction of many responses could be defined analytically, and limited stored cellular material of our study cohort subjects precluded further experimental studies. Because we assigned a higher ranking to known or high-probability HLA restrictions for those epitopes with overlapping HLA restrictions, our study is also inherently conservative, with a bias against assignment of novel epitope responses when there are limited numbers of individuals with that HLA. Furthermore, the use of an epitope-prediction program trained on characteristics of known epitopes will inevitably tend to predict epitopes more similar to known epitopes; therefore, the 507 associations for which no proximal epitope was predicted cannot be absolutely excluded as sites of true immune selection, particularly given the low mean response rate of even known epitopes shown in this study. However, the additional peptide synthesis, sample, and assay requirement of assessing all possible epitopic regions and variants spanning all associations is prohibitive at a practical level.
The challenges of translating the findings of genetic HLA polymorphism-association studies to the functional, cellular level are considerable and include the extreme polymorphism of HLA and HIV-1, as discussed above (which necessitates large sample sizes), availability of samples and subjects for immunological testing; limitations in amount and quality of cryopreserved sample material (particularly from pretreatment time points); the general heterogeneity of T cell responses between subjects and over time; limitations of ex vivo-based assays and single biomarkers, such as IFN-γ; and the false-discovery rate of associations arising from any genetic-associations study. Nevertheless, we were able to expand the base of immunological support for a number of subtype B HIV-1 polymorphisms being sites of immune selection. In addition to providing positive evidence for immune reactivity, the absence of any reactivity for some peptides can be useful in studies of secondary or compensatory mutational networks. Indeed, this study suggested that only the minority of HLA–HIV polymorphisms (given a q-value cut-off of 0.2, with adjustment for viral phylogeny) can be explained by primary escape or cotargeting of multiple epitopes, and many others are more likely secondary mutations affecting structurally or functionally interdependent residues. Mapping the mutational networks or genetic haplotypes in HIV-1 that determine viral fitness under diverse host environments will reveal more about the importance of specific residues in HIV replication and pathogenesis. The information provided in this study on mutations that are highly HLA allele specific but not within or near epitopes could help those modeling or studying such covariation networks for both vaccine research and identifying novel ligands for antiviral drugs. More HLA-association population-based studies will continue to be done in new and genetically diverse populations (46) and in larger populations, and the output from several studies have already been used as presumptive sites of viral escape in a number of secondary analyses. However, there is little value in generating vast numbers of hypotheses across these studies unless they are systemically tested and validated at the functional level, where possible. The results of such testing can be used to refine mapping of primary viral escape and compensatory pathways; iterate and validate analytical approaches to genetic studies; and understand the links between HIV polymorphism, adaptation, and immunogenicity.
Acknowledgements
We thank the study teams, study sites, and participants in the U.S. Adult AIDS Clinical Trial Group A5142 (NCT00050895) and A5128 (NCT00031408) protocols, as well as colleagues at the Centre for Clinical Immunology and Biomedical Statistics, Institute of Immunology and Infectious Diseases. We also thank Donald Cooper, Shay Leary, Micheal Corkery, and Daniel Piccoli for computing work associated with large-scale high-throughput ELISPOT testing.
Footnotes
This work was supported by Grant RO1 AI060460 from the National Institute of Allergy and Infectious Diseases. The AIDS Clinical Trial Group (ACTG) is supported by Grant AI-68636, and the Vanderbilt DNA Resources Core is supported by Grant RR024975. The ACTG Clinical Trials Sites that collected DNA were supported by National Institutes of Health Grants AI64086, AI68636, AI68634, AI069471, AI27661, AI069439, AI25859, AI069477, AI069513, AI069452, AI27673, AI069419, AI069474, AI69411, AI69423, AI69494, AI069484, AI069472, AI069501, AI69467, AI069450, AI32782, AI69465, AI069424, AI38858, AI069447, AI069495, AI069502, AI069556, AI069432, AI46370, AI069532, AI046376, AI34853, and AI069434. The project was also supported by the Australian National Health and Medical Research Council (program Grant 384702) and the Bill and Melinda Gates Foundation (Grant 31844).
The content of this study is the responsibility of the authors and does not necessarily represent the official views of the National Institute of Allergy and Infectious Diseases or the National Institutes of Health.
The online version of this study contains supplemental material.
References
Disclosures
The authors have no financial conflicts of interest.