Identification of the specific HLA locus and allele presenting an epitope for recognition by specific TCRs (HLA restriction) is necessary to fully characterize the immune response to Ags. Experimental determination of HLA restriction is complex and technically challenging. As an alternative, the restricting HLA locus and allele can be inferred by genetic association, using response data in an HLA-typed population. However, simple odds ratio (OR) calculations can be problematic when dealing with large numbers of subjects and Ags, and because the same epitope can be presented by multiple alleles (epitope promiscuity). In this study, we develop a tool, denominated Restrictor Analysis Tool for Epitopes, to extract inferred restriction from HLA class II–typed epitope responses. This automated method infers HLA class II restriction from large datasets of T cell responses in HLA class II–typed subjects by calculating ORs and relative frequencies from simple data tables. The program is validated by: 1) analyzing data of previously determined HLA restrictions; 2) experimentally determining in selected individuals new HLA restrictions using HLA-transfected cell lines; and 3) predicting HLA restriction of particular peptides and showing that corresponding HLA class II tetramers efficiently bind to epitope-specific T cells. We further design a specific iterative algorithm to account for promiscuous recognition by calculation of OR values for combinations of different HLA molecules while incorporating predicted HLA binding affinity. The Restrictor Analysis Tool for Epitopes program streamlines the prediction of HLA class II restriction across multiple T cell epitopes and HLA types.

Determination of the HLA restriction of human T cell responses is becoming increasingly necessary, as new approaches to immunophenotyping such as CyTOF, Fluidigm, and RNA profiling rely on tetramer staining as a way to gate or isolate Ag-specific T cells (13). In molecular terms, HLA restriction reflects the formation of a trimolecular complex, encompassing the Ag-specific TCR and a bimolecular complex formed by a specific epitope and a specific HLA molecule. TCR binding occurs when the epitope–HLA complex “fits” the Ag-binding site of the specific TCR. Hence recognition of that epitope by the TCR is “restricted” by that particular HLA type. In cellular terms, a T cell response to a given epitope being restricted by a given HLA refers to the fact that T cell recognition will only occur when that epitope is presented by an APC or a target expressing the specific HLA molecule that was involved in the original priming and elicitation of the T cell response.

The production of tetramer staining reagents relies on the exact identification not only of the epitope recognized by the specific T cells, but also the specific HLA molecule(s) that bind and present the specific epitope to T cell scrutiny (4). Because HLA molecules are polygenic (encoded by multiple loci) and polymorphic (each gene can be encoded by different allelic variants), this task is not trivial, also due to the extreme HLA diversity present in human populations (5). HLA class II molecules are heterodimers that consist of α (less polymorphic) and β (highly polymorphic) chains, and HLA class I molecules are encoded at multiple loci (A, B, and C as the main ones). As of January 2015, the three major HLA class I α-chain loci are composed of 9308 alleles (HLA-A [2995], HLA-B [3760], and HLA-C [2553]), which all bind the invariant β2-microglobulin chain. The class II loci consist of an α (less polymorphic) and a β (highly polymorphic) chain, and the four major class II loci include 97 α and 2963 β alleles (HLA-DPA [38 alleles] and -DPB [489], HLA-DQA [52] and -DQB [734], HLA-DRA [7], B1 and B3/4/5 [1740]) (5).

The exact HLA type of human subjects in a population under study can be readily determined by a variety of methods, increasingly relying on second-generation sequencing methods (13). The gene and allelic variant restricting specific T cell responses is determined by additional experimentation relying on classical immunobiological approaches, such as inhibition by HLA locus–specific Abs, and use of matched/mismatched or single HLA molecule transfected cell lines (6).

An alternative is to use classic genetic tools based on calculations of odds ratios (ORs) (7). The OR method has been used extensively to estimate the likelihood that a certain genetic trait is associated with a certain biological condition or outcome (810). The method is based on comparing the frequency with which a certain outcome is observed in individuals carrying a given gene or allelic variant to the frequency of outcome in individuals not expressing the given gene or allelic variant. For example, ORs have been extensively used to pinpoint and quantify the relative contribution of various genes to autoimmune diseases (11, 12).

The OR method can be used to determine likely restrictions, by considering the presence or absence of a response to a given epitope as the biological outcome, and calculating the OR and associated statistical significance for each HLA molecule expressed in human subjects in which a response, or lack thereof, is observed. The method is simple and powerful. However, by definition, statistical significance is reached only when a suitably large number of subjects are assayed. As a result, calculations of OR and statistical significance may become cumbersome, especially when a relatively large number of epitopes is simultaneously analyzed in a cohort represented by a large number of allelic polymorphisms.

Additional complexity may arise because of “HLA linkage” and/or “epitope promiscuity.” Certain HLA class I and II alleles are sometimes in very strong linkage disequilibrium; thus, a positive OR can be obtained for an HLA molecule that is not the real restricting element of the epitope but is in linkage disequilibrium with the real restricting allele. Further, an epitope can be restricted by multiple HLA molecules (a phenomenon called epitope promiscuity). Therefore, obtaining a positive OR may be problematic, because restriction by multiple alleles may proportionally increase the noise (“rider” alleles) without increasing the signal, thereby masking relevant restricting alleles. This can further complicate identification of restricting allele for an epitope.

In this article, we report the development and initial validation of the Restrictor Analysis Tool for Epitopes (RATE), a computational application that allows the user to obtain reports describing HLA class II restrictions inferred from response patterns in an HLA-typed human population based on a standardized process flow and statistical evaluation.

The RATE tool is a Python 2.6.5+ CGI script. The web interface is implemented using HTML and python-CGI with the statistical analysis done using RPy (http://rpy.sourceforge.net).

ORs assess the strength of association of one property to another in a sample population (7). In the current application, ORs are used to quantify the strength of associations between expression of a specific allele and detection of positive immune response. An OR greater than 1 indicates a positive association between the two properties in question (i.e., expressing the specific allele increases the “odds” of having positive immune response). ORs are calculated according to the following formula:

OR=(A+R+)×(AR)(AR+)×(A+R)

where A+ = number of donors expressing a specific allele, A = number of donors not expressing the specific allele, R+ = number of donors with a positive immune response to the specific peptide, and R = number of donors who do not have a positive immune response to the specific peptide. Thus, for example, A+R+ indicates number of donors expressing the specific allele and having a positive response against a specific peptide.

The OR becomes infinity when none of the donors who do not express the allele have a positive response, that is, AR+ = 0. Although this cannot be avoided, a relative frequency (RF) can be used to estimate the enrichment of responders expressing a given allele relative to the whole population. Because this value will never be zero, the RF measure will never be “infinity” due to division by zero, even in instances where the OR measure is “infinity” due to division by zero. Accordingly, we also calculate RF, which is expressed as the ratio of the response in donors expressing the specific allele to the response in all donors, and it is calculated as follows:

RF=A+R+/(A+R++A+R)(A+R++AR+)/Totaldonors

Fisher’s exact test is applied here to calculate the statistical significance of the difference in immune response between the donors who express a specific allele and those who do not, thus highlighting the restricting allele for each peptide. A p value < 0.05 was considered statistically significant.

No adjustments are made on the p value to correct for multiple statistical tests. Accordingly, the p values cannot be taken as actual probabilities of a given restriction to be true, but rather serve as a relative ranking that restriction has most statistical support. The purpose of these rankings is to guide further experiments that are necessary to fully confirm restrictions and that allow the experimenter to focus on prioritized candidate HLA alleles.

A specific algorithm was designed to address epitope promiscuity. The algorithm first identifies the HLA alleles expressed in each of the donors who gave a positive response to an epitope. The binding affinity of the epitope for each of the alleles identified is then predicted using the Immune Epitope Database (IEDB) MHC binding prediction tools using RESTful web services (http://tools.immuneepitope.org/main/html/tools_api.html) (13). The binding prediction is done using the consensus method, which uses a combination of NN-align, SMM-align, and CombLib/Sturniolo. If the specified allele is not available under the consensus method, the NetMHCIIpan method is chosen by default. More details on prediction methods are available in Paul et al. (14). All alleles predicted to bind with the peptide (IEDB consensus percentile ≤ 15.0) are selected for further screening. For general binding predictions for individual alleles, the recommended threshold for considering a peptide to be binder is IEDB consensus percentile 10.0, whereas for predicting promiscuous binding, the recommended threshold is 20.0 (14, 15). The cutoff used in this study (15.0) was chosen as a midway between these two thresholds and based on the reasoning that too stringent of a cutoff might be overlooking potential promiscuous restrictions.

After the algorithm calculates the OR, RF, and p value for each individual allele, it then combines response data for all possible allele pairs and evaluates whether an improved p value is obtained. Subsequent iterations combine various allele groups, and p values are calculated with each iteration. Iteration cycles continue until the p value cannot be improved further. The algorithm reports the allele combinations associated with the best p value for each epitope. For example, consider alleles A, B, C, and D with the values listed in Table I for an epitope.

Table I.
Allele combinations
AlleleA+R+AR+A+RARRFORp
27 45 1.7 4.4 0.046 
10 62 2.2 3.5 0.084 
64 2.1 2.9 0.157 
63 2.3 3.9 0.065 
B + C + D 10 20 52 2.5 17.4 <0.001 
AlleleA+R+AR+A+RARRFORp
27 45 1.7 4.4 0.046 
10 62 2.2 3.5 0.084 
64 2.1 2.9 0.157 
63 2.3 3.9 0.065 
B + C + D 10 20 52 2.5 17.4 <0.001 

Although only allele A is associated with p < 0.05 and no other allele showed significant p value, iterative analysis by combining the data for the other alleles may reveal that the combination of the three alleles (B, C, and D) gives a more significant p value. The algorithm reports this combination of alleles as potential promiscuous restricting alleles.

The RATE tool is available online at http://iedb-rate.liai.org.

Immune response datasets were generated as described previously (16) and by D.M. McKinney, C.S. Lindestam Arlehamn, V. Rozot, E. Makgotlho, W. Hanekom, T.J. Scriba, and A. Sette (manuscript in preparation). In brief, immune responses to various Mycobacterium tuberculosis epitopes in PBMCs from individuals with latent M. tuberculosis infection (LTBI) were measured by IFN-γ–specific ELISPOT as representative of Th1 responses and reported as spot-forming cells (SFCs) per million cells.

In addition, CD4+ T cell immune responses to 15-mer peptides from the acellular Bordetella pertussis vaccine (M.B.C. Dillon, T.A. Bancroft, R. Kolla, S. Paul, J. Sidney, B. Peters, and A. Sette, manuscript in preparation) were measured as previously described (17, 18). In brief, PBMCs isolated from whole blood were stimulated with isolated B. pertussis vaccine proteins for 14 d, with fresh human rIL-2 added every 3 d. Subsequently, the cells were restimulated with peptides and lymphokine production was measured with a dual IFN-γ and IL-5 ELISPOT assay, representative of Th1 and Th2 CD4+ subsets, respectively (18).

The criteria for positivity for the ELISPOT assay we used were as follows: responses were considered positive if the stimulus had >20 SFC/106 PBMCs, p < 0.05 by Student t test, and a stimulation index >2.0. These criteria are the ones we have consistently used for >10 y and have been maintained for consistency’s sake in this study as well (15, 17, 1922).

HLA class II typing was performed using next-generation sequencing methods (D.M. McKinney, Z. Fu, L. Le, J.A. Greenbaum, B. Peters, and A. Sette, submitted for publication). Specifically, amplicons were generated from the appropriate class II locus for exons 2–4 by PCR amplification. Sequencing libraries were generated (Illumina Nextera XT) from these amplicons and sequenced with MiSeq Reagent Kit v3 as per manufacturer’s instructions (Illumina, San Diego, CA). Sequence reads were matched to HLA alleles and donor genotyping was assigned (D.M. McKinney, Z. Fu, L. Le, J.A. Greenbaum, B. Peters, and A. Sette, submitted for publication).

B. pertussis biotinylated HLA class II tetramers conjugated to streptavidin-PE were provided by the Tetramer Core Laboratory at Benaroya Research Institute. For B. pertussis ex vivo tetramer staining experiments, CD4+ T cells were isolated from cryopreserved PBMCs using CD4+ T cell Isolation Kit (Miltenyi) according to manufacturer’s instructions. For B. pertussis in vitro tetramer staining, isolated PBMCs were stimulated as described earlier with the tetramer-specific peptide for 14 d and subsequently harvested for analysis. Purified CD4+ cells or expanded cells were incubated with a 1:50 dilution of tetramer-PE for 2 h at room temperature and stained for 30 min at room temperature in FACS buffer (PBS with 2% FBS) with Abs to the following surface markers: CD3-AF700 (BD Biosciences), CD4-allophycocyanin-ef780 (eBioscience), CD8-V500 (BD Biosciences), CD45RA-ef450 (eBioscience), and CCR7-PerCP-Cy5.5 (Biolegend). After washing, cells were resuspended in PBS and read on a BD LSRII and analyzed with FlowJo. M. tuberculosis biotinylated HLA class II tetramers were generously provided by the National Institutes of Health tetramer core facility. For M. tuberculosis ex vivo tetramer staining, PBMCs were stained with tetramer-PE, CD4-FITC, CD8a-PECy5, CD19-PECy5, CD11b-PECy5, CD56-PECy5, and Live-Dead Aqua (Invitrogen). Tetramer-stained cells were enriched with anti-PE magnetic beads (Miltenyi) and analyzed.

To verify whether HLA/epitope predicted restrictions were correct, we performed Ag presentation assays with single HLA transfectants using cell lines as described previously (6). In brief, PBMCs isolated from whole blood were incubated ex vivo with peptide-pulsed EBV-transformed cell lines expressing selected HLA molecules in an IFN-γ ELISPOT assay, as described earlier. The cell lines used for transfection were DAP.3 (for DRB1*03:01, DRB1*04:01, DRB1*07:01, DRB1*11:01, DRB1*13:01, DRB1*15:01, DRB4*01:01, and DRB5*01:01) and RM3 (for DRB3*02:02, DPA1*02:01/DPB1*01:01, DPA1*01:01/DPB1*04:01, DQA1*05:01/B1*02:01, and DQA1*01:02/DQB1*06:02) (6). All DR lines used DRA1*01:01 as the α-chain. Allele restriction was determined by comparing responses of peptide-pulsed cell lines with media-only–pulsed cell lines. The level of statistical significance was determined with a Student t test using the mean of triplicate values of the response against peptide-pulsed cell lines versus the response against the media-pulsed cell line control.

To afford analysis of congruent datasets in a standardized fashion, we designed RATE to allow importing test results from a number of different epitopes in a given biological assay, and for responses in a cohort of donors/test subjects. A tab delimited plain text file format was chosen because it is typically available as a data export option from most instrumentation, or can easily be generated from commonly used graphing and spreadsheet software. Supplemental Table I shows sample response data in a spreadsheet.

The main focus of our efforts was to design and validate a method to facilitate determination of restrictions for HLA class II molecules, and accordingly we mostly used data obtained in our laboratory where HLA class II responses were measured in a number of different settings, using 15-mer peptides and ELISPOT or intracellular cytokine staining assays, in a sufficient number of donors. To measure immune responses, we routinely use ELISPOT. However, data from any assay may be used, as long as operative criteria for positivity can be defined. Although the two statistical measures used (OR and RF, as described later) depend on a binary outcome (positive or negative), the absolute values determined by the assay can be entered directly, and a threshold for positivity can be chosen before calculation of the metrics. Positivity thresholds for ELISPOT measures have been described previously by our group (6, 16). The tool does not require that all epitopes be tested in all donors.

Similar to the response data, RATE uploads HLA typing data provided as a tab delimited plain text file (Supplemental Table II). Although different typing methods determine HLA type to differing levels of resolution (e.g., allele group, protein) (23), the OR and RF methods are independent from the HLA methods used for typing. Indeed, the MHC type analyzed may be processed using any user-preferred nomenclature system as long as the typing categories are mutually exclusive. In our laboratory, we routinely HLA-type to the protein level (four-digit typing, e.g., HLA-DRB*01:01), but serological (HLA-DRB1), allele group (HLA-DRB*01), and even typing formats down to the level demarcating synonymous DNA substitutions (HLA-DRB*01:01:01) are also compatible with the approach. Furthermore, complete HLA class II typing is not required, and data for a given locus or even a partial list of alleles can be used, with the caveat that the tool will output the most likely statistical association given the data at hand, whereas a better candidate could have been identified with a more complete dataset.

For each epitope–HLA combination, RATE generates a matrix tabulating the number of positive versus negative responders among the subjects expressing that particular HLA type, as well as the number of positive and negative responders among the subjects not expressing that HLA type. Next, for each epitope, RATE calculates the ORs and RFs corresponding to each of the HLA molecules following classical formulas, as described in 2Materials and Methods. The addition of RF allows the user to calculate a numerical value for epitope–HLA pairs for which an OR would be incalculable because of lack of response among individuals not expressing that particular HLA allele. The Fisher’s exact test is used to calculate the significance (p value) associated with each epitope–HLA combination.

For each peptide, the OR, RF, and p values for each of the HLA types expressed in the responder subjects are ranked and tabulated from high to low OR. In this reporting format, the number of responding individuals positive in the assay (R+) and expressing each particular HLA (A+) is also presented (A+R+). The number of similarly defined A+R, AR+, and AR individuals is also reported. A partial example of this type of report is shown in Table II.

Table II.
RATE HLA restriction complete results
Peptide No.Peptide IDPeptide SequenceAllele No.AlleleA+R+AR+A+RARNo. of DonorsResponse n/aRFORp
3531.0375 MSQIMYNYPAMMAHA DPA1*01:03 13 48 11 81 0.78 0.34 0.048 
3531.0461 AGCQTYKWETFLTSE DPA1*01:03 54 19 81 1.16 2.44 0.672 
3531.0511 GEEYLILSARDVLAV DPA1*01:03 59 22 87 1.34 inf 0.330 
3550.0065 STHEANTMAMMARDT DPA1*01:03 34 47 40 0.95 0.80 1.000 
3550.0063 DLVRAYHSMSSTHEA DPA1*01:03 32 10 47 40 1.27 inf 0.569 
3550.0061 DLVRAYHAMSSTHEA DPA1*01:03 33 10 47 40 1.27 inf 0.564 
3550.0060 AMEDLVRAYHAMSST DPA1*01:03 35 10 47 40 1.27 inf 1.000 
3550.0059 IMYNYPTMLGHAGDM DPA1*01:03 33 47 40 1.02 1.09 1.000 
3550.0058 MSQIMYNYPTMLGHA DPA1*01:03 33 47 40 1.02 1.09 1.000 
10 3550.0057 IMYNYPAMLGHAGDM DPA1*01:03 30 47 40 1.11 2.07 0.667 
11 3550.0056 MSQIMYNYPAMLGHA DPA1*01:03 30 47 40 0.99 0.93 1.000 
12 3550.0055 TEIRRSNAPRLVDLV DPA1*01:03 36 10 47 40 1.27 inf 1.000 
13 3550.0052 GTEIRRSDAPRLVDL DPA1*01:03 35 10 47 40 1.27 inf 1.000 
14 3550.0051 SNIKIIRIDEFRRCG DPA1*01:03 36 10 47 40 1.27 inf 1.000 
15 3550.0046 HSNIKIIRIDEFRRY DPA1*01:03 36 10 47 40 1.27 inf 1.000 
16 3550.0028 PYVIELDGQFCGQLT DPA1*01:03 42 14 57 30 1.33 inf 1.000 
17 3550.0026 EWTVRHTVAAWPAVC DPA1*01:03 42 14 57 30 1.33 inf 1.000 
18 3550.0024 GTEIRRSNAPRLVDLV DPA1*01:03 38 12 51 36 1.31 inf 1.000 
19 3550.0020 HSNIKIIRIDEFRRYG DPA1*01:03 38 12 51 36 1.31 inf 1.000 
20 3550.0006 AAVLRFQEAANKQKQ DPA1*01:03 38 12 53 34 1.29 inf 1.000 
21 3536.0170 THSWEYWGAQLNAMK DPA1*01:03 41 11 53 34 1.26 inf 1.000 
22 3536.0147 AGSLSALLDPSQGMG DPA1*01:03 41 12 56 31 0.87 0.59 0.555 
23 3536.0144 SAMILAAYHPQQFIY DPA1*01:03 42 13 56 31 1.30 inf 1.000 
24 3536.0139 PQWLSANRAVKPTGS DPA1*01:03 42 13 56 31 1.30 inf 1.000 
25 3536.0138 LTSELPQWLSANRAV DPA1*01:03 42 13 56 31 1.30 inf 1.000 
26 3536.0136 GCQTYKWETFLTSEL DPA1*01:03 42 13 56 31 1.30 inf 1.000 
27 3536.0133 SSFYSDWYSPACGKA DPA1*01:03 42 13 56 31 1.30 inf 1.000 
28 3536.0132 PVGGQSSFYSDWYSP DPA1*01:03 42 13 56 31 1.30 inf 1.000 
29 3536.0131 LSIVMPVGGQSSFYS DPA1*01:03 42 13 56 31 1.30 inf 1.000 
30 3536.0121 PSMGRDIKVQFQSGG DPA1*01:03 42 13 56 31 1.30 inf 1.000 
31 3536.0109 ALGATPNTGPAPQGA DPA1*01:03 40 12 53 34 1.29 inf 1.000 
32 3536.0102 GGHNGVFDFPDSGTH DPA1*01:03 43 14 58 29 1.32 inf 1.000 
33 3536.0071 QTYKWETFLTSELPG DPA1*01:03 42 14 57 30 1.33 inf 1.000 
Peptide No.Peptide IDPeptide SequenceAllele No.AlleleA+R+AR+A+RARNo. of DonorsResponse n/aRFORp
3531.0375 MSQIMYNYPAMMAHA DPA1*01:03 13 48 11 81 0.78 0.34 0.048 
3531.0461 AGCQTYKWETFLTSE DPA1*01:03 54 19 81 1.16 2.44 0.672 
3531.0511 GEEYLILSARDVLAV DPA1*01:03 59 22 87 1.34 inf 0.330 
3550.0065 STHEANTMAMMARDT DPA1*01:03 34 47 40 0.95 0.80 1.000 
3550.0063 DLVRAYHSMSSTHEA DPA1*01:03 32 10 47 40 1.27 inf 0.569 
3550.0061 DLVRAYHAMSSTHEA DPA1*01:03 33 10 47 40 1.27 inf 0.564 
3550.0060 AMEDLVRAYHAMSST DPA1*01:03 35 10 47 40 1.27 inf 1.000 
3550.0059 IMYNYPTMLGHAGDM DPA1*01:03 33 47 40 1.02 1.09 1.000 
3550.0058 MSQIMYNYPTMLGHA DPA1*01:03 33 47 40 1.02 1.09 1.000 
10 3550.0057 IMYNYPAMLGHAGDM DPA1*01:03 30 47 40 1.11 2.07 0.667 
11 3550.0056 MSQIMYNYPAMLGHA DPA1*01:03 30 47 40 0.99 0.93 1.000 
12 3550.0055 TEIRRSNAPRLVDLV DPA1*01:03 36 10 47 40 1.27 inf 1.000 
13 3550.0052 GTEIRRSDAPRLVDL DPA1*01:03 35 10 47 40 1.27 inf 1.000 
14 3550.0051 SNIKIIRIDEFRRCG DPA1*01:03 36 10 47 40 1.27 inf 1.000 
15 3550.0046 HSNIKIIRIDEFRRY DPA1*01:03 36 10 47 40 1.27 inf 1.000 
16 3550.0028 PYVIELDGQFCGQLT DPA1*01:03 42 14 57 30 1.33 inf 1.000 
17 3550.0026 EWTVRHTVAAWPAVC DPA1*01:03 42 14 57 30 1.33 inf 1.000 
18 3550.0024 GTEIRRSNAPRLVDLV DPA1*01:03 38 12 51 36 1.31 inf 1.000 
19 3550.0020 HSNIKIIRIDEFRRYG DPA1*01:03 38 12 51 36 1.31 inf 1.000 
20 3550.0006 AAVLRFQEAANKQKQ DPA1*01:03 38 12 53 34 1.29 inf 1.000 
21 3536.0170 THSWEYWGAQLNAMK DPA1*01:03 41 11 53 34 1.26 inf 1.000 
22 3536.0147 AGSLSALLDPSQGMG DPA1*01:03 41 12 56 31 0.87 0.59 0.555 
23 3536.0144 SAMILAAYHPQQFIY DPA1*01:03 42 13 56 31 1.30 inf 1.000 
24 3536.0139 PQWLSANRAVKPTGS DPA1*01:03 42 13 56 31 1.30 inf 1.000 
25 3536.0138 LTSELPQWLSANRAV DPA1*01:03 42 13 56 31 1.30 inf 1.000 
26 3536.0136 GCQTYKWETFLTSEL DPA1*01:03 42 13 56 31 1.30 inf 1.000 
27 3536.0133 SSFYSDWYSPACGKA DPA1*01:03 42 13 56 31 1.30 inf 1.000 
28 3536.0132 PVGGQSSFYSDWYSP DPA1*01:03 42 13 56 31 1.30 inf 1.000 
29 3536.0131 LSIVMPVGGQSSFYS DPA1*01:03 42 13 56 31 1.30 inf 1.000 
30 3536.0121 PSMGRDIKVQFQSGG DPA1*01:03 42 13 56 31 1.30 inf 1.000 
31 3536.0109 ALGATPNTGPAPQGA DPA1*01:03 40 12 53 34 1.29 inf 1.000 
32 3536.0102 GGHNGVFDFPDSGTH DPA1*01:03 43 14 58 29 1.32 inf 1.000 
33 3536.0071 QTYKWETFLTSELPG DPA1*01:03 42 14 57 30 1.33 inf 1.000 

Table shows a part of the complete results obtained for a sample dataset.

inf, infinity; n/a, number of donors not tested with the specific peptide.

This complete report is usually too large and cumbersome to evaluate. For example, even considering only the 25 most common variants of the four polymorphic HLA class II molecules yields 100 different HLA molecules, and in the case of a set of 200 peptides, this generates an output with 20,000 entries. For this reason, the tool also generates a concise report (Table III) that lists, for each peptide, only RF values >2.0. This threshold is used because the determination of negative HLA association is not within the scope of the present application, and stronger RF (or OR) values are more likely to reflect HLA restrictions. HLA molecules that are associated with significant ORs and RFs and predicted to bind the corresponding epitope with high affinity (IEDB consensus prediction score less than the 15th percentile) (13) are considered as potential restrictions.

Table III.
RATE HLA restriction concise results
Peptide No.Peptide IDPeptide SequenceAllele No.AlleleA+R+AR+A+RARNo. of DonorsResponse n/aRFORp
3531.0375 MSQIMYNYPAMMAHA 66 DQB1*06:01 18 59 81 3.68 inf 0.004 
3531.0375 MSQIMYNYPAMMAHA 101 DRB1*15:02 18 59 81 3.68 inf 0.004 
3531.0375 MSQIMYNYPAMMAHA 111 DRB5*01:02 20 59 81 3.68 inf 0.071 
3531.0375 MSQIMYNYPAMMAHA DPA1*01:05 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 18 DPB1*104:01 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 31 DPB1*27:02 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 33 DPB1*40:01 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 43 DQA1*03 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 58 DQB1*03:08 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 76 DRB1*04:02 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 81 DRB1*04:10 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 87 DRB1*08:06 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 100 DRB1*15:01 11 11 58 81 3.38 53.97 0.000 
3531.0375 MSQIMYNYPAMMAHA 110 DRB5*01:01 14 54 81 2.71 17.84 0.000 
3531.0375 MSQIMYNYPAMMAHA 25 DPB1*14:01 20 58 81 2.45 5.65 0.178 
3531.0375 MSQIMYNYPAMMAHA 62 DQB1*05:02 20 58 81 2.45 5.65 0.178 
3531.0375 MSQIMYNYPAMMAHA 84 DRB1*08:02 20 58 81 2.45 5.65 0.178 
3531.0375 MSQIMYNYPAMMAHA 39 DQA1*01:03 17 55 81 2.05 3.96 0.056 
3531.0461 AGCQTYKWETFLTSE 43 DQA1*03 73 81 10.13 inf 0.099 
3531.0461 AGCQTYKWETFLTSE 76 DRB1*04:02 73 81 10.13 inf 0.099 
3531.0461 AGCQTYKWETFLTSE 84 DRB1*08:02 72 81 6.75 21.84 0.025 
3531.0461 AGCQTYKWETFLTSE 111 DRB5*01:02 72 81 5.06 9.69 0.189 
3531.0461 AGCQTYKWETFLTSE 62 DQB1*05:02 71 81 3.38 4.90 0.271 
3531.0461 AGCQTYKWETFLTSE 66 DQB1*06:01 71 81 3.38 4.90 0.271 
3531.0461 AGCQTYKWETFLTSE 86 DRB1*08:04 71 81 3.38 4.90 0.271 
3531.0461 AGCQTYKWETFLTSE 101 DRB1*15:02 71 81 3.38 4.90 0.271 
3531.0461 AGCQTYKWETFLTSE 71 DRB1*01:01 66 81 3.04 5.47 0.055 
3531.0461 AGCQTYKWETFLTSE 60 DQB1*04:02 68 81 2.89 4.40 0.140 
3531.0461 AGCQTYKWETFLTSE 47 DQA1*04:01 65 81 2.76 4.73 0.072 
3531.0461 AGCQTYKWETFLTSE 13 DPB1*04:01 17 56 81 2.64 9.54 0.006 
3531.0461 AGCQTYKWETFLTSE 68 DQB1*06:03 70 81 2.53 3.26 0.346 
3531.0461 AGCQTYKWETFLTSE 14 DPB1*04:02 10 63 81 2.34 3.70 0.113 
3531.0461 AGCQTYKWETFLTSE 37 DQA1*01:01 66 81 2.25 3.08 0.216 
Peptide No.Peptide IDPeptide SequenceAllele No.AlleleA+R+AR+A+RARNo. of DonorsResponse n/aRFORp
3531.0375 MSQIMYNYPAMMAHA 66 DQB1*06:01 18 59 81 3.68 inf 0.004 
3531.0375 MSQIMYNYPAMMAHA 101 DRB1*15:02 18 59 81 3.68 inf 0.004 
3531.0375 MSQIMYNYPAMMAHA 111 DRB5*01:02 20 59 81 3.68 inf 0.071 
3531.0375 MSQIMYNYPAMMAHA DPA1*01:05 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 18 DPB1*104:01 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 31 DPB1*27:02 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 33 DPB1*40:01 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 43 DQA1*03 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 58 DQB1*03:08 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 76 DRB1*04:02 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 81 DRB1*04:10 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 87 DRB1*08:06 21 59 81 3.68 inf 0.272 
3531.0375 MSQIMYNYPAMMAHA 100 DRB1*15:01 11 11 58 81 3.38 53.97 0.000 
3531.0375 MSQIMYNYPAMMAHA 110 DRB5*01:01 14 54 81 2.71 17.84 0.000 
3531.0375 MSQIMYNYPAMMAHA 25 DPB1*14:01 20 58 81 2.45 5.65 0.178 
3531.0375 MSQIMYNYPAMMAHA 62 DQB1*05:02 20 58 81 2.45 5.65 0.178 
3531.0375 MSQIMYNYPAMMAHA 84 DRB1*08:02 20 58 81 2.45 5.65 0.178 
3531.0375 MSQIMYNYPAMMAHA 39 DQA1*01:03 17 55 81 2.05 3.96 0.056 
3531.0461 AGCQTYKWETFLTSE 43 DQA1*03 73 81 10.13 inf 0.099 
3531.0461 AGCQTYKWETFLTSE 76 DRB1*04:02 73 81 10.13 inf 0.099 
3531.0461 AGCQTYKWETFLTSE 84 DRB1*08:02 72 81 6.75 21.84 0.025 
3531.0461 AGCQTYKWETFLTSE 111 DRB5*01:02 72 81 5.06 9.69 0.189 
3531.0461 AGCQTYKWETFLTSE 62 DQB1*05:02 71 81 3.38 4.90 0.271 
3531.0461 AGCQTYKWETFLTSE 66 DQB1*06:01 71 81 3.38 4.90 0.271 
3531.0461 AGCQTYKWETFLTSE 86 DRB1*08:04 71 81 3.38 4.90 0.271 
3531.0461 AGCQTYKWETFLTSE 101 DRB1*15:02 71 81 3.38 4.90 0.271 
3531.0461 AGCQTYKWETFLTSE 71 DRB1*01:01 66 81 3.04 5.47 0.055 
3531.0461 AGCQTYKWETFLTSE 60 DQB1*04:02 68 81 2.89 4.40 0.140 
3531.0461 AGCQTYKWETFLTSE 47 DQA1*04:01 65 81 2.76 4.73 0.072 
3531.0461 AGCQTYKWETFLTSE 13 DPB1*04:01 17 56 81 2.64 9.54 0.006 
3531.0461 AGCQTYKWETFLTSE 68 DQB1*06:03 70 81 2.53 3.26 0.346 
3531.0461 AGCQTYKWETFLTSE 14 DPB1*04:02 10 63 81 2.34 3.70 0.113 
3531.0461 AGCQTYKWETFLTSE 37 DQA1*01:01 66 81 2.25 3.08 0.216 

The table shows a part of the concise results obtained for a sample dataset.

inf, infinity; n/a, number of donors not tested with the specific peptide.

To validate the approach, we examined whether RATE would successfully reidentify HLA class II restrictions experimentally determined in previous studies (16). The three immune response and HLA allele–typing data from previously validated epitope–HLA restrictions were analyzed using the program (16). As shown in Table IV, donors expressing HLA DRB1*15:01 accounted for 11 of 22 responders for M. tuberculosis Rv0288/Rv3019c epitope MSQIMYNYPAMMAHA, with an OR of 54.0 and an RF of 3.4 (p < 0.001). The M. tuberculosis Rv3804c epitope AGCQTYKWETFLTSE and Rv3418c epitope GEEYLILSARDVLAV were predicted to be restricted by DPB1*04:01 (p = 0.0056) and DRB1*01:01 (p = 0.0249), respectively (Table IV). All three restrictions had been previously validated by peptide-tetramer staining of PBMCs from HLA-matched donors (16). Thus, RATE correctly reidentified known HLA-epitope restrictions.

Table IV.
Epitope/RATE predictions match validated tetramer data
SequenceAlleleA+R+AR+A+RARRFORpa
MSQIMYNYPAMMAHA DRB1*15:01 11 11 58 3.4 54.0 <0.001 
AGCQTYKWETFLTSE DPB1*04:01 17 56 2.6 9.5 0.006 
GEEYLILSARDVLAV DRB1*01:01 73 4.0 8.7 0.025 
SequenceAlleleA+R+AR+A+RARRFORpa
MSQIMYNYPAMMAHA DRB1*15:01 11 11 58 3.4 54.0 <0.001 
AGCQTYKWETFLTSE DPB1*04:01 17 56 2.6 9.5 0.006 
GEEYLILSARDVLAV DRB1*01:01 73 4.0 8.7 0.025 

The response data and HLA typing from three previously validated tetramers (16) were matched using the RATE program.

a

p values calculated by Fisher’s exact test.

A+, genotyped HLA allele positive; A, genotyped HLA allele negative; R+, epitope response positive; R, epitope response negative.

To further validate the use of RATE to predict novel restrictions, we considered a dataset generated by testing various M. tuberculosis–derived epitopes as described previously (16) and by D.M. McKinney, C.S. Lindestam Arlehamn, V. Rozot, E. Makgotlho, W. Hanekom, T.J. Scriba, and A. Sette (manuscript in preparation). In brief, immune responses to various M. tuberculosis epitopes in PBMCs from individuals with LTBI were measured by IFN-γ–specific ELISPOT as representative of Th1 responses and reported as SFCs per million cells. These data were used to generate RATE predicted restrictions.

In parallel, the HLA class II restrictions of five selected M. tuberculosis–derived epitopes (Rv1705c; sequence FFGQNTAAIAATEAQ, Rv1195 epitope SSYAATEVANAAAGQ, Rv0288 epitope IMYNYPAMLGHAGDM, and Rv3874 epitopes AAVVRFQEAANKQKQ and AQAAVVRFQEAANKQ) were determined by single HLA-transfected cell lines (6). PBMCs from LTBI were incubated together with a panel of cell lines presenting the specific epitopes and expressing HLA molecules matching those expressed in the donors (Fig. 1). Responses were evaluated in a standard IFN-γ ELISPOT assay to measure Th1 responses. In the case of LTBI donors, the strong CD4+ T cell responses allow detection of IFN-γ responses directly ex vivo (16).

FIGURE 1.

Novel HLA-epitope restrictions predicted by RATE. PBMCs were incubated with single HLA-transfected cells pulsed with (A) Rv1705c epitope FFGQNTAAIAATEAQ, (B) Rv1195 epitope SSYAATEVANAAAGQ, (C) Rv0288 epitope IMYNYPAMLGHAGDM, or (D) Rv3874 epitopes AAVVRFQEAANKQKQ and (E) AQAAVVRFQEAANKQ for 24 h. IFN-γ release was measured by ELISPOT. White bars show significant responses (p < 0.05), whereas the black bars represent nonsignificant responses. HLA alleles expressed by donor are presented in the table insert. Predicted binding from IEDB is listed below each allele tested. RFs and Fisher’s exact test p values (when significant) are listed for predicted restrictions. N/A, HLA-transfected cell lines that are not available.

FIGURE 1.

Novel HLA-epitope restrictions predicted by RATE. PBMCs were incubated with single HLA-transfected cells pulsed with (A) Rv1705c epitope FFGQNTAAIAATEAQ, (B) Rv1195 epitope SSYAATEVANAAAGQ, (C) Rv0288 epitope IMYNYPAMLGHAGDM, or (D) Rv3874 epitopes AAVVRFQEAANKQKQ and (E) AQAAVVRFQEAANKQ for 24 h. IFN-γ release was measured by ELISPOT. White bars show significant responses (p < 0.05), whereas the black bars represent nonsignificant responses. HLA alleles expressed by donor are presented in the table insert. Predicted binding from IEDB is listed below each allele tested. RFs and Fisher’s exact test p values (when significant) are listed for predicted restrictions. N/A, HLA-transfected cell lines that are not available.

Close modal

As can be seen in Fig. 1A, a significant response to the peptide was observed in the case of DRB3*02:02-transfected cells, but not for any of the other lines transfected with HLA molecules expressed by the donor. These results match those obtained with the RATE program, where DRB3*02:02 was associated with an RF of 2.4 (p = 0.004) for the FFGQNTAAIAATEAQ epitope. Similar patterns of restrictions and RATE predictions were observed in the four additional epitopes tested (Fig. 1B–E). Thus, the RATE approach correctly predicted five new HLA restrictions.

As a next step toward validation, we tested whether RATE could predict HLA-peptide restrictions de novo as a means to guide generation of specific tetrameric staining reagents. Tetramer staining was used as an alternative method of validating restrictions to the earlier transfection assay technique to show the versatility of the RATE predictions. In an initial set of experiments, we used the immune response dataset from a study of 31 HLA class II–typed donors vaccinated with the acellular B. pertussis vaccine (M.B.C. Dillon, T.A. Bancroft, R. Kolla, S. Paul, J. Sidney, B. Peters, and A. Sette, manuscript in preparation). In this cohort, little to no reactivity was detected ex vivo, but good T cell reactivity was detected after expansion in vitro with the vaccine component proteins. After expansion, epitopes were defined using a set of 785 overlapping peptides (16 mer overlapping by 8) completely spanning the component proteins of which 154 nonredundant epitopes induced positive Th1 and/or Th2 CD4+ T cell responses. Of those, RATE indicated potential restrictions for 35 of them.

More specifically, when the combined IFN-γ and IL-5 CD4+ T cell immune response reactivity was analyzed by RATE, epitope YYSNVTATRLLSSTNS from pertussis toxin subunit B129–144 was found to be associated with significant OR values (p < 0.05; Table V) for DRB1*07:01. Accordingly, the corresponding PE-conjugated tetramer was generated and staining was measured on PBMCs (Fig. 2). As expected, little staining was detected directly ex vivo, with positive staining for YYSNVTATRLLSSTNS of <0.01% (Fig. 2A). However, after 14 d of stimulation with the corresponding peptide, the tetramer displayed significant binding to CD3+CD4+CD8 cells, representing an increase of >20-fold compared with ex vivo staining (0.42%; Fig. 2B). The staining was specific because, as expected, no significant staining was detected on CD3+CD4CD8+ cells from the same expanded PBMCs cultures (Fig. 2C).

Table V.
Selection of significant predicted restrictions to guide tetramer generation
SequenceAlleleA+R+AR+A+RARRFpaPredicted Bindingb (Percentile)
YYSNVTATRLLSSTNS DRB1*07:01 22 2.9 0.043 0.76 
AAFQAAHARFVAAAA DRB1*07:01 12 58 2.7 0.003 0.03 
MSQIMYNYPAMRAHA DRB1*15:01 11 11 58 3.4 0.000 0.90 
AAFQGAHARFVAAAA DRB1*07:01 14 56 2.7 0.001 0.13 
SequenceAlleleA+R+AR+A+RARRFpaPredicted Bindingb (Percentile)
YYSNVTATRLLSSTNS DRB1*07:01 22 2.9 0.043 0.76 
AAFQAAHARFVAAAA DRB1*07:01 12 58 2.7 0.003 0.03 
MSQIMYNYPAMRAHA DRB1*15:01 11 11 58 3.4 0.000 0.90 
AAFQGAHARFVAAAA DRB1*07:01 14 56 2.7 0.001 0.13 

The response data and HLA typing from donors recently vaccinated with acellular B. pertussis or with LTBI were matched using the RATE program.

a

p values calculated by Fisher’s exact test.

b

Predicted binding values were obtained from IEDB (13).

A+, genotyped HLA allele positive; A, genotyped HLA allele negative; R+, epitope response positive; R, epitope response negative.

FIGURE 2.

Novel HLA-epitope restrictions validated by tetramer binding. (A) Purified CD4+CD3+CD8 cells stained with YYSNVTARTLLSSTNS tetramer-PE. (B) CD3+CD4+CD8 PBMCs stained with YYSNVTARTLLSSTNS tetramer-PE after 14 d of stimulation with peptide. (C) CD3+CD4CD8+ PBMCs stained with YYSNVTARTLLSSTNS tetramer-PE after 14 d of stimulation with peptide. (D) Top panels present CD3+CD8CD19CD11bCD56 PBMCs stained with the indicated tetramer-PE combinations after anti-PE magnetic bead enrichment. Bottom panels present flow-through from magnetic bead enrichment. Numbers shown are percentages of tetramer+ CD4+ or CD8+ cells.

FIGURE 2.

Novel HLA-epitope restrictions validated by tetramer binding. (A) Purified CD4+CD3+CD8 cells stained with YYSNVTARTLLSSTNS tetramer-PE. (B) CD3+CD4+CD8 PBMCs stained with YYSNVTARTLLSSTNS tetramer-PE after 14 d of stimulation with peptide. (C) CD3+CD4CD8+ PBMCs stained with YYSNVTARTLLSSTNS tetramer-PE after 14 d of stimulation with peptide. (D) Top panels present CD3+CD8CD19CD11bCD56 PBMCs stained with the indicated tetramer-PE combinations after anti-PE magnetic bead enrichment. Bottom panels present flow-through from magnetic bead enrichment. Numbers shown are percentages of tetramer+ CD4+ or CD8+ cells.

Close modal

To expand the results obtained with the B. pertussis epitope, we examined additional instances of RATE predicted restrictions, in the context of the M. tuberculosis response rates in LTBI donors (as described earlier in the case of Fig. 3). In the case of LTBI and M. tuberculosis epitopes, as stated earlier, the strong CD4+ T cell responses allow detection of IFN-γ responses directly ex vivo (16).

FIGURE 3.

Predicted promiscuous binding epitope validated by tetramer binding. (A) Purified CD4+CD3+CD8 cells stained with VKAQNITNKRAALIEA tetramer-PE. (B) CD3+CD4+CD8 PBMCs stained with VKAQNITNKRAALIEA tetramer-PE after 14 d of stimulation with peptide. (C) CD3+CD4CD8+ PBMCs stained with VKAQNITNKRAALIEA tetramer-PE after 14 d of stimulation with peptide. Numbers shown are percentages of tetramer+ CD4+ or CD8+ cells.

FIGURE 3.

Predicted promiscuous binding epitope validated by tetramer binding. (A) Purified CD4+CD3+CD8 cells stained with VKAQNITNKRAALIEA tetramer-PE. (B) CD3+CD4+CD8 PBMCs stained with VKAQNITNKRAALIEA tetramer-PE after 14 d of stimulation with peptide. (C) CD3+CD4CD8+ PBMCs stained with VKAQNITNKRAALIEA tetramer-PE after 14 d of stimulation with peptide. Numbers shown are percentages of tetramer+ CD4+ or CD8+ cells.

Close modal

Three different instances of restrictions predicted from the RATE approach were selected for further investigation, as shown in Table V. More specifically, Rv0287 epitope AAFQAAHARFVAAAA and Rv3020c epitope AAFQGAHARFVAAAA were predicted to be restricted to DRB1*07:01 (p = 0.003 and 0.001, respectively) and Rv3019c epitope MSQIMYNYPAMRAHA to DRB1*15:01 (p = 0.000). Accordingly, PBMCs from epitope-responsive LTBI subjects were stained ex vivo with the respective tetramer and enriched with anti-PE magnetic beads. In all three epitope–allele combinations, tetramer staining was detected at 13- to 160-fold higher percentages than those detected in the negative control flow-through from the magnetic bead enrichment (Fig. 2D).

These results further validate the use of the RATE approach to predict HLA restrictions for the purpose of generating functional tetrameric staining reagents. This method thus allows to rapidly transition from HLA typing and response data to tetramers, essentially skipping the usual HLA association determination steps.

Many HLA allelic variants are functionally similar (2426). As a result, a given epitope may be restricted by multiple HLA molecules encoded by a particular locus (especially if they are close variants), or even molecules from different loci (promiscuous restriction). In these cases, the fact that multiple HLA molecules may restrict the response to a single epitope will (paradoxically) lower the statistical significance of each individual HLA restriction. To overcome this issue, we have developed an algorithm that calculates for a given peptide OR and RF values of all possible combination of alleles for which predicted binding is within the 15th percentile, and tabulates particular combinations of alleles associated with the best p value.

To validate the approach, we selected the epitope EEWEPLTKKGNVWEV from Phlp341–55, which was previously (15) determined to be promiscuously restricted by the two HLA alleles, DRB1*08:01 and DRB1*11:01, as determined by single HLA-transfected cell lines. Indeed, when the reactivity of EEWEPLTKKGNVWEV in the cohort of allergic individuals (15) was analyzed by RATE, of 14 HLA alleles (including DRB1*08:01 and DRB1*11:01) expressed by donors responsive to the epitope, 8 combinations had RF values >1.5, but none was associated with a significant p value (p > 0.05).

When the results were analyzed by the promiscuous restrictions algorithm, it was found that the allele combination of DRB1*08:01 and DRB1*11:01 was associated with a significant p value (p < 0.05 and RF = 3.1; Table VI). We thus concluded that RATE was able to correctly predict promiscuous restriction of this previously described example of promiscuous restriction.

Table VI.
Combined RF for responsive alleles predicts restriction for EEWEPLTKKGNVWEV
Allele(s)A+R+AR+A+RARRFpaPredicted Binding (Percentile)b
DRB1*08:01 22 8.3 0.120 2.58 
DRB1*11:01 17 2.4 0.180 3.99 
DRB1*08:01 + DRB1*11:01 17 3.1 0.024 n/ac 
Allele(s)A+R+AR+A+RARRFpaPredicted Binding (Percentile)b
DRB1*08:01 22 8.3 0.120 2.58 
DRB1*11:01 17 2.4 0.180 3.99 
DRB1*08:01 + DRB1*11:01 17 3.1 0.024 n/ac 

The response data and HLA typing from Timothy grass–allergic donors (19) were matched using the RATE program promiscuity algorithm. The combined predictions are shown in the bottom row.

a

p values calculated by Fisher’s exact test.

b

Predicted binding values were obtained from IEDB (13).

c

Predicted binding percentiles are not available for combined alleled.

A+, genotyped HLA allele positive; A, genotyped HLA allele negative; R+, epitope response positive; R, epitope response negative.

To further validate that the RATE algorithm could also predict novel promiscuous restrictions, we selected the epitope VKAQNITNKRAALIEA from filamentous hemagglutinin1753–1768. When the reactivity of this epitope in the cohort of individuals vaccinated with the acellular B. pertussis vaccine (M.B.C. Dillon, T.A. Bancroft, R. Kolla, S. Paul, J. Sidney, B. Peters, and A. Sette, manuscript in preparation) was analyzed by RATE, no alleles were predicted as potential restrictions (p > 0.05; Table VII). However, when analyzed by the promiscuous restriction algorithm, the allele combination of DQB1*06:02 and DRB1*14:04 was associated with a significant p value (p < 0.05 and RF ≥ 2.6; Table VII), as the predicted binding of the epitope for DQA1*01:02/DQB1*06:02 is 9.98 percentile and for DRB1*14:04 is 9.92 percentile.

Table VII.
Combined RF for responsive alleles predicts restriction for VKAQNITNKRAALIEA
Allele(s)A+R+AR+A+RARRFpaPredicted Bindingb (Percentile)
DQB1*06:02 23 3.0 0.120 9.98 
DRB1*14:04 28 10.3 0.097 9.92 
DQB1*06:02 + DRB1*14:04 23 3.9 0.012 n/ac 
Allele(s)A+R+AR+A+RARRFpaPredicted Bindingb (Percentile)
DQB1*06:02 23 3.0 0.120 9.98 
DRB1*14:04 28 10.3 0.097 9.92 
DQB1*06:02 + DRB1*14:04 23 3.9 0.012 n/ac 

The response data and HLA typing from donors recently vaccinated with acellular B. pertussis were matched using the RATE program promiscuity algorithm. The combined predictions are shown in the bottom row.

a

p values calculated by Fisher’s exact test.

b

Predicted binding values were obtained from IEDB (13).

c

Predicted binding percentiles are not available for combined alleled.

A+, genotyped HLA allele positive; A, genotyped HLA allele negative; R+, epitope response positive; R, epitope response negative.

To validate this predicted restriction, we developed a DQB1*06:02/KAQNITNKRAALIEA tetramer. To match the most common α/β combination in the population, we selected DQA1*01:02 for the α-chain. The staining of the PE-conjugated tetramer of VKAQNITNKRAALIEA with DQB1*06:02 was measured on PBMCs from a responsive donor (Fig. 3). As earlier with YYSNVTATRLLSSTNS and DRB1*07:01, limited staining was detected ex vivo (Fig. 3A). In contrast, 14 d of stimulation with the corresponding peptide dramatically increased the tetramer binding to CD3+CD4+CD8 cells, changing the ex vivo percentage of 0.24–6.85% in the stimulated cells (Fig. 3B). Specificity of the stain was confirmed by low staining in CD3+CD4CD8+ cells of 0.19% (Fig. 3C). The ideal validation for promiscuous restriction of this epitope would also include the corresponding DRB1*14:04 tetramer stain. Unfortunately, DRB1*14:04 is not available as a tetramer at this time and a binding assay for this allele has not been developed as yet. However, the promiscuous algorithm of RATE does take into account the in silico predicted binding to this allele. In conclusion, the promiscuous RATE algorithm allowed identification of additional restrictions that could not be detected statistically when considered alone, which was experimentally verified where production of tetramer reagents was technically feasible.

Although the RATE tool was designed with a focus on class II alleles, we speculate that it might also be applicable to the determination of HLA class I restrictions. Datasets related to epitopes of known restriction tested in groups of HLA-typed donors is not generally available in our laboratory, because we and most other groups routinely infer HLA class I restriction on the basis of presence of specific motifs and HLA binding, and only test HLA-matched donors for reactivity.

However, analysis of a dataset on class I alleles obtained from the HIV Molecular Immunology Database by the Los Alamos National Laboratory (27) was able to address the applicability to HLA class I restrictions. The dataset (Study 4 in HLA Typing and Epitope Mapping section) (28), the largest among the available studies, contained class I HLA typing data for 631 HIV patients and the reaction data (SFC values) for 409 HIV-1 peptides in each of the respective positive patients (patients who gave a positive immune response to the peptide). The HIV database provides “A list of HIV CTL epitopes,” which lists the best defined CTL/CD8+ epitopes and the restricting class I alleles for each epitope.

There were 118 A list epitopes embedded in 108 peptides in the dataset, and the data from these 108 peptides and the HLA typing of the 631 patients were used to validate the RATE tool. The 118 epitopes (embedded in 108 peptides) were restricted by 67 class I alleles, and 45 of them were expressed by the patients in the study. These 45 alleles restricted 86 (of 108) peptides in the dataset. The results from the RATE tool showed 33 peptide–allele combinations where the allele was relatively frequent among the patients (present in ≥10% of patients) and expressed in at least one positive donor (A+R+ ≥ 1; Table VIII). Twenty-seven peptide–allele restrictions out of the earlier 33 combinations were significant hits (p ≤ 0.05); that is, the RATE tool could confirm 82% of the relevant peptide–allele combinations as restrictions.

Table VIII.
Validation using class I data from Los Alamos HIV Molecular Immunology database
Peptide SequenceAlleleA+R+AR+A+RARRFORpAllele Frequency (%)Embedded Epitopes from A List Epitopes
NDIQKLVGKLNWASQIY A*30:02 58 561 7.1 29.0 0.000 10.62 KLNWASQIY 
TKELQKQIIKIQNFRVYY A*30:02 18 49 556 6.5 25.5 0.000 10.62 KIQNFRVYY 
SKLNWASQIYPGIKVRQL A*30:02 28 61 536 1.7 1.9 0.160 10.62 KLNWASQIY 
IKIQNFRVYYRDSRDPIW A*30:02 24 16 43 548 5.7 19.1 0.000 10.62 KIQNFRVYY 
TGTEELRSLYNTVATLY A*30:02 30 66 37 498 2.9 6.1 0.000 10.62 RSLYNTVATLY 
GIWQLDCTHLEGKIILVA B*15:10 22 77 527 5.2 30.1 0.000 15.69 THLEGKIIL 
GGHQAAMQMLKDTINEEA B*15:10 11 17 88 515 2.5 3.8 0.002 15.69 GHQAAMQML 
LQTGERDWHLGHGVSIEW B*15:10 51 15 48 517 4.9 36.6 0.000 15.69 WHLGHGVSI 
NTMLNTVGGHQAAMQMLK B*15:10 92 525 3.2 5.7 0.003 15.69 GHQAAMQML 
QMVHQAISPRTLNAWVKV B*15:10 20 14 79 518 3.7 9.4 0.000 15.69 HQAISPRTL 
QGYFPDWQNYTPGPGVRY A*29 45 95 482 1.0 1.0 1.000 16.48 YFPDWQNYT 
QITLWQRPLVSIKVGGQI A*68:02 12 109 509 0.4 0.4 0.709 17.43 ITLWQRPLV 
QLEKEPIAGAETFYVDGA A*68:02 25 17 85 504 3.4 8.7 0.000 17.43 GAETFYVDGA 
GLGQYIYETYGDTWTGV A*68:02 40 13 70 508 4.3 22.3 0.000 17.43 ETYGDTWTGV 
ETYGDTWTGVEALIRIL A*68:02 35 11 75 510 4.4 21.6 0.000 17.43 ETYGDTWTGV 
GAETFYVDGAANRETKI A*68:02 65 61 45 460 3.0 10.9 0.000 17.43 GAETFYVDGA 
GIQQEFGIPYNPQSQGVV B*15:03 12 106 506 2.1 2.8 0.060 17.91 IQQEFGIPY 
YHCLVCFQTKGLGISYGRa B*15:03 11 102 511 3.4 7.9 0.000 17.91 FQTKGLGISY 
VKAACWWAGIQQEFGIPYa B*15:03 38 11 75 507 4.3 23.4 0.000 17.91 IQQEFGIPY 
PRTLNAWVKVIEEKAFa B*15:03 58 15 55 503 4.4 35.4 0.000 17.91 VKVIEEKAF 
AVFIHNFKRKGGIGGYSAa B*15:03 80 12 33 506 4.9 102.2 0.000 17.91 FKRKGGIGGY 
YVDRFFKTLRAEQATQDV B*15:03 18 132 95 386 0.7 0.6 0.038 17.91 YVDRFFKTL 
GPKEPFRDYVDRFFKTLR B*15:03 24 105 89 413 1.0 1.1 0.798 17.91 YVDRFFKTL 
GKKAIGTVLVGPTPVNII B*15:03 22 19 91 499 3.0 6.3 0.000 17.91 GKKAIGTVL 
EVNIVTDSQYALGII B*15:03 21 112 497 0.3 0.2 0.152 17.91 VTDSQYALGI 
WVKVIEEKAFSPEVIPMFa B*15:03 39 111 479 0.3 0.2 0.020 17.91 VKVIEEKAF 
IYPGIKVRQLCKLLRGAK B*42:01 21 12 99 499 3.3 8.8 0.000 19.02 YPGIKVRQL 
NYTPGPGVRYPLTFGWCFa B*42:01 64 136 56 375 1.7 3.2 0.000 19.02 TPGPGVRYPL 
SKLNWASQIYPGIKVRQL B*42:01 16 18 104 493 2.5 4.2 0.000 19.02 YPGIKVRQL 
EVGFPVRPQVPLRPMTFK B*42:01 67 150 53 361 1.6 3.0 0.000 19.02 RPQVPLRPM 
MASEFNLPPIVAKEIVAa B*42:01 47 23 73 488 3.5 13.7 0.000 19.02 LPPIVAKEI 
GATPQDLNTMLNTVGGH B*42:01 90 90 30 421 2.6 14.0 0.000 19.02 TPQDLNTML 
GIKQLQTRVLAIERYLK B*58:02 36 96 496 4.4 62.0 0.000 20.92 QTRVLAIERYL 
Peptide SequenceAlleleA+R+AR+A+RARRFORpAllele Frequency (%)Embedded Epitopes from A List Epitopes
NDIQKLVGKLNWASQIY A*30:02 58 561 7.1 29.0 0.000 10.62 KLNWASQIY 
TKELQKQIIKIQNFRVYY A*30:02 18 49 556 6.5 25.5 0.000 10.62 KIQNFRVYY 
SKLNWASQIYPGIKVRQL A*30:02 28 61 536 1.7 1.9 0.160 10.62 KLNWASQIY 
IKIQNFRVYYRDSRDPIW A*30:02 24 16 43 548 5.7 19.1 0.000 10.62 KIQNFRVYY 
TGTEELRSLYNTVATLY A*30:02 30 66 37 498 2.9 6.1 0.000 10.62 RSLYNTVATLY 
GIWQLDCTHLEGKIILVA B*15:10 22 77 527 5.2 30.1 0.000 15.69 THLEGKIIL 
GGHQAAMQMLKDTINEEA B*15:10 11 17 88 515 2.5 3.8 0.002 15.69 GHQAAMQML 
LQTGERDWHLGHGVSIEW B*15:10 51 15 48 517 4.9 36.6 0.000 15.69 WHLGHGVSI 
NTMLNTVGGHQAAMQMLK B*15:10 92 525 3.2 5.7 0.003 15.69 GHQAAMQML 
QMVHQAISPRTLNAWVKV B*15:10 20 14 79 518 3.7 9.4 0.000 15.69 HQAISPRTL 
QGYFPDWQNYTPGPGVRY A*29 45 95 482 1.0 1.0 1.000 16.48 YFPDWQNYT 
QITLWQRPLVSIKVGGQI A*68:02 12 109 509 0.4 0.4 0.709 17.43 ITLWQRPLV 
QLEKEPIAGAETFYVDGA A*68:02 25 17 85 504 3.4 8.7 0.000 17.43 GAETFYVDGA 
GLGQYIYETYGDTWTGV A*68:02 40 13 70 508 4.3 22.3 0.000 17.43 ETYGDTWTGV 
ETYGDTWTGVEALIRIL A*68:02 35 11 75 510 4.4 21.6 0.000 17.43 ETYGDTWTGV 
GAETFYVDGAANRETKI A*68:02 65 61 45 460 3.0 10.9 0.000 17.43 GAETFYVDGA 
GIQQEFGIPYNPQSQGVV B*15:03 12 106 506 2.1 2.8 0.060 17.91 IQQEFGIPY 
YHCLVCFQTKGLGISYGRa B*15:03 11 102 511 3.4 7.9 0.000 17.91 FQTKGLGISY 
VKAACWWAGIQQEFGIPYa B*15:03 38 11 75 507 4.3 23.4 0.000 17.91 IQQEFGIPY 
PRTLNAWVKVIEEKAFa B*15:03 58 15 55 503 4.4 35.4 0.000 17.91 VKVIEEKAF 
AVFIHNFKRKGGIGGYSAa B*15:03 80 12 33 506 4.9 102.2 0.000 17.91 FKRKGGIGGY 
YVDRFFKTLRAEQATQDV B*15:03 18 132 95 386 0.7 0.6 0.038 17.91 YVDRFFKTL 
GPKEPFRDYVDRFFKTLR B*15:03 24 105 89 413 1.0 1.1 0.798 17.91 YVDRFFKTL 
GKKAIGTVLVGPTPVNII B*15:03 22 19 91 499 3.0 6.3 0.000 17.91 GKKAIGTVL 
EVNIVTDSQYALGII B*15:03 21 112 497 0.3 0.2 0.152 17.91 VTDSQYALGI 
WVKVIEEKAFSPEVIPMFa B*15:03 39 111 479 0.3 0.2 0.020 17.91 VKVIEEKAF 
IYPGIKVRQLCKLLRGAK B*42:01 21 12 99 499 3.3 8.8 0.000 19.02 YPGIKVRQL 
NYTPGPGVRYPLTFGWCFa B*42:01 64 136 56 375 1.7 3.2 0.000 19.02 TPGPGVRYPL 
SKLNWASQIYPGIKVRQL B*42:01 16 18 104 493 2.5 4.2 0.000 19.02 YPGIKVRQL 
EVGFPVRPQVPLRPMTFK B*42:01 67 150 53 361 1.6 3.0 0.000 19.02 RPQVPLRPM 
MASEFNLPPIVAKEIVAa B*42:01 47 23 73 488 3.5 13.7 0.000 19.02 LPPIVAKEI 
GATPQDLNTMLNTVGGH B*42:01 90 90 30 421 2.6 14.0 0.000 19.02 TPQDLNTML 
GIKQLQTRVLAIERYLK B*58:02 36 96 496 4.4 62.0 0.000 20.92 QTRVLAIERYL 

The analysis of the data resulted in 33 peptide–allele combinations where the allele was relatively frequent among the patients (present in ≥10% of patients) and expressed in at least one positive donor (A+R+ ≥ 1). Of these 33 combinations, 27 peptide–allele restrictions were significant hits (p ≤ 0.05).

a

Epitope allele restrictions confirmed by Larsen et al. (29).

We also analyzed the peptide–allele restrictions confirmed by Erup Larsen et al. (29). Out of the 18 peptides and respective restricting alleles, 7 peptide–allele combinations were present in RATE results for Los Alamos HIV database dataset with the allele being expressed in at least one positive donor. All of these peptide–allele restrictions were significant hits as per the RATE results (p ≤ 0.05; Table VIII).

In this article, we describe a novel approach to determine restriction of CD4+ T cell epitopes at the population level using genetic association methods. The main focus of our efforts was to design and validate a method to facilitate determination of restrictions for HLA class II molecules, and accordingly we mostly used data obtained in our laboratory where HLA class II responses were measured in a number of different settings, using 15-mer peptides and ELISPOT or intracellular cytokine staining assays, in a sufficient number of donors. We anticipate that this methodology will greatly facilitate the rapid generation of tetrameric staining reagents for investigations of immune reactivity in human populations.

Identification of the specific HLA locus and allele responsible for presenting an epitope for recognition by specific TCRs (HLA restriction) is necessary to fully characterize the immune response to an Ag. In this context, it is helpful to distinguish determination of the HLA molecule restricting responses to a given epitope in a particular donor (i.e., individual restriction of that peptide in that donor) from the HLA molecule(s) restricting the response in a population (i.e., general restriction of that peptide in that population).

Individual and general restrictions are not necessarily, and in reality are not frequently, the same. This is because not all donors expressing a given allele will generate T cells recognizing an epitope presented by that allele, because of immunodominance at the epitope and HLA levels (15, 18, 30, 31). Thus, population restriction does not always predict individual restriction. Conversely, because the same epitope can be presented by multiple alleles (epitope promiscuity) (30, 3236), the fact that a given epitope is presented in a particular individual by a specific allele does not fully predict its general pattern of restriction in a population.

Experimental determination of HLA class II restriction is complex and technically challenging. Because HLA molecules are polygenic (genes are encoded by multiple loci) and polymorphic (genes are encoded by different allelic variants), this task is complicated due to the extreme HLA class II diversity present in human populations (5). To determine which gene and allelic variant act as restriction element, we used classic approaches such as inhibition by HLA locus–specific Abs and/or the use of matched/mismatched or single HLA-transfected cell lines.

Inhibition with anti-HLA class II Abs is a good way to determine the locus (but not the allele). A primary limitation is that there are no available Abs that can distinguish the DRB1 and DRB3/4/5 loci. In addition, questions remain in the field as to whether pan-reactive Abs for the DQ locus truly detect all DQ α and β combinations. Finally, anti-HLA Abs that inhibit T cell recognition may act by simply killing or inhibiting the APC, and hence need to be titrated to find a “sweet spot.” The use of HLA class II–transfected cell lines as described by McKinney et al. (6) is technically the most accurate, but using transfected cell lines is cumbersome and transfected cell lines are available for only the most frequent HLA alleles.

HLA binding and motif predictions could be used, as they are widely available. However, HLA binding in itself is only a necessary, but not sufficient, requisite for T cell recognition. Because of the considerations listed earlier, it is of interest to develop alternative strategies. There have been similar works in this direction, mostly focusing on class I alleles. For example, Kiepiela et al. (37) has applied a statistical approach similar to ours to identify the HLA class I restriction of HIV peptides. Another statistical method designed by Listgarten et al. (38) can identify restricting HLA alleles for a specific epitope from ELISPOT data from a set of patients and their respective allele type. The HLA restrictor tool developed by Erup Larsen et al. (29) is based on the class I binding prediction method NetMHCpan and can predict the patient-specific epitopes restricted by alleles based on the patient HLA allele type. However, these methods are focused on class I alleles and are applicable to only one peptide or one patient data at a time, whereas the RATE tool that we describe in this study was developed to address class II restrictions and relies on datasets describing response of multiple donors, based on genetic inference. Accordingly, the output of our method is based on ORs, RF, and p value from Fisher’s exact test for each peptide–allele combination to assess the strength of association between the peptide and allele.

The RATE tool described in this article represents a novel approach for determining HLA class II allele restriction of epitopes. In particular, it does not depend on experimental work and is most suited to analyze and extract immunological information from complex datasets encompassing large numbers of peptides and donors (as long as HLA typing data for each donor are available) as generated in clinical studies and vaccine trials. As such, the method is also likely to be of value in system biology studies where large amounts of data are generated. Furthermore, because the program is agnostic to the MHC nomenclature used, it can be expanded to other species and we have, in fact, used it to analyze class II immunogenicity data obtained in the rhesus macaque system (B.R. Mothé, C.S. Lindestam Arlehamn, C. Dow, M.B.C. Dillon, R.W. Wiseman, P. Bohn, J. Karl, N.A. Golden, T. Gilpin, T.W. Foreman, M.A. Rodgers, S. Mehra, T.J. Scriba, J.L. Flynn, D. Kaushal, D.H. O'Connor, and A. Sette, submitted for publication).

Although the experimental examples and data analyses provided in this article are focused on HLA class II, we speculate that the program can also be applied to class I or any set of responses associated with large polymorphisms. The limited validation we performed to date using the data from the Los Alamos HIV Molecular Immunology database suggest that this is likely to be the case.

However, certain caveats should be kept in mind when interpreting results derived from the approach. Despite the various options provided in this article, it is likely that there are instances of ambiguous results, especially for peptides weakly or infrequently recognized. This is most commonly observed when too few subjects have been tested, or in the case of alleles that are either rare or very frequent. As a rule of thumb, strong associations can be detected with as few as 10–15 subjects, but ∼30 seems to be required in most cases, with the power of the analysis increasing dramatically as more subjects are included. However, the additional calculation of RF as described in this article increases the likelihood of detection for strong association even with the use of a limited number of subjects. These instances are usually relatively few, and the ambiguity can be resolved with additional testing using transfected cell lines (39) or direct test of tetrameric staining reagents.

Second, it is possible that HLA molecules encoded at different loci might be associated with statistically significant OR values for the same epitope. Although in some cases this may indeed be because of the promiscuity of the epitope, in others it may reflect the fact that the different HLA loci are physically close to one another in their chromosomal location, and thus are in strong linkage disequilibrium with each other (40, 41). For this reason, if alleles for more than one HLA locus are associated with significant OR values for a specific epitope, further analysis is warranted. We recommend that the locus with the best p value be considered first. If the combination of the data from this locus with any other locus does not lead to a better p value for the combined data, the association is likely due to linkage disequilibrium and should be discarded. In addition, instances where an allele encoded by a particular locus is not predicted to bind the epitope under consideration likely reflect an association due to linkage disequilibrium and should be considered with caution or discarded.

Third, although one of the advantages of RATE is to be able to globally analyze a dataset generated over multiple experiments (because it is in our experience impossible to determine restrictions in a single experiment by cellular methods for many donors and many peptides), the issue of reproducibility from experiment to experiment needs to be carefully considered. If significant experiment-to-experiment variability is present in a given dataset, this would correspondingly affect the conclusions. Therefore, the application of appropriate positive and negative controls within each experiment is necessary. In our experience, we always include a negative control and a positive PHA control, and ensure that each falls within acceptable ranges, based on our routine quality control of experimental assays.

Finally, we acknowledge that the experimental validation of the RATE approach is still somewhat limited. In total, RATE correctly predicted 10 novel restrictions (8 M. tuberculosis and 2 B. pertussis) and 5 previously validated restrictions (3 M. tuberculosis and 2 Timothy Grass). Evaluation of additional epitopes mapped by other investigators is limited by the fact that immunogenicity data need to be described on a donor-by-donor basis, and HLA typing must be available for each donor. Still, clearly additional experimental work will more firmly establish the success rate of the approach.

In conclusion, we have developed an automated method to infer HLA restriction from large datasets of T cell responses in HLA-typed subjects. The Web-accessible program calculates OR and relative frequencies from simple data tables, incorporates prediction of HLA binding capacity, and accounts for linkage disequilibrium and promiscuous recognition by iterative calculation of OR values for different combinations of HLA molecules. We consider the current algorithm and software implementation a proof of principle that it is possible to derive HLA restrictions based on genetic associations. To the best of our knowledge, the program presented in this article is the first that allows determination of restriction at the population level, and estimates response rate and immunodominance, as well as promiscuous restrictions. Accordingly, we believe it is important that the current prototype is made available to the scientific community. The tool is indeed available online at http://iedb-rate.liai.org, and we look forward to receiving user feedback for its improvement and optimization. We expect that future refinements of the approach will lead to improved results, for example, by more precisely modeling the statistical underpinnings of HLA linkage and promiscuous binding, and incorporating the predicted binding affinities as statistical priors rather than binary cutoffs.

This work was supported by National Institutes of Health Contracts HHSN272201200010C, HHSN272200900042C, HHSN272201400045C, and HHSN272200900044C and Bill and Melinda Gates Foundation Grant OPP1066265.

The online version of this article contains supplemental material.

Abbreviations used in this article:

IEDB

Immune Epitope Database

LTBI

latent Mycobacterium tuberculosis infection

OR

odds ratio

RATE

Restrictor Analysis Tool for Epitopes

RF

relative frequency

SFC

spot-forming cell.

1
Lange
V.
,
Böhme
I.
,
Hofmann
J.
,
Lang
K.
,
Sauter
J.
,
Schöne
B.
,
Paul
P.
,
Albrecht
V.
,
Andreas
J. M.
,
Baier
D. M.
, et al
.
2014
.
Cost-efficient high-throughput HLA typing by MiSeq amplicon sequencing.
BMC Genomics
15
:
63
.
2
Boegel
S.
,
Löwer
M.
,
Schäfer
M.
,
Bukur
T.
,
de Graaf
J.
,
Boisguérin
V.
,
Türeci
O.
,
Diken
M.
,
Castle
J. C.
,
Sahin
U.
.
2012
.
HLA typing from RNA-Seq sequence reads.
Genome Med
4
:
102
.
3
Moonsamy
P. V.
,
Williams
T.
,
Bonella
P.
,
Holcomb
C. L.
,
Höglund
B. N.
,
Hillman
G.
,
Goodridge
D.
,
Turenchalk
G. S.
,
Blake
L. A.
,
Daigle
D. A.
, et al
.
2013
.
High throughput HLA genotyping using 454 sequencing and the Fluidigm Access Array™ System for simplified amplicon library preparation.
Tissue Antigens
81
:
141
149
.
4
Nepom
G. T.
2012
.
MHC class II tetramers.
J. Immunol.
188
:
2477
2482
.
5
Robinson
J.
,
Mistry
K.
,
McWilliam
H.
,
Lopez
R.
,
Parham
P.
,
Marsh
S. G. E.
.
2011
.
The IMGT/HLA database.
Nucleic Acids Res.
39
:
D1171
D1176
.
6
McKinney
D. M.
,
Southwood
S.
,
Hinz
D.
,
Oseroff
C.
,
Arlehamn
C. S. L.
,
Schulten
V.
,
Taplitz
R.
,
Broide
D.
,
Hanekom
W. A.
,
Scriba
T. J.
, et al
.
2013
.
A strategy to determine HLA class II restriction broadly covering the DR, DP, and DQ allelic variants most commonly expressed in the general population.
Immunogenetics
65
:
357
370
.
7
Morris
J. A.
,
Gardner
M. J.
.
1988
.
Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates.
Br. Med. J. (Clin. Res. Ed.)
296
:
1313
1316
.
8
Ahmed
I.
,
Tamouza
R.
,
Delord
M.
,
Krishnamoorthy
R.
,
Tzourio
C.
,
Mulot
C.
,
Nacfer
M.
,
Lambert
J.-C.
,
Beaune
P.
,
Laurent-Puig
P.
, et al
.
2012
.
Association between Parkinson’s disease and the HLA-DRB1 locus.
Mov. Disord.
27
:
1104
1110
.
9
Klein, N. P., J. Bartlett, B. Fireman, A. Rowhani-Rahbar, and R. Baxter. 2013. Comparative effectiveness of acellular versus whole-cell pertussis vaccines in teenagers. Pediatrics 131: e1716–e1722
.
10
Yucesoy
B.
,
Talzhanov
Y.
,
Johnson
V. J.
,
Wilson
N. W.
,
Biagini
R. E.
,
Wang
W.
,
Frye
B.
,
Weissman
D. N.
,
Germolec
D. R.
,
Luster
M. I.
,
Barmada
M. M.
.
2013
.
Genetic variants within the MHC region are associated with immune responsiveness to childhood vaccinations.
Vaccine
31
:
5381
5391
.
11
Jin, H., N. Arase, K. Hirayasu, M. Kohyama, T. Suenaga, F. Saito, K. Tanimura, S. Matsuoka, K. Ebina, K. Shi, et al. 2014. Autoantibodies to IgG/HLA class II complexes are associated with rheumatoid arthritis susceptibility. Proc. Natl. Acad. Sci. USA 111: 3787–3792
.
12
Wu
L.
,
Guo
S.
,
Yang
D.
,
Ma
Y.
,
Ji
H.
,
Chen
Y.
,
Zhang
J.
,
Wang
Y.
,
Jin
L.
,
Wang
J.
,
Liu
J.
.
2014
.
Copy number variations of HLA-DRB5 is associated with systemic lupus erythematosus risk in Chinese Han population.
Acta Biochim. Biophys. Sin. (Shanghai)
46
:
155
160
.
13
Kim
Y.
,
Ponomarenko
J.
,
Zhu
Z.
,
Tamang
D.
,
Wang
P.
,
Greenbaum
J.
,
Lundegaard
C.
,
Sette
A.
,
Lund
O.
,
Bourne
P. E.
, et al
.
2012
.
Immune epitope database analysis resource.
Nucleic Acids Res.
40
:
W525
W530
.
14
Paul
S.
,
Kolla
R. V.
,
Sidney
J.
,
Weiskopf
D.
,
Fleri
W.
,
Kim
Y.
,
Peters
B.
,
Sette
A.
.
2013
.
Evaluating the immunogenicity of protein drugs by applying in vitro MHC binding data and the immune epitope database and analysis resource.
Clin. Dev. Immunol.
2013
:
467852
.
15
Oseroff
C.
,
Sidney
J.
,
Kotturi
M. F.
,
Kolla
R.
,
Alam
R.
,
Broide
D. H.
,
Wasserman
S. I.
,
Weiskopf
D.
,
McKinney
D. M.
,
Chung
J. L.
, et al
.
2010
.
Molecular determinants of T cell epitope recognition to the common Timothy grass allergen.
J. Immunol.
185
:
943
955
.
16
Lindestam Arlehamn
C. S.
,
Gerasimova
A.
,
Mele
F.
,
Henderson
R.
,
Swann
J.
,
Greenbaum
J. A.
,
Kim
Y.
,
Sidney
J.
,
James
E. A.
,
Taplitz
R.
, et al
.
2013
.
Memory T cells in latent Mycobacterium tuberculosis infection are directed against three antigenic islands and largely contained in a CXCR3+CCR6+ Th1 subset.
PLoS Pathog.
9
:
e1003130
.
17
Schulten
V.
,
Greenbaum
J. A.
,
Hauser
M.
,
McKinney
D. M.
,
Sidney
J.
,
Kolla
R.
,
Lindestam Arlehamn
C. S.
,
Oseroff
C.
,
Alam
R.
,
Broide
D. H.
, et al
.
2013
.
Previously undescribed grass pollen antigens are the major inducers of T helper 2 cytokine-producing T cells in allergic individuals.
Proc. Natl. Acad. Sci. USA
110
:
3459
3464
.
18
Oseroff
C.
,
Sidney
J.
,
Tripple
V.
,
Grey
H.
,
Wood
R.
,
Broide
D. H.
,
Greenbaum
J.
,
Kolla
R.
,
Peters
B.
,
Pomés
A.
,
Sette
A.
.
2012
.
Analysis of T cell responses to the major allergens from German cockroach: epitope specificity and relationship to IgE production.
J. Immunol.
189
:
679
688
.
19
Schulten
V.
,
Tripple
V.
,
Sidney
J.
,
Greenbaum
J.
,
Frazier
A.
,
Alam
R.
,
Broide
D.
,
Peters
B.
,
Sette
A.
.
2014
.
Association between specific timothy grass antigens and changes in TH1- and TH2-cell responses following specific immunotherapy.
J. Allergy Clin. Immunol.
134
:
1076
1083
.
20
Oseroff
C.
,
Sidney
J.
,
Vita
R.
,
Tripple
V.
,
McKinney
D. M.
,
Southwood
S.
,
Brodie
T. M.
,
Sallusto
F.
,
Grey
H.
,
Alam
R.
, et al
.
2012
.
T cell responses to known allergen proteins are differently polarized and account for a variable fraction of total response to allergen extracts.
J. Immunol.
189
:
1800
1811
.
21
Arlehamn
C. S.
,
Sidney
J.
,
Henderson
R.
,
Greenbaum
J. A.
,
James
E. A.
,
Moutaftsi
M.
,
Coler
R.
,
McKinney
D. M.
,
Park
D.
,
Taplitz
R.
, et al
.
2012
.
Dissecting mechanisms of immunodominance to the common tuberculosis antigens ESAT-6, CFP10, Rv2031c (hspX), Rv2654c (TB7.7), and Rv1038c (EsxJ).
J. Immunol.
188
:
5020
5031
.
22
Moutaftsi
M.
,
Bui
H.-H.
,
Peters
B.
,
Sidney
J.
,
Salek-Ardakani
S.
,
Oseroff
C.
,
Pasquetto
V.
,
Crotty
S.
,
Croft
M.
,
Lefkowitz
E. J.
, et al
.
2007
.
Vaccinia virus-specific CD4+ T cell responses target a set of antigens largely distinct from those targeted by CD8+ T cell responses.
J. Immunol.
178
:
6814
6820
.
23
Lefranc
M.-P.
2011
.
IMGT, the International ImMunoGeneTics Information System.
Cold Spring Harb. Protoc.
2011
:
595
603
.
24
Doolan
D. L.
,
Southwood
S.
,
Chesnut
R.
,
Appella
E.
,
Gomez
E.
,
Richards
A.
,
Higashimoto
Y. I.
,
Maewal
A.
,
Sidney
J.
,
Gramzinski
R. A.
, et al
.
2000
.
HLA-DR-promiscuous T cell epitopes from Plasmodium falciparum pre-erythrocytic-stage antigens restricted by multiple HLA class II alleles.
J. Immunol.
165
:
1123
1137
.
25
Schulze zur Wiesch
J.
,
Lauer
G. M.
,
Day
C. L.
,
Kim
A. Y.
,
Ouchi
K.
,
Duncan
J. E.
,
Wurcel
A. G.
,
Timm
J.
,
Jones
A. M.
,
Mothe
B.
, et al
.
2005
.
Broad repertoire of the CD4+ Th cell response in spontaneously controlled hepatitis C virus infection includes dominant and highly promiscuous epitopes.
J. Immunol.
175
:
3603
3613
.
26
Cecconi
V.
,
Moro
M.
,
Del Mare
S.
,
Sidney
J.
,
Bachi
A.
,
Longhi
R.
,
Sette
A.
,
Protti
M. P.
,
Dellabona
P.
,
Casorati
G.
.
2010
.
The CD4+ T-cell epitope-binding register is a critical parameter when generating functional HLA-DR tetramers with promiscuous peptides.
Eur. J. Immunol.
40
:
1603
1616
.
27
Yusim, K., B. T. M. Korber, C. Brander, B. F. Haynes, R. Koup, J. P. Moore, B. D. Walker, and D. I. Watkins. 2009. HIV Molecular Immunology 2009. LA-UR 09-05941. Los Alamos National Laboratory, Theoretical Biology and Biophysics, Los Alamos, NM
.
28
Kiepiela
P.
,
Ngumbela
K.
,
Thobakgale
C.
,
Ramduth
D.
,
Honeyborne
I.
,
Moodley
E.
,
Reddy
S.
,
de Pierres
C.
,
Mncube
Z.
,
Mkhwanazi
N.
, et al
.
2007
.
CD8+ T-cell responses to different HIV proteins have discordant associations with viral load.
Nat. Med.
13
:
46
53
.
29
Erup Larsen
M.
,
Kloverpris
H.
,
Stryhn
A.
,
Koofhethile
C. K.
,
Sims
S.
,
Ndung’u
T.
,
Goulder
P.
,
Buus
S.
,
Nielsen
M.
.
2011
.
HLArestrictor—a tool for patient-specific predictions of HLA restriction elements and optimal epitopes within peptides.
Immunogenetics
63
:
43
55
.
30
Assarsson
E.
,
Bui
H.-H.
,
Sidney
J.
,
Zhang
Q.
,
Glenn
J.
,
Oseroff
C.
,
Mbawuike
I. N.
,
Alexander
J.
,
Newman
M. J.
,
Grey
H.
,
Sette
A.
.
2008
.
Immunomic analysis of the repertoire of T-cell specificities for influenza A virus in humans.
J. Virol.
82
:
12241
12251
.
31
Larché
M.
2008
.
Determining MHC restriction of T-cell responses.
Methods Mol. Med.
138
:
57
72
.
32
Sinigaglia
F.
,
Guttinger
M.
,
Kilgus
J.
,
Doran
D. M.
,
Matile
H.
,
Etlinger
H.
,
Trzeciak
A.
,
Gillessen
D.
,
Pink
J. R.
.
1988
.
A malaria T-cell epitope recognized in association with most mouse and human MHC class II molecules.
Nature
336
:
778
780
.
33
Panina-Bordignon
P.
,
Demotz
S.
,
Corradin
G.
,
Lanzavecchia
A.
.
1989
.
Study on the immunogenicity of human class-II-restricted T-cell epitopes: processing constraints, degenerate binding, and promiscuous recognition.
Cold Spring Harb. Symp. Quant. Biol.
54
:
445
451
.
34
Panina-Bordignon
P.
,
Tan
A.
,
Termijtelen
A.
,
Demotz
S.
,
Corradin
G.
,
Lanzavecchia
A.
.
1989
.
Universally immunogenic T cell epitopes: promiscuous binding to human MHC class II and promiscuous recognition by T cells.
Eur. J. Immunol.
19
:
2237
2242
.
35
Krieger
J. I.
,
Karr
R. W.
,
Grey
H. M.
,
Yu
W. Y.
,
O’Sullivan
D.
,
Batovsky
L.
,
Zheng
Z. L.
,
Colón
S. M.
,
Gaeta
F. C.
,
Sidney
J.
, et al
.
1991
.
Single amino acid changes in DR and antigen define residues critical for peptide-MHC binding and T cell recognition.
J. Immunol.
146
:
2331
2340
.
36
Roche
P. A.
,
Cresswell
P.
.
1990
.
High-affinity binding of an influenza hemagglutinin-derived peptide to purified HLA-DR.
J. Immunol.
144
:
1849
1856
.
37
Kiepiela
P.
,
Leslie
A. J.
,
Honeyborne
I.
,
Ramduth
D.
,
Thobakgale
C.
,
Chetty
S.
,
Rathnavalu
P.
,
Moore
C.
,
Pfafferott
K. J.
,
Hilton
L.
, et al
.
2004
.
Dominant influence of HLA-B in mediating the potential co-evolution of HIV and HLA.
Nature
432
:
769
775
.
38
Listgarten
J.
,
Frahm
N.
,
Kadie
C.
,
Brander
C.
,
Heckerman
D.
.
2007
.
A statistical framework for modeling HLA-dependent T cell response data.
PLOS Comput. Biol.
3
:
1879
1886
.
39
Mothé
B. R.
,
Southwood
S.
,
Sidney
J.
,
English
A. M.
,
Wriston
A.
,
Hoof
I.
,
Shabanowitz
J.
,
Hunt
D. F.
,
Sette
A.
.
2013
.
Peptide-binding motifs associated with MHC molecules common in Chinese rhesus macaques are analogous to those of human HLA supertypes and include HLA-B27-like alleles.
Immunogenetics
65
:
371
386
.
40
Ahmad
T.
,
Neville
M.
,
Marshall
S. E.
,
Armuzzi
A.
,
Mulcahy-Hawes
K.
,
Crawshaw
J.
,
Sato
H.
,
Ling
K. L.
,
Barnardo
M.
,
Goldthorpe
S.
, et al
.
2003
.
Haplotype-specific linkage disequilibrium patterns define the genetic topography of the human MHC.
Hum. Mol. Genet.
12
:
647
656
.
41
Hviid
T. V. F.
,
Christiansen
O. B.
.
2005
.
Linkage disequilibrium between human leukocyte antigen (HLA) class II and HLA-G—possible implications for human reproduction and autoimmune disease.
Hum. Immunol.
66
:
688
699
.

The authors have no financial conflicts of interest.

Supplementary data