The classical HLA-C and the nonclassical HLA-E and HLA-G molecules play important roles both in the innate and adaptive immune system. Starting already during embryogenesis and continuing throughout our lives, these three Ags exert major functions in immune tolerance, defense against infections, and anticancer immune responses. Despite these important roles, identification and characterization of the peptides presented by these molecules has been lacking behind the more abundant HLA-A and HLA-B gene products. In this study, we elucidated the peptide specificities of these HLA molecules using a comprehensive analysis of naturally presented peptides. To that end, the 15 most frequently expressed HLA-C alleles as well as HLA-E*01:01 and HLA-G*01:01 were transfected into lymphoblastoid C1R cells expressing low endogenous HLA. Identification of naturally presented peptides was performed by immunoprecipitation of HLA and subsequent analysis of HLA-bound peptides by liquid chromatographic tandem mass spectrometry. Peptide motifs of HLA-C unveil anchors in position 2 or 3 with high variances between allotypes, and a less variable anchor at the C-terminal end. The previously reported small ligand repertoire of HLA-E was confirmed within our analysis, and we could show that HLA-G combines a large ligand repertoire with distinct features anchoring peptides at positions 3 and 9, supported by an auxiliary anchor in position 1 and preferred residues in positions 2 and 7. The wealth of HLA ligands resulted in prediction matrices for octa-, nona-, and decamers. Matrices were validated in terms of their binding prediction and compared with the latest NetMHC prediction algorithm NetMHCpan-3.0, which demonstrated their predictive power.

This article is featured in In This Issue, p.2609

The MHC is a polygenic and polymorphic segment on human chromosome 6 that encodes histocompatibility Ags including the classical (or class Ia) and nonclassical (or class Ib) MHC molecules (in humans also called HLA). HLA-A, HLA-B, and HLA-C belong to the classical MHC molecules, which display a high degree of polymorphism. In contrast, HLA-E and HLA-G are considered nonclassical MHC molecules showing limited polymorphism. Similar to classical MHC molecules, HLA-E and HLA-G are heterodimers, consisting of a heavy α-chain and β2-microglobulin, and take part in the peptide-presentation pathway. HLA-C, -E and -G share the ability to interact with NK cell receptors as well as TCRs, thereby bridging between innate and adaptive immunity.

Within the classical HLA molecules, HLA-C plays a special role in the interaction with NK cells. This feature manifests itself in the unusually conserved α1 domain (1) that, in combination with a generally less polymorphic region in the α2 domain, shapes the binding site of killer cell Ig-like receptors (KIRs). Compared to HLA-A and HLA-B, HLA-C shows a lower expression level at the cell surface and represents only ∼10% of classical MHC molecules. HLA-C allotypes have been implicated in many diseases, including viral infections, cancer, and autoimmune disorders, with HLA-C–restricted epitopes recognized by either CTLs or NK cells. One of the most frequent cancer mutations, KRAS G12D, has recently been shown to be presented by HLA-C*08:02. Moreover, the corresponding epitope is able to induce T cell responses in cancer patients, which can be harnessed for adoptive-transfer immunotherapy (2).

Many genetic associations of HLA-C alleles with several diseases have been reported, ranging from increased protection to higher susceptibility for a certain disease (3). Last but not least, HLA-C expression on extravillous trophoblasts plays a central role in the development and tolerance of the fetus during pregnancy by interacting with maternal NK cells (4).

Peptide motifs of HLA-C were first based on pool sequencing and few individual sequences (5). The first high-throughput approach to determine the binding specificities of a larger set of HLA-C alleles was conducted by Rasmussen et al. (6) applying an in vitro peptide–HLA class I dissociation assay with synthetic peptides. By using this approach, binding motifs for 16 HLA-C allotypes were uncovered, although often with less-pronounced anchor residues.

The nonclassical HLA-E has been implicated in the presentation of MHC class I leader peptides (7, 8). Its expression level is dependent on the HLA class Ia expression level, and previous reports suggest it to be around 5% of the HLA-C expression (9). The HLA-E–peptide complex acts as ligand for the family of CD94/NKG2 receptors expressed predominantly on NK cells, but also on a subset of CD8+ T cells (10, 11). Both the KIR and CD94/NKG2 receptor family sense changes in HLA expression by interacting with HLA-C or HLA-E, respectively. Whereas the conserved HLA-E–CD94/NKG2 system seems to be specialized in sensing HLA expression levels, polymorphic KIRs are able to detect early changes in the peptide repertoire presented on classical HLA, especially HLA-C (1214). The HLA-E–CD94/NKG2 interaction has also been associated with fetal–maternal tolerance through inhibition of uterine NK cells by HLA-E–expressing, extravillous trophoblasts (15). In addition to the presentation of MHC class I leader peptides, HLA-E is able to present pathogenic epitopes to CTLs (16, 17). However, the peptide binding pocket of HLA-E is highly hydrophobic and thus especially adapted for binding of HLA class I leader peptides. This unusual hydrophobicity within the binding pockets may further restrict the peptide repertoire. In fact, only few peptides could be shown to be presented in vivo by HLA-E.

The nonclassical HLA-G is mainly expressed on fetal tissue exerting a major tolerogenic function and promoting fetal development (18). In adults, expression of HLA-G is found on immune-privileged organs, including cornea, thymus, pancreatic islets, endothelial, and erythroblasts. In addition, dendritic cells and macrophages may also express HLA-G (19). Moreover, expression can be induced during various diseases, including cancer, viral infections, inflammatory diseases, or autoimmune disorders, mainly as an escape strategy to avoid immune recognition. Due to the checkpoint function, HLA-G is considered an attractive target for anticancer treatment using blocking Abs (20). In contrast, HLA-G expression in transplants is associated with better tolerance of the graft (21). HLA-G interacts with different inhibitory receptors such as Ig-like transcript 2 (ILT2) expressed by B cells, subsets of NK and T cells, monocytes, and dendritic cells (22); ILT4, which is solely expressed by monocytes and dendritic cells (23); and KIR2DL4, which is expressed mainly on NK cells (24). Compared to HLA-E, the peptide repertoire of HLA-G is larger but less complex than the peptide repertoire of MHC class Ia molecules (25). The peptide motif of HLA-G was first defined by Diehl et al. (26) from a small set of naturally eluted and pool-sequenced peptides exhibiting anchors at position 2 (isoleucine or leucine), position 3 (proline), and position 9 (leucine).

Considering the high importance of HLA-C, -E, and -G in many immunological processes, the clarification of ligand characteristics of these HLA molecules is of great relevance. In this study, peptide motifs were unveiled via comprehensive analyses of naturally presented HLA ligands. HLA-presented peptides were analyzed by liquid chromatographic tandem mass spectrometry (LC-MS/MS) after immunoprecipitation of HLA molecules from transfected C1R cells. The EBV-transformed lymphoblastoid C1R cell line is well suited for this approach due to a functional Ag presentation pathway and low endogenous HLA expression (27, 28). We had applied this approach previously for monoallelic motif determinations (2937), and it was also used more recently by Abelin et al. (38). We used this approach for the 15 most frequent HLA-C alleles. To our knowledge, we comprehensively analyzed, for the first time, the peptide pool presented by the nonclassical HLA molecules HLA-E and HLA-G. All analyzed HLA-C allotypes as well as HLA-G binding motifs were generated by Gibbs clustering (39). SYFPEITHI (40) matrices were subsequently created for octa-, nona-, and decamers, and their predictive power has been analyzed in comparison with NetMHCpan-3.0 (41).

DNA of HLA-C alleles, HLA-E, and HLA-G were synthesized and integrated into the pcDNA3.1(+) plasmid using the GeneArt gene synthesis service from Thermo Fisher Scientific (Waltham, MA). Nucleotide sequences were obtained from the IMGT/HLA database (42) for C*01:02:01, C*02:02:01, C*03:03:01, C*03:04:01:01, C*04:01:01:01, C*05:01:01:01, C*06:02:01:01, C*07:01:01:01, C*07:02:01:01, C*08:02:01:01, C*12:03:01:01, C*14:02:01, C*15:02:01, C*16:01:01, C*17:01:01:01, E*01:01:01:01, and G*01:01:01:01. Codon usage was adapted to the codon bias of Homo sapiens genes without changing the protein sequence.

Vectors were linearized by mixing 50 μg plasmid DNA with 50 μl CutSmart Buffer (New England Biolabs, Ipswich, MA), 10 μl PvuI-HF (20,000 U/ml; New England Biolabs), and 390 μl double dH2O and incubating for 2 h at 37°C. Complete linearization was confirmed by agarose gel electrophoresis. DNA was extracted by phenol (Sigma-Aldrich, St. Louis, MO)/chloroform (Merck, Darmstadt, Germany)/isoamyl alcohol (Sigma-Aldrich) and precipitated by adding 1/10 vol of 3 M sodium acetate (Roth, Karlsruhe, Germany) and 2.5 vol 100% ethanol (VWR Chemicals, Radnor, PA). The linearized vector was frozen for 2 h at −80°C and then centrifuged at 13,000 rpm for 30 min at 4°C. The supernatant was removed and the pellet was dried under sterile conditions. The DNA pellet was dissolved in 40 μl sterile Ampuwa water and the concentration was determined by Nanodrop at 260 nm (NanoDrop 1000 Spectralphotometer; Peqlab, Erlangen, Germany).

Prior to transfection, C1R cells were washed three times with cold RPMI 1640 (Thermo Fisher Scientific) and resuspended to a final concentration of 40 × 106 cells/ml. For transfection, 500 μl mycoplasma-free cell suspension was mixed with 10 μg linearized plasmid DNA in a Gene Pulser electroporation cuvette (0.4-cm gap; Bio-Rad, Hercules, CA). Electroporation was conducted using the Gene Pulser II (Bio-Rad) at 250 V and 975 μF. Afterward, cells were incubated in 75-cm2 flasks with 12 ml prewarmed RPMI 1640, 10% FBS (Thermo Fisher Scientific), and 1× penicillin/streptomycin (Sigma-Aldrich). Transfected cells were exposed to selection medium 24 h after electroporation by adding 1 mg/ml G418 (Merck) into the culture medium. Selection medium was exchanged twice a week.

HLA cell surface expression was verified by flow cytometry. For this purpose, 1 × 106 cells were washed with FACS buffer consisting of 2% FBS with 2 mM EDTA (Roth) in PBS (Lonza, Basel, Switzerland) and transferred into a 96-well plate (Greiner Bio-One, Kremsmünster, Austria). After an additional wash, cells were incubated with 100 μl of 20 μg/ml of pan-HLA class I–specific monoclonal W6/32 Ab (in-house production) (43) or the HLA-E–specific monoclonal 3D-12 Ab (BioLegend, San Diego, CA) on ice for 20 min. Cells were washed twice and subsequently incubated with 100 μl 1:100 polyclonal anti-mouse IgG-FITC secondary Ab (Agilent Technologies, Santa Clara, CA) on ice for 20 min, protected from light. After three additional washing steps, cells were resuspended in 75 μl FACS buffer. Finally, 7.5 μl 7-aminoactinomycin D (BioLegend) was added to each sample and the cells were analyzed on a FACSCanto II analyzer (BD Biosciences, San Jose, CA). Data analysis was performed by FlowJo 10.0.7 (FlowJo, Ashland, OR).

For intracellular staining of the C1R–E*01:01 transfectant, cells were fixed with 100 μl Cytoperm/Cytofix solution (BD Biosciences) for 20 min prior to incubation with the respective Abs. For cell wash, 2% FBS, 2 mM EDTA, 0.1% saponine (AppliChem, St. Louis, MO), and 0.5% BSA (Roth) in PBS was used.

Cell populations showing high expression of HLA were sorted using a BD FACSJazz Cell Sorter (BD Biosciences) following the HLA cell surface staining procedure.

Cells were cultured up to an amount of 2.5 × 109 cells and harvested by centrifugation at 1500 rpm for 15 min at 4°C. After two washing steps with cold PBS, cells were collected in a 50 ml centrifugation tube and frozen at −80°C.

HLA class I molecules were isolated using standard immunoaffinity purification as described previously (44, 45). In brief, cell pellets were lysed in 10 mM CHAPS (Applichem)/PBS (Lonza) containing protease inhibitor (Complete; Roche, Basel, Switzerland). HLA molecules were purified employing the pan-HLA class I–specific monoclonal W6/32 Ab, covalently linked to CNBr-activated Sepharose (GE Healthcare, Little Chalfont, U.K.). HLA–peptide complexes were eluted by repeated addition of 0.2% trifluoroacetic acid (Merck). Elution fractions E1–E8 were pooled and HLA ligands were separated from larger molecules by ultrafiltration using centrifugal filter units (Amicon; Merck Millipore). HLA ligands were extracted and desalted using ZipTip C18 pipette tips (Merck Millipore). Extracted peptides were eluted in 35 μl of acetonitrile (Merck)/0.1% trifluoroacetic acid, vacuum centrifuged to complete dryness, and resuspended in 25 μl of 1% acetonitrile/0.05% trifluoroacetic acid. Samples were stored at −20°C until analysis by LC-MS/MS.

Peptide samples were separated by reversed-phase liquid chromatography (nanoUHPLC, UltiMate 3000 RSLCnano; Dionex) and subsequently analyzed in an online-coupled Orbitrap Fusion Lumos (Thermo Fisher Scientific). Samples were analyzed in five technical replicates. Sample volumes of 5 μl (sample shares of 20%) were injected onto a 75 μm × 2 cm trapping column (Acclaim PepMap RSLC; Dionex) at 4 μl/min for 5.75 min. Peptide separation was subsequently performed at 50°C and a flow rate of 300 nl/min on a 50 μm × 25 cm separation column (Acclaim PepMap RSLC; Dionex), applying a gradient ranging from 2.4 to 32.0% of acetonitrile over the course of 90 min. Eluting peptides were ionized by nanospray ionization and analyzed in the mass spectrometer implementing the TopSpeed method. Survey scans were generated in the Orbitrap at a resolution of 120,000. Precursor ions were isolated in the quadrupole, fragmented by collision-induced dissociation in the ion trap, and finally fragment ions were recorded in the Orbitrap. Mass range was limited to 400–650 m/z with charge states 2+ and 3+ selected for fragmentation.

Data was processed against the human proteome included in the Swiss-Prot database (http://www.uniprot.org, release September 27, 2013; containing 20,279 reviewed protein sequences) applying the SequestHT algorithm (46) in the Proteome Discoverer (version 1.3; Thermo Fisher) software. Precursor mass tolerance was set to 5 ppm and fragment mass tolerance to 0.02 Da. The search was not restricted to an enzymatic specificity. Oxidized methionine was enabled as a dynamic modification. Percolator (47)-assisted false discovery rate (FDR) calculation was set at a target value of q ≤ 0.05 (5% FDR). Peptide-spectrum matches with q ≤ 0.05 were filtered according to additional orthogonal parameters to ensure spectral quality and validity. Peptide lengths were limited to 8–12 aa.

Due to endogenous expression of HLA-B*35:03 and HLA-C*04:01 in C1R cells, isolated HLA ligands of these allotypes had to be excluded from further analysis to allow for identification of HLA ligands of the transfected allele. GibbsCluster 1.1 (39) is an unsupervised way to cluster peptides according to their sequence similarity. For each transfectant, clustering of nonameric peptides was carried out. Nonamers represent the most abundant length variant in all analyzed alleles. The number of clusters was set to 1–3. A “trash cluster” with a threshold of 0 was incorporated to remove outliers. Sequence weighting type was set to “Clustering.” The default settings were used for all other options. The peptide motifs of HLA-B*35:03 and HLA-C*04:01 were previously described (48, 49) and could be confirmed performing exemplarily Gibbs clustering of some HLA-B*35:03– or HLA-C*04:01–positive samples of our in-house database containing different samples and corresponding HLA typings. Thus, clusters of these two allotypes could be well distinguished from the previously undefined analysis cluster that was assigned to the transfected HLA. The transfected HLA cluster was visualized employing Seq2Logo 2.0 (50) and Kullback–Leibler logotype using default settings. Anchor and auxiliary anchor positions were defined based on respective nonamer clusters that were assigned to the transfected HLA and subsequently adopted for octa- and decamers. This workaround was necessary because a clear distinction of all three expressed allotypes was not possible in all cases due to low peptide count and a higher proportion of non-HLA peptides (unsupervised clusters show combinations of transfected HLA, HLA-B*35:03, and HLA-C*04:01 motifs). With the exception of HLA-C*01:02, peptide anchor residues did not differ over the different length variants and clusters for octa- and decamers showed no obvious difference to the nonamer cluster. Peptides possessing anchor residues of the assigned transfected HLA cluster were selected from the initial peptide list for 8- to 11-mers and were defined as ligands. SYFPEITHI matrices were determined for 8- to 10-mers using frequencies of amino acids at each position from defined ligands according to established procedures (40). Length distribution was calculated including 8- to 11-mer ligands. Ligand overlap was determined using the 500 highest expressed ligands of each allele, defined by the sum of all precursor areas in all five technical replicates. Source proteome overlap was determined using the source proteins of the respective top 500 presented ligands.

For SYFPEITHI matrix validation, a k-fold (k = 5) cross-validation was used (51). For this purpose, peptide lists of each transfected allotype were randomly split into five equal folds, whereby four folds were used as training data sets to determine a SYFPEITHI matrix applying the GibbsCluster approach described above. The fifth fold was used for evaluation of the matrix. Clustering was performed on the fifth fold and peptides in the transfected HLA cluster were defined as true binders, whereas peptides in the other clusters and outliers were defined as false binders for the transfected HLA. Evaluation was performed exemplarily for one nonamer evaluation data set. Receiver operating characteristic (ROC) curve analysis was conducted to visualize the performance. Area under the curve (AUC) was calculated for each ROC curve. For comparison with NetMHCpan-3.0, commonly used thresholds were set to decide whether a peptide is defined as a binder or not. For SYFPEITHI, a threshold of ≥60% of the maximal score (defined by the sum of the highest possible scores in each position of the peptide) was set, and for NetMHCpan-3.0 a threshold of rank <2 was employed.

HLA expression of transfected C1R cells was analyzed by flow cytometry using the pan-HLA class I–specific Ab W6/32. Untransfected C1R cells were included as a negative control to distinguish expression of the transfected HLA from endogenous HLA-B*35:03 and HLA-C*04:01 expression. All transfectants, except C1R–HLA-E*01:01, demonstrated expression of the transfected HLA at the cell surface (Supplemental Fig. 1). C1R–HLA-E*01:01, stained by either W6/32 or HLA-E–specific Ab 3D-12, exhibited no cell surface expression of transfected HLA-E*01:01. However, intracellular staining of C1R–HLA-E*01:01 with 3D-12 Ab revealed the presence of intracellular pools of HLA-E*01:01. Furthermore, PCR of isolated plasmid DNA and subsequent sequencing of the HLA-E*01:01 locus confirmed the persistence of the transfected gene as well as the correct sequence (data not shown). Because the C1R cell line is HLA-E*01:03+ (52), which has a higher affinity to MHC class Ia leader peptides (53), this might explain the missing expression of transfected HLA-E*01:01 due to a lack of sufficient leader peptides. However, neither HLA-E*01:01 nor HLA-E*01:03 could be detected on the cell surface by flow cytometry, which in turn might be due to the overall low expression of endogenous HLA (Supplemental Fig. 1C). For all remaining transfected HLA, cell surface expression was sufficient for subsequent characterization of naturally processed and presented HLA ligands.

Peptides were obtained after immunoaffinity chromatography of HLA molecules from cell lysates. After separation by reversed-phase liquid chromatography, peptides were analyzed by mass spectrometry. GibbsCluster 1.1 (39) was used to separate ligands of the transfected HLA from those of endogenously expressed alleles in an unbiased manner (Supplemental Fig. 2). For HLA-C*05:01 and HLA-C*08:02, clustering revealed a similar motif to the endogenously expressed HLA-C*04:01. To avoid cross-contamination within the groups, clustering was repeated after exclusion of all peptides extracted from the C1R–HLA-C*04:01 transfectant.

In total, 392–3,463 ligands could be identified for the respective HLA transfectants possessing the anchor amino acids defined by clustering of nonamers (Table I, Supplemental Table I). Fig. 1 displays the peptide motifs of the 15 analyzed HLA-C molecules. All HLA-C allotypes share a hydrophobic C-terminal anchor position with differences in the preferred amino acid residues. This varies from aliphatic residues, such as valine or leucine in HLA-C*15:02, to aromatic residues phenylalanine and tyrosine in HLA-C*02:02. Most allotypes accept multiple hydrophobic or aromatic anchor residues at the C terminus, whereas a few have a clear preference for a single amino acid (e.g., leucine in HLA-C*01:02, -C*03:03/04, or -C*17:01). The frequency of aromatic residues correlates with the polymorphism at position 116 within the HLA molecules (Table II) (54, 55). Allotypes with a serine at position 116 more often favor aromatic residues at the C-terminal position of the peptide, whereas phenylalanine, tyrosine, or leucine at position 116 may interfere with the binding of aromatic residues. Eleven of fifteen HLA-C allotypes accept a second anchor shaped by peptide residues at position 2 (HLA-C*02:02, -C*03:03, -C*03:04, -C*06:02, -C*07:01, -C*07:02, -C*12:03, -C*14:02, -C*15:02, -C*16:01, and -C*17:01), whereas residues at position 3 constitute the second anchor for four of 15 HLA-C alleles (HLA-C*01:02, -C*04:01, -C*05:01, and -C*08:02). In contrast to small variations with regard to the C-terminal anchor residues, preferred residues at position 2 or 3 display a high degree of variability. A unique preference of proline in position 3 is favored by HLA-C*01:02. Small aliphatic or hydrophilic residues at position 2 constitute the anchor of HLA-C*02:02, -C*03:03, -C*03:04, -C*12:03, -C*15:02, -C*16:01, and -C*17:01. All of these allotypes possess a tyrosine at position 9, which may inhibit binding of larger anchor residues (54). Of note, six of them favor large aromatic residues at position 1, which may support the interaction provided by the small anchor residue at position 2. Only HLA-C*15:02 displays preferences for basic residues at position 1, which are also able to support the binding of the peptide. This may be feasible due to an asparagine at position 66 instead of a lysine, which constitutes this position in most allotypes. Differences in peptide specificities are marginal within the HLA-C*03 subtypes. Acidic residues at position 3 form the anchor for HLA-C*04:01, -C*05:01, and -C*08:02. All three molecules combine an asparagine at position 114 and an arginine at position 156. The arginine may serve for electrostatic interaction, whereas the asparagine at position 114 instead of aspartic acid may enable the binding of an acidic residue. Whereas HLA-C*04:01 has a clear preference for aromatic residues at position 2, HLA-C*05:01 and -C*08:02 accept only small residues at this position. An explanation for this may be the phenylalanine at position 9 of HLA-C*05:01 and -C*08:02 reduces the space to accommodate larger residues (54). HLA-C*04:01 possesses a serine at this position, which may enable the binding of large aromatic residues. Basic anchor residues at position 2 are preferred by HLA-C*06:02, -C*07:01, and -C*07:02. This may be explained by the aspartic acid at position 9 of HLA-C*06:02, -C*07:01, and -C*07:02. Major differences in the peptide specificities of the HLA-C*07 subtypes HLA-C*07:01 and -C*07:02 are revealed at position 1 and the anchor position 2. Both subtypes prefer arginine as anchor residue, whereas alternatively accepted anchor residues are threonine or asparagine for HLA-C*07:01 and tyrosine or lysine for HLA-C*07:02. The tyrosine in position 2 of HLA-C*07:02 ligands may be accepted due to a serine at position 99, where an aromatic residue is usually located. Further, HLA-C*07:01 favors basic residues at position 1, whereas HLA-C*07:02 does not have such a preference. As for HLA-C*15:02, the preference for basic residues at position 1 of HLA-C*07:01 ligands can be explained by an asparagine at position 66. Unique to HLA-C*14:02 is its preference for aromatic residues at anchor position 2. Again, a serine at position 9 instead of an aromatic amino acid which is generally placed at this position may enable binding of large aromatic residues. Further allotypes favoring aromatic residues at anchor position 2 are HLA-A*23 and -A*24, which also possess a serine at position 9. Auxiliary anchors (defined by a percentage share of >50% of amino acids with similar features) are located at position 1 of HLA-C*03:03, -C*03:04, and -C*17:01 ligands and at position 2 of HLA-C*04:01 ligands, with a preference for aromatic residues. Remarkable is the higher frequency of aromatic residues at position 5 and 7 of HLA-C*07:01 and -C*07:02 ligands and at position 8 of HLA-C*17:01 ligands, which may be explained by a leucine at position 147 instead of a tryptophan situated in this position in the other allotypes. The preference for aromatic residues at positions 5 and 7 of HLA-C*07:01 and -C*07:02 ligands may also be explained by an alanine at position 152, which may provide a larger pocket for residues at positions 5 and 7 of the ligands, a feature not shared by HLA-C*17:01. Exceptional to HLA-C*01:02 is its change at the anchor position 3 with proline for octameric and nonameric HLA ligands to a shared anchor with aliphatic residues at position 2, and proline, serine, or histidine at position 3 for longer ligands. In sum, peptide motifs of all analyzed HLA-C molecules could be identified and are in agreement with our knowledge of allotype-specific pocket characteristics. All HLA allotypes that have been analyzed in this study prefer nonameric ligands with frequencies varying from 62.5 to 91.3% (Fig. 2). Octamer frequency ranges from 4.9 to 25.0%. Decamers and undecamers were less frequent with 2.9–17.3% or 0.4–7.1%, respectively.

Table I.
HLA ligand yields for each corresponding C1R transfectant and numbers of source proteins
8-mers9-mers10-mers11-mersNo. of LigandsNo. of Source ProteinsNo. of Source Proteins (Cumulative)
C*01:02 102 987 235 36 1360 1165 1165 
C*02:02 116 1533 214 100 1963 1589 2483 
C*03:03 91 852 99 38 1080 945 2963 
C*03:04 251 1601 176 91 2119 1716 3530 
C*04:01 467 1161 153 35 1816 1484 4243 
C*05:01 626 1563 249 64 2502 1898 4985 
C*06:02 47 870 32 953 846 5271 
C*07:01 55 310 19 392 366 5357 
C*07:02 116 589 53 19 777 700 5472 
C*08:02 792 2231 330 110 3463 2444 5981 
C*12:03 146 1160 53 29 1388 1158 6143 
C*14:02 484 1604 313 38 2439 1879 6590 
C*15:02 191 1639 56 22 1908 1522 6834 
C*16:01 685 1899 106 50 2740 2086 7133 
C*17:01 120 418 49 45 632 542 7184 
E*01:01 7184 
G*01:01 248 1725 204 81 2258 1816 7536 
Sum     22,197 7,536  
8-mers9-mers10-mers11-mersNo. of LigandsNo. of Source ProteinsNo. of Source Proteins (Cumulative)
C*01:02 102 987 235 36 1360 1165 1165 
C*02:02 116 1533 214 100 1963 1589 2483 
C*03:03 91 852 99 38 1080 945 2963 
C*03:04 251 1601 176 91 2119 1716 3530 
C*04:01 467 1161 153 35 1816 1484 4243 
C*05:01 626 1563 249 64 2502 1898 4985 
C*06:02 47 870 32 953 846 5271 
C*07:01 55 310 19 392 366 5357 
C*07:02 116 589 53 19 777 700 5472 
C*08:02 792 2231 330 110 3463 2444 5981 
C*12:03 146 1160 53 29 1388 1158 6143 
C*14:02 484 1604 313 38 2439 1879 6590 
C*15:02 191 1639 56 22 1908 1522 6834 
C*16:01 685 1899 106 50 2740 2086 7133 
C*17:01 120 418 49 45 632 542 7184 
E*01:01 7184 
G*01:01 248 1725 204 81 2258 1816 7536 
Sum     22,197 7,536  

Overlapping ligands and source proteins are removed from the sum of ligands and the sum of source proteins.

FIGURE 1.

Sequence logos of the clusters corresponding to the transfected allotype visualized using Seq2Logo 2.0 (50). The size of the letter indicates the impact of the corresponding amino acid, presented by a given position in either positive or negative fashion. Black, aliphatic residues; gray, aromatic residues; green, hydrophilic residues; blue, basic residues; red, acidic residues.

FIGURE 1.

Sequence logos of the clusters corresponding to the transfected allotype visualized using Seq2Logo 2.0 (50). The size of the letter indicates the impact of the corresponding amino acid, presented by a given position in either positive or negative fashion. Black, aliphatic residues; gray, aromatic residues; green, hydrophilic residues; blue, basic residues; red, acidic residues.

Close modal
Table II.
Polymorphic residues within HLA-C molecules and the position within the peptide interacting with the respective residue (54, 55)

Polymorphic Residues within HLA-C Molecules
Allotype92466737780959799114116143147152156163
 C*01:02 F 
 C*02:02 Y S 
 C*03:03 Y 
 C*03:04 Y 
 C*04:01 S N R 
 C*05:01 Y N R 
 C*06:02 D S 
 C*07:01 D N S L A 
 C*07:02 D S S L A 
 C*08:02 Y N R 
 C*12:03 Y S 
 C*14:02 S S 
 C*15:02 Y N 
 C*16:01 Y S 
 C*17:01 Y L 
Position in peptide interacting with respective residue 1–4/6 5–8 7/8 8/9 3/5/6/9 2/3 3/5–7 5/7/9 5/7–9 3/5–7 3–7 1/2/4 

Polymorphic Residues within HLA-C Molecules
Allotype92466737780959799114116143147152156163
 C*01:02 F 
 C*02:02 Y S 
 C*03:03 Y 
 C*03:04 Y 
 C*04:01 S N R 
 C*05:01 Y N R 
 C*06:02 D S 
 C*07:01 D N S L A 
 C*07:02 D S S L A 
 C*08:02 Y N R 
 C*12:03 Y S 
 C*14:02 S S 
 C*15:02 Y N 
 C*16:01 Y S 
 C*17:01 Y L 
Position in peptide interacting with respective residue 1–4/6 5–8 7/8 8/9 3/5/6/9 2/3 3/5–7 5/7/9 5/7–9 3/5–7 3–7 1/2/4 

Boldface indicates polymorphic residues, which may explain differences in the peptide motifs.

FIGURE 2.

Length distribution of HLA-C and HLA-G ligands.

FIGURE 2.

Length distribution of HLA-C and HLA-G ligands.

Close modal

The concept of grouping HLA allotypes into supertypes depending on their main anchor specificities was introduced in 1995 (36, 56). In sum, nine supertypes could be defined covering most of the HLA-A and HLA-B alleles: HLA-A*01, -A*02, -A*03, -A*24, -B*07, -B*27, -B*44, -B*58, and -B*62 (57, 58). In 2004, Doytchinova et al. (59) applied a bioinformatics approach based on structural similarities between allotypes, also integrating the HLA-C alleles. Using this strategy, two HLA-C supertypes could be defined, named C1 and C4. Supertype C1 was defined by a serine or glycine at position 77, whereas C4 supertypic allotypes possess an asparagine at this position. Allotypes from our study belonging to the C1 supertype are HLA-C*01:02, -C*03:03, -C*03:04, -C*07:02, -C*08:02, -C*12:03, -C*14:02, and -C*16:01; whereas HLA-C*02:02, -C*04:01, -C*05:01, -C*06:02, -C*07:01, -C*15:02, and -C*17:01 belong to the C4 supertype. However, this definition is not in line with the peptide motifs of HLA-C allotypes unveiled in this study (Fig. 1). Considering the peptide motifs of HLA-C, we now propose a new categorization into five groups. Three of these groups may be integrated into HLA-A and HLA-B supertypes (HLA-C*02:02, -C*03:03, -C*03:04, -C*12:03, -C*15:02, -C*16:01, and -C*17:01 into the A*01, B*58, or B*62 supertype; HLA-C*14:02 into the A*24 supertype; and HLA-C*06:02, -C*07:01, and -C*07:02 into the B*27 supertype). Allotypes with an anchor at position 3 may deserve additional supertype definitions. A C*01 supertype with proline at position 3 and aliphatic residues at the C terminus may account for the uniqueness of HLA-C*01. A C*04 supertype would integrate HLA-C*04:01, -C*05:01, and -C*08:02 into the supertype concept.

HLA-E*01:01–transfected C1R cells present two MHC class I leader peptides, namely VMAPRTLIL derived from HLA-C*04:01 and VMAPRTLVL derived from HLA-A*02:01. The latter is to some extent surprising because there is no evidence for surface expression of HLA-A*02:01 in C1R (27, 60). VMAPRTLVL was detected in every C1R transfectant (note: C1R is HLA-E*01:03+), ensuring that it is not a false positive (FP) but most probably derived from a defective ribosomal product. Overall, three additional MHC class I leader peptides, VMAPRTLLL (HLA-C*02:02 and -C*15:02), VMAPRALLL (HLA-C*07:01 and -C*07:02), and VMAPRTLFL (HLA-G*01:01), were detected, which are presented by HLA-E*01:03. MHC class I leader peptides of HLA-B*35:03 (VTAPRTVLL) and HLA-C*17:01 (VMAPQALLL) are not presented by HLA-E*01:01 (note: only HLA-B*35:03 signal sequence could have been expressed on C1R-E*01:01) or the endogenously expressed HLA-E*01:03. This discrimination of peptides with one or two amino acid changes, mostly in positions contributing less to the interaction to HLA, illustrates the adaption of HLA-E in MHC leader peptide presentation and its restricted peptide repertoire.

HLA-G*01:01 reveals a marked peptide motif with anchors at position 3, composed of proline, isoleucine and valine, and at the C-terminal position (Ω) formed by leucine. An auxiliary anchor with lysine and arginine is shaped at position 1. Hydrophobic residue preferences show up at position 2 and position Ω-2 (Fig. 1). Contrary to HLA-E, HLA-G*01:01 exhibits a large peptide binding repertoire with 2258 identified ligands eluted solely from HLA-G*01:01–transfected C1R cells.

To look for ligand overlap across the analyzed allotypes, the top 500 most abundant ligands (defined by the sum of the AUC values of five technical replicates) of each HLA molecule were integrated. For HLA-C*07:01, only 392 ligands could be considered (Table III). The overlap in HLA-presented peptides among allotypes with clearly distinguishable peptide motifs (one or both anchor residues are different) was marginal with a maximal overlap of 2.46% between HLA-C*03:04 and HLA-C*08:02. Allotypes with consistent anchor residue preferences display higher overlap within the presented peptides, ranging from 3.20% between HLA-C*07:02 and HLA-C*14:02, and up to 11.98% between HLA-C*02:02 and HLA-C*12:03. Notably, HLA-C*05:01 and HLA-C*08:02 show a high overlap, sharing 27.39% ligands. The comparison to HLA-C*04:01 is limited due to endogenous expression of the allotype on C1R and its similarity to HLA-C*05:01 and -C*08:02. Because peptides from C1R-C*04:01 had to be excluded for ligand definition of HLA-C*05:01 and -C*08:02, the overlap is consequently zero. Nevertheless, overlap of HLA-C*05:01 and HLA-C*08:02 to HLA-C*04:01 should be markedly lower because HLA-C*04:01 favors large aromatic residues in position 2, whereas HLA-C*05:01 and HLA-C*08:02 prefer small residues. In fact, including intrinsic HLA-C*04:01 ligands (no exclusion of peptides of C1R-C*04:01 from C1R-C*05:01 and -C*08:02 peptide lists), the overlap is higher between HLA-C*05:01 and -C*08:02 with 40.45% compared with 26.26% between HLA-C*04:01 and -C*05:01 or 25.31% between HLA-C*04:01 and -C*08:02, respectively. High overlap is seen between the HLA-C*03 subtypes HLA-C*03:03 and HLA-C*03:04, with 42.86% of shared ligands. The HLA-C*07 subtypes HLA-C*07:01 and HLA-C*07:02 display a rather small overlap of 10.12% within their ligands compared with the HLA-C*03 subtypes, which can be explained by differences in the preferred residues in anchor position 2. In general, ligand overlap is marginal within HLA allotypes unless the same anchor residues are shared (61).

Table III.
Ligand overlap (%) of the top 500 HLA ligands of each allele
 
 

Ligand overlap was determined using the top 500 ligands of each allele defined by the sum of area in all five technical replicates. Crossed-out numbers are not representative.

*

C1R–C*04:01 peptides removed.

**

Only 392 ligands.

Theoretically, all proteins within a cell may be used as source for peptide presentation. However, different factors such as source protein expression level, Ag processing, and transport efficiency and affinity of the peptide to the HLA and their stability may select for a smaller set of source proteins, which are presented by one allotype. The source proteins of the 500 most abundant ligands were selected for the overlap analysis of the source proteome (Tables IV, V) (62). The source protein overlap was the highest for allotypes with a high ligand overlap, which is obvious because the overlapping ligands derive from the same source protein. More interesting is the overlap of the source proteome added by nonoverlapping ligands. In fact, the additional overlap contributed by nonoverlapping ligands is comparable within all allotypes, independent of their peptide motifs, with a median increase of 5.42%. This allotype- and also subtype-independent low increase in the source proteome overlap displays the high diversification that is added by an additional HLA molecule.

Table IV.
Source proteome overlap (%)
 
 

Source proteins of the top 500 expressed HLA ligands, defined by the sum of areas in all five technical replicates, were included.

Table V.
HLA-C and HLA-G ligands of tumor-associated Ags according to Cheever et al. (62) and the allotype by which the ligand is presented
ProteinUniProt IdentificationLigandHLA
MAGEA3 P43357 FQAALSRKV C*02:02 
MAGEA3 P43357 FVQENYLEY C*02:02 
MAGEA3 P43357 TFPDLESEF C*14:02 
MAGEA3 P43357 NYPLWSQSY C*14:02 
TP53 P04637 TAKSVTCTY C*02:02 
PSMA Q04609 FTEIASKF C*12:03 
gp100/PMEL P40967 HFLRNQPL C*14:02 
PSA P55786 ISTVEVLKV C*15:02 
PSA (PSAL) P55786 VVPKDRVAL G*01:01 
PSA P55786 RSPVYLTVL G*01:01 
Cyclin B1 P14635 VQDLAKAV C*05:01/C*08:02 
Cyclin B1 P14635 FRLLQETMY C*07:02 
Cyclin B1 P14635 VQVQMKFRL G*01:01 
RhoC (RhoA) P08134 FSIDSPDSL C*03:03 
RhoC P08134 MATRAGLQV C*15:02 
RhoC (RhoB) P08134 KTKEGVREV C*15:02/C*16:01 
SART3 Q15020 YIDFEMKI C*05:01 
SART3 Q15020 IGDPARIQL C*05:01 
SART3 Q15020 NADFAKLFL C*08:02 
SART3 Q15020 IFSNRGDF C*14:02 
SART3 Q15020 VAAATYKTM C*16:01 
SART3 Q15020 AAFTRALEY C*16:01 
Sperm protein 17 Q15506 RIPQGFGNLL G*01:01 
LCK P06239 ITFPGLHEL C*07:01/C*12:03/C*15:02 
LCK P06239 FYISPRITF C*14:02 
LCK P06239 KTPSGIKL G*01:01 
B7H3 Q5ZPR3 FSPEPGFSL C*01:02 
B7H3 Q5ZPR3 LFDVHSVL C*04:01 
B7H3 Q5ZPR3 VAAPYSKPSM C*16:01 
ProteinUniProt IdentificationLigandHLA
MAGEA3 P43357 FQAALSRKV C*02:02 
MAGEA3 P43357 FVQENYLEY C*02:02 
MAGEA3 P43357 TFPDLESEF C*14:02 
MAGEA3 P43357 NYPLWSQSY C*14:02 
TP53 P04637 TAKSVTCTY C*02:02 
PSMA Q04609 FTEIASKF C*12:03 
gp100/PMEL P40967 HFLRNQPL C*14:02 
PSA P55786 ISTVEVLKV C*15:02 
PSA (PSAL) P55786 VVPKDRVAL G*01:01 
PSA P55786 RSPVYLTVL G*01:01 
Cyclin B1 P14635 VQDLAKAV C*05:01/C*08:02 
Cyclin B1 P14635 FRLLQETMY C*07:02 
Cyclin B1 P14635 VQVQMKFRL G*01:01 
RhoC (RhoA) P08134 FSIDSPDSL C*03:03 
RhoC P08134 MATRAGLQV C*15:02 
RhoC (RhoB) P08134 KTKEGVREV C*15:02/C*16:01 
SART3 Q15020 YIDFEMKI C*05:01 
SART3 Q15020 IGDPARIQL C*05:01 
SART3 Q15020 NADFAKLFL C*08:02 
SART3 Q15020 IFSNRGDF C*14:02 
SART3 Q15020 VAAATYKTM C*16:01 
SART3 Q15020 AAFTRALEY C*16:01 
Sperm protein 17 Q15506 RIPQGFGNLL G*01:01 
LCK P06239 ITFPGLHEL C*07:01/C*12:03/C*15:02 
LCK P06239 FYISPRITF C*14:02 
LCK P06239 KTPSGIKL G*01:01 
B7H3 Q5ZPR3 FSPEPGFSL C*01:02 
B7H3 Q5ZPR3 LFDVHSVL C*04:01 
B7H3 Q5ZPR3 VAAPYSKPSM C*16:01 

Proteins in parentheses may be also the source of the HLA ligand.

Identified peptides were used to establish SYFPEITHI matrices. Therefore peptides were clustered using GibbsCluster 1.1 (39). Anchor positions and residues were defined from clusters of the transfected HLA. Peptides harboring predefined anchor residues were defined as ligands and were used to establish SYFPEITHI matrices (Supplemental Fig. 2, Supplemental Table II). To examine the performance of the established SYFPEITHI matrices, k-fold cross-validation was performed (51). For that purpose, the initial peptide list of each analyzed transfectant was split into five parts. Four parts were used for clustering and subsequent definition of the SYFPEITHI matrix, whereas one part remained to validate the matrix. Peptides in the cluster corresponding to the transfected HLA were defined as true binders whereas peptides of the other clusters and peptides fitting to no cluster were defined as false binders. ROC points were calculated in 5% steps of the SYFPEITHI maximal score. AUC values extend from 0.88 for HLA-C*14:02 and HLA-C*17:01 to 0.97 for the HLA-C*01:02 nonamer matrix. Only the HLA-C*16:01 matrix performed less well with an AUC = 0.78 (Fig. 3). In conclusion, the performance of the matrices is excellent for discrimination of true and false binders within the data set.

FIGURE 3.

ROC analysis of SYFPEITHI matrices of nonamers. Each point represents the TP- and FP-predicted ligands from applying SYFPEITHI thresholds in 5% steps from 0 to 100% of the maximal score.

FIGURE 3.

ROC analysis of SYFPEITHI matrices of nonamers. Each point represents the TP- and FP-predicted ligands from applying SYFPEITHI thresholds in 5% steps from 0 to 100% of the maximal score.

Close modal

SYFPEITHI (40) and NetMHC (41, 63) are commonly used tools for HLA binding predictions. However, both prediction tools are based on different strategies. SYFPEITHI uses a position-based matrix scoring system that depends on amino acid frequencies at each position and the definition of anchor and auxiliary anchor positions using naturally eluted MHC ligands. In contrast, NetMHCpan-3.0 uses artificial neural networks that were trained on quantitative in vitro binding data of peptide–MHC class I complexes from the Immune Epitope Database (64). Thus, all ligands are also binders, but peptides identified to be binders in vitro are not necessarily natural ligands. To compare both prediction tools, peptides of each transfectant were split into true or false binders for the transfected HLA by clustering (peptides of the transfected HLA cluster = “true binders,” other peptides = “false binders”). Commonly used thresholds for binder definition were used with ≥ 60% of the maximal score for SYFPEITHI and rank <2 for NetMHCpan-3.0. The rate of FP- and true positive (TP)–predicted binders is illustrated in Fig. 4 for all analyzed allotypes, except for HLA-E*01:01. SYFPEITHI illustrates a powerful prediction with a high TP rate ranging from 0.66 to 0.91 and a low FP rate ranging from 0.00 to 0.22. Only the nonameric matrix for HLA-C*14:02 with an FP rate of 0.54 performed poorly, which can be explained by the motif similarities to HLA-C*04:01 (endogenously expressed by C1R) at position 2 (anchor and auxiliary anchor, respectively, preferring aromatic residues) and anchor position 9. NetMHCpan-3.0 exhibits higher TP rates between 0.71 and 0.97, but at the same time higher FP rates between 0.07 and 0.65. Similar to SYFPEITHI, the NetMHCpan-3.0 prediction for HLA-C*14:02 performs with an FP rate of 0.72. The highest disparity is seen for HLA-G*01:01 with a TP rate of 0.91 and an FP rate of 0.05 for SYFPEITHI, and a TP rate of 0.67 and an FP rate of 0.65 for NetMHCpan-3.0. Hence, NetMHCpan-3.0 displays a rather random prediction. In conclusion, the performance of the established SYFPEITHI matrices could be confirmed by comparison with NetMHCpan-3.0. SYFPEITHI outperformed with higher precision for all allotypes (Fig. 5).

FIGURE 4.

Performance of SYFPEITHI matrices and NetMHCpan-3.0 prediction. Data set of each transfectant was divided into true and false binders to the respective HLA using unbiased clustering. Peptides in clusters of the corresponding transfected HLA were defined as true binders; peptides in clusters representing the endogenously expressed HLA molecules were defined as false binders of the transfected HLA. Peptides were defined as ligands with a SYFPEITHI score of ≥60% or NetMHCpan-3.0 rank <2.

FIGURE 4.

Performance of SYFPEITHI matrices and NetMHCpan-3.0 prediction. Data set of each transfectant was divided into true and false binders to the respective HLA using unbiased clustering. Peptides in clusters of the corresponding transfected HLA were defined as true binders; peptides in clusters representing the endogenously expressed HLA molecules were defined as false binders of the transfected HLA. Peptides were defined as ligands with a SYFPEITHI score of ≥60% or NetMHCpan-3.0 rank <2.

Close modal
FIGURE 5.

Precision of SYFPEITHI and NetMHCpan-3.0. The precision is defined by the TP rate divided by the sum of TPs and FPs (= TP/[TP+FP]).

FIGURE 5.

Precision of SYFPEITHI and NetMHCpan-3.0. The precision is defined by the TP rate divided by the sum of TPs and FPs (= TP/[TP+FP]).

Close modal

It is not only the definition of the binding specificities of classical but often underestimated HLA-C alleles and the nonclassical HLA-E and HLA-G that is of great importance. With regard to their roles in many diseases, like cancer, viral infections, inflammatory diseases, autoimmune disorders, and transplantation; naturally processed and presented HLA ligands may contribute to our understanding of disease and foster approaches for intervention.

In this study, the peptide motifs of the 15 most frequently represented HLA-C alleles were comprehensively analyzed using mass spectrometry–based characterization of naturally presented HLA ligands (Fig. 1). Due to the low expression of HLA-C (6569), it is hardly feasible to determine the peptide motifs in a system with simultaneous normal expression of HLA-A and -B in an unsupervised clustering approach. Hence, the lymphoblastoid C1R cell line with a low endogenous expression of HLA-B*35:03 (27) and HLA-C*04:01 was used to uncover the peptide motifs of HLA-C alleles.

The ligand yields encompass a wide range, which is because of differences in the expression levels but may also be caused by performance variances of mass spectrometric measurements. Nevertheless, yields of extracted HLA ligands were sufficient for all alleles to determine the binding motif of predominant nonamers in an unsupervised manner using GibbsCluster 1.1 (39). Peptide yields for less frequent length variants were in some cases insufficient, leading to contaminated clusters of the transfected HLA with peptides from the endogenously expressed HLA molecules. However, except for HLA-C*01:02, no changes in the preferred anchor residues emerged between the less frequent lengths and nonamers. Therefore, anchor residues were defined from the cluster of nonamers and assigned to the other length variants. This was helpful for the definition of ligands of all length variants. For HLA-C*01:02, additional anchor residues for longer length variants were included.

A common feature of all allotypes is their preference for hydrophobic and/or aromatic residues at the C-terminal position similar to HLA-B, whereas some HLA-A allotypes accept basic amino acids as C-terminal anchors. The restricted repertoire of anchor residues at the C-terminal position could be a result of the proximity to the interaction side of KIRs with HLA-C. KIRs interact with residues α73 to α90 of the HLA molecule (70, 71), which are less polymorphic in HLA-C compared to HLA-A and -B. This region is also mainly involved in the C-terminal anchor contacts (54). The low polymorphism of the α2 domain of HLA-C molecules restricts the binding repertoire of HLA-C alleles, reducing the number of potential ligands. This appears at a glance to be rather unfavorable for the body’s defense, because in theory fewer pathogen-derived or tumor-associated Ags could be presented on HLA-C molecules. This low polymorphism could be associated with the particular role of HLA-C in delivering inhibitory signals to NK cells to ensure self-tolerance. HLA-C alleles guarantee NK cell inhibition in every individual regardless of their HLA allele combination because, compared with HLA-A and -B (only few alleles show epitopes for KIR recognition), all HLA-C molecules have either the C1 or C2 epitope for KIR recognition (72, 73). HLA-C thereby allows combinatorial diversity of HLA-A and -B molecules and a still sufficiently broad binding repertoire within the population, without the disadvantage of a loss of self-tolerance due to missing-self signals.

The anchors in position 2 or 3, respectively, display high variability and thus contribute most to the peptide repertoire of HLA-C alleles. Based on the motif variability, five groups can be determined: 1) small residues in position 2 of HLA-C*02:02, -C*03:03, -C*03:04, -C*12:03, -C*15:02, -C*16:01, and -C*17:01; 2) acidic residues in position 3 of HLA-C*04:01, -C*05:01, and -C*08:02; 3) basic residues in position 2 of HLA-C*06:02, -C*07:01, and -C*07:02; 4) proline in position 3 of HLA-C*01:02; and 5) aromatic residues in position 2 of HLA-C*14:02. Interestingly, small residues in position 2 are often occupied by aromatic residues in position 1 or 3, which may stabilize the binding of the position 2 anchor. A striking change in the peptide motif is seen for longer-length variants of HLA-C*01:02, where the auxiliary anchor in position 2 almost reaches the importance of the anchor at position 3.

The predominant length variant in HLA-C is 9 aa. However, a higher rate of shorter HLA ligands were obtained for HLA-C*04:01, -C*05:01, -C*07:01, -C*07:02, -C*08:02, -C*14:02, -C*15:02, and -C*17:01. Interestingly, six of eight listed allotypes prefer charged or aromatic anchor residues, leading to the assumption that stronger interaction with the HLA molecule by charged or aromatic anchor residues may stabilize shorter peptides in the binding pocket.

Peptide overlap was generally low among the HLA alleles, ranging from 0% in allotypes with nonoverlapping motifs (note: C-terminal anchor has low variance throughout the HLA-C alleles) to 10.86% in allotypes with similar motifs (HLA-C*02:02 and HLA-C*12:03). This underlines the distinct peptide repertoire of allotypes with similar binding specificities, a feature that has also been reported for members of the HLA-B*44 supertype (61). However, high overlap is observed between HLA-C*05:01– and HLA-C*08:02–displaying peptide motifs, which are virtually indistinguishable from each other. In contrast, HLA subtypes usually demonstrate a high degree of binding similarity (41.24% peptide overlap between HLA-C*03:03 and HLA-C*03:04). What is exceptional is the lower promiscuity (8.12%) in the HLA-C*07 subtypes HLA-C*07:01 and -C*07:02 caused primarily by distinct differences in the favored anchor residues in position 2 and differences in position 1 (basic auxiliary anchor in HLA-C*07:01 or uncharged residues in HLA-C*07:02, respectively).

The source proteome overlap of the top 500 ligands of each allotype was particularly increased by overlapping ligands. A limitation is considering only the source proteome of the top 500 ligands, which underestimates the source proteome overlap, because there is a higher probability for further-included ligands (up to 3463 ligands were detected for one HLA molecule [Table I]) to derive from already included source proteins. However, this was necessary due to high variations in the ligand yields. By including the source proteins of all ligands, the source proteome overlap of nonoverlapping ligands would increase from 5.42 to ∼10%. This percentage still illustrates the high diversification added by a second allotype, including subtypes.

The SYFPEITHI matrices resulting from this work reveal high TP prediction rates for HLA-C ligands in combination with a low FP prediction rate, whereas NetMHCpan-3.0 generally gains slightly higher TP rates that are often accompanied by a high FP rate. This high FP rate is problematic in that the HLA-C*04:01 and HLA-B*35:03 peptides used for this comparison (endogenously expressed by C1R) exhibit distinguishable anchor residues. In general, SYFPEITHI prediction is more conservative (lower TP rates but outperforming low FP rates) than binding prediction with NetMHCpan-3.0, and SYFPEITHI prediction also is more precise than NetMHCpan-3.0 is.

The importance of HLA-C becomes apparent in that several peptides of tumor-associated Ags are known to be presented by HLA-C and are recognized by CD8+ T cells. HLA-C ligands, which are known to be recognized by CD8+ T cells, arise from the shared tumor-specific Ags MAGE (7477), BAGE (78), GAGE (79), and NY-ESO1 (80); the differentiation Ags DCT, PMEL (81), and SLC45A3 (82); the overexpressed Ag TPBG (83); and the Ag PARP12 (84). Indeed, SAFPTTINF (MAGEA1) and VYPEYVIQY (PARP12) were also found within our data set. Furthermore, neoepitopes are known to arise from KRAS (85) or MUM2 (86). Within our data set, further peptides of tumor-associated Ags (according to Ref. 62) were found to be presented by HLA-C alleles, which may be targets of CD8+ T cells and NK cells (Table V).

C1R cells transfected with HLA-E*01:01 exhibit no increase in cell surface expression, although successful transfection was demonstrated by sequencing. This is in line with results of Braud et al. (52) revealing a correlation of HLA-E surface expression with the presence of HLA molecules. In fact, the presentation of the HLA-C*04:01 signal peptide VMAPRTLIL was higher in C1R transfected with an HLA allele harboring the same signal peptide (HLA-C*01:01, -C*03:03, -C*03:04, -C*05:01, -C*06:02, -C*08:02, -C*12:03, -C*14:02, and -C*16:01) compared with HLA-E*01:01–transfected cells (data not shown).

In accordance with Braud et al. (52), only five HLA signal peptides could be found to be presented by HLA-E when looking for recurring sequence similarities throughout the transfectants after the exclusion of HLA-B*35:03 and HLA-C*04:01 ligands (clustering also did not work). However, some conventional peptides from pathogens (8790) and a prostate cancer–associated Ag (91) were reported to elicit HLA-E–dependent T cell responses. Furthermore, a broader binding repertoire of HLA-E was reported with similarities to the HLA-A*02 binding motif in TAP-deficient K562 cells (55, 92).

In contrast to HLA-E*01:01, HLA-G*01:01 displayed a much larger peptide repertoire with >2200 detected HLA ligands, including peptides of tumor-associated Ags such as PSA, cyclin B1, sperm protein 17, and LCK (Table V), which may elicit inhibitory effects on T and NK cells. The peptide motif displays unusual and highly specialized binding preferences. In contrast to the conclusion made by Diehl et al. (26) assuming three anchor residues (I/L in position 2, P in position 3, and L in position 9), our results indicate that position 2 seems to be less important for peptide binding. The preferred length of HLA-G*01:01 ligands is 9 aa with a sparse peptide overlap to the analyzed HLA-C alleles. The SYFPEITHI matrix for nonameric HLA-G*01:01 reveals a strong performance with a TP rate of 0.91 and an FP rate of 0.05 using the C1R-HLA-G*01:01 peptide data set, whereas NetMHCpan-3.0 exhibits a random prediction with a TP rate of 0.67 and an FP rate of 0.65.

In summary, the peptide motif of HLA-G*01:01 was uncovered, making use of >2200 HLA-G*01:01 ligands. The SYFPEITHI matrix for nonamers outperforms prediction by NetMHCpan-3.0.

We thank Claudia Falkenburger, Zsofia Bittner, and Martin Laure for supporting some of the cell culture experiments and Beate Pömmerl for PCR of isolated plasmid DNA.

This work was supported by the European Union (European Research Council Grant AdG339842 Mutaediting), the Deutsche Forschungsgemeinschaft (SFB 685 and GRK 794), and the Interfaculty Center for Pharmacogenomics and Pharma Research Graduate School at Tübingen–Stuttgart.

The online version of this article contains supplemental material.

Abbreviations used in this article:

AUC

area under the curve

FDR

false discovery rate

FP

false positive

ILT2/4

Ig-like transcript 2/4

KIR

killer cell Ig-like receptor

LC-MS/MS

liquid chromatographic tandem mass spectrometry

ROC

receiver operating characteristic

TP

true positive.

1
Zemmour
,
J.
,
P.
Parham
.
1992
.
Distinctive polymorphism at the HLA-C locus: implications for the expression of HLA-C.
J. Exp. Med.
176
:
937
950
.
2
Rech
,
A. J.
,
R. H.
Vonderheide
.
2017
.
T-cell transfer therapy targeting mutant KRAS.
N. Engl. J. Med.
376
:
e11
.
3
Blais
,
M. E.
,
T.
Dong
,
S.
Rowland-Jones
.
2011
.
HLA-C as a mediator of natural killer and T-cell activation: spectator or key player?
Immunology
133
:
1
7
.
4
Trowsdale
,
J.
,
A.
Moffett
.
2008
.
NK receptor interactions with MHC class I molecules in pregnancy.
Semin. Immunol.
20
:
317
320
.
5
Falk
,
K.
,
O.
Rötzschke
,
B.
Grahovac
,
D.
Schendel
,
S.
Stevanović
,
V.
Gnau
,
G.
Jung
,
J. L.
Strominger
,
H. G.
Rammensee
.
1993
.
Allele-specific peptide ligand motifs of HLA-C molecules.
Proc. Natl. Acad. Sci. USA
90
:
12005
12009
.
6
Rasmussen
,
M.
,
M.
Harndahl
,
A.
Stryhn
,
R.
Boucherma
,
L. L.
Nielsen
,
F. A.
Lemonnier
,
M.
Nielsen
,
S.
Buus
.
2014
.
Uncovering the peptide-binding specificities of HLA-C: a general strategy to determine the specificity of any MHC class I molecule.
J. Immunol.
193
:
4790
4802
.
7
Braud
,
V.
,
E. Y.
Jones
,
A.
McMichael
.
1997
.
The human major histocompatibility complex class Ib molecule HLA-E binds signal sequence-derived peptides with primary anchor residues at positions 2 and 9.
Eur. J. Immunol.
27
:
1164
1169
.
8
Aldrich
,
C. J.
,
A.
DeCloux
,
A. S.
Woods
,
R. J.
Cotter
,
M. J.
Soloski
,
J.
Forman
.
1994
.
Identification of a Tap-dependent leader peptide recognized by alloreactive T cells specific for a class Ib antigen.
Cell
79
:
649
658
.
9
Apps
,
R.
,
Z.
Meng
,
G. Q.
Del Prete
,
J. D.
Lifson
,
M.
Zhou
,
M.
Carrington
.
2015
.
Relative expression levels of the HLA class-I proteins in normal and HIV-infected cells.
J. Immunol.
194
:
3594
3600
.
10
Braud
,
V. M.
,
D. S.
Allan
,
C. A.
O’Callaghan
,
K.
Söderström
,
A.
D’Andrea
,
G. S.
Ogg
,
S.
Lazetic
,
N. T.
Young
,
J. I.
Bell
,
J. H.
Phillips
, et al
.
1998
.
HLA-E binds to natural killer cell receptors CD94/NKG2A, B and C.
Nature
391
:
795
799
.
11
Borrego
,
F.
,
M.
Masilamani
,
A. I.
Marusina
,
X.
Tang
,
J. E.
Coligan
.
2006
.
The CD94/NKG2 family of receptors: from molecules and cells to clinical relevance.
Immunol. Res.
35
:
263
278
.
12
Malnati
,
M. S.
,
M.
Peruzzi
,
K. C.
Parker
,
W. E.
Biddison
,
E.
Ciccone
,
A.
Moretta
,
E. O.
Long
.
1995
.
Peptide specificity in the recognition of MHC class I by natural killer cell clones.
Science
267
:
1016
1018
.
13
Fadda
,
L.
,
G.
Borhis
,
P.
Ahmed
,
K.
Cheent
,
S. V.
Pageon
,
A.
Cazaly
,
S.
Stathopoulos
,
D.
Middleton
,
A.
Mulder
,
F. H.
Claas
, et al
.
2010
.
Peptide antagonism as a mechanism for NK cell activation.
Proc. Natl. Acad. Sci. USA
107
:
10160
10165
.
14
Borhis
,
G.
,
P. S.
Ahmed
,
B.
Mbiribindi
,
M. M.
Naiyer
,
D. M.
Davis
,
M. A.
Purbhoo
,
S. I.
Khakoo
.
2013
.
A peptide antagonist disrupts NK cell inhibitory synapse formation.
J. Immunol.
190
:
2924
2930
.
15
King
,
A.
,
D. S.
Allan
,
M.
Bowen
,
S. J.
Powis
,
S.
Joseph
,
S.
Verma
,
S. E.
Hiby
,
A. J.
McMichael
,
Y. W.
Loke
,
V. M.
Braud
.
2000
.
HLA-E is expressed on trophoblast and interacts with CD94/NKG2 receptors on decidual NK cells.
Eur. J. Immunol.
30
:
1623
1631
.
16
Adams
,
E. J.
,
A. M.
Luoma
.
2013
.
The adaptable major histocompatibility complex (MHC) fold: structure and function of nonclassical and MHC class I-like molecules.
Annu. Rev. Immunol.
31
:
529
561
.
17
Sullivan
,
L. C.
,
H. L.
Hoare
,
J.
McCluskey
,
J.
Rossjohn
,
A. G.
Brooks
.
2006
.
A structural perspective on MHC class Ib molecules in adaptive immunity.
Trends Immunol.
27
:
413
420
.
18
Curigliano
,
G.
,
C.
Criscitiello
,
L.
Gelao
,
A.
Goldhirsch
.
2013
.
Molecular pathways: human leukocyte antigen G (HLA-G).
Clin. Cancer Res.
19
:
5564
5571
.
19
Carosella
,
E. D.
,
B.
Favier
,
N.
Rouas-Freiss
,
P.
Moreau
,
J.
Lemaoult
.
2008
.
Beyond the increasing complexity of the immunomodulatory HLA-G molecule.
Blood
111
:
4862
4870
.
20
Lin
,
A.
,
X.
Zhang
,
H. H.
Xu
,
D. P.
Xu
,
Y. Y.
Ruan
,
W. H.
Yan
.
2012
.
HLA-G expression is associated with metastasis and poor survival in the Balb/c nu/nu murine tumor model with ovarian cancer.
Int. J. Cancer
131
:
150
157
.
21
Lila
,
N.
,
A.
Carpentier
,
C.
Amrein
,
I.
Khalil-Daher
,
J.
Dausset
,
E. D.
Carosella
.
2000
.
Implication of HLA-G molecule in heart-graft acceptance.
Lancet
355
:
2138
.
22
Colonna
,
M.
,
F.
Navarro
,
T.
Bellón
,
M.
Llano
,
P.
García
,
J.
Samaridis
,
L.
Angman
,
M.
Cella
,
M.
López-Botet
.
1997
.
A common inhibitory receptor for major histocompatibility complex class I molecules on human lymphoid and myelomonocytic cells.
J. Exp. Med.
186
:
1809
1818
.
23
Colonna
,
M.
,
J.
Samaridis
,
M.
Cella
,
L.
Angman
,
R. L.
Allen
,
C. A.
O’Callaghan
,
R.
Dunbar
,
G. S.
Ogg
,
V.
Cerundolo
,
A.
Rolink
.
1998
.
Human myelomonocytic cells express an inhibitory receptor for classical and nonclassical MHC class I molecules.
J. Immunol.
160
:
3096
3100
.
24
Rajagopalan
,
S.
,
E. O.
Long
.
1999
.
A human histocompatibility leukocyte antigen (HLA)-G-specific receptor expressed on all natural killer cells. [Published erratum appears in 2000 J. Exp. Med. 191: 1.]
J. Exp. Med.
189
:
1093
1100
.
25
Lee
,
N.
,
A. R.
Malacko
,
A.
Ishitani
,
M. C.
Chen
,
J.
Bajorath
,
H.
Marquardt
,
D. E.
Geraghty
.
1995
.
The membrane-bound and soluble forms of HLA-G bind identical sets of endogenous peptides but differ with respect to TAP association.
Immunity
3
:
591
600
.
26
Diehl
,
M.
,
C.
Münz
,
W.
Keilholz
,
S.
Stevanović
,
N.
Holmes
,
Y. W.
Loke
,
H. G.
Rammensee
.
1996
.
Nonclassical HLA-G molecules are classical peptide presenters.
Curr. Biol.
6
:
305
314
.
27
Zemmour
,
J.
,
A. M.
Little
,
D. J.
Schendel
,
P.
Parham
.
1992
.
The HLA-A,B “negative” mutant cell line C1R expresses a novel HLA-B35 allele, which also has a point mutation in the translation initiation codon.
J. Immunol.
148
:
1941
1948
.
28
Storkus
,
W. J.
,
D. N.
Howell
,
R. D.
Salter
,
J. R.
Dawson
,
P.
Cresswell
.
1987
.
NK susceptibility varies inversely with target cell class I HLA antigen expression.
J. Immunol.
138
:
1657
1659
.
29
Yamada
,
N.
,
Y.
Ishikawa
,
T.
Dumrese
,
K.
Tokunaga
,
T.
Juji
,
T.
Nagatani
,
K.
Miwa
,
H. G.
Rammensee
,
M.
Takiguchi
.
1999
.
Role of anchor residues in peptide binding to three HLA-A26 molecules.
Tissue Antigens
54
:
325
332
.
30
Dumrese
,
T.
,
S.
Stevanović
,
F. H.
Seeger
,
N.
Yamada
,
Y.
Ishikawa
,
K.
Tokunaga
,
M.
Takiguchi
,
H.
Rammensee
.
1998
.
HLA-A26 subtype A pockets accommodate acidic N-termini of ligands.
Immunogenetics
48
:
350
353
.
31
Shiga
,
H.
,
T.
Shioda
,
H.
Tomiyama
,
Y.
Takamiya
,
S.
Oka
,
S.
Kimura
,
Y.
Yamaguchi
,
T.
Gojoubori
,
H. G.
Rammensee
,
K.
Miwa
,
M.
Takiguchi
.
1996
.
Identification of multiple HIV-1 cytotoxic T-cell epitopes presented by human leukocyte antigen B35 molecules.
AIDS
10
:
1075
1083
.
32
Kikuchi
,
A.
,
T.
Sakaguchi
,
K.
Miwa
,
Y.
Takamiya
,
H. G.
Rammensee
,
Y.
Kaneko
,
M.
Takiguchi
.
1996
.
Binding of nonamer peptides to three HLA-B51 molecules which differ by a single amino acid substitution in the A-pocket.
Immunogenetics
43
:
268
276
.
33
Falk
,
K.
,
O.
Rötzschke
,
M.
Takiguchi
,
V.
Gnau
,
S.
Stevanović
,
G.
Jung
,
H. G.
Rammensee
.
1995
.
Peptide motifs of HLA-B38 and B39 molecules.
Immunogenetics
41
:
162
164
.
34
Falk
,
K.
,
O.
Rötzschke
,
M.
Takiguchi
,
V.
Gnau
,
S.
Stevanović
,
G.
Jung
,
H. G.
Rammensee
.
1995
.
Peptide motifs of HLA-B51, -B52 and -B78 molecules, and implications for Behćet’s disease.
Int. Immunol.
7
:
223
228
.
35
Falk
,
K.
,
O.
Rötzschke
,
M.
Takiguchi
,
V.
Gnau
,
S.
Stevanović
,
G.
Jung
,
H. G.
Rammensee
.
1995
.
Peptide motifs of HLA-B58, B60, B61, and B62 molecules.
Immunogenetics
41
:
165
168
.
36
Sidney
,
J.
,
M. F.
del Guercio
,
S.
Southwood
,
V. H.
Engelhard
,
E.
Appella
,
H. G.
Rammensee
,
K.
Falk
,
O.
Rötzschke
,
M.
Takiguchi
,
R. T.
Kubo
, et al
.
1995
.
Several HLA alleles share overlapping peptide specificities.
J. Immunol.
154
:
247
259
.
37
Falk
,
K.
,
O.
Rötzschke
,
M.
Takiguchi
,
B.
Grahovac
,
V.
Gnau
,
S.
Stevanović
,
G.
Jung
,
H. G.
Rammensee
.
1994
.
Peptide motifs of HLA-A1, -A11, -A31, and -A33 molecules.
Immunogenetics
40
:
238
241
.
38
Abelin
,
J. G.
,
D. B.
Keskin
,
S.
Sarkizova
,
C. R.
Hartigan
,
W.
Zhang
,
J.
Sidney
,
J.
Stevens
,
W.
Lane
,
G. L.
Zhang
,
T. M.
Eisenhaure
, et al
.
2017
.
Mass spectrometry profiling of HLA-associated peptidomes in mono-allelic cells enables more accurate epitope prediction.
Immunity
46
:
315
326
.
39
Andreatta
,
M.
,
O.
Lund
,
M.
Nielsen
.
2013
.
Simultaneous alignment and clustering of peptide data using a Gibbs sampling approach.
Bioinformatics
29
:
8
14
.
40
Rammensee
,
H.
,
J.
Bachmann
,
N. P.
Emmerich
,
O. A.
Bachor
,
S.
Stevanović
.
1999
.
SYFPEITHI: database for MHC ligands and peptide motifs.
Immunogenetics
50
:
213
219
.
41
Nielsen
,
M.
,
M.
Andreatta
.
2016
.
NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets.
Genome Med.
8
:
33
.
42
Robinson
,
J.
,
J. A.
Halliwell
,
J. D.
Hayhurst
,
P.
Flicek
,
P.
Parham
,
S. G.
Marsh
.
2015
.
The IPD and IMGT/HLA database: allele variant databases.
Nucleic Acids Res.
43
:
D423
D431
.
43
Barnstable
,
C. J.
,
W. F.
Bodmer
,
G.
Brown
,
G.
Galfre
,
C.
Milstein
,
A. F.
Williams
,
A.
Ziegler
.
1978
.
Production of monoclonal antibodies to group A erythrocytes, HLA and other human cell surface antigens-new tools for genetic analysis.
Cell
14
:
9
20
.
44
Falk
,
K.
,
O.
Rötzschke
,
S.
Stevanović
,
G.
Jung
,
H. G.
Rammensee
.
1991
.
Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules.
Nature
351
:
290
296
.
45
Kowalewski
,
D. J.
,
S.
Stevanović
.
2013
.
Biochemical large-scale identification of MHC class I ligands.
Methods Mol. Biol.
960
:
145
157
.
46
Eng
,
J. K.
,
A. L.
McCormack
,
J. R.
Yates
.
1994
.
An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.
J. Am. Soc. Mass Spectrom.
5
:
976
989
.
47
Käll
,
L.
,
J. D.
Canterbury
,
J.
Weston
,
W. S.
Noble
,
M. J.
MacCoss
.
2007
.
Semi-supervised learning for peptide identification from shotgun proteomics datasets.
Nat. Methods
4
:
923
925
.
48
Steinle
,
A.
,
K.
Falk
,
O.
Rötzschke
,
V.
Gnau
,
S.
Stevanović
,
G.
Jung
,
D. J.
Schendel
,
H.-G.
Rammensee
.
1996
.
Motif of HLA-B*3503 peptide ligands.
Immunogenetics
43
:
105
107
.
49
Schittenhelm
,
R. B.
,
N. L.
Dudek
,
N. P.
Croft
,
S. H.
Ramarathinam
,
A. W.
Purcell
.
2014
.
A comprehensive analysis of constitutive naturally processed and presented HLA-C*04:01 (Cw4)-specific peptides.
Tissue Antigens
83
:
174
179
.
50
Thomsen
,
M. C.
,
M.
Nielsen
.
2012
.
Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion.
Nucleic Acids Res.
40
:
W281
W287
.
51
Backert
,
L.
,
O.
Kohlbacher
.
2015
.
Immunoinformatics and epitope prediction in the age of genomic medicine.
Genome Med.
7
:
119
.
52
Braud
,
V. M.
,
D. S.
Allan
,
D.
Wilson
,
A. J.
McMichael
.
1998
.
TAP- and tapasin-dependent HLA-E surface expression correlates with the binding of an MHC class I leader peptide.
Curr. Biol.
8
:
1
10
.
53
Strong
,
R. K.
,
M. A.
Holmes
,
P.
Li
,
L.
Braun
,
N.
Lee
,
D. E.
Geraghty
.
2003
.
HLA-E allelic variants. Correlating differential expression, peptide affinities, crystal structures, and thermal stabilities.
J. Biol. Chem.
278
:
5082
5090
.
54
Huyton
,
T.
,
N.
Ladas
,
H.
Schumacher
,
R.
Blasczyk
,
C.
Bade-Doeding
.
2012
.
Pocketcheck: updating the HLA class I peptide specificity roadmap.
Tissue Antigens
80
:
239
248
.
55
Rammensee, H.G., J. Bachmann, and S. Stevanovic. 1997. MHC Ligands and Peptide Motifs. In Molecular Biology Intelligence Unit. Springer-Verlag, Berlin, Heidelberg.
56
del Guercio
,
M. F.
,
J.
Sidney
,
G.
Hermanson
,
C.
Perez
,
H. M.
Grey
,
R. T.
Kubo
,
A.
Sette
.
1995
.
Binding of a peptide antigen to multiple HLA alleles allows definition of an A2-like supertype.
J. Immunol.
154
:
685
693
.
57
Sette
,
A.
,
J.
Sidney
.
1999
.
Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism.
Immunogenetics
50
:
201
212
.
58
Sidney
,
J.
,
B.
Peters
,
N.
Frahm
,
C.
Brander
,
A.
Sette
.
2008
.
HLA class I supertypes: a revised and updated classification.
BMC Immunol.
9
:
1
.
59
Doytchinova
,
I. A.
,
P.
Guan
,
D. R.
Flower
.
2004
.
Identifying human MHC supertypes using bioinformatic methods.
J. Immunol.
172
:
4314
4323
.
60
Barth
,
S. M.
,
C. M.
Schreitmüller
,
F.
Proehl
,
K.
Oehl
,
L. M.
Lumpp
,
D. J.
Kowalewski
,
M.
Di Marco
,
T.
Sturm
,
L.
Backert
,
H.
Schuster
, et al
.
2016
.
Characterization of the canine MHC class I DLA-88*50101 peptide binding motif as a prerequisite for canine T cell immunotherapy.
PLoS One
11
:
e0167017
.
61
Hillen
,
N.
,
G.
Mester
,
C.
Lemmel
,
A. O.
Weinzierl
,
M.
Müller
,
D.
Wernet
,
J.
Hennenlotter
,
A.
Stenzl
,
H. G.
Rammensee
,
S.
Stevanović
.
2008
.
Essential differences in ligand presentation and T cell epitope recognition among HLA molecules of the HLA-B44 supertype.
Eur. J. Immunol.
38
:
2993
3003
.
62
Cheever
,
M. A.
,
J. P.
Allison
,
A. S.
Ferris
,
O. J.
Finn
,
B. M.
Hastings
,
T. T.
Hecht
,
I.
Mellman
,
S. A.
Prindiville
,
J. L.
Viner
,
L. M.
Weiner
,
L. M.
Matrisian
.
2009
.
The prioritization of cancer antigens: a national cancer institute pilot project for the acceleration of translational research.
Clin. Cancer Res.
15
:
5323
5337
.
63
Nielsen
,
M.
,
C.
Lundegaard
,
P.
Worning
,
S. L.
Lauemøller
,
K.
Lamberth
,
S.
Buus
,
S.
Brunak
,
O.
Lund
.
2003
.
Reliable prediction of T-cell epitopes using neural networks with novel sequence representations.
Protein Sci.
12
:
1007
1017
.
64
Vita
,
R.
,
J. A.
Overton
,
J. A.
Greenbaum
,
J.
Ponomarenko
,
J. D.
Clark
,
J. R.
Cantrell
,
D. K.
Wheeler
,
J. L.
Gabbard
,
D.
Hix
,
A.
Sette
,
B.
Peters
.
2015
.
The immune epitope database (IEDB) 3.0.
Nucleic Acids Res.
43
:
D405
D412
.
65
McCutcheon
,
J. A.
,
J.
Gumperz
,
K. D.
Smith
,
C. T.
Lutz
,
P.
Parham
.
1995
.
Low HLA-C expression at cell surfaces correlates with increased turnover of heavy chain mRNA.
J. Exp. Med.
181
:
2085
2095
.
66
Neisig
,
A.
,
C. J.
Melief
,
J.
Neefjes
.
1998
.
Reduced cell surface expression of HLA-C molecules correlates with restricted peptide binding and stable TAP interaction.
J. Immunol.
160
:
171
179
.
67
Neefjes
,
J. J.
,
H. L.
Ploegh
.
1988
.
Allele and locus-specific differences in cell surface expression and the association of HLA class I heavy chain with beta 2-microglobulin: differential effects of inhibition of glycosylation on class I subunit association.
Eur. J. Immunol.
18
:
801
810
.
68
Setini
,
A.
,
A.
Beretta
,
C.
De Santis
,
R.
Meneveri
,
A.
Martayan
,
M. C.
Mazzilli
,
E.
Appella
,
A. G.
Siccardi
,
P. G.
Natali
,
P.
Giacomini
.
1996
.
Distinctive features of the alpha 1-domain alpha helix of HLA-C heavy chains free of beta 2-microglobulin.
Hum. Immunol.
46
:
69
81
.
69
Schaefer
,
M. R.
,
M.
Williams
,
D. A.
Kulpa
,
P. K.
Blakely
,
A. Q.
Yaffee
,
K. L.
Collins
.
2008
.
A novel trafficking signal within the HLA-C cytoplasmic tail allows regulated expression upon differentiation of macrophages.
J. Immunol.
180
:
7804
7817
.
70
Mandelboim
,
O.
,
H. T.
Reyburn
,
E. G.
Sheu
,
M.
Vales-Gomez
,
D. M.
Davis
,
L.
Pazmany
,
J. L.
Strominger
.
1997
.
The binding site of NK receptors on HLA-C molecules.
Immunity
6
:
341
350
.
71
Boyington
,
J. C.
,
A. G.
Brooks
,
P. D.
Sun
.
2001
.
Structure of killer cell immunoglobulin-like receptors and their recognition of the class I MHC molecules.
Immunol. Rev.
181
:
66
78
.
72
Colonna
,
M.
,
G.
Borsellino
,
M.
Falco
,
G. B.
Ferrara
,
J. L.
Strominger
.
1993
.
HLA-C is the inhibitory ligand that determines dominant resistance to lysis by NK1- and NK2-specific natural killer cells.
Proc. Natl. Acad. Sci. USA
90
:
12000
12004
.
73
Hilton
,
H. G.
,
P.
Parham
.
2017
.
Missing or altered self: human NK cell receptors that recognize HLA-C.
Immunogenetics
69
:
567
579
.
74
Chaux
,
P.
,
R.
Luiten
,
N.
Demotte
,
V.
Vantomme
,
V.
Stroobant
,
C.
Traversari
,
V.
Russo
,
E.
Schultz
,
G. R.
Cornelis
,
T.
Boon
,
P.
van der Bruggen
.
1999
.
Identification of five MAGE-A1 epitopes recognized by cytolytic T lymphocytes obtained by in vitro stimulation with dendritic cells transduced with MAGE-A1.
J. Immunol.
163
:
2928
2936
.
75
Breckpot
,
K.
,
C.
Heirman
,
C.
De Greef
,
P.
van der Bruggen
,
K.
Thielemans
.
2004
.
Identification of new antigenic peptide presented by HLA-Cw7 and encoded by several MAGE genes using dendritic cells transduced with lentiviruses.
J. Immunol.
172
:
2232
2237
.
76
Vantomme
,
V.
,
P.
Boël
,
E.
De Plaen
,
T.
Boon
,
P.
van der Bruggen
.
2003
.
A new tumor-specific antigenic peptide encoded by MAGE-6 is presented to cytolytic T lymphocytes by HLA-Cw16.
Cancer Immun.
3
:
17
.
77
Heidecker
,
L.
,
F.
Brasseur
,
M.
Probst-Kepper
,
M.
Guéguen
,
T.
Boon
,
B. J.
Van den Eynde
.
2000
.
Cytolytic T lymphocytes raised against a human bladder carcinoma recognize an antigen encoded by gene MAGE-A12.
J. Immunol.
164
:
6041
6045
.
78
Boël
,
P.
,
C.
Wildmann
,
M. L.
Sensi
,
R.
Brasseur
,
J. C.
Renauld
,
P.
Coulie
,
T.
Boon
,
P.
van der Bruggen
.
1995
.
BAGE: a new gene encoding an antigen recognized on human melanomas by cytolytic T lymphocytes.
Immunity
2
:
167
175
.
79
Van den Eynde
,
B.
,
O.
Peeters
,
O.
De Backer
,
B.
Gaugler
,
S.
Lucas
,
T.
Boon
.
1995
.
A new family of genes coding for an antigen recognized by autologous cytolytic T lymphocytes on a human melanoma.
J. Exp. Med.
182
:
689
698
.
80
Gnjatic
,
S.
,
Y.
Nagata
,
E.
Jager
,
E.
Stockert
,
S.
Shankara
,
B. L.
Roberts
,
G. P.
Mazzara
,
S. Y.
Lee
,
P. R.
Dunbar
,
B.
Dupont
, et al
.
2000
.
Strategy for monitoring T cell responses to NY-ESO-1 in patients with any HLA class I allele.
Proc. Natl. Acad. Sci. USA
97
:
10917
10922
.
81
Castelli
,
C.
,
P.
Tarsini
,
A.
Mazzocchi
,
F.
Rini
,
L.
Rivoltini
,
F.
Ravagnani
,
F.
Gallino
,
F.
Belli
,
G.
Parmiani
.
1999
.
Novel HLA-Cw8-restricted T cell epitopes derived from tyrosinase-related protein-2 and gp100 melanoma antigens.
J. Immunol.
162
:
1739
1748
.
82
Friedman
,
R. S.
,
A. G.
Spies
,
M.
Kalos
.
2004
.
Identification of naturally processed CD8 T cell epitopes from prostein, a prostate tissue-specific vaccine candidate.
Eur. J. Immunol.
34
:
1091
1101
.
83
Redchenko
,
I.
,
R.
Harrop
,
M. G.
Ryan
,
R. E.
Hawkins
,
M. W.
Carroll
.
2006
.
Identification of a major histocompatibility complex class I-restricted T-cell epitope in the tumour-associated antigen, 5T4.
Immunology
118
:
50
57
.
84
Nagata
,
Y.
,
T.
Hanagiri
,
M.
Takenoyama
,
T.
Fukuyama
,
M.
Mizukami
,
T.
So
,
Y.
Ichiki
,
M.
Sugaya
,
K.
Sugio
,
K.
Yasumoto
.
2005
.
Identification of the HLA-Cw*0702-restricted tumor-associated antigen recognized by a CTL clone from a lung cancer patient.
Clin. Cancer Res.
11
:
5265
5272
.
85
Tran
,
E.
,
M.
Ahmadzadeh
,
Y. C.
Lu
,
A.
Gros
,
S.
Turcotte
,
P. F.
Robbins
,
J. J.
Gartner
,
Z.
Zheng
,
Y. F.
Li
,
S.
Ray
, et al
.
2015
.
Immunogenicity of somatic mutations in human gastrointestinal cancers.
Science
350
:
1387
1390
.
86
Chiari
,
R.
,
F.
Foury
,
E.
De Plaen
,
J. F.
Baurain
,
J.
Thonnard
,
P. G.
Coulie
.
1999
.
Two antigens recognized by autologous cytolytic T lymphocytes on a melanoma result from a single point mutation in an essential housekeeping gene.
Cancer Res.
59
:
5785
5792
.
87
Heinzel
,
A. S.
,
J. E.
Grotzke
,
R. A.
Lines
,
D. A.
Lewinsohn
,
A. L.
McNabb
,
D. N.
Streblow
,
V. M.
Braud
,
H. J.
Grieser
,
J. T.
Belisle
,
D. M.
Lewinsohn
.
2002
.
HLA-E-dependent presentation of Mtb-derived antigen to human CD8+ T cells.
J. Exp. Med.
196
:
1473
1481
.
88
Salerno-Gonçalves
,
R.
,
M.
Fernandez-Viña
,
D. M.
Lewinsohn
,
M. B.
Sztein
.
2004
.
Identification of a human HLA-E-restricted CD8+ T cell subset in volunteers immunized with Salmonella enterica serovar Typhi strain Ty21a typhoid vaccine.
J. Immunol.
173
:
5852
5862
.
89
Romagnani
,
C.
,
G.
Pietra
,
M.
Falco
,
P.
Mazzarino
,
L.
Moretta
,
M. C.
Mingari
.
2004
.
HLA-E-restricted recognition of human cytomegalovirus by a subset of cytolytic T lymphocytes.
Hum. Immunol.
65
:
437
445
.
90
García
,
P.
,
M.
Llano
,
A. B.
de Heredia
,
C. B.
Willberg
,
E.
Caparrós
,
P.
Aparicio
,
V. M.
Braud
,
M.
López-Botet
.
2002
.
Human T cell receptor-mediated recognition of HLA-E.
Eur. J. Immunol.
32
:
936
944
.
91
Housseau
,
F.
,
R. K.
Bright
,
T.
Simonis
,
M. I.
Nishimura
,
S. L.
Topalian
.
1999
.
Recognition of a shared human prostate cancer-associated antigen by nonclassical MHC-restricted CD8+ T cells.
J. Immunol.
163
:
6330
6337
.
92
Lampen
,
M. H.
,
C.
Hassan
,
M.
Sluijter
,
A.
Geluk
,
K.
Dijkman
,
J. M.
Tjon
,
A. H.
de Ru
,
S. H.
van der Burg
,
P. A.
van Veelen
,
T.
van Hall
.
2013
.
Alternative peptide repertoire of HLA-E reveals a binding motif that is strikingly similar to HLA-A2.
Mol. Immunol.
53
:
126
131
.

H.-G.R. is a shareholder of Immatics Biotechnologies GmbH (Tübingen, Germany) and CureVac GmbH (Tübingen, Germany). H.S. is an employee of Immatics Biotechnologies GmbH. The authors declare that Immatics did not provide financial or scientific support in any direct relation to this manuscript or the underlying studies and was not involved in data collection, analysis, or decision to publish. The other authors have no financial conflicts of interest.

Supplementary data