Abstract
The majority of >2000 HLA class I molecules can be clustered according to overlapping peptide binding specificities or motifs recognized by CD8+ T cells. HLA class I motifs are classified based on the specificity of residues located in the P2 and the C-terminal positions of the peptide. However, it has been suggested that other positions might be relevant for peptide binding to HLA class I molecules and therefore be used for further characterization of HLA class I motifs. In this study we performed large-scale sequencing of endogenous peptides eluted from K562 cells (HLA class I null) made to express a single HLA molecule from HLA-B*3501, -B*3502, -B*3503, -B*3504, -B*3506, or -B*3508. Using sequence data from >1,000 peptides, we characterized novel peptide motifs that include dominant anchor residues extending to all positions in the peptide. The length distribution of HLA-B35-bound peptides included peptides of up to 15 residues. Remarkably, we determined that some peptides longer than 11 residues represented N-terminal-extended peptides containing an appropriate HLA-B35 peptide motif. These results provide evidence for the occurrence of endogenous N-terminal-extended peptide-HLA class I configurations. In addition, these results expand the knowledge about the identity of anchor positions in HLA class I-associated peptides that can be used for characterization of HLA class I motifs.
The HLA class I loci encode molecules that are present on the surface of all nucleated cells, where they bind and present peptides derived from the cytosol to circulating CD8+ T cells. HLA class I molecules are heterodimers of a H chain type I integral membrane glycoprotein and the soluble β2-microglobulin protein (1, 2). The extracellular region of the H chain folds into three domains (α1, α2, and α3), with β2-microglobulin contributing a fourth domain. The α1 and α2 domains of the H chain form the peptide binding site, a groove on the upper surface of the HLA class I molecule that binds peptides generally between eight and 11 aa long. HLA class I peptide binding specificity or motif is conferred by pockets in the peptide binding groove that accommodate the preferred side chains of anchor residues, often at the second (P2) position and the C-terminal positions in the peptide (3).
Precursor peptides 8–25 residues long are generated in the cytoplasm by the proteasome and delivered to the endoplasmic reticulum (ER),4 where they bind to assembling HLA class I molecules (4, 5). Proteasomes are thought to generate the final C-terminal end of HLA class I binding peptides, but further trimming of the N terminus is required for most peptides (6). The ER aminopeptidase associated with Ag processing (ERAAP; human ortholog, ERAP1) catalyzes sequential cleavage of N-terminal residues (7, 8, 9). Bulging in the middle and C-terminal extensions are well-documented mechanisms for binding peptides longer than 11 residues (10, 11, 12). N-terminal-extended peptide-HLA class I configurations appear to be transient and difficult to observe due to the decreased stability of ERAAP-untrimmed peptide-MHC I complexes (13). However, a recent study has reported the presence of nested sets of N-terminal-extended peptides derived from a single HIV protein transfected into a mouse cell line (14). To date, there is no evidence for N-terminal-extended peptide-HLA class I configurations of endogenous peptides.
HLA class I molecules are extremely polymorphic. As of April of 2008, there were >2,000 different HLA class I molecules encoded by all HLA class I loci (15) (www.ebi.ac.uk/imgt/hla). Despite this large number, HLA class I molecules can be grouped into clusters or supertypes (reviewed in Refs. 16 and 17). HLA class I supertypes are sets of HLA class I molecules that share overlapping peptide binding specificities at the P2 and C-terminal positions in the peptide. The concept of HLA class I supertypes has been useful to identify common HLA-restricted CD8+ T cell epitopes from a variety of infectious diseases (18, 19, 20, 21) and cancer (22, 23), as well as providing a principle for predicting peptide candidates with common HLA class I binding characteristics (24, 25, 26, 27).
The majority of HLA class I peptide binding motifs, the basis for the supertype classification, have been defined from the data of a few crystal structures of peptide-HLA class I complexes or from pool sequencing of a limited number of endogenously bound peptides (28, 29). Giving this limitation in data, it is possible that other positions in the peptide, besides the P2 and the C-terminal positions, can represent characteristic anchor residues that are accommodated in pockets of the peptide binding groove.
In this study, we report the sequence of >1,000 endogenously bound peptides eluted from six closely related HLA-B35 molecules. We characterized novel peptide motifs that include anchor residues extending to all positions in the peptide. Remarkably, we identified several long peptides, some of which represent N-terminal-extended versions of canonical peptides containing an appropriate HLA-B35 peptide motif.
Materials and Methods
Cell lines
The human myelogenous leukemia cell line K562 and human embryonic kidney 293 cells were obtained from the American Type Culture Collection (ATCC). The EBV-transformed homozygous human B cell lines WT100BIS (HLA-B*3501), JO528239 (HLA-B*3502), KOSE (HLA-B*3503), and TISI (HLA-B*3508) from the Tenth International Histocompatibility Workshop were obtained from the American Society for Histocompatibility and Immunogenetics Cell Repository (Minneapolis, MN).
Abs and Western blot analysis
The mouse mAb w6/32 (ATCC) recognizes an epitope of native HLA-A, -B, and -Cw Ags. The Ab CSA630 (30) (Stressgen Bioreagents) is a rabbit polyclonal Ab raised against human tapasin. The mouse mAbs Map.ERp57 (31) and FMC75 (32) (Santa Cruz Biotechnology) are directed against recombinant human ERp57 and calreticulin proteins, respectively. The mouse mAb 148.3 (31) (a gift from Dr. P. Cresswell, Yale University, New Haven, CT) is directed against the C terminus of TAP1. Abs N20 and S19 (Santa Cruz Biotechnology) are polyclonal Abs raised against human HLA-B H chain and ERAP1 proteins, respectively. For immunoblot analysis, a total of 1 × 106 K562 cells were lysed in 1 ml of buffer containing 20 mM Tris (pH 7.4), 150 mM NaCl, and 1% of Nonidet P-40 (calreticulin and ERp57 blots) or 1% digitonin (TAP1, tapasin, and ERAP1 blots). SDS-PAGE and Western blotting were performed using standard methods.
Generation of K562 cell lines expressing a single HLA class I molecule
K562 cells (HLA class I null) expressing a single HLA class I allele were generated by retroviral-mediated gene transfer as described previously (33). Briefly, cDNAs encoding each HLA class I molecule were obtained from EBV-transformed homozygous human B cell lines expressing the appropriate HLA class I molecule. For HLA-B*3504 and HLA-B3506 cDNA production, mutagenic primers were used to create isogenic molecules containing the codons specifying the sequence of each allele. Each cDNA was then cloned into the retroviral vector pLPCX (Clontech). 293 T cells were plated in 60-mm plates and the next day each retroviral vector containing the cDNA of interest was transfected by calcium phosphate precipitation. Viral supernatants were collected 48 h later and used to infect K562 cells. K562 cell lines expressing a single HLA class I molecule were selected using 2 mg/ml puromycin and verified by flow cytometry with the w6/32 mAb. Mock-transduced K562 cell were generated by transduction with an empty retroviral pLPCX vector and used as negative control throughout the study.
Isolation of HLA class I-bound peptides
The procedure used for the isolation of detergent-solubilized HLA class I molecules was modified from the one described by Purcell (34). Initially, immunoaffinity columns were constructed using protein A-Sepharose (GE Healthcare) covalently coupled with a purified HLA-class I mAb (w6/32, IgG2a) as described by Gorga et al. with some minor modifications (35). Briefly, w6/32 mAb (12.5 mg/ml) was added to 5 ml of protein A-Sepharose beads and mixed by rocking overnight. Following incubation, the beads were washed with 100 mM sodium borate (pH 8.2) and resuspended in 10 volumes of 200 mM triethanolamine (pH 8.2). The protein A-Sepharose beads were then resuspended in 20 volumes of 20 mM dimethylpimelimidate in 200 mM triethanolamine (pH 8.3). The coupling reaction was allowed to proceed for 60 min. The reaction was stopped with 10 volumes of 20 mM ethanolamine (pH 8.2) and incubation for 2 h at room temperature. Noncoupled mAb was eluted with 5 ml of glycine-HCl at 0.2 M (pH 2.7). Columns were stored in PBS with 0.02% sodium azide.
K562 wild-type and K562 cell lines expressing surface HLA class I molecules were grown in roller bottles. Batches of 1 × 109 cells were harvested by centrifugation. Cell pellets were washed twice in PBS (pH 7.4) at 4°C and then lysed in buffer containing 0.5% Nonidet P-40, 500 mM Tris-HCl (pH 8.0), 150 mM NaCl, and protease inhibitors. Cell lysates were clarified by several rounds of centrifugation, and the supernatant was passed over a Tris-blocked Sepharose column. The precleared lysates were mixed with 1 ml of Sepharose beads covalently linked to anti-HLA class I mAb w6/32 as described above. Separate columns were used for each K562 transfectant. After 60 min of gentle mixing, the beads were washed with 20 bead volumes of several buffers in the following order: 0.005% Nonidet P-40, 50 mM Tris-HCl (pH 8.0), 150 mM NaCl, 5 mM EDTA; 50 mM Tris-HCl (pH 8.0), 150 mM NaCl; 50 mM Tris-HCl (pH 8.0), 450 mM NaCl; and 50 mM Tris-HCl (pH 8.0). Subsequently, the HLA-peptide complexes were eluted with 6 ml of 10% acetic acid. HLA molecules were separated from peptides by ultrafiltration with a 3-kDa cutoff. Peptide pools were separated into ∼40 fractions of 250 μl each using an HPLC system (Beckman Coulter). Individual HPLC fractions were concentrated to a final volume of 10 μl before mass spectrometry analysis.
Mass spectrometric peptide sequencing and analysis
Briefly, 1–2 ml of each HPLC fraction was analyzed by tandem mass spectrometry (Agilent 6510 quadrupole time-of-flight (Q-TOF) instrument with Chip Cube electrospray ionization; Agilent Technologies). The samples were injected using nanospray protein chip no.1 (40-nl trap, 75 × 43 mm, C-18SB-ZX chip, 5-mm particles) at a flow rate of 400 nl/min. Data acquisition was done using MassHunter (version B.01.03) in a 2-GHz extended dynamic range at a rate of three scans per second followed by data-dependent tandem mass spectrometric fragment scans of the three most intense ions. Precursor ion exclusion was set for 12 s after two consecutive tandem mass spectrometric scans. Before each experiment, the Q-TOF analyzer was tuned to a resolution of >12,000, and mass accuracy was calibrated to <2 ppm. Acquired tandem mass spectrometric spectra were searched with no enzyme specificity using Spectrum Mill (Agilent Technologies) against the UniProt human FASTA protein database (August 2007 download). Raw peptide data files generated were converted into Excel format (Microsoft) and sorted according to their corresponding mass-to-charge ratio (m/z) values, charge state, retention time, and intensity. A user-defined intensity threshold (7.0) above the background noise was fixed to limit false-positive identification. All identified peptides sequences above this score were manually verified. In addition, peptides found in the fractions from K562 wild-type lysates were considered contaminants and subtracted from the final list of peptides.
Determination of HLA class I motifs
To determine the HLA motif for each HLA class I molecule, we used the method proposed by Rammensee et al., with some minor modifications (29). Briefly, amino acid frequency was calculated at each position of the peptide from P1 through P7 starting at the N-terminal P1 position. For this particular study, we grouped positions P8 through P11 or longer into a single position termed PΩ to account for variation in length of class I peptides. Thus, PΩ position reflects the C-terminal residue in all peptides. To visualize the characteristics of the different binding motifs, we used the logo program (36). The information content at each position in the sequence motif corresponds to the height of a column of letters. The height of each letter within the columns is proportional to frequency of the corresponding amino acid at that position.
Results
Construction of cells expressing single HLA class I molecules
To determine the peptide binding motifs of HLA class I molecules, we constructed retroviral vectors encoding the sequence of several HLA class I H chains and then created single HLA class I-expressing cells using retroviral-mediated transfer of each HLA class I coding gene into HLA class I-null K562 cells. We confirmed that HLA class I molecules were successfully transduced and expressed into K562 cells by FACS analysis (Fig. 1,A) and Western blot analysis of the HLA-B H chain (Fig. 1 B).
Expression of HLA-B molecules in retrovirally transduced K562 (K) cells. A, Flow cytometric analysis of HLA-B allelic proteins expressed on mock-transduced K562 (HLA class I null) before and after retroviral-mediated gene transfer as indicated in the figure. Cells were stained with an anti-HLA-B w6/32 mAb, demonstrating that the mock-transduced K562 cells did not express HLA-B whereas the transduced cells express surface HLA-B35 molecules. B, Western blot analysis of HLA-B transcription. Approximately 1 × 106 mock-transduced K562 cells and transfectants were lysed. Proteins were separated by SDS-PAGE and blotted with an anti-HLA-B N20 polyclonal Ab that detects a 45-kDa band of the HLA-B H chain.
Expression of HLA-B molecules in retrovirally transduced K562 (K) cells. A, Flow cytometric analysis of HLA-B allelic proteins expressed on mock-transduced K562 (HLA class I null) before and after retroviral-mediated gene transfer as indicated in the figure. Cells were stained with an anti-HLA-B w6/32 mAb, demonstrating that the mock-transduced K562 cells did not express HLA-B whereas the transduced cells express surface HLA-B35 molecules. B, Western blot analysis of HLA-B transcription. Approximately 1 × 106 mock-transduced K562 cells and transfectants were lysed. Proteins were separated by SDS-PAGE and blotted with an anti-HLA-B N20 polyclonal Ab that detects a 45-kDa band of the HLA-B H chain.
K562 cells are competent for endogenous HLA class I peptide loading
The human cell line K562 lacks HLA class I and II expression on its cell surfaces (37). K562 transfectants expressing a single HLA class I molecule have been used as APCs for CD8+ T lymphocytes and allorecognition studies (38, 39). However, to date there are no data confirming that K562 cells express all of the minimum components involved in peptide loading of HLA class I molecules. Using Western blot analyses of K562 cell lysates, we examined the expression of several components of the MHC I peptide-loading complex including TAP, tapasin, calreticulin, ERp57, and ERAP1. As shown in Fig. 2, the expression of each of these proteins was confirmed in K562 cells. We conclude that K562 transfectants expressing a single HLA class I molecule can be used to determine HLA peptide binding motifs.
Schematic representation of the MHC I peptide-loading complex in K562 cells. The MHC I H chain and β2-microglobulin (B2M) molecules are folded with the assistance of the chaperones calnexin (not shown) and calreticulin (CRT). The multisubunit peptide-loading complex is centered on the adaptor molecule tapasin, which interacts with TAP, the MHC I molecule, and the oxidoreductase ERp57. ERAP1, the ER aminopeptidase associated with Ag processing, is responsible for the trimming the N-terminal extensions of precursor peptides. Western blot analyses of the proteins depicted in the figure were performed in mock-transduced K562 (K) cell lysates. Proteins were separated by SDS-PAGE and blotted with respective Abs as described in Materials and Methods. Cell lysates from T2 cells (TAP null) and 721.220 (.220) cells (tapasin null) were used as negative controls for TAP1 and tapasin blots, respectively. Spleen cell lysates from a C57BL/6 (B6) mouse were used as negative control for ERp57, CRT, and ERAP1 blots.
Schematic representation of the MHC I peptide-loading complex in K562 cells. The MHC I H chain and β2-microglobulin (B2M) molecules are folded with the assistance of the chaperones calnexin (not shown) and calreticulin (CRT). The multisubunit peptide-loading complex is centered on the adaptor molecule tapasin, which interacts with TAP, the MHC I molecule, and the oxidoreductase ERp57. ERAP1, the ER aminopeptidase associated with Ag processing, is responsible for the trimming the N-terminal extensions of precursor peptides. Western blot analyses of the proteins depicted in the figure were performed in mock-transduced K562 (K) cell lysates. Proteins were separated by SDS-PAGE and blotted with respective Abs as described in Materials and Methods. Cell lysates from T2 cells (TAP null) and 721.220 (.220) cells (tapasin null) were used as negative controls for TAP1 and tapasin blots, respectively. Spleen cell lysates from a C57BL/6 (B6) mouse were used as negative control for ERp57, CRT, and ERAP1 blots.
Validation of the experimental approach for the identification of HLA class I-associated peptides
Immunoaffinity purification of MHC-peptide complexes from detergent-solubilized cell lysates is the gold standard method for identifying HLA class I-associated peptides. To test the validity of our experimental design and mass spectrometric parameters, we conducted a series of validation steps using peptide eluates from immunoaffinity purification of HLA-B*3501-peptide complexes. We then applied these parameters for the analysis of the remaining transfectants.
Samples obtained by immunoaffinity purification of peptides from 1 × 109 HLA-B*3501 transfectants were analyzed using Agilent 6510 Q-TOF (see Materials and Methods for more details). As a first step in the process of peptide data validation, all tandem mass spectrometric fragmentation spectrums were sorted according to their intensity. All identified peptide sequences below a defined intensity threshold of 7.0 were discarded. All peptides sequences above this threshold were manually verified. We then discarded peptide signals that were recovered from wild-type K562 preparations to discriminate HLA-B*3501-associated peptides from contaminant peptides. We note that the length distribution of contaminant peptides was variable, ranging from five to 23 aa and included several peptides from hemoglobin, keratin, and heat shock proteins. Of 543 peptide signals that were reproducibly detected across three biological replicates, 248 unique peptides were detected within the HLA-B*3501 transfectants.
Reproducibility of the tandem mass spectrometric data was demonstrated as shown in Fig. 1 of the supplemental section.5 Mass accuracy of the top 500 peptides identified from the HLA-B*3501 experiments was excellent, with an average ± 2.3 ppm error. Reproducibility of peptide retention time during replicate analysis was on average within 0.08 min. Peak intensity also demonstrated good reproducibility among analytical replicates with an average SD of signal of <10,000 counts.
In the experimental design of this study, we used K562 wild-type cells as negative control to discriminated HLA class I-associated peptides from contaminant peptides. However, the use of an HLA class I-deficient negative control could be insufficient to rule out contaminant peptides. We therefore investigated whether bioinformatics tools could be used to assess the validity of our method to identify HLA class I-associated peptides. For this, we classified all HLA-B*3501-associated peptides as a function of their estimated MHC binding score by using three separate peptide binding prediction methods, including the artificial neural network (40), the stabilized matrix method (41), and average relative binding (42). These algorithms are validated bioinformatics tools that predict the binding of peptides to a number of different MHC molecules using large in vitro peptide binding data sets. Prediction values are given in nanomolar IC50 values, which represent the equilibrium dissociation constant (KD) of the peptide in relation to a particular HLA molecule. The calculated KD value classifies peptides as binders or nonbinders over a large range of values (0.1–100,000 nM) (Fig. 3). The analysis revealed that >99% of the HLA-B*3501-associated peptides scored below the nonbinding threshold of 50,000 nM using all three methods. The median KD for the HLA-B*3501-associated peptides was 61.25 nM (25% percentile = 18; 75% percentile = 1013) for the artificial neural network algorithm, 89.13 nM (25% percentile = 33.43; 75% percentile = 427.4) for the stabilized matrix method algorithm, and 50 nM (25% percentile = 10.89; 75% percentile = 490) for the average relative binding method. We note that differences in the data sets used to generate these prediction methods are responsible for the observed minor differences in the distributions of the predicted binding data of HLA-B*3501-associated peptides between methods. Thus, we conclude that the experimental analysis using these bioinformatics tools further validates our study design and mass spectrometric parameters for the identification of HLA class I-associated peptides. Given that it has been shown that the neural network method performs better than the matrix-based predictions (43), we decided to use the artificial neural network algorithm for the validation of subsequent experiments. Specifically, the 90th percentile obtained with the artificial neural network algorithm (KD < 19,680 nM for HLA-B*3501-associated peptides) was used as the cutoff for peptide identification. This threshold provides a conservative 10% false negative rate that an HLA class I-associated peptide can be wrongly classified as a contaminant. The 90th percentile was calculated for each HLA class I transfectant in all subsequent experiments and used as the cutoff for identifying specific HLA class I-associated peptides.
Validation of the method for identifying HLA I-associated peptides using bioinformatics tools. For 248 peptides eluted from K562 transfectants expressing HLA-B*3501, the y-axis shows computed MHC binding scores determined with the three separate computational methods shown in the x-axis, including artificial neural network (ann), average relative binding (arb), and stabilized matrix method (smm). Bars represent the median and the 25th and 75th percentiles of computed binding data for each method.
Validation of the method for identifying HLA I-associated peptides using bioinformatics tools. For 248 peptides eluted from K562 transfectants expressing HLA-B*3501, the y-axis shows computed MHC binding scores determined with the three separate computational methods shown in the x-axis, including artificial neural network (ann), average relative binding (arb), and stabilized matrix method (smm). Bars represent the median and the 25th and 75th percentiles of computed binding data for each method.
HLA-B35-bound peptides include long N-terminal-extended peptide motifs
We processed batches of 1 × 109 cells from each K562 cell line expressing a single HLA-B35 molecule. Using the process described in the previous section, we obtained a total of 216 unique peptides from HLA-B*3501, 193 peptides from HLA-B*3502, 90 peptides from HLA-B*3503, 169 peptides from HLA-B*3504, 97 peptides from HLA-B*3506, and 158 peptides from HLA-B*3508. Table I shows representative peptide sequences from each HLA-B35 molecule analyzed. Complete peptide lists are presented in the supplemental data section.
Representative list of peptides eluted from HLA-B35 molecules
Peptide Sequence . | Protein Identifier . | Protein Name . |
---|---|---|
HLA-B*3501 | ||
MPLEDMNEF | RFA2 | Replication protein A 32-kDa subunit |
FPEELTQTF | EF1G | Elongation factor 1-γ |
MAWLVDHVY | 2AAA | Serine/threonine-protein phosphatase 2A, 65 kDa |
MPADTNKAF | TM111 | Transmembrane protein 111 |
FPYDYSASEY | ACOD | Acyl-CoA desaturase |
HLA-B*3502 | ||
LPQYRDAVM | DEF | Digestive organ expansion factor homolog |
MPTKETTKL | BUB1B | Mitotic checkpoint serine/threonine-protein kinase BUB1 β |
FPVEVNTVL | ANM5 | Human protein arginine N-methyltransferase 5 |
HPESERISM | SPTB2 | Spectrin β-chain, brain 1 |
MPANGETVTL | INT11 | Integrator complex subunit 11 |
HLA-B*3503 | ||
IPAEGRVAL | GRHPR | Glyoxylate reductase/hydroxypyruvate reductase |
LPDERTISL | CSTF1 | Cleavage stimulation factor 50-kDa subunit |
YPGQPHPAL | HES1 | Transcription factor HES-1 |
SPNYDHVVL | IF32 | Eukaryotic translation initiation factor 3, subunit 2 |
LPTEKEVAL | MAP4 | Microtubule-associated protein 4 |
HLA-B*3504 | ||
LPDERTISL | CSTF1 | Cleavage stimulation factor 50-kDa subunit |
SPNYDHVVL | IF32 | Eukaryotic translation initiation factor 3, subunit 2 |
LPVQPENAL | TOIP1 | Torsin-1A-interacting protein 1 |
MPEPTVLSL | TOM7 | Probable mitochondrial import receptor subunit TOM7 homolog |
APGEQTVPAL | SPT16 | FACT complex subunit SPT16 |
HLA-B*3506 | ||
LPDERTISL | CSTF1 | Cleavage stimulation factor 50-kDa subunit |
SPSKNYILSV | GSLG1 | Golgi apparatus protein 1 precursor |
SPQAPTHFL | HINT1 | Histidine triad nucleotide-binding protein 1 |
APEEHPVLL | ACTB | Actin, cytoplasmic 1 |
FPDTPLAL | ILF3 | Interleukin enhancer-binding factor 3 |
HLA-B*3508 | ||
FPSSNVHVY | DJB12 | DnaJ homolog subfamily B member 12 |
MPAPEIVSY | ZBT43 | Zinc finger and BTB domain-containing protein 43 |
YPVPDVSTY | RBM10 | RNA-binding protein 10 |
FPEELTQTF | EF1G | Elongation factor 1-γ |
MPQVAPDLY | CF211 | UPF0364 protein C6orf211 |
Peptide Sequence . | Protein Identifier . | Protein Name . |
---|---|---|
HLA-B*3501 | ||
MPLEDMNEF | RFA2 | Replication protein A 32-kDa subunit |
FPEELTQTF | EF1G | Elongation factor 1-γ |
MAWLVDHVY | 2AAA | Serine/threonine-protein phosphatase 2A, 65 kDa |
MPADTNKAF | TM111 | Transmembrane protein 111 |
FPYDYSASEY | ACOD | Acyl-CoA desaturase |
HLA-B*3502 | ||
LPQYRDAVM | DEF | Digestive organ expansion factor homolog |
MPTKETTKL | BUB1B | Mitotic checkpoint serine/threonine-protein kinase BUB1 β |
FPVEVNTVL | ANM5 | Human protein arginine N-methyltransferase 5 |
HPESERISM | SPTB2 | Spectrin β-chain, brain 1 |
MPANGETVTL | INT11 | Integrator complex subunit 11 |
HLA-B*3503 | ||
IPAEGRVAL | GRHPR | Glyoxylate reductase/hydroxypyruvate reductase |
LPDERTISL | CSTF1 | Cleavage stimulation factor 50-kDa subunit |
YPGQPHPAL | HES1 | Transcription factor HES-1 |
SPNYDHVVL | IF32 | Eukaryotic translation initiation factor 3, subunit 2 |
LPTEKEVAL | MAP4 | Microtubule-associated protein 4 |
HLA-B*3504 | ||
LPDERTISL | CSTF1 | Cleavage stimulation factor 50-kDa subunit |
SPNYDHVVL | IF32 | Eukaryotic translation initiation factor 3, subunit 2 |
LPVQPENAL | TOIP1 | Torsin-1A-interacting protein 1 |
MPEPTVLSL | TOM7 | Probable mitochondrial import receptor subunit TOM7 homolog |
APGEQTVPAL | SPT16 | FACT complex subunit SPT16 |
HLA-B*3506 | ||
LPDERTISL | CSTF1 | Cleavage stimulation factor 50-kDa subunit |
SPSKNYILSV | GSLG1 | Golgi apparatus protein 1 precursor |
SPQAPTHFL | HINT1 | Histidine triad nucleotide-binding protein 1 |
APEEHPVLL | ACTB | Actin, cytoplasmic 1 |
FPDTPLAL | ILF3 | Interleukin enhancer-binding factor 3 |
HLA-B*3508 | ||
FPSSNVHVY | DJB12 | DnaJ homolog subfamily B member 12 |
MPAPEIVSY | ZBT43 | Zinc finger and BTB domain-containing protein 43 |
YPVPDVSTY | RBM10 | RNA-binding protein 10 |
FPEELTQTF | EF1G | Elongation factor 1-γ |
MPQVAPDLY | CF211 | UPF0364 protein C6orf211 |
HLA class I molecules bind primarily to peptides 8–11 aa long. Consistent with this idea, we found that the majority of HLA-B35-associated peptides were between eight and 11 residues long (Fig. 4). Specifically, peptides nine residues in length constituted ∼70% of all peptides eluted. Peptides longer than 11 residues have been rarely identified during immunoaffinity purification of MHC-peptide complexes. Bulging and C-terminal extension are documented mechanisms for binding longer peptides (10, 11, 12, 44). In this study, we identified several peptides longer than 11 residues in all HLA-B35 molecules (Table II). Remarkably, the analysis of the sequence of these peptides revealed that many of these peptides contained an appropriate peptide motif that was preceded by several residues in the N-terminal end. We note that N-terminal-extended peptides were identified in association with HLA-B*3502, -B*3503, -B*3504, and -B*3506 molecules. In contrast, all long peptides identified in association with HLA-B*3501 and -B*3508 molecules had a proline and an aromatic residue in the P2 and C-terminal positions respectively, both known to be the anchor residues in peptides associated with HLA-B*3501 (45) and HLA-B*3508 molecules (results from this study).
Length distribution of peptides eluted from HLA-B35 molecules. Peptides longer than 11 residues were clustered in a single group.
Length distribution of peptides eluted from HLA-B35 molecules. Peptides longer than 11 residues were clustered in a single group.
List of peptides longer than 11 residues eluted from HLA-B35 moleculesa
Peptide Sequence . | Peptide Length . | Protein Identifier . | Protein Name . |
---|---|---|---|
HLA-B*3501 | |||
RPSSTSSASALY | 12 | UBP4 | Ubiquitin carboxyl-terminal hydrolase 4 |
APIKVGDAIPAVEVF | 15 | PRDX5 | Peroxiredoxin-5, mitochondrial precursor |
HLA-B*3502 | |||
AVDGEPLGRVSF | 12 | PPIA | Peptidyl-prolyl cis-trans isomerase A |
EEEDVPGQAKDEL | 13 | CALR | Calreticulin precursor |
RGEVAPDAKSFVL | 13 | LEG1 | Galectin-1 |
HLA-B*3503 | |||
AVDGEPLGRVSF | 12 | PPIA | Peptidyl-prolyl cis-trans isomerase A |
VETRSAGQGEVL | 12 | FLNA | Filamin-A |
EEEDVPGQAKDEL | 13 | CALR | Calreticulin precursor |
HLA-B*3504 | |||
EEDVPGQAKDEL | 12 | CALR | Calreticulin precursor |
AVDGEPLGRVSF | 12 | PPIA | Peptidyl-prolyl cis-trans isomerase A |
SALAPGVRAVEL | 12 | ZN313 | Zinc finger protein 313 |
EEEDVPGQAKDEL | 13 | CALR | Calreticulin precursor |
GGSAVISLEGKPL | 13 | COF1 | Cofilin-1 |
RGEVAPDAKSFVL | 13 | LEG1 | Galectin-1 |
QEAGILSAEELQRL | 14 | PLEC1 | Plectin-1 |
HLA-B*3506 | |||
VETRSAGQGEVL | 12 | FLNA | Filamin-A |
AVDGEPLGRVSF | 12 | PPIA | Peptidyl-prolyl cis-trans isomerase A |
SALAPGVRAVEL | 12 | ZN313 | Zinc finger protein 313 |
GGSAVISLEGKPL | 13 | COF1 | Cofilin-1 |
ADKVPKTAENFRAL | 14 | PPIA | Peptidyl-prolyl cis-trans isomerase A |
HLA-B*3508 | |||
AVAPTSGPALGGLF | 14 | PRCC | Proline-rich protein PRCC |
APIKVGDAIPAVEVF | 15 | PRDX5 | Peroxiredoxin-5, mitochondrial precursor |
Peptide Sequence . | Peptide Length . | Protein Identifier . | Protein Name . |
---|---|---|---|
HLA-B*3501 | |||
RPSSTSSASALY | 12 | UBP4 | Ubiquitin carboxyl-terminal hydrolase 4 |
APIKVGDAIPAVEVF | 15 | PRDX5 | Peroxiredoxin-5, mitochondrial precursor |
HLA-B*3502 | |||
AVDGEPLGRVSF | 12 | PPIA | Peptidyl-prolyl cis-trans isomerase A |
EEEDVPGQAKDEL | 13 | CALR | Calreticulin precursor |
RGEVAPDAKSFVL | 13 | LEG1 | Galectin-1 |
HLA-B*3503 | |||
AVDGEPLGRVSF | 12 | PPIA | Peptidyl-prolyl cis-trans isomerase A |
VETRSAGQGEVL | 12 | FLNA | Filamin-A |
EEEDVPGQAKDEL | 13 | CALR | Calreticulin precursor |
HLA-B*3504 | |||
EEDVPGQAKDEL | 12 | CALR | Calreticulin precursor |
AVDGEPLGRVSF | 12 | PPIA | Peptidyl-prolyl cis-trans isomerase A |
SALAPGVRAVEL | 12 | ZN313 | Zinc finger protein 313 |
EEEDVPGQAKDEL | 13 | CALR | Calreticulin precursor |
GGSAVISLEGKPL | 13 | COF1 | Cofilin-1 |
RGEVAPDAKSFVL | 13 | LEG1 | Galectin-1 |
QEAGILSAEELQRL | 14 | PLEC1 | Plectin-1 |
HLA-B*3506 | |||
VETRSAGQGEVL | 12 | FLNA | Filamin-A |
AVDGEPLGRVSF | 12 | PPIA | Peptidyl-prolyl cis-trans isomerase A |
SALAPGVRAVEL | 12 | ZN313 | Zinc finger protein 313 |
GGSAVISLEGKPL | 13 | COF1 | Cofilin-1 |
ADKVPKTAENFRAL | 14 | PPIA | Peptidyl-prolyl cis-trans isomerase A |
HLA-B*3508 | |||
AVAPTSGPALGGLF | 14 | PRCC | Proline-rich protein PRCC |
APIKVGDAIPAVEVF | 15 | PRDX5 | Peroxiredoxin-5, mitochondrial precursor |
Boldfaced amino acid symbols indicate anchor residues at position P2 and the C terminus. Underlined symbols indicate N-terminal-extended residues starting from position P1 of the peptide.
New HLA-B35 motifs include additional anchor positions
The amino acid frequency at each position of peptides eluted from each HLA-B35 molecule is represented in Fig. 5. Detailed amino acid frequency at each position is presented in the supplemental data section. The results confirm the identity of the previously reported dominant anchor residue proline in position P2 of HLA-B3501 and B*3503-associated peptides (45, 46). Also, we confirmed the dominant presence of aromatic residues (tyrosine, phenylalanine) and leucine in the C-terminal positions of peptides from HLA-B3501 and -B*3503, respectively (45, 46). The results revealed that the newly described peptide motifs from HLA-B*3502, -B*3504, and -B*3506 molecules are essentially the same compared with those from HLA-B*3503 with minor differences in the order and identity of residues in the P2 and C-terminal positions. Similarly, the peptide motif of HLA-B*3508 resembled the motif of HLA-B*3501 at those two positions. The results also identified the strong preference for aspartic acid at position P3 of peptides from HLA-B*3502, -B*3504, -B*3506, and -B*3508 molecules. In general, acidic residues (aspartic acid, glutamic acid) were also found at a higher frequency in positions P4, P5, and P6 of peptides eluted from all HLA-B35 molecules, except HLA-B*3501 and -B*3503. In addition, we identified preferred hydrophobic anchor residues located at positions P1 and P7 that are shared by all HLA-B35 molecules.
Logos displaying the peptide binding motif of HLA-B35 molecules. Acidic (red), basic (blue), hydrophobic (black), neutral (green), and aromatic (purple) are illustrated. The height of each column of letters is equal to the information content (in bits) at the given position in the binding motif. The relative height of each letter within each column is proportional to the frequency of the corresponding residue at that position. Only residues with frequencies above 5% are shown. Position PΩ refers to the C-terminal residue in the peptide and includes position P8 or higher in peptides longer than eight residues.
Logos displaying the peptide binding motif of HLA-B35 molecules. Acidic (red), basic (blue), hydrophobic (black), neutral (green), and aromatic (purple) are illustrated. The height of each column of letters is equal to the information content (in bits) at the given position in the binding motif. The relative height of each letter within each column is proportional to the frequency of the corresponding residue at that position. Only residues with frequencies above 5% are shown. Position PΩ refers to the C-terminal residue in the peptide and includes position P8 or higher in peptides longer than eight residues.
Discussion
HLA class I molecules bind and present peptides derived from the intracellular processing of proteins to CD8+ T cells. HLA class I molecules are highly polymorphic, and this variability influences peptide binding specificity and the repertoire of peptides that are available for recognition by CD8+ T cells. Despite this polymorphism, HLA class I molecules can be grouped into sets of molecules or supertypes according to overlapping peptide binding repertoires or motifs (16, 17). HLA class I motifs were initially defined using pool sequencing information from a limited number of eluted peptides (∼10–15 total) (28, 29). HLA class I motifs are generally defined by conserved residues at the C-terminal and at the P2 internal position. However, it has been suggested that internal positions other than P2 and the C terminus can be used to characterize HLA class I peptide motifs (3, 47, 48). Strong support for this idea comes from the observation that HLA-B8 molecules use positions P3 and P5 as primary anchor residues (48, 49). Moreover, in the murine system it is recognized that several H2 molecules use positions other than P2 and the C terminus as anchor residues (3, 47).
Our results expand the knowledge about HLA class I peptide binding motifs. In this study, using large-scale sequence information from ∼1,000 peptides we established that HLA class I motif characterization can be extended to all positions in the peptide that, in their majority, are shared by all HLA-B35 molecules studied. Remarkably, only the C-terminal and position P3 of the peptide differ among the six HLA-B35 molecules analyzed. Specifically, HLA-B*3501 and B*3508-associated peptides had the preferential presence of the aromatic residues tyrosine and phenylalanine at the C-terminal position. In contrast, the remaining HLA-B35 molecules had bound peptides with the preferential presence of leucine at the C-terminal position. In terms of the P3 position, all peptides except those eluted from HLA-B*3501 had increased presence of acidic residues (aspartic acid, glutamic acid) at this position.
These results can be explained based on the few differences between the sequences of the H chain of the six HLA-B35 molecules analyzed in this study (Fig. 6). Positions 109 and 114 of the H chain have not been implicated in the specificity of the peptide binding HLA class I molecules. In contrast, position 116 of the H chain is located in the floor of the peptide binding site in MHC class I molecules and is critical for determining the specificity of the C-terminal residue (1, 2). In the case of HLA-B*3501 and -B*3508, position 116 is occupied by serine, which facilitates the binding of large aromatic residues such as tyrosine and phenylalanine to the base of pocket F in the peptide binding cleft as shown previously (50, 51). Conversely, the remaining HLA-B35 molecules analyzed carry aromatic residues at position 116, which could limit the binding of large aromatic residues and favor the binding of small hydrophobic residues such as leucine.
A PyMol model depicting the peptide binding groove of the HLA-B*3501 molecule bound to the peptide VPLRPMTY (50 ). A portion of the front α-helix has been erased for better view. Residues located at positions 109, 114, 116, and 156 of the H chain and differing among HLA-B35 molecules studied are noted in reference to the sequence of HLA-B*3501 used as consensus.
A PyMol model depicting the peptide binding groove of the HLA-B*3501 molecule bound to the peptide VPLRPMTY (50 ). A portion of the front α-helix has been erased for better view. Residues located at positions 109, 114, 116, and 156 of the H chain and differing among HLA-B35 molecules studied are noted in reference to the sequence of HLA-B*3501 used as consensus.
The strong preference for acidic residues in position P3 of peptides eluted from HLA-B*3508 but not from HLA-B*3501-associated peptides can be explained by the polymorphism at position 156 of the H chain. Position 156 is located in the wall of the peptide binding cleft and has not been directly implicated in the specificity peptide binding (2, 3). Our results point to the idea that position 156 can influence the specificity of position P3 of the peptide. In the case of HLA-B*3501, position 156 is occupied by leucine, whereas in HLA-B*3508 the same position is occupied by arginine. We believe that the presence of the positively charged arginine residue in position 156 of HLA-B*3508 is likely to provide the positive counter charge for the acidic group of the P3 anchor residue found in HLA-B*3508-associated peptides. Other HLA class I molecules including HLA-A1, B39 (52, 53), and B53 (our unpublished data) have also been shown to bind acidic residues in position P3. Interestingly, the results also revealed a strong preference for acid residues in positions P4, P5, and P6 of peptides eluted from all HLA-B35 molecules for except HLA-B*3501 and -B*3503. We note that both HLA-B*3501 and -B*3503 molecules carry an aspartic acid at position 114, whereas in HLA-B*3502, -B*3504, and -B*3506 molecules the same position is occupied by asparagine. It is possible that the presence of the negatively charged aspartic acid at position 114 of HLA-B*3501 and -B*3508 disfavors the occurrence of acidic residues in internal positions of the peptide. Future crystal structures along with molecular simulation studies will be required to understand in detail the molecular principles responsible for the increased occurrence of acidic residues in internal anchor positions of peptides eluted from HLA class I molecules.
Allof the HLA-B35 molecules analyzed in this study are members of the HLA-B7 supertype. The HLA-B7 supertype is characterized by a strong preference for a proline residue in the P2 position and a combination of aromatic, aliphatic. or hydrophobic residues in the C-terminal position. Given that the specificity of the C-terminal position is difficult to correlate in some HLA class I supertypes (16, 29), the addition of other positions with similar residue specificity might be useful for the subclassification of known supertypes or the addition of new supertype groups. In this study we established that peptide anchor residues can be extended to all positions in a set of HLA-B35 molecules and that this motif is generally shared by members of the group. To our knowledge, this is first study showing a large set of peptide sequences from several molecules of the same supertype group suggesting that the HLA class I supertype classification can be further extended by using information from anchor positions other than the P2 and C-terminal positions. Future studies using HLA molecules from different supertype groups will be necessary to validate this idea.
The results also revealed a complex mixture of peptide-HLA class I configurations. One of the most notable aspects of our results is the occurrence of N-terminal-extended motifs in the face of ER aminopeptidase activity in HLA-transduced K562 cells (Fig. 2). We found that a large proportion of peptides longer than 11 residues represented N-terminal-extended versions of canonical peptide sequences (eight and nine residues long) containing the appropriate peptide binding motif. This result is unexpected, given the failure of previous large-scale mass spectrometric studies to identify N-terminal-extended motifs (54, 55, 56) and the reported instability of ERAAP-unedited peptide complexes in mice (13). In the literature, there is only one study reporting the sequence of one set of nested antigenic long N-terminal-extended peptides derived from a transfected HIV gp120 protein in mice (14). Thus, to our knowledge, this is the first report of endogenous N-terminal-extended peptides eluted from human MHC molecules.
Our results suggest that aminopeptidase activity in the ER efficiently edits the majority of N-terminal-extended peptides that are delivered to the ER from the cytosol, whereas a minority of peptides might escape trimming. A possible explanation for the escape of these peptides is that they might bind to certain HLA class I molecules that dissociate more quickly from the peptide-loading complex (57, 58, 59, 60, 61, 62). In this regard, it is interesting to note that two of the HLA-B35 molecules studied (HLA-B*3501 and HLA-B*3508) were not associated with N-terminal-extended peptides. Moreover, preliminary data from our laboratory using large-scale sequencing of peptides eluted from human HLA-B*0702 and -B*5301 molecules as well as mouse H2-Db and Kb molecules have failed to identify N-terminal-extended peptides (our unpublished data). We note that certain HLA-B35 molecules, including HLA-B*3502, have been associated with an increased susceptibility to developing a rapid progression to AIDS after HIV-1 infection (63). Thus, it is possible that the capacity of certain MHC class I molecules to present untrimmed N-terminal-extended peptides is related to their degree of affinity with the peptide-loading complex and their capacity to elicit an appropriate CD8+ T cell response. Given that the experimental design in our study does not discriminate between peptides eluted from HLA class I molecules located inside the ER or from molecules that have departed the ER, we are unable to determine whether the N-terminal-extended peptides found in this study represent peptides that indeed escaped ER aminopeptidase activity or just simply represent intermediate N-terminal-extended peptide-HLA class I complexes located in the ER before ER aminopeptidase trimming. Future studies analyzing peptides eluted from other HLA class I molecules as well as studies analyzing peptides bound to HLA class I molecules inside the ER as compared with the cell surface might provide answers to these questions.
We also observed that the length distribution of HLA-B35-bound peptides included peptides of up to 15 residues. Peptides longer than 11 amino acids might be accommodated by bulging in the middle or an extension out of the peptide binding space at the C or N terminus. Bulging and C-terminal extension are documented mechanisms for binding longer peptides (10, 12, 44, 64), and at least one crystal structure of a peptide protruding at the C terminus has been reported (11). Additional work is also required to characterize directly the HLA binding conformation adopted by N-terminal-extended peptides.
In summary, the present study provides evidence for the occurrence of endogenous N-terminal-extended peptide-HLA class I configurations. In addition, these results expand our knowledge about the identity of positions that can be used for characterization of HLA class I peptide binding motifs. Previous determinations of HLA class I motifs are incomplete, and similar studies using sensitive methods for sequencing large numbers of peptides are valuable to establish precise motif information.
Disclosures
The authors have no financial conflict of interest.
Footnotes
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
This work was supported by National Institutes of Health Grants AI33614 and AI20554 (to P.E.J.) and funds from the Associated Regional and University Pathologists (ARUP) Institute for Clinical and Experimental Pathology (to D.K.C., A.L.R., P.E.J., and J.C.D.).
Abbreviations used in this paper: ER, endoplasmic reticulum; ERAAP, ER aminopeptidase associated with Ag processing; ERAP1, ERAAP human otholog; Q-TOF, quadrupole time-of-flight.
The online version of this article contains supplemental material.