Abstract
Peptides associated with class II MHC molecules are of variable length because in contrast to peptides associated with class I MHC molecules, their amino and C termini are not constrained by the structure of the peptide interaction with the binding site. The proteolytic processing events that generate these peptides are still not well understood. To address this question, peptides extracted from HLA-DR*0401 were analyzed using two types of mass spectrometry instrumentation. This enabled identification of >700 candidate peptides in a single analysis and provided relative abundance information on 142 peptides contained in 11 nested sets of 3–36 members each. Peptides of 12 residues or less occurred only at low abundance, despite the fact that they were predicted to fully occupy the HLA-DR*0401 molecule in a single register. Conversely, the relative abundance of longer species suggested that proteolytic events occurring after MHC binding determine the final structure of most class II-associated peptides. Our data suggest that C-terminal residues of these peptides reflect the action of peptidases that cleave at preferred amino acids, while amino termini appear to be determined more by proximity to the class II MHC binding site. Thus, the analysis of abundance information for class II-associated peptides comprising nested sets has offered new insights into proteolytic processing of MHC class II-associated peptides.
Class II MHC molecules bind peptides and present them to CD4+ T cells. The major sources of such peptides are either external or membrane-associated proteins (1) that are internalized and proteolyzed in endosomes or lysosomes (2, 3, 4). The selective presentation by class II MHC molecules of peptides produced in endosomal compartments is due to the involvement of the invariant chain, which associates with the class II MHC α- and β-chains immediately after synthesis in the endoplasmic reticulum, blocks the peptide binding site, and retargets the complex to the endosomal/lysosomal compartment (5). Endosomal proteases digest most of the invariant chain, leaving a set of peptides called class II-associated invariant chain peptide associated with the class II peptide binding site (2, 4). HLA-DM catalyzes the dissociation of the class II-associated invariant chain peptide peptides, allowing for the subsequent binding of other peptides present in the endosomal compartment (6, 7).
Although the assembly, transport, and peptide binding of class II MHC molecules is relatively well understood, little is known about the processes and proteases that form these peptides. Although endosomes contain a variety of proteases (reviewed in Ref. 4), few have been directly implicated in the generation of class II-associated peptides. Cathepsin D has been implicated in the generation of T cell epitopes from OVA, hen egg lysozyme, sperm whale myoglobin, and insulin based on in vitro digestion to produce antigenic fragments (8, 9, 10, 11, 12). However, splenocytes from cathepsin B or cathepsin D-deficient mice process and present multiple antigenic determinants normally, including those from OVA and hen egg lysozyme (13). Recently, a newly identified asparaginyl endopeptidase was shown to generate a proteolytic intermediate in the processing of a tetanus toxin Ag (14). This study also suggested the existence of at least two processing events: an initial generation of precursor fragments from the intact protein and secondary processing to an optimal size for MHC binding and/or T cell recognition. However, it remains unclear whether processing is completed before binding to class II MHC molecules in the endosome, or whether some processing of MHC-associated peptides occurs subsequent to binding.
Binding of peptides to class II MHC molecules involves several hydrogen bonds between residues in the binding site and the peptide backbone, as well as contacts between specific peptide side chains and pockets in the binding site (15, 16, 17). In contrast to class I MHC molecules, the ends of the class II molecule binding site are open and do not constrain the length of class II-associated peptides. Indeed, sequence analysis of peptides associated with class II MHC molecules has identified peptides of lengths between 10 and 34 residues (1, 18, 19, 20). Furthermore, many of these class II-associated peptides comprise nested sets that contain the same core sequence but vary in length at the amino or C-terminal ends. However, little information has been gathered concerning the relative abundance of related peptides or the factors that determine the preferred length of class II-associated peptides.
Previous sequence analyses of class II MHC-associated peptides have relied on either Edman degradation or mass spectrometry (1, 18, 19, 20, 21, 22). The utility of Edman degradation is limited by the complexity of the peptide mixture even after HPLC fractionation. Chicz and colleagues (1, 20) used mass spectrometry to correlate masses in individual HPLC fractions with Edman data, and thus inferred the identity of several peptides from different human class II alleles. Although this improves on the use of Edman degradation alone, the mass accuracy of the instrument used does not allow for truly unambiguous identification of such peptides. Only by using tandem mass spectrometry (MS/MS)4 is it possible to select an individual peptide ion and subject it to collision-activated dissociation to generate an unambiguous sequence (19, 22).
In the present study, we used a high throughput mass spectrometric technique together with automated database searching of collision-activated dissociation spectra to determine candidate sequences for >700 different peptides bound to the human class II MHC molecule HLA-DR*0401. Next, we used Fourier Transform mass spectrometry to analyze 142 peptides that were members of nested sets and determine their relative abundances. Using this information, we provide a comprehensive view of the distribution of peptides within a nested set that are associated with this class II MHC molecule. Our analysis has also allowed us to draw inferences regarding the proteolytic processes that lead to final peptide products displayed by HLA-DR4.
Materials and Methods
Cell lines
The human B lymphoblastoid cell line PRIESS (kindly provided by B. Nepom, Virginia Mason Research Center, Seattle, WA) is homozygous for HLA-DR*0401 (DRA*0101, DRB1*0401). The hybridoma LB3.1 secretes an mAb that specifically recognizes human HLA-DR molecules (23). Both cell lines were maintained in RPMI 1640 supplemented with 10% FBS and 2 mM glutamine.
Isolation of HLA-DR*0401-associated peptides
PREISS cells (5 × 108) were lysed in 5 ml of lysis buffer (20 mM Tris-HCl (pH 8.0), 150 mM NaCl, 1% 3-[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonate, 5 μg/ml aprotinin, 10 μg/ml leupeptin, 10 μg/ml pepstatin A, 5 mM EDTA, 0.04% sodium azide, and 1 mM PMSF) for 1 h at 4°C. The lysate was clarified by centrifugation at 16,000 × g for 30 min at 4°C, and the supernatant was precleared by incubation with 100 μl recombinant protein A-Sepharose beads (Amersham Pharmacia Biotech, Piscataway, NJ) for 4 h at 4°C. After removal of the beads, the supernatant was incubated overnight at 4°C with 100 μl recombinant protein A-Sepharose beads to which 2 mg of LB3.1 Ab had been bound. The beads were subsequently washed twice in lysis buffer, four times in 20 mM Tris-HCl (pH 8.0), 150 mM NaCl, twice in 20 mM Tris (pH 8.0), 1 M NaCl, and three times in 20 mM Tris-HCl (pH 8.0). Peptides were eluted in acid and separated from beads, Ab, and class II MHC molecules by passage through a 10,000-day cutoff ultrafiltration unit (Millipore, Bedford, MA).
Mass spectrometry
An aliquot corresponding to 1 × 107 cell equivalents of the peptide mixture described above was analyzed by nanoflow HPLC-electrospray ionization coupled directly to an ion trap mass spectrometer (24, 25). Following data-dependent MS/MS analysis (24, 25), spectra with characteristic features of peptide fragmentation were searched against the nonredundant database maintained at the National Center for Biotechnology Information using the SEQUEST algorithm (26) to identify candidate sequences. Selected sequences were confirmed by manual interpretation of MS/MS spectra and their identification as members of nested sets. A second aliquot corresponding to 1 × 106 cell equivalents of the peptide mixture was subsequently analyzed by nanoflow HPLC coupled to electrospray ionization on a home-built Fourier transform mass spectrometer (27). Full scan mass spectra (300 ≤ m/z ≤ 5000) were acquired at approximately one scan per second. Peptide ions were correlated with those identified in ion trap experiments by relative HPLC retention time and accurate mass. Peptide abundance was estimated based on the observed ion current. Because peptides in nested sets are chemically similar to one another, it is reasonable to expect that they will exhibit similar ionization efficiencies.
Results
Rapid identification and characterization of HLA-DR*0401-associated peptides belonging to nested sets
The peptides associated with immunoaffinity-purified HLA-DR*0401 molecules were extracted and analyzed by nanoflow HPLC interfaced to electrospray ionization on an ion-trap mass spectrometer. By using a long, shallow HPLC gradient and by acquiring mass spectra in the data-dependent mode (24), ∼2000 nonredundant peptide MS/MS spectra were acquired in a single nanoflow HPLC interfaced to electrospray ionization on an ion-trap mass spectrometer run. A search of these MS/MS spectra against the NCBI nonredundant database using SEQUEST yielded 701 candidate peptide sequences. Based on estimates of ∼10,000 peptides presented by class I and II MHC molecules on the surface of a typical lymphoid cell (28), this represents ∼7% of the total peptide repertoire presented by HLA-DR*0401 on these cells. Among these 701 candidate peptide sequences, 273 were identifiable as members of nested sets (defined as groups of peptides that share core sequences but have distinct amino and C-terminal ends). By manual interpretation of the MS/MS spectra, we confirmed the sequences of 142 peptides that comprised 11 nested sets (Table I).
Major nested sets extracted from HLA-DR4a
Nested Set . | Residues . | Peptide Sequence . | Abundance (× 10−7)b . | ||
---|---|---|---|---|---|
HLA-A49–74 | 54–72 | DTQFVRFDSDAASQRMEPR | 160 | ||
52–68 | VDDTQFVRFDSDAASQR | 150 | |||
54–68 | DTQFVRFDSDAASQR | 110 | |||
52–71 | VDDTQFVRFDSDAASQRMEP | 64 | |||
54–71 | DTQFVRFDSDAASQRMEP | 59 | |||
53–68 | DDTQFVRFDSDAASQR | 43 | |||
54–70 | DTQFVRFDSDAASQRME | 43 | |||
55–68 | TQFVRFDSDAASQR | 41 | |||
53–72 | DDTQFVRFDSDAASQRMEPR | 33 | |||
52–70 | VDDTQFVRFDSDAASQRME | 32 | |||
54–67 | DTQFVRFDSDAASQ | 23 | |||
52–67 | VDDTQFVRFDSDAASQ | 21 | |||
53–71 | DDTQFVRFDSDAASQRMEP | 20 | |||
53–70 | DDTQFVRFDSDAASQRME | 17 | |||
54–69 | DTQFVRFDSDAASQRM | 15 | |||
55–70 | TQFVRFDSDAASQRME | 15 | |||
57–69 | FVRFDSDAASQRMc | 12 | |||
57–68 | FVRFDSDAASQR | 9.8 | |||
52–74 | VDDTQFVRFDSDAASQRMEPRAP | 8.8 | |||
52–69 | VDDTQFVRFDSDAASQRM | 8.6 | |||
55–67 | TQFVRFDSDAASQ | 7 | |||
57–71 | FVRFDSDAASQRMEP | 6 | |||
51–68 | YVDDTQFVRFDSDAASQRMEPRAP | 5.2 | |||
53–67 | DDTQFVRFDSDAASQ | 4.8 | |||
52–66 | VDDTQFVRFDSDAAS | 3.4 | |||
53–69 | DDTQFVRFDSDAASQRM | 2.5 | |||
57–70 | FVRFDSDAASQRME | 2.5 | |||
58–68 | VRFDSDAASQRd | 2.5 | |||
55–69 | TQFVRFDSDAASQRM | 1.9 | |||
57–67 | FVRFDSDAASQ | 1.8 | |||
57–72 | FVRFDSDAASQRMEPR | 1.6 | |||
52–73 | VDDTQFVRFDSDAASQRMEPRA | 1.3 | |||
49–68 | VGYVDDTQFVRFDSDAASQR | 0.9 | |||
56–68 | QFVRFDSDAASQR | 0.6 | |||
59–68 | RFDSDAASQR | 0.4 | |||
55–71 | TQFVRFDSDAASQRMEP | 0.1 | |||
HLA-B51–72 | 52–72 | VDDTQFVRFDSDAASPRMAPR | 24 | ||
52–70 | VDDTQFVRFDSDAASPRMA | 19 | |||
52–71 | VDDTQFVRFDSDAASPRMAP | 19 | |||
52–68 | VDDTQFVRFDSDAASPR | 14e | |||
54–68 | DTQFVRFDSDAASPR | 12e | |||
52–69 | VDDTQFVRFDSDAASPRM | 11 | |||
54–72 | DTQFVRFDSDAASPRMAPR | 9.1 | |||
54–69 | DTQFVRFDSDAASPRM | 9 | |||
54–71 | DTQFVRFDSDAASPRMAP | 6.8 | |||
53–68 | DDTQFVRFDSDAASPR | 4.9e | |||
52–67 | VDDTQFVRFDSDAASP | 3.5e | |||
54–67 | DTQFVRFDSDAASP | 2.3e | |||
53–72 | DDTQFVRFDSDAASPRMAPR | 2.6 | |||
54–70 | DTQFVRFDSDAASPRMA | 2 | |||
53–69 | DDTQFVRFDSDAASPRM | 1.9 | |||
53–71 | DDTQFVRFDSDAASPRMAP | 1.7 | |||
53–70 | DDTQFVRFDSDAASPRMA | 1.4 | |||
53–67 | DDTQFVRFDSDAASP | 1.3e | |||
55–68 | TQFVRFDSDAASPR | 1.1e | |||
57–68 | FVRFDSDAASPR | 0.4e | |||
51–68 | YVDDTQFVRFDSDAASPR | 0.3e | |||
57–67 | FVRFDSDAASP | 0.2e | |||
58–68 | VRFDSDAASPR | 0.2e | |||
56–68 | QFVRFDSDAASPR | 0.1e | |||
HLA-C51–74 | 52–72 | VDDTQFVRFDSDAASPRGEPR | 180 | ||
52–68 | VDDTQFVRFDSDAASPR | 116e | |||
54–68 | DTQFVRFDSDAASPR | 98e | |||
54–72 | DTQFVRFDSDAASPRGEPR | 74 | |||
53–72 | DDTQFVRFDSDAASPRGEPR | 47 | |||
53–68 | DDTQFVRFDSDAASPR | 41e | |||
52–67 | VDDTQFVRFDSDAASP | 29e | |||
52–74 | VDDTQFVRFDSDAASPRGEPRAP | 23 | |||
52–71 | VDDTQFVRFDSDAASPRGEP | 21 | |||
54–67 | DTQFVRFDSDAASP | 20e | |||
54–71 | DTQFVRFDSDAASPRGEP | 14 | |||
53–67 | DDTQFVRFDSDAASP | 11e | |||
55–68 | TQFVRFDSDAASPR | 8.9e | |||
53–70 | DDTQFVRFDSDAASPRGE | 4.3 | |||
57–68 | FVRFDSDAASPR | 3.4e | |||
54–70 | DTQFVRFDSDAASPRGE | 2.9 | |||
(Table continues) |
Nested Set . | Residues . | Peptide Sequence . | Abundance (× 10−7)b . | ||
---|---|---|---|---|---|
HLA-A49–74 | 54–72 | DTQFVRFDSDAASQRMEPR | 160 | ||
52–68 | VDDTQFVRFDSDAASQR | 150 | |||
54–68 | DTQFVRFDSDAASQR | 110 | |||
52–71 | VDDTQFVRFDSDAASQRMEP | 64 | |||
54–71 | DTQFVRFDSDAASQRMEP | 59 | |||
53–68 | DDTQFVRFDSDAASQR | 43 | |||
54–70 | DTQFVRFDSDAASQRME | 43 | |||
55–68 | TQFVRFDSDAASQR | 41 | |||
53–72 | DDTQFVRFDSDAASQRMEPR | 33 | |||
52–70 | VDDTQFVRFDSDAASQRME | 32 | |||
54–67 | DTQFVRFDSDAASQ | 23 | |||
52–67 | VDDTQFVRFDSDAASQ | 21 | |||
53–71 | DDTQFVRFDSDAASQRMEP | 20 | |||
53–70 | DDTQFVRFDSDAASQRME | 17 | |||
54–69 | DTQFVRFDSDAASQRM | 15 | |||
55–70 | TQFVRFDSDAASQRME | 15 | |||
57–69 | FVRFDSDAASQRMc | 12 | |||
57–68 | FVRFDSDAASQR | 9.8 | |||
52–74 | VDDTQFVRFDSDAASQRMEPRAP | 8.8 | |||
52–69 | VDDTQFVRFDSDAASQRM | 8.6 | |||
55–67 | TQFVRFDSDAASQ | 7 | |||
57–71 | FVRFDSDAASQRMEP | 6 | |||
51–68 | YVDDTQFVRFDSDAASQRMEPRAP | 5.2 | |||
53–67 | DDTQFVRFDSDAASQ | 4.8 | |||
52–66 | VDDTQFVRFDSDAAS | 3.4 | |||
53–69 | DDTQFVRFDSDAASQRM | 2.5 | |||
57–70 | FVRFDSDAASQRME | 2.5 | |||
58–68 | VRFDSDAASQRd | 2.5 | |||
55–69 | TQFVRFDSDAASQRM | 1.9 | |||
57–67 | FVRFDSDAASQ | 1.8 | |||
57–72 | FVRFDSDAASQRMEPR | 1.6 | |||
52–73 | VDDTQFVRFDSDAASQRMEPRA | 1.3 | |||
49–68 | VGYVDDTQFVRFDSDAASQR | 0.9 | |||
56–68 | QFVRFDSDAASQR | 0.6 | |||
59–68 | RFDSDAASQR | 0.4 | |||
55–71 | TQFVRFDSDAASQRMEP | 0.1 | |||
HLA-B51–72 | 52–72 | VDDTQFVRFDSDAASPRMAPR | 24 | ||
52–70 | VDDTQFVRFDSDAASPRMA | 19 | |||
52–71 | VDDTQFVRFDSDAASPRMAP | 19 | |||
52–68 | VDDTQFVRFDSDAASPR | 14e | |||
54–68 | DTQFVRFDSDAASPR | 12e | |||
52–69 | VDDTQFVRFDSDAASPRM | 11 | |||
54–72 | DTQFVRFDSDAASPRMAPR | 9.1 | |||
54–69 | DTQFVRFDSDAASPRM | 9 | |||
54–71 | DTQFVRFDSDAASPRMAP | 6.8 | |||
53–68 | DDTQFVRFDSDAASPR | 4.9e | |||
52–67 | VDDTQFVRFDSDAASP | 3.5e | |||
54–67 | DTQFVRFDSDAASP | 2.3e | |||
53–72 | DDTQFVRFDSDAASPRMAPR | 2.6 | |||
54–70 | DTQFVRFDSDAASPRMA | 2 | |||
53–69 | DDTQFVRFDSDAASPRM | 1.9 | |||
53–71 | DDTQFVRFDSDAASPRMAP | 1.7 | |||
53–70 | DDTQFVRFDSDAASPRMA | 1.4 | |||
53–67 | DDTQFVRFDSDAASP | 1.3e | |||
55–68 | TQFVRFDSDAASPR | 1.1e | |||
57–68 | FVRFDSDAASPR | 0.4e | |||
51–68 | YVDDTQFVRFDSDAASPR | 0.3e | |||
57–67 | FVRFDSDAASP | 0.2e | |||
58–68 | VRFDSDAASPR | 0.2e | |||
56–68 | QFVRFDSDAASPR | 0.1e | |||
HLA-C51–74 | 52–72 | VDDTQFVRFDSDAASPRGEPR | 180 | ||
52–68 | VDDTQFVRFDSDAASPR | 116e | |||
54–68 | DTQFVRFDSDAASPR | 98e | |||
54–72 | DTQFVRFDSDAASPRGEPR | 74 | |||
53–72 | DDTQFVRFDSDAASPRGEPR | 47 | |||
53–68 | DDTQFVRFDSDAASPR | 41e | |||
52–67 | VDDTQFVRFDSDAASP | 29e | |||
52–74 | VDDTQFVRFDSDAASPRGEPRAP | 23 | |||
52–71 | VDDTQFVRFDSDAASPRGEP | 21 | |||
54–67 | DTQFVRFDSDAASP | 20e | |||
54–71 | DTQFVRFDSDAASPRGEP | 14 | |||
53–67 | DDTQFVRFDSDAASP | 11e | |||
55–68 | TQFVRFDSDAASPR | 8.9e | |||
53–70 | DDTQFVRFDSDAASPRGE | 4.3 | |||
57–68 | FVRFDSDAASPR | 3.4e | |||
54–70 | DTQFVRFDSDAASPRGE | 2.9 | |||
(Table continues) |
The sample of HLA-DR*0401-associated peptides was also analyzed using a Fourier Transform mass spectrometer, which has enhanced sensitivity and mass accuracy compared with other instruments and also provided quantitative information on the 142 nested set peptides. We used this information to examine the display of related peptides derived from the same protein sequence. We found that the most abundant individual peptide species in the different nested sets varied between 14 and 21 residues long, while the mean length of all peptides in a nested set, after normalizing for the abundance of individual species, varied between 14.5 and 17.8 residues (Table II). Based on visibility in the x-ray crystallographic structure of HLA-DR*0401 associated with the collagen II peptide 1168–1180 and involvement in hydrogen bonds, only 12 peptide residues interact with the binding site of this class II MHC molecule (17). However, only 8 of the 142 peptides were 12 residues or smaller, and these were invariably of very low relative abundance (Table I). These data suggested that features of each composite nested set sequence apart from the minimal sequences required for binding to HLA-DR*0401, determined both the optimal length, and distribution of displayed peptides.
Length distribution of peptides in nested setsa
Set . | Mean Length . | Mean Length Normalized for Abundance . | Most Abundant Species . | . | |||
---|---|---|---|---|---|---|---|
. | . | . | . | Sequence . | Length . | ||
HLA-C51–74 | NDb | NDb | VDDTQFVRFDSDAASPRGEPR | 21 | |||
HLA-B51–72 | NDb | NDb | VDDTQFVRFDRDAASPRMAPR | 21 | |||
HLA-A49–74 | 16.3 | 17.2 | DTQFVRFDSDAASQRMEPR | 19 | |||
HLA-E148–167 | 17.0 | 17.2 | NEDLRSWTAVDTAAQISEQ | 19 | |||
LAMP234–252 | 17.2 | 16.5 | LPSYEEALSLPSKTPE | 16 | |||
HLA-C150–169 | 17.8 | 16.5 | DLRSWTAADTAAQITQ | 16 | |||
HLA-B150–170 | 16.2 | 16.1 | LSSWTAADTAAQITQ | 15 | |||
Igκ140–162 | 17.1 | 16.0 | KVQWKVDNALQSGNS | 15 | |||
BSP554–572 | 15.5 | 15.1 | DVAFVKDQTVIQNTD | 15 | |||
NSFA148–163 | 14.5 | 14.8 | DYYKGEESNSSANK | 14 | |||
HSC71482–499 | 15.3 | 14.5 | GILNVSAVDKSTGK | 14 |
Set . | Mean Length . | Mean Length Normalized for Abundance . | Most Abundant Species . | . | |||
---|---|---|---|---|---|---|---|
. | . | . | . | Sequence . | Length . | ||
HLA-C51–74 | NDb | NDb | VDDTQFVRFDSDAASPRGEPR | 21 | |||
HLA-B51–72 | NDb | NDb | VDDTQFVRFDRDAASPRMAPR | 21 | |||
HLA-A49–74 | 16.3 | 17.2 | DTQFVRFDSDAASQRMEPR | 19 | |||
HLA-E148–167 | 17.0 | 17.2 | NEDLRSWTAVDTAAQISEQ | 19 | |||
LAMP234–252 | 17.2 | 16.5 | LPSYEEALSLPSKTPE | 16 | |||
HLA-C150–169 | 17.8 | 16.5 | DLRSWTAADTAAQITQ | 16 | |||
HLA-B150–170 | 16.2 | 16.1 | LSSWTAADTAAQITQ | 15 | |||
Igκ140–162 | 17.1 | 16.0 | KVQWKVDNALQSGNS | 15 | |||
BSP554–572 | 15.5 | 15.1 | DVAFVKDQTVIQNTD | 15 | |||
NSFA148–163 | 14.5 | 14.8 | DYYKGEESNSSANK | 14 | |||
HSC71482–499 | 15.3 | 14.5 | GILNVSAVDKSTGK | 14 |
The mean length was calculated simply by summing the total number of residues in each sequence of a nested set and dividing by the number of sequences. The mean length normalized for abundance was calculated by multiplying the length of each sequence in a nested set by its abundance in Table I, summing these values, and dividing by the total abundance of all members of the set. Determination of most abundant species is based on the data in Table I.
Values for these nested sets were ND, because several peptides could not be unambiguously assigned to one set or another.
Alignment of nested set peptides to HLA-DR4 binding motif
To facilitate understanding the factors contributing to the presentation of nested set peptides, we used three predictive algorithms to determine the likely binding frames of the peptides in each nested set. The three binding algorithms used different methodology, but predicted the same preferred frame in each case. For human Ig κ-chain (Igκ)140–162, human α-soluble N-ethylaleimide-sensitive factor attachment protein148–163, human lysosomal-associated multitransmembrane protein (LAMP)234–252, and HLA-C150–169, these algorithms identified single binding frames that were contained within all peptides comprising these sets and had relative affinities significantly higher than any other predicted frame (Table III). The nested sets from HLA-C150–169, HLA-B150–170,and HLA-E148–167 are derived from homologous regions of these three class I MHC molecules, and the sequences contained in any two shares 15–18 of 19 possible residues. A single homologous high-affinity binding frame accommodated 5/5, 14/16, and 10/11 peptides from these sets, respectively. Similarly, the nested sets HLA-C52–74, HLA-B51–72, and HLA-A49–74 were also derived from homologous regions of these proteins, and share 20/22 residues. In fact, 12 peptides were derived from a region that is identical in HLA-B and -C molecules; and therefore, could come from either or both of these molecules. A common homologous high-affinity frame accounted for the binding of 28/32 peptides from the combined HLA-B and -C sets and 27/36 of the peptides from the HLA-A49–74.
Alignment of nested set peptides to predicted HLA-DR4-binding motifs
Set . | Alignment and Predicted Binding Frame . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | Algorithm . | . | . | . | . | Peptides Containing Complete Framed . | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | P(−2) . | P(−1) . | P1 . | P2 . | P3 . | P4 . | P5 . | P6 . | P7 . | P8 . | P9 . | P10 . | 1a (IC50) . | 2b . | 3c . | . | ||||||||||||||||||||||||||
Igκ140–162 | V | Q | W | K | V | D | N | A | L | Q | S | G | 8 | 2.81 | 22 | 18 /18 | ||||||||||||||||||||||||||
NSFA148–163 | D | Y | Y | K | G | E | E | S | N | S | S | A | 12 | 0.73 | 28 | 4 /4 | ||||||||||||||||||||||||||
A | D | Y | Y | K | G | E | E | S | N | S | S | 3369 | 0 | 16 | 2 /4 | |||||||||||||||||||||||||||
LAMP-5234–252 | P | S | Y | E | E | A | L | S | L | P | S | K | 1369 | 3.11 | 28 | 6 /6 | ||||||||||||||||||||||||||
HLA-C150–169 | R | S | W | T | A | A | D | T | A | A | Q | I | 140 | 0.44 | 28 | 5 /5 | ||||||||||||||||||||||||||
E | D | L | R | S | W | T | A | A | D | T | A | 906 | 0.16 | 6 | 3 /5 | |||||||||||||||||||||||||||
HLA-B150–170 | S | S | W | T | A | A | D | T | A | A | Q | I | 140 | 0.44 | 28 | 14 /16 | ||||||||||||||||||||||||||
E | D | L | S | S | W | T | A | A | D | T | A | 2539 | 0.22 | 14 | 7 /16 (1) | |||||||||||||||||||||||||||
HLA-E148–166 | R | S | W | T | A | V | D | T | A | A | Q | I | 144 | 0.29 | 28 | 10 /11 | ||||||||||||||||||||||||||
E | D | L | R | S | W | T | A | V | D | T | A | 1079 | 0.12 | 14 | 5 /11 (0) | |||||||||||||||||||||||||||
HLA-C51–74 | T | Q | F | V | R | F | D | S | D | A | A | S | 29 | 1.06 | 28 | 16 /20 | ||||||||||||||||||||||||||
Q | F | V | R | F | D | S | D | A | A | S | P | 299 | 0 | 20 | 17 /20 (1) | |||||||||||||||||||||||||||
HLA-B51–72 | T | Q | F | V | R | F | D | S | D | A | A | S | 29 | 1.06 | 28 | 20 /24 | ||||||||||||||||||||||||||
Q | F | V | R | F | D | S | D | A | A | S | P | 299 | 0 | 20 | 21 /24 (1) | |||||||||||||||||||||||||||
HLA-A49–74 | T | Q | F | V | R | F | D | S | D | A | A | S | 29 | 1.06 | 28 | 27 /36 | ||||||||||||||||||||||||||
Q | F | V | R | F | D | S | D | A | A | S | Q | 144 | 0 | 20 | 28 /36 (1) | |||||||||||||||||||||||||||
V | G | Y | V | D | D | T | Q | F | V | R | F | 27 | 0 | 22 | 1 /36 (0) | |||||||||||||||||||||||||||
BSP554–572 | V | A | F | V | K | D | Q | T | V | I | Q | N | 82 | 37.6 | 28 | 9 /11 | ||||||||||||||||||||||||||
A | F | V | K | D | Q | T | V | I | Q | N | T | 84 | 5.17 | 14 | 7 /11 (0) | |||||||||||||||||||||||||||
HSC71482–499 | G | I | L | N | V | S | A | V | D | K | S | T | 1456 | 0.34 | 14 | 3 /3 | ||||||||||||||||||||||||||
N | G | I | L | N | V | S | A | V | D | K | S | 2445 | 0.28 | 20 | 3 /3 |
Set . | Alignment and Predicted Binding Frame . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | Algorithm . | . | . | . | . | Peptides Containing Complete Framed . | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | P(−2) . | P(−1) . | P1 . | P2 . | P3 . | P4 . | P5 . | P6 . | P7 . | P8 . | P9 . | P10 . | 1a (IC50) . | 2b . | 3c . | . | ||||||||||||||||||||||||||
Igκ140–162 | V | Q | W | K | V | D | N | A | L | Q | S | G | 8 | 2.81 | 22 | 18 /18 | ||||||||||||||||||||||||||
NSFA148–163 | D | Y | Y | K | G | E | E | S | N | S | S | A | 12 | 0.73 | 28 | 4 /4 | ||||||||||||||||||||||||||
A | D | Y | Y | K | G | E | E | S | N | S | S | 3369 | 0 | 16 | 2 /4 | |||||||||||||||||||||||||||
LAMP-5234–252 | P | S | Y | E | E | A | L | S | L | P | S | K | 1369 | 3.11 | 28 | 6 /6 | ||||||||||||||||||||||||||
HLA-C150–169 | R | S | W | T | A | A | D | T | A | A | Q | I | 140 | 0.44 | 28 | 5 /5 | ||||||||||||||||||||||||||
E | D | L | R | S | W | T | A | A | D | T | A | 906 | 0.16 | 6 | 3 /5 | |||||||||||||||||||||||||||
HLA-B150–170 | S | S | W | T | A | A | D | T | A | A | Q | I | 140 | 0.44 | 28 | 14 /16 | ||||||||||||||||||||||||||
E | D | L | S | S | W | T | A | A | D | T | A | 2539 | 0.22 | 14 | 7 /16 (1) | |||||||||||||||||||||||||||
HLA-E148–166 | R | S | W | T | A | V | D | T | A | A | Q | I | 144 | 0.29 | 28 | 10 /11 | ||||||||||||||||||||||||||
E | D | L | R | S | W | T | A | V | D | T | A | 1079 | 0.12 | 14 | 5 /11 (0) | |||||||||||||||||||||||||||
HLA-C51–74 | T | Q | F | V | R | F | D | S | D | A | A | S | 29 | 1.06 | 28 | 16 /20 | ||||||||||||||||||||||||||
Q | F | V | R | F | D | S | D | A | A | S | P | 299 | 0 | 20 | 17 /20 (1) | |||||||||||||||||||||||||||
HLA-B51–72 | T | Q | F | V | R | F | D | S | D | A | A | S | 29 | 1.06 | 28 | 20 /24 | ||||||||||||||||||||||||||
Q | F | V | R | F | D | S | D | A | A | S | P | 299 | 0 | 20 | 21 /24 (1) | |||||||||||||||||||||||||||
HLA-A49–74 | T | Q | F | V | R | F | D | S | D | A | A | S | 29 | 1.06 | 28 | 27 /36 | ||||||||||||||||||||||||||
Q | F | V | R | F | D | S | D | A | A | S | Q | 144 | 0 | 20 | 28 /36 (1) | |||||||||||||||||||||||||||
V | G | Y | V | D | D | T | Q | F | V | R | F | 27 | 0 | 22 | 1 /36 (0) | |||||||||||||||||||||||||||
BSP554–572 | V | A | F | V | K | D | Q | T | V | I | Q | N | 82 | 37.6 | 28 | 9 /11 | ||||||||||||||||||||||||||
A | F | V | K | D | Q | T | V | I | Q | N | T | 84 | 5.17 | 14 | 7 /11 (0) | |||||||||||||||||||||||||||
HSC71482–499 | G | I | L | N | V | S | A | V | D | K | S | T | 1456 | 0.34 | 14 | 3 /3 | ||||||||||||||||||||||||||
N | G | I | L | N | V | S | A | V | D | K | S | 2445 | 0.28 | 20 | 3 /3 |
IC50 values were calculated according to Ref. 32 . This algorithm does not take into account the effect of an amino terminal P(−2) residue, which has been shown by crystallographic analysis to interact with the class II-binding site by hydrogen bonding interactions with the peptide backbone (17 ). All other potential binding frames had calculated IC50 values >40,000 and have been omitted for clarity.
Relative binding affinity values of 9-mer peptides initiating at P1 were calculated according to the algorithm of Southwood et al. (33 ). Higher values predict higher binding affinity, and values of 0 predict no binding. The average relative binding affinity for all class II-associated peptides is 2.671. No other binding frames containing the peptides in Table I scored >0.
Binding scores for 15-mer peptides with position 4 corresponding to P1 in the table were obtained using the SYFPEITHI program at http://www.uni-tuebingen.de/uni/kxi/. Higher values predict higher binding affinity. The maximum possible score is 36 and the minimum is 0. No other binding frames in Table I scored >0.
In cases where the primary binding frame was not contained by all peptides in a nested set, the numbers in parenthesis indicates the additional peptides of that set that contain the complete secondary binding frame.
It should be noted that the 16 of 128 peptides from these 9 nested sets that did not contain the complete predicted high affinity binding frame generally fell into two categories (Table I). Five were shorter than 12 residues; and therefore, could not fully contain any frame based on this number of residues. Of the remaining 11, only the 3 peptides of the HLA-B150–170 and HLA-E148–166 sets that terminated in Q could fully occupy any discernable alternate binding frame, albeit of low affinity (Table III). Regardless, these nonconforming peptides were generally among the lowest abundance species of a particular set. Thus, the predictive algorithm, together with the abundance data provided from the Fourier Transform mass spectrometry experiments, suggests that the single high-affinity binding frames in Table III represent the dominant mode of binding for the peptides in these nested sets.
The two nested sets derived from bovine serotransferrin precursor (BSP)554–572 and human heat shock cognate 71-kDa protein (HSC71)482–499 had two frames of similar predicted affinity that were shifted by one residue with respect to one another. These two frames also accommodated the same sets of peptides. In addition, no predicted alternate binding frame can account for the two peptides of the BSP554–572 set that terminate at Q, and thus do not contain the complete primary predicted frame. In these cases, it cannot be determined what proportions of these peptides will be displayed by HLA-DR*0401 in one or both frames.
Distribution of peptide length and abundance within a nested set reveals holes in the presented peptide repertoire
With the information about likely binding frames in hand, we reexamined in more detail the abundance of individual peptides within two representative nested sets containing a minimum of 11 unique members. The 16-member HLA-B150–170 set was dominated by four peptides (154–168 (15 mer), 153–168 (16 mer), 153–169 (17 mer), and 152–169 (18 mer)) that collectively accounted for almost 80% of the total peptide abundance (Fig. 1,, upper panel). In contrast, eight peptides in this set were each present at <1.5%. The distribution of peptide lengths within this set is relatively regular and continuous, and the dominant peptides were related to one another by the gain or loss of single residues at either the amino or the C termini. A small number of dominant species was also evident in the Igκ140–162 set, with the most abundant three peptides (140–158, 145–158, and 145–159) accounting for 64% of the total peptide abundance (Fig. 1,, lower panel). However, the 140–158 species was not related to the other dominant peptides by the gain or loss of a single residue at the amino terminal end, but instead involved a difference of five residues. More generally, Igκ-derived peptides beginning at residues 140 or 145 were relatively frequent, while those beginning at 141–144 were much less abundant. At the extreme, no Igκ peptides beginning at R142 were found, leading to a distribution “gap” and two “clusters” of peptides based on their amino termini. It is unlikely that an alternate binding frame could account for this discontinuous distribution, since only a single binding frame of any significance was identified (Table III). Thus, these features are most likely due to the specificity of the proteolytic enzymes that produce these peptides during Ag processing.
Abundance of individual peptides in two nested sets associated with HLA-DR*0401. Peptides are identified based on their amino and C-terminal residues as indicated on the X and Y axes, respectively. Abundance data is taken from Table I. Abundance normalized length was calculated by multiplying the length of each sequence in a nested set by its abundance in Table I, summing these values, and dividing by the total abundance of all members of the set.
Abundance of individual peptides in two nested sets associated with HLA-DR*0401. Peptides are identified based on their amino and C-terminal residues as indicated on the X and Y axes, respectively. Abundance data is taken from Table I. Abundance normalized length was calculated by multiplying the length of each sequence in a nested set by its abundance in Table I, summing these values, and dividing by the total abundance of all members of the set.
Summed abundance of peptides with common amino and C termini provides evidence for the action of specific proteases
The data of Fig. 1, taken together with the differences in dominant and mean peptide lengths among the nested sets and the predicted binding motif information, suggested the possibility that the proteases responsible for the generation of the peptides associated with HLA-DR4 preferentially cleave at certain residues. Alternatively, the HLA-DR4 binding site might sterically inhibit access of a nonspecific exoprotease or endoprotease to the ends of an already bound peptide, even though those ends might not be immobilized in the site. To gain more information about these possibilities, we summed the abundances of all peptides in each of the nested sets that either began or ended at a particular residue. For example, 5 of the 18 peptides in the Igκ nested set ended with N158, and their summed abundance is 59.5 × 107 (Table I). Similarly, six peptides from this set begin with K145, and their summed relative abundance is 73.1 × 107. By combining this data with the predicted binding frames in Table II, we examined whether the distribution of peptide ends was related simply to binding site proximity, or suggested specificity in proteolysis (Fig. 2).
Summed abundance of peptides with common amino and C termini. For each nested set, the abundances values of all peptides initiating or terminating at a specific position were summed and plotted with respect to the primary predicted binding frame identified in Table III (indicated by shaded area). The secondary predicted binding frame is indicated by a filled horizontal bar only in those instances where that frame can account for the binding of additional peptides in the nested set, as shown in Table III.
Summed abundance of peptides with common amino and C termini. For each nested set, the abundances values of all peptides initiating or terminating at a specific position were summed and plotted with respect to the primary predicted binding frame identified in Table III (indicated by shaded area). The secondary predicted binding frame is indicated by a filled horizontal bar only in those instances where that frame can account for the binding of additional peptides in the nested set, as shown in Table III.
We first examined the relationship of dominant C termini to one another and to the P(-2) through P10 residues predicted to interact with the HLA-DR4 binding site. Only in the case of the Igκ140–162 set was there significant representation of peptides that terminated at P11: the most abundant C-terminal residues for the remaining 10 sets were usually at the P12 or P13 positions. While the paucity of peptides terminating at P11 could be due to steric inhibition of protease access, we also observed that certain residues were preferred cleavage sites although they were more distant from the apparent end of the binding site. Thus, peptides of the HLA-C52–74, HLA-B51–72, and HLA-A49–74 sets terminated at R at the P12 or P16 positions with a similar frequency, and peptides terminating at one of these two residues were more abundant than those terminating at intervening residues. R was also found as a strong secondary C terminus at the P13 position for peptides of the HLA-C150–169 and HLA-B150–170 sets. Similarly, Q was the preferred C-terminal residue for the HLA-C150–169, HLA-B150–170, and HLA-E148–166 sets. While Q occupies the P12 position in the first two sets, it occupies the P13 position in the latter, although the sequences of the binding frames for these nested sets are otherwise highly homologous. These preferred cleavage sites were not explainable as steric hindrance due to the use of a secondary binding frame, since these frames all would have protected amino terminal, rather than C-terminal residues. The enhanced representation of certain C-terminal residues in multiple sequences at different positions is consistent with the hypothesis that they are due to selective protease activity rather than simply by steric constraint due to proximity to the peptide binding site.
In contrast to the C termini, dominant amino termini were often found immediately adjacent to the peptide binding site at the P(-3) position. In two cases (human α-soluble NSF attachment protein (NSFA)148–163 and HSC71482–499), the dominant terminal residues were at the P(-2) position. In neither case could this observation be accounted for by a secondary binding frame. It is also more difficult to discern evidence for the specificity of proteases that produce the amino termini of the remaining nested sets. These observations are consistent with the idea that steric constraints on protease activity are less severe at this end of the peptide binding site, or that substantially complete processing of peptide amino termini occurs before stable HLA-DR4 binding.
Discussion
In this study, we used quadrupole ion trap mass spectrometry technology to identify >700 candidate peptides associated with HLA-DR*0401. Of these sequences, 142 were contained within 11 nested sets. The sequences of these peptides were manually confirmed and then quantitated using Fourier transform mass spectrometry. This quantitative information concerning constituent peptide species of nested sets provides several advantages: 1) nested sets derive from a common protein source, thereby eliminating any skew introduced by differences in protein expression; 2) any observed differences in peptide distribution must result from proteolytic processing or binding to the class II MHC molecule; and 3) comparison of multiple nested sets allows us to define general trends in peptide processing.
Based on hydrogen bond interactions and visibility in the x-ray crystallographic structure of HLA-DR*0401 (DRA*0101, DRB1*0401) associated with the collagen II peptide 1168–1180 (17), as well as other crystal structures of HLA-DR molecules (15), only 12 peptide residues interact with the peptide binding site of this molecule. Furthermore, studies using sequentially smaller peptides have shown that 11 mers bind to HLA-DR4 with affinities comparable to those 13 mers (29). Our work confirms on a large scale that the average naturally occurring class II-associated peptides is four to seven residues longer than necessary for maximal hydrogen bonding between the peptide and the class II molecule. However, perhaps most surprising was that 12 mers corresponding to this “ideal” binding site were not observed among the peptides of most nested sets, or if observed, were of low relative abundance. Although it is formally possible that these additional residues contribute to the affinity of these particular peptides, these observations suggest that most peptides interact initially with the class II binding site in the context of larger precursors, which are then proteolyzed further at both amino and C-terminal ends. The additional residues beyond the minimum necessary for binding site occupancy are either protected from proteolytic removal after the peptide is bound to the class II molecule or are a reflection of the nonrandom cleavage of peptides by endosomal proteases. However, in a small number of cases, a fraction of the peptides from individual nested sets terminate at residues that lie within the peptide binding site. Such peptide termini likely represent dominant protease cleavage sites that are used early in the degradation process.
Several observations are consistent with the idea that the final form of peptides displayed by HLA-DR*0401 reflects the action of proteases with preferences for certain amino acid side chains, rather than simply amino and carboxyl peptidase digestion of exposed ends until steric hindrance prevents further action. First, the average peptide lengths and length of the most dominant species differ among nested sets. Second, when individual species were aligned, some sets showed evidence of gaps or low frequency of occurrence of particular amino or C termini. Third, preferential occurrence of R and Q at the C terminus was observed when different nested sets were compared. This preference was only partly dependent on the absolute position relative to the peptide binding site. The specificities of proteases that have been localized to endosomes are rather broad (4), and so it may be somewhat surprising that any evidence of selectivity in proteolysis was observed in this work. However, with the exception of a recently described asparaginyl endopeptidase (14), none of these proteolytic activities have been directly implicated in the generation of class II-associated peptides other than those derived from the invariant chain. Our work may offer useful insights into the characteristics of proteases that are important in this process.
Previous work had established that nested sets were a typical feature of peptides associated with class II MHC molecules (1, 18, 19, 20, 21, 22). Indeed, 4 of the 11 sets described in the present work were previously identified with a more limited number of members (1). However, we have shown in this study that while they tend to be dominated by a relatively small number of species, many such sets are quite complex. Previous studies have demonstrated that T cells may differentially recognize different members of such sets. For example, the crystal structure of hen egg lysozyme50–62 restricted by I-Ak showed that residues 61 and 62 had very weak electron density, consistent with their being considerably solvent exposed (30). However, two-thirds of the T cells that respond to this epitope recognize the peptide 48–63, but not the minimal 48–61 peptide (31). Therefore, nested sets may add a level of antigenic diversity that increases immunogenicity. It will be interesting to determine how different members of these nested sets are viewed by the immune system.
Continued
Nested Set . | Residues . | Peptide Sequence . | Abundance (× 10−7)b . | ||
---|---|---|---|---|---|
51–68 | YVDDTQFVRFDSDAASPR | 2.9e | |||
58–68 | VRFDSDAASPR | 2.1e | |||
57–67 | FVRFDSDAASP | 1.4e | |||
56–68 | QFVRFDSDAASPR | 0.6e | |||
HLA-B150–170 | 154–168 | LSSWTAADTAAQITQ | 65 | ||
153–168 | DLSSWTAADTAAQITQ | 34 | |||
153–169 | DLSSWTAADTAAQITQR | 32 | |||
152–169 | EDLSSWTAADTAAQITQR | 23 | |||
154–167 | LSSWTAADTAAQIT | 11 | |||
154–169 | LSSWTAADTAAQITQR | 11 | |||
152–168 | EDLSSWTAADTAAQITQ | 6.9 | |||
153–170 | DLSSWTAADTAAQITQRK | 5.4 | |||
153–167 | DLSSWTAADTAAQIT | 2.7 | |||
152–170 | EDLSSWTAADTAAQITQRK | 2.2 | |||
151–168 | NEDLSSWTAADTAAQITQ | 1.3 | |||
152–167 | EDLSSWTAADTAAQIT | 1 | |||
155–168 | SSWTAADTAAQITQ | 0.8 | |||
150–168 | LNEDLSSWTAADTAAQITQ | 0.2 | |||
152–165 | EDLSSWTAADTAAQ | 0.2 | |||
153–165 | DLSSWTAADTAAQ | 0.2 | |||
HLA-C150–169 | 153–168 | DLRSWTAADTAAQITQ | 97 | ||
153–169 | DLRSWTAADTAAQITQR | 39 | |||
151–168 | NEDLRSWTAADTAAQITQ | 13 | |||
152–169 | EDLRSWTAADTAAQITQR | 2 | |||
150–169 | LNEDLRSWTAADTAAQITQR | 0.6 | |||
HLA-E148–167 | 148–166 | NEDLRSWTAVDTAAQISEQ | 7.4 | ||
150–166 | DLRSWTAVDTAAQISEQ | 6 | |||
150–167 | DLRSWTAVDTAAQISEQK | 4 | |||
150–162 | DLRSWTAVDTAAQ | 3.2 | |||
151–165 | LRSWTAVDTAAQISE | 2.6 | |||
149–167 | EDLRSWTAVDTAAQISEQK | 2.2 | |||
150–165 | DLRSWTAVDTAAQISE | 1.6 | |||
149–166 | EDLRSWTAVDTAAQISEQ | 1.1 | |||
150–164 | DLRSWTAVDTAAQIS | 1.1 | |||
148–167 | NEDLRSWTAVDTAAQISEQK | 0.7 | |||
149–165 | EDLRSWTAVDTAAQISE | 0.1 | |||
Igκ140–162 | 145–159 | KVQWKVDNALQSGNS | 37 | ||
145–158 | KVQWKVDNALQSGN | 31 | |||
140–158 | YPREAKVQWKVDNALQSGN | 14 | |||
144–158 | AKVQWKVDNALQSGN | 7.2 | |||
140–159 | YPREAKVQWKVDNALQSGNS | 6.3 | |||
145–160 | KVQWKVDNALQSGNSQ | 5.3 | |||
141–158 | PREAKVQWKVDNALQSGN | 4.4 | |||
144–159 | AKVQWKVDNALQSGNS | 4.3 | |||
145–161 | KVQWKVDNALQSGNSQE | 3.8 | |||
143–158 | EAKVQWKVDNALQSGN | 2.9 | |||
140–160 | YPREAKVQWKVDNALQSGNSQ | 2.4 | |||
143–159 | EAKVQWKVDNALQSGNS | 2 | |||
143–160 | EAKVQWKVDNALQSGNSQ | 2 | |||
144–161 | AKVQWKVDNALQSGNSQE | 2 | |||
145–162 | KVQWKVDNALQSGNSQES | 1.6 | |||
144–162 | AKVQWKVDNALQSGNSQES | 1.2 | |||
144–160 | AKVQWKVDNALQSGNSQ | 1.1 | |||
145–157 | KVQWKVDNALQSG | 0.4 | |||
BSP554–572 | 555–569 | DVAFVKDQTVIQNTD | 41 | ||
554–569 | GDVAFVKDQTVIQNTD | 16 | |||
555–570 | DVAFVKDQTVIQNTDG | 16 | |||
555–567 | DVAFVKDQTVIQN | 9.7 | |||
554–566 | GDVAFVKDQTVIQ | 6 | |||
554–567 | GDVAFVKDQTVIQN | 5.6 | |||
555–571 | DVAFVKDQTVIQNTDGN | 5.5 | |||
555–566 | DVAFVKDQTVIQ | 4.6 | |||
554–570 | GDVAFVKDQTVIQNTDG | 3.8 | |||
554–571 | GDVAFVKDQTVIQNTDGN | 3.7 | |||
554–572 | GDVAFVKDQTVIQNTDGNN | 0.4 | |||
NSFA148–163 | 148–163 | SADYYKGEESNSSANK | 0.7 | ||
150–163 | DYYKGEESNSSANK | 0.7 | |||
149–163 | ADYYKGEESNSSANK | 0.6 | |||
(Table continues) |
Nested Set . | Residues . | Peptide Sequence . | Abundance (× 10−7)b . | ||
---|---|---|---|---|---|
51–68 | YVDDTQFVRFDSDAASPR | 2.9e | |||
58–68 | VRFDSDAASPR | 2.1e | |||
57–67 | FVRFDSDAASP | 1.4e | |||
56–68 | QFVRFDSDAASPR | 0.6e | |||
HLA-B150–170 | 154–168 | LSSWTAADTAAQITQ | 65 | ||
153–168 | DLSSWTAADTAAQITQ | 34 | |||
153–169 | DLSSWTAADTAAQITQR | 32 | |||
152–169 | EDLSSWTAADTAAQITQR | 23 | |||
154–167 | LSSWTAADTAAQIT | 11 | |||
154–169 | LSSWTAADTAAQITQR | 11 | |||
152–168 | EDLSSWTAADTAAQITQ | 6.9 | |||
153–170 | DLSSWTAADTAAQITQRK | 5.4 | |||
153–167 | DLSSWTAADTAAQIT | 2.7 | |||
152–170 | EDLSSWTAADTAAQITQRK | 2.2 | |||
151–168 | NEDLSSWTAADTAAQITQ | 1.3 | |||
152–167 | EDLSSWTAADTAAQIT | 1 | |||
155–168 | SSWTAADTAAQITQ | 0.8 | |||
150–168 | LNEDLSSWTAADTAAQITQ | 0.2 | |||
152–165 | EDLSSWTAADTAAQ | 0.2 | |||
153–165 | DLSSWTAADTAAQ | 0.2 | |||
HLA-C150–169 | 153–168 | DLRSWTAADTAAQITQ | 97 | ||
153–169 | DLRSWTAADTAAQITQR | 39 | |||
151–168 | NEDLRSWTAADTAAQITQ | 13 | |||
152–169 | EDLRSWTAADTAAQITQR | 2 | |||
150–169 | LNEDLRSWTAADTAAQITQR | 0.6 | |||
HLA-E148–167 | 148–166 | NEDLRSWTAVDTAAQISEQ | 7.4 | ||
150–166 | DLRSWTAVDTAAQISEQ | 6 | |||
150–167 | DLRSWTAVDTAAQISEQK | 4 | |||
150–162 | DLRSWTAVDTAAQ | 3.2 | |||
151–165 | LRSWTAVDTAAQISE | 2.6 | |||
149–167 | EDLRSWTAVDTAAQISEQK | 2.2 | |||
150–165 | DLRSWTAVDTAAQISE | 1.6 | |||
149–166 | EDLRSWTAVDTAAQISEQ | 1.1 | |||
150–164 | DLRSWTAVDTAAQIS | 1.1 | |||
148–167 | NEDLRSWTAVDTAAQISEQK | 0.7 | |||
149–165 | EDLRSWTAVDTAAQISE | 0.1 | |||
Igκ140–162 | 145–159 | KVQWKVDNALQSGNS | 37 | ||
145–158 | KVQWKVDNALQSGN | 31 | |||
140–158 | YPREAKVQWKVDNALQSGN | 14 | |||
144–158 | AKVQWKVDNALQSGN | 7.2 | |||
140–159 | YPREAKVQWKVDNALQSGNS | 6.3 | |||
145–160 | KVQWKVDNALQSGNSQ | 5.3 | |||
141–158 | PREAKVQWKVDNALQSGN | 4.4 | |||
144–159 | AKVQWKVDNALQSGNS | 4.3 | |||
145–161 | KVQWKVDNALQSGNSQE | 3.8 | |||
143–158 | EAKVQWKVDNALQSGN | 2.9 | |||
140–160 | YPREAKVQWKVDNALQSGNSQ | 2.4 | |||
143–159 | EAKVQWKVDNALQSGNS | 2 | |||
143–160 | EAKVQWKVDNALQSGNSQ | 2 | |||
144–161 | AKVQWKVDNALQSGNSQE | 2 | |||
145–162 | KVQWKVDNALQSGNSQES | 1.6 | |||
144–162 | AKVQWKVDNALQSGNSQES | 1.2 | |||
144–160 | AKVQWKVDNALQSGNSQ | 1.1 | |||
145–157 | KVQWKVDNALQSG | 0.4 | |||
BSP554–572 | 555–569 | DVAFVKDQTVIQNTD | 41 | ||
554–569 | GDVAFVKDQTVIQNTD | 16 | |||
555–570 | DVAFVKDQTVIQNTDG | 16 | |||
555–567 | DVAFVKDQTVIQN | 9.7 | |||
554–566 | GDVAFVKDQTVIQ | 6 | |||
554–567 | GDVAFVKDQTVIQN | 5.6 | |||
555–571 | DVAFVKDQTVIQNTDGN | 5.5 | |||
555–566 | DVAFVKDQTVIQ | 4.6 | |||
554–570 | GDVAFVKDQTVIQNTDG | 3.8 | |||
554–571 | GDVAFVKDQTVIQNTDGN | 3.7 | |||
554–572 | GDVAFVKDQTVIQNTDGNN | 0.4 | |||
NSFA148–163 | 148–163 | SADYYKGEESNSSANK | 0.7 | ||
150–163 | DYYKGEESNSSANK | 0.7 | |||
149–163 | ADYYKGEESNSSANK | 0.6 | |||
(Table continues) |
Continued
Nested Set . | Residues . | Peptide Sequence . | Abundance (× 10−7)b . |
---|---|---|---|
150–162 | DYYKGEESNSSAN | 0.2 | |
LAMP234–252 | 236–251 | LPSYEEALSLPSKTPE | 27 |
236–252 | LPSYEEALSLPSKTPEG | 22 | |
236–250 | LPSYEEALSLPSKTP | 7.1 | |
234–251 | VVLPSYEEALSLPSKTPE | 3.4 | |
234–252 | VVLPSYEEALSLPSKTPEG | 3.3 | |
235–252 | VLPSYEEALSLPSKTPEG | 0.3 | |
HSC71482–499 | 484–498 | GILNVSAVDKSTGK | 8.1 |
484–499 | GILNVSAVDKSTGKE | 7.1 | |
482–499 | ANGILNVSAVDKSTGKE | 0.3 |
Nested Set . | Residues . | Peptide Sequence . | Abundance (× 10−7)b . |
---|---|---|---|
150–162 | DYYKGEESNSSAN | 0.2 | |
LAMP234–252 | 236–251 | LPSYEEALSLPSKTPE | 27 |
236–252 | LPSYEEALSLPSKTPEG | 22 | |
236–250 | LPSYEEALSLPSKTP | 7.1 | |
234–251 | VVLPSYEEALSLPSKTPE | 3.4 | |
234–252 | VVLPSYEEALSLPSKTPEG | 3.3 | |
235–252 | VLPSYEEALSLPSKTPEG | 0.3 | |
HSC71482–499 | 484–498 | GILNVSAVDKSTGK | 8.1 |
484–499 | GILNVSAVDKSTGKE | 7.1 | |
482–499 | ANGILNVSAVDKSTGKE | 0.3 |
Peptides were identified in HLA-DRB*0401 extracts from MS/MS spectra that were recorded using an ion trap mass spectrometer and searched against a sequence database using the SEQUEST algorithm.
Peptide abundance is estimated based on the observed ion current.
Sequences in bold do not contain the primary predicted binding frame identified in Table II.
Sequences in italics are shorter than 12 residues.
Because residues 51–68 of HLA-B and HLA-C are identical, the exact contribution of these two sources to peptides contained within this sequence could not be directly determined. Therefore, the total ion abundances for all HLA-B and all HLA-C peptides that terminated at residue 72 were calculated, and their ratio was then used to calculate the amounts of each common peptide that would arise from each molecule. Peptides terminating at residue R72 were chosen because no peptides for HLA-C were detected with residue G69 as a terminus. Therefore, we could not sum the abundances for residues 69–72 because the abundance of peptides terminating at residue 69 is not 0 but below detection threshold. In addition, residues 69 and 70 are unique to each molecule. If site-specific endoproteases do generate these peptides, then any difference in amino acid could account for changes in abundance.
Footnotes
This work was supported by U.S. Public Health Service Grants AI20963 (to V.H.E.), AI33993 (to D.F.H.), and AI45199 (to V.H.E. and D.F.H.). C.J.L. was supported by Public Health Service Medical Scientist Training Program Grant (GM07267).
Abbreviations used in this paper: MS/MS, tandem mass spectrometry; Igκ, human Ig κ-chain; BSP, bovine serotransferrin precursor; NSFA, human α-soluble NSF attachment protein; LAMP, human lysosomal-associated multitransmembrane protein; HSC71, human heat shock cognate 71-kDa protein.