Celiac disease is an enteropathy caused by intolerance to dietary gluten. The disorder is strongly associated with DQA1*0501/DQB1*0201 (HLA-DQ2) as ∼95% of celiac patients express this molecule. HLA-DQ2 has unique Ag-binding properties that allow it to present a diverse set of gluten peptides to gluten-reactive CD4+ T cells so instigating an inflammatory reaction. Previous work has indicated that the presence of negatively charged amino acids within gluten peptides is required for specific binding. This, however, only partly explains the scale of the interaction. We have now characterized 432 natural ligands of HLA-DQ2 representing length variants of 155 distinct sequences. The sequences were aligned and the binding cores were inferred. Analysis of the amino acid distribution of these cores demonstrated that negatively charged residues in HLA-DQ2-bound peptides are favored at virtually all positions. This contrasts with a more restricted presence of such amino acids in T cell epitopes from gluten. Yet, HLA-DQ2 was also found to display a strong preference for proline at several anchor and nonanchor positions that largely match the position of proline in gluten T cell epitopes. Consequently, the bias for proline at p6 and p8 facilitates the enzymatic conversion of glutamine into glutamic acid in gluten peptides at p4 and p6, two important anchor sites. These observations provide new insights in the unique ability of HLA-DQ2 to bind a large repertoire of glutamine- and proline-rich gluten peptides. This knowledge may be an important asset in the development of future treatment strategies.

Referred to as HLA-DQ2, DQA1*0501/DQB1*0201 is the strongest genetic factor predisposing to celiac disease (1). HLA-DQ2 dimers were shown to present gluten-derived peptides to gluten-specific CD4+ T cells in the intestinal lamina propria, thereby inducing and sustaining strong inflammatory responses (2, 3). It is well-established that HLA-DQ2 can bind and present a relatively large and diverse repertoire of gluten-derived peptides. The association of HLA-DQ2 with celiac disease is extraordinarily strong: ∼95% of celiac disease patients express HLA-DQ2, whereas in the general population, the frequency of this allele is between 20 and 30% (4). Virtually all HLA-DQ2-negative patients are HLA-DQ8 positive and it has been described that several gluten-derived peptides can bind to HLA-DQ8 and elicit T cell responses (5, 6). Together, these data indicate that the ability to present gluten-derived peptides is a unique feature of HLA-DQ2 and HLA-DQ8 molecules. A binding motif for HLA-DQ2 has previously been elucidated and indicates a preference for negatively charged amino acids at the p4, p6, and p7 positions and bulky hydrophobic amino acids at the p1 and p9 positions in bound peptides (7, 8). Gluten, however, hardly contains negatively charged amino acids. This discrepancy was solved when it became clear that the glutamine- and proline-rich gluten molecules are substrates for the enzyme tissue transglutaminase. This enzyme can convert glutamine into glutamic acid, thereby introducing the negative charges required for binding to HLA-DQ2 (2, 3). A preference for hydrophobic and negatively charged anchor residues, however, is not unique to HLA-DQ2. HLA-DR3, for example, is known to preferentially bind peptides with an isoleucine and glutamic acid at the p1 and p4 positions, respectively. We therefore sought to further investigate the peptide-binding properties of HLA-DQ2 to understand its capacity to bind and present a large repertoire of gluten-derived peptides. To this end, we have now characterized a large number of peptides that are naturally bound to HLA-DQ2. Alignment and analysis of the binding cores of those peptides provided unexpected insight into the peptide-binding characteristics of HLA-DQ2 clarifying why this molecule is so uniquely suited to bind gluten peptides.

Human HLA-DQ2 dimers were isolated from a DQA1*0501/DQB1*0201 homozygous EBV-transformed B lymphoblastoid cell line generated from B cells of a healthy individual. Approximately 1010 cells were grown in RPMI 1640 medium supplemented with l-glutamine and 10% FCS. Subsequently, the cells were harvested, washed with PBS, and the cell pellet was stored at −80°C. The cells were lysed with 50 ml of lysis buffer (50 mM Tris, 150 mM NaCl, 5 mM EDTA, 0.5% Nonidet P-40, 10 mM iodo-acetamide, and protease inhibitor mix (Complete; Roche)). To remove the nuclei and insoluble material, the lysate was centrifuged for 60 min at 10,000 × g. It was subsequently precleared for 60 min with Sepharose beads and mixed with 7 ml of Sepharose beads onto which 16 mg of the HLA-DQ-specific mAb SPV-L3 (9) was covalently linked. After 60 min of gentle mixing, the beads were washed in 10 bead volumes of lysis buffer followed by washing with 20 mM Tris-HCl, 120 mM NaCl (pH 8.0), followed by washing with 20 mM Tris-HCl, 1 M NaCl (pH 8.0), 20 mM Tris-HCl (pH 8.0), and finally with 10 mM Tris-HCl (pH 8.0). Subsequently, the HLA-peptide complexes were eluted with 30 ml of 10% acetic acid in water. All purification steps were performed at 4°C. High molecular mass material (HLA molecules) was removed by filtration through Centriprep filtration units with a cutoff value of 10 kDa and analyzed by SDS-PAGE. The peptide fraction was freeze-dried, dissolved in 400 μl of 10% acetic acid in water, and fractionated with an HPLC system (micro Smart; Pharmacia) equipped with a reverse-phase C2/C18 sc 2.1/10 column (Pharmacia). The material was eluted using a gradient of 0–100% acetonitrile (0.25% per minute) supplemented with 0.1% trifluoroacetic acid.

Peptides present in the collected fractions were subsequently sequenced by tandem mass spectrometry. To this end, half of the peptide fractions from the Smart system were freeze-dried and resuspended in 95/3/0.1 v/v/v water/acetonitrile/formic acid (solvent A). These resuspended fractions were analyzed by an online nano-HPLC-MS system. The nano-HPLC system consisted of a conventional gradient HPLC system (Agilent 1100), the flow of which was reduced to 300 nl/min by an in-house constructed splitter. Two-microliter injections were done onto a precolumn (10 mm × 100 μm; AQUA-C18, 5-μm particle size (Phenomenex)) and eluted via an analytical nano-HPLC column (15 cm × 75 um; AQUA-C18, 5-μm particle size. HPLC columns were packed in-house. The gradient was run from 0 to 50% solvent B (10/90/0.1 v/v/v water/acetonitrile/formic acid) in 90 min. The nano-HPLC column was connected to the needle of the electrospray source of the mass spectrometer. The mass spectrometer was an HCTplus (Bruker Daltonics), which was run in data-dependent MS/MS mode during peptide elution. All tandem mass spectra produced in this way were searched against the human IPI database with the database search program MASCOT version 2.1.0 (Matrixscience). All reported hits were assessed manually, and peptides with MASCOT scores <40 were usually discarded.

The high molecular mass fraction of the immunoprecipitate was analyzed by standard 12% PAGE. To this end, the fraction was freeze-dried, dissolved in 1 M Tris-HCl buffer (pH 7.5), diluted with 4× protein sample buffer (60% glycerol, 300 mM Tris (pH 6.8), 12 mM EDTA (pH 8.0), 12% SDS, 864 mM 2-ME, 0.05% bromphenol blue) and run on a 12% SDS-PAGE gel. The proteins were visualized using Imperial Protein Stain (Pierce). Selected bands were cut out of the gel, extracted, and digested with trypsin according to Shevchenko et al. (10). The tryptic fragments were identified by online nano-HPLC-MS on an HCTplus as described above.

A tailored database of all the known protein sequences of gluten and gluten-like proteins (gliadins, glutenins, hordeins, secalins, and avenins), referred to as a gluteome, was built by extraction of relevant entries from the UniprotKB database at the European Bioinformatics Institute (EBI; www.ebi.ac.uk) with the SRS system with the following search terms: organism: Triticum aestivum and (description: gliadin or glutenin) for wheat, Hordeum vulgare and (description: hordein) for barley, Secale cereale and (description: secalin) for rye, and Avena sativa and (description: avenin) for oats.

The human IPI database was downloaded from the EBI (www.ebi.ac.uk/IPI/IPIhelp.html). The set of the protein sequences contained in the database is referred to as the human proteome.

Peptides were synthesized by standard F-moc chemistry on a multiple peptide synthesizer (SyroΙΙ). The integrity of synthetic peptides was checked by HPLC and MS.

Ninety-six-well FluoroNunc plates were coated with the HLA-DQ-specific mAb SPV-L3, 2 μg/well in 100 μl of carbonate buffer (50 mM Na2CO3, 50 mM NaHCO3 (pH 9.6)) for 2 h at 37°C, subsequently blocked for 3 h at 37°C with 2% FCS in PBS. EBV-transformed B cells homozygous for DQA1*0501/DQB1*0201 were grown and a sample was checked for proper HLA-DQ2 expression by FACS analysis using the SPV-L3 mAb. Subsequently, the remainder of the cells were lysed in 20 mM Tris-HCl (pH 7.5), 5 mM MgCl2, 1% Nonidet P-40, and protease inhibitor mix (Complete; Roche) at 4°C, at a concentration of 4 × 106 cells/1 ml of lysis buffer. Subsequently, nuclei and cell debris was removed by centrifugation (4°C, 2000 × g, 15 min). Next, the lysate was mixed with an equal volume of ice-cold 1% solution of BSA in PBS and 100-μl aliquots were pipetted into the SPV-L3-coated wells. After overnight incubation at 4°C, the plates were washed and 50 μl of binding buffer (0.1% Nonidet P-40, 0.1% Tween 20, 33.6 mM citric acid, 72 mM Na2HPO4 (pH 5.5), and Complete protease inhibitor mix) was added to each well. Titration ranges of peptides to be tested (concentration range 300–0.2 μM) were prepared in 10% DMSO containing a fixed amount of the biotin-labeled indicator peptides Glt-156 or MHC class Iα (MHC Iα) (46–63) at concentrations of 2.5 and 0.6 μM, respectively. Subsequently, 50 μl of the samples was applied to the SPV-L3/HLA-DQ2-coated plates. Following a 48-h incubation at 37°C, each well was washed extensively. Subsequently, 100 μl of 100 nM streptavidin-europium in assay buffer (both obtained from Wallac) was added and incubated for 45 min at room temperature. After extensive washing, 150 μl/well of enhancement solution (Wallac) was applied and the plates were read in a time-resolved fluorometer (1234; Wallac) 15–30 min thereafter. IC50 values were calculated based on the observed competition between the test peptides and biotin-labeled indicator peptides and indicate the concentration of the tested peptide required for half-maximal inhibition of the binding of the indicator peptide.

Molecular simulation of the structures of complexes between HLA-DQ2 molecules and selected peptides was performed as previously described (11). In brief, a hybrid HLA-DQ2 molecule was constructed containing the αβ-chains from the crystal structure of HLA-DQ2 (12) and the peptide backbone coordinates from HLA-DQ8 (13). This step was necessary because the antigenic peptide in the HLA-DQ2 crystal structure has its p9 pocket (the most spacious) unoccupied and its p9Tyr residue occupies the so-called p10 pocket (shelf) (14). Previous binding experiments indicated that this pocket is very spacious, accommodating aromatic and large aliphatic amino acids and secondarily acidic ones. This selectivity would not be exhibited by a residue occupying the p10 pocket. The peptides used were 13-mer with their nonamer core as obtained from the alignment of the nested peptides from the same family. The rotamer chosen for each peptide residue position was derived from a library of rotamers, so that there would be no steric clashes. In case a glutamate residue was to occupy pocket 6, we chose to have a rotamer nearly identical with that occupying pocket 6 in the crystal structure of the complex of DQ2 and the gliadin peptide (12). The molecular simulation was conducted by the program Discover of Accelrys with 1000 steps of the steepest gradient method followed by 1000 steps of the conjugate gradient method (15, 16). Numbering of the HLA-DQαβ residues follows the scheme suggested by Fremont et al. (17), as this provides structural equivalence for residues from different MHC class II (MHC II) gene loci independent of animal species. Graphical representations of the results were performed with the program WebLabViewer (version 3.5; Accelrys).

Human HLA-DQ2 dimers were purified from HLA-homozygous EBV-transformed B cells and the HLA-DQ2-bound peptides were eluted with acid (see Materials and Methods). Subsequently, filtration was used to separate peptides from the protein fraction. Analysis of the high molecular mass material by SDS-PAGE revealed the presence of two bands corresponding to the predicted molecular mass of HLA-DQ2 α- and β-chains (respectively, 33 and 28 kDa). To confirm the identity of these proteins, we performed in-gel trypsin digestion of these two proteins and sequenced the generated fragments by MS. Several protein fragments corresponding to sequences present in the HLA-DQ2 α- and β-chain were identified, demonstrating that HLA-DQ2 was successfully purified (data not shown). Next, the low molecular mass material that was eluted from the HLA-DQ2 was fractionated by reverse-phase HPLC and the peptides present in the obtained fractions were sequenced by tandem MS. This resulted in the identification of 432 peptides that represent length variants of 155 unique peptides.

In agreement with previous studies, the most abundant peptide identified was the HLA-class I-derived peptide IEQEGPEYW. This sequence was represented by 28 length variants and 7 length variants of the homologous sequence VEQEGPEYW (Fig. 1,A). Alignment of the length variants shows that the I/V-EQEGPEYW is the only 9-aa sequence common to all length variants (Fig. 1,A), indicating that this is the binding core. This is in line with previous studies (7, 8). In a similar way, the alignment of the length variants of a CD20-derived peptide implies that the sequence IPIQEEEEE represents the binding core (Fig. 1,B). Using this method, the core sequences could be determined for several other peptides. In most cases, however, fewer length variants were identified so that the minimal cores could not be determined this way. The putative minimal cores of those peptides were deduced with the aid of the previously identified HLA-DQ2-specific peptide-binding motif. This motif incorporates large hydrophobic amino acids at position p1 and/or p9 and negatively charged amino acids at p4, p6, and/or p7. Through this procedure, a single minimal core could be identified for 124 of 155 peptides (Table I). For the remainder of the peptides, either multiple binding frames were possible or a binding core could not be identified (data not shown).

FIGURE 1.

Alignment of the length variants of two MHC I-derived sequences (A) and a highly acidic CD20-derived peptide (B). Anchor residues are given in bold.

FIGURE 1.

Alignment of the length variants of two MHC I-derived sequences (A) and a highly acidic CD20-derived peptide (B). Anchor residues are given in bold.

Close modal
Table I.

Overview of peptides eluted from HLA-DQ2a

Peptide SourceSequenceLength Variants
Actin KEITALAPSTMKIK 
α tubulin APVISAEKAYHE
α tubulin DYEEVGADSADGEDEG 
β2-microglobulin HPSDIEVDLLKN 
β tubulin DATAEEEEDFGEEAEEE 
β tubulin DPTGTYHGDSDLQLDR 
β-actin EEEIAALVVDN 
β-actin YVALDFEQEM
β-actin VPIYEGYALPHAI 
Calcium-binding protein (stromal cell-derived factor 4) DGIVTAEELESYMDP 
Calmodulin LQDMINEVDADG
CD20 IEIIPIQEEEEE 15 
CD20 IKEEVVGLTETSSQPK 
CD74 MRMATPLLMQALPM 22 
Desmocollin type 4 ADGYSADLPLPLPIR 
Endoplasmin precursor ESDDEAAVEEEEEEK 
Eukaryotic translation initiation factor 4A DIETFYNTTVEEM
Glyceraldehyde 3-phosphate dehydrogenase GKVDIVAINDPFID 
Glyceraldehyde 3-phosphate dehydrogenase DIQVVAINDPFID 
Golgi phosphoprotein EFEEAEQVREENLPDE 
Golgi-localized phosphoprotein 130 kDa VDEQYQEEAEEEVQED 
GP73 Golgi phosphoprotein 2 AALSVSQENPEMEGPE 
Heat shock cognate 71-kDa protein FDVSILTIEDGIF
Heparan sulphate 2 sulphotransferase GPRQDATLDEEEDM 
HLA DM α-chain ADWAQEQGDAPAILFDK 
HSP-90 DPTADDTSAAVTEEMPP 
HSP-90 IDEDEVAAEEPN 
HSP-70 KSINPDEAVAY
HSP-70 TPSYVAFTDTERLIG 
Hyaluronan-binding protein YPEDEVGQEDEAESD 
Hypoxanthine-guanine phosphoribosyltransferase SPGVVISDDEPGYD 
Ig κ-chain V-III region POM TLTISSLQSEDFAV 
IgE FcR 2 NLNGLQADLSSFK 
IgG YPSDIAVEWESNGQPE 
IgG FTLTISSLQSEDF 
Importin β-1 subunit EAADVADDQEEPA
Initiation factor 4A-I GPDGMEPEGVIESNWNEIV 
Integral membrane protein 2A DSDPAAIIHDFE 
Integrin β1 DDLENVKSLGTDLMNEM 
IFN-γ receptor α APTSFGYDKPHVLVDL 
IL-21 receptor GSPLAGLDMDTFDSG 
IL-2 receptor β-chain precursor SPFPSSSFSPGGLAPEISPLE 
IL-4 receptor α chain precursor APVECEEEEEVEEE 
ITM2A REDLVAVEEIRD 
ITM2C protein LPQTYIIQEEM
LckBP1 p75; HCLS1; HS1 DEPEGDYEEVLEPE 
LDL receptor DIQAPDGLAVDWIHSN 
Legumain DYTGEDVTPQ
Legumain GDAEAVKGIGSGKV
Low-affinity Igε FcR ISQELEELRAEQQ
Major vault protein GDKVVAGDEWLFEGPG 
Mannose-6-phosphate receptor VPAAYRGVGDDQLGE 
MAPK-activating protein SDTEVYGEFYPVPPP 
MHC I APWIEQEGPEYWDQ 28 
MHC I APWVEQEGPEYWDR 
MHC I DTQFVRFDSDAAS16 
MHC I GKDYIALNEDLRS 
MHC I TAADTAAQITQ 11 
MHC I EPAADLIDMGPDPA 
MHC I FISVGYVDDTQF
MHC I LTWKRDGEDQTQD 
MHC II AEYWNSQKDLLEQKR 
MHC II DVGEFRAVTELGRPD 
MHC II SDVGEYRAVTELG 
MHC II FDFDGDEIF13 
MHC II NPDQSGEFMFDFDGDEFF
MHC II GKPVTTGVSETVFLPR 
MHC II GPSGQYTHEFD 
MHC II DPLKVYPPLKGSFPENLRHL 
MHC II DQEETAGVVSTPLIR 
Microsomal stress 70 protein ATPase core DVYVGYESVELADSNPQ 
(Table continues)   
Peptide SourceSequenceLength Variants
Actin KEITALAPSTMKIK 
α tubulin APVISAEKAYHE
α tubulin DYEEVGADSADGEDEG 
β2-microglobulin HPSDIEVDLLKN 
β tubulin DATAEEEEDFGEEAEEE 
β tubulin DPTGTYHGDSDLQLDR 
β-actin EEEIAALVVDN 
β-actin YVALDFEQEM
β-actin VPIYEGYALPHAI 
Calcium-binding protein (stromal cell-derived factor 4) DGIVTAEELESYMDP 
Calmodulin LQDMINEVDADG
CD20 IEIIPIQEEEEE 15 
CD20 IKEEVVGLTETSSQPK 
CD74 MRMATPLLMQALPM 22 
Desmocollin type 4 ADGYSADLPLPLPIR 
Endoplasmin precursor ESDDEAAVEEEEEEK 
Eukaryotic translation initiation factor 4A DIETFYNTTVEEM
Glyceraldehyde 3-phosphate dehydrogenase GKVDIVAINDPFID 
Glyceraldehyde 3-phosphate dehydrogenase DIQVVAINDPFID 
Golgi phosphoprotein EFEEAEQVREENLPDE 
Golgi-localized phosphoprotein 130 kDa VDEQYQEEAEEEVQED 
GP73 Golgi phosphoprotein 2 AALSVSQENPEMEGPE 
Heat shock cognate 71-kDa protein FDVSILTIEDGIF
Heparan sulphate 2 sulphotransferase GPRQDATLDEEEDM 
HLA DM α-chain ADWAQEQGDAPAILFDK 
HSP-90 DPTADDTSAAVTEEMPP 
HSP-90 IDEDEVAAEEPN 
HSP-70 KSINPDEAVAY
HSP-70 TPSYVAFTDTERLIG 
Hyaluronan-binding protein YPEDEVGQEDEAESD 
Hypoxanthine-guanine phosphoribosyltransferase SPGVVISDDEPGYD 
Ig κ-chain V-III region POM TLTISSLQSEDFAV 
IgE FcR 2 NLNGLQADLSSFK 
IgG YPSDIAVEWESNGQPE 
IgG FTLTISSLQSEDF 
Importin β-1 subunit EAADVADDQEEPA
Initiation factor 4A-I GPDGMEPEGVIESNWNEIV 
Integral membrane protein 2A DSDPAAIIHDFE 
Integrin β1 DDLENVKSLGTDLMNEM 
IFN-γ receptor α APTSFGYDKPHVLVDL 
IL-21 receptor GSPLAGLDMDTFDSG 
IL-2 receptor β-chain precursor SPFPSSSFSPGGLAPEISPLE 
IL-4 receptor α chain precursor APVECEEEEEVEEE 
ITM2A REDLVAVEEIRD 
ITM2C protein LPQTYIIQEEM
LckBP1 p75; HCLS1; HS1 DEPEGDYEEVLEPE 
LDL receptor DIQAPDGLAVDWIHSN 
Legumain DYTGEDVTPQ
Legumain GDAEAVKGIGSGKV
Low-affinity Igε FcR ISQELEELRAEQQ
Major vault protein GDKVVAGDEWLFEGPG 
Mannose-6-phosphate receptor VPAAYRGVGDDQLGE 
MAPK-activating protein SDTEVYGEFYPVPPP 
MHC I APWIEQEGPEYWDQ 28 
MHC I APWVEQEGPEYWDR 
MHC I DTQFVRFDSDAAS16 
MHC I GKDYIALNEDLRS 
MHC I TAADTAAQITQ 11 
MHC I EPAADLIDMGPDPA 
MHC I FISVGYVDDTQF
MHC I LTWKRDGEDQTQD 
MHC II AEYWNSQKDLLEQKR 
MHC II DVGEFRAVTELGRPD 
MHC II SDVGEYRAVTELG 
MHC II FDFDGDEIF13 
MHC II NPDQSGEFMFDFDGDEFF
MHC II GKPVTTGVSETVFLPR 
MHC II GPSGQYTHEFD 
MHC II DPLKVYPPLKGSFPENLRHL 
MHC II DQEETAGVVSTPLIR 
Microsomal stress 70 protein ATPase core DVYVGYESVELADSNPQ 
(Table continues)   
Table IA.

(Continued)

Peptide SourceSequenceLength Variants
Microtubule-associated protein 1A APAKAENEEAAANPAW 
Moesin YPEDVSEELIQDIT 
MOP-4 APDIDGDEDLPGPP 
Na/K-transporting ATPase FASIVTGVEEGR
NAP-1related protein DLDDVEEVEEEETG 
Natural killer cell-specific antigen KLIP1 EEEVDADAADAAAA 
NEDD4-2 minus WW2 VPEPWETISEEV
Nef-associated factor 1 TSSEFEVVTPEEQ 
Neutral amino acid transporter B GPAGDATVASEKESVM 
Notch 1 (TAN-1) IEAVQSETVEPPPP 
Nucleoporin FGTLSPSLGNSSIL
Occludin ELDDYREESEEYM 
Phosphatidylinositol 4-kinase type-II β EEGEAGDEELPLPPG 
Phosphoglycerate kinase 1 LPVDFVTADKF
Putative amyloid protein GDEVEEEAEEPYE 
Putative NF-κB activating protein SDTEVYGEFYPVPPPY 
Ras GTPase-activating-like protein QDILNEIAKDIRNQR 
Ras related protein Rap-2c VPLILVGNKVDLEPER 
Ras-related protein Ral-B DGEEVQIDILDTAGQE 
Regulator of G-protein signaling 14 GGNEQKALVL
Rho-GDP-dissociation inhibitor S12121 LAQIAAENEEDEHSVN 
Saposin IVDSYLPVILDII
Saposin FVAEYEPVLIEIL 
Saposin SSILSILLEEVSPE 
Saposin VAPFMANIPLLLYP 
Secretory granule proteoglycan core protein DLGQHGLEEDFM 13 
Sequestome EPEAEAEAAAGPGP 
Sequestome 1 MPESEGPSSLDPSQEGP 
Sialidase 1 DGDVPDGLNLGAVVSDVETG 
Sialyl transferase WDILQEISPEEIQPNPP 
Similar to calsyntenin-3 precursor EISLVGDDLDPERE 
Snf7 homolog associated with Alix3 EDELMAELEELE
Sodium/potassium-transporting ATPase α-1 chain precursor TPIAAEIEHFIH 
Sodium/potassium-transporting ATPase α-1 chain precursor SSLTGESEPQTRSPD 
Sortilin-related receptor HPINEYYIADASEDQ 
Sorting nexin 1 KPQPTYEELEEEEQE
STAM EPEPEPAFIDEDKMDQ 
TGF-β1 DRVAGESAEPEPEPE 
TNF-α LPDGPAGSWEQLIQER 
TRAF 6-binding protein KPSPSAAEADFD 
Transferin receptor a chain IQGAANALSGDVWDIDN 
Transferrin receptor A chain AAAEVAGQFVIKLTHD 
Transferrin receptor A chain IPVQTISRAAAEKLF 
Transferrin receptor A chain SAGDFGSVGATEWL 
Transformation/transcription domain associated protein EIDIELAPGDQTST
Translation initiation factor IF1H GPGDDDEIQFDDIGDDDE
Tripeptidyl-peptidase I precursor EPFLITNEIVDYISGG 
Tripeptidyl-peptidase I precursor SELVQAVSDPSSP 
Ubiquitin and ribosomal protein GKTITLEVEPSDT 
Ubiquitin-conjugating enzyme E2 N EPVPGIKAEPDESN 
Vacuolar ATP synthase subunit E DVDVQIDQESYLPE 
Vacuolar protein sorting protein LPDEGEPTDEETDGDISDSMD 
WB syndrome protein G GEPDYVNGEVAATE
Peptide SourceSequenceLength Variants
Microtubule-associated protein 1A APAKAENEEAAANPAW 
Moesin YPEDVSEELIQDIT 
MOP-4 APDIDGDEDLPGPP 
Na/K-transporting ATPase FASIVTGVEEGR
NAP-1related protein DLDDVEEVEEEETG 
Natural killer cell-specific antigen KLIP1 EEEVDADAADAAAA 
NEDD4-2 minus WW2 VPEPWETISEEV
Nef-associated factor 1 TSSEFEVVTPEEQ 
Neutral amino acid transporter B GPAGDATVASEKESVM 
Notch 1 (TAN-1) IEAVQSETVEPPPP 
Nucleoporin FGTLSPSLGNSSIL
Occludin ELDDYREESEEYM 
Phosphatidylinositol 4-kinase type-II β EEGEAGDEELPLPPG 
Phosphoglycerate kinase 1 LPVDFVTADKF
Putative amyloid protein GDEVEEEAEEPYE 
Putative NF-κB activating protein SDTEVYGEFYPVPPPY 
Ras GTPase-activating-like protein QDILNEIAKDIRNQR 
Ras related protein Rap-2c VPLILVGNKVDLEPER 
Ras-related protein Ral-B DGEEVQIDILDTAGQE 
Regulator of G-protein signaling 14 GGNEQKALVL
Rho-GDP-dissociation inhibitor S12121 LAQIAAENEEDEHSVN 
Saposin IVDSYLPVILDII
Saposin FVAEYEPVLIEIL 
Saposin SSILSILLEEVSPE 
Saposin VAPFMANIPLLLYP 
Secretory granule proteoglycan core protein DLGQHGLEEDFM 13 
Sequestome EPEAEAEAAAGPGP 
Sequestome 1 MPESEGPSSLDPSQEGP 
Sialidase 1 DGDVPDGLNLGAVVSDVETG 
Sialyl transferase WDILQEISPEEIQPNPP 
Similar to calsyntenin-3 precursor EISLVGDDLDPERE 
Snf7 homolog associated with Alix3 EDELMAELEELE
Sodium/potassium-transporting ATPase α-1 chain precursor TPIAAEIEHFIH 
Sodium/potassium-transporting ATPase α-1 chain precursor SSLTGESEPQTRSPD 
Sortilin-related receptor HPINEYYIADASEDQ 
Sorting nexin 1 KPQPTYEELEEEEQE
STAM EPEPEPAFIDEDKMDQ 
TGF-β1 DRVAGESAEPEPEPE 
TNF-α LPDGPAGSWEQLIQER 
TRAF 6-binding protein KPSPSAAEADFD 
Transferin receptor a chain IQGAANALSGDVWDIDN 
Transferrin receptor A chain AAAEVAGQFVIKLTHD 
Transferrin receptor A chain IPVQTISRAAAEKLF 
Transferrin receptor A chain SAGDFGSVGATEWL 
Transformation/transcription domain associated protein EIDIELAPGDQTST
Translation initiation factor IF1H GPGDDDEIQFDDIGDDDE
Tripeptidyl-peptidase I precursor EPFLITNEIVDYISGG 
Tripeptidyl-peptidase I precursor SELVQAVSDPSSP 
Ubiquitin and ribosomal protein GKTITLEVEPSDT 
Ubiquitin-conjugating enzyme E2 N EPVPGIKAEPDESN 
Vacuolar ATP synthase subunit E DVDVQIDQESYLPE 
Vacuolar protein sorting protein LPDEGEPTDEETDGDISDSMD 
WB syndrome protein G GEPDYVNGEVAATE
a

The putative binding cores are underlined. The average pI of the 9-mer core sequences was 4.1 (SD ± 0.9).

To verify that the above approach correctly identified the 9-aa binding cores, we selected 14 9-aa core sequences and tested their ability to bind to HLA-DQ2 in a cell-free binding assay (Fig. 2,A). For all those peptides, IC50 values ranged from 0.2 to 7 μM, which is consistent with the values previously found for other HLA-DQ2 ligands (7, 18). To further ensure the correctness of the identified 9-aa cores we tested the ability of N- and C-terminally truncated versions of the 9-aa cores to bind to HLA-DQ2 (Fig. 2 B). In four of five cases, this explicitly confirmed that the identified cores represent the only binding frame. The fifth peptide (endoplasmin precursor), with the highly acidic sequence DDEAAVEEEEEE, appeared to bind also in frames other than the predicted one. This binding can be explained by the fact that HLA-DQ2 accepts negatively charged amino acids along the entire p1-p9 binding core (see below). The results of the binding studies were used for further refinement of our core prediction algorithm. Further confirmation of the accuracy of the approach was obtained by molecular simulation of HLA-DQ2 binding with several other ligand peptides (data not shown).

FIGURE 2.

Binding of 9-aa core sequences to HLA-DQ2. The ability of 14 predicted core sequences to bind to HLA-DQ2 (A). The correctness of the binding core prediction as tested in the binding studies of the truncated peptides (B). Mean values of three independent experiments are shown. The IC50 values were calculated based on the observed competition between the biotin-labeled indicator peptide (QLQPFPQPELPY) and the test peptide. IC50 represents the concentration of the test peptide that is required for 50% inhibition of the binding of the indicator peptide.

FIGURE 2.

Binding of 9-aa core sequences to HLA-DQ2. The ability of 14 predicted core sequences to bind to HLA-DQ2 (A). The correctness of the binding core prediction as tested in the binding studies of the truncated peptides (B). Mean values of three independent experiments are shown. The IC50 values were calculated based on the observed competition between the biotin-labeled indicator peptide (QLQPFPQPELPY) and the test peptide. IC50 represents the concentration of the test peptide that is required for 50% inhibition of the binding of the indicator peptide.

Close modal

Based on the alignment of the 124 minimal cores, the frequency of amino acids was calculated and compared with that of the human proteome (Fig. 3, A and B). Most of the amino acids in HLA-DQ2-bound peptides were present at a level that is comparable to that found in the human proteome with the notable exception of negatively and positively charged amino acids. Although basic amino acids were virtually absent in HLA-DQ2-bound peptides, the vast majority of the eluted peptides were found to contain multiple acidic residues, the most extreme example being a peptide EEEEEEAEEEEEEE derived from glucosidase II. Strikingly, a preference for aspartate and particularly glutamate was found not only at the p4, p6, and p7 anchor positions, but also at virtually all positions in the binding core including the nonanchor positions p2, p3, p5, and p8 (Fig. 4).

FIGURE 3.

Amino acid distribution in: the human proteome (A), the binding cores of peptides eluted from HLA-DQ2 (B), the gluteome (C), and T cell stimulatory sequences from gluten and related proteins in barley, rye, and oats (D).

FIGURE 3.

Amino acid distribution in: the human proteome (A), the binding cores of peptides eluted from HLA-DQ2 (B), the gluteome (C), and T cell stimulatory sequences from gluten and related proteins in barley, rye, and oats (D).

Close modal
FIGURE 4.

Distribution of amino acids at the p1 through p9 positions in peptides eluted from HLA-DQ2 (A) and T cell stimulatory sequences from gliadins, glutenins, hordeins, secalins, and avenins (B) (see also Table II).

FIGURE 4.

Distribution of amino acids at the p1 through p9 positions in peptides eluted from HLA-DQ2 (A) and T cell stimulatory sequences from gliadins, glutenins, hordeins, secalins, and avenins (B) (see also Table II).

Close modal

To explain this preference for negatively charged amino acids at both anchor and nonanchor positions, we analyzed the charge distribution on the surface of the HLA-DQ2 molecule. This revealed that HLA-DQ2 has a surplus of positive charges (+8) that is larger than that found in HLA-DQ8 (+6) or HLA-DQ6 (+4) (Fig. 5 and data not shown). As these positive charges are located around the peptide-binding groove, they could facilitate the accommodation of negatively charged residues along the entire p1-p9 register.

FIGURE 5.

The distribution of charges on the surface of HLA-DQ2. The HLA-DQ2 molecule was modeled with an antigenic peptide in the groove and represented in the van der Waals convention. The surface is color-coded according to the atomic charges (negative, red; neutral, gray; and positive, blue). Note the positive charges in and around the binding groove.

FIGURE 5.

The distribution of charges on the surface of HLA-DQ2. The HLA-DQ2 molecule was modeled with an antigenic peptide in the groove and represented in the van der Waals convention. The surface is color-coded according to the atomic charges (negative, red; neutral, gray; and positive, blue). Note the positive charges in and around the binding groove.

Close modal

In addition, we modeled a number of the eluted peptides in complex with HLA-DQ2 to analyze the molecular basis for the acceptance of a negatively charged amino acid at p1, p5, and p9, three positions previously unknown to be involved in harboring such amino acids. The molecular simulation of peptide AGEEGEAGDEELP in HLA-DQ2 indicates that the p1Glu forms a salt bridge with the terminal guanidine group of α52Arg (Fig. 6 A). Because glutamate occupies and covers the opening of the p1 pocket, it is very likely that below this residue water molecules are present in this pocket that further promote hydrogen bonding between p1Glu and other polar residues e.g., β89Thr, β90Thr, and α52Arg. Interestingly, α52Arg is also present in the p1 pocket of HLA-DQ8, which prefers acidic residues at p1, but is replaced with glycine in HLA-DQ6, which does not tolerate negatively charged amino acids at the p1 position.

FIGURE 6.

Molecular simulation of HLA-DQ2-peptide complexes explaining anchoring via glutamate at positions p1, p9, and p5. The antigenic peptide anchor residues are shown in space-filling mode (color code: oxygen, red; nitrogen, blue; carbon, green; and hydrogen, white) while the residues from HLA-DQ2 are shown in stick form (identical atom color code except for carbon that is in orange) with transparent van der Waals surfaces and atomic charges (positive, blue; neutral, gray; negative, red). A, TCR view of pocket 1 of the complex between peptide AGEEGEAGDEELP (anchors in bold) and HLA-DQ2. The glutamate residue at p1 forms a salt bridge with the terminal guanidine group of α52Arg. In addition, the β86Glu forms hydrogen bonds with α31Gln (not seen because of p1Glu) and α9Tyr. The p1Glu occupies and covers the opening of this pocket; it is thus very likely that below the p1Glu there are water molecules in the pocket that further promote hydrogen bonding among polar residues (e.g., p1Glu, β89T, β90T, and α52R). B, TCR view of pocket 9 of the complex between peptide EIIPIQEEEEEET and HLA-DQ2. The glutamate at p9 forms a salt bridge with α76Arg and a hydrogen bond with α72Ser. C, TCR view of the complex between peptide AAVEEEEEEKVAE (anchors in bold) and HLA-DQ2. Inset, Note how β77Arg swings into position to form a salt bridge with p5Glu in the bound peptide.

FIGURE 6.

Molecular simulation of HLA-DQ2-peptide complexes explaining anchoring via glutamate at positions p1, p9, and p5. The antigenic peptide anchor residues are shown in space-filling mode (color code: oxygen, red; nitrogen, blue; carbon, green; and hydrogen, white) while the residues from HLA-DQ2 are shown in stick form (identical atom color code except for carbon that is in orange) with transparent van der Waals surfaces and atomic charges (positive, blue; neutral, gray; negative, red). A, TCR view of pocket 1 of the complex between peptide AGEEGEAGDEELP (anchors in bold) and HLA-DQ2. The glutamate residue at p1 forms a salt bridge with the terminal guanidine group of α52Arg. In addition, the β86Glu forms hydrogen bonds with α31Gln (not seen because of p1Glu) and α9Tyr. The p1Glu occupies and covers the opening of this pocket; it is thus very likely that below the p1Glu there are water molecules in the pocket that further promote hydrogen bonding among polar residues (e.g., p1Glu, β89T, β90T, and α52R). B, TCR view of pocket 9 of the complex between peptide EIIPIQEEEEEET and HLA-DQ2. The glutamate at p9 forms a salt bridge with α76Arg and a hydrogen bond with α72Ser. C, TCR view of the complex between peptide AAVEEEEEEKVAE (anchors in bold) and HLA-DQ2. Inset, Note how β77Arg swings into position to form a salt bridge with p5Glu in the bound peptide.

Close modal

The simulated structure of the complex between HLA-DQ2 and the peptide AAVEEEEEEKVAE indicates that the presence of multiple acidic residues in the binding core of the bound peptide attracts the β77Arg residue toward the p5 acidic residue (Fig. 6 C). Thus, besides the residues β28Ser, β30Ser, β70Arg, β71Lys, and β74Ala, that are very accommodating to acidic residues at p4, p6, and p7, β77Arg functions as an additional attractor and stabilizer of very negatively charged acidic peptides bound to HLA-DQ2.

Finally, the modeling of EIIPIQEEEEEET demonstrated that the p9Glu forms an ionic bond with α76Arg and a hydrogen bond with α72Ser (Fig. 6 B). These strong interactions likely explain the presence of glutamate at the p9 position in roughly 30% of the peptides found in HLA-DQ2.

To determine whether the preference of HLA-DQ2 for negatively charged amino acids matches the sequences of cereal proteins, we compared the frequency of amino acids in the HLA-DQ2-eluted peptides with that found in the collection of all available gluten and gluten-like molecules from wheat, barley, rye, and oat present in the public databases (referred to as the gluteome hereafter) and in the known HLA-DQ2-restricted gluten T cell stimulatory epitopes (Table II). Large discrepancies in the abundance of certain amino acids were found between these three groups of sequences (Fig. 3, B–D). To a large extent, this can be explained by the restricted amino acid composition of gluten and gluten-like molecules, which are strongly dominated by glutamine, proline, and hydrophobic amino acids. Moreover, while gluten lacks negatively charged residues, these are prominent in HLA-DQ2-restricted, T cell stimulatory gluten peptides due to the activity of tissue transglutaminase, which converts glutamine residues into glutamic acid, a prerequisite for high-affinity HLA-DQ2 binding.

Table II.

Overview of sequences derived from gluten and homologous proteins known to stimulate responses of T cells from celiac patients

NoSequenceaPeptide NameReference
    1 4 6 7 9        
   P E P Y Q Glia-α 2 (α-II) 28  
P Q E L Y      Glia-α 9 (α-I) 28  
P Q E L Y      Glia-α 9 (α-III) 29  
  F E P Y Q   Glia-α 20 24  
Q E P Q F      Glu-5 24  
P S P E E   γ-I (Glia-γ 1) 30  
I E P A L      γ-II (Glia-γ 30) 2331  
 E E P Y Q     γ-III 29  
  S E E F Q    γ-IV 29  
10   E F E Q Q     γ-VI 23  
11    P E E F Q      γ-VIIA 23  
12    E E P F Q      γ-VIIB 23  
13 F E E S F    Glt-156 24  
14 F E E Q L    Glt-17A 24  
15 P E P F Q    Hor-α 2 32  
16 P E P F Q    Sec-α 2 32  
17 P Q E Q F    Hor-α 9 32  
18 P Q E Q F    Sec-α 9 32  
19  P E E E F    Av-α 9A 3233  
20  P E E Q F    Av-α 9B 3233  
21 F E E Q L     Glt-156 homolog 24  
22 F E E Q I    Glt-156 homolog 24  
23 F Q Q P L    Glt-156 homolog 24  
24 F Q Q Q Q F    Glia-γ 2A 11  
25 Y Q Q Q Q F    Glia-γ 2B 11  
26 F Q Q Q Y  −   31  
NoSequenceaPeptide NameReference
    1 4 6 7 9        
   P E P Y Q Glia-α 2 (α-II) 28  
P Q E L Y      Glia-α 9 (α-I) 28  
P Q E L Y      Glia-α 9 (α-III) 29  
  F E P Y Q   Glia-α 20 24  
Q E P Q F      Glu-5 24  
P S P E E   γ-I (Glia-γ 1) 30  
I E P A L      γ-II (Glia-γ 30) 2331  
 E E P Y Q     γ-III 29  
  S E E F Q    γ-IV 29  
10   E F E Q Q     γ-VI 23  
11    P E E F Q      γ-VIIA 23  
12    E E P F Q      γ-VIIB 23  
13 F E E S F    Glt-156 24  
14 F E E Q L    Glt-17A 24  
15 P E P F Q    Hor-α 2 32  
16 P E P F Q    Sec-α 2 32  
17 P Q E Q F    Hor-α 9 32  
18 P Q E Q F    Sec-α 9 32  
19  P E E E F    Av-α 9A 3233  
20  P E E Q F    Av-α 9B 3233  
21 F E E Q L     Glt-156 homolog 24  
22 F E E Q I    Glt-156 homolog 24  
23 F Q Q P L    Glt-156 homolog 24  
24 F Q Q Q Q F    Glia-γ 2A 11  
25 Y Q Q Q Q F    Glia-γ 2B 11  
26 F Q Q Q Y  −   31  
a

Anchor positions are given in bold.

Strikingly, while HLA-DQ2 can accommodate glutamic acid all along the binding core (Fig. 7,A), the occurrence of glutamic acid in the known T cell stimulatory peptides is much more restricted to certain positions (Fig. 7,B). It was mostly limited to the anchor positions p4 and p6, as well as nonanchor position p3. Also, the p7-binding pocket, which has the highest preference for acidic residues, was found to mostly accommodate glutamine and aromatic amino acids in the case of gluten epitopes (Fig. 4). Similarly, glutamate was hardly found at the p2, p5, and p8 nonanchor positions of gluten epitopes. Thus, the preference for negatively charged residues does not explain the unique binding properties of HLA-DQ2 that allows efficient presentation of gluten peptides.

FIGURE 7.

Patterns of relative distribution of glutamate and proline at the p1 through p9 positions of the HLA-DQ2-binding register. A and C, The relative abundance of the indicated amino acids in the core sequences of HLA-DQ2-eluted peptides. B and D, The relative abundance of the indicated amino acids in the binding cores of cereal-derived T cell stimulatory sequences. The dotted line indicates the abundance of the amino acids in the sources, i.e., human proteome for the eluted peptides and the gluteome for the T cell epitopes. There is no dotted line in B because most of the glutamate residues in the T cell epitopes stem from the deamidation of glutamine.

FIGURE 7.

Patterns of relative distribution of glutamate and proline at the p1 through p9 positions of the HLA-DQ2-binding register. A and C, The relative abundance of the indicated amino acids in the core sequences of HLA-DQ2-eluted peptides. B and D, The relative abundance of the indicated amino acids in the binding cores of cereal-derived T cell stimulatory sequences. The dotted line indicates the abundance of the amino acids in the sources, i.e., human proteome for the eluted peptides and the gluteome for the T cell epitopes. There is no dotted line in B because most of the glutamate residues in the T cell epitopes stem from the deamidation of glutamine.

Close modal

A further analysis of the distribution of amino acids in the core of the HLA-DQ2-bound peptides revealed that the abundance of several amino acids is biased at both anchor and nonanchor positions. This was observed for alanine, phenylalanine, serine, glycine, and proline (Fig. 4). Although the overall presence of proline in HLA-DQ2-bound peptides is comparable to its presence in the human proteome (Fig. 3, A and B), strong positive and negative effects are observed depending on the position (Fig. 7 C). Most notably, proline was found to be favored at the p1 anchor position and at the p8 nonanchor position while its presence in the p3 and p6 position is comparable to that found in the human proteome. At all other positions, proline is underrepresented.

It is well-established that all known HLA-DQ2-restricted gluten-derived T cell stimulatory peptides contain multiple proline residues (Table II). As noted previously, proline residues in these gluten peptides are usually found at positions p1, p3, p5, p6, and/or p8 (Fig. 7,D), positions at which HLA-DQ2 favors or tolerates the presence of proline (Fig. 7,C). Remarkably, the alignment of known T cell stimulatory peptides from gluten and gluten homologs in barley, rye, and oats indicates that proline is present at position p8 in 22 of 26 sequences, a value that is much higher than the proline content in the gluteome (Fig. 7 D). Together, these results indicate that the proline profile found in gluten peptides closely matches the preference of HLA-DQ2 for this amino acid, with the exception of the p5 position.

The association between autoimmune disorders and specific MHC haplotypes indicates a fundamental role for Ag presentation in the development of these diseases. This is particularly clear in the case of celiac disease, which shares many similarities with other autoimmune disorders (19), and where roughly 95% of the patients carry HLA-DQ2, the remainder is HLA-DQ8 positive. There is strong evidence that the capacity of these HLA molecules to bind and present gluten peptides to CD4+ T cells in the small intestine is a prerequisite for the development of celiac disease (20).

Peptide motifs specific for a number of HLA alleles have been defined and can be used to identify potential T cell epitopes in proteins. A peptide-binding motif for HLA-DQ2 was first described in 1996 (7, 8). These and subsequent studies indicated that HLA-DQ2 preferred large hydrophobic amino acids at p1 and p9 as well as negatively charged amino acids at p4, p6, and/or p7 while gluten lacks such negatively charged residues. The observation that gluten-derived peptides undergo an enzymatic modification in the intestine, in which glutamine is converted into the negatively charged glutamic acid due to the activity of tissue transglutaminase, explained this apparent discrepancy (2, 3).

In the present study, we have investigated whether the preference of HLA-DQ2 for negative charges could fully explain the ability of HLA-DQ2 to bind and present a wide range of gluten-derived peptides. For this purpose, we have characterized a large number of peptides that are naturally bound to HLA-DQ2. Our study revealed that HLA-DQ2 accepts negatively charged residues at the entire p1-p9 register. This can be mainly explained by the strategically positioned polar or positively charged residues in all the five binding pockets (Fig. 6). Additionally, analysis of the charge distribution on the surface of the HLA-DQ2 molecule indicates that HLA-DQ2, in contrast to HLA-DQ8 and HLA-DQ6 in particular, has many positive surface charges around the peptide-binding groove that facilitate the accommodation of negatively charged residues along the p1-p9 register (Fig. 5 and data not shown). It is important to note that these unpaired positive surface charges can attract free peptides, facilitating the docking process, but do not stabilize the peptide within the binding groove. The unique ability of HLA-DQ2 to accommodate negatively charged residues along the entire p1-p9 register, however, does not by itself explain the gluten peptide-binding properties as the presence of glutamic acid residues in T cell stimulatory gluten peptides is much more restricted to certain positions in these peptides (Fig. 7 B). Also, peptides containing acidic residues can be presented by other HLA haplotypes whereas only HLA-DQ2, and to much lower extent HLA-DQ8, are associated with celiac disease. A more plausible explanation for the exceptional gluten peptide-binding properties is offered by the unique ability of HLA-DQ2 to accommodate proline-rich peptides while gluten is known to be exceptionally proline rich. In a recent study, the structure of the complex between HLA-DQ2 and a gluten peptide was elucidated, which demonstrated that proline at p1/3/5/8 does not interfere with the hydrogen bond network from the polar residues of HLA-DQ2 that line the Ag-binding groove (12). In agreement with this, our results indicate that the presence of proline in T cell stimulatory gluten peptides largely matches the preference for or acceptance of proline by HLA-DQ2. Several unique features of HLA-DQ2 explain this property. First, HLA-DQ2 is the only DQ allele known to accept proline at p1 and roughly one-third of T cell stimulatory gluten peptides carry a proline at p1. The uniqueness of the p1 pocket in HLA-DQ2 arises from an α53 deletion, which eliminates the requirement for a hydrogen bond from the p1 amide. There exists, however, the possibility of hydrogen bonding to the α50–52 HLA-DQ2 backbone by the amide groups of p-1 and p-2 peptide residues (Refs. 11 and 12 and this work). Second, a proline at p3 is accepted and present in over 50% of T cells stimulatory gluten peptides. Third, proline at p6 of DQ2 is tolerated despite the loss of one hydrogen bond to the side chain carbonyl of α62Asn, provided that there is no proline at p4 or p9 (21). Roughly one-third of T cell stimulatory gluten peptides carry a proline at p6. The peculiar character of pocket 6 stems from the presence of β30Ser. HLA-DQ2 is the only HLA allele to have such a residue at this position (www.anthonynolan.uk.org). Importantly, our results now indicate that proline is preferred at the p8 nonanchor position and 85% of T cell stimulatory gluten peptides carry proline at p8. Although selection for a particular amino acid would not be expected at nonanchor positions, the peculiar nature of proline restricts conformational freedom in peptides and may thus influence the position of the amino acid at p9, which in all cases are amino acids with a bulky side chain that can dock into the spacious p9 pocket. We have previously demonstrated that the spacing between the proline and glutamine residues determines which glutamines will be deamidated by tissue transglutaminase (31). Notably, in the very common gluten sequence QXP, where X is any amino acid, the glutamine will almost invariably be deamidated by tTG. Consequently, the preference for a proline at p8 will lead to the deamidation of a glutamine at p6, a strong anchor residue. Likewise, a proline at p6 will often lead to the deamidation of a Q at p4, another strong anchor residue. In the same manner, the acceptance of proline at p5 and p3 can result in QXP-containing peptides that, when deamidated, will have a p1Glu and/or a p3Glu residue. Although at p1 Glu is a highly favored anchor residue, at p3 it is well-tolerated because of the positive surface electrostatic potential of the HLA-DQ2 molecule. Thus, HLA-DQ2 is capable of binding peptides that contain multiple negatively charged residues as well as multiple proline residues, at p1/3/5/6/8. Together, these exceptional binding properties explain why this MHC molecule is so well-suited to bind and present a large set of various gluten-derived peptides.

Despite the large repertoire of gluten T cell epitopes implicated in the pathogenesis of celiac disease, it is clear that in adults T cells to certain gluten sequences (e.g., α-I, -II, and -III) are almost invariably present, whereas responses to others (e.g., γ-IV, -VIIa, and VIIb) are less frequently observed (22, 23). This suggests immunodominance of certain peptides. In children with celiac disease, however, we were often not able to find responses to these immunodominant peptides. Instead, we observed a diverse response to several other gluten sequences, some of which are not found in adults (24). To explain this apparent discrepancy, we have hypothesized that a T cell response to gluten can be initiated against a relatively large number of peptides and that when the response evolves it focuses on the most immunodominant sequences (24). This hypothesis still awaits confirmation in an independent study with children with celiac disease.

Two recent studies have further investigated the specificity of the HLA-DQ8-restricted gluten-specific T cell response (6, 25). These studies have confirmed our original observation of the immunodominant character of T cell responses toward one particular α-gliadin peptide (5). Thus, immunodominant T cell responses to certain gluten peptides are found both in adult HLA-DQ2- and HLA-DQ8-positive patients. In addition, the above-mentioned studies identified several HLA-DQ8-restricted subdominant peptides. In agreement with our previous studies (3), deamidation of many of these peptides at the p1 and p9 positions enhanced their T cell recognition, presumably increasing their binding affinity for HLA-DQ8, known to prefer negatively charged amino acids at those positions. Remarkably, both studies reported that identical gluten peptides can bind to either HLA-DQ2 or HLA-DQ8, depending on the deamidation pattern (6, 25). Of interest, these promiscuous sequences are nonimmunodominant.

The presence of gluten-specific T cells in CD patients implies that such T cells are positively selected in the thymus, most likely on the basis of the expression of HLA-DQ2-restricted self peptides in the thymus. Therefore, it is conceivable that the peptides identified in the current study may be involved in the selection of such T cells and we intend to address this issue in our future studies.

Gluten is a glutamine- and proline-rich family of proteins with related but distinct properties and sequences. It is well-established that the high proline content renders gluten peptides resistant to proteolytic degradation in the gastrointestinal tract. Given that the average daily gluten consumption ranges between 10 and 15 grams, it is clear that in HLA-predisposed individuals, HLA-DQ-gluten peptide complexes may almost continuously be formed in the small intestine. Although under normal circumstances, tolerance will be maintained, a perturbation of the homeostasis in the intestine may allow the development of an inflammatory T cell response. This is in line with the observation that HLA-DQ2 homozygous individuals, expressing higher numbers of HLA-DQ2 dimers on their APCs compared with HLA-DQ2 heterozygous individuals, have a much higher risk of developing celiac disease (18, 26). This implies that blocking of gluten presentation by HLA-DQ2 molecules might be an effective way to treat patients or prevent disease. Similarly, such compounds may be of use for the prevention of type I diabetes. The first steps toward generating such blocking compounds have already been undertaken (27), but it is clear that to obtain a therapeutically useful agent a substantial improvement will be required. The natural ligands for HLA-DQ2 identified in the present study provide new insight into peptide-binding properties of this molecule and novel templates, both of which are likely to be of importance for the development of an effective HLA-DQ2 blocker.

We thank Bart Roep, Jeroen van Bergen, and John Sidney for critical reading of the manuscript and Willemien Benckhuijsen for peptide synthesis.

The authors have no financial conflict of interest.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1

This work was supported by the Netherlands Organization for Scientific Research (Grant 912-02-028), the Celiac Disease Consortium, an Innovative Cluster approved by the Netherlands Genomics Initiative, and was partially funded by the Dutch Government (BSIK03009), and the Centre for Medical Systems Biology, a Center of Excellence approved by the Netherlands Genomics Initiative/Netherlands Organization for Scientific Research.

3

Abbreviations used in this paper: MS, mass spectrometry; MHC I, MHC class I; MHC II, MHC class II.

1
Sollid, L. M., E. Thorsby.
1993
. HLA susceptibility genes in celiac disease: genetic mapping and role in pathogenesis.
Gastroenterology
105
:
910
-922.
2
Molberg, O., S. N. McAdam, R. Korner, H. Quarsten, C. Kristiansen, L. Madsen, L. Fugger, H. Scott, O. Noren, P. Roepstorff, et al
1998
. Tissue transglutaminase selectively modifies gliadin peptides that are recognized by gut-derived T cells in celiac disease.
Nat. Med.
4
:
713
-717.
3
van de Wal, Y., Y. Kooy, P. van Veelen, S. Pena, L. Mearin, G. Papadopoulos, F. Koning.
1998
. Selective deamidation by tissue transglutaminase strongly enhances gliadin-specific T cell reactivity.
J. Immunol.
161
:
1585
-1588.
4
Sollid, L. M., G. Markussen, J. Ek, H. Gjerde, F. Vartdal, E. Thorsby.
1989
. Evidence for a primary association of celiac disease to a particular HLA-DQ α/β heterodimer.
J. Exp. Med.
169
:
345
-350.
5
van de Wal, Y., Y. M. Kooy, P. A. van Veelen, S. A. Pena, L. M. Mearin, O. Molberg, K. E. Lundin, L. M. Sollid, T. Mutis, W. E. Benckhuijsen, et al
1998
. Small intestinal T cells of celiac disease patients recognize a natural pepsin fragment of gliadin.
Proc. Natl. Acad. Sci. USA
95
:
10050
-10054.
6
Henderson, K. N., J. A. Tye-Din, H. H. Reid, Z. Chen, N. A. Borg, T. Beissbarth, A. Tatham, S. I. Mannering, A. W. Purcell, N. L. Dudek, et al
2007
. A structural and immunological basis for the role of human leukocyte antigen DQ8 in celiac disease.
Immunity
27
:
23
-34.
7
van de Wal, Y., Y. M. C. Kooy, J. W. Drijfhout, R. Amons, F. Koning.
1996
. Peptide binding characteristics of the coeliac disease-associated DQ (α1*0501, β1*0201) molecule.
Immunogenetics
44
:
246
-253.
8
Vartdal, F., B. H. Johansen, T. Friede, C. J. Thorpe, S. Stevanovic, J. E. Eriksen, K. Sletten, E. Thorsby, H. G. Rammensee, L. M. Sollid.
1996
. The peptide binding motif of the disease associated HLA-DQ (α1*0501, β1*0201) molecule.
Eur. J. Immunol.
26
:
2764
-2772.
9
Spits, H., J. Borst, M. Giphart, J. Coligan, C. Terhorst, J. E. De Vries.
1984
. HLA-DC antigens can serve as recognition elements for human cytotoxic T lymphocytes.
Eur. J. Immunol.
14
:
299
-304.
10
Shevchenko, A., M. Wilm, O. Vorm, M. Mann.
1996
. Mass spectrometric sequencing of proteins silver-stained polyacrylamide gels.
Anal. Chem.
68
:
850
-858.
11
Stepniak, D., L. W. Vader, Y. Kooy, P. A. van Veelen, A. Moustakas, N. A. Papandreou, E. Eliopoulos, J. W. Drijfhout, G. K. Papadopoulos, F. Koning.
2005
. T-cell recognition of HLA-DQ2-bound gluten peptides can be influenced by an N-terminal proline at p-1.
Immunogenetics
57
:
8
-15.
12
Kim, C. Y., H. Quarsten, E. Bergseng, C. Khosla, L. M. Sollid.
2004
. Structural basis for HLA-DQ2-mediated presentation of gluten epitopes in celiac disease.
Proc. Natl. Acad. Sci. USA
101
:
4175
-4179.
13
Lee, K. H., K. W. Wucherpfennig, D. C. Wiley.
2001
. Structure of a human insulin peptide-HLA-DQ8 complex and susceptibility to type 1 diabetes.
Nat. Immunol.
2
:
501
-507.
14
Zavala-Ruiz, Z., I. Strug, M. W. Anderson, J. Gorski, L. J. Stern.
2004
. A polymorphic pocket at the P10 position contributes to peptide binding specificity in class II MHC proteins.
Chem. Biol.
11
:
1395
-1402.
15
De Oliveira, D. B., E. Harfouch-Hammoud, H. Otto, N. A. Papandreou, L. J. Stern, H. Cohen, B. O. Boehm, J. Bach, S. Caillat-Zucman, T. Walk, et al
2000
. Structural analysis of two HLA-DR-presented autoantigenic epitopes: crucial role of peripheral but not central peptide residues for T-cell receptor recognition.
Mol. Immunol.
37
:
813
-825.
16
Moustakas, A. K., W. Y. van De, J. Routsias, Y. M. Kooy, P. van Veelen, J. W. Drijfhout, F. Koning, G. K. Papadopoulos.
2000
. Structure of celiac disease-associated HLA-DQ8 and non-associated HLA- DQ9 alleles in complex with two disease-specific epitopes.
Int. Immunol.
12
:
1157
-1166.
17
Fremont, D. H., D. Monnaie, C. A. Nelson, W. A. Hendrickson, E. R. Unanue.
1998
. Crystal structure of I-Ak in complex with a dominant epitope of lysozyme.
Immunity
8
:
305
-317.
18
Vader, W., D. Stepniak, Y. Kooy, L. Mearin, A. Thompson, J. J. van Rood, L. Spaenij, F. Koning.
2003
. The HLA-DQ2 gene dose effect in celiac disease is directly related to the magnitude and breadth of gluten-specific T cell responses.
Proc. Natl. Acad. Sci. USA
100
:
12390
-12395.
19
Sollid, L. M., B. Jabri.
2005
. Is celiac disease an autoimmune disorder?.
Curr. Opin. Immunol.
17
:
595
-600.
20
Koning, F., D. Schuppan, N. Cerf-Bensussan, L. M. Sollid.
2005
. Pathomechanisms in celiac disease.
Best Pract. Res. Clin. Gastroenterol.
19
:
373
-387.
21
van de Wal, Y., Y. M. Kooy, J. W. Drijfhout, R. Amons, G. K. Papadopoulos, F. Koning.
1997
. Unique peptide binding characteristics of the disease-associated DQ(α1*0501, β1*0201) vs the non-disease-associated DQ(α1*0201, β1*0202) molecule.
Immunogenetics
46
:
484
-492.
22
Shan, L., O. Molberg, I. Parrot, F. Hausch, F. Filiz, G. M. Gray, L. M. Sollid, C. Khosla.
2002
. Structural basis for gluten intolerance in celiac sprue.
Science
297
:
2275
-2279.
23
Qiao, S. W., E. Bergseng, O. Molberg, G. Jung, B. Fleckenstein, L. M. Sollid.
2005
. Refining the rules of gliadin T cell epitope binding to the disease-associated DQ2 molecule in celiac disease: importance of proline spacing and glutamine deamidation.
J. Immunol.
175
:
254
-261.
24
Vader, W., Y. Kooy, P. van Veelen, A. De Ru, D. Harris, W. Benckhuijsen, S. Pena, L. Mearin, J. W. Drijfhout, F. Koning.
2002
. The gluten response in children with celiac disease is directed toward multiple gliadin and glutenin peptides.
Gastroenterology
122
:
1729
-1737.
25
Tollefsen, S., H. Arentz-Hansen, B. Fleckenstein, O. Molberg, M. Raki, W. W. Kwok, G. Jung, K. E. Lundin, L. M. Sollid.
2006
. HLA-DQ2 and -DQ8 signatures of gluten T cell epitopes in celiac disease.
J. Clin. Invest.
116
:
2226
-2236.
26
Mearin, M. L., I. Biemond, A. S. Pena, I. Polanco, C. Vazquez, G. T. Schreuder, R. R. de Vries, J. J. van Rood.
1983
. HLA-DR phenotypes in Spanish coeliac children: their contribution to the understanding of the genetics of the disease.
Gut
24
:
532
-537.
27
Xia, J., M. Siegel, E. Bergseng, L. M. Sollid, C. Khosla.
2006
. Inhibition of HLA-DQ2-mediated antigen presentation by analogues of a high affinity 33-residue peptide from α2-gliadin.
J. Am. Chem. Soc.
128
:
1859
-1867.
28
Arentz-Hansen, H., R. Korner, O. Molberg, H. Quarsten, W. Vader, Y. M. Kooy, K. E. Lundin, F. Koning, P. Roepstorff, L. M. Sollid, S. N. McAdam.
2000
. The intestinal T cell response to α-gliadin in adult celiac disease is focused on a single deamidated glutamine targeted by tissue transglutaminase.
J. Exp. Med.
191
:
603
-612.
29
Arentz-Hansen, H., S. N. McAdam, O. Molberg, B. Fleckenstein, K. E. Lundin, T. J. Jorgensen, G. Jung, P. Roepstorff, L. M. Sollid.
2002
. Celiac lesion T cells recognize epitopes that cluster in regions of gliadins rich in proline residues.
Gastroenterology
123
:
803
-809.
30
Sjostrom, H., K. E. Lundin, O. Molberg, R. Korner, S. N. McAdam, D. Anthonsen, H. Quarsten, O. Noren, P. Roepstorff, E. Thorsby, L. M. Sollid.
1998
. Identification of a gliadin T-cell epitope in coeliac disease: general importance of gliadin deamidation for intestinal T-cell recognition.
Scand. J. Immunol.
48
:
111
-115.
31
Vader, L. W., A. De Ru, Y. van der Wal, Y. M. Kooy, W. Benckhuijsen, M. L. Mearin, J. W. Drijfhout, P. van Veelen, F. Koning.
2002
. Specificity of tissue transglutaminase explains cereal toxicity in celiac disease.
J. Exp. Med.
195
:
643
-649.
32
Vader, L. W., D. T. Stepniak, E. M. Bunnik, Y. M. Kooy, W. de Haan, J. W. Drijfhout, P. A. van Veelen, F. Koning.
2003
. Characterization of cereal toxicity for celiac disease patients based on protein homology in grains.
Gastroenterology
125
:
1105
-1113.
33
Arentz-Hansen, H., B. Fleckenstein, O. Molberg, H. Scott, F. Koning, G. Jung, P. Roepstorff, K. E. Lundin, L. M. Sollid.
2004
. The molecular basis for oat intolerance in patients with celiac disease.
PloS Med.
1
:
e1