Abstract
Similar to host proteins, N-myristoylation occurs for viral proteins to dictate their pathological function. However, this lipid-modifying reaction creates a novel class of “lipopeptide” Ags targeted by host CTLs. The primate MHC class I–encoded protein, Mamu-B*098, was previously shown to bind N-myristoylated 5-mer peptides. Nevertheless, T cells exist that recognize even shorter lipopeptides, and much remains to be elucidated concerning the molecular mechanisms of lipopeptide presentation. We, in this study, demonstrate that the MHC class I allele, Mamu-B*05104, binds the N-myristoylated 4-mer peptide (C14-Gly-Gly-Ala-Ile) derived from the viral Nef protein for its presentation to CTLs. A phylogenetic tree analysis indicates that these classical MHC class I alleles are not closely associated; however, the high-resolution x-ray crystallographic analyses indicate that both molecules share lipid-binding structures defined by the exceptionally large, hydrophobic B pocket to accommodate the acylated glycine (G1) as an anchor. The C-terminal isoleucine (I4) of C14-Gly-Gly-Ala-Ile anchors at the F pocket, which is distinct from that of Mamu-B*098 and is virtually identical to that of the peptide-presenting MHC class I molecule, HLA-B51. The two central amino acid residues (G2 and A3) are only exposed externally for recognition by T cells, and the methyl side chain on A3 constitutes a major T cell epitope, underscoring that the epitopic diversity is highly limited for lipopeptides as compared with that for MHC class I–presented long peptides. These structural features suggest that lipopeptide-presenting MHC class I alleles comprise a distinct MHC class I subset that mediates an alternative pathway for CTL activation.
Introduction
Major histocompatibility complex class I molecules bind fragments of intracellular proteins and present them to CTLs bearing specific αβ TCR and the associated CD8 coreceptors (1, 2). These CD8+ CTLs precisely discriminate peptides derived from self and nonself proteins, thereby monitoring microbial insults and cellular transformation that may occur within cells. As a consequence, CTLs function efficiently in eliminating abnormal cells only while leaving healthy cells unaffected and thus serve as a critical element in controlling viral infections and cancer (3–5). Because of the milestone discovery of the x-ray crystallographic structure of the peptide-bound HLA–A2 complex (6), extensive investigations have been conducted over the past three decades to elucidate the molecular mechanisms responsible for peptide presentation by various MHC class I alleles and have established a paradigmatic model that delineates how peptides are captured by MHC class I molecules and recognized by TCRs (7, 8). Six pockets, designated A through F, are present in the Ag-binding groove of MHC class I molecules; among these pockets, the allele-specific B and F pockets play a major role in influencing the ligand repertoire (9). The peptides of a stretch of 9 aa residues are typically captured with their P2 and C-terminal (P9) anchors accommodated in the B and F pockets, respectively, whereas the side chains of several other residues protrude externally for close interactions with TCRs. Accordingly, virtually innumerable T cell epitopic variations may be generated within MHC class I–presented peptides, which allows T cells to recognize foreign Ags specifically without eliciting autoimmunity. However, this fundamental paradigm may now need some modifications to incorporate the novel MHC class I function of lipopeptide Ag presentation.
A group of cellular proteins with the N-terminal Gly-x-x-x-Ser/Thr motif (where x is any amino acid) undergo N-myristoylation, a protein lipidation reaction in which N-myristoyltransferase catalyzes the conjugation of a 14-carbon fatty acid (myristic acid) to the N-terminal glycine residue, using myristoyl-CoA as its substrate (10). Besides host proteins, N-myristoylation also occurs for viral proteins in virus-infected cells by borrowing the host machinery (11), and this lipid modification often dictates their pathogenic function (12–14). Therefore, the ability of CTLs to monitor the N-myristoylation of viral proteins may be valuable for the efficient control of pathogenic viruses; the findings of our recent studies using SIV-infected monkeys indicated that T cells capable of mediating such functions exist (15). The rhesus macaque CD8+ CTL line, 2N5.1, specifically recognized N-myristoylated 5-mer peptides (C14-Gly-Gly-Ala-Ile-Ser [C14nef5]) derived from the SIV Nef protein, and the classical MHC class I–encoded protein, Mamu-B*098, was found to bind C14nef5 and functioned as the restriction element for its presentation to 2N5.1 (16, 17).
An x-ray crystallographic analysis of the Mamu-B*098:C14nef5 complex revealed that the myristoyl group of C14nef5 was accommodated in the B pocket lined with hydrophobic amino acid residues, whereas the C-terminal serine residue fitted into the small F pocket. Because the myristoyl group and serine residue at the given position are basic elements for most N-myristoylated proteins (18), Mamu-B*098 may potentially bind a wide array of N-myristoylated 5-mer lipopeptides derived not only from the SIV Nef protein but also from other viral proteins containing the N-myristoylation motif. Although 2N5.1 failed to recognize the N-myristoylated Nef 4-mer lipopeptide (C14-Gly-Gly-Ala-Ile [C14nef4]) lacking the C-terminal serine residue, the rhesus CD8+ CTL line, termed SN45, isolated from the circulation of a SIV-infected donor, exhibited a prominent reactivity to C14nef4 (19). Thus, we predicted that another MHC class I allele may exist that is capable of binding 4-mer lipopeptides in a manner that differs from that for Mamu-B*098.
In the current study, we identified the restriction element for the presentation of C14nef4 to SN45 as the classical MHC class I allele, Mamu-B*05104. The high-resolution x-ray crystal structures of the two MHC class I:lipopeptide complexes (Mamu-B*05104:C14nef4 and Mamu-B*098:C14nef5) revealed marked differences as well as conserved structural features between long peptide- and N-myristoylated short lipopeptide-presenting MHC class I molecules. These structural features indicate that two distinct MHC class I subsets have evolved to mediate the presentation of long peptides or short lipopeptides to CTLs.
Materials and Methods
MHC genotyping and phylogenetic tree analysis
The typing of Mamu genes was performed as described previously (20). Briefly, cDNA samples derived from rhesus macaque PBMCs were used as templates for PCR amplification with primer pairs that were designed to amplify all monkey MHC class I genes. Pyrosequencing of the PCR products was performed using the GS Junior system and the amplicon sequencing protocol (Roche, Branford, CT). MHC class I genotypes were identified by referring to the Immuno Polymorphism Database (http://www.ebi.ac.uk/ipd/index.html). A phylogenetic tree was constructed by the neighbor-joining method using GENETYX software (GENETYX, Tokyo, Japan) based on the amino acid sequences of the MHC class I α1/α2 domains.
Cloning of MHC class I genes
Total RNA was extracted from rhesus macaque (MM570)–derived PBMCs using the RNeasy mini kit (QIAGEN, Hilden, Germany), and first-strand cDNA was synthesized from 0.5 μg of total RNA using oligo(dT) and the PrimeScript reverse transcriptase (Takara Bio, Otsu, Japan). PCR amplification was performed with Pfu DNA polymerase (Stratagene, La Jolla, CA) for 35 cycles at 94°C for 45 s, at 58°C for 45 s, and at 72°C for 1.5 min, followed by an additional 10 min incubation at 72°C. The primers used were as follows: 5′-TAT GGT ACC ATG GCG CCC CGA ACC CTC CTT-3′ (sense) and 5′-TAT GCG GCC GCC ACA AGA CAG TTG TCT TTT CA-3′ (antisense) for Mamu-B*05104 and 5′-GCG GAA TTC GAG ACG CCA AGA TGC GGT-3′ (sense) and 5′-GCG CTC GAG TCA AGC CGT GAG AGA CAC AT-3′ (antisense) for Mamu-B11L*0101. Mamu-B*06004 cDNA was chemically synthesized (Integrated DNA Technologies). All cDNA samples were cloned into pcDNA3.1(+), and their identity was confirmed by DNA sequencing.
T cell assays
SN45-derived TCR α and β cDNAs were cloned into pREP7 and pREP9, respectively, and transfection into TCRβ-deficient J.RT3 cells was performed by electroporation as described previously (21). Cells were cultured in the presence of G418 (1.5 mg/ml) and hygromycin B (0.5 mg/ml) for the selection of transfectants and used as responder cells in T cell assays. T cells (5 × 104/well) were cultured with HeLa cell transfectants (5 × 104/well) expressing each Mamu-B allele in the presence of 10 μg/ml lipopeptides and 20 nM PMA in 96-well flat-bottom microtiter plates. After 24 h, culture supernatants were collected, and the amount of IL-2 released into the medium was measured using the BD ELISA kit (BD Biosciences, Franklin Lakes, NJ).
Generation of lipopeptide-bound Mamu-B*05104 complexes
Recombinant proteins were prepared as described previously (17). Briefly, DNA constructs encoding the ectodomain of Mamu-B*05104 (from G1 to P276 with R128 and K177 mutated into Glu and a Met-Ala added to the N terminus) and rhesus β2-microglobulin (β2m) (from I1 to M99 with a Met-Ala added to the N terminus) were synthesized and cloned into pLM1. The expression plasmids were introduced into the Escherichia coli Rosetta 2 (DE3) pLysS strain (Novagen, Madison, WI), and protein expression was induced in the presence of isopropyl-β-d-thiogalactoside, followed by the isolation of inclusion bodies. Purified inclusion bodies were dissolved in buffer containing 6 M guanidine-HCl, and the insoluble material was removed by centrifugation. The supernatant was treated with 50 mM DTT at 37°C for 3 h, and aliquots were stored at −80°C until used.
C14nef4 and its structural analogue (C14-GGGI), in which A3 was mutated into glycine, were solubilized in methanol. To obtain lipopeptide-loaded Mamu-B*05104 complexes, solubilized Mamu-B*05104 heavy chains (32 mg) and β2m (12 mg) were refolded by rapid dilution in the presence of C14nef4 (7.5 mg) or C14-GGGI (10 mg). After dialysis against 10 mM Tris-HCl (pH 8), the refolded proteins were purified by HiLoad 16/600 Superdex 200 pg (GE Healthcare, Milwaukee, WI) size-exclusion chromatography, followed by monoQ (GE Healthcare) anion exchange chromatography.
Crystallization and structural elucidation
Crystals of Mamu-B*05104:lipopeptide complexes were formed as described previously with some modifications (22). Regarding the C14nef4-loaded complex, 1 μl of a 10 mg/ml protein solution and 1 μl of 100 mM MIB buffer (Molecular Dimensions, U.K.) (pH 7) and 25% polyethylene glycol 1500 were mixed at 4°C. Regarding the C14-GGGI–loaded complex, 1 μl of a 10 mg/ml protein solution and 1 μl of a mother liquid containing 200 mM sodium fluoride, 100 mM Bis-Tris propane (pH 7.5), and 20% polyethylene glycol 3350 were mixed at 20°C. The crystals that formed were then cryoprotected in 20% ethylene glycol. Diffraction data were collected at 100 K (in a cold nitrogen gas stream) on a Rigaku Saturn A200 charge-coupled device detector (Rigaku/MSC, Woodlands, TX) using synchrotron radiation with a wavelength of 1.0 Å at the BL26B1 station (SPring-8, Hyogo, Japan). The resulting data set was processed, merged, and scaled using HKL-2000 (HKL Research, Charlottesville, VA) (23). Complex structures were solved by molecular replacement (MOLREP) with Mamu-B*098 (Protein Data Bank code 4ZFZ) as a search model, as implemented in CCP4i software (24). Models were refined using the PHENIX software package (25). Structures were rebuilt using COOT 0.8.1 (26) and further modified on σ-weighted (2|Fo| − |Fc|) and (|Fo| − |Fc|) electron density maps. Repeated processes of the rebuilding and refinement of Mamu-B*05104:C14nef4 and Mamu-B*05104:C14-GGGI complexes resulted in 96.8 and 98.4% of residues being in favored regions and 0.3 and 0.0% of residues being in outliers, respectively, in a Ramachandran plot. Crystallographic images were visualized using PyMOL (DeLano Scientific, San Carlos, CA).
The root mean-square deviation (RMSD) values over all Cα atoms in the α1 and α2 domains after the superimposition of Mamu-B*05104 with Mamu-A*002 (3JTT) (27), Mamu-B*017 (3RWG) (28), HLA-A2 (3MRE), HLA-B27 (1K5N), HLA-E (3BZE), and HLA-F (5IUE) were calculated. The size of the B pocket, which was lined by amino acid residues at positions 7, 9, 22, 24, 25, 34, 35, 36, 45, 63, 66, 67, 70, 74, 97, and 99, was calculated for each representative MHC class I allele using the CASTp web server (http://cast.engr.uic.edu) (29) with a probe radius of 1.2 Å.
Accession numbers
Atomic coordinates and structural factors for the reported crystal structures have been deposited in the Protein Data Bank (https://pdbj.org) under accession 6IWG (for Mamu-B*05104:C14nef4) and 6IWH (for Mamu-B*05104:C14-GGGI).
SN45 TCR tetramer and flow cytometry
A soluble form of the SN45 TCR protein was generated using the disulfide-linked TCR method as described previously (30). DNA constructs encoding the extracellular domains of the SN45-derived TCR α-chain (from K1 to G197 with T155 mutated to Cys) and the TCR β-chain (from D1 to D246 with S173 mutated to Cys) that was fused with the C-terminal BirA substrate peptide sequence, LHHILDAQKMVWNHR, were cloned into pLM1, and expression plasmids were expressed in E. coli as described above. Purified TCRα (21 mg) and TCRβ (35 mg) proteins were combined in 1 l of a buffer containing 100 mM Tris-HCl (pH 8.1), 400 mM l-arginine, 2 mM EDTA, 5 M urea, 3.7 mM cystamine, and 6.6 mM cysteamine and incubated at 10°C for 24 h with continuous stirring. After dialysis against 6 l of 10 mM Tris-HCl (pH 8), the paired TCR α/β protein was purified as described above, for which intermolecular disulfide bond formation was confirmed by a nonreducing SDS-PAGE analysis. The purified protein was then biotinylated using the BirA enzyme (Sigma-Aldrich), and tetramer formation was achieved using streptavidin-R–PE (Invitrogen). To assess the reactivity of the TCR tetramer, K562 cell transfectants expressing Mamu-B*05104 or mock-transfected cells were pulsed with 160 μM lipopeptides for 4 h and then incubated with the TCR tetramer (50 μg/ml) at room temperature for 1 h. Labeled cells were analyzed by flow cytometry using FACS LSRFortessa and FlowJo software.
Animals
Rhesus macaques (Macaca mulatta) used in the current study were treated humanely in accordance with institutional regulations (31), and experimental protocols were approved by the Committee for Experimental Use of Nonhuman Primates at the Institute for Frontier Life and Medical Sciences and at the Primate Research Institute of Kyoto University.
Results
Identification of the restriction element for the presentation of C14nef4 to SN45
The SN45 CTL line, which was established from an SIV-infected rhesus monkey (MM521), responded to C14nef4 in the presence of PBMCs derived from MM521 and three other donors (MM570, MM571, and MM606) (“positive donors”), whereas two donors (MM601 and MM1867) failed to activate SN45 (“negative donors”). The clear separation of positive and negative donor groups for CTL activation indicated that the presentation of C14nef4 to SN45 was mediated by a polymorphic MHC class I allele. The rhesus classical MHC class I is marked by the presence of multiple Mamu-A and Mamu-B alleles per chromosome (32, 33); thus, the expression of several Mamu-A and Mamu-B alleles is simultaneously detected in a single individual (Fig. 1A). Deep sequencing of the MHC class I alleles expressed in each of the donors revealed that three alleles, Mamu-B*05104, Mamu-B*06004, and Mamu-B11L*01, were shared among the positive donors and were absent in the negative donors (Fig. 1B, highlighted in orange, blue, and green, respectively). To assess their potential to mediate the presentation of C14nef4 to SN45, HeLa cell transfectants expressing each of these alleles were tested for their ability to present C14nef4 to the J.RT3/SN45 responder cells that were obtained by transfection of the TCR-deficient J.RT3 T cell line with cDNAs encoding the SN45 TCR α- and β-chains. The J.RT3/SN45 response by IL-2 production was only observed when T cells were incubated in the presence of C14nef4 with HeLa cell transfectants expressing Mamu-B*05104 but not those expressing Mamu-B*06004 or Mamu-B11L*01 (Fig. 2A, left panel). The C14nef4-specific, Mamu-B*05104-dependent response was undetectable for untransfected J.RT3 cells, indicating the involvement of TCRs in Ag recognition (Fig. 2A, right panel). Although Mamu-B*05104 shared sites for N-glycosylation and intradomain disulfide bond formation with Mamu-B*098, significant differences were observed in the primary amino acid sequences of their α1 and α2 domains; accordingly, a phylogenetic tree analysis indicated that Mamu-B*05104 and Mamu-B*098, both of which belonged to the classical MHC-B family, were only remotely associated (Fig. 2B, 2C). Collectively, these results provide compelling evidence to show that the classical MHC class I allele, Mamu-B*05104, functions as a novel lipopeptide Ag-presenting molecule capable of mediating the presentation of N-myristoylated 4-mer lipopeptides to CTLs.
MHC class I alleles expressed in rhesus donors. (A) Models of the classical MHC class I loci in humans (HLA) and in rhesus macaques (Mamu) are illustrated in parallel. Note that several Mamu-A and Mamu-B alleles are present per chromosome. (B) Mamu-A and Mamu-B alleles expressed in each individual are shown, and those that were shared among the positive donors and absent in the negative donors were differentially highlighted.
MHC class I alleles expressed in rhesus donors. (A) Models of the classical MHC class I loci in humans (HLA) and in rhesus macaques (Mamu) are illustrated in parallel. Note that several Mamu-A and Mamu-B alleles are present per chromosome. (B) Mamu-A and Mamu-B alleles expressed in each individual are shown, and those that were shared among the positive donors and absent in the negative donors were differentially highlighted.
Identification of Mamu-B*05104 as a lipopeptide-presenting molecule. (A) HeLa cell transfectants expressing Mamu-B*05104, Mamu-B*06004, or Mamu-B11L*01 as well as mock-transfected cells were cultured with SN45 TCR-reconstituted (left panel) or untransfected J.RT3 cells (right panel) in the presence or absence of C14nef4. T cell responses were monitored by measuring the amount of IL-2 released into the medium. Mean values with SEM are shown. (B) A phylogenetic tree constructed by the neighbor-joining method for representative classical (Mamu-A and -B) and nonclassical (Mamu-AG, -I, -E, and -F) MHC class I alleles are shown. Note that Mamu-B*05104 as well as Mamu-B*098 belonged to the classical Mamu-B family. (C) The amino acid sequences of Mamu-B*098 and Mamu-B*05104 are shown, in which unmatched amino acid residues are highlighted. Solid and open triangles indicate paired cysteine residues for intramolecular disulfide bonds and the asparagine residue for N-glycosylation, respectively. Asterisks indicate amino acid residues that establish VDW interactions with the acyl chain in Mamu-B*098.
Identification of Mamu-B*05104 as a lipopeptide-presenting molecule. (A) HeLa cell transfectants expressing Mamu-B*05104, Mamu-B*06004, or Mamu-B11L*01 as well as mock-transfected cells were cultured with SN45 TCR-reconstituted (left panel) or untransfected J.RT3 cells (right panel) in the presence or absence of C14nef4. T cell responses were monitored by measuring the amount of IL-2 released into the medium. Mean values with SEM are shown. (B) A phylogenetic tree constructed by the neighbor-joining method for representative classical (Mamu-A and -B) and nonclassical (Mamu-AG, -I, -E, and -F) MHC class I alleles are shown. Note that Mamu-B*05104 as well as Mamu-B*098 belonged to the classical Mamu-B family. (C) The amino acid sequences of Mamu-B*098 and Mamu-B*05104 are shown, in which unmatched amino acid residues are highlighted. Solid and open triangles indicate paired cysteine residues for intramolecular disulfide bonds and the asparagine residue for N-glycosylation, respectively. Asterisks indicate amino acid residues that establish VDW interactions with the acyl chain in Mamu-B*098.
Structure of the C14nef4-bound Mamu-B*05104 complex
The ability of MHC class I molecules to bind N-myristoylated lipopeptides with a short stretch of 4 aa residues has not been fully recognized; therefore, to elucidate the molecular mechanisms underlying the binding of C14nef4 to Mamu-B*05104, we attempted to clarify the x-ray crystal structure of the Mamu-B*05104:C14nef4 complex. The ectodomain of the Mamu-B*05104 H chain and monkey β2m were produced in E. coli as inclusion bodies, and the purified recombinant proteins were refolded in the presence of C14nef4, followed by crystallization. We elucidated the crystal structure of C14nef4-bound Mamu-B*05104 at a resolution of 1.8 Å (Table I). The overall structure of Mamu-B*05104 was virtually indistinguishable from that of other MHC class I molecules, in which the two semisymmetrical α1 and α2 domains formed a β-sheet platform topped by two semiparallel α helices (Fig. 3A) (8). The α1/α2 fold of Mamu-B*05104 exhibited a high degree of structural similarity with those of the conventional peptide-presenting MHC class I molecules, including Mamu-A*002, Mamu-B*017, HLA-A2, and HLA-B27, with RMSD values of 0.74, 0.53, 0.61, and 0.68 Å, respectively, whereas more structural deviations were noted for the nonclassical MHC class I molecules (HLA-E and -F) (Fig. 3B).
. | Mamu-B05104:C14nef4a . | Mamu-B05104:C14-GGGIa . |
---|---|---|
Data collection | ||
Space group | P21 21 21 | P21 21 21 |
Cell dimensions | ||
a, b, c (Å) | 53.82, 81.76, 106.63 | 55.10 80.79 106.42 |
α, β, γ (°) | 90, 90, 90 | 90, 90, 90 |
Resolution (Å)a | 50–1.80 (1.83–1.80) | 50–1.95 (1.98–1.95) |
Rmerge | 0.062 (0.321) | 0.073 (0.262) |
I/σI | 43.9 (7.2) | 46.0 (10.0) |
Completeness (%) | 99.9 (99.5) | 99.8 (98.4) |
Redundancy | 7.1 (6.6) | 8.6 (7.8) |
Refinement | ||
Resolution (Å) | 1.80 (1.84–1.80) | 1.95 (2.00–1.95) |
No. of reflections | 44052 (2700) | 35237 (2852) |
Rwork/Rfree (%) | 17.9 (19.0)/21.5 (25.7) | 17.5 (18.6)/21.7 (26.6) |
No. of atoms | ||
Protein | 3133 | 3115 |
MYR/sodium/EDO/BO3 | 15/4/72/12 | 15/5/52/0 |
Water | 304 | 255 |
B-factors (Å2) | ||
Protein | 25.4 | 26.4 |
Ligand and ion | 37.8 | 31.1 |
Water | 31.1 | 30.8 |
RMSDs | ||
Bond lengths (Å) | 0.010 | 0.010 |
Bond angles (°) | 1.201 | 1.142 |
. | Mamu-B05104:C14nef4a . | Mamu-B05104:C14-GGGIa . |
---|---|---|
Data collection | ||
Space group | P21 21 21 | P21 21 21 |
Cell dimensions | ||
a, b, c (Å) | 53.82, 81.76, 106.63 | 55.10 80.79 106.42 |
α, β, γ (°) | 90, 90, 90 | 90, 90, 90 |
Resolution (Å)a | 50–1.80 (1.83–1.80) | 50–1.95 (1.98–1.95) |
Rmerge | 0.062 (0.321) | 0.073 (0.262) |
I/σI | 43.9 (7.2) | 46.0 (10.0) |
Completeness (%) | 99.9 (99.5) | 99.8 (98.4) |
Redundancy | 7.1 (6.6) | 8.6 (7.8) |
Refinement | ||
Resolution (Å) | 1.80 (1.84–1.80) | 1.95 (2.00–1.95) |
No. of reflections | 44052 (2700) | 35237 (2852) |
Rwork/Rfree (%) | 17.9 (19.0)/21.5 (25.7) | 17.5 (18.6)/21.7 (26.6) |
No. of atoms | ||
Protein | 3133 | 3115 |
MYR/sodium/EDO/BO3 | 15/4/72/12 | 15/5/52/0 |
Water | 304 | 255 |
B-factors (Å2) | ||
Protein | 25.4 | 26.4 |
Ligand and ion | 37.8 | 31.1 |
Water | 31.1 | 30.8 |
RMSDs | ||
Bond lengths (Å) | 0.010 | 0.010 |
Bond angles (°) | 1.201 | 1.142 |
The highest resolution shell is shown in parentheses.
EDO, ethylene glycol; MYR, myristic acid.
The overall structure of the Mamu-B*05104:C14nef4 complex. (A) Two views of the trimer complex of the ectodomain of Mamu-B*05104 heavy chains (green), β2m (orange), and C14nef4 (yellow sticks) are shown. (B) Superimposed images of the α1 and α2 domains of Mamu-B*05104 (green) with those of the classical (Mamu-A*00201, -B*1701, HLA-A*0201, and -B*2705) and nonclassical (HLA-E and -F) MHC class I molecules (magenta) are shown with RMSD values provided in parentheses. (C) The binding of C14nef4 (yellow sticks) to Mamu-B*05104 is demonstrated by a 2Fo-Fc map (green mesh) contoured at 0.8σ. (D) The surface of the Ag-binding groove with pockets A through F and the bound C14nef4 lipopeptide (yellow sticks) are shown. (E) The bound C14nef4 lipopeptide (yellow sticks) accommodated in the semitransparent Ag-binding groove is shown.
The overall structure of the Mamu-B*05104:C14nef4 complex. (A) Two views of the trimer complex of the ectodomain of Mamu-B*05104 heavy chains (green), β2m (orange), and C14nef4 (yellow sticks) are shown. (B) Superimposed images of the α1 and α2 domains of Mamu-B*05104 (green) with those of the classical (Mamu-A*00201, -B*1701, HLA-A*0201, and -B*2705) and nonclassical (HLA-E and -F) MHC class I molecules (magenta) are shown with RMSD values provided in parentheses. (C) The binding of C14nef4 (yellow sticks) to Mamu-B*05104 is demonstrated by a 2Fo-Fc map (green mesh) contoured at 0.8σ. (D) The surface of the Ag-binding groove with pockets A through F and the bound C14nef4 lipopeptide (yellow sticks) are shown. (E) The bound C14nef4 lipopeptide (yellow sticks) accommodated in the semitransparent Ag-binding groove is shown.
Electron density corresponding to C14nef4 was observed in the Ag-binding groove of Mamu-B*05104, located between the α1 and α2 helices on top of the antiparallel β-sheet (Fig. 3C). Six pocket structures, designated A through F, that are common to MHC class I molecules were also identified in the Ag-binding groove of Mamu-B*05104 (Fig. 3D); the myristoyl group (C14) and C-terminal isoleucine residue (I4) were buried deeply in the groove and interacted primarily with the B pocket and F pocket, respectively, forcing the central 2 aa residues of the peptide chain to protrude out of the groove and into the solvent (Fig. 3D, 3E). The A pocket of Mamu-B*05104 was unoccupied (Fig. 3D), which contrasted sharply with conventional peptide-presenting MHC class I molecules in which A pockets primarily accommodated the N-terminal residue of peptide ligands (8).
Accommodation of the myristoyl group in the B pocket
The two anchors, namely, the myristoyl group and C-terminal isoleucine, located at both ends of the lipopeptide appeared to function as bridge piers stabilizing its binding to MHC class I. The acyl chain of C14nef4 extended deeply into the B pocket of Mamu-B*05104 in a relatively straight configuration, and no alternative conformations were detected. This was apparently distinct from the U-shaped configuration observed for the acyl chain of C14nef5 in the B pocket of Mamu-B*098 (Fig. 4A). As was noted previously for the Mamu-B*098 B pocket (Fig. 4A, right panel), the B pocket of Mamu-B*05104 was lined with an array of hydrophobic or noncharged amino acid residues, including Y7, Y24, V34, F36, T45, A67, W97, and A99, which contacted the acyl chain of C14nef4 via numerous intermolecular van der Waals (VDW) forces (Fig. 4A, left panel). Furthermore, the side chains of R66 and R70 of Mamu-B*05104 were directed outward; therefore, the polar guanidine groups at their tips were located distantly from the B pocket, whereas their hydrophobic stems were able to establish VDW interactions with the acyl chain of C14nef4 (Fig. 4A, Supplemental Table I).
Interactions with the acyl chain in the B pocket. (A) The C14nef4 lipopeptide (space-filling model) captured in the semitransparent Ag-binding groove of Mamu-B*05104 is shown with the side chains (green) of amino acid residues surrounding the acyl chain (left panel). The C14nef5 lipopeptide captured in Mamu-B*098 is also shown for comparison (right panel). (B) Mamu-B*098, Mamu-B*05104, and other MHC class I molecules for which crystal structures have been resolved are arranged in order of decreasing size of the B pocket cavity. (C) B pockets of Mamu-B*05104 and Mamu-B*098 accommodating the acyl chain (yellow sticks) as well as those of representative peptide-presenting alleles (Mamu-B*1701 and HLA-B*0801) are shown as a semitransparent groove. The side chains of amino acid residues at positions 9 and 99 are also shown with an emphasis on nonbulky side chains at these positions for Mamu-B*05104 and Mamu-B*098.
Interactions with the acyl chain in the B pocket. (A) The C14nef4 lipopeptide (space-filling model) captured in the semitransparent Ag-binding groove of Mamu-B*05104 is shown with the side chains (green) of amino acid residues surrounding the acyl chain (left panel). The C14nef5 lipopeptide captured in Mamu-B*098 is also shown for comparison (right panel). (B) Mamu-B*098, Mamu-B*05104, and other MHC class I molecules for which crystal structures have been resolved are arranged in order of decreasing size of the B pocket cavity. (C) B pockets of Mamu-B*05104 and Mamu-B*098 accommodating the acyl chain (yellow sticks) as well as those of representative peptide-presenting alleles (Mamu-B*1701 and HLA-B*0801) are shown as a semitransparent groove. The side chains of amino acid residues at positions 9 and 99 are also shown with an emphasis on nonbulky side chains at these positions for Mamu-B*05104 and Mamu-B*098.
Besides its hydrophobic properties, the B pocket of Mamu-B*05104 was also marked by its ample cavity, which was similar to the Mamu-B*098 B pocket. The large cavity size of the two lipopeptide-presenting MHC class I alleles appeared exceptional among MHC class I alleles (higher than 1.5 SD above the mean) (Fig. 4B). We found that amino acid residues at positions 9 and 99 had a critical influence on the size of the B pocket; in the case of conventional peptide-presenting MHC class I alleles, such as Mamu-B*1701 and HLA-B*0801, the side chains of Y9 and Y99 and those of D9 and Y99, respectively, protruded into the cavity, thereby reducing the cavity sizes of the B pockets. In contrast, the amino acid residues at these positions were small for Mamu-B*05104 (G9 and A99) and Mamu-B*098 (S9 and S99), which provided extra space to accommodate the acyl chain (Fig. 4C).
Accommodation of the C-terminal residue in the F pocket
The C-terminal isoleucine residue of C14nef4 was accommodated in the F pocket of Mamu-B*05104, which was larger than that of Mamu-B*098. In the case of Mamu-B*098, the side chain of glutamine at position 116 protruded into the F pocket, which not only pushed up the bottom of the pocket but also established a unique hydrogen bond with the side chain of the C-terminal serine residue of C14nef5 (Fig. 5, right panels). Based on these structural features, we previously predicted that the Mamu-B*098 F pocket was elaborately designed for binding a small polar amino acid residue, such as serine or threonine, which constituted the N-myristoylation motif (18). In contrast, the small amino acid residue at position 116 (S116) of Mamu-B*05104 appeared to be less influential, making its F pocket more similar to that of conventional peptide-presenting MHC class I alleles, such as HLA-B51, which had a binding preference for bulky nonpolar residues, including isoleucine (34). We found that the main chain of the isoleucine of C14nef4 established a hydrogen bond network with D77, Y84, T143, and K147 of Mamu-B*05104 (Fig. 5, middle panels), which was virtually identical to that established between HLA-B51 and the C-terminal isoleucine residue of the 9-mer peptide ligand (left panels) (35). The side chain of the C-terminal isoleucine residue of C14nef4 also contributed to ligand binding by establishing VDW interactions with L81, Y84, L95, and Y123 of Mamu-B*05104 (Supplemental Table I).
Interactions with the C-terminal amino acid residues in the F pocket. Top- and side-view images of F pockets are presented for Mamu-B*05104 (left), HLA-B51 (middle), and Mamu-B*098 (right) as semitransparent grooves with the side chain of the bound C-terminal amino acid residue (isoleucine, isoleucine, and serine, respectively) shown with yellow sticks. Amino acid residues establishing hydrogen bonds (dashed lines) with the bound C-terminal residue are also indicated.
Interactions with the C-terminal amino acid residues in the F pocket. Top- and side-view images of F pockets are presented for Mamu-B*05104 (left), HLA-B51 (middle), and Mamu-B*098 (right) as semitransparent grooves with the side chain of the bound C-terminal amino acid residue (isoleucine, isoleucine, and serine, respectively) shown with yellow sticks. Amino acid residues establishing hydrogen bonds (dashed lines) with the bound C-terminal residue are also indicated.
Potential T cell epitopes
Because the N-myristoylated glycine (C14-G1) and C-terminal isoleucine (I4) residues were submerged deeply in the Ag-binding groove of Mamu-B*05104, only the central 2 aa residues of C14nef4, namely, G2 and A3, were positioned outside of the groove (Fig. 3E). Therefore, we attempted to assess the T cell epitopic potential of the methyl group of A3, the sole side chain exposed to the solvent. The SN45 TCR tetramer (Fig. 6A, left), which was capable of detecting C14nef4 in the context of Mamu-B*05104, specifically stained K562 cell transfectants expressing Mamu-B*05104 (K562/B*05104) that were pulsed with C14nef4 (C14-GGAI) (Fig. 6A, right). In contrast, the tetramer failed to react with K562/B*05104 cells pulsed with C14-GGGI, the mutant ligand in which the methyl group was removed from A3 (Fig. 6A, right). Similar results were obtained using J.RT3/SN45 T cells as responder cells, which recognized C14-GGAI in a dose-dependent manner but totally failed to react to C14-GGGI (Fig. 6B). Furthermore, the high-resolution x-ray crystal structure of Mamu-B*05104 complexed with the C14-GGGI mutant lipopeptide (Supplemental Fig. 1, Table I) confirmed that the spatial configuration of the complex (Fig. 6C) and the hydrogen bond network involved in ligand binding (Fig. 6D) were virtually identical to those observed for the Mamu-B05104:C14nef4 complex, except for the presence or absence of the methyl group of A3. Based on these functional and structural observations, we concluded that the methyl group of A3 comprised a major epitope recognized by SN45 T cells (Fig. 7).
T cell recognition of bound lipopeptides. (A) The PE-labeled SN45 TCR tetramer (left) was tested for its reactivity to K562 cell transfectants expressing Mamu-B*05104 (K562/B*05104) by flow cytometry (right). Histograms for TCR tetramer-stained cells (filled) and unstained cells (unfilled) are shown. Specific staining was only observed for K562/B*05104 cells that were pulsed with C14nef4 (C14-GGAI) and not those pulsed with the structural analogue, C14-GGGI. (B) HeLa cell transfectants expressing Mamu-B*05104 were cultured with J.RT3/SN45 T cells in the presence of either C14nef4 or C14-GGGI at indicated concentrations. T cell responses were assessed by measuring the amount of IL-2 released into the medium. Mean values with SEM are shown. (C) Top-view images of the α1 and α2 domains of Mamu-B*05104 complexed with C14-GGAI (left, green) or C14-GGGI (middle, pink) are superimposed (right). (D) Hydrogen bond interactions with C14-GGAI (left panel, yellow sticks) as well as those with C14-GGGI (right panel, yellow sticks) are shown as orange dashed lines. Amino acid residues of Mamu-B*05104 that are involved in hydrogen bond interactions are indicated with their side chains displayed as stick models.
T cell recognition of bound lipopeptides. (A) The PE-labeled SN45 TCR tetramer (left) was tested for its reactivity to K562 cell transfectants expressing Mamu-B*05104 (K562/B*05104) by flow cytometry (right). Histograms for TCR tetramer-stained cells (filled) and unstained cells (unfilled) are shown. Specific staining was only observed for K562/B*05104 cells that were pulsed with C14nef4 (C14-GGAI) and not those pulsed with the structural analogue, C14-GGGI. (B) HeLa cell transfectants expressing Mamu-B*05104 were cultured with J.RT3/SN45 T cells in the presence of either C14nef4 or C14-GGGI at indicated concentrations. T cell responses were assessed by measuring the amount of IL-2 released into the medium. Mean values with SEM are shown. (C) Top-view images of the α1 and α2 domains of Mamu-B*05104 complexed with C14-GGAI (left, green) or C14-GGGI (middle, pink) are superimposed (right). (D) Hydrogen bond interactions with C14-GGAI (left panel, yellow sticks) as well as those with C14-GGGI (right panel, yellow sticks) are shown as orange dashed lines. Amino acid residues of Mamu-B*05104 that are involved in hydrogen bond interactions are indicated with their side chains displayed as stick models.
Molecular models that recapitulate salient features of lipopeptide-presenting MHC class I molecules. Molecular models for lipopeptide-bound Mamu-B*05104 (middle) and Mamu-B*098 (right) are presented with the representative MHC class I allele, HLA-B*51, that binds long peptides (left) for comparison. Note that the large, hydrophobic B pocket of Mamu-B*05104 is similar to that of Mamu-B*098, whereas its F pocket is virtually identical to that of HLA-B*51.
Molecular models that recapitulate salient features of lipopeptide-presenting MHC class I molecules. Molecular models for lipopeptide-bound Mamu-B*05104 (middle) and Mamu-B*098 (right) are presented with the representative MHC class I allele, HLA-B*51, that binds long peptides (left) for comparison. Note that the large, hydrophobic B pocket of Mamu-B*05104 is similar to that of Mamu-B*098, whereas its F pocket is virtually identical to that of HLA-B*51.
Discussion
The MHC class I–mediated presentation of 8–10-mer peptides to CTLs is a basic paradigm that modern immunology has established; however, it is now clear that a couple of MHC class I alleles have evolved the ability to present N-myristoylated short peptide chains to CTLs. Structural analyses of Mamu-B*05104 and Mamu-B*098 highlighted the unique B pocket structure that was distinct from that of conventional peptide-presenting MHC class I alleles. Amino acid residues that construct the B pocket of MHC class I molecules are highly polymorphic, and this was also the case with the two lipopeptide-presenting MHC class I alleles. A phylogenetic tree analysis indicated that Mamu-B*05104 and Mamu-B*098 were not closely associated, and distinct sets of amino acid residues constituted the B pockets of Mamu-B*05104 (G9, Y24, T45, E63, R66, A67, R70, D74, W97, and A99) and Mamu-B*098 (S9, V24, M45, Q63, R66, V67, A70, F74, T97, and S99) (Fig. 4A). Nevertheless, these sets of amino acid residues functioned similarly to establish VDW interactions with the acyl chain (Fig. 4A, Supplemental Table I), and the voluminous B pocket structure was achieved by placing small amino acid residues at the floor (G9 and A99 for Mamu-B*05104; S9 and S99 for Mamu-B*098). These molecular features have not been noted previously for peptide-presenting MHC class I alleles, which appeared to be more optimal for accommodating the acyl chain rather than amino acid residues.
It is also important to note that some amino acid residues shared between peptide- and lipopeptide-presenting MHC class I alleles were assigned to play different roles in binding peptides and lipopeptides. For example, the majority of MHC class I alleles possess tyrosine at position 7, and the hydroxyl group at the tip of its side chain is commonly used to establish a hydrogen bond with the N-terminal amino acid residue of peptide ligands for its anchoring at the A pocket (8). Alternatively, the benzene ring of Y7 was used to establish VDW interactions with lipopeptide ligands. Thus, it is interesting to speculate that lipopeptide-presenting MHC class I alleles may have emerged during the evolutionary process of polymorphic MHC class I genes, and this is also supported by the F pocket structure and function of Mamu-B*05104 (Fig. 5). As is the case with HLA-B51 with a binding preference for bulky nonpolar residues (34), the F pocket of Mamu-B*05104 captured the Ile residue by establishing numerous VDW interactions (Supplemental Table I), most of which would be lost if Ser is replaced for Ile at this position. Therefore, the F pocket of Mamu-B*05104 is more similar in structure and function to that of HLA-B51 than to that of Mamu-B*098.
The highly specific recognition of 9-mer peptides by MHC class I-restricted CTLs is mediated by TCR interactions with multiple epitopic determinants, which generally involve several amino acid residues (7). In contrast, the x-ray crystal structure of the Mamu-B*05104:C14nef4 complex indicated that epitopic variations generated within lipopeptide Ags are so limited that the precise discrimination of self-derived and nonself-derived lipopeptides by T cells may be almost impossible in a similar manner to that for 9-mer peptides. However, SN45 T cells exhibited an exceptional ability to recognize C14nef4 in the context of Mamu-B*05104 without showing an apparent autoreactivity to Mamu-B*05104+ cells (19). In addition, rhesus monkeys mounted T cell responses to SIV Nef-derived lipopeptide Ags after, but not before, SIV infection, indicating the specific recognition of viral lipopeptides. Results obtained from our preliminary experiments indicated that a distinct chemical class of self-ligands, such as a phospholipid species abundantly expressed within cells, comprised a major ligand repertoire for lipopeptide-presenting MHC class I alleles in healthy cells. In virus-infected cells, an excess amount of N-myristoylated viral proteins and their degradation products as well as defective ribosomal products may be generated under the influence of inflammatory cytokines (36). Accordingly, the rapid intracellular accumulation of viral lipopeptides may facilitate their presentation to CTLs.
Besides the large quantity of CTLs that recognize long peptides, lipopeptide-specific CTLs may constitute a small population; nevertheless, these cells are likely to play a unique role in critical aspects of host defense. N-myristoylation of the Nef protein is essential for its anchoring at the plasma membrane, thereby exerting its immunosuppressive activity (37). This immune escape mechanism that retroviruses have evolved may be simultaneously counterbalanced by the host immune system through the elicitation of lipopeptide-specific CTL responses, which are capable of sensing the N-myristoylation event in infected cells and eliminating them. Upon infection, CTLs that recognize Nef-derived peptides vigorously expand; however, their ability to recognize long peptides in a highly specific manner may adversely allow pathogenic viruses to escape simply by mutating a single amino acid residue that constitutes the T cell epitope (38). In contrast, the short stretch of N-terminal amino acid residues constitutes the N-myristoylation motif; therefore, there are few chances for pathogens to introduce arbitrary mutations without affecting the efficiency of N-myristoylation (39, 40).
Based on the present results, we propose a molecular model that recapitulates contrasting features of conventional peptide-presenting and novel lipopeptide-presenting MHC class I subsets (Fig. 7). N-myristoylated short peptides are captured by MHC class I molecules with their N-terminal C14-Gly and C-terminal residues accommodated in the B and F pockets, respectively, leaving the A pocket unoccupied. The large, hydrophobic B pocket constitutes the most salient feature of lipopeptide-presenting MHC class I alleles; the F pocket structure appears to be variable and may be small (Mamu-B*098) or similar in size (Mamu-B*05104) to that of peptide-presenting MHC class I molecules depending on the nature of the C-terminal anchor residue. Because of the short stretch of amino acid residues exposed to the solvent, the T cell epitopic variations generated within lipopeptide ligands are more limited than those generated within long peptide ligands; however, T cells are capable of monitoring the presence or absence of the alkyl side chain of the ligand, including the methyl group of A3 of C14nef4 captured by Mamu-B*05104. Structural differences as well as similarities between long peptide- and short lipopeptide-presenting MHC class I molecules are now beginning to be elucidated, providing novel insights into their coevolution.
Acknowledgements
We thank Dr. Tatsuhiko Igarashi for his helpful discussions concerning animal experiments.
Footnotes
This work was supported by Japan Society for the Promotion of Science KAKENHI Grants 17H05791, 18K19563, 18H02852, and 19H04805 (to M.S.), and 16K19151 (to D.M.) and by a grant from Kato Memorial Bioscience Foundation (to D.M.). It was also supported by the Cooperation Research Program of the Primate Research Institute, Kyoto University.
The atomic coordinates and structural factors have been submitted to the Protein Data Bank (https://pdbj.org) under accession numbers 6IWG and 6IWH.
The online version of this article contains supplemental material.
References
Disclosures
The authors have no financial conflicts of interest.