Abstract
Abs are very efficient drugs, ∼70 of them are already approved for medical use, over 500 are in clinical development, and many more are in preclinical development. One important step in the characterization and protection of a therapeutic Ab is the determination of its cognate epitope. The gold standard is the three-dimensional structure of the Ab/Ag complex by crystallography or nuclear magnetic resonance spectroscopy. However, it remains a tedious task, and its outcome is uncertain. We have developed MAbTope, a docking-based prediction method of the epitope associated with straightforward experimental validation procedures. We show that MAbTope predicts the correct epitope for each of 129 tested examples of Ab/Ag complexes of known structure. We further validated this method through the successful determination, and experimental validation (using human embryonic kidney cells 293), of the epitopes recognized by two therapeutic Abs targeting TNF-α: certolizumab and golimumab.
Introduction
The use of Abs as drugs against a large number of diseases has dramatically increased in the last decade, and this tendency should still intensify in the near future (1). Because many Abs are often developed against the same target, it has become essential to determine the epitope of an Ab early in its development. Moreover, the identification of the epitope is an important element in the understanding of Ab mechanism of action (2).
Aside from three-dimensional (3D) structures, most experimental methods available for epitope determination are based either on the following: 1) site-directed mutagenesis; 2) peptide arrays (3–5); or 3) mass spectrometry (6). Most peptide-based methods use 15–30 aa overlapping peptides of the target arrayed on solid support, which are then exposed to the Ab (4, 5). This identification of interacting peptides can then be completed by alanine scanning to define the epitope more precisely (3). In the mass spectrometry–based approach, the Ab/Ag complex is subjected either to hydrogen/deuterium exchange (7) or to enzymatic digestion, which allows differentiating target peptides that are “protected” by the presence of the Ab. These peptides can then be identified using mass spectrometry [see (6) for a review]. It should be noted that even when successful, these different approaches are likely to provide nonidentical definitions of the epitope. Indeed, because of the crystallization step that freezes the complex structure in one out of many possible conformations, X-ray structure identifies only the most stable interactions. Alanine scanning does not allow identifying all the interacting residues for different reasons: the mutated amino acid might interact with the Ab through its main chain, or the mutation to alanine might not be drastic enough to give rise to a measurable difference in affinity. Still, there is usually a large overlap between the epitopes identified by each method, which corresponds to the core of the interface.
However, these approaches are expensive, time-consuming, and, except crystallography, remain error prone. Indeed, the results obtained through hydrogen/deuterium exchange mass spectrometry are sometimes very difficult to interpret, for example, when there is a conformational change in the target between the free and complexed forms (7). Peptide arrays performance at identifying epitopes are limited by different factors (8): immobilization methods, affinity of the peptides, and conformational constraints induced by the immobilization. For these reasons, many efforts have been put in developing in silico methods capable of predicting Ab/Ag interactions. This endeavor has taken two main directions: 1) B cell epitope prediction, which aims at predicting the regions of a protein that are the most amenable of being targeted by an Ab; and 2) partner-specific approaches, which aim at predicting the epitope for a single Ab target pair [see (9, 10) for reviews]. Only the second type of method leads to the prediction of the epitope for a given Ab, although B cell epitope prediction can be a useful first step in this process. Among the partner-specific approaches, three main categories can be distinguished: predictors based on the intrinsic properties of the partners, predictors based on coevolution of the partners, and predictors based on docking. However, few of these methods are dedicated to the special case of Ab/Ag interaction.
The aim of docking methods is originally the prediction of the conformation of the assembly between two interacting proteins. From a correct prediction of this conformation, the interaction regions can be straightforwardly defined. For this reason, docking methods have been applied to the prediction of interaction interfaces, and in some cases, the specific issue of predicting the epitope and the paratope. Some methods provide accurate results, such as Rosetta (11) and Z-dock (12), but in local docking only, meaning that they require a partial knowledge of the epitope. The introduction of decoys as the reference state (DARS), a pairwise statistic potential specific of Ab/Ag interactions, allows PIPER/ClusPro (which is the algorithm used for docking within the BioLuminate suite) to achieve satisfactory results (13), placing at least one near-native solution in the top 10 predicted conformations. The particularity of this statistic potential as compared with previously used ones is that it accounts for the asymmetry of the Ab/Ag interaction. Another example of a website server specific for Ab/Ag docking is FRODOCK (14, 15). FRODOCK uses spherical harmonics for conformation generation (as opposed to fast Fourier transform for most other algorithms, including PIPER) and a combination of energetic (van der Waals, electrostatics, and desolvation) and knowledge-based potentials, optimized for the different categories of complexes (enzyme, Abs, and others). However, the goal of these methods is predicting the conformation of the assembly, meaning predicting the interaction region, but also the precise relative orientations of the two partners, and not predicting the epitope. Even though they perform better at this task than the other types of epitope prediction methods, they are not optimized for it.
We have developed a new method for epitope determination named MAbTope, which integrates both a docking-based prediction method and experimental steps. Indeed, the software part of the method automatically outputs peptides, without any human intervention, that can be readily used for experimental validation. We also show how these peptides can be used to design point mutations in the target, allowing a more precise definition of the epitope. Thus, this method, although in part computational, is not just a prediction method, but also includes the experimental validation of the epitope.
Materials and Methods
Overview of the method
The 3D structures of the Ab and of the target are used as input of the Hex Software (16) (Fig. 1). Hex generates more than 108 docking poses and ranks them according to energetic criteria (H-ranking). Each of the Hex top 500 docking poses is evaluated using 30 specific and 30 nonspecific scoring functions. The nonspecific scoring function is identical to the one used in PRIOR (17); the specific scoring function has been reoptimized, using the learning dataset described hereafter and the same machine-learning procedure, genetic algorithm, and covariance matrix adaptation evolutionary strategy (CMA-ES) (18), and in both cases, the area under the ROC-curve is used as fitness function. A consensus score is then computed using the following formula:
where is the ranking of pose i according to the nonspecific function j, and is the ranking of pose i according to the specific function j. The rankings of pose 0 (the best ranked according to Hex) are used for normalization.
For each pose, the algorithm also computes the A-score, C-score, and P-score (see hereafter). For each residue, r, of the target, we compute a value, Vr, which is the sum of the Hex ranks of the poses in which r belongs to the interface. For a given pose, the A-score is the sum of the Vr of the residues that belong to the epitope in this particular pose. For each pose, the C-score is the sum of the ranks of the other poses that have a root mean square deviation (RMSD) value lower than 5 Å with this particular pose.
The consensus, Hex rank, A-score, C-score, and P-score are used to generate five different rankings. For each pose, the sum of its ranks in the different rankings is computed. These numbers are used to generate the final ranking. The top 30 solutions are then used to compute the interface frequency of each residue of the target, which is equal to the number of poses within these 30 in which the residue belongs to the interface. This interface frequency is used to design the interface peptides (see hereafter).
P-score
A new postprocessing function has been introduced: the P-score. For a given docking pose, we count the number of CDR amino acids that are closer than 4 Å to an atom of the target and normalize by the total number of CDR residues. The docking poses are then ranked by decreasing values of this ratio. This rank is the P-score of the pose.
Specific learning dataset
The learning dataset is composed of 393 nonredundant Ab-target complexes manually extracted from the Brookhaven Protein Data Bank in January 2015. Only the complexes in which the target is larger than 40 residues were considered. These complexes contain 392 distinct Abs, and the targets belong to 165 distinct Pfam families. The definition of nonredundancy we use is weaker than what is usually used, because Abs are very special proteins, and overall sequence identity, even restricted to the variable domain, is not indicative of the Ab specificity and consequently, on its ability to form a complex with its target. The criteria retained for considering two Ab/Ag complexes as nonredundant were the following: 1) targets are not related (they belong to different Pfam families); 2) targets are related, but the epitopes recognized by the Abs have <20% overlap; or 3) targets are related and epitopes are overlapping, but the CDRs of the considered Ab differ in 10 or more positions. This third criterion is justified by the fact that most pairs of Abs differ by 10 or more residues within the CDRs; even when they present a very high overall sequence identity, they do not share the same target. We further checked that the highest sequence identity, computed on the CDRs only, of a test case with the remaining of the learning dataset, and found it corresponded to an Ab having a target unrelated to that of the test case.
Test dataset
To evaluate the performance of the method, a test dataset has been designed. It consists in the 82 complexes of the learning dataset for which the 3D structures of the individual partners are known. For the evaluation, the learning has been done in leave-one-out, meaning that the epitope of a given Ab is predicted using a scoring function learnt on a dataset not containing the 3D structure of the complex it forms with its target. Forty-seven new complexes, whose 3D structure has been determined after January 2015 and that were nonredundant with those already present in the learning dataset, have been added to this test set.
We distinguished “small” targets (40–300-aa long) from “large” targets (more than 300-aa long). However, the results obtained for the two categories only slightly differ.
Negative controls
To better evaluate the method performance, we have included negative controls. To this aim, we have compared, for each target of the test set, the epitope predicted by docking each of the noncognate Abs of the test set with the actual epitope.
Epitope definition
In this work, an amino acid of a protein targeted by an Ab will be considered as belonging to the epitope if at least one of its atoms is at <4 Å of an atom belonging to an amino acid of the Ab. These distances are computed on the crystallographic structure of the complex.
Definition of epitope peptides
Each amino acid of the target is attributed a value, which is the number of poses within the 30 top-ranked ones in which this amino acid belongs to the predicted epitope. Different sizes of pose sets have been tested, and 30 is a satisfactory compromise (data not shown). Each 15-aa peptide of the sequence is then given a score equal to the sum of these values for each amino acid in the peptide. The peptides are then ranked along this score. Peptides overlapping by at least 8 aa with a better-ranked peptide are ignored. For benchmarking, the relevance of a given peptide is evaluated by the number of residues that belong to the crystallographic epitope. This definition of epitope peptide was also used for the testing of ClusPro and FRODOCK. In EpiPred predictions, amino acids present in the first predicted epitope were given a score of 3, a score of 2 for the amino acids of the second epitope, and a score of 1 for the amino acids of the third epitope. In PPiPP predictions, the scores given in the program output were considered. Epitope peptides were then built as explained above.
The choice of 15 mers is a compromise between two empirical observations we have made along the development of this method: 1) shorter peptides tend to give a poor signal. Our hypotheses are that they are too flexible (short peptides present fewer long-range interactions and are thus more flexible), which decreases their binding to the Ab. Moreover, the secondary structure is important for binding, and very short peptides have no chance to adopt hairpin or strand conformations. 2) Longer peptides tend to span over more than one loop, and interpretation of experimental results is then more difficult. A second aspect is that longer peptides have a higher tendency to precipitate.
Evaluation criteria
MAbTope first output is a ranked list of docking poses. To evaluate the distance between these poses and the native solution, critical assessment of predicted interactions (CAPRI) criteria were used (19): high quality (***): fnat > 0.5 and (Irmsd < 1 or Lrmsd < 1); medium quality (**): [fnat ε[0.3, 0.5] and (Irmsd < 2 or Lrmsd < 5)] or [fnat > 0.5 and Irmsd > 1 and Lrmsd > 1]; and acceptable (*): [fnat > 0.3 and Irmsd > 2 and Lrmsd > 5] or [fnat ε[0.1, 0.3] and (Irmsd < 4 or Lrmsd < 10)]. fnat is the fraction of correctly predicted native contacts, Lrmsd (ligand RMSD) is the RMSD between the predicted position of the ligand and its position in the crystal structure, and Irmsd (interface RMSD) is the same but reduced to the interface residues.
Because our epitope predictions are based on the evaluation of a set of conformations, and not on a single conformation, it was necessary for us to also evaluate the number of “indicative” conformations (+) that satisfy [fnat > 0.1], [Lrmsd < 10], or [Irmsd < 5].
Introducing this new category is very useful for evaluating docking performance in the perspective of epitope determination. Indeed, the docking poses falling in this category, even though their geometry is too distant from the crystal structure to be considered as acceptable by the CAPRI criteria, still define an interaction area on the target that overlaps with the actual epitope and thus, give valuable information on the epitope.
To evaluate the docking performances of our algorithms, for each complex in the test set, we calculate the rank of the first near-native pose with the CAPRI criteria and with our own criteria (CAPRI plus indicative).
The second output of MAbTope is a list of peptides ranked on the predicted probability they match with the epitope. To evaluate the epitope prediction accuracy, we calculate the number of residues in each peptide that belong to the actual epitope (and do not belong to better-ranked peptides), normalized by the total number of residues in the epitope.
Binding kinetics of certolizumab to biotinylated peptides using biolayer interferometry
All measurements were performed with the Octet RED96 system (Pall ForteBio, Fremont, CA) in the manufacturer kinetics buffer at 30°C with shaking at 1000 rpm. Biotinylated peptides were immobilized during 200 s on streptavidin-coated sensors at 0.5, 1, and 5 μg/ml for P1-3, P1-1, and P1-2, respectively, and left for equilibration for 120 s in kinetics buffer. Typical capture variability within a row of eight tips did not exceed 0.1 nm. Binding was assessed at 100, 200, 400, 600, 800, 1000, and 1200 μg/ml certolizumab for 300 s. Two parallel corrections were carried out by subtracting the association of certolizumab on an immobilized, nonrelevant biotinylated peptide and by subtracting the loading baseline drift on nonassociated sensors. Data were analyzed using Octet Software 9.0 version. Because certolizumab is a Fab’, experimental data were fitted with the binding equation describing a 1:1 interaction. Considering the weak affinity of peptides for the Ab and the fact that the dissociation is almost immediate, we restrained the dissociation analysis to the first 20 s. Global analyses of the datasets assuming that binding was reversible (full dissociation) were carried out using nonlinear least squares fitting, allowing a single set of binding parameters to be obtained simultaneously for all concentrations used in each experiment.
Homogeneous time-resolved fluorescence–based competition assay
The competition between golimumab and either the certolizumab or the peptides for the TNF-α was assessed in vitro using a homogeneous time-resolved fluorescence (HTRF)–based assay in 384-well plate. The golimumab and the certolizumab were kindly provided by D. Mulleman (Hôpital Bretonneau, Centre Hospitalier Régional Universitaire de Tours, Tours, France). The golimumab was incubated at 0.1, 0.33, and 1 nM with 8 ng of TNF-α (NP_000585.2, Val77-Leu 233), N-terminally fused to the AviTag (Avidity, Aurora, CO) purchased from ACROBiosystems (Newark, DE) in 10 μl of PPI–Terbium detection buffer (Cisbio Bioassays, Condolet, France). Five microliters containing either 4 mM of nonbiotinylated peptides (GeneCust, Dudelange, Luxembourg) or 4 μM of certolizumab were added. The HTRF-compatible fluorophore Terbium cryptate and d2 conjugated to either an anti-Fc Ab or the streptavidin (Cisbio Bioassays) were finally added in 5 μl. After 1-h incubation at room temperature, the fluorescence at 620 and 665 nm were measured on the TriStar2 LB 942 Microplate Reader (Berthold Technologies, Wildbad, Germany). Data were expressed as the emission ratio 665 nm/620 nm subtracted by the nonspecific signal obtained without Ab or peptide.
Interaction measurement by peptide array
Peptide array.
The interaction between the different biotinylated peptides (GeneCust) and golimumab was assessed in vitro using peptide array. Biotinylated peptides are first diluted in printing buffer (20% glycerol and 1 M DMSO) for a final concentration of 0.8 and 1.6 mM. Peptides spotted in two replicates in 16 identical subarrays on a nitrocellulose-coated glass slide (ONCYTE Film slides; Grace Bio-Labs) using a Nano-Plotter (GeSiM, Germany). Slides are dried overnight at room temperature.
Preparation of Abs.
Golimumab is fluorescently labeled with iFluor 680 amine dye (AAT Bioquest) following the protocol of the provider. Excess of dye are eliminated by centrifugation on Amicon Ultra filter (Merck Millipore, Darmstadt, Germany). Abs are prepared fresh for the incubation by diluting into PBS-T (PBS 1 X, 0.1% Tween 20) supplemented with 1% of BSA (Sigma-Aldrich) for a final concentration of 2 ng/ml.
Incubation.
Slides are mounted with ProPlate chamber (Grace Bio-Labs) for the following steps. Slides are hydrated with 150 μl/well PBS-T solution for 15 min under agitation on a seesaw rocker. PBS-T is removed, and 100 μl of Super G blocking buffer (Grace Bio-Labs) is added for 1-h incubation on a seesaw rocker. After removing the blocking buffer, 100 μl/well Abs diluted in PBS-T supplemented with 1% BSA (corresponding to 200 ng) are added for 2-h incubation on a seesaw rocker. Then, Abs are removed, and slides are washed two times with PBS-T for 5 min and once with PBS (150 μl/well). Finally, slides are rinsed with filtered water for 1 min and air dried.
Detection and analysis.
Slides are scanned with an InnoScan 710-IR scanner (Innopsys, Carbonne, France) at 670-nm wavelength, 3-μm resolution, PMT of 1, and low intensity of the laser. Image analysis is performed using the circular feature alignment of Mapix software (Innopsys). Relative fluorescence unit is obtained by retrieving the median fluorescence signal intensity of each surrounding feature to the median fluorescent signal of the feature. Relative fluorescence unit is used to measure the interaction between the different peptides and the Ab. Graphs are generated using GraphPad Software (GraphPad Prism 5 Software, San Diego, CA).
In vitro fluorescence resonance energy transfer binding measurement
The interaction between the different biotinylated peptides (GeneCust) and certolizumab, or golimumab, or eculizumab used as a negative control, was assessed by HTRF. All experiments were performed in PPI–Terbium or –Europium detection buffers (Cisbio Bioassays). For this, 5 μl of biotinylated peptides (4 mM) were first incubated with 5 μl of either of the mAbs (1.6 μg/ml) for 1 h at room temperature. Then, 5 μl of streptavidin and 5 μl of anti-Fab (for certolizumab) or anti-Fc (for golimumab and eculizumab) Abs conjugated with HTRF compatible fluorophores, Terbium or Europium cryptate and d2, were added in quantities recommended by the manufacturer. After an overnight incubation at 4°C, the fluorescence emissions at 620 and 665 nm were measured using the appropriate HTRF program on a TriStar2 LB 942 Modular Microplate Reader (Berthold Technologies). Data are represented as specific fluorescence resonance energy transfer signals calculated as the 665 nm/620 nm emission ratio subtracted of the binding on the nonrelevant Ab.
Golimumab binding on mutant TNF-α by flow cytometry
Three TNF-α mutants were designed starting from the sequence NP_000585.2 by incorporating the mutations predicted to alter the interaction with golimumab according to our docking solution. The mutant TNF-α constructions contain the following mutations: TNF-α_P1-1m6 (N222A, R223A, D225A, F229A, E231A, and Q234A), TNF-α_P3-1m7 (R167A, Y172A, Q173A, T174A, K175A, and N177A), and TNF-α_P4-1m6 (Q106A, E108A, Q110A, Q112A, and R116A). The cDNA of the three mutants and the wild-type TNF-α fused to a Flag tag on their N terminus and depleted the 77 first residues, which contain the transmembrane part of the protein targeted by proteases and were synthesized and subcloned in pcDNA3.1 by GenScript (Piscataway, NJ). HEK293N cells were transiently transfected with the TNF-α constructions or a mock vector using Metafectene (Biontex Laboratories, München, Germany) according to manufacturer’s instructions. Thirty hours after transfection, the cells were fixed and permeabilized according to the BD Cytofix/Cytoperm Kit instructions (BD Biosciences, San Jose, CA). All the following hybridizations were performed in the kit’s perm/wash buffer. Five hundred thousand cells of each transfected population were incubated with 5 μg of golimumab for 1 h at room temperature and washed once in 2 ml of buffer. The binding of golimumab was assessed with the allophycocyanin-labeled anti-IgG1 Ab from Miltenyi Biotec (Bergisch Gladbach, Germany) diluted to 1:100. The expression level of each of the constructions was evaluated with an anti-Flag Ab coupled to PE, also from Miltenyi Biotec. After staining, all the cells were washed once in 2 ml of working buffer and once in 2 ml PBS/2 mM EDTA and finally suspended in 200 μl PBS/2 mM EDTA. The fluorescence was assessed with the MACSQuant Analyzer 10 (Miltenyi Biotec), and the data were analyzed with FlowJo software (FlowJo, Ashland, OR).
Binding kinetics of certolizumab to biotinylated peptides using biolayer interferometry
All measurements were performed with the Octet RED96 system (Pall ForteBio) in the manufacturer kinetics buffer at 30°C, shaking at 1000 rpm. Biotinylated peptides were immobilized during 200 s on streptavidin-coated sensors at 0.5, 1, and 5 μg/ml for C1-3, C1-1, and C1-2, respectively, and left for equilibration for 120 s in kinetics buffer. Typical capture variability within a row of eight tips did not exceed 0.1 nm. Binding was assessed at 100, 200, 400, 600, 800, 1000, and 1200 μg/ml certolizumab for 300 s. Two parallel corrections were carried out by subtracting the association of certolizumab on an immobilized nonrelevant biotinylated peptide and by subtracting the loading baseline drift on nonassociated sensors. Data were analyzed using Octet Software 9.0 version. Because certolizumab is a Fab’, experimental data were fitted with the binding equation describing a 1:1 interaction. Considering the weak affinity of peptides for the Ab and the fact that the dissociation is almost immediate, we restrained the dissociation analysis to the first 20 s. Global analyses of the datasets assuming that binding was reversible (full dissociation) were carried out using nonlinear least squares fitting, allowing a single set of binding parameters to be obtained simultaneously for all concentrations used in each experiment.
Statistical analysis
Experimental data were analyzed under Prism 6 software (GraphPad Software, La Jolla, CA). Data were expressed as mean ± SEM, and ANOVA statistical analysis was applied.
Results
Principle and benchmarking
MAbTope involves three successive steps. The first step is the docking of the Ab on its target, which results in the generation of docking poses (possible conformations of the Ab/Ag complex) through a method related to PRIOR, a general protein/protein docking method we had previously developed (17, 20–22). The second step is the ranking of these docking poses to extract 30 poses that tile the epitope, and the design of four so-called interacting peptides, that is, peptides predicted to be part of the epitope. The third step is the experimental validation based on the interacting peptides. Different methods can be used: measurement of the binding of each of these four peptides with the Ab, competition for Ab binding between the peptides and the target, or measurement of the binding of target mutated on residues belonging to these peptides.
The design of the interacting peptides from the docking poses is crucial for the success of the method. At this step, all the possible 15-aa–long peptides of the target are ranked according to the frequency at which their amino acids are found within the epitope in the 30 top-ranked docking poses. MAbTope predicts a correct peptide, that is, a peptide that contains residues belonging to the crystallographic interface, within the four best-ranked ones for all of the 129 complexes tested. On average, the four best-ranked peptides contain more than 80% of the epitope residues, and the minimum is 30%, meaning that the epitope is at least partly found for all complexes in the test set (Fig. 1A, Supplemental Table I). As a control, each Ab of the test set was docked to all the targets of the other Abs. In this test, on average, only 36% of the residues belonging to the epitope of the specific Ab are found within the four best-ranked peptides.
Principle and performance of the method. (A) From the 3D structures of the Ab and the target, Hex generates docking poses and ranks them according to energetic criteria (H-rank). Each of the Hex top 500 docking poses is evaluated using both nonspecific and specific scoring functions. A consensus score is computed, and the poses are ranked according to this score. In parallel, for each docking pose, the A-score, C-score, and P-score (see hereafter) are computed, and the poses are ranked. The final ranking of poses is a consensus of the five different rankings (Hex, consensus, A-ranking, C-ranking, and P-ranking). The top 30 solutions are used to compute, for each residue of the target, the frequency at which it appears within the epitope in these top 30 poses and to design the epitope peptides. (B) Ratio of residues of the epitope within the designed peptides identified using MAbTope on the complete test set (blue), ClusPro (orange), FRODOCK (green), EpiPred (violet), and PPiPP (purple). The values obtained when docking the false positives is shown in red. (C) Ratio of epitope residues with the designed peptides identified using MAbTope on the complete test set (blue), on unique targets (green), on small targets (orange), and large targets (purple). The values obtained when docking the false positives on unique targets shown in red.
Principle and performance of the method. (A) From the 3D structures of the Ab and the target, Hex generates docking poses and ranks them according to energetic criteria (H-rank). Each of the Hex top 500 docking poses is evaluated using both nonspecific and specific scoring functions. A consensus score is computed, and the poses are ranked according to this score. In parallel, for each docking pose, the A-score, C-score, and P-score (see hereafter) are computed, and the poses are ranked. The final ranking of poses is a consensus of the five different rankings (Hex, consensus, A-ranking, C-ranking, and P-ranking). The top 30 solutions are used to compute, for each residue of the target, the frequency at which it appears within the epitope in these top 30 poses and to design the epitope peptides. (B) Ratio of residues of the epitope within the designed peptides identified using MAbTope on the complete test set (blue), ClusPro (orange), FRODOCK (green), EpiPred (violet), and PPiPP (purple). The values obtained when docking the false positives is shown in red. (C) Ratio of epitope residues with the designed peptides identified using MAbTope on the complete test set (blue), on unique targets (green), on small targets (orange), and large targets (purple). The values obtained when docking the false positives on unique targets shown in red.
MAbTope performs much better than ClusPro or FRODOCK at predicting the epitopes, as they identify, within the four best-ranked peptides, 36 and 35% of the epitope residues, respectively. One reason is that, in MAbTope, the 30 top-docking poses are centered on the correct epitope and are not distributed on the whole surface of the target. This is illustrated in Fig. 2 (see also Supplemental Figs. 1, 2) by the example of the complex between the HIV gp120 glycoprotein and the VRC03 Ab (Brookhaven Protein Data Bank 3SE8) (23). This particularity, which can be found for all of the tested examples, arises for two main reasons. First, for conformation generation, we use Hex with very restrictive angle parameters; the obtained poses are consequently already well focused. Second, the A-score (as defined in 2Materials and Methods) favors overrepresented poses and consequently decreases the diversity of the top-ranked poses. As a result, the amino acids constituting the epitope are almost all found in more than half of the 30 selected docking poses. Consequently, the four best-ranking peptides all contain amino acids belonging to the interface (Supplemental Fig. 3). In addition, peptides 1 and 2 contain 7 and 6 aa belonging to the epitope, respectively. It should be noted that peptides 3 and 4 also contain 8 and 6 residues, respectively, and can also be considered as good predictions. Finally, the six best-ranking peptides contain all the amino acids belonging to the epitope.
Epitope peptides designed for 25 Abs targeting HIV gp120. (A) Top four epitope peptides designed for the 25 Abs of the benchmark targeting gp120. The designed peptides are all mapped on the sequence of 4ZMJ (chains B and G of GP120), although some of the Abs target gp120 proteins of different clades. However, the epitopes recognized by the different Abs have a homologous region in 4ZMJ. Each colored region represents one designed peptide; a black star indicates that the peptide belongs to the epitope. Red stars outside of these colored regions indicate residues of the epitope that do not belong to a designed peptide. (B) 3D structures of the complexes between the 25 Abs (cartoon) and gp120 (surface). All the structures have been superimposed on 4ZMJ; the color code is given in (A).
Epitope peptides designed for 25 Abs targeting HIV gp120. (A) Top four epitope peptides designed for the 25 Abs of the benchmark targeting gp120. The designed peptides are all mapped on the sequence of 4ZMJ (chains B and G of GP120), although some of the Abs target gp120 proteins of different clades. However, the epitopes recognized by the different Abs have a homologous region in 4ZMJ. Each colored region represents one designed peptide; a black star indicates that the peptide belongs to the epitope. Red stars outside of these colored regions indicate residues of the epitope that do not belong to a designed peptide. (B) 3D structures of the complexes between the 25 Abs (cartoon) and gp120 (surface). All the structures have been superimposed on 4ZMJ; the color code is given in (A).
We also compared the performance of MAbTope with that of two nondocking-based epitope prediction methods: PPiPP (24) and EpiPred (25). The results show that MAbTope clearly outperforms these two methods, confirming that a detailed consideration of shape and electrostatic complementarity, which results from the docking procedure, is necessary for high-quality predictions (Fig. 1B, Supplemental Table I).
The last step of the method consists in the experimental validation. Our first approach consists in measuring the binding of the Ab to the peptides. For each designed peptide, three peptides are synthesized, all of the same length but sliding 3 aa along the sequence. The first one starts and ends 3 aa upstream of the designed epitope peptide, the second one corresponds to the designed one, and the third one starts and ends 3 aa downstream. This choice was made to overcome the issue of some peptides being insoluble. A second approach is to measure the competition between these peptides and the target for the binding of the Ab. Finally, as the residues present within these peptides are those predicted to belong to the epitope, they can be used to predict point mutations of the target reducing the binding of the Ab.
It should be highlighted that MAbTope is able to find the epitope of each Ab, and not only the most antigenic sites on the target protein as defined by B cell epitope prediction methods. This is well illustrated by the example of gp120, to which 25 Abs of the benchmark bind. Whereas some regions of gp120 are targeted by a large number of Abs, including some that do not belong to the benchmark because the structure of the isolated Ab is not known, other regions are also targeted. Accordingly, the interaction peptides designed through MAbTope are spread on the whole target sequence (Fig. 2). MAbTope correctly builds at least one correct peptide for each of these 25 Abs, and two peptides for 19.
Validation on golimumab and certolizumab
To validate the method, we next predicted the epitopes of two therapeutic Abs targeting TNF-α: golimumab and certolizumab. These two Abs are already widely used in clinic, but their respective epitope is still unknown. We built homology models of the two Abs and used MAbTope to predict the epitopes they bind. On the basis of the predicted epitope–Ab interface, four different sets of peptides have been selected and synthesized (G1–G4 for golimumab and C1–C4 for certolizumab, Fig. 3). The P1 family overlaps with G3 and C4 and corresponds to the region containing the highest overlap between both predictions. The P4 family overlaps with C2 and G4. The P3 family overlaps with G1 and C1. Finally, the P2 family does not overlap with one of the four top-predicted peptides, but lies in a region well exposed and predicted by MAbTope to belong to certolizumab epitope, but not golimumab one. Peptides G2 and C3 were ignored because they are partly buried and have consequently low chances to interact efficiently with the Ab.
Residues present at the interface of docking poses of the golimumab/TNF and certolizumab/TNF complexes. (A) Golimumab: dark blue, residues present at the interface of more than 20 poses; medium blue, 10–20 poses; light blue, 1–10 poses. Certolizumab: dark, medium, and light violet. Validation peptides used in experiments are shown below the sequences. First best-ranking peptides predicted are boxed for each sequence. Red stars indicate the residues of certolizumab epitope in the crystal structure of the complex. (B and C) Selected docking poses for the assembly of TNF-α with golimumab (B) and certolizumab (C). The three TNF-α monomers are shown in different shades of gray (light, medium, and dark gray). The peptides selected for experimental validation are shown (P1 in red, P2 in orange, P3 in dark green, and P4 in light green).
Residues present at the interface of docking poses of the golimumab/TNF and certolizumab/TNF complexes. (A) Golimumab: dark blue, residues present at the interface of more than 20 poses; medium blue, 10–20 poses; light blue, 1–10 poses. Certolizumab: dark, medium, and light violet. Validation peptides used in experiments are shown below the sequences. First best-ranking peptides predicted are boxed for each sequence. Red stars indicate the residues of certolizumab epitope in the crystal structure of the complex. (B and C) Selected docking poses for the assembly of TNF-α with golimumab (B) and certolizumab (C). The three TNF-α monomers are shown in different shades of gray (light, medium, and dark gray). The peptides selected for experimental validation are shown (P1 in red, P2 in orange, P3 in dark green, and P4 in light green).
After the initial submission of this paper, the structure of the complex between certolizumab and TNF-α has been published (26). Comparison with our prediction shows that out of the 20 residues constituting the epitope, 17 belong to peptides C1–C4 (Fig. 3A). This shows that certolizumab epitope can be considered as conformational because it involves residues belonging to five different peptides. Nevertheless, we are still able to show the specific binding of some of these peptides to the Ab through HTRF and interferometry (Supplemental Fig. 2).
To validate the epitope of golimumab, to our knowledge, we first have shown that it competes with certolizumab for the binding to TNF-α, using HTRF (Fig. 4A, Supplemental Fig. 2). We thus performed further experimental validations on golimumab solely. We have also shown, using both HTRF and reverse-phase peptide array, that golimumab specifically binds the P3-1, P3-2, and P3-3 peptides (Fig. 4B, 4C). Finally, we have shown, using HTRF, that peptides P1-1, P1-2, P1-3, P3-1, and P3-3 decrease the binding of golimumab to TNF-α in a dose-dependent manner. Note that we observe a strong competition with the P1 series peptides in this last experiment, whereas we could not observe the binding of these peptides in the direct binding experiments. One hypothesis is that the biotin, which is attached at the N terminus of the peptide in the direct binding experiments, could prevent the binding to the Ab. The specificity of the binding of the P1 series peptides is confirmed by the flow cytometry experiments presented hereafter.
Validation of the predicted epitope of golimumab. (A) Certolizumab-induced displacement of golimumab from AviTag TNF-α and thus bind the same epitope. The initial binding of golimumab on AviTag TNF-α was measured at 0.1, 0.33, and 1 nM (brown, orange, and red full symbol, respectively) using the HTRF mixture anti–IgG-Tb/streptavidin-d2. Increasing doses of certolizumab were added from 10−15–10−5 M (empty symbols). The IC50 of the displacements are indicated above the graph. (B–D) Peptide-based validation assays of golimumab epitope. (B) The 11 peptides predicted to belong to the epitope at 1 mM, biotinylated at their N terminus, were incubated with 8 ng of golimumab or a nonrelevant mAb and the HTRF mixture anti–IgG-Tb/streptavidin-d2. The HTRF signals obtained with golimumab were corrected by the nonspecific binding on the irrelevant mAb considered as a baseline. (C) The 11 predicted peptides and one control peptide were spotted on a nitrocellulose-coated glass slide. After a blocking step, the slides were incubated with 200 ng of fluorescently labeled golimumab. Relative fluorescence unit (RFU) is calculated and used to compare the interaction between the peptides and the golimumab. (D) Displacement of the golimumab from the AviTag TNF-α by the peptides. Selected nonbiotinylated peptides at 1, 0.1, or 0.01 mM were incubated with golimumab and biotinylated AviTag TNF-α. The complex formed by golimumab and TNF-α was detected by the HTRF mixture anti–IgG-Tb/streptavidin-d2. As a displacement control, certolizumab was incubated with golimumab/TNF-α complex. HTRF signals were subtracted of the signal obtained with the control Ab (eculizumab). *p < 0.05, ***p < 0.001. ns, not significant.
Validation of the predicted epitope of golimumab. (A) Certolizumab-induced displacement of golimumab from AviTag TNF-α and thus bind the same epitope. The initial binding of golimumab on AviTag TNF-α was measured at 0.1, 0.33, and 1 nM (brown, orange, and red full symbol, respectively) using the HTRF mixture anti–IgG-Tb/streptavidin-d2. Increasing doses of certolizumab were added from 10−15–10−5 M (empty symbols). The IC50 of the displacements are indicated above the graph. (B–D) Peptide-based validation assays of golimumab epitope. (B) The 11 peptides predicted to belong to the epitope at 1 mM, biotinylated at their N terminus, were incubated with 8 ng of golimumab or a nonrelevant mAb and the HTRF mixture anti–IgG-Tb/streptavidin-d2. The HTRF signals obtained with golimumab were corrected by the nonspecific binding on the irrelevant mAb considered as a baseline. (C) The 11 predicted peptides and one control peptide were spotted on a nitrocellulose-coated glass slide. After a blocking step, the slides were incubated with 200 ng of fluorescently labeled golimumab. Relative fluorescence unit (RFU) is calculated and used to compare the interaction between the peptides and the golimumab. (D) Displacement of the golimumab from the AviTag TNF-α by the peptides. Selected nonbiotinylated peptides at 1, 0.1, or 0.01 mM were incubated with golimumab and biotinylated AviTag TNF-α. The complex formed by golimumab and TNF-α was detected by the HTRF mixture anti–IgG-Tb/streptavidin-d2. As a displacement control, certolizumab was incubated with golimumab/TNF-α complex. HTRF signals were subtracted of the signal obtained with the control Ab (eculizumab). *p < 0.05, ***p < 0.001. ns, not significant.
To further validate, we mutated in TNF-α the residues belonging to peptides of series 1, 3, and 4 to alanines and observed the binding of golimumab using flow cytometry (Fig. 5A, Supplemental Fig. 2). We observed that each TNF-α construct expressed well in cells by detecting flag epitope that was added to all constructs. Interestingly, we found that in the binding of golimumab, its target was almost abolished when the TNF-α was mutated at positions indicated within P1 and P3 series and reduced by 50% for mutations within the P4 series peptides. Finally, for peptides P3-1 and P3-3, which gave the best signals in HTRF, we individually mutated the residues belonging to these peptides and whose side chains are exposed and measured the binding to golimumab using HTRF (Fig. 5C, 5D). These results show that, as predicted, residues Y172, T174, and K175 are essential for golimumab binding to TNF-α.
Predicted mutations abolish the binding of golimumab on complete TNF-α and peptides. (A) HEK293 cells were transfected with either a mock vector, the wild-type TNF-α (Val 77–Leu 233), or with three mutated TNF-α constructions. The mutations (letters in red) were selected among the amino acids whose side chain is solvent exposed within peptides P1-1, P3-1, and P4-1. Cells were fixed, permeabilized, and incubated with golimumab (0.33 μM). They were then stained by detection of the Flag epitope fused to the different TNF-α constructs (PE-conjugated anti-Flag Ab) (upper panels) or of golimumab using an anti-IgG coupled to allophycocyanin as the secondary Ab (lower panels). The plots show the side-scatter versus light intensity for both channels. (B–D) Validation of the residues predicted to be implicated in the interaction between golimumab and TNF-α. Mutated variants of the P3-1 (B) and P3-3 (C) peptides were designed as indicated by red letters. Biotinylated peptides were incubated at 1 mM with 8 ng of either golimumab or a nonrelevant mAb and the HTRF mixture anti–IgG-Tb/streptavidin-d2. The HTRF signals obtained with golimumab were corrected by the nonspecific binding on the irrelevant mAb considered as a baseline. (D) Summary of the residues found critical in the interaction between golimumab and TNF-α. In red are indicated the residues obtained from the P3-1 mutants series and in blue the ones from the P3-3 mutants series. *p < 0.05, **p < 0.01, ***p < 0.001. ns, not significant.
Predicted mutations abolish the binding of golimumab on complete TNF-α and peptides. (A) HEK293 cells were transfected with either a mock vector, the wild-type TNF-α (Val 77–Leu 233), or with three mutated TNF-α constructions. The mutations (letters in red) were selected among the amino acids whose side chain is solvent exposed within peptides P1-1, P3-1, and P4-1. Cells were fixed, permeabilized, and incubated with golimumab (0.33 μM). They were then stained by detection of the Flag epitope fused to the different TNF-α constructs (PE-conjugated anti-Flag Ab) (upper panels) or of golimumab using an anti-IgG coupled to allophycocyanin as the secondary Ab (lower panels). The plots show the side-scatter versus light intensity for both channels. (B–D) Validation of the residues predicted to be implicated in the interaction between golimumab and TNF-α. Mutated variants of the P3-1 (B) and P3-3 (C) peptides were designed as indicated by red letters. Biotinylated peptides were incubated at 1 mM with 8 ng of either golimumab or a nonrelevant mAb and the HTRF mixture anti–IgG-Tb/streptavidin-d2. The HTRF signals obtained with golimumab were corrected by the nonspecific binding on the irrelevant mAb considered as a baseline. (D) Summary of the residues found critical in the interaction between golimumab and TNF-α. In red are indicated the residues obtained from the P3-1 mutants series and in blue the ones from the P3-3 mutants series. *p < 0.05, **p < 0.01, ***p < 0.001. ns, not significant.
Discussion
The results obtained on the 129 Ab-target complexes of the benchmark show that the in silico prediction is robust because within the benchmark, the predicted peptides contain on average 80% of the epitopes residues. This number is not much affected by the type of epitope: 79% for conformational epitopes (105 out of 129) and 89% for linear epitopes (14 out of 129). Neither is it much affected by the size of target: 88% for targets up to 300-residues long and 70% for larger targets. The main limitation of the in silico step is that the 3D structure of the target is needed. We have already tested the approach using homology models of the target when the 3D structure is not available. Although good results could be obtained in the few tested cases, this requires further investigations.
Based on the designed peptides, we present three different experimental validations of the predicted epitope. Our first approach consists in measuring the direct binding of the designed peptides either through HTRF, peptide array, or through interferometry. Good results could be obtained for the golimumab peptides of series 3. However, no signal is observed for series 1 peptides, although we later demonstrate that these peptides belong to the epitope. The second approach consists in making a competition between the peptides and the target for the binding of the Ab. Using this method, we were able to validate the peptides of series 1 and confirm the peptides of series 3. Nevertheless, both approaches are limited by the fact that some peptides tend to be “sticky.” Another limit to these approaches is the solubility of peptides, which is not always sufficient.
Importantly, the interaction peptides can also be used to design point mutations in the target, potentially decreasing the affinity of the Ab. In the TNF-α, to alanines, we mutated the residues belonging to peptide series 1, 3, and 4, whose side chains point toward the solvent. We show, using flow cytometry, that these mutations indeed abolish (series 1 and 3) or decrease (series 4) the binding of the Ab. However, this approach also has its limitations: the difficulty of expressing some target or their mutated forms, especially if they are toxic for the cells. The endogenous expression of the native target could also raise some issues.
Despite the known limitations of each experimental approach proposed, it is reasonable to assume that their combined use will convey more robustness to the overall validation process.
Further demonstration of MAbTope ability to determine the epitope is given through the examples of certolizumab and golimumab. For these two Abs, although their 3D structure was not known at the beginning of this study, we were able to predict and experimentally validate the epitopes. A good example is given by peptide 1.3, which contains only one residue belonging to the epitope, but for which we were able to measure the specific binding with certolizumab (Supplemental Fig. 2). Using mutated peptides, we were also able to refine these results and show the importance of individual residues in the epitope.
Two other therapeutic Abs are used in clinic for their ability to bind TNF-α: infliximab and adalimumab, and the 3D structures of the corresponding complexes with the target are known [4G3Y for infliximab (27) and 3WD5 for adalimumab (28)]. A recent meta-analysis has compared the efficacy of different TNF-α–blocking agents, including the four Abs cited above. It concludes that infliximab and golimumab are less efficient in the treatment of rheumatoid arthritis than adalimumab and certolizumab (29). By contrast, a meta-analysis performed in ulcerative colitis indicated that infliximab is better than adalimumab, and probably golimumab (30). Their affinities for TNF-α [4.5 × 10−10 M for infliximab (28), 7.05 × 10−11 M for adalimumab (28), 1.8 × 10−11 M for golimumab (31), and 1.32 × 10−10 M for certolizumab (US patent US20050042219 A1)] do not explain these differences. Hu et al. (28) hypothesized that the difference of efficacy between infliximab and adalimumab could be partly due to the fact that adalimumab binds in the groove between two monomers and has consequently a higher overlap with the TNF-α receptor binding interface and a better neutralizing activity than infliximab, which binds to a monomer. By contrast, the ability to target inflammatory cells expressing membrane TNF-α, which could be monomeric, and to induce apoptotic signals seems important determinants of therapeutic activity of anti–TNF-α agents in inflammatory bowel diseases (32). These reasons could also account for the difference of efficacy between certolizumab and golimumab, as certolizumab binds in the groove (like adalimumab), whereas golimumab binds to the monomer (Supplemental Fig. 3), knowing that certolizumab differs from the three others by its monovalency and the absence of an Fc region. However, the fact that the structure of the four anti–TNF-α therapeutic Abs is now known will help at understanding the subtle differences in their clinical activities.
In conclusion, MAbTope initial prediction of the epitope is very robust. On a benchmark of 129 Ab/Ag complexes, MAbTope correctly defines the epitope in each case. In addition, MAbTope allows defining four 15-aa peptides; among which, at least one belongs to the epitope, which in turn, allows experimental validation. These peptides also allow the design of point mutations that can be used to validate and refine the predicted epitope. Although the information obtained through MAbTope does not allow defining the precise interactions taking place between the Ab and the target, it allows defining, with good precision, the region of the target involved in the interaction. This information is sufficient for understanding the mechanism of action of the Ab, a crucial step in the development of therapeutics, but also diagnostic or biotechnological tools. Taken together, MAbTope is not just a prediction method, but constitutes an integrated workflow allowing identification of the epitope. With the example of two therapeutic Abs, certolizumab and golimumab, we show that it can be successfully applied to Abs, whose 3D structure is unknown.
Acknowledgements
We thank Prof. Denis Mulleman for providing certolizumab and golimumab. We thank Dr. Olivier Lichtarge for reading and advice.
Footnotes
This work was supported by the French National Research Agency (ANR) under the program Investissements d’Avenir Grant Agreement (LabEx MabImprove: ANR-10-LABX-53), and by ANR (Contract ANR-2011–1619 01), ARTE2, MODUPHAC, MAbSilico, and ARD 2020 Biomédicament grants from Région Centre.
The online version of this article contains supplemental material.
References
Disclosures
T. Bourquard, A.M., V.P., E.R., P.C., and A.P. are shareholders of the company MAbSilico, which proposes as a service the application of MAbTope to clients’ Abs. The other authors have no financial conflicts of interest.