The killer cell Ig-like receptor (KIR) gene family encodes MHC class I-specific receptors, which regulate NK cell responses and are also expressed on subpopulations of T cells. KIR haplotypes vary in gene content, which, in combination with allelic polymorphism, extensively diversifies the KIR genotype both within and between human populations. Species comparison indicates that formation of new KIR genes and loss of old ones are frequent events, so that few genes are conserved even between closely related species. In this regard, the hominoids define a time frame that is particularly informative for understanding the processes of KIR evolution and its potential impact on killer cell biology. KIR cDNA were characterized from PBMC of three gorillas, and genomic DNA were characterized for six additional individuals. Eleven gorilla KIR genes were defined. With attainment of these data, a set of 75 KIR sequences representing five hominoid species was assembled, which also included rhesus monkey, cattle, and rodent KIR. Searching this data set for recombination events, and phylogenetic analysis using Bayesian methods, demonstrated that new KIR were usually the result of recombination between loci in which complete protein domains were shuffled. Further phylogenetic analysis of the KIR sequences after removal of confounding recombined segments showed that only two KIR genes, KIR2DL4 and KIR2DL5, have been preserved throughout hominoid evolution, and one of them, KIR2DL4, is also common to rhesus monkey and hominoids. Other KIR genes represent recombinant forms present in a minority of species, often only one, as exemplified by 8 of the 11 gorilla KIR genes.
Killer cell Ig-like receptors (KIR)4 are encoded by a compact gene-rich family, which forms part of the leukocyte receptor complex on human chromosome 19q13.4 (1, 2, 3, 4). KIR are expressed on the surfaces of NK cells, some γδ T cells (5, 6), and some memory αβ T cells, most of which express CD8 (7). Apparent from this cellular distribution is that KIR have the potential to function in both innate and adaptive immunity. The KIR family includes both inhibitory and activating receptors of various specificity for HLA class I ligands (8). The inhibitory class I-specific KIR protect autologous cells from NK cell attack and are indirectly responsible for alloreactive NK cell responses toward targets lacking a cognate HLA class I ligand. For patients receiving an HLA-mismatched hemopoietic-cell transplant for acute myelogenous leukemia, an alloreactive response directed toward the recipient’s cells and mediated by donor-derived NK cells can help eliminate residual leukemia and prevent graft-vs-host disease (9). Although the functions of the activating KIR remain poorly defined, KIR3DS1 is associated with slowed progress of HIV infection for patient’s having a cognate HLA-B ligand (10). KIR associations have also been reported with autoimmune conditions (11, 12, 13) and with outcome following HLA-matched hemopoietic cell transplantation (14). KIR-HLA class I interactions are also implicated in the contribution made to implantation and establishment of pregnancy by intrauterine NK cells during the first trimester (15).
Underlying the biological functions and clinical associations of the KIR gene family is a degree of genetic diversity, which may approach that of the MHC class I genes. Human KIR haplotypes differ in the number of genes they contain (7–15 genes) (2, 16, 17) and are further diversified through allelic polymorphism of individual genes (18). As a consequence, unrelated individuals almost always differ in KIR genotype (19). Because KIR haplotypes differ in the number of activating receptors they encode, KIR genotypes range from ones biased for inhibition to ones rich in activating receptors. Such range has the potential to modify NK cell responses in qualitative, as well as quantitative, ways. Because NK cells respond early during infection and influence its subsequent course, and the immune response to it, an appealing hypothesis is that KIR diversity is the consequence of past selection by pathogens upon the NK cell response.
Comparison of gene content for human KIR haplotypes has served to emphasize the contributions of two modes of homologous recombination to haplotype diversification (20). First, asymmetric recombination between homologous intergenic regions adds genes through duplication or subtracts them through deletion; second, reciprocal recombination in the unique sequence near the center of the KIR locus reassorts motifs from the centromeric and telomeric halves. Through these mechanisms, an existing set of genes is permuted to generate diversity. In contrast, over time frames separating closely related species, a majority of the KIR loci have been reconfigured and no longer seem orthologous. Favored animal models, e.g., mouse (21, 22) and rhesus monkey (23, 24), have little commonality with the human KIR system. Because much more is known of human KIR than of any other species, the comparison of human KIR with those of other hominoid species provides an excellent system for the identification of processes by which an existing set of KIR genes is used to make new KIR genes (25, 26, 27). Within the hominoids, there is much interspecies diversity but sufficient commonality to give confidence that the comparisons are valid. In this study, we describe characterization of gorilla KIR and an analysis of recombination and phylogenetic relationship for an extensive set of KIR sequences representing five species of hominoid, as well as rhesus monkey, cattle (28), and rodents.
Materials and Methods
Peripheral blood or spleen samples of healthy gorillas (Gorilla gorilla (Gg)) and common chimpanzees (Pan troglodytes (Pt)) were obtained from the Yerkes Regional Primate Center of Emory University School of Medicine (Atlanta, GA). PBMC were isolated by Ficoll-Hypaque gradient centrifugation. Aliquots of PBMC were transformed with EBV to establish B lymphoblastoid cell lines (29). The genomic DNA was isolated either from B lymphoblastoid cell lines or from spleen samples using QIAamp blood kit or DNeasy tissue kit (Qiagen, Chatsworth, CA).
Cloning of cDNA encoding gorilla KIR
Total RNA was extracted from PBMC and reverse transcribed using standard methods (30). KIR transcripts were PCR amplified from gorilla cDNA using two different forward primers (2IgFa, 5′-CATGGCGTGTGTTGGGTTCT-3′; or 3IgFa, 5′-GCACCGGCAGCACCATGT-3′) paired with reverse primer (NKRc, 5′-TGAGTTCCTCAGTGTGATTGCAGCCTC-3′). The following optimized temperature condition was used for PCR: initial denaturation for 2 min at 94°C; then 29 cycles of 94°C for 30 s, 60°C for 30 s, and 72°C for 3 min; and a final extension at 72°C for 10 min. The PCR products of relevant size (∼1.3 and ∼1.6 kb) were purified using the QIAEX II gel extraction kit (Qiagen) and cloned into the pCR4-TOPO vector (Invitrogen, Carlsbad, CA). Partial sequences were determined on randomly picked clones using an oligonucleotide sequencing primer (NKRb, 5′-TTTGAGACAGGGCTGTTGTCTCCCTAG-3′) that anneals to a segment of 3′-untranslated region conserved in all human and chimpanzee KIR sequences. If the NKRb-primed sequencing reaction failed, then standard T3 and T7 primers were used. Ten groups of gorilla KIR sequences were distinguished by this preliminary screening. Multiple clones from three gorillas representing these distinct KIR groups were selected for full-length sequencing. In addition to T3 and T7 primers, a set of seven oligonucleotide primers (Table I) annealing with the segments conserved in human and chimpanzee KIR sequences were used to determine the full-length sequence at both directions. Sequencing reactions were performed using the BigDye terminator cycle sequencing kit (Applied Biosystems, Foster City, CA) and analyzed in a 377 automated DNA sequencer (Applied Biosystems).
|Oligoa .||Sequence (5′-3′) .||Annealing Region .|
|Oligoa .||Sequence (5′-3′) .||Annealing Region .|
F and R in the oligo name stand for forward and reverse primers, respectively.
PCR typing of gorilla KIR genes
Genomic DNAs from six gorillas were PCR typed for gorilla KIR genes using PCR methods based upon human and chimpanzee KIR sequences (16, 19, 26, 27, 31, 32). A similar sequence-specific PCR typing system was also developed for the 10 gorilla KIR cDNA sequences. The oligonucleotide primers and the sizes of the PCR amplicons are listed in Table II. The PCR mixture and the temperature conditions were the same as used for typing human and chimpanzee KIR (16). The amplified products from all gorilla KIR typing reactions were sequenced directly, confirming all gorilla cDNA KIR sequences at the genomic level. Gorilla KIR were named according to convention with the prefix Gg (http://www.gene.ucl.ac.uk/nomenclature/genefamily/kir.html) (33, 34).
|KIR .||Sense Primer (5′-3′) .||Antisense Primer (5′-3′) .||Amplicon Size (bp) .|
|KIR .||Sense Primer (5′-3′) .||Antisense Primer (5′-3′) .||Amplicon Size (bp) .|
Second set of primers that recognize all KIR2DL4-like sequences in human, chimpanzees, and gorilla.
Sequence determination of pseudoexon 3 from KIR2D genes
Using a group-specific forward primer and KIR2D gene-specific reverse primers (Table III), a ∼3-kb segment that covers exons 2–4 of KIR genes was PCR amplified from two gorillas (Simsim and Oko) and from three common chimpanzees (Alex, Elwood, and Agatha). The PCR was conducted using the Expand Long Template PCR system according to the manufacturer’s instructions (Boehringer Mannheim, Mannheim, Germany). The PCR conditions included 3 min of initial denaturation at 94°C; five cycles of 20 s at 94°C, 1 min at 62°C, and 3 min at 72°C; 25 cycles of 20 s at 94°C, 30 s at 64°C, and 3 min at 72°C; and final extension at 72°C for 10 min. PCR products were purified using the QIAEX II gel extraction kit (Qiagen) and cloned into the pCR4-TOPO vector (Invitrogen). Preliminary sequencing was performed using the standard T3 and T7 primers, which provided the exon 2 and exon 4 sequences. These exon 4 genomic sequences were compared with cDNA sequences to identify the relevant gorilla KIR genes, and multiple clones, obtained from more than one animal, were used to determine the pseudoexon 3 sequences using two additional primers (ps3F, 5′-TGGTCAGGACAARCCCTT-3′; and ps3R, 5′-GGGGTTGCTGGGTGCCGACC-3′).
|Target KIR .||Antisense Primer (5′-3′) .||Sense Primera .|
|Gg-KIR2DSa and Gg-KIR3DL3||gKIR5Ra: CAGAGGGTCACTGGGAGCC||gKIR2DF1|
|Target KIR .||Antisense Primer (5′-3′) .||Sense Primera .|
|Gg-KIR2DSa and Gg-KIR3DL3||gKIR5Ra: CAGAGGGTCACTGGGAGCC||gKIR2DF1|
The sequences of sense primers are as follows: N9P11F, 5′-CAGGGGGCCTGGCCACAT-3′; and gKIR2DF1, 5′-TCTTGCTGCAGGGGGCCTGG-3′.
GenBank accession numbers
Nucleotide sequences were submitted to GenBank and given the following accession numbers: Gg-KIR2DL4 (AY122865), Gg-KIR2DL5 (AY122866), Gg-KIR3DLa.v1 (AY122867), Gg-KIR3DLa.v2 (AY122868), Gg-KIR3DL7 (AY122869), Gg-KIR2DL6 (AY122870), Gg-KIR2DLb (AY122871), Gg-KIR2DLc (AY122872), Gg-KIR2DLd (AY122873), Gg-KIR2DLe (AY122874), Gg-KIR2DSa (AY122875), Gg-KIR3DL3-ex3 (AY122884), Gg-KIR3DL3-ex4 (AY122885), Pt-KIR2DL6-ex3 (AY122876), Pt-KIR2DS4-ex3 (AY122877), Gg-KIR2DL6-ex3 (AY122878), Gg-KIR2DLb-ex3 (AY122881), Gg-KIR2DLc-ex3 (AY122879), Gg-KIR2DLd-ex3 (AY122880), Gg-KIR2DLe-ex3 (AY122882), and Gg-KIR2DSa-ex3 (AY122883).
A preliminary sequence comparison was made of the 94 KIR coding-region sequences available from the mammalian species studied so far. Only a single bovine KIR was included in the analysis, because all of the known bovine KIR sequences form an external group to the primate KIR sequences (35). The pseudoexon 3 sequences were added where available. For groups of sequences having <1% of divergence and likely to be allelic variants, a single sequence was chosen to represent the group, and the others were removed from further analysis. The remaining 75 sequences were aligned using the program ClustalX (36) and then corrected manually. The corrected alignment begins at the common start codon and ends at the stop codon of the long-tailed KIR. To identify recombination events, we used a modified version of the recombination detection program (RDP) program (37) that implements additional methods for detecting recombination (kindly provided by D. Martin (University of Cape Town, Cape Town, South Africa)). The bootscanning method (38) was used to check the result of the initial RDP analysis. The parameters used for the RDP analysis were as follows: no multiple comparison correction, a window of 50 bp, a highest acceptable p value of 0.001, and a reference sequence selection using internal and external references (results were also compared with analyses using internal references only).
Phylogenetic analysis of individual domains
The gorilla Gg-KIR3DL3 Ig domain 0 (D0) and D1 sequences were added to the KIR sequence alignment used for the recombination analysis, which was then divided into four main parts: the first three parts each comprised an Ig domain, D0, D1, and D2; and the last part comprised the stem (S), transmembrane (TM), and cytoplasmic (CYT) domains together. Each of the four parts was analyzed separately. For each part, we selected the model of DNA substitution from a comparison of 56 models using the Akaike information criterion (39) as implemented in MODELTEST 3.06 (40), with an α level of significance of 0.01. Bayesian phylogenetic analyses were conducted with MRBAYES 2.01 (41). The Metropolis-coupled Markov chain Monte Carlo (MCMCMC) sampling approach was used to calculate the posterior probabilities, starting with random trees. Four chains were run simultaneously, and each was incrementally heated by applying a temperature of 0.5. Specific nucleotide substitution model parameter values were not defined a priori for analyses. Instead, model parameters were treated as unknown variables and estimated as part of the analysis. Preliminary analyses, in which plots were made of generation vs log likelihood, showed that every analysis converged before 200,000 generations. Subsequently, we used 600,000 generations for all analyses. Every twentieth tree was sampled from the MCMCMC analysis. The stationary generation was determined by plotting the generations vs log likelihoods; all trees below the stationary level were discarded. A majority rule consensus tree (retaining all compatible clades, even under 50% frequency of occurrence) was then determined using PAUP*4.0b10 (42). Five independent MCMCMC runs were performed for each domain, and the five resulting tree topologies were compared statistically using the Shimodaira-Hasegawa test of alternative phylogenetic hypotheses (43) with resampling estimated log-likelihood optimization and 10,000 bootstrap replicates. For all of the analyses presented, the test failed to reject any of the alternative tree topologies (with a significance level of 0.05). To avoid keeping the weakly supported nodes in the final trees, a new majority rule consensus tree was determined (retaining only the clades with 50% or more frequency of occurrence) for each of the five runs, and a majority rule consensus of the five topologies obtained was established. The average support over the five runs, as well as the SD, was calculated for each node of the tree. All of the trees were rooted at the midpoint. Branch lengths and the midpoint were determined with PAUP*, using a maximum likelihood (ML) model with the parameters determined by MODELTEST.
Full-length phylogenetic analysis of a trimmed data set
Recombination analysis identified those segments of the KIR sequences that had been involved in recombination. These segments, as well as the sequences for which the domain analysis was unclear, were removed from the sequence alignment. This trimmed full-length alignment was then subjected to Bayesian analysis, conducted exactly as described for the domain-by-domain analysis. In addition to the Bayesian analysis, we also performed a nonparametric bootstrap ML analysis using PAUP*. The analysis was performed with the parameters defined by MODELTEST, using the tree-bisection-reconnection algorithm. Because of computational limitations imposed by the ML, only 100 bootstrap replicates (44) were performed. The tree topologies obtained with the two methods were statistically compared using the Shimodaira-Hasegawa test of alternative phylogenetic hypotheses (43) with resampling estimated log-likelihood optimization and 10,000 bootstrap replicates. The test failed to reject one of the two tree topologies (with a significance level of 0.05).
Identification of 11 gorilla KIR
Gorilla KIR cDNA clones were obtained following RT-PCR amplifications using oligonucleotides based on sequences conserved in chimpanzee and human KIR (26, 27). Ten groups of KIR sequences were characterized from the mRNA extracted from PBMC of three gorillas. Each group consisted of 60–100 cDNA clones, of which 8–29 clones per group were sequenced completely (Fig. 1,A). The gorilla KIR groups differ from each other by 3.9–13%, with a mean difference of 6.7%, comparable to the differences distinguishing KIR groups in human, chimpanzees, orangutans, and rhesus monkeys (Fig. 1,B); the largest difference being observed with the common chimpanzee (similar range, 2.5–13.3%, but higher average difference, 8.4%). This difference is in part due to KIR3DL3, which accounts for 0.4–0.5 points of the diversity in human and common chimpanzee, not being completely characterized in the gorilla and not included in the comparison. A sequence-specific PCR typing system for gorilla KIR was used to analyze genomic DNA from six additional gorillas, revealing that each of the nine gorillas has a distinct KIR type (Fig. 1 A). The amplified products from all of these typing reactions were sequenced, confirming all the gorilla KIR cDNA sequences at the genomic level.
Our initial approach to Gg-KIR2DL4-specific typing gave positive reactions for only two of the gorillas, a surprising result given that all chimpanzees and almost all humans type positively for KIR2DL4 (16, 19, 26, 27, 45). Using a different set of primers, five of the six gorillas typed positively for KIR2DL4 (Fig. 1 A). The one negatively reacting gorilla (Shamba) may not have a KIR2DL4gene, as has been described for rare human donors (45, 46). To determine whether this is also a rare phenomenon in gorillas would require analysis of a larger panel of individuals than studied here.
Gorilla genomic DNAs were also typed using the methods previously applied to human and chimpanzee panels (16, 19, 26, 27, 32). PCR products of relevant size were obtained only from the reactions specific for the KIR2DL5 and KIR3DL3 genes (data not shown). Because no gorilla KIR3DL3-like cDNA clones had been identified, we cloned and sequenced a genomic fragment comprising exon 2 through exon 4 of Gg-KIR3DL3.
Human genes encoding KIR2D with the D1+D2 configuration of Ig-like domains contain a nonexpressed sequence, called pseudoexon 3, which corresponds to the D0 domain-encoding exon 3 of KIR3D genes. All gorilla KIR genes of this type: Gg-KIR2DL6, Gg-KIR2DLb, Gg-KIR2DLc, Gg-KIR2DLd, Gg-KIR2DLe, and Gg-KIR2Dsa, as well as Pt-KIR2DS4 and Pt-KIR2DL6 of the common chimpanzee, were similarly demonstrated to have a pseudoexon 3. A summary of the characteristics of the gorilla KIR, as well as a comparison with other species KIR, is presented in Fig. 1 C.
Gorilla KIR represent the major hominoid KIR lineages
On the basis of overall sequence similarity and phylogenetic analysis of the complete cDNA sequences, Gg-KIR2DL4 and Gg-KIR2DL5 appear orthologous to the human KIR2DL4 and KIR2DL5 genes, respectively. Discerning relationships for other gorilla KIR was confounded by the effects of recombination. To confront this problem, which had been evident in previous interspecies KIR comparisons, we analyzed the KIR sequences for possible recombinations. An initial analysis performed using an updated version of the RDP program (37) revealed the propensity for recombination within the KIR gene family, because almost all the sequences were found to be affected by recombination (Fig. 2). The analysis showed a second feature of the KIR data set: a strongcorrelation between the detected breakpoint locations and the domain junctions. This result indicates that recombination was almost exclusively confined to events shuffling complete or nearly complete exons encoding individual extracellular Ig-like domains or the set of smaller exons encoding the S, TM, and CYT regions (Fig. 2). Given this result, we were able to continue investigation of recombination events by comparison of the trees obtained by phylogenetic analysis on individual domains.
Phylogenetic analysis of individual domains was performed using a Bayesian phylogenetic approach, which allows large phylogenetic data sets to be analyzed under complex evolutionary models (47). The results obtained for the four different domains, D0, D1, D2, and S/TM/CYT, which included pseudoexon 3 sequences, are shown in Fig. 3.
The recombination events identified in the RDP analysis were verified by comparison of the phylogenetic trees. The trees were also analyzed for the presence of discordant topologies, sequences or group of sequences whose position in the phylogenetic trees changed depending upon domain and was in each case supported statistically. This analysis allowed us to characterize a total of 25 recombination events: 8 involving groups of sequences and 17 involving individual sequences. The nature of recombination for each of the 75 KIR sequences analyzed is summarized on a KIR-by-KIR basis in Fig. 4. From these results, the KIR data set was trimmed to remove the recombinant regions from each sequence. Also removed were sequences that were too short or for which the domain-by-domain analysis gave ambiguous results. Neither were the leader sequences included, because they were too short to be analyzed separately and not available for some KIR. All of these modifications were designed to limit bias or inaccuracy in the subsequent phylogenetic analysis. The trimmed data set retained 54 of the 75 sequences, and represents 52% of the original data set (Fig. 5,A). Because the trimmed data set was selected conservatively, the actual proportion affected by recombination was less (23–37%) than the 48% removed (Fig. 5,A). When the trimmed data set was tested by analysis with the RDP program (with parameters identical with those used for the earlier analysis of complete KIR sequences), no recombination events were detected. The trimmed data set was then subjected to phylogenetic analyses, including parametric Bayesian analysis and the nonparametric ML analysis. The results are presented in the phylogenetic tree of Fig. 5 B.
This tree shows that primate KIR form a cluster separated from mouse and cattle KIR. Within the primates, the overall topology is like that previously obtained with full-length coding-region sequences (25), with four statistically well-supported groups: the KIR2DL4 group, the KIR2DL5 group, one large group including the lineage II and III KIR, and a last group consisting exclusively of rhesus monkey KIR. The large group including the lineage II and III KIR contains all of the hominoid KIR genes that either encode KIR3D or encode KIR2D but have a pseudoexon 3.
KIR2DL4 and KIR2DL5 are genes that encode KIR2D with D0+D2 configuration of Ig-like domains and lack a sequence equivalent to exon 4 that specifies the D1 domain of KIR3D. On the basis of similar exon-intron organization and sequence similarity, KIR2DL4 and KIR2DL5 have been grouped together as lineage I KIR. However, with an expanding KIR data set and more refined methods, it has become increasingly apparent that the statistical support for grouping KIR2DL4 with KIR2DL5 is not strong. This is evident in the tree of Fig. 5 B where the KIR2DL4 and KIR2DL5 groups belong to different parts of the tree. However, as can be seen from the modest support for the basal nodes, these positions are not reliable, and so neither keeping them together nor splitting them apart can be excluded. To reflect this situation, we have designated the KIR2DL4 group as lineage I-A and the KIR2DL5 group as lineage I-B.
In the phylogenetic tree of Fig. 5,B, gorilla KIR are represented in four lineages (I-A, I-B, II, and III) containing hominoid KIR, as are human and chimpanzee KIR (26, 27). In contrast, only three of these four lineages are described in the orangutan: I-A, II, and III (25), and only one in the rhesus monkey: the lineage I-A (Macaca mulatta (Mm)-KIR2DL4). A potential ortholog of the hominoid KIR2DL5 was described in the rhesus monkey (Mm-KIR2DL5) (23), but our analyses only confirm this relationship for the D0 domain (Fig. 3 A). For the other two domains, Mm-KIR2DL5 has different patterns: its D2 domain clusters with the D2 domains of Pt-KIRC1 and KIR3DL3, and its S/TM/CYT domain clusters with the S/TM/CYT domain of Mm-KIR1D. These relationships indicate that recombination events involving Mm-KIR2DL5 have occurred. Without additional information, it is impossible to establish whether Mm-KIR2DL5 was the donor or the receiver in these recombination events, and therefore care should be taken before designating it as the ortholog of the hominoid KIR2DL5. Of all the primate KIR so far identified, only KIR2DL4 can be inferred with confidence as having been present in the common ancestor of Old World monkeys and hominoids. In contrast, the lineages II and III are found in all the hominoids analyzed so far, and were probably present in the ancestor of the hominoids.
An additional lineage (lineage V) is represented in gorilla by a partial gene: Gg-KIR3DL3. This partial sequence is orthologous to the corresponding part of the human and common chimpanzee KIR3DL3 sequences, as indicated by the domain analysis (Fig. 3, A and B). This demonstrates that the Ig structure of KIR3DL3 predated the gorilla and human/chimpanzee divergence. The status of this Ig structure in the hominoid ancestor is unknown, because none of these domains were found in the orangutan. In the rhesus monkey, the conservation of this structure seems to be partial, because only the KIR3DL3 D2 domain appears to have a clear ortholog: the D2 domain of Mm-KIR2DL5 (Fig. 3 C).
The two lineage I gorilla KIR are conserved in hominoids
The results of the recombination and phylogenetic analysis were used to assess the relationships of each gorilla KIR to other KIR. Gg-KIR2DL4 and Gg-KIR2DL5 are orthologous to human and chimpanzee KIR2DL4 and 2DL5, respectively. For each domain tree, the sets of orthologs cluster similarly (Fig. 3), and in the phylogenetic analysis of Fig. 5, the monophyly of the group containing the human, chimpanzee, and gorilla sequences is strongly supported. Characteristic features of the cytoplasmic tail for several hominoid KIR2DL4 are substitutions that disrupt immunoreceptor tyrosine-based inhibitory motifs (ITIMs), and insertion/deletions that change the reading frame and cause premature termination. The KIR2DL4 cDNA obtained from each of three gorillas was in this category: a nucleotide substitution disrupts the membrane-proximal ITIM, and a downstream deletion of 2 nt eliminates the membrane-distal ITIM through change of reading frame and premature termination after 26 residues of novel sequence. The former of these changes, but not the latter, is shared with chimpanzee KIR2DL4. Although Gg-KIR2DL4 lacks inhibitory signaling motifs in its cytoplasmic tail, the transmembrane region conserves the arginine residue implicated in the transduction of activating signals by human KIR2DL4.
The lineage II gorilla KIR is a recombinant
A single gorilla lineage II KIR, Gg-KIR3DLa, was identified and shown to be present in all the individuals analyzed. Two variants (v1 and v2) differ by <1% difference in nucleotide sequence and are therefore probably alleles. Gorilla and common chimpanzee have a single gene encoding lineage II KIR, whereas human KIR haplotypes can have one or two such genes. Gg-KIR3DLa clusters with human KIR3DL2, although not with strong support, and chimpanzee KIR3DL1/2 clusters with human KIR3DL1, KIR3DS1, and KIR3DL1/2v (Fig. 5). Gg-KIR3DLa is a simple example of a recombinant product: its D0, D2, and S/TM/CYT are those of a lineage II KIR, but the D1 domain is more related to the D1 domains of lineage III KIR (Figs. 3 and 6). Two chimpanzee lineage III KIR, Pt-KIR3DL3 (a divergent allotype of Pt-KIR3DL1/2) and Pp-KIR3DLa, also have a similar chimeric structure, a lineage II KIR with a lineage III D1 domain (Fig. 4).
The seven lineage III gorilla KIR are unique to gorilla
The gorilla’s lineage III KIR are more diverse and more complicated than the lineage II KIR, as is the case for the other hominoids. Of the seven lineage II gorilla KIR, only two complete sequences, Gg-KIR3DL7 and Gg-KIR2DLd, survived the selection for contributing to the final phylogenetic analysis; of the remainder, three did not contribute (Gg-KIR2DLb, Gg-KIR2DSa, and Gg-KIR2DLc), and two contributed about one-half of their sequence (Gg-KIR2DL6 and Gg-KIR2DLe). None of the lineage III gorilla KIR appears to be a full-length ortholog of any human or chimpanzee KIR.
Gg-KIR2DLd clusters with the orthologous KIR2DS4 and Pt-KIR2DS4 and might possibly be an ortholog. Complicating the situation is a distinctive S/TM/CYT domain and the fact that the D0s of other gorilla KIR (Gg-KIR2DLc, Gg-KIR2Dsa, and Gg-KIR2DLe) are closer to the KIR2DS4 D0 than is the Gg-KIR2DLd D0 (Fig. 3,A). Gg-KIR3DL7 is very similar to Pt-KIR3DL5 in D1 and D2, but it has a distinctive S/TM/CYT and a D0 that clusters with the pseudoexons 3 of Gg-KIR2DLb and the human pseudogene KIR2DP1 (Fig. 3 A).
Functional human KIR of lineage III are all KIR2D containing a pseudoexon 3, whereas chimpanzee lineage III KIR can be either KIR2D with pseudoexon 3 or KIR3D with expressed exon 3. The pseudoexons 3 are distinguished by a 3-bp deletion from the expressed exons (48). As expected, the six lineage III Gg-KIR2D have pseudoexons 3 containing the deletion. Unexpected was our finding that the sequence of the expressed exon 3 of KIR3DL7 has a sequence similar to that of the pseudoexons 3, including the 3-bp deletion. Thus, the deletion is not sufficient to prevent the incorporation of exon 3 into mature mRNA. In the phylogenetic analysis, there is strong statistical support for the monophyly of all pseudoexons 3 and the expressed exon 3 of Gg-KIR3DL7. That the position of the Gg-KIR3DL7 sequence within this cluster is not basal supports a history in which exon 3 of Gg-KIR3DL7 was previously a pseudoexon that reverted to an expressed form (Fig. 3,A). Five differences are observed between the exon 3 sequence of Gg-KIR3DL7 and its closest sequence: the pseudoexon 3 sequence of Gg-KIR2DLb (Fig. 3 A). Therefore, one or more of these five mutations could be responsible for the re-expression.
Within the lineage III KIR, two different structures can be defined: one sublineage containing all pseudoexons 3 and exon 3 of Gg-KIR3DL7, and a second sublineage containing chimpanzee lineage III KIR3D and human pseudogene KIR3DP1 that have a D0 domain related to the lineage V KIR D0 domain (KIR3DL3, Pt-KIRC1, and Gg-KIR3DL3). For the latter group, the similarity in D0 does not extend to other exons, where chimpanzee lineage III KIR3D and human pseudogene KIR3DP1 cluster with other lineage III KIR, but lineage V KIR do not have the same relationship. The similarity in D0 is therefore consistent with a model in which chimpanzee KIR3D of lineage III and KIR3DP1 evolved from an ancestral recombinant in which exon 3 of a lineage III KIR was replaced by exon 3 of a Pt-KIRC1/KIR3DL3/Gg-KIR3DL3 ancestor. This event gave rise to a new three-Ig KIR structure, with the D1 and D2 domains of the lineage III KIR and the D0 domain of KIR3DL3/Pt-KIRC1/Gg-KIR3DL3 (see Fig. 8). In chimpanzee, the progeny of the ancestral recombinant are still expressed genes, whereas in human, there is only a pseudogene.
Of six Gg-KIR2Ds, four had in-frame stop codons in their pseudoexon 3 (Fig. 7). However, the codon usage and location of these stop codons are different from those seen in human KIR2DL1, KIR2DL2, KIR2DS1, and KIR2DS2. Comparison of the intron 2-pseudoexon 3 boundaries revealed that Gg-KIR2DL6 has its splicing site altered from the regular AG/GT to GG/GT. This mutation is different from the CG/GT seen in the human KIR2DL3-NKAT2-like sequence and KIR2DS1. The pseudoexons 3 of Gg-KIR2DLb, Pt-KIR2DS4, and Pt-KIR2DL6 have no major structural abnormalities, like those of human KIR2DL3, KIR2DS3, KIR2DS4, and KIR2DS5.
Overall, the key feature of the lineage III gorilla KIR is the lack of orthology to KIR in other hominoid species, although orthologs are identified for individual domains. Only four of seven gorilla KIR have an overall structure that can be related to particular human or chimpanzee KIR: Gg-KIR3DL7, Gg-KIR2DLe, Gg-KIR2DL6, and Gg-KIR2DLd (Fig. 6).
The first clue that the KIR gene family has evolved rapidly was the finding that human and mouse NK cells use entirely different molecular types as receptors for polymorphic determinants of classical MHC class I molecules: the Ig superfamily KIR and the lectin-resembling Ly49 receptors, respectively (49). Genomic analysis was then used to place the functional observations into sharper focus. Whereas the human species was found to have a diverse and polymorphic family of KIR genes, the Ly49 gene family has but a single unexpressed gene (50, 51); conversely, laboratory mice have a diverse and polymorphic family of Ly49 genes (52), but their KIR gene family comprises two genes (21, 22). The divergence of the human and murine NK cell receptor systems stimulated further species comparisons, particularly ones involving primate species that are either closely related to humans (apes) or favored as a model for human disease (rhesus monkey).
For each primate species examined, only a minority of KIR corresponded to those in other species (23, 24, 25, 26, 27). Among the divergent KIR, examples were seen of a patchwork pattern, the hallmark of recombination. Because of the inherent complexity of the KIR gene family, previous species comparisons were limited both by the amount of data forming the basis for analysis and by phylogenetic methods that ignored the confounding effect of recombination. In this study, we took three approaches to reduce these limitations. First, new data were acquired for the KIR of the gorilla, a previously unstudied species that is second only to chimpanzees in its affinity to humans (53). Second, we did not limit our analysis to comparison of human and gorilla KIR, but also undertook a general comparison of 75 KIR representing six primate species. Third, methods were used to determine the impact of recombination upon the set of KIR sequences analyzed and to correct for it effects in subsequent phylogenetic analysis.
This combined approach has led to three general conclusions. First, rapid evolution in the KIR gene family is principally due to the formation of new KIR genes and the loss of existing KIR genes. This is illustrated, for example, by the D0 of the lineage II KIR and its relationship with the D0 domain of the rhesus monkey KIR (Fig. 3,A); this relationship, and the fact that it is not observed for the other domains can be explained only by a duplication and an independent loss in the different species (hominoids and rhesus monkey) (Fig. 8). Second, new KIR genes are produced by recombination between existing genes: recombination is, for example, at the origin of the lineage II KIR or of the chimpanzee lineage III three-Ig KIR (Fig. 8). Third, new KIR genes formed by recombinations that reassort intact protein domains predominate among the new KIR genes that ultimately become fixed (Figs. 2 and 4).
For KIR genes, the exon-intron structure correlates with the four structural domains: D0, D1, D2, and S/TM/CYT (54). One possible cause for the predominance of domain shuffling is that certain introns are hot spots for recombination; another is that KIR made by domain shuffling more frequently have functions favored by natural selection. These two explanations are not mutually exclusive. The study of KIR mutants and variants has emphasized the segregation of different functions to different domains. Activating and inhibitory signaling functions are carried by alternative forms of S/TM/CYT; direct interaction with MHC class I is mediated by D1 and D2 domains, and ligand specificity is determined by their polymorphisms and combination; and enhancing interaction is mediated by D0 for which its presence and absence are mediated by alternative forms of exon 3 (55). This modular aspect to KIR molecules means that domain shuffling should be an effective means for changing the strength and specificity of KIR binding to MHC class I ligands, as well as reversing their signaling function.
In the higher primates studied so far, it is the KIR genes of lineages II and III, comprising the MHC-A, -B, and -C receptors, that have been rapidly evolving through domain shuffling. These two lineages appear specific to the hominoids: the ape and human species. In contrast, genes encoding the lineage I KIR (KIR2DL4 and KIR2DL5) have been relatively resistant to change and appear to be the only KIR common to hominoids (KIR2DL4 and KIR2DL5) and monkeys (KIR2DL4). The gorilla KIR exemplify these general observations. Whereas the lineage I gorilla KIR have orthologs in the other primate species, its lineage II KIR is a recombinant, and all of its lineage III KIR are unique. In general, gorilla KIR are most closely related to human and chimpanzee KIR, with particular features being more like the chimpanzee and others more like the human.
We thank Yerkes Regional Primate Center of Emory University for providing samples of peripheral blood from gorillas and chimpanzees. We thank D. Martin for having provided us with an updated version of RDP and for help in setting up the recombination analysis. We thank Dr. L. Guethlein for useful discussions.
This study was supported by National Institutes of Health Grant AI31168 to P.P.
Abbreviations used in this paper: KIR, killer cell Ig-like receptor; Gg, Gorilla gorilla; Pt, Pan troglodytes; Mm, Macaca mulatta; RDP, recombination detection program; D0, D1, D2, Ig domains 0, 1, and 2; S, stem domain; TM, transmembrane domain; CYT, cytoplasmic domain; MCMCMC, Metropolis-coupled Markov chain Monte Carlo; ML, maximum likelihood; ITIM, immunoreceptor tyrosine-based inhibitory motif.