The immune and reproductive functions of human NK cells are regulated by interactions of the C1 and C2 epitopes of HLA-C with C1-specific and C2-specific lineage III killer cell Ig-like receptors (KIR). This rapidly evolving and diverse system of ligands and receptors is restricted to humans and great apes. In this context, the orangutan has particular relevance because it represents an evolutionary intermediate, one having the C1 epitope and corresponding KIR but lacking the C2 epitope. Through a combination of direct sequencing, KIR genotyping, and data mining from the Great Ape Genome Project, we characterized the KIR alleles and haplotypes for panels of 10 Bornean orangutans and 19 Sumatran orangutans. The orangutan KIR haplotypes have between 5 and 10 KIR genes. The seven orangutan lineage III KIR genes all locate to the centromeric region of the KIR locus, whereas their human counterparts also populate the telomeric region. One lineage III KIR gene is Bornean specific, one is Sumatran specific, and five are shared. Of 12 KIR gene–content haplotypes, 5 are Bornean specific, 5 are Sumatran specific, and 2 are shared. The haplotypes have different combinations of genes encoding activating and inhibitory C1 receptors that can be of higher or lower affinity. All haplotypes encode an inhibitory C1 receptor, but only some haplotypes encode an activating C1 receptor. Of 130 KIR alleles, 55 are Bornean specific, 65 are Sumatran specific, and 10 are shared.
This article is featured in In This Issue, p.3003
Orangutans occupy a pivotal position in the evolution of hominid killer cell Ig-like receptors (KIR). These receptors are expressed by NK cells and subsets of T cells and interact with specific epitopes on MHC molecules (1). In phylogenetic analyses, orangutans are the species most distant from humans that have MHC-A, -B, and -C orthologs of the HLA-A, -B, and -C molecules that are KIR ligands (2–4). More distant species, such as Old World monkeys, lack a strict ortholog of HLA-C (4, 5), which, in humans, is the dominant KIR ligand. A characteristic of this dominance is that all humans have at least one functionally interacting HLA-C and KIR allotype pair, whereas numerous individuals (an average of 13% worldwide) lack any HLA-A or -B allotype that is a KIR ligand. Popy-C, the orangutan ortholog of HLA-C, has properties indicating that it plays a less prominent role than does HLA-C. Although HLA-C is fixed in the human species, Popy-C is present on only ∼50% of orangutan MHC haplotypes; thus, ∼25% of the orangutan population does not have Popy-C (2, 6, 7). Another critical difference is that Popy-C allotypes only have the C1 epitope (6, 7), whereas human HLA-C allotypes are of two types that have either the C1 or C2 epitope, and they are recognized by different human KIR (8). These and other observations suggest that C1+MHC-C evolved first, in a common human–orangutan ancestor, and that C2+MHC-C evolved later, on the human lineage after separation from the orangutan lineage (9–11). Thus, the orangutan system represents an intermediate stage in the evolution of the human system, one using interaction of C1+MHC-C with KIR but not interaction of C2+MHC-C with KIR.
In hominids, the KIR locus contains KIR genes of four phylogenetic lineages (11). MHC-A and -B epitopes are recognized by lineage II KIR, whereas MHC-C epitopes are recognized by lineage III KIR, and ligands for lineage I and lineage V KIR are less well characterized. Extensive study of the KIR locus in human populations shows that KIR haplotypes combine high gene-content diversity with high polymorphism for most constituent genes (12–14). Such diversity is observed in the centromeric and telomeric regions of the KIR locus. Differences between human KIR haplotypes are associated with numerous human diseases (15). More limited study of other hominids (3, 9, 16–19), including orangutan (7, 10, 11), suggests that KIR diversity is restricted to the centromeric region of the KIR locus in these species. Because the orangutan already proved to be a valuable model for understanding evolution of the human system, we performed an in-depth study of the structure and polymorphism of the orangutan KIR locus, including comparison of the Sumatran and Bornean orangutan species.
Materials and Methods
Isolation of bacterial artificial chromosome clones containing orangutan KIR genes
Filters containing the orangutan bacterial artificial chromosome (BAC) library CH253 (BACPAC, Children’s Hospital Oakland Research Institute), derived from the genomic DNA of male orangutan Segundo, were screened with 32P-labeled cDNA clones encoding orangutan KIR: Popy-KIR3DLA and Popy-KIR2DL4A (7, 10). Random priming with incorporation of [32P]-dCTP was used to label the cDNA probes. For screening, the filters were prehybridized at 42°C for 2 h in a solution containing 50% formamide, 5% Denhardt’s solution, 5× SSPE, 5% dextran sulfate, 1% SDS, and sheared salmon sperm DNA (100 μg/ml). Radiolabeled probe was added, and the hybridization continued overnight. After decanting the hybridization solution, the filters were washed twice in a solution of 2× SSPE/0.5% SDS at 42°C, followed by two washes in 0.2× SSPE/0.5% SDS at room temperature. Fourteen BAC clones that hybridized with the Popy-KIR probe were plated, from which individual colonies were isolated.
That the selected BACs contained KIR genes was confirmed by dot blotting the individual cultures, which were then hybridized with the same probes used to screen the original library. Eight to sixteen colonies of each BAC were cultured individually in 1 ml of broth (2× YT with 25 μg/ml chloramphenicol) in 96-well culture blocks. Following overnight culture at 37°C, dot blots were made by spotting 50 μl of culture onto a prewetted Hybond-N+ membrane in a 96-well vacuum manifold. Vacuum was applied to remove the liquid. Then, 100 μl of 0.5 M NaOH, 1.5 M NaCl solution was added to each well. After incubation at room temperature for 5 min, vacuum was applied to remove the liquid. Finally, 100 μl of 1.5 M NaCl, 0.5 M Tris (pH 8) solution was added to each well and incubated at room temperature for 5 min. Vacuum was applied to remove the liquid. The filters were removed from the manifold, cross-linked by exposure to UV light, and air dried prior to hybridization. The filters were hybridized using the same hybridization conditions and probes used in screening the library filters (described above). For 4 of the 14 BACs identified in the initial screen (CH253-75E4, CH253-76D24, CH253-272D24, and CH253-134A21), all subclones gave a positive hybridization signal. For one additional BAC (CH253-418C19), 5 of 16 subclones gave a positive signal. The positive clones from these five BACs were analyzed further to determine their KIR gene content.
Determining the Popy-KIR gene content of BACs
PCR analysis was used to determine the gene content and relatedness of the five BACs. Gene-specific amplifications were performed to detect Popy-ILT10 and Popy-FCAR, the genes that flank the KIR locus (11), as well as lineage II Popy-KIR3D. PCR reactions contained 1× PCR buffer, 0.2 mM each primer, 0.8 mM each dNTP, 1 U Taq polymerase, 100 ng of template DNA, and Milli-Q–distilled water; the last was added to give a final volume of 25 μl. Cycling conditions were 3 min at 95°C, 5 cycles of 20 s at 95°C, 1 min at 62°C, and 45 s at 72°C, 25 cycles of 20 s at 95°C, 40 s at 60°C, and 90 s at 72°C, followed by a final extension of 10 min at 72°C. The sequences of the primers are given in Fig. 1A. The PCR products from the Popy-KIR3D amplification were cloned and sequenced, and the results were used to assess the overlaps between the BACs. Three sequences were identified: two are haplotype specific, and one is common to the two haplotypes. Presence and absence of the haplotype-specific sequence was used to sort the BACs into haplotype groups. The BACs were also end-sequenced to determine their genomic context. The results of this PCR analysis are summarized in Fig. 1B. BACs CH253-75E4 and CH253-76D24 represent one of the two KIR haplotypes, with CH253-75E4 containing the entire KIR locus. BACs CH253-272D24 and CH253-418C19 represent the second KIR haplotype. Neither BAC contains the complete KIR locus. BAC CH253-134A21 could not be assigned to one of the two haplotypes because it only contains the Popy-KIR3D gene present on both haplotypes.
The Orangutan Genome Sequencing project (20) studied the genome of Susie, a female orangutan whose genomic DNA was also used to construct the CH276 BAC library. This library was used to finish parts of the orangutan genome that were incomplete, including the KIR locus (20). Of Susie’s two KIR haplotypes, the one determined in the Orangutan Genome Sequencing project (by sequencing BAC CH276-525G20) corresponds to haplotype H2 in Fig. 2. We determined the sequence of Susie’s second KIR haplotype. To do this, we extracted all sequencing reads homologous to orangutan KIR from the orangutan Whole Genome Shotgun (WGS) sequence archive at GenBank. The extracted reads were sorted into two groups. One group contains all reads having sequence identical to Susie’s first KIR haplotype (the CH276-525G20 haplotype, H2). This group also contains reads that map to both haplotypes. The second group contains reads specific to Susie’s second KIR haplotype and lacks reads that can also map to the first (sequenced) haplotype. These reads mapped well to haplotype H3 (Fig. 2), the first KIR haplotype sequence determined (11). That H3 haplotype sequence was obtained from clones isolated from the cosmid library prepared by the Resourcezentrum Primärdatenbank (Berlin, Germany) from DNA obtained from another, unspecified, orangutan. Susie’s H2 and H3 haplotypes differ substantially, particularly in the region flanked by KIR3DL3 and KIR2DP (Fig. 2). Because Segundo is Susie’s son, he has one KIR haplotype in common with Susie, the H2 haplotype, and a second KIR haplotype (H1) that is different from H2 and of paternal origin. Comparison of partial sequences of the KIR-containing BACs from Segundo (library CH253) with the complete sequences of Susie’s two KIR haplotypes showed that Segundo’s paternal KIR haplotype is contained in two overlapping BACs: CH253-272D24 and CH253-418C19. Complete sequences of these two BACs were determined, giving the complete sequence of an H1 KIR haplotype.
Sequencing the BACs containing KIR
BAC DNA was prepared using the QIAGEN Large-Construct Kit, following the manufacturer’s recommendations. This included an exonuclease step to remove contaminating bacterial genomic DNA. The BACs were sequenced by the Stanford Genome Technology Center using GS-FLX Titanium chemistry on a 454 instrument.
We used Pacific Biosciences technology to confirm these sequences. BAC DNA was fragmented to a final size of 10 kb using Covaris tubes. Libraries were prepared using the Pacific Biosciences SMRTbell Template Prep Kit and the DNA/Polymerase Binding Kit. Sequencing was performed on an RS II Instrument (Pacific Biosciences, Menlo Park, CA) using the DNA sequencing kit and conditions recommended by the manufacturer. Sequence runs were 2 h in duration.
Sequence assembly and analysis
The sequence data obtained from the 454 instrument were assembled using the MIRA software package (http://www.chevreux.org/projects_mira.html) (21). An initial round of de novo assembly was performed for each BAC, the contigs were assessed for gene content, and a putative reference sequence was constructed using known sequences as templates. This reference sequence was used in subsequent rounds of assembly in MIRA using four passes of de novo assembly and two passes of mapping assembly to the reference sequence.
Sequence reads obtained from the RS II Instrument were subjected to de novo assembly using SMRT Analysis Software and Quiver (Pacific Biosciences). The sequence obtained was compared with that obtained from the MIRA assembly of the 454 data, and ambiguities were resolved by targeted PCR amplification and sequencing. The final sequence has been submitted to the GenBank sequence database (https://www.ncbi.nlm.nih.gov) under accession number KU757291.
Genotyping orangutan KIR
We developed a KIR genotyping method with which we assessed KIR variation in a panel of 21 orangutan genomic DNA samples banked in our laboratories. This method uses a combination of sequence-specific priming genotyping and gene-specific PCR amplification, followed by sequencing. Both methods determine the presence or absence of KIR genes, and the latter can also distinguish some KIR alleles. PCR reactions contained 1× PCR buffer, 0.2 mM each primer, 0.8 mM each dNTP, 1 U Taq polymerase, and 100 ng of template DNA; Milli-Q–distilled water was added to give a final volume of 25 μl. Primers, their location, expected product size, annealing temperature, and extension times are shown in Fig. 1C. Cycling conditions were 3 min at 95°C, 30 cycles of 30 s at 95°C, 1 min at the annealing temperature, and 72°C for the indicated extension time, followed by a final extension of 10 min at 72°C. Products were cloned and sequenced to determine allelic identity. For each individual, the genotype and the known family relationships (Supplemental Fig. 1) were used to determine the KIR haplotypes present in each orangutan’s genome. Novel sequences have been submitted to the GenBank sequence database (https://www.ncbi.nlm.nih.gov) under accession numbers KY490019–KY490037. An alignment containing all orangutan KIR is provided (Supplemental Fig. 2).
Determining KIR genotypes from the ape genome database
The genomes of 10 orangutans (5 Bornean and 5 Sumatran) were sequenced, using paired-end technology, as part of a larger study of great ape genomic diversity (22). Using Bowtie 2 (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml), we screened the data in the Sequence Read Archive, using complete genomic sequences and individual exons of cDNA sequences of all available orangutan KIR as the reference. These reference sequences were masked to remove repeat elements. Reads that mapped to the KIR region and their pairs were retained, and gene content was assessed using in silico probing for KIR gene–specific reads. The reads obtained were then assembled using MIRA, with individual orangutan KIR gene sequences as the reference. The assembly was manually inspected to determine the alleles of each KIR gene.
Orangutan KIR nomenclature
When orangutan KIR sequences were first determined by cDNA sequencing (7), there was uncertainty in distinguishing whether a new sequence represented a new gene or a new allele of a known gene. Because of the ambiguity, we adopted a provisional nomenclature in which orangutan KIR were distinguished by upper case letters, rather than by numbers, as was the practice for other primate species (16–18, 23). Having complete sequences for several orangutan KIR haplotypes has now eliminated the ambiguities. After consultation with the curators of the IPD-KIR database (https://www.ebi.ac.uk/ipd/kir/introduction.html) (24), we developed a new and permanent nomenclature for orangutan KIR genes and alleles. Supplemental Fig. 3 gives the new names for the orangutan KIR and their previous designations. Following the convention established for assigning allele names (https://www.ebi.ac.uk/ipd/kir/nomenclature.html), the alleles determined for each species were numbered in order of acquisition. Alleles that are identical between the two species receive distinct numbers within each species set. In some cases they are the same number (e.g., Poab-KIR2DS10*001 and Popy-KIR2DS10*001), but for others they are different. For example, Poab-KIR3DL1*003 is identical to Popy-KIR3DL1*00101. All gene sequences used in this study, whether they were determined previously, are newly sequenced, or were assembled from the ape genome database, are provided in Supplemental Fig. 2.
Haplotype inference for unrelated individuals
For individuals for whom family relationships are not known, KIR haplotypes were inferred by comparison with the dataset of genotypes and the confirmed KIR haplotypes obtained from family analysis. For example, individual SRS396838 (Fig. 3B) from the ape genome dataset had a gene-content genotype identical to four other individuals (Allen, Penari, Puti, PPY1) (Fig. 3A). Of these four individuals, two (Penari and Puti) had haplotypes assigned by family analysis as H1 and H4. As well as sharing gene content, several KIR genes had identical alleles in all of these individuals (2DL11, 2DL5, 3DS1, 2DL12), further supporting the presence of identical KIR gene-content haplotypes. In this case, the three individuals (Allen, PPY1, SRS396838) were all assigned as having the same haplotype content (H1 and H4) as that established for Penari and Puti. For some individuals, one haplotype identical to a known haplotype was extracted from the dataset, either by comparison with known haplotypes or by comparison with other individuals having the same alleles. For example, individual SRS396839 was determined to have an H1 haplotype that is also present in other panel members. The second haplotype in SRS396839, designated H6, was inferred from the genes/alleles remaining after subtracting those of the H1 haplotype. As further confirmation, this combination of genes/alleles was also found in individual SRS396840.
Phylogenetic and diversity analyses
Sequences were aligned in Geneious using the MAFFT algorithm (25), with manual correction. Phylogenetic trees were constructed with MEGA 6 using Neighbor Joining, pairwise deletion, Tamura-Nei, and 1000 bootstrap replicates (26). Pairwise distances for alleles of a KIR gene were calculated using MEGA 6. The results were plotted using ggplot2 (http://ggplot2.org) (27) in RStudio (https://www.rstudio.com) (28).
High-resolution analysis of orangutan KIR haplotypes
Complete nucleotide sequences were determined for three orangutan KIR haplotypes: Pongo-H1, -H2, and -H3 (Figs. 1, 2A). Comparison of these high-resolution haplotypes shows how the orangutan KIR locus is organized around four fixed framework genes that represent the four KIR lineages of the catarrhine primates: humans, apes, and Old World monkeys (Fig. 2A). These framework genes include Popy-KIR3DL3 of lineage V and Popy-KIRDP of lineage III, which define the centromeric region of the KIR locus, as well as lineage I Popy-KIR2DL4 and lineage II Popy-KIR3DL1, which define the telomeric region. This organization of the KIR locus is conserved in humans and chimpanzees (9, 11, 19). From this first comparison of three orangutan KIR haplotypes, we see that each haplotype has a different KIR gene content in the centromeric region but identical gene content in the telomeric region.
The orangutan KIR locus is located in a chromosomal region corresponding to that of the KIR locus of humans, as well as to the KIR loci of other catarrhine primates (https://www.ncbi.nlm.nih.gov/projects/mapview/) (7, 9, 29–31). It is flanked on the centromeric side by the LILR gene family and on the telomeric side by genes encoding FCAR, NCR1, and NLRP7 (Fig. 2A). Distinguishing the orangutan haplotypes is a 9-kb insertion, which is situated 5 kb upstream from the 5′ end of the LILRP1 pseudogene (Fig. 2A). BLAST searches of GenBank found the inserted sequence to be orangutan specific. Consistent with this observation were the results of our analysis of aligned human, chimpanzee, orangutan, and rhesus macaque genomic sequences for the region containing the LILR and KIR gene families. No sequence equivalent to that identified in orangutan was detected in human, chimpanzee, or rhesus monkey.
The inserted sequence is primarily composed of a series of LINE fragments interrupted by insertion of an AluSx repeat, a partial MSTB retroviral element, and a vomeronasal receptor 1 (V1R) pseudogene (ponPygV1R-1924ps) (Fig. 2B). With 85% sequence identity, the closest relative of this pseudogene is ponPygV1R-1923ps, another V1R pseudogene. This second vomeronasal pseudogene is also situated in the LILR gene region, upstream of the LILRB4 gene.
A minority of KIR alleles are present in both the Bornean and Sumatran orangutan species
We developed a KIR genotyping method that determines the presence or absence of each Popy-KIR gene and also distinguishes alleles of lineage II and III KIR that have sequence differences in exons encoding the functionally important Ig-like domains. This method also distinguishes Popy-KIR2DL4 alleles that differ by the presence or absence of a premature termination codon in exon 5. A panel of 21 orangutans was KIR genotyped (Fig. 3A, 3C, 3E). Contemporaneous with our study, genome sequences from 10 wild orangutans (5 Bornean and 5 Sumatran) were determined by other investigators, as part of a general survey of ape genome diversity (Great Ape Genome Project [GAGP]) (22). From this dataset, we extracted all KIR sequences and determined the KIR genotype for each individual orangutan (Fig. 3B, 3D).
Including previously known alleles (7, 11), a total of 130 KIR alleles, encoding 107 allotypes, was defined, of which 94 were novel. Of these, only 11 KIR allotypes are found in both the Bornean and Sumatran orangutan species. For 10 of these shared allotypes, the Bornean and Sumatran allotypes are encoded by identical nucleotide sequences (Poab-KIR3DL3*001/Popy-KIR3DL3*001, Poab-KIR2DS13*004/Popy-KIR2DS13*00101, Poab-KIR2DS10*001/Popy-KIR2DS10*001, Poab-KIR2DS10*002N/Popy-KIR2DS10*002N, Poab-KIR2DL11*001/Popy-KIR2DL11*001, Poab-KIRDP*001/Popy-KIRDP*001, Poab-KIRDP*002/Popy-KIRDP*002, Poab-KIR3DL1*002/Popy-KIR3DL1*003, Poab-KIR3DL1*003/Popy-KIR3DL1*001, and Poab-KIR3DL1*00401/Popy-KIR3DL1*004). In contrast, Sumatran Popy-KIR2DL4*001 and Bornean Poab-KIR2DL4*006 encode identical proteins but differ by two to four synonymous substitutions.
By analysis of familial relationships (Supplemental Fig. 1), combined with inference from the full set of genotypes, we defined 12 gene-content KIR haplotypes (Fig. 4A, Supplemental Fig. 4). When KIR allelic diversity was also taken into account, the number of different KIR haplotypes increased to 41 (Fig. 4A). As seen in other catarrhine primates, groups of linked KIR genes and alleles are observed in the centromeric or telomeric region, but there is no linkage between centromeric and telomeric motifs because of the prevalence of recombination between the centromeric and telomeric regions (13, 32). For example, three of the four forms of haplotype H1 have the same KIR alleles in the centromeric region but differ with regard to their telomeric KIR2DL4 and KIR3DL1 alleles. The telomeric combination of KIR2DL4*001N:3DL1*002 on haplotype H1 is common; it also is present on haplotypes H2 and H4.
Species-specific differences in KIR haplotype structure were observed. Five haplotypes (H1, H4, H5, H6, and H7) are present only in Sumatran orangutans, with H4 having the highest frequency (Fig. 4A). Five haplotypes (H8, H9, H10, H11, and H12) are present only in Bornean orangutans. The remaining haplotypes, H2 and H3, occur in both species. They are the most frequent haplotypes in Bornean orangutans and are of intermediate frequency in Sumatran orangutans. Although the animals studied are small in number, our results point to there being substantial species-specific differences between the KIR loci of Bornean and Sumatran orangutans.
Orangutan KIR haplotypes vary in their content of activating and inhibitory MHC-C receptor genes
Differentially distributed among the haplotypes are five lineage III KIR genes that encode receptors specific for MHC-C and a sixth lineage III KIR gene (KIR2DS14) that encodes a receptor that was shown not to interact with MHC-C (10). Two of the five encode activating receptors (KIR2DS11 and 2DS13), two encode inhibitory receptors (KIR2DL11 and 2DL12) and one encodes both activating and inhibitory receptors (KIR2DS/L10). Also distinguishing these receptors is the presence of lysine (KIR2DS14 and 2DS/L10) or glutamate (KIR2DS13, 2DL11, and 2DL12) at position 44 in the D1 domain, the specificity-determining residue. A previous study that examined the binding of the lineage III orangutan KIR to an extensive panel of human HLA-A, -B, and -C allotypes showed that K44 KIR2DS/L10 binds specifically to C1+HLA-C with high avidity, whereas K44 KIR2DS14 lacks avidity for HLA class I (10). This difference in function and the strong linkage disequilibrium between functional KIR2DS/L10 and nonfunctional KIR2DS14 have a striking parallel in the human KIR system. KIR2DL2, a strong C1 receptor, and KIR2DS2, which lacks avidity for HLA class I, both have K44, high sequence similarity, and genes that are in almost complete linkage disequilibrium. The three orangutan KIR that have E44 (KIR2DL11, 2DL12, and 2DS13) all recognize HLA-C. They have a lower avidity than K44 KIR2DS/L10 but a broader specificity for HLA-C that includes C1+HLA-C and C2+HLA-C. In the physiological context of the orangutan, where there is no C2+MHC-C, their difference in avidity for C1 distinguishes the K44 and E44 orangutan KIR.
Orangutan KIR haplotypes exhibit considerable variation in their functional potential. Two groups of KIR gene-content haplotype account for 89 and 72% of the haplotypes in Sumatran and Bornean orangutans, respectively (Fig. 4B). Most common are haplotypes that provide one inhibitory MHC-C receptor in the absence of an activating MHC-C receptor. These include all H2 and H4 haplotypes and one H1 haplotype that has a nonfunctional 2DS10 allele (KIR2DS*003N). The inhibitory receptors of the H1 and H4 haplotypes have K44, whereas that of the H2 haplotypes has E44. The second most common haplotype group has one inhibitory and one activating MHC-C receptor. This includes the H3 and H5 haplotypes and the majority of H1 haplotypes. The activating and inhibitory receptors of the H3 haplotypes both encode E44, whereas the H1 haplotypes combine a K44 activating receptor with an E44 inhibitory receptor, and the single H5 haplotype combines an E44 activating receptor with a K44 inhibitory receptor. As in humans, all orangutan KIR haplotypes provide an inhibitory MHC-C receptor, whereas only some haplotypes supply an activating MHC-C receptor (Fig. 4B) (29, 33). This is consistent with inhibitory MHC class I receptors being essential for NK cell education and function, as well as with the activating receptors having a subsidiary and modulating function. Seven KIR gene-content haplotypes that are relatively rare (H6, 7, 8, 9, 10, 11, 12) include ones with two activating MHC-C receptors and up to three inhibitory MHC-C receptors.
Popy-KIR2DS15: a novel KIR gene specific to Bornean orangutans
From this investigation and our previous studies (7, 11), coding region sequences were obtained for a total of 130 orangutan KIR genes and pseudogenes. Of these, 65 were specific to Sumatran orangutans, 55 were specific to Bornean orangutans, and 10 are present in both. Phylogenetic analysis shows how these KIR segregate into the four lineages of hominid KIR (Fig. 5A). Orangutan lineage I consists of two clades corresponding to alleles of the KIR2DL4 and KIR2DL5 genes, which are orthologous to the human KIR2DL4 and KIR2DL5 genes, respectively. Lineage V contains a single clade, corresponding to alleles of the KIR3DL3 gene, which is orthologous to the human KIR3DL3 gene. Although phylogenetically well defined, the functions of KIR2DL4, KIR2DL5, and KIR3DL3 are the least understood. The lineage II KIR contains two major clades corresponding to alleles of the activating KIR3DS1 gene and the inhibitory KIR3DL1 gene. The latter has two subclades, of which the minor one corresponds to recombinant alleles that have sequence derived from lineage III KIR in exon 5 encoding the D2 domain.
The largest clade of orangutan KIR accounts for half of the total KIR and corresponds to the lineage III KIR that expanded with the emergence of MHC-C. This clade contains alleles of seven genes that make up two ancient clades: one corresponds to the truncated pseudogene DP, and the other corresponds to the six full-length genes. Of these, five were identified in a previous analysis. In contrast, the sixth gene, KIR2DS15, which forms a separate subclade within the phylogenetic tree, was uncovered in our analysis of KIR data from the GAGP (http://biologiaevolutiva.org/greatape/) (22). Of 10 orangutan genomes sequenced, we found that KIR2DS15 was present in all five Bornean orangutans but none of the five Sumatran orangutans. The KIR2DS15 coding sequence is characterized by its minimal polymorphism, which is restricted to one nucleotide substitution in pseudoexon 3. Because of this homogeneity, we could not distinguish which individual orangutans are hemizygous or homozygous for KIR2DS15. The only other gene that exhibits species specificity is KIR2DL10, which is present only in Sumatran orangutans (Figs. 3, 4A).
The coding sequence of KIR2DS15 is consistent with this gene encoding an activating MHC-C receptor that has K44. The T52, K54, N72 motif in the D1 domain distinguishes KIR2DS15 from the other orangutan lineage III KIR (Fig. 5B). Residues 52 and 54 map to the distal end of the D1 domain, whereas residue 72 maps within the binding site near the specificity-determining residue 44 (Fig. 6A). Crystallographic analysis of the KIR:MHC interface in lineage III KIR showed that residue 72 of the KIR forms a hydrogen bond with Q72 of MHC-C (34). The N72 found in KIR2DS15 is unique to orangutans (Fig. 6B); based on its disparity from the amino acids found in this position in all other KIR, it is likely to disrupt binding of KIR2DS15 to MHC. Although the other residues of the motif (T52 and K54) are located at positions outside of the binding site, it is possible that they may also affect function. In addition to KIR2DS15, lysine at position 54 is found in only a few macaque KIR and replaces what is primarily glutamate or glutamine in all other KIR (Fig. 6B), suggesting it too may alter function. Threonine at position 52 is somewhat more common and is found in chimpanzee and human KIR. In humans it is found in KIR2DP1 and is part of a motif (N46, T52, E71) that distinguishes it from other human lineage III KIR. Mutating human KIR2DL3 to have T52 reduced its affinity for the C1 epitope (35). Taken together, these results suggest that KIR2DS15 has reduced or no binding to MHC class I.
Allele diversity is highest in the framework KIR genes
The KIR3DL3, DP, 2DL4, and 3DL1 framework genes are components of all of the orangutan KIR haplotypes (Figs. 3, 7A). The other KIR genes exhibit frequencies that vary between 0 and 64% in the dataset for all 33 orangutans studied (Fig. 7A). Being components of the H4 haplotype, which is at high frequency in Sumatran orangutans, KIR2DS14 and 2DL10 have high frequencies of 55 and 41%, respectively. In striking contrast, KIR2DL10 is absent from Bornean orangutans, and 2DS14 was detected in just one individual (Fig. 3). The slightly higher frequencies of KIR2DL11, 2DL12, 2DL5, and 3DS1 in Bornean orangutans reflect the absence of the H4 and H5 haplotypes that lack these KIR genes.
Comparison of the allelic polymorphism of orangutan KIR shows that KIR3DL1 exhibits the greatest allelic differences (Fig. 7B). This reflects the role that recombination has played in forming new KIR3DL1 alleles, as illustrated by the clusters of pairwise differences that are seen for comparisons within a species and between the two species. The two groups of KIR3DL1 alleles are distinguished by a recombination event that replaced the ancestral exon 5 with an exon 5 from a lineage III KIR. A similar, but less dramatic, clustering is seen for KIR2DL11, in which there is little polymorphism beyond the recombination distinguishing the two groups. This recombination produced allotypes distinguished by the sequences of their cytoplasmic tails. One group of allotypes has the ancestral sequence, whereas the other group has a sequence derived from KIR2DL12. Overall, the lowest level of allelic variation is seen for the activating KIR of lineage III: KIR2DS10, 2DS13, 2DS14, and 2DS15. Most striking is KIR2DS15, the only orangutan KIR for which the coding sequence is completely conserved.
In the absence of recombination events, the neighboring KIR2DL12 gene and the KIRDP pseudogene have the highest mean pairwise distances (Fig. 7C). KIR2DL12 also has more allotypes (17) than the other orangutan KIR, with 12 in the Bornean species and 5 in the Sumatran species (Fig. 7D). The increased diversity in Bornean orangutans is another effect of the high frequency of the H4 haplotype, which lacks KIR2DL12, in Sumatran orangutans.
Sumatran and Bornean orangutans share a total of 11 allotypes, 10 of which are encoded by identical alleles (Fig. 7D). The eleventh is a KIR2DL4 allotype that is encoded by four KIR2DL4 alleles (three in Bornean orangutans and one in Sumatran orangutans) that all differ by two to four synonymous substitutions. Of another 13 KIR allotypes encoded by more than one allele, 9 are encoded by two alleles, 3 are encoded by three alleles, and 1 is encoded by four alleles. All other KIR differ by one or more nonsynonymous substitutions, but some have synonymous substitutions as well.
The divergence time that separates the orangutans from humans and other hominids is estimated to be around 14–18 million years (36). Although the genomic organization, polymorphism, and population genetics of the human KIR locus were studied in considerable detail and were reviewed (13, 37, 38), prior to this study, knowledge of the orangutan KIR locus was limited. To some extent, that prevented accurate assignment of alleles to genes and development of an accurate picture of the KIR locus and its variation. By defining the genes, the alleles, and their organization in a set of 41 orangutan KIR haplotypes, we were able to achieve these goals. Additionally, we compared the KIR locus in the Bornean and Sumatran species of orangutan. This provided an opportunity to examine KIR evolution over a shorter time period than was previously possible.
The ancestral hominoid KIR haplotype is predicted to have had five KIR genes, three framework genes of lineage Ia (like KIR2DL4), lineage II, and lineage V, as well as ancestral forms of lineage Ib (like KIR2DL5) and lineage III (3). All of these genes are present in orangutan KIR haplotypes. Distinguishing the modern haplotypes from the ancestral haplotype is a variable number (between 1 and 5) of lineage III KIR genes, an expansion that occurred with the emergence of MHC-C. The orangutan lineage III KIR genes are restricted to the centromeric region, leaving the telomeric region empty, consisting of just the KIR2DL4 and KIR3DL1 framework genes. Such segregation of the lineage III KIR genes is also a feature of chimpanzee KIR haplotypes (9, 19) but not human KIR haplotypes. In the latter, lineage III and lineage Ib KIR genes can be present in both centromeric and telomeric locations (29). Thus, orangutan and chimpanzee KIR haplotypes have similar overall organization to the common ancestor, whereas the human lineage has evolved new forms. This culminated in the formation of functionally different groups of KIR A and KIR B haplotypes.
Of the 12 orangutan KIR gene-content haplotypes, 4 are common, and 7 are rare. These haplotypes are all related by events of gene deletion and insertion. The number of genes in a haplotype varies from 5 to 10 compared with 7–17 for human KIR haplotypes. The longest common haplotype is H1. Haplotype H3 appears to be derived from H1 by a deletion that eliminated KIR2DL5 and KIR3DS1 and created a chimeric form of KIR2DL11, with KIR2DL12 exons encoding the cytoplasmic tail. H1 and H3 also differ by having KIR2DS14 and KIR2DS13, respectively. On most haplotypes, KIR2DS13 and KIR2DS14 segregate as alleles, with the exception of H7, which has both genes. This haplotype appears to be the product of an insertion that placed KIR2DS13, KIR2DS10, and KIR2DL11 (found on H3) in between the KIR2DS14 and KIR2DL12 genes, likely of an H1 haplotype. In contrast, haplotype H2 was likely produced from H1 by deletion. This event eliminated KIR2DS14, KIR2DS10, and KIR2DL11 and created a chimeric form of KIR3DL3, with KIR2DL11 exons encoding the cytoplasmic tail. Similarly, H4 was likely derived from H1 by deletion that eliminated KIR2DL11, KIR2DL5, KIR3DS1, and KIR2DL12 and created a chimeric form of KIR2DS10, with KIR2DL12 exons encoding the cytoplasmic tail. Thus, this novel form of KIR2DS10 (called KIR2DL10) is long-tailed and an inhibitory receptor, whereas its parent is a short-tailed activating receptor. As seen by these examples, the insertion and deletion events by which orangutan KIR haplotypes evolve often involve formation of a novel KIR having a distinctive combination of intracellular signaling and extracellular ligand-binding domains.
Two critical characteristics distinguish orangutan MHC-C from HLA-C. First, the orangutan MHC-C gene is carried by only 50% of MHC haplotypes, whereas HLA-C is present on all HLA haplotypes. Second, all orangutan MHC-C allotypes carry the C1 epitope, whereas an average of 65% (range 24–98%) of HLA-C allotypes carry the C1 epitope, and the other 35% carry the C2 epitope (37). The orangutan lineage III KIR that recognize MHC-C are of four types that are distinguished by the combination of two dimorphisms. In the cytoplasmic tail, dimorphism determines whether a receptor has inhibitory or activating function. In the extracellular D1 domain, dimorphism at position 44 determines whether a receptor has higher or lower affinity for MHC-C. A characteristic of the orangutan KIR haplotypes is that they encode various combinations of these four receptor types. Reflecting the importance of inhibitory receptors for NK cell education and function, all orangutan KIR haplotypes encode an inhibitory receptor, but that is not true for the activating receptors. For example, in the cohort of orangutans that we studied, the most frequent gene-content KIR haplotype encodes an activating receptor that has lost the capacity to bind MHC-C.
Pongo pygmaeus and P. abelii inhabit the Southeast Asian islands of Borneo and Sumatra, respectively. At various times in the past during periods of low sea level, these islands, situated on the Sunda shelf, were connected by a land bridge (39). During these periods, the most recent being 17,000 y ago, there was the possibility for migration of Bornean orangutans to Sumatra and Sumatran orangutans to Borneo (40). Pointing to complexity in the demographic history of the orangutans, the time of divergence of the two species is ∼3.5 million years ago when estimated from mitochondrial DNA sequences and 0.3–0.4 million years ago when estimated from autosomal DNA sequences from the same cohort of orangutans (40). To explain this difference, Ma et al. (40) proposed that divergence of the maternally inherited mitochondrial DNA marks the beginning of a protracted speciation process in which the specific migration and dispersal of males, but not females, between the islands served to blend the autosomal genes of the two species.
KIR genes of Bornean and Sumatran orangutans exhibit impressive differences. Of the common KIR gene–content haplotypes, H1 and H4 are specific to Sumatra, whereas H2 and H3 are present in both species. Of the rare haplotypes, three are specific to Sumatra, and five are specific to Borneo. At allele-level resolution, no KIR haplotype is shared by the two orangutan species. Of the 130 orangutan KIR sequences that we defined, only 10 were detected in Sumatran and Bornean orangutans: three KIR3DL1, two KIR2DS10, two KIRDP, one KIR3DL3, and one KIR2DL11. Such divergence is consistent with the functional importance of KIR variation in binding MHC ligands and modulating NK cell functions in immunity and reproduction. Humans and chimpanzees have a divergence time of 5 million years and, like the two orangutan species, have no KIR haplotype sequence in common (9).
Phylogenetic analyses show few species-specific branches or clades on the tree. The exceptions are the few genes that are specific to each species. In Bornean orangutans, a new gene, KIR2DS15, was discovered through analysis of the GAGP sequence database and has no counterpart in Sumatran orangutans. Unique to the Sumatran orangutans is KIR2DL10, the gene formed by the deletion event that formed haplotype H4. KIR2DS14 is enriched in Sumatran orangutans, whereas a group of KIR3DL1 alleles characterized by a recombination that inserted an exon from a lineage III KIR into this lineage II KIR is enriched in Bornean orangutans.
Bornean orangutans have lower autosomal gene diversity and lower effective population size than Sumatran orangutans (20, 22, 40). In contrast, we find similar levels of KIR diversity in the orangutan species. This difference is likely the consequence of selection for KIR diversity. For both species, there was greater diversity in the framework genes than in other KIR genes. For some of these lower-diversity genes, notably KIR2DL11 and KIR3DL1, greater pairwise distances were observed as a result of the allelic variation that was due to recombination events.
The contribution of NK cells to human reproduction is to cooperate with fetal extravillous trophoblast in the remodeling of maternal arteries that will supply the placenta with blood (41). This cooperation is mediated by various NK cell receptors that interact with ligands on the trophoblast. Because HLA-C, but not HLA-A or HLA-B, is expressed on trophoblast and engages the KIR of maternal NK cells, the polymorphism of HLA-C can influence the outcome of pregnancy. Two classes of syndrome are associated with the fetus expressing paternally derived C2+HLA-C. If the mother has the KIR A/A genotype, which provides strong inhibitory C2-specific KIR2DL1 but no activating C2-specific KIR2DS1, there is increased risk for low birth weight, pre-eclampsia, and miscarriage (42), conditions favored by an insufficient supply of blood to the placenta. Conversely, if the mother has a KIR A/B or B/B genotype that combine activating C2-specific KIR2DS1 with weak inhibitory C2-specific KIR2DL1, there is increased risk for higher birth weight and obstructed labor (43), conditions favored by an oversupply of blood to the placenta. Thus, the combination of C2+HLA-C, which evolved in a common ancestor of humans and African apes, with human-specific A and B KIR haplotype differences appears to have had the effect of increasing the frequency of pregnancy syndromes. The orangutan system of MHC-C ligands and cognate KIR lacks both of these risk factors, and there is scant evidence for such pregnancy syndromes in this species.
Because MHC-C is present on only half of orangutan MHC haplotypes, 25% of individuals lack MHC-C and are predicted to lack NK cells that are educated by KIR. In pregnancy, there are four maternal–fetal MHC-C combinations: MHC-C+ mothers with an MHC-C+ fetus, MHC-C+ mothers with an MHC-C− fetus, MHC-C− mothers with an MHC-C+ fetus, and MHC-C− mothers with an MHC-C− fetus. It would be of considerable interest to assess the effect of these genotypes and their combinations with KIR on the success of pregnancy, as well as the health and survival of the progeny. Improving technology for typing KIR and MHC from fecal samples makes such studies a feasible proposition in the wild.
We thank Swati Ranade and Brett Bowman of Pacific Biosciences for help with the sequencing and analysis of BACs.
This work was supported by National Institutes of Health Grants AI24258 and AI31168 (to P.P.). This project was funded in part by National Institutes of Health Grant ORIP/OD P51OD011132 to the Yerkes Regional Primate Center, from which the majority of the source material for the banked samples was obtained.
The sequences presented in this article have been submitted to the GenBank database (https://www.ncbi.nlm.nih.gov) under accession numbers KU757291 and KY490019–KY490037.
The online version of this article contains supplemental material.
Abbreviations used in this article:
bacterial artificial chromosome
Great Ape Genome Project
killer cell Ig-like receptor
vomeronasal receptor 1
Whole Genome Shotgun.
The authors have no financial conflicts of interest.