The reptiles are the last major group of jawed vertebrates in which the organization of the IGH locus and its encoded Ig H chain isotypes have not been well characterized. In this study, we show that the green anole lizard (Anolis carolinensis) expresses three Ig H chain isotypes (IgM, IgD, and IgY) but no IgA. The presence of the δ gene in the lizard demonstrates an evolutionary continuity of IgD from fishes to mammals. Although the germline δ gene contains 11 CH exons, only the first 4 are used in the expressed IgD membrane-bound form. The μ chain lacks the cysteine in CH1 that forms a disulfide bond between H and L chains, suggesting that (as in IgM of some amphibians) the H and L polypeptide chains are not covalently associated. Although conventional IgM transcripts (four CH domains) encoding both secreted and membrane-bound forms were detected, alternatively spliced transcripts encoding a short membrane-bound form were also observed and shown to lack the first two CH domains (VDJ-CH3-CH4-transmembrane region). Similar to duck IgY, lizard IgY H chain (υ) transcripts encoding both full-length and truncated (IgYΔFc) forms (with two CH domains) were observed. The absence of an IgA-encoding gene in the lizard IGH locus suggests a complex evolutionary history for IgA in the saurian lineage leading to modern birds, lizards, and their relatives.
The Igs are uniquely characteristic of the adaptive immune systems of all jawed vertebrates, being found in jawed fish, amphibia, reptiles, birds, and mammals (1, 2). A typical Ig molecule consists of two H and two L chains, with the H and L chains being encoded in separate IGH5 and IGL loci (2). The H and L polypeptides are typically covalently linked by disulfide bonds formed by positionally conserved cysteine residues; however, noncovalent interactions alone are sufficient to hold L and H chains together, as observed for human IgA2m (1) and Igs of Rana catesbeiana (3, 4).
The Ig classes expressed in mammals are IgM, IgD, IgG, IgE, and IgA. This classification is based on their H chain isotypes, which are encoded by the μ, δ, γ, ε, and α genes, respectively (1, 2). The evolutionary history of these genes is of interest from many perspectives, two of which have received particular attention. The first is the question of how the genes encoding the H chains arose, one from the other, and how their subsequent evolutionary divergence permitted the acquisition of novel Ab functions including transport and secretion into external body fluids such as mucus and milk, or the mediation of allergic reactions. In summary, first, IgM is the only class of Ab that is expressed in all species of jawed vertebrates (1, 2, 5). Second, IgD is a class of Ab that is probably as ancient as IgM, existing in elasmobranchs as a class that was originally named IgW but is now recognized as a homolog of IgD (6). IgD is an enigmatic Ab the structure of which is highly variable (which, in some cases, is due to duplication or loss of CH exons; Ref. 7), the function of which is a matter of debate (8), and which has clearly been lost from certain species. In bony fish and pigs, IgD can be expressed as chimeric H chains with inclusion of the IgM CH1 domain (9, 10). Although IgD/W has been described in fish (cartilaginous and bony), amphibians, and numerous mammals (6, 9, 11, 12), it is absent in birds and rabbits (13, 14). Third, the major class of low-m.w. Ab in the amphibia, reptiles, and birds is IgY, which is thought to have been derived from IgM by a gene duplication event (5). Fourth, IgY played a central role in Ab evolution in the tetrapod lineage, giving rise to IgA (through an apparent recombination event with IgM; Ref. 15) and to IgG and IgE through a gene duplication event followed by differentiation of structure and function (16). Overlaid on this broad scheme are other important events in the evolutionary history of Abs where certain lineages developed unique classes of Abs, such as the IgNAR in elasmobranchs (17), IgT/Z in some species of bony fish (18, 19), and IgX and IgF in amphibia (12, 20).
A second important perspective on the evolution of the Ab genes is the structure of the IGH locus. Interest in this question was stimulated by observations that in cartilaginous fish (phylogenetically the most primitive jawed vertebrates), the IGH loci show a unique structure. Whereas in the mammals the genes present in the IGH locus are arranged in a (VH)n-(DH)n-(JH)n-(CH)n order, the IGH loci of cartilaginous fish show a multicluster organization, where multiple clusters, each consisting of (V-DH-DH-JH-CH), are arranged in tandem (21, 22, 23). These alternative organizations (translocon or multicluster) present different challenges for the orderly rearrangement and clonal expression by B lymphocytes of the Ab H chains (24).
Although the IGH loci have been extensively investigated in fish, amphibia, birds, and mammals, our current knowledge of the Ig genes in reptiles remains limited. Reptiles are known, on the basis of studies at the protein level, to express two Ig classes, IgM and IgY (25, 26), and an IgM encoding gene has been cloned in turtles (27). Two types of IgY (7S, 5.7S) that differ in the lengths of their H chain have been described at the biochemical level in turtles (28, 29), but it is not known whether the genetic basis of this phenomenon is the same as that underlying the expression of two isoforms of IgY in the duck (30). A gene encoding an IgA-like isotype was recently cloned in the leopard gecko (15), suggesting that the repertoire of Ig classes in reptiles may be similar to that of birds. The IgD-encoding gene, despite its absence in birds, was also recently reported in the gecko (31). A paucity of knowledge of the structure of the reptilian IGH locus and of its encoded Ig H chain isotypes has been a barrier to a more complete understanding of the evolutionary history of Igs in vertebrates. We report here an investigation of the expression of the IGH genes of a reptile, the green anole lizard (Anolis carolinensis). (While our paper was under revision, another group (32) reported a computational analysis of the lizard IgH locus but without providing any experimental data).
Materials and Methods
Annotation of the green anole lizard (A. carolinensis) IGH constant genes
Basic local alignment search tool (BLAST) searches were performed against the genome sequence of the lizard (AnoCar1.0 assembly) deposited in Ensembl (http://www.ensembl.org/index.html). Genomic scaffolds were retrieved for further analysis. To search for putative Ig domains, the genomic DNA sequences were translated into proteins, which were subsequently subject to protein to protein BLAST searches in the National Center for Biotechnology Information (NCBI) GenBank. Recombination signal sequences for VH, DH, and JH gene segments were examined using the program FUZZNUC (http://bioweb.pasteur.fr/seqanal/interfaces/fuzznuc.html).
Animals, RNA isolations, and reverse transcriptions
About 3-year-old green anole lizards were purchased from a local Beijing pet market. Total RNA from different tissues was prepared using a TRNzol kit (Tiangen Biotech). Reverse transcription was conducted using Moloney murine leukemia virus reverse transcriptase following the manufacture’s instructions (Invitrogen).
Cloning of the IgM, IgD, and IgY H chain constant region cDNAs
Two JH primers, JHs1 (5′-GGGGCCAAGGAACATTTCTG) and JHs2 (ACATTTCTGGCTGTAACTTCAG-3′) were designed based on conserved JH sequences. Membrane-bound IgM cDNA was amplified by a nested PCR using JHs1, and IgMTMas1 (5′-TTGATGACAGTCACGGTGGCAC-3′, in which TM is transmembrane region), JHs2 and IgMTMas2 (5′-TGGCACTGTAGAACAGGCTCAC-3′), whereas the secreted IgM cDNA was amplified by a nested PCR using primers JHs1 and lizard sIgMas1 (5′-GGTGATGGTTAGTGGCAAGTGT-3′, in which sIg is secretory Ig), JHs2 and lizard sIgMas2 (5′-AGTGTTGCCAGTATCGGAGAGC-3′). IgD constant region cDNA was amplified using JHs1 and IgDTMas1 (5′-GTGCAACGACAAACAGAACCAT-3′) and JHs2 and IgDTMas2 (5′-GAAACTTTGTTCCAGACG T-3′). IgY full-length cDNA (secreted form) was amplified using JHs1 and IgYas1 (5′-TCTGGGATAGGACAAGGAGGGC-3′), JHs2 and IgYas2 (5′-GGGCTGTGATGCTGGATAGAGA-3′), whereas IgY (DFc) cDNA was amplified using primers JHs1 and IgYDFcas1 (5′-TTTATATCCTGCCCTTCTCA-3′) and JHs2 and IgYDFcas2 (5′-CAACAATTAGATGCCCATAC-3′). The IgY membrane-bound form cDNA was amplified using primers JHs1 and lizard IgYTMas1 (5′-AAATTGCCATGCTTATAGGAGA-3′) and JHs2 and IgYTMas2 (5′-CTTGGCCAGGGACTTCAGCTTT-3′). The DNA polymerase used for amplification of IgD cDNA was LA-TaqDNA polymerase (Takara), whereas TaqDNA polymerase (Tiangen) was used for amplification of IgM and IgY cDNA. All the resulting PCR products were cloned into a T vector (pMD19; Takara) and sequenced. The sequences have been deposited in the NCBI GenBank under the following accession numbers: EF683585 and EF690357–EF690361 (http://www.ncbi.nlm.nih.gov/sites/entrez).
The above amplified μ (membrane-bound form), δ (membrane-bound form), and υ (secreted form) cDNAs were used as probes, which were all labeled using a PCR digoxygenin probe synthesis kit (Roche). The hybridization and detection were conducted by following the manufacturer’s digoxygenin hybridization instruction.
Tissue expression of the lizard IGH genes
Tissue expression of the membrane-bound and secreted IgM were detected by PCR using the primers IgMCH3s (5′-CATCAGTGGCATCCCTCACA-3′) and IgMTMas3 (5′-ATCCCATCTAGGTAAGCCGT-3′) or lizard sIgMas2. Full-length IgY expression was detected using primer IgY detections (5′-TTTCCTGATCCGGTCACAGTGC-3′) and IgY detections (5′-AGCCCTTCTTCACTTTCCAGAT-3′), whereas IgY (ΔFc) expression was detected using primer IgY detections and IgYDFc detections (5′-CAACAATTAGATGCCCATAC-3′). Expression of IgD was detected by a nested PCR using primers JHs1 and IgDTMas1, JHs2 and IgDTMas2.
Analysis of the expressed Ig isotypes in the small intestine
Small intestine total RNA was reverse-transcribed using the primer NotI-d(T)18 (5′-AACTGGAAGAATTCGCGGCCGCAGGAATTTTTTTTTTTTTTTTTT-3′) and Moloney murine leukemia virus reverse transcriptase. One round of 3′-RACE PCR was conducted using a mixture of JH2s (5′-CTGGGGGAAAGGCATGATCGTTGTC-3′, JH2 derived) and JH8s (5′-CTGGGGGGATGGGACATTCCTCCTCGTA-3′, JH8 derived) as sense primers and R2 (5′-AAGAATTCGCGGCCGCAGGAA-3′; this primer was designed based on the anchor sequence of the primer NotI-d(T)18) as an antisense primer. The 3′-RACE PCR parameters were 94°C for 5 min for 1 cycle; followed by 40 cycles of 94°C for 30 s, 66.5°C for 30 s, and 72°C for 90 s and a final extension at 72°C for 7 min. The polymerase used was LA-TaqDNA polymerase (Takara). The resulted ∼1.6-kb PCR products were cloned into pMD19-T vector (Takara) to generate an Ig cDNA minilibrary. The white clones (after blue-white screening) were subject to PCR screening using primers M13F (5′-AGCGGATAACAATTTCACACAGG-3′) and M13R (5′-AGACAGGGTTTTCCCAGTCACGAC-3′) for positively recombined clones containing correct insert size; IgM1 (5′-ATCCGTGTAGAGACTATTGC-3′) and IgM2 (5′-GGTTTACCCGTGTTCTTGTC-3′) for IgM-positive clones; and IgY1 (5′-AGAAATCTGCTTGAGGGG-AC-3′) and IgY2 (5′-CTTTTGGCCATCCCATTGTTG-3′) for IgY-positive clones. The clones containing correct insert size but negative for both IgM and IgY were sent for sequencing to identify the nature of the inserts.
DNA and protein sequence computations
DNA and protein sequence editing, alignments, and comparisons were performed using the DNAStar program. Phylogenetic trees were made using MrBayes3.1.2 (33) and viewed in TreeView (34). Multiple sequence alignments were performed using ClustalW. Accession numbers for the sequences (http://www.ncbi.nlm.nih.gov/sites/entrez) used in phylogenetic analysis are as follows. α or χ gene: chicken, S40610; cow, AF109167; duck, AJ314754; human, J00220; mouse, J00475; platypus, AY055778; Xenopus laevis, BC072981. δ gene: catfish, U67437; cow, AF411245; dog, DQ297185; fugu: AB159481; human, BC021276; horse, AY631942; mouse, J00449; pig, AF411239; rat, AY148494; sheep, AF515673; Xenopus tropicalis, DQ350886; zebrafish, BX510335. γ gene: human, J00228; mouse, J00453; platypus, AY055781. ε gene: cow, AY221098; human, J00222; mouse, X01857; platypus, AY055780. μ gene: catfish, X52617; chicken, X01613; cow, AY145128; duck, AJ314754; human, X14940; horned shark, X07781; lungfish, AF437724; mouse, V00818; nurse shark, M92851; platypus, AY168639; skate, M29679; trout, X65261; X. laevis, BC084123; zebrafish: AF281480. υ gene: chicken, X07175; duck, X78273; X. laevis, X15114. ζ/τ gene: zebrafish, AY643752; trout, AY872256. ω gene: sandbar shark, U40560; lungfish, AF437727; nurse shark, U51450.
The lizard IGH locus
Using the turtle μ gene sequence as a template (27), a BLAST search was performed against the green anole lizard whole-genome shotgun sequences deposited in the Ensemble database. A 369-kb genomic contig (scaffold 869) was found to contain sequences related to those of the turtle μ. An examination of this scaffold revealed that it contained the 3′-region of the lizard IGH locus, with VH, DH, and JH segments and μ, δ, and υ genes being readily identifiable and as expected, arranged in a VH-DH-JH-μ-δ-υ order (Fig. 1; see supplemental Figs. 1 and 26 for deduced DH and JH segments). We subsequently performed Southern blottings using the μ, δ, and υ cDNA probes, revealing that the hybridization bands either were consistent with the expected patterns or could be readily explained by loss or gain of restriction enzymatic sites (polymorphisms in an outbreed species) (Fig. 2). This, to a certain extent, confirmed the assembling accuracy of the IGH locus.
The lizard μ gene
Four exons encoding sequences homologous (43.8% similarity at the overall protein level) to the turtle μ CH1–4 sequences were identified downstream of the JH locus. Phylogenetic analysis indicates that the identified gene is the lizard μ gene (Fig. 3). Alignment of the lizard μ sequence (inferred amino acids) with those from other species revealed a distinct pattern with regard to the distribution of noncanonical cysteines; i.e., those cysteines not involved in the intradomain disulfide bond (supplemental Fig. 3). Surprisingly, the cysteine residue in the N-terminal region of CH1, typically involved in disulfide bonding of heavy and L chains, is absent (supplemental Fig. 3). This was confirmed in both the retrieved genomic sequence and by sequencing of RT-PCR products generated using forward primers in the JH region and reverse primers in the 3′-region of the μ sequence. In contrast, three noncanonical cysteines were observed in the CH2 domain (supplemental Fig. 3). The cysteine residues that covalently polymerize pentameric IgM in mammals (35) are also found in the CH3 and secreted tail of the lizard μ (supplemental Fig. 3). Three potential N-linked glycosylation sites, one in CH1, one in CH2, and one in the secreted tail, are conserved in reptiles, birds, and Xenopus (supplemental Fig. 3). The first and third (invariant NVS) glycosylation sites are also conserved between reptiles and mammals (36, 37), suggesting a functional importance (37, 38).
To examine the splicing pattern of the lizard μ gene transcript, we used primers in JH, TM, or secreted tail sequences to perform nested RT-PCR amplifications. This approach revealed the presence of 4-CH secreted and membrane-bound forms (VDJ-CH1-CH2-CH3-CH4-s; VDJ-CH1-CH2-CH3-CH4-TM). Surprisingly, a short, membrane-bound encoding transcript, lacking the first two CH domains (i.e., VDJ-CH3-CH4-TM) was also observed (Fig. 4; supplemental Figs. 4–6). However, Northern blotting suggested that this short IgM transcript was a very minor form as it was not detectable using this approach (data not shown). Transcripts encoding the secreted form were detected only with four CH exons (VDJ-CH1-CH2-CH3-CH4-s) in both PCR and Northern experiments (Figs. 4 and 5, supplemental Fig. 4, and data not shown). Transcripts encoding both the membrane-bound and secreted forms of μ were detectable in multiple tissues (Fig. 5).
Characterization of the rearranged VDJ-Cμ fragments
To analyze the expressed VDJ-Cμ sequences, we used four VH sense primers that matched most of the VH genes identified in the current genome assembly (including those not assembled to scaffold 369), and an antisense primer derived from the IgM CH1 to amplify the rearranged VDJ-Cμ fragments by RT-PCR. We sequenced 92 clones, which provided 34 uniquely rearranged VDJ junctions (supplemental Fig. 7). Among these 34 clones, only 1 (LDA24) was expressed from the 8 VH gene segments (VH7) identified in scaffold 369. The frequency of JH utilization was JH1 (0), JH2 (7), JH3 (1), JH4 (3), JH5 (4), JH6 (5), JH7 (5), JH8 (7), and JH9 (2); see supplemental Fig. 7. The most frequently used DH segments were DH11 (or DH20, n = 10) and DH2 (or DH4, n = 6). The average length of the CDR3 was 8.9 codons, which is nearly the same as that in Xenopus (8.6 codons) and mice (8.7 codons); see Ref. 39). On average, the D gene plus N and P nucleotides contributed 43.1% of the CDR3 length, a ratio lower than that observed in X. laevis (57%) and adult mice (57%) (39).
Identification of the lizard δ gene
An array of 11 C domain-encoding exons spanning ∼12 kb was identified further downstream of the μ gene. Two typical Ig transmembrane region-encoding exons were also observed 1 kb downstream of the CH11 exon. A BLAST search using the translated protein sequence from the 11 exons showed homology to the Xenopus and fugu IgD H chains, supporting the notion that it is the δ ortholog in lizards. This conclusion was also supported by a phylogenetic analysis (Fig. 3 and supplemental Figs. 8 and 9).
All the δ CH domains show C1-set Ig superfamily structure with seven strands, and canonical cysteines at invariant positions (Ref. 40 and supplemental Fig. 10). In the CH1, the second canonical tryptophan is replaced by a phenylalanine, in CH4 by a glycine, and in CH8 by a glutamine. As in Xenopus IgD (and in lungfish IgW; Refs. 6, 12 , and 41), the N-terminal region of CH1 contains two consecutive cysteines, either of which could potentially provide the disulfide bond to the L chain.
Domain to domain comparisons of the lizard and human δ sequences revealed a relatively high identity between the lizard CH7 and the human CH2 (30.8%), and between the lizard CH8 and human CH3 (38.3%; supplemental Table I). Two potential N-linked glycosylation sites in the CH7 and CH8 are conserved across species when compared with the mammalian δ sequences (supplemental Fig. 10). Strikingly, CH10 contains seven potential N-linked glycosylation sites.
The presence of 11 δCH exons suggested that the lizard δ gene may be expressed as a long, multiple domain H chain as in Xenopus and bony fish (6, 9, 12). However, a nested RT-PCR using primers derived from the conserved JH sequence and the predicted δTM sequence generated a single product of ∼1.4 kb. Sequencing of this amplicon showed that the first 4 δCH exons were directly spliced to the δTM exon, with the remaining exons (CH5–CH11) being spliced out (Fig. 4 and supplemental Fig. 11). We were unable to identify the 3′-end of a secreted form using 3′-RACE PCR, due, perhaps, to the overall low level of expression of δ transcripts. The encoded lizard IgD H chain is thus more similar in size to mammalian IgD than to that of bony fish and Xenopus (Fig. 6). The membrane form of the lizard δ gene, however, appears to be weakly expressed given that even a two-round, nested RT-PCR generated only weak amplicons in the spleen and lung (Fig. 5).
Identification of the lizard υ gene
Approximately 17 kb downstream of the δTM2 exon, a third Ig gene consisting of four CH domain-encoding exons, plus two transmembrane region-encoding exons, was identified. A comparison of sequences, as well as phylogenetic analysis, indicated that this was the lizard υ gene (Fig. 3). The entire gene, including four CH- and two TM region-encoding exons, spans ∼8 kb of DNA.
Alignment of the lizard IgY H chain (υ) sequence with those of other species revealed several conserved features (supplemental Fig. 12). These included the three cysteines (potentially linking H chains and L chains) in the N-terminal region of the CH1, and three noncanonical cysteines located in the N- and C-terminal regions of the CH2 domain, all of which are conserved across the species examined. The two potential N-linked glycosylation sites in the lizard CH2 and CH3 are also found in birds, but not in Xenopus.
RT-PCR using primers for the JH and secreted tail (or TM region) followed by sequencing of the amplicons revealed that the lizard IgY is expressed in secreted and membrane-bound forms that contain all four C-region exons (Fig. 4 and supplemental Figs. 13 and 14).
IgY(ΔFc) is expressed in the lizard
Based on serological data, a truncated IgY form has previously been reported in turtles (28). This is reminiscent of the situation in ducks, in which both full-length IgY and IgY(ΔFc) lacking the Fc region (CH3 and CH4), have been reported (16, 30). To confirm whether a truncated form is also expressed in the green anole lizard, we searched the CH2-CH3 intron for a potential transcriptional termination signal (or polyadenylate addition signal, AATAAA). Such signal sequences were found 464 bp downstream of the CH2 exon. To identify the putative expression of the truncated IgY H chains, RT-PCR was conducted using primers for the predicted 3′-untranslated region and conserved JH sequences. The sequences of the amplicon generated showed that a truncated IgY transcript was indeed present (Fig. 4 and supplemental Figs. 15–17) but expressed much weaker than the four-CH forms in many tissues as shown by Northern blotting (data not shown).
IgA-encoding gene is absent in the lizard
An α gene was also recently cloned in a reptile, the leopard gecko (15). However, neither the intron between the lizard δ and υ genes (∼17 kb) nor the DNA (∼238 kb in the retrieved genomic scaffold) downstream of the lizard υ gene contained any potential Ig CH domain-encoding sequences. We also searched the entire lizard genome and did not find any potential IgA CH encoding sequences in any other contigs, suggesting an absence of the IgA-encoding gene in the lizard. To further confirm this point, an Ig cDNA-specific minilibrary was constructed using total RNA derived from the small intestine where IgA, if present, is predicted to be highly expressed. By screening of the library using PCR, we obtained 302 clones containing ∼1.6-kb inserts, of which 267 contained an IgM sequence, whereas 22 were IgY. No IgD cDNA or additional Ig isotypes were found in this minilibrary, because the remaining clones were all shown to have non-Ig sequences. Taken together, these data strongly suggest that there is no IgA encoding gene in the lizard.
Putative switch regions are present upstream of the μ and υ but not δ genes
Expression of H chain isotypes other than IgM (IgD) in tetrapods is accomplished through class switch recombination, in which a functional VDJ region is translocated from upstream of an expressed μ gene to a position immediately 5′ of the H chain gene to be expressed (42). The recombination is mediated by switch (S) regions, which are located upstream of the μ gene and of the other H chain genes (except for δ, which is expressed by RNA processing rather than by class-switch recombination). The switch regions are typically repeat rich (in mammals; Ref. 42) and contain short repetitive sequences rich in AGCT motifs (43). Analysis by dot-plot under relatively relaxed conditions (window, 30 bp; percent match, 70%) of the JH-μ intron identified an ∼3.2-kb DNA repetitive sequence block (supplemental Fig. 18a). In this 3.2-kb DNA region, 12 CTGGG (both strands) and 18 CTGAG (both strands) motifs that characterize the mammalian Sμ, were found. The region is also rich in the AGCT motifs (supplemental Fig. 18a) that can promote isotype switching (43), suggesting that this region is most likely the Sμ. Using the same method, an ∼5-kb Sυ was also identified (supplemental Fig. 18c). When dot-plot analysis was applied to the μ-δ intron, some repetitive sequences (mainly due to the presence of two (ATT)n microsatellite sites) could be identified. This intron is, however, very much less rich in AGCT motifs than are the putative Sμ and Sυ regions (supplemental Fig. 18b), suggesting that there is no Sδ region in the lizard. The δ gene is thus likely to be expressed through cotranscription together with the μ gene, followed by alternative processing of the primary transcript, as occurs in other species that express IgD.
Making use of the recently released whole-genome sequence combined with experimental investigation of gene expression, we report that the anole green lizard expresses three IgH isotypes (IgM, IgD, and IgY) but no IgA. These findings have implications for our understanding of the evolutionary history of the vertebrate IGH locus and the Ab classes it encodes (Fig. 7). These results, as discussed below, also allow a more complete reconstruction of the evolutionary history of the vertebrate IGH locus and suggest conclusions (some unexpected) regarding the structural and functional plasticity of this locus and of the Abs that it encodes.
First, the demonstration of IgD in a lizard makes clear that IgD, an Ab once thought to be present only in primates, shows an evolutionary continuity from fish, through the amphibia and reptiles to the mammals (Fig. 7). In those taxa that are lacking IgD (e.g., chickens, ducks, and rabbits), its absence is clearly the result of a loss of the δ gene from the IGH locus. It is perhaps surprising that Ab classes can be lost without compromising the immune defenses of an animal, but this theme is illustrated not only by IgD, but also by IgA, which is a class of Ab highly expressed in mammals and which is also believed to be a critical component of mucosal immunity (44). The fact that IgA appears to be present in some reptiles, such as the gecko (15), but not in the anole lizard suggests that it is not essential for immune defense in reptiles. In humans, some IgA-deficient individuals show marked susceptibility to infection, whereas others do not, which is thought to reflect increased (and compensatory) IgM secretion (44). In teleost fish that do not produce IgA, its function is fulfilled by IgM (45). Considering that IgM shares the same transport receptor with IgA (or IgX) in mammals and birds and Xenopus (46, 47) and that it is highly expressed in the lizard intestine (Fig. 4), it is likely to play a role in mucosal immunity when IgA is absent. Thus, the fact that an absence of IgA poses no apparent problem for immunity in the lizard exemplifies the plasticity and redundancy of the vertebrate Ab system.
The failure to find the α gene in the lizard was unexpected, given that IgA is present in both mammals and birds, and its evolutionary origins must therefore have predated the divergence of the synapsid (mammalian) and diapsid (bird, reptile) lineages. The demonstration of a gene encoding the IgA H chain in the leopard gecko (Eulepharis macularius) (15) is compatible with this scenario; thus, the absence of IgA in the lizard most likely represents the loss of the α gene from the IGH locus of this species. The α gene recently cloned from the gecko (15) was proposed to have originated from a recombination between the μ and υ genes, with the first two CH domains being derived from the υ CH1–2 and the last two derived from μ CH3–4, a scenario that also applies to bird α and Xenopus IgX H chains (15). The hypothesis that α originated from a recombination between the μ and υ genes explains why only IgM and IgA can form polymeric Abs and also why IgM and IgA share the same polymeric Ig receptor (46, 47). Indeed, sequence comparisons of the gecko α with the lizard μ and υ, show strikingly high similarities at the inferred protein level between the gecko α CH1–2 and the lizard υ CH1–2 (59.0% identity), and between the gecko α CH3–4 and the lizard μ CH3–4 (59.6% identity; supplemental Table II and supplemental Fig. 19). Dot plot analysis of the lizard μ and υ with the gecko α gene supports the same conclusion (supplemental Fig. 20). Although IgA contain 4 CH domains in birds and reptiles, mammalian IgA possess only three and a hinge region. Phylogenetic analysis suggested that the CH2 domain (of 4-CH IgA) was lost in mammals (supplemental Fig. 21).
The position of IgA-encoding genes in the IGH locus differs greatly between birds and mammals. In mammals it is found at the 3′-end of the locus, but in birds it is found (in an inverted transcriptional orientation) between the μ and υ genes. It is possible that, following its creation, the α gene was in an unstable chromosomal location that favored deletion and/or recombination events, thus explaining the presence of the α gene in different sites in birds and mammals and its absence in the lizard.
A broad overview of the Abs present in the jawed vertebrates, including the lizard, leads to two conclusions. First, certain classes of Ab (such as IgD or IgA) can be either present or absent from otherwise related species without any apparent effects on fitness. Second, structural variants of Abs (such as IgY) that are predicted to impair important functions are also well tolerated. An example of the latter is the presence in the lizard of two forms of IgY, one of which is of normal size (four-C-domain H chain), but the other is truncated. The observation in the lizard of a message that encodes a truncated form of IgY, termed IgY(ΔFc) (16), mirrors findings made in ducks (30) and in turtles (28, 29). The IgY(ΔFc) of the lizard appears to be a structural equivalent of a F(ab′)2 fragment, with a VH-CH1-CH2 domain structure of the H chain. Such a fragment would be predicted to maintain virus-neutralizing functions but to be inactive in opsonization and complement fixation (16, 48), although some potential selective advantages of IgY(ΔFc) have been proposed, such as an inability to initiate anaphylactic reactions (48). In this study, it was clearly shown that the truncated υ chain of the lizard has a genetic basis different from that described in ducks. In ducks, the C-terminal sequence of the υ(ΔFc) chain results from the use of a small additional exon that is present in the CH2-CH3 intron (supplemental Fig. 17) (30). However, the C-terminal encoding sequence and 3′-untranslated region of the lizard IgY(ΔFc) transcripts are located immediately downstream of the CH2 exon, encoding a slightly longer tail (GKTSCGLH; supplemental Figs. 16 and 17). Thus, processing of the primary υ transcript in the lizard involves a choice between splicing from the donor site at the end of the CH2 exon to the acceptor site of the CH3 exon or cleavage/polyadenylation of the transcript at a site downstream of the CH2 splice donor site. This is reminiscent of the alternative RNA processing pathways associated with the expression of the secreted vs membrane bound forms of Ig H chains (49). The expression of the membrane form of an Ig H chain typically involves splicing of the TM1 exon into a cryptic site (consensus GG/GTAAA) within the terminal secreted exon. The CH4 exons of the lizard μ and υ genes both use cryptic splice sites of this sequence for generating messages that encode the membrane-anchored form of IgM or IgY. In contrast, the 3′ splice site of the CH2 exon of the lizard υ gene (AG/GTAAG) conforms more closely to the canonical splice donor sequence (AG/GTGAG).
Thus, the IgY(ΔFc) of the lizard and of the duck arise from different genetic events and are examples of convergent evolution. This would argue for a selective advantage for the presence IgY(ΔFc) in the Ab responses of lizards, sea turtles, and ducks, but the biological significance of IgY(ΔFc) expression in any of these species has yet to be established.
Overall, these observations on the Ig genes of the lizard reinforce several conclusions concerning the vertebrate IGH locus and the Abs it encodes. First, this locus shows remarkable evolutionary flexibility in the number of classes of Ab that are expressed in different species of jawed vertebrate. Furthermore, the patterns of expression (Fig. 7) show little consistency, outside the universal presence of IgM. Second, any single class of vertebrate Ab (such as IgD, IgA, or IgY) can show substantial structural variation, in ways that do not clearly correlate with obvious differentiated functions. Third, it is clear that Abs show both structural plasticity and functional redundancy. Although the evolutionary history of the Ab family is now better described, we are still far from understanding the functional significance of the remarkable diversity that has been uncovered.
The authors have no financial conflict of interest.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
This work was supported by National Science Fund for Distinguished Young Scholars Grant 30725029, Program for New Century Excellent Talents in University of China, National Key Basic Research Program Grant 2006CB102100, and National Natural Science Foundation of China Grant 30671497.
This material is based in part on work supported by the National Science Foundation. Any opinion, finding, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.
Abbreviations used in this paper: IGH, Ig H chain gene; IGL, Ig L chain gene; BLAST, basic local alignment search tool; NCBI, National Center for Biotechnology Information; IgSF, Ig superfamily; TM, transmembrane region; sIg, secretory Ig.
The online version of this article contains supplemental material.