T lymphocytes or T cells are key components of the vertebrate response to pathogens and cancer. There are two T cell classes based on their TCRs, αβ T cells and γδ T cells, and each plays a critical role in immune responses. The squamate reptiles may be unique among the vertebrate lineages by lacking an entire class of T cells, the γδ T cells. In this study, we investigated the basis of the loss of the γδ T cells in squamates. The genome and transcriptome of a sleepy lizard, the skink Tiliqua rugosa, were compared with those of tuatara, Sphenodon punctatus, the last living member of the Rhynchocephalian reptiles. We demonstrate that the lack of TCRγ and TCRδ transcripts in the skink are due to large deletions in the T. rugosa genome. We also show that tuataras are on a growing list of species, including sharks, frogs, birds, alligators, and platypus, that can use an atypical TCRδ that appears to be a chimera of a TCR chain with an Ab-like Ag-binding domain. Tuatara represents the nearest living relative to squamates that retain γδ T cells. The loss of γδTCR in the skink is due to genomic deletions that appear to be conserved in other squamates. The genes encoding the αβTCR chains in the skink do not appear to have increased in complexity to compensate for the loss of γδ T cells.
This article is featured in Top Reads, p.
T cells are essential to the immune responses of all jawed vertebrates. There are two ancient types of T cells, the αβ and γδ T cells, present in all major gnathostome lineages and defined by the protein composition of their Ag-binding TCR (1) (Fig. 1) (2–4). αβ T cells, with a TCR composed of TCRα- and TCRβ-chains, are comparatively well studied and perform a variety of roles in the immune system, including promoting Ab responses, killing virus-infected and tumor cells, stimulating inflammation, and suppressing or regulating immune responses (5). Ag recognition by the αβTCR involves the binding of Ags, or fragments of Ags, also bound or presented by cell-surface molecules encoded by the genes of the MHC (6).
The γδ T cells, possessing a TCR composed of TCRγ- and TCRδ-chains, are functionally more enigmatic than the αβ T cells. The γδTCR has been found to bind Ags presented on MHC molecules, as well as Ags presented on non-MHC molecules and free Ags (7). γδ T cells have been found to play a role in early stress responses, wound healing, and a wide variety of immune responses to cancer and infection (7). In addition, the relative ratios of αβ to γδ T cells vary among species. In humans and mice, for example, γδ T cells make up a small fraction (<10%) of circulating T cells. In contrast, species such as cows and pigs are known as “high γδ T cell” species and have a larger fraction (up to 40%) of circulating γδ T cells (8).
The genes encoding the four TCR chains, TRA, TRB, TRG, and TRD, are well conserved from mammals to sharks (1). As with the Ig, the genes encoding the ligand-binding V domains of each TCR chain exist in an unassembled form in the germline DNA. During T cell development, TCR gene segments are recombined to create the exon encoding the V domain (9). The TRB and TRD loci assemble the V exon using three gene segments, the V, D, and J gene segments, whereas the TRA and TRG loci use two gene segments, V and J. Unique combinations of gene segments in individual developing T cells contribute to the diversity of TCRs and clonally unique Ag specificity.
Among the four conventional TCR chains, TCRδ has demonstrated the greatest evolutionary plasticity. The TRD locus is nested within the TRA locus in most species examined, and TCRδ can share V gene segments with TRA, utilizing either TRDV or TRAV gene segments to generate diversity (10) (Fig. 2). In a growing list of vertebrates, the TRA/D loci contain V genes termed TRDVH. TRDVH genes appear to have been translocated from the IgH locus and are used to encode TCRδ-chains (11–14) (Fig. 1). How TRDVH genes influence γδ T cell function or Ag recognition, if at all, remains unknown. A more extreme example of TCRδ plasticity has been found in marsupials and monotremes. These lineages contain a fifth TCR chain, TCRμ, with three extracellular domains, two V and one C domain, that appears to have evolved from a TCRδ ancestor (15, 16). TCRμ pairs with TCRγ, defining the γμ T cell lineage, unique to marsupials and monotremes (17) (Fig. 1).
Recent reports have detailed the inability to identify TRG and TRD transcripts and/or genes in a broad range of squamate reptile species (lizards and snakes) (18–20) (R.D. Miller, unpublished observations). If confirmed, this would make squamates the first known major vertebrate lineage to specifically lack γδ T cells. To investigate the genetic basis of the loss of γδ T cells in some reptiles, we compared the recently published genome and transcriptome of the sole surviving rhynchocephalian, the tuatara Sphenodon punctatus (4), to that of the Australian sleepy lizard, Tiliqua rugosa, a member of the large Scincidae family of squamates. Although separated by >250 million years, the Rhynchocephalia are the closest living relatives to the Squamata that retain the TRG and TRD genes, serving here as a genomic outgroup to the squamates (4, 19).
Materials and Methods
Spleen transcriptome data were generated from two T. rugosa, one from Western Australia and one from South Australia. The West Australian animal was an adult of unknown sex that was brought to the Kanyana Wildlife Rehabilitation Center (Lesmurdie, WA, Australia) where it was euthanized. The South Australian animal was an adult male archived at the South Australian Museum as SAMAR71619. All animal procedures were conducted under ethics permits from Flinders University E454-17 under permits FO25000145 and A23436.
The T. rugosa genome was assembled from PacBio HiFi consensus reads (Bioplatforms Australia Data Portal—AusARG PacBio HiFi 102.100.100/350719) using hifiasm (21) with default parameters. The animal used was the same individual, SAMAR71619 (South Australian Museum), used for the splenic transcriptome. The primary assembly consists of 144 contigs with an N50 of 125.5 Mb. Contigs containing TRA and TRB sequences were identified by BLASTn using putative V and C gene sequences from the transcriptome analyses (see below). When regions of interest were fully contained within unitigs, these were used in preference to contigs to avoid potential phasing issues. A match to the S. punctatus TCRG locus C gene was not found in the assembly; however, AMPH and STARD3 that flank this locus in other vertebrates were identified by BLASTn analysis. PacBio reads mapped back to utg000010l using minimap2 (22) revealed that several individual PacBio reads spanned the distance between the two loci, confirming that there was no misassembly in this region. Unitigs identified and analyzed in T. rugosa were as follows: TRA, utg000827l; TRB allele 1, utg000549l; TRB allele 2, utg000354l; and AMPH and STARD3, utg000010l. Contigs identified and analyzed in S. punctatus were as follows: TRA/D, QEPC01010493.1; TRB, QEPC01009940.1; and TRG, QEPC01001482.1, QEPC01006510.1, and QEPC01003628.1 in GenBank (https://www.ncbi.nlm.nih.gov). The S. punctatus genome assembly (ASM311381v1, GenBank accession number GCA_003113815.1) was searched to identify scaffolds containing the TRA/D, TRB, and TRG loci. Gallus gallus TCR sequences used to search the S. punctatus whole-genome assembly were as follows: TRA, EF554728.1; TRB, EF554755.1; TRG, NM_001318455.1; and TRD, AF175433.1 in GenBank (https://www.ncbi.nlm.nih.gov).
To identify the TRA and TRB transcripts in T. rugosa, previously published sequences were used (20). The TCR C regions identified were used to identify transcripts in a previously published dataset (23). The outputs were analyzed for V regions and C regions based on conserved motifs. When partial sequences were identified, they were then used to screen for sequences containing full-length V or C regions. Sequences identified were then used to query a PacBio transcriptome with BLASTn in a local database. The same process by which V and C regions were identified in the 454 transcriptome was applied to the PacBio transcriptome. Sequences for T. rugosa TRG and TRD were sought using TRG and TRD genes from Gallus gallus (NM_001318455.1 and AF175433, respectively), Alligator mississippiensis (NW_017708624.1 and NW_017707656.1, respectively), Mus musculus (NC_000079.7 and NC_000080.7, respectively), Alligator sinensis (NW_005842396.1 and MT081963.1, respectively), Actinemys marmorata TRG (ML756264 and VOGN01000000), Podocnemis expansa TRG (ML682116.1), Crocodylus porosus TRG (NW_017728886.1), Chrysemys picta bellii TRG (NW_007281368.1), Platysternon megacephalum TRG (QXTE01000076.1), Carettochelys insculpta TRG (ML683784.1), and Gavialis gangeticus TRG (NW_017729006.1), but TRG and TRD could not be identified.
The S. punctatus transcriptome assembly (GGNQ00000000.1) GenBank accession numbers of all sequences used are in Supplemental Table I. The T. rugosa transcriptome sequences reported in this study have been deposited into GenBank under accession numbers OL311598–OL311653 (https://www.ncbi.nlm.nih.gov).
Annotation and characterization
Non-TCR gene models were predicted using GenSAS (24) with reference sequences for non-mammal vertebrates, and then BLAST was used on all predicted coding sequences against the GenBank database to assign functions. Genomic V and J gene segments were located by comparing scaffolds to cDNA and by identifying the flanking recombination signal sequence. The D segments were identified by searching for flanking recombination signal sequence and by using CDR3 sequences, which represent the V–D–J junctions, from aligning cDNA clones to the genomic sequence. V genes were designated TRAV, TRDV, TRDVH, or TRBV by utilizing three methods of annotation: 1) National Center for Biotechnology Information’s BLASTp or tBLASTn algorithms were used to assess similarity to TCR homologs from various species retrieved from GenBank; 2) transcriptomes were mined to assess the mRNA splicing of V segments to TRAC or TRDC segments; and 3) phylogenetic analysis based on V gene nucleotide alignments from various species using MEGA version X (25). V gene nucleotide sequences corresponding to framework regions 1–3 were aligned using ClustalW (26).
Gene segments were annotated following the IMGT nomenclature (27). Gene segments are numbered according to their location from the 5′ to 3′ end of the locus. V gene designations contain the family number followed by the gene segment number when there is more than one V gene in a family. V gene families were determined by ≥80% nucleotide sequence identity using ClustalW alignments and identity matrices constructed through BioEdit (28). TRAJ segments are numbered from the 3′ to 5′ end of the locus following IMGT and proposed by Koop et al. (29). Supplemental Table II contains the location of TCR gene segments within each scaffold.
Phylogenetic trees based on the nucleotide alignments were constructed using MEGA version X (25) and iTOL (30) and the neighbor-joining method (31). Evolutionary distances were processed using the maximum composite likelihood method (32). GenBank accession numbers for VH and TRDVH phylogenetic analyses are as follows: Homo sapiens, X64147.1, MT078123.1, EF633766.1, MN514921.1, and AM408141.1; Mus musculus, K01569.1, Z37145.1, X03398.1, and NG_007044.1; Ornithorhynchus anatinus, AF381307.1 and JQ664694.1; Alligator sinensis, JQ479335.1, MT081963.1, and EU588358.1; and Monodelphis domestica, EU588365.1, EU588373.1, and EU588383.1 (https://www.ncbi.nlm.nih.gov). A. sinensis sequences were provided by Wang (10).
Recent reports that transcripts encoding TCRγ- and TCRδ-chains could not be found in any squamate reptile imply that snakes and lizards may be the first lineage of jawed vertebrates naturally lacking γδ T cells (18–20). To investigate this further, the genome of T. rugosa was compared with that of S. punctatus. S. punctatus is the only surviving member of the Rhynchocephalia, the order of reptiles closest to the squamates and one that retains the TRD and TRG genes (19) (Figs. 1–3). Using sequences for avian and crocodilian TCR C region genes, we were able to identify TRA and TRB sequences in a splenic transcriptome from two T. rugosa individuals but failed to identify any TRG or TRD transcripts (data not shown).
The lack of TRG and TRD transcripts in the T. rugosa spleen transcriptome, as well as that of other squamates (Refs. 18–20 and this study), raises the question of whether the genes encoding these two receptor chains remain in the genome. Using the TRA transcriptome sequences, we identified the genomic regions encoding the TCRα-chains in T. rugosa and the TRA/D locus in S. punctatus (Figs. 3–5). The syntenic genes immediately flanking the TRA/D locus are well conserved among amniotes (10), which enabled confirmation that orthologous regions were being compared (Fig. 2). The entire TRD region and a significant number of TRAV genes are absent from the T. rugosa locus, relative to the S. punctatus locus. Consistent with this, the genomic region containing the TRA genes is substantially shorter, by ∼400 kb, in T. rugosa than the TRA/D locus in S. punctatus (Figs. 2, 3A, 4, and 5). An analysis comparing germline S. punctatus and T. rugosa TRAV revealed that most clades included sequences from both species, except for one. One clade contained only S. punctatus TRAV sequences, and these were among the most TRD-proximal V genes in the locus (Supplemental Fig. 1). In contrast, all other TRAV clades included both S. punctatus and T. rugosa TRAV sequences. This finding would be consistent with a deletion in the squamate lineage that included not only the TRD genes but some of the proximal TRAV genes as well (Fig. 3).
Searching the T. rugosa genome with the S. punctatus TRGC sequence failed to identify homologs (data not shown). As an alternative strategy to search for a T. rugosa TRG locus, a contig containing the AMPH and STARD3 loci was identified and analyzed (utg000010l). AMPH and STARD3 have conserved synteny with TRG in all amniotes so far examined, with AMPH flanking TRG on the 5′ or V side and STARD3 on the 3′ or C region side (33). A search of the T. rugosa genomic sequence between AMPH and STARD3 also failed to identify sequences resembling TRG genes (data not shown). Indeed, the distance between AMPH and STARD3 is <5 kb, consistent with a large deletion of the genomic region expected to contain TRG in T. rugosa (Fig. 3B). The TRG locus in the S. punctatus genome is spread over at least three contigs; however, it was possible to order these contigs based on their content. In contrast to T. rugosa, the distance between AMPH and STARD3 in S. punctatus would be >150 kb (Fig. 3B). Only four TRGVs were identified in the S. punctatus genome, consistent with what was estimated for this species previously (19).
Furthermore, the deletion associated with the TRG locus in T. rugosa is also present in the Anolis carolinensis and Podarcis muralis genomes, and likely common to all or most squamates (data not shown). In contrast, we were able to identify genes and transcripts encoding all four TCR chains in both the genome and a blood transcriptome database of S. punctatus, further confirming that the rhynchocephalians retained the γδ T cell lineage. Collectively, these results are consistent with the loss of the γδ T cells continuing to be a general feature of squamates and associated with deletions of the genomic regions containing the TRD and TRG genes (18–20).
The absence of the TRG and TRD genes in squamates raises the question of whether there is any evidence of increased, compensatory diversity of αβ T cells in this lineage by increasing complexity of the TRA and TRB genes. This does not appear to be the case. Rather, the T. rugosa locus is less complex than that of S. punctatus with respect to the germline TRAV genes (Table I) (10, 12, 14, 33–35). Indeed, T. rugosa has 86% fewer TRAV genes when compared with S. punctatus. The T. rugosa and S. punctatus TRB loci were also identified based on their flanking genes with conserved synteny in other amniotes and analyzed for their germline content (data not shown). Similar to other squamates analyzed, the T. rugosa TRB locus contains a relatively small number of TRBV genes (18) (Table I). The low TRBV gene complexity appears to be a general feature of lepidosaurs, as S. punctatus also has fewer TRBV genes than many other vertebrates (Table I).
|Species .||.||VA .||JA .||CA .||VD .||VHD .||DD .||JD .||CD .||VB .||DB .||JB .||CB .||VG .||JG .||CG .||References .|
|TCRβ||12||2||10||1||(12, 33, 35)|
|Species .||.||VA .||JA .||CA .||VD .||VHD .||DD .||JD .||CD .||VB .||DB .||JB .||CB .||VG .||JG .||CG .||References .|
|TCRβ||12||2||10||1||(12, 33, 35)|
TRDV genes are included with the TRAV genes.
Allele 1/allele 2.
Xenopus includes sequences from both X. tropicalis and X. laevis.
Several vertebrate lineages have been found to use V gene segments that are more closely related to IgH genes in their TCRδ repertoire (10, 12–14, 16). These TRDVH genes are located within the TRA/D locus. S. punctatus also has TRDVH, supporting the presence of these atypical V genes in the last common ancestor of the Diapsida. The S. punctatus TRA/D locus contains 11 TRDVH genes, and transcripts encoding the TCRδ-chain containing TRDVH were identified in the blood transcriptome (data not shown). Four TRDVH genes are located 3′ of Cδ and appear not to be used. S. punctatus TRDVHs are related to VH from all three ancient VH clans, similar to Chinese alligators (10) (Fig. 6).
γδ T cells, along with αβ T cells and B cells, are the three lineages of lymphocytes that generate clonally diverse Ag receptors via a process of somatic DNA recombination (9). This trichotomy is ancient and predates the evolution of gnathostomes. Three distinct types of lymphocytes that generate clonally diverse Ag receptors also exist in the jawless vertebrates, with features like B cells, αβ T cells, and γδ T cells (36). The genes encoding the TCR chains of αβ and γδ TCRs are also both ancient and relatively conserved (1).
A search of the genome and splenic transcriptome of a species of skink, T. rugosa, identified transcripts and genes encoding the TCRα- and TCRβ-chains. There was no evidence, however, of transcripts or genes encoding TCRγ- or TCRδ-chains. Nor was there evidence of remnants of the TRG and TRD loci in the T. rugosa genome. Rather, the genomic regions expected to contain these genes, based on well-conserved syntenic blocks, appear to contain large deletions. In the case of the TRA/D locus, the deleted region includes some of the TRAV genes in addition to the TRDV and TRDVH genes and the rest of the TRD coding sequences.
The role of gene and genome duplication and expansion in vertebrate evolutionary history is well established (37, 38). Gene loss as well has contributed to vertebrate speciation and adaptation (39). The evidence that T. rugosa’s loss of γδ T cells is accompanied by genomic deletions is noteworthy considering the evidence of loss of key components of the adaptive immune system from the genomes of other vertebrates. For example, some species of anglerfish lack the components necessary for the development of functional B and T cells. In most cases the relevant genes are still present in the genome but have been pseudogenized (40). Similarly, Atlantic cod and apparently all other Gadiformes are missing elements of the T cell arm of their adaptive immune response, specifically MHC class II genes and CD4 (41, 42). No traces of the MHC class II genes were found in the cod genome; however, a CD4 pseudogene is present. In the cases of teleost fishes these gene losses are generally limited to a single species or set of closely related species. In contrast, the loss of γδ T cells in squamates affects a large group of vertebrates encompassing >10,000 species and dates the loss to >150 million years ago (43) (Fig. 1). The genomic deletions found in T. rugosa appear common to squamate species.
The absence of γδ T cells in squamates raises several questions. First, is there evidence that squamate αβ T cells have compensated for the lack of γδ T cells? One prediction might be that the complexity of αβTCR diversity may be increased in squamates. Comparative genomics between S. punctatus and T. rugosa, however, would not support this hypothesis. Rather, squamate TRA and TRB genes are less complex than those of S. punctatus and other amniotes. The TRBV genes in T. rugosa and S. punctatus were also limited when compared with TRAV. Relatively low numbers of TRBV genes appear to be the norm for squamates (18, 19). S. punctatus has fewer TRBV genes than does T. rugosa but it still has a lower complexity in its TRB locus than its TRA locus. Both species appear to have greater numbers of TRBV genes than do other squamates, even while having a less complex TRB locus. In conclusion, there appears to be no evidence of compensation for the loss of γδ T cells by an increasing complexity of germline gene segments. However, we cannot rule out the possibility of increased CDR3 complexity in the αβTCR due to junctional diversity. It is also noteworthy that there was no evidence that the TRB locus in T. rugosa contains V genes related to VH. This is noteworthy because TRB-related genes have been found linked to VH genes in other squamate species (20). In contrast, other tetrapod lineages have acquired VH-like V genes in the TRD locus (10, 12–14, 16). S. punctatus, as we have shown in the present study, also uses γδTCR containing either TRDV or TRDVH in peripheral blood T cells.
There is lower complexity of the TRB locus relative to the TRA locus in most amniotes such as the opossum, alligator, mouse, and human (33, 34, 44). The results from S. punctatus and T. rugosa are consistent with that pattern. During conventional T cell development, the TRB genes are rearranged first and require a minimum of two successful recombination events (9). Once TRB rearrangement is successfully completed, the developing T cell will begin recombination at the TRA locus. The higher complexity of the TRA locus with a greater number of available TRAV and TRAJ genes allows for more chances to achieve a successful rearrangement, resulting in improved probabilities of the T cell completing development with a functional TCR (45).
A significant implication for the loss of γδ T cells is in squamate immunity. In mammals such as humans and mice, αβ T cells make up most circulating T cells, whereas γδ T cells predominate in epithelial sites (7). Many of the roles described for γδ T cells in mammals appear conserved in the few non-mammalian species that have been examined, such as birds (46). One of the populations of γδ T cells in mammalian skin are a specialized dendritic epithelial T cell (DETC) expressing an invariant γδTCR, where they participate in a variety of functions including wound healing and general maintenance of the epidermis (47). Studies in knockout mice have demonstrated that, in the absence of γδ T cells, αβ T cells with a DETC morphology and more limited TCR repertoire appear in the skin (48). In this knockout situation it appears as if αβ T cells attempt to fill the niche in skin left by the missing γδ T cells. However, these skin αβ T cells are short lived and fail to compensate functionally for γδ DETCs (49). γδ T cell knockout mammals display a wide variety of other phenotypes including impaired function in wound healing and mucosal IgA responses, lower neutralizing Ab titers, decreases in bacterial clearance, and overall decreased survival (49–52).
Squamates are a large successful lineage of vertebrates that have adapted to a wide variety of environments. Squamates obviously evolved from an ancestral species that had γδ T cells, but the genes encoding the γδTCR appear to have been lost prior to the divergence of the extant species. They clearly have adapted to not having γδ T cells and are not the equivalent of an artificially generated knockout. Uncovering these adaptations is critical for understanding not only the squamate immune system, but also for the insights provided on the role of γδ T cells in species where they are present. Unfortunately, reptiles have been identified as vertebrate lineages for which immunological studies are particularly lacking (53). Current evidence supports a hypothesis that reptiles, in general, may be more dependent on innate immune mechanisms for defense relative to adaptive immunity (53). It appears unlikely that this is related to the loss of γδ T cells in squamates since non-squamate reptiles, such as turtles and crocodilians, which have γδ T cells, also appear to depend more on innate immune mechanisms (53).
Our knowledge of pathogen challenges is also limited in squamates, although information for T. rugosa is more comprehensive than for most (54, 55). The species and its interactions with parasites, primarily with ectoparasitic ticks, has been under continuous study for 40 y at a site in South Australia (56). In addition, tick-borne rickettsia and protozoan parasites have also been identified in ticks associated with T. rugosa (55). Future directions should focus on how αβ T cells or other components of the immune system may be compensating for the roles normally filled by γδ T cells in other vertebrates. T. rugosa is emerging as a model organism for these explorations.
T cells are a critical cell type to the immune responses of all jawed vertebrates, from sharks to humans. In the present study, we describe a comparative analysis between the genomes of the tuatara and a skink species, revealing genomic deletions and loss of genes encoding the chains of the γδ TCRs in the squamate reptiles. The γδ T cell lineage is ancient and present in all other jawed vertebrate lineages investigated so far, making the squamates the only known group to lack γδ T cells. In addition, the tuatara TCRδ-chains, like those of a growing list of species, use Ab-like V region genes, further demonstrating the evolutionary plasticity of TCRδ.
We give special thanks to Dr. Xifeng Wang, Institute of Zoology, Chinese Academy of Science, Beijing, China, for sharing the alligator TRDVH sequences. We thank the UNM Center for Advanced Research Computing for providing the high-performance computing and largescale storage resources used in this work. We also thank Amy Slender and Robert O’Reilly for extracting RNA and preparing cDNA for T. rugosa lizard samples.
This work was supported in part by National Science Foundation Award IOS-2103367 (to R.D.M.), National Science Foundation Graduate Research Fellowship Award DGE-1939267 (to K.A.M.), and by Australian Research Council Discovery Grant DP200102880 (to M.G.G., R.D.M., and T.B.). The tuatara genome and transcriptome sequencing were undertaken in partnership with Ngātiwai iwi, with financial support from New Zealand Genomics Limited, the Allan Wilson Centre, and the University of Otago to N.J.G. Sequencing of the Tiliqua rugosa genome and transcriptomes was funded by BioPlatforms Australia under the Amphibian and Reptile Genome Initiative and Holsworth Wildlife Research Endowment. Assembly of the T. rugosa genome was supported by computational resources provided by the Australian Government through National Computational Infrastructure under the National Computational Merit Allocation Scheme, the Australian National University Merit Allocation Scheme, and the Australian BioCommons Leadership Share initiative.
The Tiliqua rugosa transcriptome sequences presented in this study have been submitted to GenBank under accession numbers OL311598–OL311653.
The online version of this article contains supplemental material.
The authors have no financial conflicts of interest.