Visual Abstract

Comparative analyses suggest that the MHC was derived from a prevertebrate “primordial immune complex” (PIC). PIC duplicated twice in the well-studied two rounds of genome-wide duplications (2R) early in vertebrate evolution, generating four MHC paralogous regions (predominantly on human chromosomes [chr] 1, 6, 9, 19). Examining chiefly the amphibian Xenopus laevis, but also other vertebrates, we identified their MHC paralogues and mapped MHC class I, AgR, and “framework” genes. Most class I genes mapped to MHC paralogues, but a cluster of Xenopus MHC class Ib genes (xnc), which previously was mapped outside of the MHC paralogues, was surrounded by genes syntenic to mammalian CD1 genes, a region previously proposed as an MHC paralogue on human chr 1. Thus, this gene block is instead the result of a translocation that we call the translocated part of the MHC paralogous region (MHCtrans). Analyses of Xenopus class I genes, as well as MHCtrans, suggest that class I arose at 1R on the chr 6/19 ancestor. Of great interest are nonrearranging AgR-like genes mapping to three MHC paralogues; thus, PIC clearly contained several AgR precursor loci, predating MHC class I/II. However, all rearranging AgR genes were found on paralogues derived from the chr 19 precursor, suggesting that invasion of a variable (V) exon by the RAG transposon occurred after 2R. We propose models for the evolutionary history of MHC/TCR/Ig and speculate on the dichotomy between the jawless (lamprey and hagfish) and jawed vertebrate adaptive immune systems, as we found genes related to variable lymphocyte receptors also map to MHC paralogues.

This article is featured in In This Issue, p.1681

The “2R hypothesis” has proposed that the early vertebrate genome experienced two rounds of genome-wide duplications (1). Indeed, there are four paralogous clusters of genes in the genomes of all jawed vertebrates, first studied in humans for homeobox and MHC genes (2, 3). When genes or genetic regions are duplicated, some loci preserve their original function, whereas others are modified (neofunctionalization or subfunctionalization) or may experience differential silencing. Other types of genome modifications may occur, such as translocation of block regions, at times blurring the origins of a particular genetic region.

As mentioned, the MHC was one of the original gene clusters noted for its paralogous regions (or “ohnologues”), found on human chromosomes (chr) 6 (MHC), 1, 9, and 19 (MHC paralogues [MHCpara]) (3, 4). Further analysis using the insulin/relaxin and neurotrophin/neurotrophin receptor family genes revealed that there are additional regions containing paralogous genes in a similar order (57), and it has been suggested that the precursors of these regions and MHCpara were syntenic during the preduplication era, but some were translocated over evolutionary time. These detached regions include sections of human chr 12, 14, and 15, and are generally shorter than the original regions; we refer to these detached regions as “minor MHCpara,” and the original four regions as “major MHCpara.”

The MHC harbors many genes involved in adaptive and innate immunity (6, 8). Central to the adaptive immune system, the Ag-presenting MHC class I and class II molecules work in concert with Ag-processing (immunoproteasomes), peptide-transporting (TAP), peptide-editing (DM, TAPBP), and other molecules, to present antigenic peptides recognized by TCR. Precursors of these genes were likely derived from the so-called primordial immune complex (PIC), predating the genome-wide duplications in early vertebrates (9). Indeed, analysis of several invertebrate deuterostome genomes [e.g., amphioxus (Branchiostoma lanceolatum) (10), and a placozoan (Trichoplax adhaerens) (11)] revealed conserved synteny of proteasome and “framework” genes (i.e., nonimmune genes in MHC). To date, and unfortunately, no candidate class I/II genes have been detected in species derived from ancestors predating the jawed vertebrates, and thus most genes strictly involved in adaptive immunity (based on MHC, Ig, TCR) seem to have appeared “suddenly” in a gnathostome ancestor. Because both MHC and MHCpara are derived from a preduplicated precursor region in a common vertebrate ancestor (3, 6, 9), analysis of these regions from different extant vertebrates provides insight into the evolutionary history of the MHC and its precursor.

Previous work on the paralogous regions has focused only on mammals. In this study, we took advantage of the published work in humans and focused on the genome of the amphibian Xenopus. Previous studies showed that the Xenopus genome is relatively stable and preserves some primordial features that were lost in other vertebrates (12), thus serving as a complementary model system to study genome evolution. We used the true diploid Xenopus tropicalis (13) and especially the tetraploid Xenopus laevis (14), in which the genomes have been recently sequenced and analyzed. In combination with comparative genomic analyses, we obtained evidence for the timing of emergence of MHC class I/II and AgR genes. We further propose a model for the evolution of the human chr 1q21.1–23.3 region, including the CD1 genes, and reflect on the dichotomy between the jawed and jawless vertebrate adaptive immune systems.

We examined gene models (i.e., software-generated conceptual translation) in the scaffolds and genome assembly with subsequent manual validation/annotation. Additionally, we performed tblastn to find genes that were overlooked by the gene-finder software at the web portal. Chromosomal location of Xenopus genes was obtained based on the mapped BAC clones using fluorescence in situ hybridization (FISH) methods described elsewhere (15). All information is publicly available through Xenbase (http://xenbase.org) (X. laevis v7.1 and 9.1, X. tropicalis v8 and 9) and the National Center for Biotechnology Information (NCBI) (http://ncbi.nlm.nih.gov). We found inconsistent assemblies among different X. tropicalis versions as well as between X. laevis and X. tropicalis. More extensive mapping has been done with X. laevis chromosomes, and thus the X. laevis genome was largely used for this study. Genomic data from vertebrates other than Xenopus were obtained from various databases in GenBank at NCBI. Gene models from the X. laevis genome are found at NCBI: VJC11310 (ACB47447); VJC1258 (OCT67647); VJC1406 (OCT69143-7); class Ib112 (XP_018111305); class Ib145 (OCT68671); class Ib16004 (XP_018109328). Note that these gene model-based sequences are predicted and thus may not always reflect the RNA sequence. We found that most Ig superfamily (IgSF) domains encoded within a single exon are reliable with occasional inaccurate exon-intron boundaries.

Synteny probability calculation was performed using the method described by Danchin et al. (16); we calculated the binomial probability that the Xenopus regions of interest are in synteny with their human corollaries or the probability that the genes were organized by chance. This probability is calculated using a binomial probability as:

where x is the number of homologous genes of human found in the Xenopus regions and p is the proportion of genes in the hypothesized human region (i.e., number of genes divided by 20,199 total protein-coding genes in the human reference GRC38 dataset at NCBI). This gives the probability of our selected Xenopus regions have the same compliment of genes as humans by chance. To keep consistency of gene criteria, we obtained protein-coding genes from Xenopus_laevis_v2 dataset at NCBI.

For all reported statistics, we included both hypothetical and duplicated genes in Xenopus as a conservative probability estimate of synteny, but with or without these gene subsets, all probabilities provide the same interpretation, if not decreasing the probability of synteny by chance.

X. laevis is an allotetraploid (4n) species, generated by hybridization of two divergent ancestral diploid (2n) Xenopus species (subgenomes long [L] and short [S]), and thus its genome contains sets of paired, or homeologous, chromosomes (i.e., 1L ∼ 9L and 1S ∼ 9S; n = 18). These two subgenomes have been independently maintained, with no detectable intergenome recombination (14). Genome-wide analysis further revealed that synteny is generally well conserved between L and S chromosomes, but gene loss, when it occurs [often the case for many adaptive immune genes (17)], is much more frequent on S chromosomes (14). Gene content of the L chromosomes is most similar to the genome of the true diploid X. tropicalis. Although most housekeeping genes are present on both chromosomes, most class I (except a few class I–like genes), AgR, and AgR-like genes discussed in this report were diploidized and thus found only on the L chromosomes, and therefore we focused our analyses on the L chromosomes.

The Xenopus MHC was previously mapped by FISH to chr 8 (18) and now is precisely mapped to 8Lq21. To identify Xenopus MHCpara, we used sets of paralogous hallmark genes that were originally used to identify the human MHCpara (huMHCpara) (3) (e.g., notch1, 2, 3, 4; pbx1, 2, 3, 4; rxra, b, g; and complement c3, 4, 5, a2m). Other conserved paralogues such as brd1, 2, 3, 4 were not all detected in the current Xenopus assemblies and thus were excluded from analyses. Like in humans, we found the same four sets of clustered paralogous hallmark genes on Xenopus chromosomes: 8Lq21 (MHC), 4Lq24-25, 8Lp11-12, and 3Lq33-34, as well as orthologs of the human minor MHCpara on 1Lq and 7Lp23-24 (Fig. 1, Table I; hallmark genes in red).

FIGURE 1.

MHC class I, AgR, and catalytic proteasome β-subunit genes are found in the human and especially Xenopus major and minor MHC paralogous regions. The location of MHCpara marker genes correspond well between the human and Xenopus genomes. Two minor MHCpara are also shown because these regions contain significant marker genes (e.g., PSMB5) and other paralogues (e.g., TAPBPL) and thus harbor remnants of the ancestral linkage. Marker, or hallmark, genes are indicated in red, and psmb genes are in light blue. MHC class I/II, AgR, and NCR3 homologs are shown in green, blue, and purple, respectively. A VLR homolog, GP1BB, is also shown with a gray box. Corresponding chromosomes in human and Xenopus are shown side-by-side. NK receptor complexes (NKC, LRC) also map in minor MHCpara. Note that both minor MHCpara shown in this figure are likely derived from the human chr 19 precursor (see Fig. 6).

FIGURE 1.

MHC class I, AgR, and catalytic proteasome β-subunit genes are found in the human and especially Xenopus major and minor MHC paralogous regions. The location of MHCpara marker genes correspond well between the human and Xenopus genomes. Two minor MHCpara are also shown because these regions contain significant marker genes (e.g., PSMB5) and other paralogues (e.g., TAPBPL) and thus harbor remnants of the ancestral linkage. Marker, or hallmark, genes are indicated in red, and psmb genes are in light blue. MHC class I/II, AgR, and NCR3 homologs are shown in green, blue, and purple, respectively. A VLR homolog, GP1BB, is also shown with a gray box. Corresponding chromosomes in human and Xenopus are shown side-by-side. NK receptor complexes (NKC, LRC) also map in minor MHCpara. Note that both minor MHCpara shown in this figure are likely derived from the human chr 19 precursor (see Fig. 6).

Close modal
Table I.
Chromosomal locations of genes in human and Xenopus genomes
MHC and MHCpara
MHCparaGenesHuman chr.X. laevis chr.aScaffold (v7.1)Position (v7.1)bFISH BACPosition (v9.1)b
MHC-6 TAPBP 6p21.3 8Lq14-21 50694 6,954,053..6,969,886 108L10 50,739,635..50,755,022 
RXRB 6p21.3 8Lq14-21 50694 7,102,020..7,117,938 106L10 50,887,807..50,903,520 
PSMB8 6p21.3 8Lq21 75398 274,622..283,843 290K18 78,508,537..78,523,720 
PSMB9 6p21.3 8Sq21 12933 4,797,175..4,812,351 044A14 78,508,537..78,523,720 
PBX2 6p21.3 8Lq21 75398 378,082..396,079 114D22 51,636,291..51,653,761 
NOTCH4 6p21.3 8Lq21 75398 337,685..353,721 114D22 59,569,525..51,611,344 
C4 6p21.33 8Lq21 75398 475,934..520,806 114D22 51,733,637..51,778,496 
PSMB10 16q22.1 8Lq21 75398 524,686..539,832 114D22 51,782,368..51,796,858 
MHCpara-1 NOTCH2 1p13-p11 4Lq25 78978 84,826..126,901 055J23 110,037,275..110,044,215 
PBX1 1q23 4Lq24 47606 5,480,539..5,556,446 036M06 99,325,939..99,347,128 
RXRG 1q22-q23 4Lq25 78978 1,407,934..1,480,769 055J23 111,399,108..111,408,361 
MHCpara-9 NOTCH1 9q34.3 8Lp12 37448 2,529,375..2,559,128 030B08 4,177,800..4,228,940 
RXRA 9q34.3 8Lp 255149 96,949..211,615 NA 5,266,355..5,268,209 
PBX3 9q33.3 8Lp11 403228 523,205..572,639 020L15 11,095,878..11,257,315 
PSMB7 9q33.3 8Lp11-12 3586 2,248,619..2,282,130 227M14 9,754,478..9,780,669 
C5 9q33.2 8Lp 86205 1,227,102..1,317,602 NA 5,816,244..5,865,712 
MHCpara-19 NOTCH3 19p13.2 3Lq33-34 171831 677,258..734,233 079J11 125,881,103..125,938,078 
C3 19p13.3 3Lq34-35 175714 455,613..739,326 322O09 134,274,206..134,300,156 
PSMB6 17p13.2 3Lq35 16004 50,127..57,691 017J04 139,511,604..139,519,183 
PBX4 19p13.11 NA NA NA NA NA 
MHCpara-14 (minor) IgLσ NA 1Lq12 39437 417,923..418,230 031N23 98,280,301..98,295,182 
TRA 14q11.2 1Lq15 29869 458,946..459,559 039F04 140,207,982..140,211,379 
TRD 14q11.2 1Lq15 272406 116,704..184,681 130J21 140,946,814..140,951,210 
IgHMC 14q32-33 1Lq14-15 13576 6,811,972..7,160,435 312E22 139,040,662..139,059,333 
PSMB5 14q11.2 1Lq14 13576 6,389,514..6,394,129 244A12 138,627,523..138,632,499 
IgLλ 22q11.22 1Lq21 162663 1..140,765 159H19 153,417,276..153,418,351  
MHCpara-12 (minor) TAPBPL 12p13.31 7Lp23-24 79772 4,980,784..7,959,403 225A12 7,950,550..7,960,609 
LAG3 12p13.31 7Lp23-24 79772 5,304,485..5,317,904 225A12 7,593,489..7,606,908 
CD4 12p13.31 7Lp23-24 79772 5,359,817..5,371,805 225A12 7,539,588..7,551,576 
A2M 12p13.31 7Lp24 131666 1,275,208..1,307,453 307G18 5,334,645..5,366,890 
CLEC2B 12p13.31 7Lp24 131666 693,899..709,661  307G18 5,932,418..5,948,215 
MHC and MHCpara
MHCparaGenesHuman chr.X. laevis chr.aScaffold (v7.1)Position (v7.1)bFISH BACPosition (v9.1)b
MHC-6 TAPBP 6p21.3 8Lq14-21 50694 6,954,053..6,969,886 108L10 50,739,635..50,755,022 
RXRB 6p21.3 8Lq14-21 50694 7,102,020..7,117,938 106L10 50,887,807..50,903,520 
PSMB8 6p21.3 8Lq21 75398 274,622..283,843 290K18 78,508,537..78,523,720 
PSMB9 6p21.3 8Sq21 12933 4,797,175..4,812,351 044A14 78,508,537..78,523,720 
PBX2 6p21.3 8Lq21 75398 378,082..396,079 114D22 51,636,291..51,653,761 
NOTCH4 6p21.3 8Lq21 75398 337,685..353,721 114D22 59,569,525..51,611,344 
C4 6p21.33 8Lq21 75398 475,934..520,806 114D22 51,733,637..51,778,496 
PSMB10 16q22.1 8Lq21 75398 524,686..539,832 114D22 51,782,368..51,796,858 
MHCpara-1 NOTCH2 1p13-p11 4Lq25 78978 84,826..126,901 055J23 110,037,275..110,044,215 
PBX1 1q23 4Lq24 47606 5,480,539..5,556,446 036M06 99,325,939..99,347,128 
RXRG 1q22-q23 4Lq25 78978 1,407,934..1,480,769 055J23 111,399,108..111,408,361 
MHCpara-9 NOTCH1 9q34.3 8Lp12 37448 2,529,375..2,559,128 030B08 4,177,800..4,228,940 
RXRA 9q34.3 8Lp 255149 96,949..211,615 NA 5,266,355..5,268,209 
PBX3 9q33.3 8Lp11 403228 523,205..572,639 020L15 11,095,878..11,257,315 
PSMB7 9q33.3 8Lp11-12 3586 2,248,619..2,282,130 227M14 9,754,478..9,780,669 
C5 9q33.2 8Lp 86205 1,227,102..1,317,602 NA 5,816,244..5,865,712 
MHCpara-19 NOTCH3 19p13.2 3Lq33-34 171831 677,258..734,233 079J11 125,881,103..125,938,078 
C3 19p13.3 3Lq34-35 175714 455,613..739,326 322O09 134,274,206..134,300,156 
PSMB6 17p13.2 3Lq35 16004 50,127..57,691 017J04 139,511,604..139,519,183 
PBX4 19p13.11 NA NA NA NA NA 
MHCpara-14 (minor) IgLσ NA 1Lq12 39437 417,923..418,230 031N23 98,280,301..98,295,182 
TRA 14q11.2 1Lq15 29869 458,946..459,559 039F04 140,207,982..140,211,379 
TRD 14q11.2 1Lq15 272406 116,704..184,681 130J21 140,946,814..140,951,210 
IgHMC 14q32-33 1Lq14-15 13576 6,811,972..7,160,435 312E22 139,040,662..139,059,333 
PSMB5 14q11.2 1Lq14 13576 6,389,514..6,394,129 244A12 138,627,523..138,632,499 
IgLλ 22q11.22 1Lq21 162663 1..140,765 159H19 153,417,276..153,418,351  
MHCpara-12 (minor) TAPBPL 12p13.31 7Lp23-24 79772 4,980,784..7,959,403 225A12 7,950,550..7,960,609 
LAG3 12p13.31 7Lp23-24 79772 5,304,485..5,317,904 225A12 7,593,489..7,606,908 
CD4 12p13.31 7Lp23-24 79772 5,359,817..5,371,805 225A12 7,539,588..7,551,576 
A2M 12p13.31 7Lp24 131666 1,275,208..1,307,453 307G18 5,334,645..5,366,890 
CLEC2B 12p13.31 7Lp24 131666 693,899..709,661  307G18 5,932,418..5,948,215 
Class Ia/Ib and AgR genes 
Gene Human chr X. laevis chr.a Scaffold (v.7.1) Position (v7.1)b FISH BAC Position (v9.1)b Domains 
MHC class I and class I–like 
112  1Lq12 72621 122,476..126,293 085N05 102,130,692..102,139,541 a1,2,3; a1,2; a2 
145  8Lq25 265107 1,565,727..1,581,290 012C13 87,117,299..87,129,697 a1,2,3 
Class Ia 6p21.3 8Lq21 75396 164,448..242,219 290K18 51,482,854..51,498,908  a1,2,3 
XNC  8Lq31-32 26819 3,427,830..3,826,756 156D07 110,198,845..110,862,792 a1,2,3 
16004  3Lq35 16004 123,032..130,911 017J04 139,582,763..139,592,397 a1,2,3 
CD1 1q22-23      a1,2,3 
MR1 1q25.3      a1,2,3 
FCGRT 19q13.33      a1,2,3 
PROCR 20q11.2      a1,2 
ZAG 7q22.1      a1,2,3 
ULBP RAET 6q25      a1,2,3 
AgR-like 
1310  8Lp12 127590 359,968..365,248 209G21 1,072,952..1,075,438 VC 
258  8Lq14-21 50694 22,116..25,167 106L10 43,808,224..43,815,003 VC 
406 Lost? (1q22) 8Lq31-32 115163 Multigene family 221,846..1754,674 033B12 104,468,021..106,003,445 VC 
PTCRA 6p21.1      C (loss of V?) 
IgLκ 2p12 1Lp32-34 109418 3467 2,725,506..2,725,994 177,260..183,220 213L05 146J08 9,199,747..9,212,091 VC 
TCRβC 7q34 7Lp23-24 230427 307,269..307,610 191H14 315,991...316,317 VC 
TCRγC 7P14 6Lp12-13 19169 498,099..551,608 045F01 62,074,212..62,074,523 VC 
NKp30 homolog 
NKp30 6p21.3 4Lq25 35524 Multigene family 2,568,835..2,569,428 166F02 118,024,408..118,452,984 
XMIV (6p21.3) 8Lq21 75398 Multigene family 1,530,600..1,631,611 154P18 52,754,412..52,854,193 
Class Ia/Ib and AgR genes 
Gene Human chr X. laevis chr.a Scaffold (v.7.1) Position (v7.1)b FISH BAC Position (v9.1)b Domains 
MHC class I and class I–like 
112  1Lq12 72621 122,476..126,293 085N05 102,130,692..102,139,541 a1,2,3; a1,2; a2 
145  8Lq25 265107 1,565,727..1,581,290 012C13 87,117,299..87,129,697 a1,2,3 
Class Ia 6p21.3 8Lq21 75396 164,448..242,219 290K18 51,482,854..51,498,908  a1,2,3 
XNC  8Lq31-32 26819 3,427,830..3,826,756 156D07 110,198,845..110,862,792 a1,2,3 
16004  3Lq35 16004 123,032..130,911 017J04 139,582,763..139,592,397 a1,2,3 
CD1 1q22-23      a1,2,3 
MR1 1q25.3      a1,2,3 
FCGRT 19q13.33      a1,2,3 
PROCR 20q11.2      a1,2 
ZAG 7q22.1      a1,2,3 
ULBP RAET 6q25      a1,2,3 
AgR-like 
1310  8Lp12 127590 359,968..365,248 209G21 1,072,952..1,075,438 VC 
258  8Lq14-21 50694 22,116..25,167 106L10 43,808,224..43,815,003 VC 
406 Lost? (1q22) 8Lq31-32 115163 Multigene family 221,846..1754,674 033B12 104,468,021..106,003,445 VC 
PTCRA 6p21.1      C (loss of V?) 
IgLκ 2p12 1Lp32-34 109418 3467 2,725,506..2,725,994 177,260..183,220 213L05 146J08 9,199,747..9,212,091 VC 
TCRβC 7q34 7Lp23-24 230427 307,269..307,610 191H14 315,991...316,317 VC 
TCRγC 7P14 6Lp12-13 19169 498,099..551,608 045F01 62,074,212..62,074,523 VC 
NKp30 homolog 
NKp30 6p21.3 4Lq25 35524 Multigene family 2,568,835..2,569,428 166F02 118,024,408..118,452,984 
XMIV (6p21.3) 8Lq21 75398 Multigene family 1,530,600..1,631,611 154P18 52,754,412..52,854,193 
a

Mapping location based on v9.1.

b

Beginning..end of positions in the scaffolds.

Proteasomes are the most abundant proteins in the cytoplasm and are required for cytosolic protein degradation and recycling pathways (19). Eukaryote proteasomes form a barrel-shaped catalytic tunnel with two identical outer rings composed of seven α-subunits and two identical inner rings composed of seven β-subunits. Only three β-subunits (PSMB5 [LMPX], PSMB6 [LMPY], PSMB7 [LMPZ]) are catalytically active. Upon immune stimulation, expression of three β-subunits, PSMB8 (LMP7), PSMB9 (LMP2), and PSMB10 (MECL1), are upregulated, replacing the constitutive subunits PSMB5, PSMB6, and PSMB7, respectively, to form the “immunoproteasome” that generates peptides preferable for class I binding (19). Because some prokaryotes possess only one type of β-subunit, it has been proposed that the genes encoding the catalytically active β-subunits, psmb5, 6, and 7, were generated by cis-duplication in an eukaryote ancestor, likely present in the proto-MHC (20, 21); indeed, β-subunit genes are found in linkage groups with MHC framework genes in preduplicated genomes in lower deuterostomes such as amphioxus (10, 21) and the placozoan T. adhaerens (11). All three immunoproteasome genes psmb8, 9, and 10 are encoded in the MHC of many ectothermic vertebrates (12, 22). In humans, only PSMB8 and PSMB9 are found in the MHC (chr 6), and PSMB10 on human chr 16 is the result of translocation out of the MHC. Likewise, the constitutive proteasome PSMB7 maps on huMHCpara-9 (i.e., huMHCpara chr 9) (light blue boxes in Fig. 1, Table I), but other PSMB genes were proposed to be translocated from their original location to other genomic regions outside MHCpara (20).

We found that Xenopus psmb6 maps to 3Lq35, in the vicinity of c3 and notch3, a region corresponding to huMHCpara-19, and we previously reported that Xenopus psmb10 maps in the MHC class III region (Fig. 1, Table I) (12), suggesting that the translocation of psmb6 and psmb10 occurred after the amphibian–mammal divergence. PSMB5 is found on human chr 14q11.2 in the vicinity of TCRA/D (14q11.2) and near the IgH chain (14q32.33) loci. This synteny is well conserved in Xenopus, with psmb5 on chr 1Lq14-15, near tcra/d (1Lq15), igh (1Lq14-15), and igl (λ and σ) (Fig. 1, Table I). As mentioned above, from the distribution of human insulin-relaxin genes (5), this region of human chr 14 is a genetic fragment originally linked to an MHC precursor, but translocated during vertebrate evolution, and is designated as a minor MHCpara (6, 7, 20) (Fig.1, Table I). In summary, unlike in humans, all Xenopus psmb genes encoding catalytic proteasome β subunits map to major or minor MHCpara.

In Xenopus, a single classical class I (class Ia) gene maps to the MHC (23), whereas a cluster of nonclassical class I (class Ib) genes (xnc) (24, 25) was previously mapped to the telomeric region of the MHC chromosome (18). Now we report three additional nonclassical class I genes in the Xenopus genome designated class Ib112, class Ib16004, and class Ib145, based on their original scaffold numbers in ver 4.1 (Table I). All three are single-copy genes on L chromosomes with typical class I domain structures, but the deduced amino acid sequences lack the evolutionarily conserved peptide-binding residues found in all classical class Ia molecules (Supplemental Fig. 1A); note that the class Ib112 is highly divergent from class Ia (see below). In addition, consistent with their designation as nonclassical class I genes, these three class I genes are monomorphic (data not shown), have a tissue-specific expression, and are expressed at much lower levels than class Ia (Supplemental Fig. 1E).

Whereas Xenopus MHC class Ia and the xnc cluster map to 8Lq21 and 8Lq31-32, respectively, the class Ib145 gene maps between the MHC and xnc (green box in Fig.1, Table I). Based on phylogenetic analyses, the class Ib145 gene is intermediate in similarity to the Xenopus class Ia and class Ib genes (Supplemental Fig. 2). Interestingly, the class Ib145 gene is surrounded by genes mapping to human chr 14q13.2 (Supplemental Table I), near huMHCpara-14. The class Ib16004 gene, most related to the xnc genes (Supplemental Fig. 2), maps very near (only four genes apart) to psmb6 on 3q33-34 in an MHCpara (Fig. 1, Table I). The human class Ib gene FCGRT encoding the p51 subunit of the neonatal IgG Fc receptor (FcRn) is found in a similar gene location as Xenopus class Ib1604, but we could not establish orthology between these two genes in phylogenetic analyses or synteny (Supplemental Fig. 2). However, the synteny of genes between class Ib16004 to psmb6 on human chr 17p13 is conserved (probability by chance: 3.33 × 10−16, Table II), further cementing the ancient class I–proteasome gene linkage. Most likely, this part of the MHCpara was translocated later in the vertebrate lineage.

Table II.
Probabilistic calculation of Xenopus synteny with human for regions of interest
RegionNo. of Genes in Hypothesized Human RegionaPbHomologs in Xenopus RegionTotal in Xenopus RegionProbability Human and Xenopus Share Genes by Chance
VJC11310 327 1.62 × 10−2 172 220 3.89 × 10−15 
Class Ib112 1158 5.73 × 10−2 139 279 <1 × 10−16 
MHC 216 1.07 × 10−2 88 106 <1 × 10−16 
MHC without butyrophilins 150 7.43 × 10−3 85 103 <1 × 10−16 
Class Ib16004 35 1.73 × 10−3 23 51 3.33 × 10−16 
GP1BB 181 8.96 × 10−3 66 78 <1 × 10−16 
RegionNo. of Genes in Hypothesized Human RegionaPbHomologs in Xenopus RegionTotal in Xenopus RegionProbability Human and Xenopus Share Genes by Chance
VJC11310 327 1.62 × 10−2 172 220 3.89 × 10−15 
Class Ib112 1158 5.73 × 10−2 139 279 <1 × 10−16 
MHC 216 1.07 × 10−2 88 106 <1 × 10−16 
MHC without butyrophilins 150 7.43 × 10−3 85 103 <1 × 10−16 
Class Ib16004 35 1.73 × 10−3 23 51 3.33 × 10−16 
GP1BB 181 8.96 × 10−3 66 78 <1 × 10−16 
a

Based on the human reference GRC38, with 20,199 total genome-wide protein-coding genes.

b

Proportion of the human genome found in the hypothesized syntenic region.

Most conspicuously, the Xenopus class Ib112 class Ib gene maps between psmb5 and IgL on Xenopus chr 1Lq12 (Fig. 1, Table I), the region corresponding to the minor huMHCpara-14 described above that also contains TCRA/D and IgH/L genes. Consistent with its location on the ancient paralogue, class Ib112, like CD1, clusters outside of all other vertebrate class Ia and class Ib genes in the maximum likelihood phylogenetic tree, and somewhat less so in the neighbor-joining tree (Supplemental Fig. 2). We detected reptilian class I genes orthologous to Xenopus class Ib112 (Fig. 2A) that, where it was possible to examine, also map to this interesting paralogous region (Fig. 2B). Upon closer examination of the Xenopus chr 1L region, we found that class Ib112 is surrounded by genes that map to human chr 19p13 (Supplemental Fig. 3). Conservation of synteny was further evaluated with probability by chance of <1 × 10−16 (Table II). It should be noted that the so-called UT class Ib genes in opossum (26) (also with reptilian orthologs) are also linked to the psmb10 gene in an MHCpara (GenBank accession NC_008801.1: region 685896657- 705364100 [www.ncbi.nlm.nih.gov]). In summary, all three Xenopus class Ib genes map to MHCpara most likely derived from the chr 6/19 precursor, and two of them are linked to genes encoding constitutive catalytic proteasome β subunits.

FIGURE 2.

Evolutionarily conserved MHC class Ib112 among lower vertebrates. (A) Alignment of the class Ib112 genes from Xenopus and reptiles. Dots show residues identical to X. tropicalis 112. Dashes show deletions. An asterisk (*), 8, and b denote peptide-binding residues that are evolutionary conserved among classical class Ia, CD8 binding sites, and β-2 microglobulin binding sites, respectively. Typical conserved amino acid residues for IgSF domains are highlighted in blue. GenBank accession numbers (obtained from ncbi.nlm.nih.gov) of the class Ib112: Chmy (Chelonia mydas: green sea turtle) XP_007069382; Pesi (Pelodiscus sinensis: Chinese soft-shell turtle) XP_014430793, XP_006126776; XP_014430792, XP_014430791, XP_014430790, XP_014430790; Chpib (Chrysemys picta bellii: painted turtle) XP_005313900, XP_008175642; Alsi (Alligator sinensis: Chinese alligator) XP_006037953; Almi (Alligator mississippiensis: American alligator) XP_019343116. (B) Conserved synteny of class Ib112 in amphibians and reptiles. Each box indicates a single gene. Red boxes represent the 112 class Ib genes. The number of genes varies depending on the species, and these genes could only be found in amphibian and reptiles. Data were retrieved from NCBI (www.ncbi.nlm.nih.gov/gene/).

FIGURE 2.

Evolutionarily conserved MHC class Ib112 among lower vertebrates. (A) Alignment of the class Ib112 genes from Xenopus and reptiles. Dots show residues identical to X. tropicalis 112. Dashes show deletions. An asterisk (*), 8, and b denote peptide-binding residues that are evolutionary conserved among classical class Ia, CD8 binding sites, and β-2 microglobulin binding sites, respectively. Typical conserved amino acid residues for IgSF domains are highlighted in blue. GenBank accession numbers (obtained from ncbi.nlm.nih.gov) of the class Ib112: Chmy (Chelonia mydas: green sea turtle) XP_007069382; Pesi (Pelodiscus sinensis: Chinese soft-shell turtle) XP_014430793, XP_006126776; XP_014430792, XP_014430791, XP_014430790, XP_014430790; Chpib (Chrysemys picta bellii: painted turtle) XP_005313900, XP_008175642; Alsi (Alligator sinensis: Chinese alligator) XP_006037953; Almi (Alligator mississippiensis: American alligator) XP_019343116. (B) Conserved synteny of class Ib112 in amphibians and reptiles. Each box indicates a single gene. Red boxes represent the 112 class Ib genes. The number of genes varies depending on the species, and these genes could only be found in amphibian and reptiles. Data were retrieved from NCBI (www.ncbi.nlm.nih.gov/gene/).

Close modal

Note that the positions of class Ib16004 and class Ib145 in the phylogenetic trees do not conform well to their ancient origins that we propose (Supplemental Fig. 2). At least in the case of class Ib145, its location on the same chromosome as the xnc and MHC might subject class Ib145 to gene conversion events that blur its age (e.g., the high similarity of class Ia to class Ib145 in the N-terminal region of the α2 domain and low similarity in the rest of the molecule, Supplemental Fig. 1A). Being in a paralogous region on a different chromosome than MHC/XNC, the clustering of class Ib16004 with Xenopus xnc class Ib genes in the trees is difficult to reconcile with its proposed origins at 1R. Considering the numerous class Ib genes in the frog genome (25) we speculate that there may be opportunities for gene conversion or other unknown mechanisms even among nonhomologous chromosomes.

As mentioned above, a large cluster of xnc class Ib genes maps to the telomere of the Xenopus MHC chr 8Lq31-32 (18), which is not assigned as an MHCpara (Figs. 1, 3, Table I, Supplemental Table I). In the MHC of Xenopus and other nonmammalian vertebrates, low numbers (or only one) of class Ia genes (22) are closely linked to the polymorphic psmb and tap genes (27, 28), forming a primordial “class I region” (29). Coevolution among the genes in the class I region has been suggested: there is a strong linkage disequilibrium between the bony fish [psmb and class Ia (medaka) (30) and psmb, tap and class I (zebrafish) (31)] and shown functionally in birds [tap and class Ia (32)]. The XNC loci were likely generated via cis-duplication of MHC class I genes and the subsequent translocation to a telomeric location, perhaps to limit recombination/gene conversion between the single MHC class Ia gene and class Ib (xnc) genes. A similar organization is found for the chicken MHC (B locus), where class Ib along with several class II genes map separately from the MHC in the telomeric region of the same chromosome (Y or Rfp-Y locus) (33) (see below). This secondary region also presumably arose by cis-duplication of MHC genes followed by translocation, but the situation in frogs and chicken is thought to have developed via convergent evolution. We further predict that the splitting of Xenopus class Ib genes from the MHC to the telomere likely allowed expansion of xnc genes and drove neofunctionalization. For example, xnc10-restricted NKT-like cells have been identified in Xenopus (34), and other xnc genes have prospective NKT partners (35, 36).

FIGURE 3.

Human chr 1q21.2–23.3 is likely a translocated MHCpara. (A) Comparative mapping of the Xenopus XNC region (top) and human chr 1q21-23 region (bottom). The gene cluster, including CD1 (purple boxes), mapping to the human chr 1q21-23 region has its counterparts in the Xenopus XNC region. The XNC maps to the telomeric region of the Xenopus MHC chromosome, and this region was proposed to be the result of translocation from the MHC. Other immune genes, such as slamf and fcr-like, are also found in this linkage group, suggesting the ancient linkage of these genes to the PIC (also refer to Fig. 5A). Furthermore, the presence of uninterrupted IgL-like (VJC1406) genes (shown in blue boxes) provides a strong case for ancestral MHC–AgR linkage. Marker genes as well such as KIRREL are shown in red boxes. Only relevant genes are shown in this figure, and the complete list of X. laevis genes is provided in Supplemental Table I. The solid bar on the far right end of the Xenopus chromosome indicates the telomere. (B) Conserved synteny of novel AgR-like VJC1406 (PRARP) genes among frog, birds, and reptiles. VJC1406 genes, most related to IgL genes, were found in other vertebrate species besides Xenopus and synteny is well conserved. Triangles indicate the 5′ to 3′ gene orientation. Red triangles represent AgR IgL-like genes, VJC1406. The number of genes varies depending on the species, and these loci have been lost in humans and bony fish. Although the genes are present in cartilaginous fish, there is no information on synteny. Synteny is consistent with previously published data (73); however, we focused more on the context of particular genes found in MHCtrans.

FIGURE 3.

Human chr 1q21.2–23.3 is likely a translocated MHCpara. (A) Comparative mapping of the Xenopus XNC region (top) and human chr 1q21-23 region (bottom). The gene cluster, including CD1 (purple boxes), mapping to the human chr 1q21-23 region has its counterparts in the Xenopus XNC region. The XNC maps to the telomeric region of the Xenopus MHC chromosome, and this region was proposed to be the result of translocation from the MHC. Other immune genes, such as slamf and fcr-like, are also found in this linkage group, suggesting the ancient linkage of these genes to the PIC (also refer to Fig. 5A). Furthermore, the presence of uninterrupted IgL-like (VJC1406) genes (shown in blue boxes) provides a strong case for ancestral MHC–AgR linkage. Marker genes as well such as KIRREL are shown in red boxes. Only relevant genes are shown in this figure, and the complete list of X. laevis genes is provided in Supplemental Table I. The solid bar on the far right end of the Xenopus chromosome indicates the telomere. (B) Conserved synteny of novel AgR-like VJC1406 (PRARP) genes among frog, birds, and reptiles. VJC1406 genes, most related to IgL genes, were found in other vertebrate species besides Xenopus and synteny is well conserved. Triangles indicate the 5′ to 3′ gene orientation. Red triangles represent AgR IgL-like genes, VJC1406. The number of genes varies depending on the species, and these loci have been lost in humans and bony fish. Although the genes are present in cartilaginous fish, there is no information on synteny. Synteny is consistent with previously published data (73); however, we focused more on the context of particular genes found in MHCtrans.

Close modal

We found that XNC region contains many genes mapping to human chromosomal region 1q21.1–23.3 (Fig. 3B, Supplemental Table I), specifically a block region surrounding CD1 genes (dotted box in Fig.1). Previously, the 1q21.1–23.3 region was proposed to be a part of huMHCpara-1 (37). However, the proposed MHCpara regions are spread broadly over human chr 1, presumably because of a pericentric inversion on this chromosome (more details below), and thus the integrity of the conservation of the huMHCpara-1 has been questioned (37).

CD1 molecules are similar to MHC class Ia in their protein structure, association with β-2 microglobulin, and Ag-presentation capacity (38, 39). CD1 molecules, however, do not present peptide Ags to conventional T cells but rather lipid Ags to unconventional T cells such as NKT cells and γδT cells, and thus are categorized as class Ib (40). Unlike MHC class Ia, which is expressed ubiquitously, CD1 expression is usually limited to APC, and the CD1 Ag-loading machinery is similar to that of MHC class II (41). It was originally proposed that CD1 genes were generated during 2R and subfunctionalized (42). However, the discovery of cd1 genes in the chicken MHC did not conform well to the 2R hypothesis (4345). So far, two major hypotheses have been proposed to explain the timing of cd1 emergence and genome evolution: Salomonsen et al. (44) proposed that cd1 was generated by tandem duplication of MHC genes at the primordial state (0R), and paralogous copies were silenced in all paralogous regions during genome duplications rather than direct product of 2R. Miller et al. (45) proposed that cd1 may have arisen more recently, and cd1 genes were later translocated to an MHCpara in mammals. The discovery that cd1 genes map to Chinese alligator huMHCpara-19 (46) (Fig. 4, Supplemental Table I) strongly suggests that cd1 arose pre-2R (reviewed in Refs. 47 and 48). Our discovery of the human chr 1q21.1–23.3 region containing genes whose Xenopus counterparts map to the XNC locus suggests a compromise scenario in which the block of human 1q21.1–23.2 genes, including CD1, was the result of secondary translocation following the intrachromosomal translocation from the MHC (Fig. 4). One caveat is the synteny of cd1 genes in various bird species in which the cd1 genes are found in various linkage groups that are not consistent with each other and most of them are not in MHCpara (Supplemental Table I): human chr 1q25 (mallard and swan goose); 9q22.31 (egret, pigeon, crow, finch, manakin, killdeer, falcon, cuckoo, ibis); and 6q22.31 (eagles). If the synteny on 1q25 and 9q22.31 represents the original location, MHC class I could have existed even in the 0R ancestor (Fig. 1).

FIGURE 4.

Hypothetical scenario for the origins of the CD1 region. We propose that the CD1 gene was originally generated by tandem duplication of MHC class I/II precursor genes in the MHC, followed by subfunctionalization. Subsequently, part of the MHC region was translocated and differentially silenced, leaving MHC class I/II genes in the MHC, with CD1 in the translocated MHC region (MHCtrans). This MHCtrans was later translocated into another chromosome, coincidentally an MHCpara (shown here on human chr 1). Dotted boxes indicate silenced/pseudogenes. 2R, second round of whole-genome duplication. See text for further discussion.

FIGURE 4.

Hypothetical scenario for the origins of the CD1 region. We propose that the CD1 gene was originally generated by tandem duplication of MHC class I/II precursor genes in the MHC, followed by subfunctionalization. Subsequently, part of the MHC region was translocated and differentially silenced, leaving MHC class I/II genes in the MHC, with CD1 in the translocated MHC region (MHCtrans). This MHCtrans was later translocated into another chromosome, coincidentally an MHCpara (shown here on human chr 1). Dotted boxes indicate silenced/pseudogenes. 2R, second round of whole-genome duplication. See text for further discussion.

Close modal

In this article, we propose the following scenario (Fig. 4): cd1 was generated by tandem duplication from an MHC class I/II precursor, most likely pre-2R. Subsequently, the class I/II/cd1 genes were cis-duplicated and a block region was translocated to the telomeric region (translocated part of the MHCpara region [MHCtrans]), which allowed expansion of class Ib/cd1 genes. Later, a block region was further translocated to human chr 1q21.1–23.3, coincidentally in huMHCpara-1. During the process, MHC and CD1 loci experienced differential gene loss (loss of MHC class II and CD1 in Xenopus MHCtrans, and loss of MHC genes on human chr 1q21.1–23.3). Finally, expansion of certain genes occurred (class Ib genes [xnc] in Xenopus and CD1 genes in mammalian species including humans). Because most genes mapping to human chr 1q21–23.3 are in the Xenopus XNC region [including KIRREL (49)], whereas all hallmark genes for huMHCpara-1 map to Xenopus 4Lq24-25 with no homologs in both the XNC and 4Lq24-25 regions, translocation seems to be the simplest explanation. Note that the 3′-end of this translocation is at the telomere (Fig. 3A, Supplemental Table I), and the 5′-end contains large clusters of olfactory (OR) and vomeronasal (VNR) genes; both the telomere and repetitive genes may have played a role either in the translocation (especially the telomeric location) or original duplication.

To further examine the evolutionary timing of en bloc translocation of the 1q21.1–23.3 region, we searched for huMHCpara-1 orthologous regions in several representative vertebrates (Fig. 5A). As mentioned earlier, the huMHCpara-1 spreads onto both arms of chr 1, proposed to be partially a result of a pericentric inversion (37). For example, hallmark genes are split onto both arms of chr 1: NOTCH2 maps to 1p13-p11, whereas RXRG and PBX1 map to 1q23.3 (Fig. 5B). Similarly, notch2 maps separately from rxrg and pbx1 in the opossum genome. However, hallmark genes are closely linked in all nonmammalian species (on chr 8 in chicken; on chr 4q25 in Xenopus; and in the elephant shark genome) (Fig. 5B), suggesting that the pericentric inversion must have occurred in a mammalian ancestor. Like in Xenopus, orthologous genes on human chr 1q21.1–23.3 are found on chicken chr 25. Therefore, both regions orthologous to 1q21.1–23.3 in chicken and Xenopus are found on different chromosomes, and thus it seems likely that the translocation of 1q21.1–23.3 region occurred after the bird–mammal separation (Fig. 5A). Note that unlike Xenopus, the chicken MHC is not found on chr 25 (rather on chr 16); however, both chr 16 and 25 are microchromosomes, and we predict that these two chromosomes were split during bird evolution. There seems to have been different genome modifications among mammalian species, having multiple chromosomal breakpoints before the rodent/artiodactyla divergence (data not shown).

FIGURE 5.

(A) Origin of the translocated MHC (MHCtrans) region and its subsequent translocation in placental mammals. The region we describe as MHCtrans is found at the telomere of the MHC chromosome in Xenopus, chickens, and marsupials, suggesting that the translocation of MHCtrans to non-MHC chromosomes occurred in placental mammals. MHCtrans contains other immune genes such as SLAMF, FcR-like, and IgL-like (VJC1406), suggesting that ancestors of these genes were present in the primordial MHC (e.g., C2 and VJ-IgSF–encoding genes). MHC and MHCtrans are shown in solid green and dotted red boxes, respectively. (B) Inferring the timing of the p-q split on human chr 1. Chromosomal locations of the NOTCH2 gene from other marker genes, RXRG, PBX1, which correlates with the evolutionary timing of the p-q split between birds and mammals. The timing of the p-q split also correlates well with the translocation of MHCtrans into the 1q21.1–23.3 region (indicated by the hatched box including the KIRREL gene). MR1 is a nonclassical class I that maps to human chr 1, outside of the CD1 region, is found only in mammalian lineage, and its evolutionary origin is unknown.

FIGURE 5.

(A) Origin of the translocated MHC (MHCtrans) region and its subsequent translocation in placental mammals. The region we describe as MHCtrans is found at the telomere of the MHC chromosome in Xenopus, chickens, and marsupials, suggesting that the translocation of MHCtrans to non-MHC chromosomes occurred in placental mammals. MHCtrans contains other immune genes such as SLAMF, FcR-like, and IgL-like (VJC1406), suggesting that ancestors of these genes were present in the primordial MHC (e.g., C2 and VJ-IgSF–encoding genes). MHC and MHCtrans are shown in solid green and dotted red boxes, respectively. (B) Inferring the timing of the p-q split on human chr 1. Chromosomal locations of the NOTCH2 gene from other marker genes, RXRG, PBX1, which correlates with the evolutionary timing of the p-q split between birds and mammals. The timing of the p-q split also correlates well with the translocation of MHCtrans into the 1q21.1–23.3 region (indicated by the hatched box including the KIRREL gene). MR1 is a nonclassical class I that maps to human chr 1, outside of the CD1 region, is found only in mammalian lineage, and its evolutionary origin is unknown.

Close modal

In summary, we propose that the CD1 region in mammals is a result of a translocation event, by chance, into huMHCpara-1, and thus there is no strong evidence of class I genes on MHCpara-1 or -9. This is consistent with our hypothesis that a class I precursor gene may have arisen after 1R on only one of the duplicated chromosomes, chr 6/19 (Figs. 4, 6). Contrary to the existing hypothesis that class II predates class I (5053), we further propose that class I emerged first in evolution because we have not found MHC class II genes anywhere outside the bona fide MHC or paralogous regions (54).

FIGURE 6.

Emergence and genomic evolution of MHC class I/II and AgR genes. We hypothesize that an AgR precursor was present in the PIC prior to the first round of whole-genome duplication (1R). Subsequently, MHC class I/II arose on the chr 6/19 precursor after 1R. We anticipate that the RAG transposon insertion had not occurred until after the after the second round of genome duplication (2R), separated the VJ exon into separate exons only in genes on the chr 19 precursor. Hallmark genes are indicated in red and psmb genes are in light blue. MHC class I/II, AgR, complement, and NK receptors are shown in green, blue, orange, and purple, respectively.

FIGURE 6.

Emergence and genomic evolution of MHC class I/II and AgR genes. We hypothesize that an AgR precursor was present in the PIC prior to the first round of whole-genome duplication (1R). Subsequently, MHC class I/II arose on the chr 6/19 precursor after 1R. We anticipate that the RAG transposon insertion had not occurred until after the after the second round of genome duplication (2R), separated the VJ exon into separate exons only in genes on the chr 19 precursor. Hallmark genes are indicated in red and psmb genes are in light blue. MHC class I/II, AgR, complement, and NK receptors are shown in green, blue, orange, and purple, respectively.

Close modal

When did the original MHCtrans (red arrow in Fig. 4) arise in evolution? We found it in amphibians (Fig. 5A), but it may be older. Families of class Ib genes in cartilaginous fish that are currently unmapped (55) may be a part of this original MHCtrans. Besides class I and AgR-like genes (see below) in MHCtrans, other immune-related genes such as fcrl (56, 57) and slamf (58) are also found in this region (Figs. 3, 5). Unlike class I and AgR, however, slamf and fcrl per se are not found in bona fide MHCpara and thus likely emerged soon after 2R in early vertebrates. We further predict that their origin, most likely, is from constant (C) 2–type IgSF precursors that were present in the PIC (e.g., KIR genes found on huMHCpara-19 are also derived from these precursors).

Linkage of TCR- and Ig-like genes in association to the primordial MHC has been previously suggested (59, 60). AgRs bear a rare, specialized C1-type IgSF domain (61) like those found in MHC class I/II, and thus one might predict their linkage to the primordial MHC. Human TCRA/D genes are found near PSMB5 (chr 14q11 in Fig. 1, Table I), also suggesting ancestral linkage of TCR to MHC. In Xenopus genome, in addition to the close linkage of tcrad-psmb5, the igh locus (62) and igl genes (especially the λ isotype) are closely linked (Xen1q in Fig. 1). These locations strongly support the ancestral linkage of precursor AgR genes to the proto-MHC.

AgRs have a variable (V) domain with a signature IgSF “G” β-strand encoded in a separate element; in the germline of the most simple IgL, the V element encodes strands “A–F” and the J (joining) element encodes the “G” strand (61) (also shown in Supplemental Fig. 1B, 1C). It has been proposed that genes containing a single uninterrupted VJ element (i.e., exon) were present in the primordial MHC, near to genes encoding C1-IgSF domains. Genes encoding these VJ and C1 domains likely combined to become AgR precursors (59, 60), and the RAG transposon (6365) split one of the VJ single genes into separate V- and J- genetic elements (V-J). One candidate for such a precursor is the NCR3 gene encoding NKp30 (66). NCR3 contains a single VJ exon and maps to the MHC in most studied vertebrates (67). In Xenopus, a cluster of ncr3 genes map to an MHCpara, 4Lq25 (68), whereas there is another set of genes having exactly the same domain structure (xmiv) mapping to the MHC (12) (dark purple boxes in Fig. 1, Supplemental Fig. 3). Whether ncr3 is immediately related to the ancestor of the AgR precursor or not, the xmiv and ncr3 genes are clearly derived from a common VJ precursor gene that was linked to the primordial MHC (Fig. 6, Supplemental Fig. 3) (67). Recently, genes with VJ-C2 structure were discovered in amphioxus (lancelet), an invertebrate deuterostome (69). Whether these genes are immediate relatives to the VJ ancestor or is a divergent descendant is debatable; however, one of the lancelet VJ-C2 genes maps adjacent to the kirrel gene, which maps next to CD1 genes in human chr 1q (dotted red box in Fig.1), strongly supporting its relationship to the VJ precursor.

In addition to the previously identified IgH and L chains, and all four types of TCR genes, there are three novel Xenopus genes that encode a single VJ and a C1-IgSF domains, like TCR or IgL chains in “pre-RAG transposon” state. All three genes are found in MHCpara and we designate them VJC1258, VJC1406, and VJC11310 based on their domain structure and scaffold number in ver 4.1 (light purple boxes in Fig. 1, Table I). VJC11310 is a single-copy gene (Supplemental Fig. 1B) mapping to Xenopus MHCpara-8Lp11-12. Preliminary BlastP analysis exhibited high identity with IgL from various vertebrates with highest similarity to the anole lizard (∼4 × 10−31), and spiny dogfish (shark; 5 × 10−25). VJC11310 was previously reported to be a “germline-joined igl chain” (GenBank accession ACB47447 [www.ncbi.nlm.nih.gov]) (70). However, we mapped all three known rearranging IgL isotypes (λ, κ, σ) to Xenopus chr 1, whereas VJC11310 maps to a different MHCpara region (surrounding genes mapping in the huMHCpara-9 [Supplemental Fig. 3]; linkage probability by chance 3.89 × 10−15 [Table II]), making it highly unlikely that VJC11310 is a bona fide IgL. VJC1258 is also a single-copy gene (Supplemental Fig. 1C), maps upstream of the MHC, and is expressed in the Xenopus thymus (by northern blotting, data not shown). BlastP analysis using the VJ domain exhibited highest identity with IgL from various vertebrates with the highest match to coelacanth (4 × 10−31) and large flying fox (2 × 10−30), whereas the C domain matched various cartilaginous fish IgH and IgL with much lower E-values ranging from 1 × 10−9 to 9 × 10−5. The PreTα (PTCRA) gene, which encodes a single C1-IgSF domain and is so far found only in mammalian species (71), also maps upstream of the human MHC (striped box in Fig. 1). The prediction is that PTCRA originally had a V(J) domain, but it was lost in evolution (72). It is possible that Xenopus VJC1258 was related to a precursor of preTα before loss of the V(J) domain, but phylogenetic analysis of VJC1258 and all AgR including preTα did not support this scheme (data not shown). Moreover, BlastP analysis using the C domain did not select PreTα in any other species, suggesting VJC1258 is not closely related to preTα. Regardless of their function and orthology to other genes, mapping of these AgR-like genes to all MHCpara strongly supports the idea that an AgR precursor was present at the 0R stage (i.e., PIC) (Fig. 6).

We also mapped a cluster of VJC1406 genes (Supplemental Fig. 1D) to the scaffolds with xnc genes in the MHCtrans region along with the genes mapping to human 1q21.1–23.3 (Fig. 3A, Supplemental Table I). Again, linkage of MHC class I to AgR-like genes is clear. We found VJC1406 orthologs in many species of reptiles, birds, and other species; during preparation of our article, VJC1406 orthologs have been recently reported from chicken and named PRARP. PRARP were likely lost in mammals and teleost fish but are present in coelacanth and likely in sharks (73). The authors did not conclude that PRARP were AgR-like genes or MHC associated, but the chicken prarp genes were expressed in lymphocytes and thus potentially have an immune function, and they were proposed as candidates for invasion by the RAG transposon. Regardless of their functions, their synteny is well conserved among different vertebrate species (73) (Fig. 3B). In our study, we found a clear linkage of this gene family to MHC class I genes in the MHCtrans region of lower vertebrates (Fig. 3), further confirming the hypothesis that VJ-IgSF were present in the PIC.

In summary, VJ- and C1-IgSF–containing AgR-like genes are present in both major and minor MHCpara regions and MHCtrans, showing that they were present in the PIC before 1R. The consistent linkage of AgR-like and MHC class I genes on chromosomes derived from chr 6/19 after 1R further demonstrates that the presence of AgR precursors in the PIC predates the emergence of bona fide MHC class I genes (Fig. 6).

In a previous study, we proposed a scenario for the evolutionary emergence of TCRD/A and IgH genes (6, 62). In this study, we further examined the genome evolution of the TCRB/G genes. Whereas TCRA and TCRD genes are encoded in the minor huMHCpara-14, TCRB and TCRG genes map at both ends of human chr 7 (Fig. 7A, Table III). Hood and colleagues (74) proposed that this split arrangement is an evolutionarily derived situation, and TCRB and TCRG had been originally closely linked, like the extant TCRA/D genes, but were separated via a pericentric inversion. In Xenopus, tcrb and tcrg are found on different chromosomes (tcrb 7Lq23-24; tcrg 6Lp12-13 [Table III]). However, the Xenopus tcrb gene maps near tapbpl and cd4/lag3 genes, which are found in the NK cell complex (NKC) on human chr 12p13.31 (Figs. 1, 7B, Table III). The NKC is also considered as a minor MHCpara, based on 1) the presence of the marker gene A2M (homolog of C3,4,5) (6); 2) the presence of the TAPBP paralogue, TAPBPL (75) (TAPBP maps to the MHC); 3) mapping of chicken C-type lectin NK receptor genes to the MHC (6, 76, 77), whereas the C-type lectin NK receptor genes map to the mammalian NKC; and 4) studies of neurotrophin gene distribution in jawed vertebrates (7). Thus, tcrb linkage to an MHCpara also suggests an ancestral linkage of TCR precursor genes to the primordial MHC. In contrast, Xenopus tcrg may have been translocated to an unrelated region (chr 6) having no connection to the MHCpara.

FIGURE 7.

(A) TCRβ genes are found in the minor MHCpara. Human chr 12p, harboring the NKC, was identified as a minor MHCpara and contains marker genes like TAPBPL and A2M (shown in red and orange boxes). Although no TCR gene maps in the human NKC, TCRβ genes from many species are closely linked to the region orthologous to the NKC, suggesting the linkage of TCR genes to the PIC. Moreover, the TCRγ gene is also linked to TAPBPL in the opossum genome, further suggesting that a precursor of TCRβ/γ was in the primordial MHC. CD4 and LAG3 (yellow boxes) define the region and contain genes encoding IgSF domains related to MHC. (B) Evolution of TCR and IgH/L from AgR precursors. We predict that a common AgR precursor with an “uninterrupted” VJ exon came together with C1-IgSF domain that was supplied by neighboring genes in the PIC. RAG insertion split the VJ exon into separate V and J exons, D fragments were generated, and became the TCR/IgH/L precursor, all post-2R. During the 2R duplication, this precursor region was further split into two as the precursors of αδ and γβ TCR lineages, consistent with Hood’s hypothesis (74). The TCRα/δ precursor then cis-duplicated and generated IgH/L, as previously suggested (74). The TCR γβ precursor split and was subsequently translocated as separate genes, as detailed in the text. Chromosome numbers are based on human locations. X denotes gene loss; a dot (•) denotes centromere.

FIGURE 7.

(A) TCRβ genes are found in the minor MHCpara. Human chr 12p, harboring the NKC, was identified as a minor MHCpara and contains marker genes like TAPBPL and A2M (shown in red and orange boxes). Although no TCR gene maps in the human NKC, TCRβ genes from many species are closely linked to the region orthologous to the NKC, suggesting the linkage of TCR genes to the PIC. Moreover, the TCRγ gene is also linked to TAPBPL in the opossum genome, further suggesting that a precursor of TCRβ/γ was in the primordial MHC. CD4 and LAG3 (yellow boxes) define the region and contain genes encoding IgSF domains related to MHC. (B) Evolution of TCR and IgH/L from AgR precursors. We predict that a common AgR precursor with an “uninterrupted” VJ exon came together with C1-IgSF domain that was supplied by neighboring genes in the PIC. RAG insertion split the VJ exon into separate V and J exons, D fragments were generated, and became the TCR/IgH/L precursor, all post-2R. During the 2R duplication, this precursor region was further split into two as the precursors of αδ and γβ TCR lineages, consistent with Hood’s hypothesis (74). The TCRα/δ precursor then cis-duplicated and generated IgH/L, as previously suggested (74). The TCR γβ precursor split and was subsequently translocated as separate genes, as detailed in the text. Chromosome numbers are based on human locations. X denotes gene loss; a dot (•) denotes centromere.

Close modal
Table III.
Chromosomal locations of TCRβ and TCRγ genes in various vertebrates
SpeciesChromosomeGenePositiona
Human 12p13.2 KLRD1 (NKC) 10,238,385..10,329,607 
 12p13.31 A2M 9,067,708..9,115,962 
 12p13.31 CD4 6,789,472..6,820,810 
 12p13.31 LAG3 6,772,483..6,778,455 
 12p13.31 TAPBPL 6,451,655..6,472,006 
 7q34 TCRβ 142,299,011..142,813,287 
 7p14.1 TCRγ 38,240,024..38,368,055 
Pig klrd1 61,583,868..61,596,985 
 a2m 65,274,903..65,320,342 
 cd4 66,326,568..66,353,856 
 lag3 66,364,099..66,369,484 
 tapbpl 66,649,711..66,658,562 
 18 tcrβ 7,715,206..7,823,795 
 tcrγ 119,542,537..119,635,982 
Mouse a2m 121,636,166..121,679,238 
 cd4 124,864,693..124,888,248 
 lag3 124,904,359..124,912,434 
 tapbpl 125,223,927..125,231,923 
 klrd1 129,588,092..129,598,775 
 tcrβ 40,891,296..41,558,371 
 13 tcrγ 19,178,042..19,356,476 
Opossum a2m 104,682,643..104,771,506 
 cd4 108,220,454..108,260,998 
 lag3 108,170,654..108,179,156 
 klrk1 113,517,720..113,533,133 
 tcrβ 205,270,812..205,335,586 
 tapbpl 290,987,908..290,993,624 
 tcrγ 283,848,252.. 283,942,577 
Chicken a2m 76,229,983..76,255,770 
 tapbpl 76,876,884..76,889,825 
 lag3 77,194,590..77,202,789 
 cd4 77,208,503..77,219,970 
 tcrβ 78,071,772..78,072,534 
 klrdr1 78,423,947..78,430,724 
 tcrγ 49,292,467..49,295,949 
Turkey tcrβ 74,734,696..74,742,685 
 lag3 75,575,531..75,581,610 
 cd4 75,588,055..75,599,611 
 tapbpl 75,900,408..75,911,755 
 a2m 79,842,550..79,855,332 
 tcrγ 47,636,597..47,652,020 
Salmon tapbpl 10,225,084..10,231,543 
 klrd1 24,161,287..24,177,629 
 cd4-2 30,978,314..30,984,887 
 cd4-1 30,986,632..31,013,703 
 a2m 108,156,058..108,189,009 
 tcrβ 3,348,168..3,354,302 
 20 tcrγ 9,074,301..9,083,342 
Zebrafish 16 tapbpl 9,899,183..9,911,977 
 16 cd4-1 12,021,001..12,055,289 
 16 cd4-2 12,057,069..12,072,262 
 16 clec 29,030,785..29,042,169 
 15 a2m 21,178,237..21,196,748 
 17 tcrβ 48,395,034..48,401,797 (C) 
 tcrγ 31,873,021..31,902,832 (V) 
SpeciesChromosomeGenePositiona
Human 12p13.2 KLRD1 (NKC) 10,238,385..10,329,607 
 12p13.31 A2M 9,067,708..9,115,962 
 12p13.31 CD4 6,789,472..6,820,810 
 12p13.31 LAG3 6,772,483..6,778,455 
 12p13.31 TAPBPL 6,451,655..6,472,006 
 7q34 TCRβ 142,299,011..142,813,287 
 7p14.1 TCRγ 38,240,024..38,368,055 
Pig klrd1 61,583,868..61,596,985 
 a2m 65,274,903..65,320,342 
 cd4 66,326,568..66,353,856 
 lag3 66,364,099..66,369,484 
 tapbpl 66,649,711..66,658,562 
 18 tcrβ 7,715,206..7,823,795 
 tcrγ 119,542,537..119,635,982 
Mouse a2m 121,636,166..121,679,238 
 cd4 124,864,693..124,888,248 
 lag3 124,904,359..124,912,434 
 tapbpl 125,223,927..125,231,923 
 klrd1 129,588,092..129,598,775 
 tcrβ 40,891,296..41,558,371 
 13 tcrγ 19,178,042..19,356,476 
Opossum a2m 104,682,643..104,771,506 
 cd4 108,220,454..108,260,998 
 lag3 108,170,654..108,179,156 
 klrk1 113,517,720..113,533,133 
 tcrβ 205,270,812..205,335,586 
 tapbpl 290,987,908..290,993,624 
 tcrγ 283,848,252.. 283,942,577 
Chicken a2m 76,229,983..76,255,770 
 tapbpl 76,876,884..76,889,825 
 lag3 77,194,590..77,202,789 
 cd4 77,208,503..77,219,970 
 tcrβ 78,071,772..78,072,534 
 klrdr1 78,423,947..78,430,724 
 tcrγ 49,292,467..49,295,949 
Turkey tcrβ 74,734,696..74,742,685 
 lag3 75,575,531..75,581,610 
 cd4 75,588,055..75,599,611 
 tapbpl 75,900,408..75,911,755 
 a2m 79,842,550..79,855,332 
 tcrγ 47,636,597..47,652,020 
Salmon tapbpl 10,225,084..10,231,543 
 klrd1 24,161,287..24,177,629 
 cd4-2 30,978,314..30,984,887 
 cd4-1 30,986,632..31,013,703 
 a2m 108,156,058..108,189,009 
 tcrβ 3,348,168..3,354,302 
 20 tcrγ 9,074,301..9,083,342 
Zebrafish 16 tapbpl 9,899,183..9,911,977 
 16 cd4-1 12,021,001..12,055,289 
 16 cd4-2 12,057,069..12,072,262 
 16 clec 29,030,785..29,042,169 
 15 a2m 21,178,237..21,196,748 
 17 tcrβ 48,395,034..48,401,797 (C) 
 tcrγ 31,873,021..31,902,832 (V) 
a

Beginning..end of positions in chromosomes.

We decided to further examine the linkage status of TCRB and TCRG genes in other vertebrate genomes. Other mammals (e.g., pig, mouse), besides humans, have a linkage of TCRB to NKC genes (Fig. 7A, Table III). Linkage of tcrb to the NKC is also seen in birds (e.g., chicken and turkey). Linkage of tcrb to the NKC has not been documented in bony fish: In the primitive bony fish, spotted gar, tcrb is linked to genes on human chr 14q24.1 and 15q15 on LG7, whereas a2m and tapbpl map to LG26. Synteny of tcrg is conserved among vertebrate species; like Xenopus, tcrg was found on a separate chromosome in all nonprimate species. However, in opossum, tcrg is linked to tapbpl, suggesting a remnant linkage of tcrg to NKC.

In summary, the combined data favor the existing hypothesis that TCRB and TCRG were indeed originally linked in minor huMHCpara-12, followed by chromosome split to human chr 7, secondary translocation of block regions containing TCRG (Fig. 7B). Alternatively, TCRB and TCRG were differentially silenced after translocation from their original location. In either scenario, the splitting up of the two genes and subsequent translocation(s) were involved in positioning tcrb and tcrg at either end of human chr 7.

Based on the distribution of the orthologous genes found on Xenopus chr 1q (Supplemental Fig. 3), we speculate that huMHCpara-12 split from huMHCpara-14. Also, a block region containing the iglλ gene (human chr 22q11) is derived from huMHCpara-14 (linkage probability by chance <1 × 10−16 [Table II]). Therefore, our analysis suggests that all rearranging AgR are likely derived from the huMHCpara-19 precursor. Invasion of the RAG transposon likely happened on hu-MHCpara-19 after 2R, splitting the VJ element into separate V and J elements, and the various pairs of AgR genes are suggested to have been generated via cis duplications. This theme is discussed further below (Fig. 8).

FIGURE 8.

Dichotomy of vertebrate adaptive immune system. Because VLR homologs identified by Pancer (i.e., LRR–carboxy-terminal containing genes) were also mapped in MHCpara regions, we anticipate that a VLR precursor was present at least in the chr 6/19 common ancestor. We argue that the MHC/TCR/Ig system emerged and expanded in the jawed vertebrates soon after 2R as a consequence of the RAG transposon, and the VLR system was superseded (see text).

FIGURE 8.

Dichotomy of vertebrate adaptive immune system. Because VLR homologs identified by Pancer (i.e., LRR–carboxy-terminal containing genes) were also mapped in MHCpara regions, we anticipate that a VLR precursor was present at least in the chr 6/19 common ancestor. We argue that the MHC/TCR/Ig system emerged and expanded in the jawed vertebrates soon after 2R as a consequence of the RAG transposon, and the VLR system was superseded (see text).

Close modal

We have conducted a genome survey for loci involved in adaptive immunity and propose hypotheses for the origins of the PIC (Fig. 6). We also uncovered evidence of an en bloc translocation of the loci surrounding the CD1 genes (Figs. 4, 5A). Finally, we provide compelling evidence for the timing of the emergence of MHC class I(/II) and AgR in a gnathostome ancestor (Figs. 6, 7B) and have uncovered nonrearranging AgR-like genes in MHCpara that may be related to the Ig/TCR ancestor.

It has been previously predicted that AgR precursor genes were linked to the proto-MHC and translocated later in evolution (59, 60, 78). To address this hypothesis, we mapped AgR/AgR-like genes on Xenopus chromosomes and uncovered several nonrearranging genes with structures similar to TCR and IgL chains: a single uninterrupted VJ-type IgSF domain followed by a C1-IgSF domain. It has been also speculated (60) that modern AgRs were generated by recruitment of C1-IgSF in the preadaptive immune complex followed by the RAG transposon splitting a VJ gene into V- and J- genetic elements (V-J). Thus, extant VJ-IgSF­–containing genes are potentially descendants of such precursor genes (69, 73). Like other immune genes directly involved in Ag recognition, all AgR-like genes described in this report are diploidized in the tetraploid X. laevis, and therefore they likely play roles in immunity (18, 73). As mentioned above, NCR3, another gene encoding a VJ-type domain, maps to the human (and other vertebrate including sharks [M.E. Janes, L. Du Pasquier, M.F. Flajnik, and Y. Ohta, manuscript in preparation]) MHC (Fig. 1), and an amphioxus VJ gene (69) linked to a kirrel homolog further supports the hypothesis that the AgR precursor was present in the PIC at 0R.

Mapping of Ig and TCR genes in several vertebrates to MHCpara indicates that all of the extant AgR seemed to be derived from an ancestral chr 19 paralogue. This suggests that an uninterrupted VJ element was split by the RAG transposon, and after gene duplication, one duplicate acquired a diversity (D) element, generating paired receptor genes (74). Hood et al. (79) suggested an ancestral VJ homodimer, which, after the RAG transposon invasion and gene duplication, gave rise to a heterodimeric receptor. As proposed by Davis and Bjorkman (80), the original receptor may have been TCR α/β-like, because the RAG rearrangement break at CDR3 makes the most sense for an MHC-restricted AgR (i.e., the most diverse part of the AgR binding to the true Ag, peptide, or another original type of Ag) in the MHC groove. We previously proposed (59) that the original AgR was derived from NK-like receptors that recognized MHC-like molecules encoded both in the PIC or the proto-MHC, and we now provide evidence for such candidate receptors. Subsequent duplication of the paired TCR genes and translocation may have relieved the pressure of MHC restriction, allowing the duplicated receptor to bind free Ags, like γ/δ TCR today. Another duplication in cis may have occurred [as previously suggested (62)] on huMHCpara-14, generating IgH/L by a cis-duplication of the neighboring (TCRA/D) pair: the two sets of loci (TCRA/D and IgH/L) are still linked in extant vertebrates including Xenopus (62).

We also identified novel class I genes and mapped them in MHCpara derived from the chr 6/19 precursor after 1R. Our analyses suggest that MHC class I likely arose after the first round of genome duplication rather than prior to 1R (Fig. 6). The previous proposals (4345) were partially supported by the presence of CD1 genes on huMHCpara-1. In contrast, we present evidence that the 1q21.1–23.3 region, including the CD1 genes, was secondarily translocated from another location, which itself was translocated from the MHC (MHCtrans) (red arrow in Fig. 6); thus, the presence of CD1 on huMHCpara-1 was likely the result of a chance event and not a genome-wide duplication. There is, however, an alternative explanation: duplication of both MHC and MHCtrans may have been generated on both loci on chr 1 and 6 but differentially silenced during 2R. We think this scenario is unlikely because some housekeeping genes would have remained in other MHCpara as homologs, as we commonly see in the tetraploid X. laevis genome compared with the diploid X. tropicalis (14). KIRREL homologs, KIRREL 2 (19q13.12) and KIRREL 3 (11q24.2), are found in major and minor huMHCpara (68, 78), whereas KIRREL maps to human chr 1q23.1 but maps in Xenopus MHCtrans. Furthermore, kirrel maps adjacent to notch in the Drosophila genome, presumably an ancestral linkage (16). Although this is only one example, the distribution of KIRREL genes adds another layer of support to our hypothesis that the MHCtrans was initially translocated from the MHC (Fig. 5A). The presence of a cd1 gene in Chinese alligator on huMHCpara-19 (46) further suggests that CD1 emerged after 1R but before 2R and was differentially silenced in reptiles and birds (Fig. 4). Regardless of the precise timing of CD1’s emergence, we propose that class II arose later and may have co-opted the CD1 pathway of Ag presentation. We found no class II genes outside of the MHC.

The overarching hypothesis is that all constituents/domains of current adaptive (and some innate) immune genes were genetically linked in the PIC (9), which predated the MHC (6), and these PIC components were “mixed and matched” to generate the precursors of modern immune genes (9), especially the VJ and C1-IgSF domains that are fundamental components of the adaptive immune system (e.g., Igs, TCR, MHC class I/II, B2M) (8183). It was previously predicted that Ig/TCR/MHC precursor genes originated in the MHC based on preliminary evidence (6, 60). In addition to MHCpara, genes linked in the MHCtrans region also provide an indication of the primordial linkage of AgR/MHC; as mentioned above, other genes, like KIRREL, FcRL, and SLAMF, map to MHCtrans, corresponding to the human 1q21.1–23.3 region (Figs. 3A, 5A, Supplemental Table I). Therefore, other domains such as C2-IgSF (building blocks of FcRL and SLAMF) and B30.2 (building block for butyrophilin) (11) were also present in the PIC and likely used as raw material to generate new sets of immune genes. In addition, the synteny of SLAMF and CD1 genes may be another example of functional clustering, because SLAM family members are involved in NKT cell development in the thymus (84).

Finally, we also speculate on the dichotomy between the jawless and jawed vertebrate adaptive immune systems. Leucine-rich repeat (LRR) domain-containing variable lymphocyte receptor (VLR) genes are rearranging adaptive immune genes unique to jawless vertebrates (lamprey and hagfish) (85). LRR domains are also present in many other proteins such as TLR (86, 87), which are predicted to be encoded in PIC because toll is linked to MHC paralogous hallmark genes in Drosophila (16). Pancer identified three VLR homologous genes based on the presence of LRR carboxy-terminal domain (88), and, surprisingly, we found all three genes mapping to MHCpara regions: GP1BB is closely linked to IgLλ on human chr 22q11 (Figs. 1, 6) and Xenopus chr-1q (Supplemental Fig. 3) (linkage probability by chance <1 × 10−16 [Table II]); Xenopus gp1ba and gp9 could not be mapped, but human GP1BA maps closely to PSMB6 on human chr 17p13.2, and GP9 maps on human chr 3q21.3, a region also designated as minor MHCpara (60). Both GP1BB and GP1BA were mapped on chromosomes derived from huMHCpara-19. This unexpected result strongly suggests that the precursor of VLR genes was also in PIC or an ancestral MHCpara. We have searched the lamprey and hagfish genomes for synteny of the VLR genes but could not map any linked genes. Better assembly of the lamprey and hagfish genomes could provide genetic evidence for further confirmation. Depending on the precursor of human chr 3, the VLR predecessor could have been present either at 1R or 0R (PIC). In either scenario, our model predicts that VLR predates the emergence of rearranging IgSF-containing AgR. At this point, we have no working hypothesis for why VLRs would be encoded in the MHCpara besides the basic idea that many immune gene families seems to be conceived in these regions.

There was an expansion of gene families and neofunctionalization [e.g., globin genes (89)] in early jawed vertebrates shortly after 2R and perpetuated in the gnathostome lineage. In contrast, the jawless fish either maintained the primordial state or evolved novel globin genes (89). We suggest that such a major dichotomy occurred for the immune system as well (Fig. 8): adaptive immunity likely emerged in the jawless vertebrates in the first “Big Bang” with major features such as clonal selection of lymphocytes bearing somatically generated Ag receptors, emergence of the thymus, and appearance of lymphocyte subsets (90). In our scenario, as opposed to a model proposing parallel evolution of VLR and Ig/TCR systems, the VLR system emerged during the first Big Bang, and then was superseded by the Ig/TCR system after invasion of an VJ-IgSF gene by the RAG transposon at 2R. As previously suggested, RAG-mediated rearrangement provides a distinct advantage over APOBEC-mediated recombination in that the CDR3 loop can be wildly different in size (91), accommodating either a rich adaptive repertoire or one that is more innate in nature. We suggest that the RAG transposon invasion at 2R was the innovative event that initiated a second Big Bang of adaptive immunity, resulting in the emergence of immunoproteasomes, emergence and expansion of AgR, and the first appearance of SLAM family members, all of which likely occurred on the chr 6 and chr 19 ancestral paralogues. Other features of the gnathostome adaptive immune system, such as emergence of secondary lymphoid tissues, expansion of cytokine and chemokine networks, and appearance of a complex thymic architecture also occurred over a short period of evolutionary time, in some cases under the influence of genes mapping to MHC paralogous regions, e.g., TNF (92) and B7 family members (68).

We thank Hanover Matz and Dr. Louis Du Pasquier for critical reading of the manuscript and advice on the nonrearranging AgR-like genes.

This project was supported by National Institutes of Health Grants AI140326-26 and AI02877 to Y.O. and M.F.F.

The online version of this article contains supplemental material.

Abbreviations used in this article:

chr

chromosome

FISH

fluorescence in situ hybridization

huMHCpara

human MHCpara

IgSF

Ig superfamily

L

long

LRR

leucine-rich repeat

MHCpara

MHC paralogue

MHCtrans

translocated part of the MHC paralogous region

NCBI

National Center for Biotechnology Information

NKC

NK complex

PIC

primordial immune complex

S

short

VLR

variable lymphocyte receptor.

1
Ohno
,
S.
1970
.
Evolution by Gene Duplication.
Springer-Verlag
,
New York
.
2
Lundin
,
L. G.
1993
.
Evolution of the vertebrate genome as reflected in paralogous chromosomal regions in man and the house mouse.
Genomics
16
:
1
19
.
3
Kasahara
,
M.
1997
.
New insights into the genomic organization and origin of the major histocompatibility complex: role of chromosomal (genome) duplication in the emergence of the adaptive immune system.
Hereditas
127
:
59
65
.
4
Darbo
,
E.
,
E. G.
Danchin
,
M. F.
Mc Dermott
,
P.
Pontarotti
.
2008
.
Evolution of major histocompatibility complex by “en bloc” duplication before mammalian radiation.
Immunogenetics
60
:
423
438
.
5
Olinski
,
R. P.
,
L. G.
Lundin
,
F.
Hallböök
.
2006
.
Conserved synteny between the Ciona genome and human paralogons identifies large duplication events in the molecular evolution of the insulin-relaxin gene family.
Mol. Biol. Evol.
23
:
10
22
.
6
Flajnik
,
M. F.
,
M.
Kasahara
.
2010
.
Origin and evolution of the adaptive immune system: genetic events and selective pressures.
Nat. Rev. Genet.
11
:
47
59
.
7
Hallböök
,
F.
1999
.
Evolution of the vertebrate neurotrophin and Trk receptor gene families.
Curr. Opin. Neurobiol.
9
:
616
621
.
8
Horton
,
R.
,
L.
Wilming
,
V.
Rand
,
R. C.
Lovering
,
E. A.
Bruford
,
V. K.
Khodiyar
,
M. J.
Lush
,
S.
Povey
,
C. C.
Talbot
Jr.
,
M. W.
Wright
, et al
.
2004
.
Gene map of the extended human MHC.
Nat. Rev. Genet.
5
:
889
899
.
9
Abi Rached
,
L.
,
M. F.
McDermott
,
P.
Pontarotti
.
1999
.
The MHC big bang.
Immunol. Rev.
167
:
33
44
.
10
Danchin
,
E. G.
,
P.
Pontarotti
.
2004
.
Towards the reconstruction of the bilaterian ancestral pre-MHC region.
Trends Genet.
20
:
587
591
.
11
Suurväli
,
J.
,
L.
Jouneau
,
D.
Thépot
,
S.
Grusea
,
P.
Pontarotti
,
L.
Du Pasquier
,
S.
Rüütel Boudinot
,
P.
Boudinot
.
2014
.
The proto-MHC of placozoans, a region specialized in cellular stress and ubiquitination/proteasome pathways.
J. Immunol.
193
:
2891
2901
.
12
Ohta
,
Y.
,
W.
Goetz
,
M. Z.
Hossain
,
M.
Nonaka
,
M. F.
Flajnik
.
2006
.
Ancestral organization of the MHC revealed in the amphibian Xenopus.
J. Immunol.
176
:
3674
3685
.
13
Hellsten
,
U.
,
R. M.
Harland
,
M. J.
Gilchrist
,
D.
Hendrix
,
J.
Jurka
,
V.
Kapitonov
,
I.
Ovcharenko
,
N. H.
Putnam
,
S.
Shu
,
L.
Taher
, et al
.
2010
.
The genome of the Western clawed frog Xenopus tropicalis.
Science
328
:
633
636
.
14
Session
,
A. M.
,
Y.
Uno
,
T.
Kwon
,
J. A.
Chapman
,
A.
Toyoda
,
S.
Takahashi
,
A.
Fukui
,
A.
Hikosaka
,
A.
Suzuki
,
M.
Kondo
, et al
.
2016
.
Genome evolution in the allotetraploid frog Xenopus laevis.
Nature
538
:
336
343
.
15
Uno
,
Y.
,
C.
Nishida
,
C.
Takagi
,
N.
Ueno
,
Y.
Matsuda
.
2013
.
Homoeologous chromosomes of Xenopus laevis are highly conserved after whole-genome duplication.
Heredity
111
:
430
436
.
16
Danchin
,
E. G. J.
,
L.
Abi-Rached
,
A.
Gilles
,
P.
Pontarotti
.
2003
.
Conservation of the MHC-like region throughout evolution.
Immunogenetics
55
:
141
148
.
17
Du Pasquier
,
L.
,
J.
Schwager
,
M. F.
Flajnik
.
1989
.
The immune system of Xenopus.
Annu. Rev. Immunol.
7
:
251
275
.
18
Courtet
,
M.
,
M.
Flajnik
,
L.
Du Pasquier
.
2001
.
Major histocompatibility complex and immunoglobulin loci visualized by in situ hybridization on Xenopus chromosomes.
Dev. Comp. Immunol.
25
:
149
157
.
19
Tanaka
,
K.
2013
.
The proteasome: from basic mechanisms to emerging roles.
Keio J. Med.
62
:
1
12
.
20
Kasahara
,
M.
,
M.
Hayashi
,
K.
Tanaka
,
H.
Inoko
,
K.
Sugaya
,
T.
Ikemura
,
T.
Ishibashi
.
1996
.
Chromosomal localization of the proteasome Z subunit gene reveals an ancient chromosomal duplication involving the major histocompatibility complex.
Proc. Natl. Acad. Sci. USA
93
:
9096
9101
.
21
Abi-Rached
,
L.
,
A.
Gilles
,
T.
Shiina
,
P.
Pontarotti
,
H.
Inoko
.
2002
.
Evidence of en bloc duplication in vertebrate genomes.
Nat. Genet.
31
:
100
105
.
22
Kelley
,
J.
,
L.
Walter
,
J.
Trowsdale
.
2005
.
Comparative genomics of major histocompatibility complexes.
Immunogenetics
56
:
683
695
.
23
Shum
,
B. P.
,
D.
Avila
,
L.
Du Pasquier
,
M.
Kasahara
,
M. F.
Flajnik
.
1993
.
Isolation of a classical MHC class I cDNA from an amphibian. Evidence for only one class I locus in the Xenopus MHC.
J. Immunol.
151
:
5376
5386
.
24
Flajnik
,
M. F.
,
M.
Kasahara
,
B. P.
Shum
,
L.
Salter-Cid
,
E.
Taylor
,
L.
Du Pasquier
.
1993
.
A novel type of class I gene organization in vertebrates: a large family of non-MHC-linked class I genes is expressed at the RNA level in the amphibian Xenopus.
EMBO J.
12
:
4385
4396
.
25
Edholm
,
E. S.
,
A.
Goyos
,
J.
Taran
,
F.
De Jesús Andino
,
Y.
Ohta
,
J.
Robert
.
2014
.
Unusual evolutionary conservation and further species-specific adaptations of a large family of nonclassical MHC class Ib genes across different degrees of genome ploidy in the amphibian subfamily Xenopodinae.
Immunogenetics
66
:
411
426
.
26
Krasnec
,
K. V.
,
A. T.
Papenfuss
,
R. D.
Miller
.
2016
.
The UT family of MHC class I loci unique to non-eutherian mammals has limited polymorphism and tissue specific patterns of expression in the opossum.
BMC Immunol.
17
:
43
.
27
Nonaka
,
M.
,
C.
Yamada-Namikawa
,
M. F.
Flajnik
,
L.
Du Pasquier
.
2000
.
Trans-species polymorphism of the major histocompatibility complex-encoded proteasome subunit LMP7 in an amphibian genus, Xenopus.
Immunogenetics
51
:
186
192
.
28
Ohta
,
Y.
,
S. J.
Powis
,
R. L.
Lohr
,
M.
Nonaka
,
L.
Du Pasquier
,
M. F.
Flajnik
.
2003
.
Two highly divergent ancient allelic lineages of the transporter associated with antigen processing (TAP) gene in Xenopus: further evidence for co-evolution among MHC class I region genes.
Eur. J. Immunol.
33
:
3017
3027
.
29
Nonaka
,
M.
,
C.
Namikawa
,
Y.
Kato
,
M.
Sasaki
,
L.
Salter-Cid
,
M. F.
Flajnik
.
1997
.
Major histocompatibility complex gene mapping in the amphibian Xenopus implies a primordial organization.
Proc. Natl. Acad. Sci. USA
94
:
5789
5791
.
30
Tsukamoto
,
K.
,
M.
Sakaizumi
,
M.
Hata
,
Y.
Sawara
,
J.
Eah
,
C. B.
Kim
,
M.
Nonaka
.
2009
.
Dichotomous haplotypic lineages of the immunoproteasome subunit genes, PSMB8 and PSMB10, in the MHC class I region of a teleost medaka, Oryzias latipes.
Mol. Biol. Evol.
26
:
769
781
.
31
McConnell
,
S. C.
,
K. M.
Hernandez
,
D. J.
Wcisel
,
R. N.
Kettleborough
,
D. L.
Stemple
,
J. A.
Yoder
,
J.
Andrade
,
J. L.
de Jong
.
2016
.
Alternative haplotypes of antigen processing genes in zebrafish diverged early in vertebrate evolution.
Proc. Natl. Acad. Sci. USA
113
:
E5014
E5023
.
32
Kaufman
,
J.
2015
.
Co-evolution with chicken class I genes.
Immunol. Rev.
267
:
56
71
.
33
Miller
,
M. M.
,
R. L.
Taylor
Jr
.
2016
.
Brief review of the chicken major histocompatibility complex: the genes, their distribution on chromosome 16, and their contributions to disease resistance.
Poult. Sci.
95
:
375
392
.
34
Edholm
,
E. S.
,
L. M.
Albertorio Saez
,
A. L.
Gill
,
S. R.
Gill
,
L.
Grayfer
,
N.
Haynes
,
J. R.
Myers
,
J.
Robert
.
2013
.
Nonclassical MHC class I-dependent invariant T cells are evolutionarily conserved and prominent from early development in amphibians.
Proc. Natl. Acad. Sci. USA
110
:
14342
14347
.
35
Edholm
,
E. S.
,
M.
Banach
,
J.
Robert
.
2016
.
Evolution of innate-like T cells and their selection by MHC class I-like molecules.
Immunogenetics
68
:
525
536
.
36
Edholm
,
E. S.
,
M.
Banach
,
K.
Hyoe Rhoo
,
M. S.
Pavelka
Jr.
,
J.
Robert
.
2018
.
Distinct MHC class I-like interacting invariant T cell lineage at the forefront of mycobacterial immunity uncovered in Xenopus.
Proc. Natl. Acad. Sci. USA
115
:
E4023
E4031
.
37
Kasahara
,
M.
1999
.
The chromosomal duplication model of the major histocompatibility complex.
Immunol. Rev.
167
:
17
32
.
38
Calabi
,
F.
,
C.
Milstein
.
1986
.
A novel family of human major histocompatibility complex-related genes not mapping to chromosome 6.
Nature
323
:
540
543
.
39
Martin
,
L. H.
,
F.
Calabi
,
C.
Milstein
.
1986
.
Isolation of CD1 genes: a family of major histocompatibility complex-related differentiation antigens.
Proc. Natl. Acad. Sci. USA
83
:
9154
9158
.
40
Zajonc
,
D. M.
2016
.
The CD1 family: serving lipid antigens to T cells since the Mesozoic era.
Immunogenetics
68
:
561
576
.
41
Jayawardena-Wolf
,
J.
,
A.
Bendelac
.
2001
.
CD1 and lipid antigens: intracellular pathways for antigen presentation.
Curr. Opin. Immunol.
13
:
109
113
.
42
Kasahara
,
M.
,
J.
Nakaya
,
Y.
Satta
,
N.
Takahata
.
1997
.
Chromosomal duplication and the emergence of the adaptive immune system.
Trends Genet.
13
:
90
92
.
43
Maruoka
,
T.
,
H.
Tanabe
,
M.
Chiba
,
M.
Kasahara
.
2005
.
Chicken CD1 genes are located in the MHC: CD1 and endothelial protein C receptor genes constitute a distinct subfamily of class-I-like genes that predates the emergence of mammals.
Immunogenetics
57
:
590
600
.
44
Salomonsen
,
J.
,
M. R.
Sørensen
,
D. A.
Marston
,
S. L.
Rogers
,
T.
Collen
,
A.
van Hateren
,
A. L.
Smith
,
R. K.
Beal
,
K.
Skjødt
,
J.
Kaufman
.
2005
.
Two CD1 genes map to the chicken MHC, indicating that CD1 genes are ancient and likely to have been present in the primordial MHC.
Proc. Natl. Acad. Sci. USA
102
:
8668
8673
.
45
Miller
,
M. M.
,
C.
Wang
,
E.
Parisini
,
R. D.
Coletta
,
R. M.
Goto
,
S. Y.
Lee
,
D. C.
Barral
,
M.
Townes
,
C.
Roura-Mir
,
H. L.
Ford
, et al
.
2005
.
Characterization of two avian MHC-like genes reveals an ancient origin of the CD1 family.
Proc. Natl. Acad. Sci. USA
102
:
8674
8679
.
46
Yang
,
Z.
,
C.
Wang
,
T.
Wang
,
J.
Bai
,
Y.
Zhao
,
X.
Liu
,
Q.
Ma
,
X.
Wu
,
Y.
Guo
,
Y.
Zhao
,
L.
Ren
.
2015
.
Analysis of the reptile CD1 genes: evolutionary implications.
Immunogenetics
67
:
337
346
.
47
Flajnik
,
M. F.
,
J. F.
Kaufman
,
P.
Riegert
,
L.
Du Pasquier
.
1984
.
Identification of class I major histocompatibility complex encoded molecules in the amphibian Xenopus.
Immunogenetics
20
:
433
442
.
48
Rogers
,
S. L.
,
J.
Kaufman
.
2016
.
Location, location, location: the evolutionary history of CD1 genes and the NKR-P1/ligand systems.
Immunogenetics
68
:
499
513
.
49
Donoviel
,
D. B.
,
D. D.
Freed
,
H.
Vogel
,
D. G.
Potter
,
E.
Hawkins
,
J. P.
Barrish
,
B. N.
Mathur
,
C. A.
Turner
,
R.
Geske
,
C. A.
Montgomery
, et al
.
2001
.
Proteinuria and perinatal lethality in mice lacking NEPH1, a novel protein with homology to NEPHRIN.
Mol. Cell. Biol.
21
:
4829
4836
.
50
Hughes
,
A. L.
,
M.
Nei
.
1993
.
Evolutionary relationships of the classes of major histocompatibility complex genes.
Immunogenetics
37
:
337
346
.
51
Kaufman
,
J. F.
,
C.
Auffray
,
A. J.
Korman
,
D. A.
Shackelford
,
J.
Strominger
.
1984
.
The class II molecules of the human and murine major histocompatibility complex.
Cell
36
:
1
13
.
52
Kaufman
,
J.
2018
.
Unfinished business: evolution of the MHC and the adaptive immune system of jawed vertebrates.
Annu. Rev. Immunol.
36
:
383
409
.
53
Dijkstra
,
J. M.
,
T.
Yamaguchi
.
2019
.
Ancient features of the MHC class II presentation pathway, and a model for the possible origin of MHC molecules.
Immunogenetics
71
:
233
249
.
54
Flajnik
,
M. F.
,
C.
Canel
,
J.
Kramer
,
M.
Kasahara
.
1991
.
Which came first, MHC class I or class II?
Immunogenetics
33
:
295
300
.
55
Bartl
,
S.
,
M. A.
Baish
,
M. F.
Flajnik
,
Y.
Ohta
.
1997
.
Identification of class I genes in cartilaginous fish, the most ancient group of vertebrates displaying an adaptive immune response.
J. Immunol.
159
:
6097
6104
.
56
Guselnikov
,
S. V.
,
T.
Ramanayake
,
A. Y.
Erilova
,
L. V.
Mechetina
,
A. M.
Najakshin
,
J.
Robert
,
A. V.
Taranin
.
2008
.
The Xenopus FcR family demonstrates continually high diversification of paired receptors in vertebrate evolution.
BMC Evol. Biol.
8
:
148
.
57
Guselnikov
,
S. V.
,
T.
Ramanayake
,
J.
Robert
,
A. V.
Taranin
.
2009
.
Diversity of the FcR- and KIR-related genes in an amphibian Xenopus.
Front. Biosci.
14
:
130
140
.
58
Guselnikov
,
S. V.
,
P. P.
Laktionov
,
A. M.
Najakshin
,
K. O.
Baranov
,
A. V.
Taranin
.
2011
.
Expansion and diversification of the signaling capabilities of the CD2/SLAM family in Xenopodinae amphibians.
Immunogenetics
63
:
679
689
.
59
Flajnik
,
M. F.
,
M.
Kasahara
.
2001
.
Comparative genomics of the MHC: glimpses into the evolution of the adaptive immune system.
Immunity
15
:
351
362
.
60
Du Pasquier
,
L.
,
I.
Zucchetti
,
R.
De Santis
.
2004
.
Immunoglobulin superfamily receptors in protochordates: before RAG time.
Immunol. Rev.
198
:
233
248
.
61
Williams
,
A. F.
,
A. N.
Barclay
.
1988
.
The immunoglobulin superfamily--domains for cell surface recognition.
Annu. Rev. Immunol.
6
:
381
405
.
62
Parra
,
Z. E.
,
Y.
Ohta
,
M. F.
Criscitiello
,
M. F.
Flajnik
,
R. D.
Miller
.
2010
.
The dynamic TCRδ: TCRδ chains in the amphibian Xenopus tropicalis utilize antibody-like V genes.
Eur. J. Immunol.
40
:
2319
2329
.
63
Schatz
,
D. G.
,
M. A.
Oettinger
,
D.
Baltimore
.
1989
.
The V(D)J recombination activating gene, RAG-1.
Cell
59
:
1035
1048
.
64
Oettinger
,
M. A.
,
D. G.
Schatz
,
C.
Gorka
,
D.
Baltimore
.
1990
.
RAG-1 and RAG-2, adjacent genes that synergistically activate V(D)J recombination.
Science
248
:
1517
1523
.
65
Agrawal
,
A.
,
Q. M.
Eastman
,
D. G.
Schatz
.
1998
.
Transposition mediated by RAG1 and RAG2 and its implications for the evolution of the immune system.
Nature
394
:
744
751
.
66
Pende
,
D.
,
S.
Parolini
,
A.
Pessino
,
S.
Sivori
,
R.
Augugliaro
,
L.
Morelli
,
E.
Marcenaro
,
L.
Accame
,
A.
Malaspina
,
R.
Biassoni
, et al
.
1999
.
Identification and molecular characterization of NKp30, a novel triggering receptor involved in natural cytotoxicity mediated by human natural killer cells.
J. Exp. Med.
190
:
1505
1516
.
67
Ohta
,
Y.
,
M. F.
Flajnik
.
2015
.
Coevolution of MHC genes (LMP/TAP/class Ia, NKT-class Ib, NKp30-B7H6): lessons from cold-blooded vertebrates.
Immunol. Rev.
267
:
6
15
.
68
Flajnik
,
M. F.
,
T.
Tlapakova
,
M. F.
Criscitiello
,
V.
Krylov
,
Y.
Ohta
.
2012
.
Evolution of the B7 family: co-evolution of B7H6 and NKp30, identification of a new B7 family member, B7H7, and of B7's historical relationship with the MHC.
Immunogenetics
64
:
571
590
.
69
Chen
,
R.
,
L.
Zhang
,
J.
Qi
,
N.
Zhang
,
L.
Zhang
,
S.
Yao
,
Y.
Wu
,
B.
Jiang
,
Z.
Wang
,
H.
Yuan
, et al
.
2018
.
Discovery and analysis of invertebrate IgVJ-C2 structure from Amphioxus provides insight into the evolution of the Ig superfamily.
J. Immunol.
200
:
2869
2881
.
70
Wu
,
Q.
,
Z.
Wei
,
Z.
Yang
,
T.
Wang
,
L.
Ren
,
X.
Hu
,
Q.
Meng
,
Y.
Guo
,
Q.
Zhu
,
J.
Robert
, et al
.
2010
.
Phylogeny, genomic organization and expression of lambda and kappa immunoglobulin light chain genes in a reptile, Anolis carolinensis.
Dev. Comp. Immunol.
34
:
579
589
.
71
Del Porto
,
P.
,
L.
Bruno
,
M. G.
Mattei
,
H.
von Boehmer
,
C.
Saint-Ruf
.
1995
.
Cloning and comparative analysis of the human pre-T-cell receptor alpha-chain gene.
Proc. Natl. Acad. Sci. USA
92
:
12105
12109
.
72
Saint-Ruf
,
C.
,
K.
Ungewiss
,
M.
Groettrup
,
L.
Bruno
,
H. J.
Fehling
,
H.
von Boehmer
.
1994
.
Analysis and expression of a cloned pre-T cell receptor gene.
Science
266
:
1208
1212
.
73
Fu
,
Y.
,
Z.
Yang
,
J.
Huang
,
X.
Cheng
,
X.
Wang
,
S.
Yang
,
L.
Ren
,
Z.
Lian
,
H.
Han
,
Y.
Zhao
.
2019
.
Identification of two nonrearranging IgSF genes in chicken reveals a novel family of putative remnants of an antigen receptor precursor.
J. Immunol.
202
:
1992
2004
.
74
Glusman
,
G.
,
L.
Rowen
,
I.
Lee
,
C.
Boysen
,
J. C.
Roach
,
A. F.
Smit
,
K.
Wang
,
B. F.
Koop
,
L.
Hood
.
2001
.
Comparative genomics of the human and mouse T cell receptor loci.
Immunity
15
:
337
349
.
75
Du Pasquier
,
L.
2000
.
Relationships among the genes encoding MHC molecules and the specific antigen receptors
. In
MHC Evolution, Structure and Function.
L.
Du Pasquier
,
M.
Kasahawa
, eds.
Springer-Verlag
,
Tokyo
, p.
53
65
.
76
Trowsdale
,
J.
2001
.
Genetic and functional relationships between MHC and NK receptor genes.
Immunity
15
:
363
374
.
77
Kaufman
,
J.
,
S.
Milne
,
T. W.
Göbel
,
B. A.
Walker
,
J. P.
Jacob
,
C.
Auffray
,
R.
Zoorob
,
S.
Beck
.
1999
.
The chicken B locus is a minimal essential major histocompatibility complex.
Nature
401
:
923
925
.
78
Du Pasquier
,
L.
2004
.
Speculations on the origin of the vertebrate immune system.
Immunol. Lett.
92
:
3
9
.
79
Hood
,
L.
,
M.
Kronenberg
,
T.
Hunkapiller
.
1985
.
T cell antigen receptors and the immunoglobulin supergene family.
Cell
40
:
225
229
.
80
Davis
,
M. M.
,
P. J.
Bjorkman
.
1988
.
T-cell antigen receptor genes and T-cell recognition. [Published erratum appears in 1988 Nature 335: 744.]
Nature
334
:
395
402
.
81
DuPasquier
,
L.
,
I.
Chrétien
.
1996
.
CTX, a new lymphocyte receptor in Xenopus, and the early evolution of Ig domains.
Res. Immunol.
147
:
218
226
.
82
Du Pasquier
,
L.
2002
.
Several MHC-linked Ig superfamily genes have features of ancestral antigen-specific receptor genes.
Curr. Top. Microbiol. Immunol.
266
:
57
71
.
83
Ohta
,
Y.
,
T.
Shiina
,
R. L.
Lohr
,
K.
Hosomichi
,
T. I.
Pollin
,
E. J.
Heist
,
S.
Suzuki
,
H.
Inoko
,
M. F.
Flajnik
.
2011
.
Primordial linkage of β2-microglobulin to the MHC.
J. Immunol.
186
:
3563
3571
.
84
Godfrey
,
D. I.
,
S.
Stankovic
,
A. G.
Baxter
.
2010
.
Raising the NKT cell family.
Nat. Immunol.
11
:
197
206
.
85
Pancer
,
Z.
,
C. T.
Amemiya
,
G. R.
Ehrhardt
,
J.
Ceitlin
,
G. L.
Gartland
,
M. D.
Cooper
.
2004
.
Somatic diversification of variable lymphocyte receptors in the agnathan sea lamprey.
Nature
430
:
174
180
.
86
Anderson
,
K. V.
,
L.
Bokla
,
C.
Nüsslein-Volhard
.
1985
.
Establishment of dorsal-ventral polarity in the Drosophila embryo: the induction of polarity by the toll gene product.
Cell
42
:
791
798
.
87
Anderson
,
K. V.
,
G.
Jürgens
,
C.
Nüsslein-Volhard
.
1985
.
Establishment of dorsal-ventral polarity in the Drosophila embryo: genetic studies on the role of the toll gene product.
Cell
42
:
779
789
.
88
Rogozin
,
I. B.
,
L. M.
Iyer
,
L.
Liang
,
G. V.
Glazko
,
V. G.
Liston
,
Y. I.
Pavlov
,
L.
Aravind
,
Z.
Pancer
.
2007
.
Evolution and diversification of lamprey antigen receptors: evidence for involvement of an AID-APOBEC family cytosine deaminase.
Nat. Immunol.
8
:
647
656
.
89
Hoffmann
,
F. G.
,
J. C.
Opazo
,
J. F.
Storz
.
2012
.
Whole-genome duplications spurred the functional diversification of the globin gene superfamily in vertebrates.
Mol. Biol. Evol.
29
:
303
312
.
90
Flajnik
,
M. F.
2018
.
A cold-blooded view of adaptive immunity.
Nat. Rev. Immunol.
18
:
438
453
.
91
Hsu
,
E.
2011
.
The invention of lymphocytes.
Curr. Opin. Immunol.
23
:
156
162
.
92
Collette
,
Y.
,
A.
Gilles
,
P.
Pontarotti
,
D.
Olive
.
2003
.
A co-evolution perspective of the TNFSF and TNFRSF families in the immune system.
Trends Immunol.
24
:
387
394
.

The authors have no financial conflicts of interest.

Supplementary data