Abstract
Gnathostome adaptive immunity is defined by the Ag receptors, Igs and TCRs, and the MHC. Cartilaginous fish are the oldest vertebrates with these adaptive hallmarks. We and others have unearthed nonrearranging Ag receptor-like genes in several vertebrates, some of which are encoded in the MHC or in MHC paralogous regions. One of these genes, named UrIg, was detected in the class III region of the shark MHC that encodes a protein with typical V and C domains such as those found in conventional Igs and TCRs. As no transmembrane region was detected in gene models or cDNAs, the protein does not appear to act as a receptor. Unlike some other shark Ig genes, the UrIg V region shows no evidence of RAG-mediated rearrangement, and thus it is likely related to other V genes that predated the invasion of the RAG transposon. The UrIg gene is present in all elasmobranchs and evolves conservatively, unlike Igs and TCRs. Also, unlike Ig/TCR, the gene is not expressed in secondary lymphoid tissues, but mainly in the liver. Recombinant forms of the molecule form disulfide-linked homodimers, which is the form also detected in many shark tissues by Western blotting. mAbs specific for UrIg identify the protein in the extracellular matrix of several shark tissues by immunohistochemistry. We propose that UrIg is related to the V gene invaded by the RAG transposon, consistent with the speculation of emergence of Ig/TCR within the MHC or proto-MHC.
This article is featured in Top Reads, p. 905
Introduction
Immunoglobulins and TCRs are the Ag receptors that coordinate adaptive immunity in jawed vertebrates (gnathostomes) (1). Ig/TCR genes are generated by RAG-dependent gene rearrangement events during lymphocyte ontogeny. Igs and most γ/δ TCRs recognize free Ag whereas α/β TCRs recognize peptide or lipid Ags bound in the grooves of MHC class I and class II proteins (2). It is universally agreed that Igs and TCRs were generated from a common ancestral Ag receptor (3–5), but speculation abounds on the nature of such a common ancestor because all living gnathostomes have all three types of Ig superfamily (IgSF)–based Ag receptors, and jawless fish and invertebrates lack such receptors.
Cartilaginous fishes are in the oldest vertebrate group with rearranging Ag receptors and the MHC (6). Other hallmarks of adaptive immunity such as primary (thymus) and secondary (spleen) lymphoid tissues, activation-induced cytidine deaminase (AID)–driven somatic hypermutation, and diverse cytokine and chemokine networks, among others, also arose in cartilaginous fish. The only glaring omissions of shark adaptive immunity are class-switch recombination (classically arose in amphibians, although sharks also have a form of switch) (7, 8), germinal centers (arose in reptiles/birds) (9), and lymph nodes (arose in mammals) (10). However, sharks have some primordial features of their adaptive immune system that were lost in most higher vertebrates (and all primates) such as the cluster (split) organization of Ig H chain genes (11), linkage of β2-microglobulin to the MHC (12), somatic hypermutation of TCRα genes during thymic development (13), “germline joining” of V, D, and J segments that can generate functional Ig genes in the genome (14–18), single V domain–containing H chains that make dimers but do not associate with L chains (15, 19, 20), and usage of IgH V regions in TCRs (21, 22). Such ancestral features highlight the shark as an attractive model for immune studies.
Ig/TCR genes arose by an invasion of an IgSF exon by the RAG transposon early in gnathostome evolution (23–25). The IgSF domain used by Ig/TCR is a so-called “VJ domain” in which the last (G) strand of the molecule bears a Gly-X-Gly motif in the N-terminal part of the strand involved in dimerization and a less conserved V(l)TVT motif in the C-terminal part (26, 27). There are not many genes besides Ig/TCR with this motif in the genome, with most of them having probably derived from an ancestral, nonrearranging VJ domain. A recent review compiled all of the reported animal VJ domains, most of which are involved in immunity (27). In addition, the C IgSF domains used by Ig/TCR are likewise special, so-called C1 domains that are the most compact IgSF domains also with distinguishing motifs: FYP in the B strand and C-V-H in the F strand (26, 28). Besides Ig/TCR, C1 domains are found in MHC class I, MHC class II, β2-microglobulin, and a few other molecules involved in immunity (29). Interestingly, several VJ and C1 domain genes are encoded in the MHC or in MHC paralogous regions, prompting the idea from several groups (including ours) that Ig/TCR and class I/II arose from a proto-MHC region (28, 30–32).
While examining the shark MHC we uncovered a nonrearranging Ig-like gene in the class III region that converges on mammalian IgG, near where other VJ exons are found (33). This was an unexpected result and not one based on hypothesis testing, but nevertheless is consistent with an MHC origin of Ag receptors. In this study, we report the basic features of the gene and molecule, which likely serves an innate immune or structural role in sharks. We speculate on this molecule’s evolutionary significance in relationship to Ag receptor emergence.
Materials and Methods
Animals
Wild-caught nurse sharks (Ginglymostoma cirratum) were maintained in artificial seawater at ∼28°C in indoor tanks at the Institute of Marine and Environmental Technology (Baltimore, MD). Animals were anesthetized with MS222 (0.1%) before bleeds were harvested from the caudal vein with 1000 U/ml heparin reconstituted in shark-modified PBS (SPBS), then spun at 300 × g for 10 min to isolate blood plasma and buffy coats. We harvested major organs in SPBS and then euthanized animals according to protocol. All procedures were conducted in accordance with University of Maryland School of Medicine Institutional Animal Care and Use Committee protocols.
Database searches
While searching for MHC genes in databases from various cartilaginous fish genomes at the National Center for Biotechnology Information Web site (https://www.ncbi.nlm.nih.gov), we discovered the UrIg gene. The presence of UrIg was further confirmed by BLASTp and tblastn searches against various vertebrate species including the nurse shark transcriptomics database.
Phylogenetic tree analysis
The UrIg V and C IgSF domains were separately aligned against different sets of IgSF-containing genes (e.g., Igs, TCRs) based on the predicted evolutionary origin and domain similarities using ClustalW. Phylogenetic trees were then constructed using the bootstrapping neighbor-joining method (34) with 500 runs.
RT-PCR
First-strand cDNA was made from 500 ng of total RNA from various nurse shark “NJ” tissue using a SuperScript IV first-strand synthesis kit (Invitrogen) following the manufacturer’s protocol. RT-PCR was performed using GoTaq master mix (Promega) with 35 cycles of denature (at 95°C) for 45 s, annealing (see below for each gene) for 45 s, and extension (at 72°C) for 60 s, following 2 min of hot start at 95°C, and followed by a 5-min final extension at 72°C. The primers used to examine the gene expression were as follows: UrIg (C1–C3 IgSF domains) 5′-CCGGAAGAACATCTCGCTGCT-3′ and 5′-CCGGAAGAACATCTCGCTG CT-3′ (at 58°C annealing temperature); nucleoside-diphosphate kinase (NDPK; control) 5′-AACAAGGAACGAACCTTC-3′ and 5′-TCACTCATAGATCCA GTC-3′ (at 50°C annealing temperature). The PCR amplicons were visualized on 1% agarose gel.
UrIg recombinant expression and purification
The four-domain UrIg molecule (V-C1-C2-C3) was cloned into the phCMV3 vector for mammalian expression, with an N-terminal murine Igκ signal peptide and C-terminal His tag. Protein was produced by transient transfection of expi293F cells (Thermo Fisher Scientific) following the manufacturer’s standard protocols. Protein was purified from the medium by affinity chromatography with Ni Sepharose excel histidine-tagged protein purification resin (Cytiva). After binding, resin was washed with 20 mM Tris (pH 7.5) and then with 500 mM imidazole, 20 mM Tris, 500 mM NaCl (pH 7.5) for elution. Protein was then dialyzed overnight at 4°C against 1× PBS/1 mM EDTA. After dialysis, NDSB-201 (MilliporeSigma) was added to a final concentration of 200 mM and protein was concentrated to ∼1 mg/ml before size-exclusion chromatography on a Superdex 200 16/60 column (Cytiva) with running buffer Dulbecco’s PBS. Peak fractions were pooled, and samples were then concentrated and examined by SDS-PAGE, and for immunization of mice according to Institutional Animal Care and Use Committee protocols.
mAb production
mAbs were produced as previously described (35). Briefly, mice were immunized s.c. to 50 μg of UrIg protein emulsified in IFA and boosted 3 wk later with 50 μg in IFA. Three days later, splenocytes were fused with myeloma line X63. Two weeks later, hybridomas were tested for reactivity by ELISA on 96-well plates coated with recombinant UrIg (1 μg/ml). Positive clones were expanded, and supernatants were used for immunohistochemistry and Western blotting.
Cloning and production of trimeric collagen adhesin 35
Collagen adhesin (CNA)35 synthetic DNA with optimized codons for Escherichia coli (Integrated DNA Technologies) was cloned in-frame with the human collagen XVIII trimerization domain sequence (36) into the pET23-His-Trx-thr vector using BamHI and EcoRI sites (37). The resultant construct encoded a thrombin-cleavable, His-tagged thioredoxin domain added to the N terminus of the collagen XVIII trimerization domain (18TD) fused with CNA35. The plasmid sequence was verified by Sanger sequencing.
The protein was expressed in the BL21(DE3) strain of E. coli using 1 mM IPTG induction for 3 h at 37°C. Cells were collected by centrifugation and disrupted by ultrasonication in 20 mM Tris-HCl (pH 8) on ice. Insoluble material was removed by centrifugation at 10,000 × g at 4°C for 30 min and supernatant was adjusted to include 50 mM sodium phosphate (pH 8), 150 mM NaCl, and 5 mM imidazole. The His-tagged chimera protein was affinity purified using a Ni-NTA column (Qiagen) by elution using the same buffer supplemented with increasing imidazole concentrations. The fractions with the protein of interest were pooled and dialyzed against 50 mM Tris-HCl, 100 mM NaCl (pH 8.3), after which 10 mM CaCl2 and 0.01% sodium azide were added to the dialyzed protein sample to facilitate thrombin cleavage and prevent bacterial growth, respectively. Every week 4 U/ml thrombin was added for a total time of 4 wk to complete the thrombin cleavage at room temperature. His-tagged thioredoxin was separated from (18TD-CNA35)3 using a Ni-NTA column (Qiagen), but this time the trimeric CNA35 was found in the flowthrough. The (18TD-CNA35)3 was dialyzed against 20 mM Tris HCl (pH 8), loaded onto a Q-Sepharose column (Cytiva), and eluted at ∼10–20 mM NaCl gradient. The fractions were pooled and run over the size-exclusion chromatography using the Superdex S200 Increase column (Cytiva) equilibrated with the coupling buffer (100 mM sodium carbonate [pH 8.8]).
Conjugation with Alexa Fluor 488 N-hydroxysuccinimide ester (Thermo Fisher Scientific) was performed at 2 mg/ml protein concentration at 4°C overnight using 1 mg of dye/10 mg of protein. The labeled protein was separated from unreacted dye by desalting into PBS buffer. The stock solution was adjusted to 1 mg/ml. The working dilution for staining was 1:1000.
Immunohistochemistry
Fresh-frozen nurse shark tissues in Tissue-Tek OCT compound (Sakura) were cryosectioned at 7-μm thickness and briefly fixed in cold acetone. Slides could be stored at −80°C for several days. Slides were thawed, rehydrated in PBS for 5 min, and made permeable in PBS supplemented with 0.05% Tween 20 (PBST). Slides were blocked with 10% FCS in PBS for 45 min at 4°C, then incubated with mAb supernatant for 5 min at 4°C. After washing with PBST, slides were incubated with the secondary Ab, goat anti-mouse IgG–Alexa Fluor 488, for 45 min at 4°C, washed, and mounted with ProLong gold with DAPI mounting media. Images were taken with a Nikon Eclipse E800 and analyzed with Mosaic software.
For double immunofluorescence staining with CNA and mAb, kidney sections (7 μm) were mounted on glass slides, air dried for 15–30 min at room temperature, and fixed in acetone at −20°C for 10 min. Sections were washed for 5 min with 50 mM Tris/150 mM NaCl/0.1% Tween 20 buffer (TBST). Blocking was performed with 10% goat serum for 1 h (Invitrogen, 50062Z) followed by incubation with primary anti-UrIg mAb O-1 (1:500 dilution in 1% goat serum TBST, Fig. 7) overnight in a cold room. Following three 15-min washes with TBST, secondary Ab conjugated with Alexa Fluor 568 and CNA conjugated with Alexa Fluor 488 were incubated on sections for 1.5 h at room temperature. Sections were then washed and mounted with antifade mounting solution with DAPI. Images were taken with a Nikon Eclipse Ti microscope and analyzed with GIMP software.
Western blotting
Nurse shark tissues were dissected from animals in SPBS and minced into small pieces. Approximately 500 mg of tissue was dissociated with the frosted ends of microscope slides in 5 ml of 2% Nonidet P-40/PBS lysis buffer containing protease inhibitors. Lysates were kept on ice for 30 min and then were cleared of nucleic acid and debris by centrifugation at 600 × g for 15 min. Then, 30 μl of lysates was mixed with an equal volume of Laemmli sample buffer and subjected to SDS-PAGE. Gels were transferred to polyvinylidene difluoride as described (38) and cut into strips for incubation with 500 μl of mAb supernatant or antisera at 1:1000. Bands were revealed with a Vectastain ELITE ABC kit (Novus Biologicals) using precipitable diaminobenzidine as substrate.
UrIg modeling
AlphaFold-Multimer (39) was used to prepare a structural model for the full UrIg sequence. The relaxed predicted structure with highest confidence was then submitted to the GLYCAM-Web server (40) for addition of carbohydrate for N-linked glycan sites. Any modeled glycans with steric clashes were then adjusted manually to avoid clashing using Coot (41).
Results
UrIg is an MHC-linked IgH gene
Cartilaginous fish databases were scanned for MHC-linked genes in the National Center for Biotechnology Information Web site. One gene in the MHC class III region of catshark (Scyliorhinus canicula) was annotated as a V region of the Igκ L chain (LOC119976294), but upon inspection it had a VJ exon followed by a C1 exon. The gene is encoded near the cluster of TNF genes, not far from other MHC-linked genes encoding VJ domains, the NK receptor NCR3 (33). Using this gene fragment, we searched other shark genomic and transcriptomic databases and found the ortholog in various shark genomes as well as full-length transcriptomes (GIWU01140354 and GIWU01140356) from a nurse shark liver database. In most cartilaginous fish species, the gene is composed of a leader segment, one VJ exon, followed by three C domain exons (Fig. 1A–D). Upon examination of genomic sequences including nurse shark (position 341176–354343 in JAHRHZ010000005 (Supplemental Fig. 1)), we found that the leader exon is “split” as is found for all Ig/TCR V genes (a short leader exon followed by an intron and then a short leader segment 5′ of the VJ exon, Fig. 1A), and each IgSF domain is encoded by a separate exon (Fig. 1B–D, Supplemental Fig. 1). No transmembrane exons were found in the genome, nor were any transcripts found with a transmembrane region. Furthermore, there is no secretory tail at the end of the C3 exon and thus no identifiable cryptic splice site that would allow for alternative splicing, as is found for all vertebrate H chain mRNA (42). Besides the VC1C2C3 mRNA transcript, there is a shorter transcript in which the C2 domain exon is spliced out (Supplemental Fig. 2). In all species in which larger genomic contigs were available, the gene mapped to the class III region of MHC (Fig. 1E).
The VJ domain bears the classic YYC in the F strand and GX(C)G and LTVK motifs in the G strand (Fig. 1A). Residues that interact between VH and VL are not well conserved in the UrIg V domain, suggesting that the V-UrIg may not form a closed dimer (bold residues below the alignment and in the IgM/IgW sequences in Fig. 1A). All three C domains are card-carrying C1 IgSF domains bearing the cardinal FYP and C-V-H motifs described in the Introduction (Fig. 1B). The overall four-domain structure suggests that this molecule is a type of Ig and therefore we named it UrIg (Ig-original). Unlike Ig/TCR, there is no cysteine in the A strand of the first C domain to make a disulfide bond with an L chain cysteine (also true of IgNAR [43]), so if the molecule associates with L chains it must occur via noncovalent bonding. There is only one free cysteine in the entire molecule, found in the VJ domain GXG motif in the G strand, which may form a disulfide bond between UrIg H chains (Fig. 1A, Cys-113 in red). The molecule has six potential N-linked glycosylation sites (underlined in Fig. 1A, 1B). From these sequence analyses we predicted that UrIg forms a disulfide-linked homodimer, perhaps with free V domains as for IgNAR and camelid IgG (discussed below) (44). This speculation is tested below with recombinant UrIg and with UrIg found in nurse shark tissues.
The entire UrIg protein sequence is only 30% identical to other shark IgH genes such as IgM (11, 14), IgNAR (19), and IgW (45–47), showing that UrIg diverged from other Ag receptors long ago. Domain-by-domain bioinformatics (e.g., BLAST) searches showed that the VJ domain is just as related to Vs of IgL chains, TCRs, and IgH chains with no clear orthology, which is confirmed by the phylogenetic tree (Fig. 2A). The UrIg VJ domain is more like Ig/TCR V domains than the other VJ gene encoded in the class III region of the MHC, NKp30 (encoding gene: NCR3), which forms the root of the VJ tree (33, 48).
Phylogenetic trees of the C domains (Fig. 2B) showed that the UrIg C1 domain weakly clusters with (and may be ancestral to) the IgM C1 domains of nurse shark and the holocephalan Hydrolagus as well as the nurse shark IgW C1 domain. Interestingly, the UrIg C2 domain clustered weakly (and also may be ancestral to) the C-terminal domains of IgM (C4), IgNAR (C5), and IgW (C6). The UrIg C3 domain was not closely related to any IgH/IgL/TCR domain cluster. In summary, although the domains of UrIg are clearly VJ and C1 domains similar to those seen in Ag receptors, there is no high relatedness to any known Ig or TCR. Furthermore, although the bootstrap values are very low, a weak case can be made that UrIg domains are older than all other cartilaginous fish (and thus all other vertebrate) Ig/TCR domains.
We briefly point out other features of the C domain tree, which were noted previously but now are definitive, perhaps reinforced with the addition of the UrIg domains. The relationships between the IgNAR, IgW, and IgM C domains (45, 47), as well as duplication by convergent evolution of the C1 domains from C2 domains in IgNAR (19) and the H chain–only class of Hydrolagus IgM (15, 20), are robustly confirmed by this new tree.
UrIg is found in all elasmobranchs and evolves conservatively
Nurse shark UrIg was used as bait to search for the gene in all other cartilaginous fish databases and was detected in all elasmobranchs (sharks and rays) (Fig. 3). The alignment demonstrated that UrIg is highly conserved in all species (60–97% overall; 70–95% VJ exon), reaching back at least 300 million years. Note that the VJ domain is more conserved than the C domains, and that “CDR3” has been particularly well conserved during 300 million years of evolution. Such conservation suggests that UrIg is under strong negative selection, perhaps for binding to an evolutionarily conserved epitope. These characteristics are clearly unlike bona fide Igs and even other IgSF molecules and suggest that UrIg serves an innate immune or even a nonimmune function in sharks.
UrIg is expressed in the liver and not primary or secondary lymphoid tissues
Igs are expressed at high levels in the spleen and epigonal of cartilaginous fish, and at lower levels in the gut, liver, and kidney (49, 50). TCRs are also expressed highly in the spleen and thymus, as in all other vertebrates. In contrast, UrIg is expressed at high levels in the liver and somewhat lower levels in the kidney, with no or little mRNA expression in other tissues, including immune organs (Fig. 4). It is not known whether the gene is expressed in hepatocytes or in infiltrating liver leukocytes.
Modeling shows UrIg to converge on mammalian IgG
Bona fide Igs are dimers of IgH/IgL dimers. The H chains are disulfide bonded to each other and to L chains. There is, however, only one free cysteine in UrIg, in the G strand of the VJ domain (Fig. 1A, Cys-113), suggesting that H chain dimers form but there is no disulfide-bonded L chain. AlphaFold (relaxed) predicted the structure of UrIg with highest confidence placed the V-V domains so that a disulfide bond would be formed between the Cys residues 113 and 113′ (marked red in Fig. 5A). In contrast, if the V-V dimer domains were modeled as a standard Fab or L chain dimer of V domains, the free Cys would be on the distal sides of that dimer, so it is unlikely that the V-V will pair like a typical Fab. The AlphaFold model of the C2–C3 region predicts an Fc-like (CH2–CH3) arrangement of the domains, and the N-linked glycan site at position 291 in UrIg (penultimate sugar in Fig. 1C) is very close to that of position 297 in a human IgG Fc, where the equivalent glycan is attached. In mammals this glycan is crucial for IgG effector functions (51). Thus, we predict that the UrIg C2–C3 domains will pair similar to mammalian IgG C2–C3. The other glycans (underlined in Fig. 1A, 1B) are modeled to be exposed to the solvent, most conspicuously at the unique hinge region between the VJ and C1 domain; this sugar likely protects UrIg from protease attack (note that UrIg has a hinge-like region at the N terminus of the C1 domain [APSVSPPL]).
The V domain was modeled as well, with amino acid side chains on the CDR (Fig. 5B). Note the large number of aromatic residues in the loops that are modeled to interact (two Phes in CDR3, Trp in CDR2, three Tyrs). We emphasize that this is a model and we are not certain whether the CDR3 points up or folds over and perhaps buries those aromatic residues.
Recombinant UrIg forms a secreted disulfide-linked dimer
The UrIg four-domain H chain, with a C-terminal His-tag, was produced in mammalian expi293F cells. That UrIg was secreted readily into the supernatant suggested that it folded correctly and likely does not have to associate with another partner, for example, Ig L chains. The calculated protein molecular mass for this construct was 50,730 g/mol. There are six potential N-linked glycosylation sites per monomer (one in V, three in C1, two in C2, Fig. 1A–C), suggesting an approximate molecular mass increase between 9,000 and 12,500, depending on the occupancy and type of carbohydrate at each site. Thus, we expected a monomer to have a molecular mass of ∼60,000–63,500 and a dimer of ∼120,000–127,000. As mentioned, there is one Cys in the V domain (residue 113, sequential numbering) that modeling predicts to be surface exposed, and thus the possibility of dimerization via this residue. SDS-PAGE of purified UrIg showed a band slightly larger than 100,000 for the nonreduced sample and between 50,000 and 75,000 for the reduced sample (Fig. 6A). The nonreduced sample eluted in size-exclusion chromatography was close to the molecular mass standard, γ-globulin (∼158,000) (Fig. 6B). Thus, the secreted, recombinant UrIg forms a disulfide-linked dimer.
Natural nurse shark UrIg also forms a dimer and is present in many tissues
The purified recombinant UrIg was used to immunize mice to produce antisera and mAbs. A panel of mAbs was generated and tested on the recombinant protein by ELISA. The antisera and several of the mAbs recognized the recombinant UrIg molecule at ∼120 kDa under nonreducing conditions by Western blotting, and the antisera, but not the mAbs, also recognized the reduced UrIg at 60 kDa (Fig. 7).
Tissue lysates from all nurse shark tissues were prepared and tested for the presence of UrIg. Although mRNA expression is mostly limited to the liver (Fig. 4), we were able to detect UrIg protein in many tissues under nonreducing conditions at 120 kDa (Fig. 8). Interestingly, in the five brain tissues, that is, telencephalon (cerebrum), diencephalon (optic lobes), mesencephalon (midbrain), metencephalon (cerebellum), and myelencephalon (medulla oblongata), UrIg was found predominantly as a monomer.
UrIg is present in the extracellular matrix of many shark tissues
It was a puzzle that UrIg is expressed as mRNA by the liver, but the protein can easily be detected in many tissues by Western blotting (Fig. 8). We tested the exact tissue localization by immunohistochemistry of the spleen and found that the mAbs and antisera stained in a “stick-like” pattern around the splenic white pulp and around the vasculature, as expected of staining of the extracellular matrix (ECM) (Fig. 9A). A mAb specific for IgNAR was used to show the staining of lymphocytes in the white pulp (Fig. 9B). Staining of other tissues on the Western blot such as gill (Fig. 9D) and others (not shown) also showed stick-like staining of defined areas. To test whether the mAbs truly stained ECM, we costained kidney sections with an UrIg-specific mAb O-1 (in red, see Western blot for mAb O-1 reactivity in Fig. 7) and a reagent that detects collagen (CNA in green).
Most UrIg staining was observed in the anterior and middle portions of kidneys surrounding tubules (Fig. 9F). No staining was detected in the glomeruli (not shown). Staining was extracellular surrounding tubules and apparently forming rod-like structures with an orientation perpendicular to the tubular cell membrane. To visualize collagen matrix, we developed a trimeric form of a natural collagen binding protein (CNA35) conjugated to a fluorescent dye (see Materials and Methods). CNA35 monomer itself was demonstrated to have much better specific binding to collagen than existing fluorescent techniques currently used for collagen visualization (52, 53). Generation of multivalent CNA35 dendrimers (2–4 mers) by native chemical ligation remarkably enhanced the affinity and attenuated the dissociation kinetics (54). Inspired by these multivalent CNA35 probes, we designed our own system for recombinant production of trimeric CNA35 in bacteria. The CNA35 sequence was genetically fused to a trimerization domain of collagen XVIII (18TD), which has a picomolar trimerization potential (55). The resulting hybrid protein molecule (18TD-CNA35)3 was conjugated to Alexa Fluor 488 and used for costaining of tissues. The UrIg (red) is seen “stitching” layered collagen structures in shark kidneys (Fig. 9F–H). Note that in all tissues the UrIg is found associated with collagen, but in defined areas that require further study.
The data, taken together, suggest that the UrIg expressed by the liver is transported by the blood or by other cells (note that PBLs [WBCs] were Western blot positive; Fig. 8) throughout the body and deposited in the ECM.
Discussion
UrIg converges on IgG and likely predates Ig/TCR emergence
The UrIg sequence suggested that the molecule is an IgH chain that made a disulfide-linked dimer via a unique S-S bond and is not associated with Ig L chains. This prediction was borne out in the production of the recombinant molecule and by Western blotting of shark tissues. An AlphaFold model predicts that the three C1 domains form a structure highly reminiscent of IgG, even at the level of the canonical glycan in the UrIg (and IgGH) C2 domain that provides flexibility to the Fc region and is vital for effector function (51).
As mentioned in the Introduction, cartilaginous fish IgH genes are in the so-called “cluster organization” with multiple V-D-D-J-C genes that undergo rearrangement within a cluster during B cell ontogeny (11, 56). The phenomenon of germline joining is the apparent consequence of RAG activity in germ cells that can initiate rearrangement within these clusters (14–17). Most of the clusters having such germline joining that have been detected result in pseudogenes, but there are two examples of functional IgH and IgL genes generated by this mechanism in the nurse shark, and one example in Hydrolagus. In all cases, it is clear that the process was “recent,” as the joined genes are closely related to non–germline-joined clusters; furthermore, in the two nurse shark joins there are few bases in the CDR3-encoding regions suggestive of little TdT activity in the germline-joined events in germ cells. UrIg, in contrast, is highly conserved in all elasmobranchs, going back 300 million years (Fig. 3). The CDR3 of UrIg is quite long and relatively hydrophobic. Furthermore, the C domains are not closely related to any of the shark Ig C regions of IgM, IgNAR, or IgW, unlike the three germline-joined cartilaginous fish Ig genes (Fig. 2). Thus, although it is possible that there was a germline-joining event prior to the emergence of cartilaginous fish (or elasmobranchs), it is more likely that UrIg emergence preceded the development of rearranging Ig/TCR; this hypothesis is also suggested by the trees, in which the UrIg C domains seem ancestral. Using the same logic, it is unlikely that UrIg resulted from a reverse-transcribed IgH gene (retroposon) that inserted into the genome (also the transcript would have had to retain all of the introns).
While the modeling of the C domains was relatively straightforward, this was not true of the VJ domains. We think it most likely that the Cys-113 in the center of the Gly-X-Gly motif of the G strand, almost unique among VJ domains (27), provides the disulfide bond covalently linking the UrIg chains. However, do the UrIg V domains join together similar to VH/VL to make a conventional binding site or do the two domains bind Ag independently, similar to IgNAR (43)? The conserved amino acid residues known to interact between VH/VL and TCRα/β to facilitate dimer formation are not well conserved in the UrIg VJ, but the AlphaFold model does suggest a V-V association. Further studies of the natural UrIg will allow discrimination of these two possibilities.
What is the function of UrIg?
UrIg is transcribed in the liver yet is found as protein in ECM throughout the shark body (Figs. 8, 9). First, is UrIg produced by hepatocytes or infiltrating hematopoietic cells? We should be able to address this question with in situ hybridization studies in combination with immunohistochemistry using existing mAbs. So far, in situ hybridization has not been sensitive enough to detect the UrIg-expressing cells in the liver, perhaps because of overall low expression by hepatocytes. Second, it will be a challenge to understand how UrIg is transported to the ECM throughout the body, yet with existing reagents we should be able to address this question in the future as well. Lastly, and most importantly, what is the function of UrIg in the ECM? Does it perform an innate immune function such as acting as a secreted pattern recognition receptor, or could it have an ECM supporting function? Based on its MHC linkage in the class III region, we propose that it does have an innate immune function; such a function would also make sense if UrIg (or an UrIg-like molecule) were co-opted to function in the adaptive immune system after the RAG transposon invasion.
UrIg is encoded in the MHC and might be related to the Ig/TCR precursor
Based on the presence of VJ and C1 domains in the MHC and MHC paralogous regions, Du Pasquier et al. (26) suggested that Ig/TCR emerged from the MHC or MHC precursor. Subsequently, other groups, including our own, also provided evidence for such a scenario (27, 30–32). UrIg is encoded in the class III region of the MHC, near to the TNF gene cluster (57) and close to other VJ single-exon genes such as NCR3 (Fig. 1E). The MHC class III region is well known for the presence of genes involved in innate immunity and inflammatory responses, and immune genes in the class III region clearly were present in the proto-MHC prior to the emergence of adaptive immunity (28, 58, 59). Thus, we think it plausible that UrIg is related to the VJ gene that was invaded by the RAG transposon, perhaps most related. Although UrIg is clearly Ig-like in overall structure, the VJ domain is not more related to VH, VL, TCRα, or TCRβ, suggesting that it is either highly derived, or it emerged before the split of Ig and TCR. To date, we have not detected UrIg in agnathans or deuterostomes that arose from ancestors prior to the genome-wide duplications that occurred in vertebrates (30). There are, however, VJ domains present in agnathans and invertebrates, yet little is known about their genetic history or functions (26, 60–62). Studies of these molecules, and further studies of the structure and ligand identification of UrIg, will be crucial in piecing together the origins of adaptive immunity.
Note added in proof
While our article was being revised, another article reported the same shark sequence, Kondo et al. (64).
Disclosures
The authors have no financial conflicts of interest.
Acknowledgments
We thank Erik Cruz for technical support and Marc Elslinger for help with the AlphaFold modeling.
Footnotes
This work was supported by National Institute of Allergy and Infectious Diseases Grants R01AI140326 and R01AI170844.
The online version of this article contains supplemental material.