“Two Genes, One Polypeptide Chain–Fact or Fiction” was the title of a major symposium at an American Association of Immunologists meeting in the early 1970s (1). What is impossible for today’s young immunologists to imagine is that for about 20 years (from the discovery of the “Todd Phenomenon”—that rabbit allotypes, which were thought to be encoded by V regions, were shared by at least two if not three Ig classes—about 1963) until the publication of the Early et al. paper (2), the notion that two genes let alone three or four genes encoding any polypeptide chain was heretical. Yet, all immunologists believed that two genes encoded a single Ig: a V gene and a C gene.
In the 1970s, a series of papers were published that began to call into doubt the notion that a single gene encoded a variable region. In the book The Antibody Enigma, that Tom Kindt and I published in 1984, we referred to these theories of diversity under the rubric “the Maverick Solutions” (3).
For entirely different reasons, Tom Kindt and I had come to the conclusion that “there has to be more than a single gene per variable region.” Tom argued that because the rabbit allotypes were located in the “N terminal 100 amino acids” of the H chain, and because the C-terminal part of the rabbit H chain had an amino acid sequence that was quite independent of the N-terminal portion, the best way to explain this was “two genes per variable region.” In a similar vein, data my laboratory had collected taught two things. First, the phylogenetically associated residues (those amino acids at particular positions that distinguished one species from another) were confined to the region “up to position 100 of the heavy chain.” Second, we had found Abs derived from unrelated people to have identical amino acid sequences in the region beyond position 100, but very different before position 100. We had therefore suggested that the simplest explanation was that there were two (or three) genes encoding a variable region, and another gene encoding the constant regions. In 1975, Tom Kindt and I published a paper, “Antibody Diversity: Can More than One Gene Encode Each Variable Region?” (4). Fig. 3 of that paper is reproduced below as Fig. 1. Yet, all the evidence was indirect. This was before the era of molecular biology. The direct proof would come a year later, when at the Cold Spring Harbor Antibody meeting, Tonegawa presented his finding that V and C rearranged between embryonic and adult B cells (5). Within 2 years, his lab, Phil Leder’s lab, and others had decisively shown that there were “two genes per variable region” (V and J in L chains). The Early et al. Cell paper extended this to the H chain. Finally, there it was: there was a V, a D, and a J gene segment, and that explained it all. The V region gene was the location of the phylogenetically associated residues and the rabbit allotypes, the D segment was the location of the Id and explained the “third hypervariable region of the heavy chain,” and the J segment explained why the amino acid sequence of the major portion of the V region did not correlate with the C-terminal dozen or so amino acids.
It set the course for the solution to the Ab problem. Suddenly it became clear: with a limited number of V, D, and J gene segments (say 100, 10, and 5), one could get over 5,000 different chains (100 × 10 × 5). With the junctional diversity already evident, the 5,000 could become 50,000; and with H and L chain pairing (50,000 × 50,000 = “infinity”), more than enough specificities than anyone thought we needed.
It lead to the way of explaining the power of somatic hypermutation. Somatic hypermutation had to be the explanation for the variation seen in the first and second hypervariable regions.
In one fell swoop, the Early paper explained many things. The germline origin of rearranged VH genes: V, D, and J could mix and match, and as such, a limited amount of germline DNA could give rise to multiple complete V genes. It described the order of the genes as 5′ to 3′, such that V, D, J, and the Cμ region followed further downstream (this explained the genetic crosses that had found recombination between V and C in certain species). It highlighted that the D segment was even more extraordinary than anyone could have predicted: diversity could be generated at two junctions, i.e., the VD and the DJ. All of these factors, including the 12/23 rule, the prediction of enzymes that would be involved in the recombination events, etc., fell from this one landmark paper.
Additionally, the Early et al. paper had a profound impact on the evolving notion of recombination not only in immunology, but all of biology. Certainly the 12/23 rule had been discussed in the previous 3 years by the labs studying L chains, but the Early paper, especially with the discovery of the D segment, established that recombination in the immune system was special: directed by conserved heptamer/nonamers with the spacing dictating the orientation and specificity of recombination. It was now evident why V did not recombine with J in the H chain. One vs two turns of the DNA helix helped explain both of these issues. The implication that because the three families of Ig genes had diverged over 500 million years ago, before the origin of vertebrates, suggested that the mechanism of this form of recombination was ancient. The paper essentially predicted the discovery of the RAG enzymes years later, as well as the myriad of DNA-binding proteins that we now know are involved in this central process of receptor generation.
Within a year, the 30-year argument, germline vs somatic, was over. To my memory, neither Lee Hood (whose lab had provided this critical paper, in my view the most important contribution Lee Hood has so far made in his illustrious career) nor Mel Cohn nor Marty Weigert spoke of the germline or the somatic mutation theories again!