The Wu and Kabat (1) paper in 1970 was a classic that gave us our first real view of Ab diversity. It assembled most of the amino acid sequence data then available from Ig and Ab light chains, established metrics for mapping variability across these polypeptide chains, analyzed the structural constraints imposed by this variability, and then speculated about a possible mechanism for generating Ab diversity based on the patterns of variability. Let me provide a personal view of the historical context of this paper.
The first year of medical school at Hopkins, 1960, was fascinating for me, as I started to learn the human biology that was to provide a framework for the rest of my career. What I found most intriguing were the theories of how the diversity of Abs was generated. In 1897, Paul Ehrlich suggested that each Ab-producing cell made all Abs and expressed them on the cell surface, and Ag pulled complementary Abs off the membrane and stimulated their synthesis. The first theory of Ab diversity was therefore a selectionistic theory involving Ag triggering the synthesis of preformed Abs. In the 1930s and 1940s, Linus Pauling and Felix Haurowitz argued that the Ab protein was a template molecule that could be folded into many different shapes by the interaction with Ag and that once this template-directed folding occurred, the Ab shape and hence specificity was retained. This was termed an instructionistic theory, where Ag served to instruct Ab folding. In the 1950s and 1960s, Macfarlane Burnet and Joshua Lederberg argued that each Ab-producing cell had the capacity to synthesize a single molecular species of Ab and that the Ag played the role of a mitogen to trigger the proliferation of cells producing Abs complementary to the Ag. This theory, the clonal selection hypothesis, proposed that Ag induced both the clonal expansion of functional Ab-producing cells and the creation of an expanded population of Ab memory cells that could rapidly expand upon future exposure to Ag and thus also explained the long mysterious process of immunological memory.
Several observations in the 1960s made the selectionistic theories attractive to me. First, Igs or Abs were extremely chemically heterogeneous, and it was not obvious why this would be so under the instructionistic hypothesis. Second, Christian Anfinsen and his colleagues had just begun to demonstrate that the primary protein sequence of a polypeptide directs its three-dimensional folding. If structure-determined folding was a generality, then the template theory of Ag-directed folding of Abs was incorrect. In thinking about selectionistic theories in the 1960s, many of us concluded that there were two fundamental mechanisms for generating Ab polypeptide diversity: either it could be encoded by many different germline genes, the germline theory, or it could arise from a few genes by somatic diversification mechanisms, the somatic theory. Several further features of Abs became clear during this time in the early 1960s. First, there were two types of Ab polypeptide chains: light and heavy. Second, there were many different classes of Abs with distinct general effector functions encoded by different types of heavy chain. Light chains fell into two distinct types. Third, in 1965 Bill Dreyer and Claude Bennett (2) proposed that Ab polypeptides were encoded by two different genes, variable and constant, that were joined during development. Hence, the question of Ab diversity focused on the mechanism(s) for generating variable region diversity.
In medical school, I (and others) became intrigued by the possibility that an analysis of the diversity present in Ab polypeptides would reflect the nature of the mechanism used to generate the diversity. At this time only protein sequences, not gene sequences, could be determined. Hence, I was excited by the prospect of analyzing the homogeneous Bence-Jones proteins (myeloma light chains) purified from the urine or myeloma proteins (complete Igs) isolated from the blood of mice in which tumors could be chemically induced, as well as from humans with spontaneously arising multiple myeloma, a cancer of Ab-producing cells. This is why I went to Caltech to work for my Ph.D. degree with Bill Dreyer in 1963. Later, it became possible to purify and determine the amino acid sequences of purified specific Abs. Many different groups began sequencing Igs and Abs in the mid and late 1960s, focusing initially on the smaller light chains.
Wu and Kabat in 1970 assembled for the first time all of the existing amino acid sequences of Ig chains, analyzed the variability of 77 light chains and a few heavy chains by alignment of their similar sequence features, and then analyzed three aspects of this variability and proposed a specific somatic theory of Ab diversity. First, Wu and Kabat commented perceptively on the structural roles various conserved amino acids played, such as several totally conserved glycines. Second, they also noted that a number of conserved light chain residues in mouse were different from human and suggested that this argued against a somatic theory of Ab diversity, for how could multiple genes evolve to have species-specific conserved residues? (In fact, we later went on to discuss the fact that, apart from functional selection, multigene families could indeed evolve in a residue-conserved, species-specific manner through gene conversion or gene expansion and contraction (3).) Third, they created a metric of variability at each amino acid residue position (variability = number of different amino acids at a given position/frequency of the most common amino acid at that position) and plotted this variability across the 107 residues of the light chain variable region. This diversity figure demonstrated two stretches of extreme variability also containing all of the sequence gaps, residues 24–34 and 89–97, and one stretch of modest variability, residues 50–56. Because a disulfide bridge joined the two extreme hypervariable regions, it was postulated that these regions were joined in three-dimensional space and constituted the Ag-binding or complementarity-determining site or region (CDR) of the Ab light chain. This was later verified by three-dimensional analyses of Ab protein crystals that suggested residues 50–56 also participated in this site. This mode of analysis was later applied successfully to the heavy chain and to the TCR variable regions with similar results. Immunologists used this pioneering concept of CDRs and this diversity metric thereafter.
Wu and Kabat also suggested that the CDRs were encoded by episomes, small stretches of DNA sequence. They suggested that a large repertoire of episomes existed for each CDR and that different episomes could be joined into a few germline genes in a combinatorial manner, thus creating a somatic mechanism for generating Ab diversity. In fact, these suppositions turned out to be incorrect in the broader context of how Ab diversity was generated. However, what was fascinating about the various theories of generating Ab diversity was that each of them turned out to be somewhat correct. There were indeed multiple germline genes in most species: tens to hundreds of variable genes for the various Ig gene families. Recombination did play a major role in generating diversity at the third CDR because in the process of recombinatorially joining the various gene segments (variable and joining for the light chain, and variable, joining, and diversity for the heavy chain), much diversity was generated. Finally, somatic mutation did play an important role after antigenic stimulation in generating a variable gene (region) diversity that could be selected by Ag (the affinity maturation of the immune response). Therefore, in a real sense, most everyone was right about the mechanisms of Ab diversity. Furthermore, nature demonstrated the power of Darwinian evolution (diversification followed by selection) in the somatic differentiation of Ab-producing cells. Moreover, it was Wu and Kabat who articulated so clearly back in 1970 our initial glimpse of the dimensions of the fascinating process of Ab diversification and its implications for the generation of variability, structure-function relationships, and even evolutionary considerations.