Somatic mutations are not distributed randomly throughout Ab V region genes. A sequence-specific target bias is revealed by a defined hierarchy of mutability among di- and trinucleotide sequences located within Ig intronic DNA. Here we report that the di- and trinucleotide mutability preference pattern is shared by mouse intronic JH and Jκ clusters and by human VH genes, suggesting that a common mutation mechanism exists for all Ig V genes of both species. Using di- and trinucleotide target preferences, we performed a comprehensive analysis of human and murine germline V genes to predict regional mutabilities. Heavy chain genes of both species exhibit indistinguishable patterns in which complementarity-determining region 1 (CDR1), CDR2, and framework region 3 (FR3) are predicted to be more mutable than FR1 and FR2. This prediction is borne out by empirical mutation data from nonproductively rearranged human VH genes. Analysis of light chain genes in both species also revealed a common, but unexpected, pattern in which FR2 is predicted to be highly mutable. While our analyses of nonfunctional Ig genes accurately predicts regional mutation preferences in VH genes, observed relative mutability differences between regions are more extreme than expected. This cannot be readily accounted for by nascent mRNA secondary structure or by a supplemental gene conversion mechanism that might favor nucleotide replacements in CDR. Collectively, our data support the concept of a common mutation mechanism for heavy and light chain genes of mice and humans with regional bias that is qualitatively, but not quantitatively, accounted for by short nucleotide sequence composition.

During T cell-dependent immune responses, Ab V genes of activated B cells are modified by a somatic mutation process that indirectly exerts a strong influence on recruitment of B cells into the memory compartment by producing structural and functional alterations in V region domains of surface receptor Abs. Despite the importance of somatic mutagenesis to immune and autoimmune processes, its mechanism has proved to be conceptually elusive. Somatic mutations occur within variable region gene segments (1, 2, 3, 4, 5, 6), but are selectively absent from exons encoding constant domains (7, 8, 9). Transcription seems to be necessary, but not sufficient, for targeting mutagenesis (10, 11, 12). A link to transcription is supported by the influence of promoter location on the distribution of mutations along a segment of DNA (13). However, mutagenesis is not strictly dependent on Ig-specific promoters or coding sequences, since replacing either does not dramatically alter mutation frequency (13, 14, 15, 16). A correlation between predicted RNA secondary structure and intrinsic mutability was observed in one study of an artificial mutation substrate under the control of Ig κ regulatory elements (17). On the basis of these data, Storb et al. (17) proposed a model in which a mutator travels with the transcription machinery and stalls at points of nascent RNA secondary structure, thereby leading to higher frequencies of mutagenesis in adjacent DNA. κ intron and 3′ enhancers play an essential role in the mutation process as revealed in transgenic studies (15, 18, 19, 20), yet the analogous heavy chain gene enhancers appear to be insufficient for full mutagenesis (21, 22, 23, 24). These differences raise the possibility that different mutation mechanisms may operate on heavy and light chain V region genes.

Somatic mutations are not distributed randomly throughout V region genes. In part this is due to selection pressures that favor alteration of the complementarity-determing region (CDR).3 However, an intrinsic target bias of the mutation mechanism has been revealed by the nonrandom nature of mutations within sequences that are not influenced by cellular selection processes (8, 25, 26). Lebecque and Gearhart (27) observed that particular types of mutations occurred more frequently than others, leading them to deduce that the mechanism possesses a strand bias. Rogozin and Kolchanov (28) observed a higher than average frequency of mutations in RGYW and TAA nucleotide sequences. Similarly, Betz et al. (29) observed that AGY serine codons in a passenger transgene mutated at an inordinately high frequency. This led them to propose that CDR are more intrinsically mutable than FR because of differences in AGY vs TCN serine codon use by segments encoding the two types of regions (30). However, somatic mutations often occur in nucleotide sequences that are not included in these motifs, and this led us to examine the relative mutability of all di- and trinucleotides. Our analysis was confined to mutations within Ig intronic sequences that presumably are not subject to the indirect, but substantial, influences of selection (31). The results of this work revealed a hierarchy of mutability among all di- and trinucleotide sequences. In addition, the results helped to refine previously proposed motifs, identifying AGC, for example, as the mutable component of the AGY motif. Defining intrinsic mutability provides clues to the mechanism of mutation and is important to interpretations concerning Ag-driven selection in immune and autoimmune processes that are often drawn on the basis of mutational distribution within V genes.

In this manuscript we compare relative intrinsic mutabilities of di- and trinucleotides in heavy and light chains and V genes of mice and humans in search of evidence for common or distinct mutation mechanisms. In addition, we used di- and trinucleotide mutability preferences in a comprehensive analysis of germline V region gene sequences to predict intrinsic regional mutabilities of CDR and FR. We compared these predictions with empirical mutation data from nonproductively rearranged human VH genes to determine whether regional mutability indexes based upon di- and trinucleotides composition alone could predict the pattern and relative extent of mutational accumulation in different segments of Ab V genes. Finally, we examined the relationship between predicted RNA secondary structure and observed regional mutability. Our results support the idea that all Ig genes in mice and humans mutate by a common mechanism and that di- and trinucleotide sequence composition alone can predict regional mutation patterns, but in a nonquantitative manner. The quantitative discrepancy is not obviously resolved by taking predicted nascent secondary RNA structure into consideration.

For regional mutability predictions of human V genes, sequences were extracted from the V BASE index (32) which contains all known human germline heavy and light chain V genes. When sequence discrepancies due to potential mistakes or polymorphisms occurred, the Tomlinson et al. (33) sequence was used. Murine Ig V gene sequences were extracted from the ABG germline directory of mouse Ig sequences, organized by the Ab Group of Instituto de Biotecnologia, Universidad Nacional Autonoma de Mexico (34). In those few cases where identical amino acid sequences were specified by different nucleotide sequences, only one sequence was selected for analysis. Because complete sequence information was not available for all V genes of any strain, the known sequences from all strains were combined in our analyses. Altogether, 47, 40, and 29 human heavy, κ, and λ genes as well as 72 and 42 murine heavy and κ genes were included in the analyses.

Empirical mutation frequencies were determined using somatically mutated, but unselected, Ig sequences derived from several sources. Mutations in murine intronic sequences for the JH and Jκ clusters that flank assembled V gene exons were analyzed using raw data (A/J + literature + autoimmune) obtained by Smith et al. (31). Mutations in human VH genes were analyzed from sequences of nonproductively rearranged VH genes obtained by Dorner et al. (35) and Dunn-Walters and Spencer (36). In all cases insertions and deletions were not included in the analyses, and nonproductive rearrangements were defined by out-of-frame J segments or junctional stop codons. We also analyzed somatic mutations in a κ transgenic construct containing a 100-nt insertion with repeated EcoRV and PvuII restriction sites under the control of κ regulatory elements (EPS insert) (17). Forty-six such mutated transgene sequences (27,278 nt) containing 135 mutations were obtained from data reported by Storb et al. (17).

All unmutated germline sequences were manipulated and analyzed with MacVector version 5.0 software (Oxford Molecular Group, Beaverton, OR). Sequences were divided into FR and CDR according to both the Kabat (37) and Chothia (38, 39, 40) definitions; the former is based on sequence variability, and the latter is based on location of structural loop regions. In each sequence file used in the dinucleotide analysis, one extra nucleotide was included from adjacent sequence at each end of the region under consideration. Similarly, sequence files for trinucleotide analyses included two extra nucleotides at each end. Mutated nonproductive human VH sequences were treated identically, except that 10 nt were deleted from the 3′ end of FR3 to insure that discrepancies with germline sequence were due to somatic mutagenesis as opposed to junctional diversification during V segment assembly (41, 42). The transgene of Storb et al. (17) was divided into segments of 100 nt in a manner that left the artificial EPS insertion intact in one file. Collectively, 6507 files containing regional sequence data for V genes were generated for analysis.

Di- and trinucleotide observed/expected (obs/exp) mutability indexes were calculated as previously described (31). Briefly, the number of times a given oligonucleotide within a segment of DNA contained a mutation was divided by the number of times the oligonucleotide was expected to be mutated for a mechanism with no bias. Mutability indexes are normalized for the di- and trinucleotide compositions of unmutated templates covering the precise regions for which mutational data were analyzed.

Regional mutability based solely on DNA sequence was predicted using di- and trinucleotide obs/exp mutability indexes (A/J and literature) defined by Smith et al. (31). The mutability index of a di- or trinucleotide is a normalized measure of its tendency to mutate, where a value of 1 indicates the average mutability for di- or trinucleotides. The predicted mutability index for a region was calculated by determining the number of times each di- or trinucleotide occurred within each region (file) of each gene (regardless of frame of reference) and multiplying by its mutability index. The resulting products for the 16 dinucleotides or 64 trinucleotides were summed and then divided by the total number of di- or trinucleotides in the region (file) under consideration. That is, di- and trinucleotide mutabilities were summed in all frames of reference. The composite mutability index predicted for a type of region, for example nucleotide sequences encoding human VH FR1, was determined by summing all di- or trinucleotide products (occurrences × mutability index) and dividing this number by the sum of all di- or trinucleotide occurrences in the region for all such sequences in the database. When predicting regional mutabilities for human VH genes within the databases of nonproductively rearranged VH genes, calculations were weighted proportionally to the number of times a mutated version of a given gene appeared in the database. Microsoft Excel 98 and version 4.0 (Redmond, WA) were used for database management and calculations.

The empirical relative mutabilities (observed mutability index) for each region of the nonproductively rearranged human VH genes and the murine κ transgene were determined by dividing the number of mutations per nucleotide for a region by the number of mutations per nucleotide for the entire gene. In essence, this gives the obs/exp mutability ratio for a region, where the expected frequency is the average frequency of mutations for the whole gene. For a given subregion, such as nucleotide sequences encoding VH FR1, the observed composite mutability index was calculated as the number of mutations per nucleotide in all VH FR1 of the database divided by the number of mutations per nucleotide for the entire length (FR1, CDR1, FR2, CDR2, FR3) of all V regions within the database.

Six-base intervals of the nonproductive human VH genes were defined by aligning all the sequences while adjusting for CDR length variability. Predicted folding energy of nascent mRNA was used as a measure of secondary structure stability. Members of the mutated nonproductive human VH gene databases from Dorner et al. and Dunn-Walters and Spencer (35, 36) were divided into 51-nt intervals in steps of 6 nt. The mRNA folding energy for each interval was calculated by the online version of Mfold (43) and reported in kilocalories per mole, where stability increases as the number becomes more negative. For each 6-nt segment, we determined the mRNA folding energy by averaging the folding energies of the 51-nt interval whose 3′ end coincided with the 6-base segment under consideration and the two immediately flanking 51-nt intervals. This analysis correlates a 6-nt segment with upstream RNA secondary structure that has the potential to influence the polymerase complex. Calculations for some 6-nt segments in FR1 could not be performed due to lack of leader sequence information.

Continuous data are summarized as the mean and SEM. Mean ratios were compared between regions using one-way ANOVA. The percentages of nucleotides with mutations were compared with the expected percentages under the assumption of uniform mutation rates by χ2 tests for comparing observed to expected proportions. Pearson correlation coefficients were used to evaluate associations between continuous variables. Simple linear regression with confidence intervals and tests of hypotheses on the intercepts and slopes were used to evaluate linear relationships between continuous variables. All tests of hypotheses were considered significant at an α level of 0.05, except that tests for significance of mutation indexes were performed at the 0.01 level.

If a common mechanism generates somatic mutations within Ig heavy and light chain V genes, then the two should reveal a common di- and trinucleotide target bias. To test this idea, we calculated normalized obs/exp mutability indexes for all di- and trinucleotides within mutated JH and Jκ intronic DNA from a panel of murine hybridomas (31). The index is a measure of the tendency to mutate, where a value of 1 is the average. As shown in Fig. 1,A and Table I, a comparison of dinucleotide mutability indexes in the two regions revealed a close agreement. This is most clearly seen in a simple regression analysis, which yielded a slope and y-intercept of 1 and 0, respectively, within a 95% confidence interval. Similar results and statistical significance were obtained upon comparative analyses of trinucleotide mutability indexes for the two regions of DNA (Fig. 1,B and Table II).

FIGURE 1.

Common hierarchy of mutability among di- and trinucleotides in JH and Jκ intronic DNA. Mutability indexes (obs/exp) for all dinucleotides (A) and trinucleotides (B) were compared.

FIGURE 1.

Common hierarchy of mutability among di- and trinucleotides in JH and Jκ intronic DNA. Mutability indexes (obs/exp) for all dinucleotides (A) and trinucleotides (B) were compared.

Close modal
Table I.

Human and murine dinucleotide mutability indexes

DinucleotideMurine JH/Jκ IntronicaMurine JH IntronicaMurine Jκ IntronicaHuman VHb
No. of mutationsMutability indexNo. of mutationsMutability indexNo. of mutationsMutability indexNo. of mutationsMutability index
GC 114 2.03* 60 2.15* 54 1.88* 160 1.72* 
TA 169 1.84* 64 1.87* 105 1.87* 124 2.03* 
AC 114 1.53* 50 1.49* 64 1.57* 108 1.20 
AA 226 1.47* 78 1.55* 148 1.48* 86 1.43* 
AT 143 1.42* 66 1.40* 77 1.42* 88 1.36* 
AG 170 1.17 70 1.22 100 1.16 181 1.50* 
CT 129 1.06 70 1.08 59 1.00 105 0.89 
GT 113 1.01 41 0.89 72 1.11 100 1.13 
TG 100 0.74* 49 0.73 51 0.74 63 0.56* 
CA 72 0.72* 34 0.71 38 0.72 87 0.77 
CC 42 0.71 28 0.81 14 0.54 96 0.76* 
GA 84 0.65* 46 0.82 38 0.53* 63 0.60* 
GG 71 0.56* 44 0.64* 27 0.44* 87 0.66* 
TT 69 0.49* 34 0.52* 35 0.47* 40 1.01 
TC 48 0.45* 27 0.48* 21 0.40* 60 0.61* 
CG 0.40 0.21 0.50 28 0.55* 
         
Mutations 835  381  454  738  
Nucleotides 92,228  37,769  54,459  14,811  
Mut. Freq. (%) 0.91  1.01  0.83  4.98  
DinucleotideMurine JH/Jκ IntronicaMurine JH IntronicaMurine Jκ IntronicaHuman VHb
No. of mutationsMutability indexNo. of mutationsMutability indexNo. of mutationsMutability indexNo. of mutationsMutability index
GC 114 2.03* 60 2.15* 54 1.88* 160 1.72* 
TA 169 1.84* 64 1.87* 105 1.87* 124 2.03* 
AC 114 1.53* 50 1.49* 64 1.57* 108 1.20 
AA 226 1.47* 78 1.55* 148 1.48* 86 1.43* 
AT 143 1.42* 66 1.40* 77 1.42* 88 1.36* 
AG 170 1.17 70 1.22 100 1.16 181 1.50* 
CT 129 1.06 70 1.08 59 1.00 105 0.89 
GT 113 1.01 41 0.89 72 1.11 100 1.13 
TG 100 0.74* 49 0.73 51 0.74 63 0.56* 
CA 72 0.72* 34 0.71 38 0.72 87 0.77 
CC 42 0.71 28 0.81 14 0.54 96 0.76* 
GA 84 0.65* 46 0.82 38 0.53* 63 0.60* 
GG 71 0.56* 44 0.64* 27 0.44* 87 0.66* 
TT 69 0.49* 34 0.52* 35 0.47* 40 1.01 
TC 48 0.45* 27 0.48* 21 0.40* 60 0.61* 
CG 0.40 0.21 0.50 28 0.55* 
         
Mutations 835  381  454  738  
Nucleotides 92,228  37,769  54,459  14,811  
Mut. Freq. (%) 0.91  1.01  0.83  4.98  
a

A/J + autoimmune + literature sequences from Smith et al.(31 ).

b

Sequences from Dorner et al.(35 ) and Dunn-Walters and Spencer(36 ).

c

, Statistically significant by χ2 test at p = 0.01.

Table II.

Human and murine trinucleotide mutability indexes

TrinucleotideMurine JH/Jκ IntronicaMurine JH IntronicaMurine Jκ IntronicaHuman VHb
No. of mutationsMutability indexNo. of mutationsMutability indexNo. of mutationsMutability indexNo. of mutationsMutability index
AGC 80 2.80c 34 3.08c 46 2.68c 148 2.50c 
TAC 50 2.71c 19 2.85c 31 2.70c 78 2.33c 
GTA 80 2.41c 16 2.15c 64 2.62c 68 2.62c 
GCT 75 2.19c 36 2.43c 39 2.02c 101 2.18c 
ATA 55 1.81c 18 1.61 37 1.97c 32 1.67c 
AAC 62 1.74c 26 1.97c 36 1.64c 32 1.52 
TAG 65 1.69c 24 1.57 41 1.80c 39 2.54c 
CTA 63 1.65c 29 1.70c 34 1.61c 43 1.62c 
TGC 38 1.57c 26 1.72c 12 1.22 11 0.52 
ATG 77 1.56c 37 1.75c 40 1.43 18 0.84 
GCA 43 1.56c 19 1.45 24 1.66 38 1.26 
AAG 95 1.54c 27 1.39 68 1.67c 59 1.29 
ACT 69 1.50c 38 1.64c 31 1.34 39 1.24 
AAT 82 1.49c 34 1.54 48 1.48c 21 1.83c 
TAA 78 1.49c 27 1.82c 51 1.41 2.22 
TAT 41 1.43 20 1.34 21 1.50 46 1.53c 
AAA 109 1.36c 30 1.39 79 1.42c 10 0.80 
GAT 55 1.36 24 1.37 31 1.37 36 1.13 
CAA 48 1.26 18 1.30 30 1.27 40 1.12 
ACA 40 1.17 12 0.81 28 1.46 30 0.74 
GAA 66 1.11 30 1.15 36 1.08 42 1.14 
GCC 20 1.09 17 1.26 0.53 37 0.71 
ATC 29 1.05 19 1.50 10 0.67 58 1.69c 
GGT 38 1.01 19 0.75 19 1.36 30 0.91 
CTT 43 0.98 29 1.27 14 0.65 15 0.85 
GTT 43 0.95 20 1.29 23 0.79 27 1.81c 
AGA 66 0.94 27 1.05 39 0.90 21 0.59 
ACC 23 0.91 0.73 15 1.06 56 1.18 
GTG 40 0.91 13 0.73 27 1.04 43 0.89 
CAT 26 0.90 18 1.11 0.61 20 0.82 
ATT 40 0.90 23 0.89 17 0.88 35 1.54c 
CTG 50 0.89 30 0.89 20 0.83 65 0.73c 
AGT 50 0.88 13 0.74 37 0.97 58 1.20 
TGT 54 0.87 19 0.74 35 0.97 36 1.02 
TTA 31 0.86 17 1.06 14 0.70 34 1.69c 
CAC 19 0.77 0.65 11 0.88 34 0.79 
CTC 34 0.75 15 0.63 19 0.87 17 0.37c 
CCT 23 0.74 16 0.79 0.59 41 0.71 
GGC 21 0.74 11 0.75 10 0.71 24 0.74 
TGG 37 0.69 24 0.82 13 0.52 42 0.56c 
CCA 24 0.68 12 0.67 12 0.67 48 0.88 
ACG 0.64 0.80 0.64 24 1.46 
CAG 36 0.61c 22 0.76 14 0.46c 82 1.21 
AGG 37 0.59c 23 0.71 14 0.45c 26 0.68 
CCC 0.59 0.74 0.20 15 0.37c 
CCG 0.58 0.32 0.80 24 0.62 
CGG 0.57 0.61 0.54 14 0.57 
GAG 33 0.56c 13 0.57 20 0.56c 51 0.98 
TCT 40 0.55c 24 0.61 16 0.47c 24 0.57c 
TGA 35 0.55c 16 0.51c 19 0.58 11 0.28c 
TTG 30 0.55c 17 0.59 13 0.50 0.50 
TCA 28 0.53c 17 0.67 11 0.40c 35 0.77 
GTC 23 0.52c 13 0.46c 10 0.57 20 0.46c 
GGG 34 0.50c 26 0.70 0.25c 39 0.64c 
TTT 35 0.48c 18 0.61 17 0.40c 0.95 
GGA 26 0.43c 19 0.66 0.22c 55 0.76 
TTC 18 0.42c 0.42 10 0.42c 17 0.73 
CGT 0.39 0.00 0.46 15 0.99 
TCC 11 0.38c 0.42 0.32 33 0.65c 
GAC 11 0.33c 0.33c 0.32c 0.24c 
TCG 0.17 0.00 0.26 0.32 
CGA 0.00 0.00 0.00 0.57 
CGC 0.00 0.00 0.00 15 0.57 
GCG 0.00 0.00 0.00 0.43 
         
Mutations 835  381  454  738  
Nucleotides 92,228  37,769  54,459  14,811  
Mut. Freq. (%) 0.91  1.01  0.83  4.98  
TrinucleotideMurine JH/Jκ IntronicaMurine JH IntronicaMurine Jκ IntronicaHuman VHb
No. of mutationsMutability indexNo. of mutationsMutability indexNo. of mutationsMutability indexNo. of mutationsMutability index
AGC 80 2.80c 34 3.08c 46 2.68c 148 2.50c 
TAC 50 2.71c 19 2.85c 31 2.70c 78 2.33c 
GTA 80 2.41c 16 2.15c 64 2.62c 68 2.62c 
GCT 75 2.19c 36 2.43c 39 2.02c 101 2.18c 
ATA 55 1.81c 18 1.61 37 1.97c 32 1.67c 
AAC 62 1.74c 26 1.97c 36 1.64c 32 1.52 
TAG 65 1.69c 24 1.57 41 1.80c 39 2.54c 
CTA 63 1.65c 29 1.70c 34 1.61c 43 1.62c 
TGC 38 1.57c 26 1.72c 12 1.22 11 0.52 
ATG 77 1.56c 37 1.75c 40 1.43 18 0.84 
GCA 43 1.56c 19 1.45 24 1.66 38 1.26 
AAG 95 1.54c 27 1.39 68 1.67c 59 1.29 
ACT 69 1.50c 38 1.64c 31 1.34 39 1.24 
AAT 82 1.49c 34 1.54 48 1.48c 21 1.83c 
TAA 78 1.49c 27 1.82c 51 1.41 2.22 
TAT 41 1.43 20 1.34 21 1.50 46 1.53c 
AAA 109 1.36c 30 1.39 79 1.42c 10 0.80 
GAT 55 1.36 24 1.37 31 1.37 36 1.13 
CAA 48 1.26 18 1.30 30 1.27 40 1.12 
ACA 40 1.17 12 0.81 28 1.46 30 0.74 
GAA 66 1.11 30 1.15 36 1.08 42 1.14 
GCC 20 1.09 17 1.26 0.53 37 0.71 
ATC 29 1.05 19 1.50 10 0.67 58 1.69c 
GGT 38 1.01 19 0.75 19 1.36 30 0.91 
CTT 43 0.98 29 1.27 14 0.65 15 0.85 
GTT 43 0.95 20 1.29 23 0.79 27 1.81c 
AGA 66 0.94 27 1.05 39 0.90 21 0.59 
ACC 23 0.91 0.73 15 1.06 56 1.18 
GTG 40 0.91 13 0.73 27 1.04 43 0.89 
CAT 26 0.90 18 1.11 0.61 20 0.82 
ATT 40 0.90 23 0.89 17 0.88 35 1.54c 
CTG 50 0.89 30 0.89 20 0.83 65 0.73c 
AGT 50 0.88 13 0.74 37 0.97 58 1.20 
TGT 54 0.87 19 0.74 35 0.97 36 1.02 
TTA 31 0.86 17 1.06 14 0.70 34 1.69c 
CAC 19 0.77 0.65 11 0.88 34 0.79 
CTC 34 0.75 15 0.63 19 0.87 17 0.37c 
CCT 23 0.74 16 0.79 0.59 41 0.71 
GGC 21 0.74 11 0.75 10 0.71 24 0.74 
TGG 37 0.69 24 0.82 13 0.52 42 0.56c 
CCA 24 0.68 12 0.67 12 0.67 48 0.88 
ACG 0.64 0.80 0.64 24 1.46 
CAG 36 0.61c 22 0.76 14 0.46c 82 1.21 
AGG 37 0.59c 23 0.71 14 0.45c 26 0.68 
CCC 0.59 0.74 0.20 15 0.37c 
CCG 0.58 0.32 0.80 24 0.62 
CGG 0.57 0.61 0.54 14 0.57 
GAG 33 0.56c 13 0.57 20 0.56c 51 0.98 
TCT 40 0.55c 24 0.61 16 0.47c 24 0.57c 
TGA 35 0.55c 16 0.51c 19 0.58 11 0.28c 
TTG 30 0.55c 17 0.59 13 0.50 0.50 
TCA 28 0.53c 17 0.67 11 0.40c 35 0.77 
GTC 23 0.52c 13 0.46c 10 0.57 20 0.46c 
GGG 34 0.50c 26 0.70 0.25c 39 0.64c 
TTT 35 0.48c 18 0.61 17 0.40c 0.95 
GGA 26 0.43c 19 0.66 0.22c 55 0.76 
TTC 18 0.42c 0.42 10 0.42c 17 0.73 
CGT 0.39 0.00 0.46 15 0.99 
TCC 11 0.38c 0.42 0.32 33 0.65c 
GAC 11 0.33c 0.33c 0.32c 0.24c 
TCG 0.17 0.00 0.26 0.32 
CGA 0.00 0.00 0.00 0.57 
CGC 0.00 0.00 0.00 15 0.57 
GCG 0.00 0.00 0.00 0.43 
         
Mutations 835  381  454  738  
Nucleotides 92,228  37,769  54,459  14,811  
Mut. Freq. (%) 0.91  1.01  0.83  4.98  
a

A/J + autoimmune + literature sequences from Smith et al.(31 ).

b

Sequences from Dorner et al.(35 ) and Dunn-Walters and Spencer(36 ).

c

, Statistically significant by χ2 test at p = 0.01.

Using sequences obtained by Dorner et al. (35) and Dunn-Walters and Spencer (36), we similarly calculated dinucleotide mutability indexes for 730 mutations located within 60 nonproductively rearranged human VH genes. Again, there was close agreement between dinucleotide mutability indexes for human VH genes and those of murine intronic JH and Jκ DNA (Fig. 2,A and Table I). This was demonstrated by the slope of their linear regression equaling 1 and their y-intercept falling within a 95% confidence interval of 0. In all cases GC and TA were the preferred targets. The trinucleotide mutability indexes were also in close agreement as shown in Fig. 2,B and Table II. This consistent hierarchy of di- and trinucleotide mutability in all analyzed Ig sequences supports the idea that a common mutation mechanism acts on heavy and light chain genes of both species regardless of whether DNA is located within an intron or an exon encoding a V region domain.

FIGURE 2.

Common hierarchy of mutability among di- and trinucleotides in V genes of humans and mice. Mutability indexes (obs/exp) for all dinucleotides (A) and trinucleotides (B) were compared. The upper limit of the 95% confidence interval of the slope of the linear regression for trinucleotides is slightly <1 (0.96).

FIGURE 2.

Common hierarchy of mutability among di- and trinucleotides in V genes of humans and mice. Mutability indexes (obs/exp) for all dinucleotides (A) and trinucleotides (B) were compared. The upper limit of the 95% confidence interval of the slope of the linear regression for trinucleotides is slightly <1 (0.96).

Close modal

To further investigate the consistency of sequence-specific mutability, we calculated the mutability index for each position of all di- and trinucleotides in the murine JH and Jκ and human VH sequences. As shown in Table III, the JH and Jκ as well as the murine and human Ig sequences possessed a striking consistency of mutational bias at the level of individual positions. Besides providing further support for a common mutation mechanism, these data indicate that positions within di- and trinucleotides are targeted to differing degrees (29, 44).

Table III.

Human and murine di- and trinucleotide mutability indexes by position

OligonucleotideMurine JH/Jκ IntronicaMurine JH IntronicaMurine Jκ IntronicaHuman VHb
First positionSecond positionThird positionFirst positionSecond positionThird positionFirst positionSecond positionThird positionFirst positionSecond positionThird position
Most mutable GC 2.06c 1.99c  1.87c 2.44c  2.22c 1.53  1.95c 1.48c  
 TA 1.18 2.50c  0.99 2.74c  1.32 2.42c  1.67c 2.40c  
 AC 1.53c 1.53c  1.61 1.37  1.47 1.67c  1.31 1.09  
 AA 1.25 1.69c  1.19 1.91c  1.32 1.64c  0.93 1.92c  
 AT 1.76c 1.07  1.87c 0.94  1.66c 1.18  1.70c 1.02  
              
Least mutable GA 0.70 0.61c  0.89 0.75  0.56c 0.50c  0.46c 0.74  
 GG 0.33c 0.78  0.35c 0.94  0.29c 0.59  0.48c 0.83  
 TT 0.59c 0.40c  0.62 0.43c  0.56c 0.37c  0.91 1.11  
 TC 0.41c 0.49c  0.36c 0.61  0.46c 0.34c  0.51c 0.71  
 CG 0.40 0.40  0.43 0.00  0.40 0.60  0.47c 0.63  
              
Most mutable AGC 1.68 3.36c 3.36c 1.90 2.72c 4.62c 1.57 3.84c 2.62c 1.47 3.44c 2.58c 
 TAC 1.46 3.74c 2.93c 1.35 4.50c 2.70 1.57 3.40c 3.14c 2.51c 2.24c 2.24c 
 GTA 2.71c 1.08 3.43c 2.02 0.40 4.04c 3.07c 1.35 3.43c 2.89c 1.27 3.70c 
 GCT 2.97c 2.80c 0.79 3.25c 3.45c 0.61 2.79c 2.33c 0.93 3.83c 2.01c 0.71 
 ATA 1.08 2.07c 2.27c 0.81 1.61 2.42 1.28 2.39c 2.23c 0.47 2.19c 2.35c 
              
Least mutable GGA 0.40c 0.50 0.40c 0.52 0.83 0.62 0.28 0.19 0.19 0.58 0.62 1.07 
 TTC 0.62 0.28c 0.35 0.62 0.16 0.47 0.63 0.38 0.25 1.16 0.51 0.51 
 CGT 0.59 0.59 0.00 0.00 0.00 0.00 0.69 0.69 0.00 0.99 1.39 0.60 
 TCC 0.52 0.31 0.31 0.53 0.53 0.18 0.47 0.00 0.47 0.41 0.65 0.89 
 GAC 0.09c 0.27 0.63 0.17 0.33 0.50 0.00 0.19 0.76 0.16c 0.40 0.16c 
OligonucleotideMurine JH/Jκ IntronicaMurine JH IntronicaMurine Jκ IntronicaHuman VHb
First positionSecond positionThird positionFirst positionSecond positionThird positionFirst positionSecond positionThird positionFirst positionSecond positionThird position
Most mutable GC 2.06c 1.99c  1.87c 2.44c  2.22c 1.53  1.95c 1.48c  
 TA 1.18 2.50c  0.99 2.74c  1.32 2.42c  1.67c 2.40c  
 AC 1.53c 1.53c  1.61 1.37  1.47 1.67c  1.31 1.09  
 AA 1.25 1.69c  1.19 1.91c  1.32 1.64c  0.93 1.92c  
 AT 1.76c 1.07  1.87c 0.94  1.66c 1.18  1.70c 1.02  
              
Least mutable GA 0.70 0.61c  0.89 0.75  0.56c 0.50c  0.46c 0.74  
 GG 0.33c 0.78  0.35c 0.94  0.29c 0.59  0.48c 0.83  
 TT 0.59c 0.40c  0.62 0.43c  0.56c 0.37c  0.91 1.11  
 TC 0.41c 0.49c  0.36c 0.61  0.46c 0.34c  0.51c 0.71  
 CG 0.40 0.40  0.43 0.00  0.40 0.60  0.47c 0.63  
              
Most mutable AGC 1.68 3.36c 3.36c 1.90 2.72c 4.62c 1.57 3.84c 2.62c 1.47 3.44c 2.58c 
 TAC 1.46 3.74c 2.93c 1.35 4.50c 2.70 1.57 3.40c 3.14c 2.51c 2.24c 2.24c 
 GTA 2.71c 1.08 3.43c 2.02 0.40 4.04c 3.07c 1.35 3.43c 2.89c 1.27 3.70c 
 GCT 2.97c 2.80c 0.79 3.25c 3.45c 0.61 2.79c 2.33c 0.93 3.83c 2.01c 0.71 
 ATA 1.08 2.07c 2.27c 0.81 1.61 2.42 1.28 2.39c 2.23c 0.47 2.19c 2.35c 
              
Least mutable GGA 0.40c 0.50 0.40c 0.52 0.83 0.62 0.28 0.19 0.19 0.58 0.62 1.07 
 TTC 0.62 0.28c 0.35 0.62 0.16 0.47 0.63 0.38 0.25 1.16 0.51 0.51 
 CGT 0.59 0.59 0.00 0.00 0.00 0.00 0.69 0.69 0.00 0.99 1.39 0.60 
 TCC 0.52 0.31 0.31 0.53 0.53 0.18 0.47 0.00 0.47 0.41 0.65 0.89 
 GAC 0.09c 0.27 0.63 0.17 0.33 0.50 0.00 0.19 0.76 0.16c 0.40 0.16c 
a

A/J + autoimmune + literature sequences from Smith et al.(31 ).

b

Sequences from Dorner et al.(35 ) and Dunn-Walters and Spencer(36 ).

c

, Statistically significant by χ2 test at p = 0.01.

To address the importance of DNA sequence in directing somatic mutation, we first comprehensively analyzed di- and trinucleotide sequence compositions of murine and human germline V region genes. Human and murine germline-encoded genes were divided into regions encoding FR and CDR based on the definitions of Kabat (37) and Chothia (38, 39, 40). For each region in every category, we calculated a composite obs/exp mutability index. Our analysis includes all available unmutated human and murine Ig sequences: 47, 40, and 29 human heavy, κ, and λ genes and 72 and 42 murine heavy and κ Ig genes, respectively, organized into 4780 regional sequence files.

The composite regional mutability indexes predicted for murine and human VH genes are shown in Fig. 3. There is an excellent agreement between results obtained with VH genes of the two species. With obs/exp mutability indexes >1, CDR1 and CDR2 in both species are predicted to mutate the most when normalized for length. In contrast, FR1 and FR2 are predicted to mutate the least, while FR3 is predicted to mutate at an intermediate level. The regional patterns calculated using the Kabat and Chothia CDR and FR definitions are similar. Although modest in extent, the predicted regional mutability differences are remarkably consistent among individual VH genes (data not shown) and regardless of whether they are calculated using di- or trinucleotide mutability indexes. This finding together with the large number of regions examined and consistency between species suggest that the predicted regional mutability differences are not due to chance alone. Statistical analysis of the data support this interpretation (two-way ANOVA for comparing means with p < 0.0001).

FIGURE 3.

Predicted regional mutability in human and murine germline VH genes. VH genes were divided into Kabat (A) and Chothia (B) regional definitions. Regional mutability was calculated on the basis of di- and trinucleotide compositions and their mutability indexes. The data are reported as the obs/exp mutation ratio ± SEM for each region. Error bars are obscured in most cases by symbols.

FIGURE 3.

Predicted regional mutability in human and murine germline VH genes. VH genes were divided into Kabat (A) and Chothia (B) regional definitions. Regional mutability was calculated on the basis of di- and trinucleotide compositions and their mutability indexes. The data are reported as the obs/exp mutation ratio ± SEM for each region. Error bars are obscured in most cases by symbols.

Close modal

Analysis of di- and trinucleotide compositions in light chains yielded an unexpected regional mutability prediction in which FR2 is comparable to CDR1 and CDR2 (Fig. 4). Nearly identical regional trends were seen for human Vκ, human Vλ, and murine Vκ genes, with a remarkable consistency among individual light chain genes (data not shown). Murine Vλ genes were excluded because their numbers were insufficient to permit statistical analysis. As with the heavy chain genes, there was good agreement between results obtained using di- and trinucleotide mutability indexes according to either the Kabat or Chothia regional definition. Predicted regional differences were statistically tested as before and were significant (p < 0.0001).

FIGURE 4.

Predicted regional mutability in human and murine germline VL genes. Germline VL genes were divided into Kabat (A) and Chothia (B) regional definitions. The data are reported as described in Fig. 3.

FIGURE 4.

Predicted regional mutability in human and murine germline VL genes. Germline VL genes were divided into Kabat (A) and Chothia (B) regional definitions. The data are reported as described in Fig. 3.

Close modal

Our predictions of regional mutabilities are based solely on di- and trinucleotide composition. To assess the importance of these short sequences in directing mutation into a longer segment of DNA, we compared the regional predictions to empirical data from somatically mutated V genes. To exclude potential bias due to cellular selection, only somatic mutations within nonproductively rearranged V genes were analyzed. Only two available sets of human VH sequences satisfied this important criterion, and no comparable data were available for light chain genes. Empirical regional mutability indexes were determined by dividing the actual number of mutations in each region by the expected quantity, assuming a nondiscriminatory mechanism would randomly distribute mutations throughout the VH genes. Observed regional mutabilities in these VH genes were compared with predicted regional mutabilities for germline-encoded correlates, such that the germline database was weighted according to the frequency with which each gene was found in the database of mutated genes. As shown in Fig. 5, the results of these comparisons revealed a generally good agreement between predicted and observed mutability patterns. This was true for results obtained using both di- and trinucleotide mutability indexes and according to the Kabat and Chothia regional definitions. However, the observed differences in mutability among regions were more extreme than predicted. For example, among the VH genes divided by the Kabat definition, CDR1 mutated 2.8 times more frequently than FR1, while it was predicted to mutate only 1.5 and 1.2 more frequently on the basis of di- and trinucleotide compositions, respectively. Thus, our predictions were qualitatively, but not quantitatively, accurate.

FIGURE 5.

Comparison of predicted and observed regional mutability in human VH genes. Somatically mutated, nonproductively rearranged, human VH genes described by Dorner et al. (35) and Dunn-Walters and Spencer (36) were separated into regions according to Kabat (A) and Chothia (B) definitions. Observed mutability indexes were determined and compared with expected mutability indexes calculated for germline correlates, as described in the text. Data are reported as described in Fig. 3.

FIGURE 5.

Comparison of predicted and observed regional mutability in human VH genes. Somatically mutated, nonproductively rearranged, human VH genes described by Dorner et al. (35) and Dunn-Walters and Spencer (36) were separated into regions according to Kabat (A) and Chothia (B) definitions. Observed mutability indexes were determined and compared with expected mutability indexes calculated for germline correlates, as described in the text. Data are reported as described in Fig. 3.

Close modal

To explain the quantitative discrepancy, we considered the possibility that some somatic mutations might be derived by gene conversions involving unrearranged donor VH genes. Such a mechanism might increase the proportion of mutations in segments of rearranged VH DNA encoding CDR because VH germline sequence diversity is greatest in CDR-encoded DNA. With this in mind, we compared the observed mutational distribution within a κ transgene construct (10) to mutational distribution predicted on the basis of di- and trinucleotide composition. The transgene includes a 100-nt segment of DNA that contains repeated EcoRV/PvuII restriction sites and that is unrelated to V region gene DNA. Work by Storb et al. (17) had indicated that this segment of DNA mutated disproportionally to its trinucleotide composition. The results in Fig. 6 confirm this and reveal that, as with the human VH genes, we were able to qualitatively predict regional mutability preferences from di- and trinucleotide composition, but as before, the observed differences were more extreme than predicted. These quantitative differences between predicted and observed mutations in an artificial sequence are therefore unlikely to be due to mutagenesis by gene conversion.

FIGURE 6.

Predicted and observed regional mutability in a κ transgene. The sequence data are from the report by Storb et al. (17) using a κ transgene with an artificial insert containing repeated EcoRV/PvuII restriction sites. Observed and predicted mutability indexes for the artificial insert (608–707) and murine κ Ig sequences were determined as described in the text. Data are reported as described in Fig. 3.

FIGURE 6.

Predicted and observed regional mutability in a κ transgene. The sequence data are from the report by Storb et al. (17) using a κ transgene with an artificial insert containing repeated EcoRV/PvuII restriction sites. Observed and predicted mutability indexes for the artificial insert (608–707) and murine κ Ig sequences were determined as described in the text. Data are reported as described in Fig. 3.

Close modal

To address the possibility that mRNA secondary structure amplifies mutagenesis in CDR, we adopted an approach used by Storb et al. (17) to predict the stability of mRNA secondary structure within 51-nt-long regions at six-nt steps for germline correlates of the mutated VH genes described above. Fig. 7,A shows a comparison of average predicted mRNA secondary structure and mutation distribution for 60 nonproductively rearranged human VH genes encompassing four families. Segments encoding CDR1 and CDR2 were most mutable, and substantial mRNA secondary structure occurred 5′ of each. However, for CDR1, mRNA stability peaked about 30 bases before the peak of mutation, whereas for CDR2, peaks of mRNA stability and mutation nearly coincided. Furthermore, while FR3 was substantially mutated, a distinct decrease in the predicted mRNA stability occurred within this region. Finally, FR1 possessed considerable mRNA secondary structure yet contained few somatic mutations. A correlation coefficient of 0.09 and a linear regression with a p value of 0.55 indicated no significant linear correlation between observed mutability and the position of predicted stability of mRNA secondary structure (data not shown). To further investigate the relationship between mutational distribution and mRNA secondary structure, we analyzed two VH genes for which the most mutational data were available (235 and 172 mutations in 18 and 12 sequences for 4–34 and 5–51, respectively). As shown in Fig. 7, C and D, there was no obvious linear correlation between predicted mRNA secondary structure and mutation frequency. However, it is intriguing that the pattern of mRNA secondary structure stability was somewhat conserved even among VH genes belonging to different families (Fig. 7 B).

FIGURE 7.

Observed mutation distribution in human VH genes compared with predicted mRNA secondary structure: A, composite; B, composite mRNA stability among human VH families; C, gene 4–34; and D, gene 5–51. The predicted stability of mRNA secondary structure for 51-nt intervals at steps of 6 nt was determined using the online version of Mfold (line ± SEM) (43). For each 6-nt segment, we determined the mRNA folding energy by averaging the folding energies of the 51-nt interval whose 3′ end coincided with the 6-base segment under consideration and the two immediately flanking intervals. This analysis correlates a 6-nt segment with upstream RNA secondary structure that has the potential to influence the polymerase complex. Kabat and Chothia CDR are indicated.

FIGURE 7.

Observed mutation distribution in human VH genes compared with predicted mRNA secondary structure: A, composite; B, composite mRNA stability among human VH families; C, gene 4–34; and D, gene 5–51. The predicted stability of mRNA secondary structure for 51-nt intervals at steps of 6 nt was determined using the online version of Mfold (line ± SEM) (43). For each 6-nt segment, we determined the mRNA folding energy by averaging the folding energies of the 51-nt interval whose 3′ end coincided with the 6-base segment under consideration and the two immediately flanking intervals. This analysis correlates a 6-nt segment with upstream RNA secondary structure that has the potential to influence the polymerase complex. Kabat and Chothia CDR are indicated.

Close modal

We report results of a comparative analysis of di- and trinucleotide target preferences in human and mouse Ab V genes as well as an extensive analysis of predicted and observed regional mutability patterns in human VH genes. The shared hierarchy of di- and trinucleotide target mutabilities among H and L chain V genes in mice and humans suggests that a common mutation mechanism acts on all Ig genes of both species. This is supported by the excellent agreement between actual regional mutability patterns in human VH genes and mutability patterns in human VH genes predicted using di- and trinucleotide mutability indexes derived from sequences of mouse Jκ and JH intronic DNA. While not entirely surprising, the consistency in the mutation bias is an important result because distinct regulatory elements differentially control developmental expression of H and L chain genes and because the intron and 3′ L chain enhancers are essential for full mutation, while analogous H chain enhancers are not sufficient (15, 18, 19, 20, 21, 22, 23, 24). We cannot entirely exclude the possibility that distinct mutation mechanisms operate on H and L chain genes and that the two only share elements that generate sequence-specific bias. For example, observed somatic mutations are ones that have been fixed and propagated by B cells. Accordingly, sequence-specific mutation bias could be due to a bias in nucleotide repair rather than to a bias in the initial mutagenic event. Nevertheless, our interpretation is the most straightforward given the nature and consistency of the results.

Predicted and observed higher mutability of DNA encoding VH CDR vs FR suggests that evolution has carved the VH sequence to optimize beneficial effects of somatic mutation while minimizing potential detrimental effects during Ag-driven somatic diversification. This view is consistent with the observations of Chang and Casali (45) that codon use by CDR is such that random mutations will produce amino acid changes at an inordinately high frequency and with a similar finding by Kepler (46), who also took into consideration intrinsic codon mutability. Kepler’s results show the same general mutability trend as ours, except in FR2 of light chains, where he found no differential localization of synonymous codons that preferentially mutate to produce amino acid changes, while we predicted this region to be highly mutable (46). These previous studies have been limited to in-frame codon analysis without evidence that the mutation mechanism has a frame of reference. Our analysis is thus unique in that all di- and trinucleotides were considered regardless of reading frame.

On the basis of di- and trinucleotide composition, the sequences encoding light chain FR2 were predicted to mutate more frequently than those for CDR2. This was true for humans and mice and for κ and λ light chain genes. At present, there are insufficient sequence data from nonproductively rearranged light chain genes to permit a test of this unexpected result. However, quantitative differences between predicted and actual regional mutabilities in VH genes suggest that other undefined factors may influence mutational accumulation. Thus, it will be interesting to determine whether regional trends in light chain mutability follow predictions based only on short sequence composition.

We considered several possible explanations for the quantitative discrepancy between predicted and observed regional mutabilities. The mutation mechanism might recognize oligonucleotide motifs longer than trinucleotides. However, this is difficult to reconcile with the consistent hierarchy of di- and trinucleotide mutation indexes observed for different types of V genes in both coding and noncoding regions for two species as well as the consistent positional mutation preferences among the different sequence types. We have conducted a more limited regional predictive analysis using tetranucleotide sequences that have been proposed to be most mutable by other investigators (28, 47), including RGYW, as well as sequences composed of overlapping mutation-prone trinucleotides (AGCT, GTAC, CTAC, GCTA, TAGC, GTAG). The results of this analysis failed to resolve the quantitative discrepancy between predicted and observed regional mutabilities (data not shown). Furthermore, using alignment algorithms on the entire human database of germline V genes, we have been unable to identify a new motif that is preferentially located in regions of high or low mutability. Again without success, we tested the possibility that only some triplets were targets for mutagenesis by performing the predictive analysis with selected trinucleotides of highest mutability with or without regard to reading frame (data not shown). Finally, the quantitative discrepancy between predicted and observed regional mutation in the κ transgene containing the artificial insert argues against a supplemental mutagenesis mechanism involving a recombinational process with unrearranged donor V genes.

Storb et al. (17) found a region of secondary structure stability in RNA located 40 nt downstream of a highly mutable segment in their κ transgene. From this observation they proposed that mutations are directed into a region by pausing of the RNA polymerase due to secondary structures within the nascent RNA, and the DNA sequence is responsible for the fine specificity of the mutation mechanism. However, our observations are not consistent with this model, in that we found no linear correlation between areas of observed high mutability in human VH genes and positions of predicted RNA secondary structure stability. Yet, we cannot rule out the possibility that RNA secondary structure plays a more elusive and complex role than we were able to detect. We note, for example, that there appears to be a somewhat recurrent pattern of predicted secondary structure in mRNA for VH genes, even among members of different families (Fig. 7 B). If very short or long segments of RNA secondary structure were influencing mutagenesis, these would not necessarily be detected by our analysis, which was predicated on stability within 51-base-long segments. Our analysis does not take into consideration the possibility that secondary substructures within this length interval or superstructures formed by spatially separated regions of RNA could be involved in mutagenesis. As suggested by other investigators, other components, such as palindromes and repeats (28, 48, 49), could be acting together with oligonucleotide motifs to direct mutations. Palindromes and repeats are abundant in the most mutable region of the murine artificial transgene, but are virtually absent from adjacent regions. This distribution might account for the more extreme discrepancies between observed and expected regional mutabilities found for the artificial transgene compared with those found for human VH genes.

Our findings are perhaps most consistent with the accumulation model of Gearhart and Bogenhagen (50) and observations by O’Brien et al. (9), who found clustered somatic mutations within Ig κ transgenes. Collectively, the results suggest that once a region of DNA is somatically mutated, it is more likely to be targeted for mutation again. If duplex melting were required for mutagenesis and if heteroduplex DNA remained unrepaired for a significant period of time, mutations might enhance further localized mutagenesis. This could explain why di- and trinucleotide motifs can accurately predict the pattern, but not the number, of mutations within a region. Despite quantitative issues, our results indicate that short nucleotide sequences play an important role in distributing somatic mutations within V genes.

We thank Drs. Dorner and Spencer for supplying their databases, and Prasanna Jena, Andy Liu, Chris Snyder, Amanda Guth, Diana Smith, Bristol Sorenson, and Xianghua Zhang for their insights and for critically reading the manuscript.

1

This work supported by National Institutes of Health Grant RO1AI39563.

3

Abbreviations used in this paper: CDR, complementarity-determining region; FR, framework; obs/exp, observed/expected.

1
Rothenfluh, H. S., L. Taylor, A. L. Bothwell, G. W. Both, E. J. Steele.
1993
. Somatic hypermutation in 5′ flanking regions of heavy chain antibody variable regions.
Eur. J. Immunol.
23
:
2152
2
Rada, C., A. Gonzalez-Fernandez, J. M. Jarvis, C. Milstein.
1994
. The 5′ boundary of somatic hypermutation in a Vκ gene is in the leader intron.
Eur. J. Immunol.
24
:
1453
3
Both, G. W., L. Taylor, J. W. Pollard, E. J. Steele.
1990
. Distribution of mutations around rearranged heavy-chain Ab variable-region genes.
Mol. Cell. Biol.
10
:
5187
4
Rogerson, B. J..
1994
. Mapping the upstream boundary of somatic mutations in rearranged immunoglobulin transgenes and endogenous genes.
Mol. Immunol.
31
:
83
5
Roes, J., K. Huppi, K. Rajewsky, F. Sablitzky.
1989
. V gene rearrangement is required to fully activate the hypermutation mechanism in B cells.
J. Immunol.
142
:
1022
6
Weber, J. S., J. Berry, T. Manser, J. L. Claflin.
1991
. Position of the rearranged Vκ and its 5′ flanking sequences determines the location of somatic mutations in the Jκ locus.
J. Immunol.
146
:
3652
7
Hackett, J., Jr, B. J. Rogerson, R. L. O’Brien, U. Storb.
1990
. Analysis of somatic mutations in κ transgenes.
J. Exp. Med.
172
:
131
8
Sharpe, M. J., C. Milstein, J. M. Jarvis, M. S. Neuberger.
1991
. Somatic hypermutation of immunoglobulin κ may depend on sequences 3′ of Cκ and occurs on passenger transgenes.
EMBO J.
10
:
2139
9
O’Brien, R. L., R. L. Brinster, U. Storb.
1987
. Somatic hypermutation of an immunoglobulin transgene in κ transgenic mice.
Nature
326
:
405
10
Klotz, E. L., J. Hackett, Jr, U. Storb.
1998
. Somatic hypermutation of an artificial test substrate within an Ig κ transgene.
J.Immunol.
161
:
782
11
Peters, A., U. Storb.
1996
. Somatic hypermutation of immunoglobulin genes is linked to transcription initiation.
Immunity
4
:
57
12
Winter, D. B., N. Sattar, J. J. Mai, P. J. Gearhart.
1997
. Insertion of 2 kb of bacteriophage DNA between an immunoglobulin promoter and leader exon stops somatic hypermutation in a κ transgene.
Mol. Immunol.
34
:
359
13
Tumas-Brundage, K., T. Manser.
1997
. The transcriptional promoter regulates hypermutation of the Ab heavy chain locus.
J. Exp. Med.
185
:
239
14
Fukita, Y., H. Jacobs, K. Rajewsky.
1998
. Somatic hypermutation in the heavy chain locus correlates with transcription.
Immunity
9
:
105
15
Yelamos, J., N. Klix, B. Goyenechea, F. Lozano, Y. L. Chui, A. Gonzalez Fernandez, R. Pannell, M. S. Neuberger, C. Milstein.
1995
. Targeting of non-Ig sequences in place of the V segment by somatic hypermutation.
Nature
376
:
225
16
Azuma, T., N. Motoyama, L. E. Fields, D. Y. Loh.
1993
. Mutations of the chloramphenicol acetyl transferase transgene driven by the immunoglobulin promoter and intron enhancer.
Int. Immunol.
5
:
121
17
Storb, U., E. L. Klotz, J. Hackett, K. Kage, G. Bozek, T. E. Martin.
1998
. A hypermutable insert in an immunoglobulin transgene contains hotspots of somatic mutation and sequences predicting highly stable structures in the RNA transcript.
J. Exp. Med.
188
:
689
18
Goyenechea, B., N. Klix, J. Yelamos, G. T. Williams, A. Riddell, M. S. Neuberger, C. Milstein.
1997
. Cells strongly expressing Igκ transgenes show clonal recruitment of hypermutation: a role for both MAR and the enhancers.
EBMO J.
16
:
3987
19
Klix, N., C. J. Jolly, S. L. Davies, M. Bruggemann, G. T. Williams, M. S. Neuberger.
1998
. Multiple sequences from downstream of the Jκ cluster can combine to recruit somatic hypermutation to a heterologous, upstream mutation domain.
Eur. J. Immunol.
28
:
317
20
Betz, A. G., C. Milstein, A. Gonzalez-Fernandez, R. Pannell, T. Larson, M. S. Neuberger.
1994
. Elements regulating somatic hypermutation of an immunoglobulin κ gene: critical role for the intron enhancer/matrix attachment region.
Cell
77
:
239
21
Sohn, J., R. M. Gerstein, C. L. Hsieh, M. Lemer, E. Selsing.
1993
. Somatic hypermutation of an immunoglobulin μ heavy chain transgene.
J. Exp. Med.
177
:
493
22
Giusti, A. M., T. Manser.
1993
. Hypermutation is observed only in Ab H chain V region transgenes that have recombined with endogenous immunoglobulin H DNA: implications for the location of cis-acting elements required for somatic mutation.
J. Exp. Med.
177
:
797
23
Tumas-Brundage, K. M., K. A. Vora, T. Manser.
1997
. Evaluation of the role of the 3′α heavy chain enhancer [3′α E(hs1, 2)] in Vh gene somatic hypermutation.
Mol. Immunol.
34
:
367
24
Taylor, L. D., C. E. Carmack, D. Huszar, K. M. Higgins, R. Mashayekh, G. Sequar, S. R. Schramm, C. C. Kuo, S. L. O’Donnell, R. M. Kay, et al
1994
. Human immunoglobulin transgenes undergo rearrangement, somatic mutation and class switching in mice that lack endogenous IgM.
Int. Immunol.
6
:
579
25
Gonzalez-Fernandez, A., S. K. Gupta, R. Pannell, M. S. Neuberger, C. Milstein.
1994
. Somatic mutation of immunoglobulin lambda chains: a segment of the major intron hypermutates as much as the complementarity-determining regions.
Proc. Natl. Acad. Sci. USA
91
:
12614
26
Betz, A. G., M. S. Neuberger, C. Milstein.
1993
. Discriminating intrinsic and antigen-selected mutational hotspots in immunoglobulin V genes.
Immunol. Today
14
:
405
-411.
27
Lebecque, S. G., P. J. Gearhart.
1990
. Boundaries of somatic mutation in rearranged immunoglobulin genes: 5′ boundary is near the promoter, and 3′ boundary is approximately 1 kb from V(D)J gene.
J. Exp. Med.
172
:
1717
28
Rogozin, I. B., N. A. Kolchanov.
1992
. Somatic hypermutagenesis in immunoglobulin genes. II. Influence of neighbouring base sequences on mutagenesis.
Biochim. Biophys. Acta
1171
:
11
29
Betz, A. G., C. Rada, R. Pannell, C. Milstein, M. S. Neuberger.
1993
. Passenger transgenes reveal intrinsic specificity of the Ab hypermutation mechanism: clustering, polarity, and specific hot spots.
Proc. Natl. Acad. Sci. USA
90
:
2385
-2388.
30
Wagner, S. D., C. Milstein, M. S. Neuberger.
1995
. Codon bias targets mutation.
Nature
376
:
732
(Lett.).
31
Smith, D. S., G. Creadon, P. K. Jena, J. P. Portanova, B. L. Kotzin, L. J. Wysocki.
1996
. Di- and trinucleotide target preferences of somatic mutagenesis in normal and autoreactive B cells.
J. Immunol.
156
:
2642
32
Tomlinson, I. 1998 December 15. V Base Index. http://www.mrc-cpe.cam.ac.uk/imt-doc/restricted/ok.html
33
Tomlinson, I. M., G. Walter, J. D. Marks, M. B. Llewelyn, G. Winter.
1992
. The repertoire of human germline VH sequences reveals about fifty groups of VH segments with different hypervariable loops.
J. Mol. Biol.
227
:
776
34
Instituto de Biotecnología, Universidad Nacional Aut-noma de México. 1998 December 15. ABG: Germline gene directories of mouse. http://www.ibt.unam.mx/∼almagro/V_mice.html.
35
Dorner, T., H. P. Brezinschek, R. I. Brezinschek, S. J. Foster, R. Domiati-Saad, P. E. Lipsky.
1997
. Analysis of the frequency and pattern of somatic mutations within nonproductively rearranged human variable heavy chain genes.
J. Immunol.
158
:
2779
36
Dunn-Walters, D. K., J. Spencer.
1998
. Strong intrinsic biases towards mutation and conservation of bases in human IgVH genes during somatic hypermutation prevent statistical analysis of antigen selection.
Immunology
95
:
339
37
Kabat, E. A., T. T. Wu, H. M. Perry, K. S. Gottesman, C. Foeller.
1991
.
Sequences of Proteins of Immunological Interest
U.S. Department of Health and Human Services, U. S. Government Printing Office, Washington, D.C.
38
Chothia, C., A. M. Lesk, E. Gherardi, I. M. Tomlinson, G. Walter, J. D. Marks, M. B. Llewelyn, G. Winter.
1992
. Structural repertoire of the human VH segments.
J. Mol. Biol.
227
:
799
39
Tomlinson, I. M., J. P. Cox, E. Gherardi, A. M. Lesk, C. Chothia.
1995
. The structural repertoire of the human Vκ domain.
EMBO J.
14
:
4628
40
Williams, S. C., J. P. Frippiat, I. M. Tomlinson, O. Ignatovich, M. P. Lefranc, G. Winter.
1996
. Sequence and evolution of the human germline Vλ repertoire.
J. Mol. Biol.
264
:
220
41
Nadel, B., A. J. Feeney.
1995
. Influence of coding-end sequence on coding-end processing in V(D)J recombination.
J. Immunol.
155
:
4322
42
Nadel, B., A. J. Feeney.
1997
. Nucleotide deletion and P addition in V(D)J recombination: a determinant role of the coding-end sequence.
Mol. Cell. Biol.
17
:
3768
43
Zuker, M. 1998 November 11. Mfold. http://mfold1.wustl.edu/∼mfold/rna/form1.cgi.
44
Milstein, C., M. S. Neuberger, R. Staden.
1998
. Both DNA strands of Ab genes are hypermutation targets.
Proc. Natl. Acad. Sci. USA
95
:
8791
45
Chang, B., P. Casali.
1994
. The CDR1 sequences of a major proportion of human germline Ig VH genes are inherently susceptible to amino acid replacement.
Immunol. Today
15
:
367
46
Kepler, T. B..
1997
. Codon bias and plasticity in immunoglobulins.
Mol. Biol. Evol.
14
:
637
47
Dunn-Walters, D. K., A. Dogan, L. Boursier, C. M. MacDonald, J. Spencer.
1998
. Base-specific sequences that bias somatic hypermutation deduced by analysis of out-of-frame human IgVH genes.
J. Immunol.
160
:
2360
48
Golding, G. B., P. J. Gearhart, B. W. Glickman.
1987
. Patterns of somatic mutations in immunoglobulin variable genes.
Genetics
115
:
169
49
Kolchanov, N. A., V. V. Solovyov, I. B. Rogozin.
1987
. Peculiarities of immunoglobulin gene structures as a basis for somatic mutation emergence.
FEBS Lett.
214
:
87
50
Gearhart, P. J., D. F. Bogenhagen.
1983
. Clusters of point mutations are found exclusively around rearranged Ab variable genes.
Proc. Natl. Acad. Sci. USA
80
:
3439