## Abstract

To gain insight into the molecular causes and functional consequences of allelic inclusion of TCR α-chains, we develop a computational model for thymocyte selection in which the signal that determines cell fate depends on surface expression. Analysis of receptor pairs on selected dual TCR cells reveals that allelic inclusion permits both autoreactive TCR and receptors not in the single TCR cell repertoire to be selected. However, in comparison with earlier theoretical studies, relatively few dual TCR cells display receptors with high avidity for thymic ligands because their α-chains compete aggressively for the β-chain, which hinders rescue from clonal deletion. This feature of the model makes clear that allelic inclusion does not in itself compromise central tolerance. A specific experiment based on modulation of TCR surface expression levels is proposed to test the model.

Somatic recombination of noncontiguous gene segments encoding variable portions of TCR α- and β-chains results in a diverse repertoire of specificities for Ag. Productive rearrangement of both β loci is rare (1, 2, 3). In contrast, such allelic inclusion is common at the α locus, and about a quarter of peripheral T cells can in principle express two receptors (2, 4, 5). This violation of the “one cell, one receptor” rule has long been seen as a challenge to the clonal selection theory (6, 7, 8) and speculated to be a cause of autoimmunity (9, 10).

Lending credibility to this idea, dual transgenic TCR cells that express an autoreactive receptor at low levels have been observed to escape deletion and kill specifically in response to self Ag both in vitro (9) and in vivo (10). If clones that react to self Ag were restricted to such artificial situations and did not arise naturally at significant rates, allelic inclusion in itself would not compromise thymic education. One theoretical study suggested that the majority of dual TCR cells carry autoreactive receptors (11), but a more recent experiment paints a different picture. Cells with transgenic TCR specific for a foreign Ag (human collagen IV peptide presented by I-A^{s}) required coexpression of an endogenous receptor to be positively selected but then proliferated in response to that foreign Ag in the periphery (12). Thus, allelic inclusion could on balance improve immunity by expanding the repertoire of tolerant TCR that recognize foreign Ag (8, 12).

To assess whether dual TCR cells breach central tolerance, we go beyond earlier theoretical treatments of allelic inclusion (2, 11, 13) to develop a consistent model for single and dual TCR cells that yields specific, experimentally testable predictions. An important advance of the model is that the signal that determines cell fate varies with TCR surface expression. This consideration is found to enrich the fraction of dual TCR cells with receptors that exhibit relatively low avidity for self Ag. In other words, allelic inclusion is not inherently at odds with thymic education. Based on these results and available experimental data, we conclude that autoimmunity stems from coincident peripheral features or events that permit otherwise tolerant receptors to react to self Ag.

## Materials and Methods

As noted in the Introduction, the goal of the present study is to develop a model that treats single and dual TCR cells in a consistent fashion to investigate the causes and consequences of allelic inclusion. Below, we outline the model (Fig. 1), followed by the method used to simulate its dynamics (Fig. 2); parameters are summarized in Table I.

Parameter . | Description . | Value . |
---|---|---|

l | Bit string length | 24 |

k | Energy scale | 1.6 |

E_{0} | Energy zero point (in number of bit matches) | 12 |

N_{self} | Number of self Ag bit strings | 1 |

N_{cell} | Initial number of cells | 10^{5} |

L_{cut} | Lower selection cutoff | 2.0 |

H_{cut} | Higher selection cutoff | 3.0 |

P_{R} | Probability of rearranging the same allele as in the previous round | 0.0 |

P_{P} | Probability of productive rearrangement | 0.3 |

P_{D} | Probability of deletion | 0.7 |

f_{α} | Weight of the α-chain in determining the total TCR-pMHC interaction | 0.7 |

P_{div} | Probability of division following rearrangement | 0.3 |

N_{α} | Number of α-chains | 20,000 |

N_{β} | Number of β-chains | 10,000 |

N_{R} | Maximum number of allowed rearrangements | 10 |

L_{MHC} | Number of TCR-pMHC bit matches to be considered MHC restricted | 4 |

P_{α} | Probability that each bit in an α-chain is a 1 | 0.5 |

P_{β} | Probability that each bit in an β-chain is a 1 | 0.5 |

Parameter . | Description . | Value . |
---|---|---|

l | Bit string length | 24 |

k | Energy scale | 1.6 |

E_{0} | Energy zero point (in number of bit matches) | 12 |

N_{self} | Number of self Ag bit strings | 1 |

N_{cell} | Initial number of cells | 10^{5} |

L_{cut} | Lower selection cutoff | 2.0 |

H_{cut} | Higher selection cutoff | 3.0 |

P_{R} | Probability of rearranging the same allele as in the previous round | 0.0 |

P_{P} | Probability of productive rearrangement | 0.3 |

P_{D} | Probability of deletion | 0.7 |

f_{α} | Weight of the α-chain in determining the total TCR-pMHC interaction | 0.7 |

P_{div} | Probability of division following rearrangement | 0.3 |

N_{α} | Number of α-chains | 20,000 |

N_{β} | Number of β-chains | 10,000 |

N_{R} | Maximum number of allowed rearrangements | 10 |

L_{MHC} | Number of TCR-pMHC bit matches to be considered MHC restricted | 4 |

P_{α} | Probability that each bit in an α-chain is a 1 | 0.5 |

P_{β} | Probability that each bit in an β-chain is a 1 | 0.5 |

### Molecular interactions

Because the purpose of the model is to gain insight by identifying trends rather than to reproduce details for particular TCR heterodimers, strings of binary variables (bits) are used to encode molecular properties (Fig. 1). Similar schemes have been used in other contexts (14, 15, 16). In the present study, separate bit strings, each of length *l*, are used for the β-chain and each productively rearranged α-chain. The number of bit matches determines the strength of interaction between two molecules. In other words, each pair of corresponding binary variables with the same value (0 or 1) contributes a unit of favorable binding free energy. Half of the bits in the α- and β-chains (*l*/2 of each) are used to evaluate the stability of the TCR heterodimer interface. The remainder of the bits in the α- and β-chains are used to determine the matches to the peptide and MHC (pMHC), each of which is a string of *l*/2 bits as indicated in Fig. 1. Equivalent results could be obtained by directly assigning αβ and TCR-pMHC pairs affinities according to a fixed distribution, but the discrete nature of the bit strings facilitates analysis and exposition of the model.

In the model, the relative importance of the α- and β-chains is treated as an adjustable parameter, and the total interaction between a TCR and a pMHC is an average of the α- and β-chain-pMHC bit matches weighted by *f*_{α} and (1 − *f*_{α}), respectively. Effectively, *f*_{α} controls the stability of the signal in successive rounds of rearrangement. When *f*_{α} = 0, a TCR-pMHC interaction is determined entirely by the β-chain, which is fixed for a given cell; when *f*_{α} = 1, a TCR-pMHC interaction is determined entirely by the α-chain, which can change subsequently. However, even when *f*_{α} = 0, the signal that controls cell fate still varies to some extent because the α-chain gene configuration modulates TCR expression as discussed below.

### Rearrangement

Only cells at the double-positive stage are considered. Thus, each cell is assigned a bit string for the β-chain that remains fixed throughout the simulation (Fig. 2). Those cells that have neither been selected nor deleted continue to rearrange their α-chain genes until the maximum number of rearrangements allowed (*N*_{R}), which acts as a surrogate for time, is exhausted. Before the first rearrangement, both chromosomes are equivalent; after the first rearrangement, the same chromosome as in the previous round was picked again with probability *P*_{R}.

Each rearrangement has a one in three chance that it is in-frame, but the actual likelihood that a functional protein can be made is lower due to the presence of pseudogenes and the possibility of generating a stop codon (13). Here, we take the probability that a rearrangement is productive (*P*_{P}) to be 0.3. This choice does not affect the composition of the dual TCR cell population with respect to receptor pairs, but the overall fraction of selected cells that exhibit allelic inclusion increases linearly with *P*_{P} (data not shown), as previously found analytically (13). If the rearrangement is productive, the α-chain is assigned a bit string randomly with a given composition, and the distribution of molecules on the surface is determined as described below. Otherwise, the number of rearrangements is incremented, and a new rearrangement is attempted.

### Division

After each rearrangement, a cell divides with probability *P*_{div}. The probability of division is close to that used in earlier theoretical studies (11) and leads to an average number of thymic divisions (about four) close to that observed in experiments (17). Following division, both daughter cells are allowed the number of rearrangements remaining for the parent cell at the time of division. Simulations without division yielded consistent behavior.

### Cell surface expression

TCR heterodimers comprised of specific α- and β-chains vary markedly in their efficiency of surface expression (18, 19, 20, 21). Both signal-dependent and independent mechanisms have been suggested to influence the population of molecules displayed by a cell (19, 21, 22, 23, 24, 25, 26). The former is discussed below; the latter concerns the kinetics of making, trafficking, and assembling the various TCR components. The approach used in the present study is not specific to a particular molecular scenario. Given the bit strings for the α- and β-chains, TCR surface expression is determined from the number of αβ bit matches and an effective temperature that sets the energy scale. The latter controls the degree to which α-chains of different affinities for the β-chain compete and was chosen so that expression levels in single TCR cells varied by approximately an order of magnitude in accord with experimental observations (19).

The weighting scheme is equivalent to solving the system of equations corresponding to the equilibrium distribution for the association reaction(s)

where *i* = 1 if there is only one productive α rearrangement and *i* = 1 and 2 if there are two. Mathematically, *K*_{eq} = exp[*k*(*E − E*_{0})], where *E* is the number of bit matches and the other parameters are described in Table I.

In a given simulation, the total numbers of α- and β-chains are fixed. It was assumed that the total number of α-chains was the same in single and dual TCR cells; qualitatively similar results were obtained in simulations in which such homeostasis was not maintained and dual TCR cells made twice as many α-chains as single TCR cells.

Whether homeostatic mechanisms maintain total TCR surface expression at fixed levels has not been studied for endogenous receptors. However, it is clear that TCR formed from different transgenic α-chains paired with the same transgenic β-chain are expressed at different levels on the surface of single TCR cells even when the mRNA is transcribed at the same level (19). As such, we have allowed the total surface expression to vary according to Equation 1. We also investigated the effects of keeping the overall surface expression at a fixed level and only varying the relative expression of receptors on dual TCR cells, and almost all features of the model are robust to this alternative scheme. The only significant difference is that varying the average stability of the αβ heterodimer interface has no effect on the extent of allelic inclusion because fixing the total TCR expression prevents variation in the average signal (data not shown).

### Down-regulation

Recent experiments suggest that phenotypic allelic exclusion in cells with two productive rearrangements derives largely from preferential down-regulation of TCR in a signal-dependent manner (25, 26). Because it is thought that this mechanism of regulating expression initiates at the single-positive stage (25, 26, 27), we did not include it in our model of double-positive thymocytes. However, to ensure the robustness of the results, we performed simulations in which we removed TCR with either higher or lower numbers of bit matches to the pMHC before comparing the signal with the selection cutoffs. Regardless of whether a linear or nonlinear (threshold) rule was used for determining the extent of TCR down-regulation, the average signal became lower, but the overall trends remained the same (data not shown). The control of TCR surface expression in mature T cells in the periphery is beyond the scope of the present study and will be discussed elsewhere.

### Signaling

T cell responses to Ag are correlated with the affinity between the TCR and pMHC (28, 29). In the model, the signal that determines cell fate (*s*) derives from the surface expression of TCR and their bit matches with self pMHC:

Here, *s*_{i,max} is the weighted average number of matches between the components of the *i*-th receptor and the best matching self pMHC, *N*_{αi} is the number of α_{i}β TCR expressed on the surface (either *N*_{α}_{1} = 0 or *N*_{α}_{2} = 0 in a single TCR cell), and *M* is a constant normalization factor chosen such that the signal varies between zero and the maximum number of possible bit matches for any pair of molecules (*l*/2). Because the length of the bit strings used here requires using a small number of self pMHC in the simulations (from 1 to 10), the bits that represent the MHC are the same for all peptides to reflect the fact that the number of different MHC expressed in an individual is very small in comparison with the size of the self Ag repertoire in the thymus. Only the best matching pMHC is used to ensure that the average signal increases with the number of self Ags presented in the thymus. Although data suggest that T cells can integrate signals in time (30) and over several ligands (31, 32), including such features in the model would require treating time and TCR phosphorylation in a more explicit fashion, which would preclude simulating a sufficient number of selection events for meaningful repertoire statistics.

If we associate the number of TCR-pMHC bit matches with the strength and quality of signal (varying monotonically from a strongly antagonizing signal to a strongly activating one) rather than directly with the free energy of binding, this integration scheme captures aspects of observed cross-antagonism (31, 32, 33) and dilution effects (9, 10) in a simple way. TCR with fewer bit matches with the pMHC are able to rescue ones with more from negative selection. Use of an alternative rule that weighted signals nonlinearly such that the TCR that better matched the pMHC dominated the signal would favor dual TCR cells bearing low avidity receptors over those bearing high avidity ones to an even greater degree than observed below.

### Selection

After each rearrangement, the signal is calculated as described above and compared with the thresholds for positive and negative selection. Cells with signals between the lower and upper cutoffs are positively selected, and statistics concerning their receptor pairs and cell surface distributions are accumulated. Those with signals above the upper cutoff are deleted with probability *P*_{D}. The remainder of cells are subjected to another round of rearrangement or terminated as described above. We did not allow multiple rearrangements between tests, which corresponds to assuming that rearrangement is infrequent relative to the time it takes for a T cell to scan the self Ag repertoire; choosing whether to rearrange or test stochastically would yield qualitatively similar results so long as the kinetics of the former process are slower than the latter. We explored a wide range of possible thresholds for positive and negative selection and present data for reasonable, representative values (Table I, unless otherwise specified).

### Exhaustive enumeration

To confirm that the number of cells examined in the stochastic simulation was sufficient, all possible gene configurations were enumerated for up to two rearrangements, and the probabilities of selecting single and dual TCR cells were determined exactly. Good agreement was obtained. Direct comparison for larger numbers of rearrangements is computationally costly because it is necessary to enumerate all possible cell histories rather than restricting attention to the populations and possible transitions at any one time.

## Results

We present data from stochastic simulations of the bit string model for selection of single and dual TCR cells, and the latter are functionally classified according to their receptor pairs. We then show how the overall extent of genotypic allelic inclusion varies with each parameter in the model to identify means of experimentally verifying the results.

### Population analysis

The bit string model allows us to examine the gene configuration and surface TCR distribution of all selected cells to assess the functional consequences of allelic inclusion. Because the mechanisms that lead to autoimmunity are unknown, we chose not to label receptors as “autoreactive” or “repertoire expanding.” Rather, we classified each receptor on a dual TCR cell according to whether the signal of a hypothetical single TCR cell carrying only that receptor would be below the lower cutoff (denoted B for “below”), inside the range for selection (I for “inside”), or above the upper cutoff (A for “above”); the B type TCR were further subdivided as discussed below. It is important to stress that, given these definitions, not only does the maximum number of bit matches between a TCR and all pMHC in a simulation influence how it is counted, but so do its other properties. In particular, the surface expression modulates the signal in a single TCR cell to some extent, so that poor αβ pairing efficiency can cause a receptor with many bit matches with the self pMHC to be labeled B.

There are six possible pairs of B, I, and A type receptors, but the only significant ones are I/I, I/B, I/A, and A/B. Dual TCR cells that carry a B type TCR have the potential to expand the immune repertoire, but will do so only if MHC restricted, which we take to be the case when the number of matches between the TCR and MHC exceeds a fixed threshold (four of six bits for the data presented). Using this definition, we further divide the A/B and I/B cells according to whether their B receptor is MHC-restricted or ignorant. Thus, the pairs of interest are I/I, I/A, A/B-restricted, A/B-ignorant, I/B-restricted, and I/B-ignorant.

Significant numbers of cells with genotypic allelic inclusion are observed for all six categories (Tables II and III). At first glance, the fact that there are cells with two I type receptors might seem surprising because a cell with only one would be selected, which would preclude rearrangement of the second allele. However, I/I pairs can arise from the following sequence of events. A T cell initially makes a productive rearrangement that corresponds to a TCR with a signal outside the selection cutoffs (an A or B type receptor). The second allele then rearranges to a bit string encoding an I type receptor, but it is insufficient to overwhelm the already present A or B type receptor to pull the signal inside the cutoffs. The cell makes another productive rearrangement and, this time, replaces the original A or B type receptor with an I type receptor; it is thus selected with an I/I gene configuration.

L_{cut}
. | H_{cut}
. | Total . | Single . | Dual . | Fraction . | I/I . | I/A . | A/B RE . | A/B IG . | I/B RE . | I/B IG . |
---|---|---|---|---|---|---|---|---|---|---|---|

2.4 | 2.6 | 2,285,609 | 102,462 | 29,724 | 0.23 | 0.02 | 0.03 | 0.37 | 0.53 | 0.02 | 0.03 |

2.2 | 2.8 | 2,052,583 | 237,547 | 66,994 | 0.22 | 0.03 | 0.03 | 0.26 | 0.37 | 0.14 | 0.18 |

2.0 | 3.0 | 1,806,303 | 343,578 | 82,656 | 0.19 | 0.04 | 0.03 | 0.15 | 0.20 | 0.25 | 0.34 |

1.5 | 3.5 | 1,346,363 | 486,133 | 81,608 | 0.14 | 0.05 | 0.02 | 0.04 | 0.05 | 0.32 | 0.52 |

1.0 | 4.0 | 1,078,696 | 502,827 | 73,505 | 0.13 | 0.03 | 0.01 | 0.01 | 0.01 | 0.42 | 0.54 |

0.5 | 4.5 | 801,610 | 487,512 | 41,414 | 0.08 | 0.05 | 0.00 | 0.00 | 0.00 | 0.35 | 0.60 |

L_{cut}
. | H_{cut}
. | Total . | Single . | Dual . | Fraction . | I/I . | I/A . | A/B RE . | A/B IG . | I/B RE . | I/B IG . |
---|---|---|---|---|---|---|---|---|---|---|---|

2.4 | 2.6 | 2,285,609 | 102,462 | 29,724 | 0.23 | 0.02 | 0.03 | 0.37 | 0.53 | 0.02 | 0.03 |

2.2 | 2.8 | 2,052,583 | 237,547 | 66,994 | 0.22 | 0.03 | 0.03 | 0.26 | 0.37 | 0.14 | 0.18 |

2.0 | 3.0 | 1,806,303 | 343,578 | 82,656 | 0.19 | 0.04 | 0.03 | 0.15 | 0.20 | 0.25 | 0.34 |

1.5 | 3.5 | 1,346,363 | 486,133 | 81,608 | 0.14 | 0.05 | 0.02 | 0.04 | 0.05 | 0.32 | 0.52 |

1.0 | 4.0 | 1,078,696 | 502,827 | 73,505 | 0.13 | 0.03 | 0.01 | 0.01 | 0.01 | 0.42 | 0.54 |

0.5 | 4.5 | 801,610 | 487,512 | 41,414 | 0.08 | 0.05 | 0.00 | 0.00 | 0.00 | 0.35 | 0.60 |

Lines have increasingly more permissive cutoffs. Numbers of single and dual TCR cells selected out of an initial population of 100,000 cells with a division probability of 0.3 (the resulting total number of cells tested for selection is indicated). In the final six columns, the percent of dual TCR cells with the indicated receptor pair is given. TCR were classified as MHC restricted (RE) if they matched at least four of six MHC bits and ignorant (IG) otherwise. Results shown are obtained without down-regulation.

N_{self}
. | Total . | Single . | Dual . | Fraction . | I/I . | I/A . | A/B RE . | A/B IG . | I/B RE . | I/B IG . |
---|---|---|---|---|---|---|---|---|---|---|

1 | 1,806,303 | 343,578 | 82,656 | 0.19 | 0.04 | 0.03 | 0.15 | 0.20 | 0.25 | 0.34 |

2 | 1,677,651 | 318,514 | 81,671 | 0.20 | 0.03 | 0.03 | 0.17 | 0.24 | 0.22 | 0.31 |

3 | 1,591,009 | 286,261 | 79,518 | 0.22 | 0.03 | 0.03 | 0.20 | 0.28 | 0.18 | 0.28 |

4 | 1,517,339 | 259,272 | 75,321 | 0.23 | 0.02 | 0.03 | 0.22 | 0.32 | 0.15 | 0.25 |

5 | 1,499,703 | 253,206 | 72,920 | 0.22 | 0.03 | 0.04 | 0.23 | 0.33 | 0.14 | 0.24 |

6 | 1,473,722 | 240,862 | 69,849 | 0.23 | 0.03 | 0.04 | 0.24 | 0.36 | 0.12 | 0.23 |

7 | 1,449,073 | 234,676 | 68,274 | 0.23 | 0.03 | 0.04 | 0.24 | 0.37 | 0.10 | 0.22 |

8 | 1,443,710 | 232,095 | 67,633 | 0.23 | 0.03 | 0.04 | 0.25 | 0.38 | 0.10 | 0.21 |

9 | 1,424,267 | 226,848 | 65,940 | 0.23 | 0.03 | 0.04 | 0.25 | 0.39 | 0.09 | 0.20 |

10 | 1,404,655 | 224,142 | 63,691 | 0.22 | 0.03 | 0.04 | 0.26 | 0.39 | 0.09 | 0.20 |

N_{self}
. | Total . | Single . | Dual . | Fraction . | I/I . | I/A . | A/B RE . | A/B IG . | I/B RE . | I/B IG . |
---|---|---|---|---|---|---|---|---|---|---|

1 | 1,806,303 | 343,578 | 82,656 | 0.19 | 0.04 | 0.03 | 0.15 | 0.20 | 0.25 | 0.34 |

2 | 1,677,651 | 318,514 | 81,671 | 0.20 | 0.03 | 0.03 | 0.17 | 0.24 | 0.22 | 0.31 |

3 | 1,591,009 | 286,261 | 79,518 | 0.22 | 0.03 | 0.03 | 0.20 | 0.28 | 0.18 | 0.28 |

4 | 1,517,339 | 259,272 | 75,321 | 0.23 | 0.02 | 0.03 | 0.22 | 0.32 | 0.15 | 0.25 |

5 | 1,499,703 | 253,206 | 72,920 | 0.22 | 0.03 | 0.04 | 0.23 | 0.33 | 0.14 | 0.24 |

6 | 1,473,722 | 240,862 | 69,849 | 0.23 | 0.03 | 0.04 | 0.24 | 0.36 | 0.12 | 0.23 |

7 | 1,449,073 | 234,676 | 68,274 | 0.23 | 0.03 | 0.04 | 0.24 | 0.37 | 0.10 | 0.22 |

8 | 1,443,710 | 232,095 | 67,633 | 0.23 | 0.03 | 0.04 | 0.25 | 0.38 | 0.10 | 0.21 |

9 | 1,424,267 | 226,848 | 65,940 | 0.23 | 0.03 | 0.04 | 0.25 | 0.39 | 0.09 | 0.20 |

10 | 1,404,655 | 224,142 | 63,691 | 0.22 | 0.03 | 0.04 | 0.26 | 0.39 | 0.09 | 0.20 |

Columns are the same as in Table II.

In Fig. 3, we show that allelic inclusion does not breach central tolerance. TCR on selected dual TCR cells produce signals that span the full range below the lower cutoff but only a limited range above the upper cutoff (Fig. 3, *b* and *c*). To understand this behavior, it is helpful to keep in mind that, in the model, the probability of selection depends on a weighted average of signals from cell surface TCR (Equation 2). The higher the signal associated with a TCR in the first place, the better its α-chain must pair with the β-chain. As a result, a strongly autoagressive receptor is difficult to mask and tends to result in clonal deletion before successive rearrangements can produce a tolerizing partner. Consistent with this idea, as the probability of negative selection (*P*_{D}) increases, the population of I/A cells shifts to I/B and I/I (which then mostly derive from I/B) (data not shown). Indeed, In Fig. 3, *b* and *c*, most of the I type receptors are close to the upper cutoff because they are from I/B rather than I/A pairs, which mandates that their signals be sufficiently high to overcome dilution for positive selection.

In Table II, we illustrate the effects of making selection more permissive. When the cutoff range is narrow, receptors are unlikely to be classified as I, and the dual TCR cells are dominated by A/B pairs. As the range widens (but remains centered on the same value), the population shifts primarily to the I/B category because I type receptors are more prevalent and negative selection eliminates most cells with I/A pairs. In general, the fact that cells with signals above the upper cutoff are much more likely to be deleted than those with signals below the lower cutoff leads to fewer dual TCR cells with A type than B type receptors, many of which are MHC restricted. The number of I/I cells is always small because, in the model, the only route to their formation is through editing an I/A or I/B gene configuration as described above.

To explore the effects of increasing the average signal relative to the selection cutoffs, we vary the number of self-derived peptides (Table III). In these simulations, the bits that represent the MHC are the same for all peptides to reflect the fact that the number of different MHC expressed in an individual is very small in comparison with the size of the self Ag repertoire in the thymus. As the number of peptides increases, the population of B type receptors shifts from MHC restricted to ignorant. To understand this trend, it is important to keep in mind how the signal is computed in the model. For each TCR, the number of matches with an entire pMHC bit string (rather than separate peptide and MHC components) is determined, and the maximum for all the pMHC is used in the surface weighted average to compute the signal. When the number of peptides is large, it is very likely that a TCR will match one well; consequently, only receptors that are not MHC restricted will tend to have sufficiently few matches with full pMHC bit strings to be classified as B type TCR. Because the overall signal is derived from the surface expression and the number of bit matches of the TCR with peptides and MHC, changing any parameter to raise one of these three quantities (for example, increasing the number of β-chains or improving αβ pair efficiency) requires in general that the other two be lower on average for the signal to fall within the selection cutoffs.

### Fraction of dual TCR cells

Calculation of experimentally observable trends is important for validating the model and its predictions concerning receptor pairs. Consequently, we varied each parameter in the model over a wide range of values to determine its influence on a measurable quantity such as the overall extent of allelic inclusion. The results are best understood in terms of the number of selection attempts as a dual TCR cell and the average signal level, each of which is discussed in turn.

Three parameters in the model determine the number of times a cell is typically tested for selection with two productively rearranged α-chain loci: the maximum number of rearrangements (*N*_{R}), the probability of deletion (*P*_{D}), and the probability of editing the last allele rearranged (*P*_{R}). The first two variables control the overall number of rearrangements, while the last influences the likelihood of allelic inclusion in any given round. Because the probability of selection as a dual TCR cell is roughly constant each time the signal is computed from two TCR and compared with the cutoffs, increases in such events translate to increases in the fraction of selected cells that exhibit genotypic allelic inclusion. Increasing *N*_{R} or decreasing *P*_{D} boosts the fraction of selection attempts which are made with two productive rearrangements and thus the fraction of dual TCR cells (Fig. 4 and data not shown), consistent with results of earlier theoretical studies (2, 11, 13). As *P*_{R} becomes closer to unity, the same allele is rearranged more often, which decreases the likelihood that a cell has two productive rearrangements at any time (data not shown).

In contrast, the extent of genotypic allelic inclusion varies nonmonotonically with factors that increase the average signal of the cells tested for selection. These include the total number of β-chains, the pairing efficiency of α- and β-chains, and the number of self peptides. The first two factors control the overall TCR surface expression; the third directly affects the number of TCR-pMHC bit matches used to compute the signal. Raising the average signal increases the number of cells with signals above the lower cutoff but also those with signals above the upper one. Thus the numbers of both single and dual TCR cells first increase and then decrease with average signal (Fig. 5). What is not intuitive is that the fraction of selected cells that exhibit allelic inclusion is highest at intermediate values.

This result can be understood by considering low, medium, and high signal scenarios. When the signal is low, positive selection is unlikely, and cells undergo multiple rearrangement attempts but almost never succeed in being selected. As the signal is raised, cells still tend go through more than one round, but the probability of forming a receptor that rescues the cell is increased. For sufficiently high signal levels, most TCR that match the pMHC are either positively or negatively selected by themselves, so there are very few opportunities for selection with two productively rearranged α-chain loci. In other words, the nonmonotonic variation in the fraction of dual TCR cells derives from a competition between boosting the probability that a dual TCR cell is selected once formed and limiting the sampling of gene configurations with two productively rearranged α-chain loci as the TCR signals vary over their full range in the model.

### Simulated transgenic experiment

The discussion above reveals a previously unanticipated dependence of the fraction of selected cells that exhibit allelic inclusion on average signal levels, which depend on TCR surface expression. In this section we propose a specific experiment to test this prediction.

Although protein knockouts can be used to manipulate TCR surface expression (25), the interpretation of such experiments is complicated by the fact that the connectivity of the signaling network is effectively changed. A more direct approach is to in some way limit the availability of TCR components. Using a binary transgenic strategy (34, 35), Labrecque et al. (36) generated mice with T cells in which the number of OT-1 TCR α-chains could be modulated in a dose-dependent manner by tetracycline treatment. Because we want to probe allelic inclusion of endogenous α-chains, it is desirable to put a different TCR component under the control of the tetracycline-responsive transactivator. Doing so avoids the need to perturb the loci of interest and the possibility of selecting triple TCR cells (observed in Ref.23). Either CD3, the ζ-chain, or the β-chain on an endogenous β^{0} background (again, to avoid generating cells with more than two receptors combinatorially) could be modulated in double-positive thymocytes.

Because phenotypic and genotypic allelic inclusion are inherently correlated, surface staining with V_{α}-specific reagents could be used as the readout if care were used to account for the overall differences in surface expression. However, it would be better to sequence selected clones. Not only would the extent of genotypic allelic inclusion be unambiguous, but the J_{α} gene segments, which are used almost sequentially (37, 38), could be identified for a rough estimate of the number of rearrangements. The latter would enable connection with the underlying competition between boosting selection probability and limiting the number of attempts as a dual TCR cell.

Because only the α- and β-chains are treated explicitly in the model, we simulated the experiment by varying the number of β-chain molecules for populations of cells uniform with respect to the β-chain bit string. The results are essentially identical to those obtained with many different (endogenous) β-chains (Fig. 4). In comparing Fig. 4 with experimental data, it is important to keep in mind that the full range shown might not be accessible in animals. If the vast majority of cells fail positive selection, only the left (increasing) part of the curve will be observed; the opposite will be true if most cells fail negative selection. Thus the experiment can actually provide information about the degree of autoreactivity in the preselection repertoire as well. Any variation of the fraction of dual TCR cells with overall TCR surface expression would constitute a successful prediction of the model because this means of manipulating allelic inclusion has not been suggested elsewhere to the best of our knowledge. In contrast, a lack of variation in the extent of allelic inclusion would support the idea that selection is dominated by the affinity for self of the most recently formed TCR and not cell surface expression levels, as implied by earlier models (2, 11, 13).

## Discussion

In the present study, we introduced a model for thymocyte selection in which the signal that determines cell fates depends on the cell surface distribution of Ag receptors. Analysis of receptor pairs on selected dual TCR cells indicates that TCR with low avidity for self-derived ligands in the thymus (due to either receptor surface number or pMHC interactions) are more readily rescued than those with high avidity, which results in the relative abundance of the former in the periphery. The total fraction of selected cells that exhibit allelic inclusion reflects a balance between boosting the probability of positive selection and limiting the number of rearrangements as the signal increases on average. This feature of the model leads to the prediction that the fraction of dual TCR cells in the periphery varies with overall TCR surface expression, and a specific experimental means of testing this idea was proposed.

To treat large numbers of cells, it was necessary to use a highly reduced representation of T cell signaling. We modeled TCR cross-antagonism (33) and down-regulation (39) in ad hoc fashions and ignored possible cooperative signal amplification mechanisms (40). Partially decoupling αβ pairing, down-regulation, and signaling makes the current simulation approach straightforward to interpret, but, as further quantitative data become available, it will be important to study the interplay of these processes with stochastic methods like those used in Ref.40 .

Nevertheless, the model in the present work goes far beyond earlier theoretical studies of allelic inclusion in T cells (2, 11, 13), which neglected variations in the probability of selection for cells with at least one productively rearranged allele. This simplifying assumption allowed Mason (13) to show analytically that an increase in the number of rearrangements results in an increase in the fraction of dual TCR cells; Mehr and coworkers (11) elaborated on this scheme with stochastic simulations. In these studies, receptors that persist from previous rearrangements do not influence the chance of clonal selection, which corresponds to a physical picture in which nonselectable TCR are always totally hidden even at the double-positive stage. As a result, the avidities of these TCR are unrestricted by selection. This feature, together with the assumption that 67% of TCR in the preselection repertoire react strongly to self, inflates the potential dual TCR cells have for triggering autoimmunity in Ref.11 . In our model, both TCR are considered in evaluating the signal, and, as the selection cutoffs become more permissive, the α-chains of those TCR with signals above the upper cutoff become increasingly difficult to out compete; thus cells with such receptors tend to react strongly to self in the thymus and are deleted.

As discussed in *Materials and Methods*, allelic exclusion can be maintained at a phenotypic rather than genotypic level. Both signal-dependent and independent mechanisms have been suggested to allow one TCR to be functionally dominant (19, 21, 22, 23, 24, 25, 26). Data support the idea that both under- (23) and overstimulated TCR (39, 41) are down-regulated after the double-positive stage (25, 26, 27). To the extent that the speculation that TCR outside the selection cutoffs become hidden (25) is true, T cells will be further tolerized.

How then does allelic inclusion relate to autoimmunity? It has been shown that dual transgenic TCR cells that are tolerant in vivo despite low expression of an autoreactive receptor kill specifically in response to self Ag following activation of the other TCR in vitro (9). Experiments with a transgenic animal model demonstrate that dual TCR cells displaying low but measurable amounts of a receptor that reacts to Ags expressed ubiquitously in the hemopoietic system can escape deletion and induce autoimmune diabetes irrespective of the specificity of the second TCR when Ags relevant to the first are expressed in pancreatic tissue (10). Such cells could cause damage by continuously releasing cytokines, responding to an aberrant environment, or stochastically modulating the levels of TCR and other molecules relevant to activation. The last of these possibilities is of particular interest given the relatively recent demonstration that exposure to a viral superantigen can induce mature T cells to rearrange Ag receptor loci in the periphery (42, 43, 44). This observation suggests that a dual TCR cell can unmask a weakly autoagressive TCR by eliminating a tolerizing partner. Evaluating this scenario and the impact it will have on different backgrounds requires a better understanding of the molecular factors that link TCR signaling, surface expression, and rearrangement. Nevertheless, these data are consistent with the idea that allelic inclusion does not in itself compromise thymic education. Rather, coincident peripheral features or events are required for self Ag to activate otherwise tolerant receptors, which would account for why autoimmunity manifests in such a wide range of ways.

## Acknowledgments

We thank Arup Chakraborty, Natan Dotan, and Esther Witsch for helpful discussions and critical readings of the manuscript.

## Disclosures

The authors have no financial conflict of interest.

## Footnotes

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked *advertisement* in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

## References

*841*.

*322*.

*234*.

*347*.

*154*.

*69*.

*257*.

*110*.

*9107*.

*570*.

*1808*.

*134*.

*885*.

*204*.

*424*.

*416*.

*2339*.

*59*.

*3384*.

*82*.

*5774*.

*3890*.

*1727*.

*6829*.

*4563*.

*5600*.

*458*.

*544*.

*684*.

*1132*.

*5711*.

*5359*.

*298*.

*9306*.

*10938*.

*82*.

*4729*.

^{+}CD8

^{+}thymocytes.

*476*.

*1222*.

*799*.

*675*.

^{+}T cells.

*647*.

*566*.

*233*.