Accessibility control of V(D)J recombination at Ag receptor loci depends on the coordinate activities of transcriptional enhancers and germline promoters. Recombination of murine Tcrd gene segments is known to be regulated, at least in part, by the Tcrd enhancer (Eδ) situated in the Jδ2-Cδ intron. However, there has been little characterization of promoters and other cis-acting elements that are activated by or collaborate with Eδ and that might function to regulate Tcrd gene recombination events. We now describe a strong promoter that is tightly associated with the murine Dδ2 gene segment. EMSAs reveal that upstream stimulatory factor 1, Runx1, c-Myb, lymphoid enhancer binding factor 1, NF1, and E47 all interact with this promoter in vitro. Of these, upstream stimulatory factor 1, Runx1, and c-Myb appear necessary for full promoter activity in transiently transfected cells. Moreover, the same three factors were found to interact with the promoter in vivo by chromatin immunoprecipitation. We suggest that these factors play important roles as Eδ-dependent regulators of Dδ2 accessibility in vivo. Consistent with the established roles of c-Myb and Runx factors in Eδ function, we detected low level, enhancer-independent activity of the Dδ2 promoter in transient transfection experiments. We speculate that the Dδ2 promoter may play a role as a weak, enhancer-independent regulator in vivo, and might contribute to residual Tcrd rearrangement in Eδ−/− mice.
Antigen receptor genes are formed by the somatic rearrangement of variable (V), diversity (D), and joining (J) gene segments during the development of T and B lymphocytes. The mechanics of V(D)J recombination are essentially identical in the two lymphoid lineages and at each of the TCR (Tcra, Tcrb, Tcrg, Tcrd) and Ig (Igh, Igk, Igl) loci. The recombination reaction is initiated by two lymphoid-specific proteins (RAG-1 and RAG-2) that recognize and cleave at conserved recombination signal sequences (RSS)3 that flank each of the gene segments, and is then completed by an array of DNA repair proteins (1). Despite this conserved mechanism, the recombination reaction is differentially regulated at the various loci as a function of cell lineage and developmental stage (2).
Several lines of evidence argue that the specificity of rearrangement is regulated at the level of gene segment accessibility (2, 3). Chromatin presents a barrier to RAG recognition and cleavage at RSSs, and lineage- and developmental stage-specific modulation of this barrier is thought to allow RAG proteins to access the appropriate RSSs. Although a precise molecular understanding of accessibility is still lacking, changes in histone acetylation, histone methylation, and DNA demethylation are all thought to contribute (3, 4).
It has long been appreciated that the transcription of unrearranged gene segments correlates with their potential for recombination (2, 3). A large body of data obtained through gene targeting or manipulation of recombination substrates has shown that transcriptional enhancers and germline promoters cooperate to control chromatin accessibility and V(D)J recombination as a function of lymphocyte lineage and developmental stage. Enhancers seem to influence locus accessibility over long distances; for example, deletion of the Tcrb enhancer (Eβ) (5, 6), the Tcra enhancer (Eα) (7), and the intronic and 3′ Igk enhancers (8) results in essentially complete blocks to V(D)J recombination at the relevant loci. In contrast, promoter elements seem to influence chromatin structure in a more local fashion, and often affect the choice of gene segment use within a locus. Accordingly, deletion of the T early α promoter only abolishes rearrangement of Jα segments located within a 15-kb region at the 5′ end of the Jα array (9), and elimination of Dβ1 promoter, PDβ, impairs rearrangement of the Dβ1-Jβ1 cluster, but does not affect rearrangement of the Dβ2-Jβ2 cluster (10).
The Tcra Tcrd locus spans 1.6 megabases and contains V, D, J, and C gene segments that encode either TCRα or TCRδ chains (11). The locus is organized such that Tcrd and Tcra rearrangements are mutually exclusive, with two Dδ segments, two Jδ segments, and Cδ lying between the Vα and Jα segments. Tcrd, like Tcrb and Tcrg, rearranges early during T cell development, at the double-negative (DN) stage. Tcra rearrangement occurs later, at the double-positive stage, and occurs with deletion of rearranged Tcrd genes. Two enhancers are known to regulate this locus, the Tcrd enhancer (Eδ) and Eα. Eδ is important both for germline transcription through the Cδ region and for V(D)J recombination of Tcrd gene segments in DN thymocytes, but seems unimportant for the transcription of rearranged TCRδ genes in γδ T lymphocytes (12). Eα is required for germline transcription through the Jα array and for Vα to Jα recombination in double-positive thymocytes, and also promotes transcription of mature recombined Tcrd and Tcra genes in γδ and αβ T lymphocytes, respectively (7).
Although roles for Eδ in Tcrd locus recombination and transcription have been clearly established, there has been little characterization of promoters and other cis-acting elements that are activated by or collaborate with Eδ. By analogy with T early α and PDβ, such promoters might function directly as local, Eδ-dependent regulators of accessibility. Moreover, the analysis of Eδ-deleted mice revealed significant levels of residual partially (VD, DD, and DJ) and fully (VDJ) rearranged Tcrd genes, suggesting the presence of unidentified cis-acting elements that function redundantly with Eδ to target Dδ and Jδ segment recombination (12). Yet only a single report characterizing weak promoter activity associated with the Jδ1 gene segment has appeared previously (13).
In this study, we describe a strong promoter that is tightly associated with the murine Dδ2 gene segment. We show that c-Myb, Runx1, and upstream stimulatory factor 1 (USF1) interact with this promoter both in vitro and in vivo, and that all three factors appear necessary for full promoter activity in vivo. We suggest that these factors may play important roles as Eδ-dependent regulators of Dδ2 accessibility in vivo. Moreover, because we detected low level, enhancer-independent activity of the Dδ2 promoter in reporter assays, we speculate that the Dδ2 promoter may also play a role as a weak, Eδ-independent regulator in vivo.
Materials and Methods
The 5′-RACE was performed using a GeneRacer kit (Invitrogen Life Technologies), according to the manufacturer’s instructions. A Cδ-specific 3′ primer (5′-CAGGAACCGTAGTCTCCTCATGTC-3′) was used for touchdown PCR, as described. PCR products were purified by agarose gel electrophoresis and were cloned using a TOPO TA cloning kit for sequencing (Invitrogen Life Technologies). Clones were sequenced using a Model 3730 DNA Analyzer (Applied Biosystems).
Equimolar amounts of complementary, single-stranded oligonucleotides were annealed in 10 mM Tris-HCl, pH 7.4, 1 mM EDTA, and 50 mM NaCl by boiling for 5 min and slow cooling to room temperature. Annealed oligonucleotides were end labeled with [α-32P]deoxynucleotide triphosphates and the Klenow fragment of DNA polymerase. They were then gel purified, eluted overnight at room temperature in elution buffer (10 mM Tris-HCl, pH 8.0, 1 mM EDTA, and 50 mM NaCl), and were isolated by filtration.
Nuclear extracts were prepared. EMSAs were performed, as described previously (14, 15), with some modifications. For the binding reaction, 5–10 μg of Jurkat or MOLT13 nuclear extracts were incubated with 4–20 fmol of labeled, annealed oligonucleotides in 25 μl of 10 mM Tris-HCl, pH 7.6, 50 mM NaCl, 1 mM DTT, 1 mM EDTA, 200 μg/ml BSA, 2% glycerol, and (except when indicated) 2 μg of poly(dI-dC). Following a 20-min incubation at 4°C, 3 μl of loading buffer was added and the samples were electrophoresed through a 4.5% nondenaturing polyacrylamide gel at 350 V for 3.5 h. Gels were dried and exposed to x-ray film.
Cold competition was performed by incubating nuclear extracts with a 50- to 150-fold molar excess of unlabeled binding site for 20 min before adding the corresponding labeled probe. Ab inhibition/supershift experiments were performed by similar preincubation with 1 μg of specific Ab. Abs specific for USF1 (sc229), Max (sc765), Myb (sc517), T cell factor 1 (TCF1) (sc8589), lymphoid enhancer binding factor 1 (LEF1) (sc8592), Runx1 (sc8563), as well as control rabbit and goat IgG were purchased from Santa Cruz Biotechnology.
The previously characterized transcription factor binding sites used were: Moloney murine leukemia virus (MoMLV) Runx, 5′-ATCTGTGGTAAGCA-3′; δE3 Runx, 5′-AGCAATGCATGTGGTTTCCAA-3′; δE3 Runxm, 5′-AGCAATGCATGACCTTTCCAA-3′; 1×Myb, 5′-TAGGAATAACGGAAT-3′; δE3 Myb, 5′-TTCCAACCGTTAATGCTAGA-3′; δE3 Mybm, 5′-TTCCAAGCTTTAATGCTAGA-3′; TCRα LEF, 5′-CGTAGGGCACCCTTTGAAGCTCTCCC-3′; and TCRα LEFm, 5′-CGTAGGGCACAATTTCAAGCTCTCCC-3′. Top and bottom strand oligonucleotides were synthesized with 5′ GATC overhangs, except for TCRα LEF and TCRα LEFm, which were synthesized with 5′ TCGA overhangs. Oligonucleotides containing the NF1 binding site (sc-2553) or its mutated form (sc-2554) were purchased from Santa Cruz Biotechnology.
Dδ2 and Jδ1 fragments tested in the luciferase assay were obtained by PCR using Pfx polymerase (Invitrogen Life Technologies) and plasmid pTAE7.5 (16) as a template. Primer sequences are available on request. PCR fragments were extracted with phenol:chloroform (1:1) and ethanol precipitated. Fragments were then digested with the appropriate restriction enzymes, electrophoresed through agarose gels, and purified using a Qiaex II gel extraction kit (Qiagen). Fragments were then ligated into the polylinker upstream of the Firefly luciferase gene of plasmids pXPG (17) or pXPGEα (A. Hawwari and M. Krangel, unpublished observations). pXPGEα carries a 380-bp human Eα fragment (nt 1062946–1063326 of GenBank file NG001332) inserted into the EcoRI site of pXPG downstream of the luciferase reporter gene. Mutations were introduced into the Dδ2 fragment (−238/+27) by overlap extension PCR, as described (18). The structures of all constructs were confirmed by DNA sequence analysis. All plasmids were purified using a maxiprep plasmid purification kit (Qiagen).
The Jurkat T cell line was transfected with purified plasmid DNA using Superfect (Qiagen), following the manufacturer’s recommendations. Briefly, 3 × 106 cells were transfected in 6-cm dishes with 5 μg of test plasmid in 5 ml of RPMI 1640 (Invitrogen Life Technologies) supplemented with 10% FBS, penicillin, and streptomycin. As a control, 40 ng of the Renilla luciferase expression vector pTK-RL was included in each transfection. Transfected cells were harvested at 24 h and were assayed for luciferase activity using the Dual-Luciferase Reporter System (Promega).
Chromatin immunoprecipitation was conducted with modifications of the protocol outlined in the Upstate Biotechnology Chromatin IP Assay Kit. RAG-2−/− thymocytes were centrifuged at 1300 × g and were washed and resuspended in cold PBS at 107 cells/ml. Cells were cross-linked by adding formaldehyde to 1% (w/v) and incubating for 10 min on ice, and the reaction was stopped by adding glycine to 0.125 M and incubating for 5 min at 23°C. After a PBS wash, 107 cells were centrifuged, washed, and resuspended in 1 ml of 5 mM PIPES, pH 8.0, 85 mM KCl, 0.5% Nonidet P-40, 10 μg/ml leupeptin, 0.1 mM PMSF, and 0.1 mM benzamidine for a 5-min incubation at 4°C. After centrifugation, the nuclei were suspended in 100 μl of 50 mM Tris-HCl, pH 8.0, 10 mM EDTA, 1% SDS, 10 μg/ml leupeptin, 0.1 mM PMSF, and 0.1 mM benzamidine, and were broken by vigorous pipeting. The volume was then adjusted to 1 ml with 10 mM Tris-HCl, pH 8.0, 1 mM EDTA, and the suspension was sonicated using a Model 550 Sonic Dismembrator (Fisher Scientific), alternating 15 s on and 20 s off for 10 cycles while the sample was immersed in an ice/water bath. Chromosomal DNA was reduced to an average size of 300–500 bp as determined by agarose gel analysis. The composition of the chromatin solution was then adjusted by addition of Tris-HCl, pH 8.0, Triton X-100, and NaCl to 25 mM, 1.1% (v/v), and 170 mM, respectively. Chromatin was precleared by incubation for 3 h at 4°C with 100 μl of a 50% salmon sperm DNA/protein A-agarose slurry (Upstate Biotechnology). Precleared chromatin corresponding to 3 × 106 thymocytes was incubated for 16 h at 4°C with 10 μg of specific or isotype-matched control Abs, followed by 70 μl of protein A-agarose slurry for an additional 30 min at 4°C. The supernatant was saved as the unbound fraction. Immunoprecipitates were washed by rocking for 5 min at 4°C once each with the following buffers (containing protease inhibitors): 1) 0.01% SDS, 1.1% Triton X-100, 1.2 mM EDTA, 20 mM Tris-HCl, pH 8.0, 170 mM NaCl; 2) 0.1% SDS, 1% Triton X-100, 2 mM EDTA, 20 mM Tris-HCl, pH 8.0, 500 mM NaCl; 3) 1% Nonidet P-40, 1% deoxycholic acid, 100 mM Tris-HCl, pH 8.0, 500 mM LiCl; and twice with 4) 10 mM Tris-HCl, pH 8.0, 1 mM EDTA. DNA/Protein/Ab complexes were eluted from the protein A-agarose by rocking twice for 15 min at 23°C in 250 μl of 50 mM NaHCO3, 1% SDS. After reversal of cross-links and deproteination, the presence of the Dδ2 promoter was assessed using the primers D2P F1 (5′-TAAGTACCCAGGCAAGTCTG-3′) and D2P R2 (5′-ACGGTTCTTCACCCTGCAGT-3′) for PCR, and radiolabeled probe D2P (5′-GGCTTTGTATCACGTGTCTCTG-3′) for detection. The presence of Oct2 was assessed using primers OCTF (5′-TGGAGGAGCTGGAACAGTTT-3′) and OCTR (5′-TGTTTGGACCTTGGCATCTTTG-3′) for PCR, and radiolabeled probe OCT (5′-GCACCTTCAAGCAACGCCGCA-3′) for detection. PCRs were conducted using as templates 0.02% of unbound material and 4% of the bound material (and successive 1/3 dilutions of each). PCR conditions were: 5 min at 95°C, followed by 25 cycles of 25 s at 95°C, 25 s at 60°C, 30 s at 72°C, and a final extension step of 2 min at 72°C. The products were resolved by 1.5% agarose gel electrophoresis, were blotted to a nylon membrane, and were probed with appropriate radiolabeled oligonucleotides.
Structure of germline Tcrd transcripts
To identify promoters responsible for germline transcription through the Dδ-Jδ-Cδ region, we initially used 5′-RACE to map the 5′ ends of Cδ transcripts present in thymocytes of RAG-2−/− mice. Because these mice lack recombinase activity, thymocyte development is arrested at the DN stage (in which Tcrd transcription is readily detected) and the Tcrd locus is retained in germline configuration. Using an antisense Cδ primer (Fig. 1,A) in a RACE protocol designed to selectively amplify full-length products, we identified a doublet of ∼700–900 bp by agarose gel electrophoresis (Fig. 1,B). The upper and lower bands were separately excised, cloned, and sequenced to determine their structures. Most clones obtained from the lower band reflected initiation within and upstream of Jδ1, whereas most clones obtained from the upper band reflected initiation within the Dδ2 gene segment (Fig. 1,C). Two clones reflected initiation upstream of Jδ2 (Fig. 1,D). Sequence analysis of Jδ1 RACE products revealed that transcripts initiated over a broad region and were spliced from Jδ1 to Cδ exon 1, as expected. Analysis of Dδ2 RACE products revealed a tight cluster of initiation sites. These products were only slightly larger than the Jδ1 RACE products due to a splicing event that removes 709 bp from the region between Dδ2 and Jδ1 (Fig. 1, A and C). Further RACE experiments identified low levels of Dδ2 transcripts that displayed spliced Cδ exons, but retained the Dδ2-Jδ1-intervening sequence. We never found clones containing Dδ1 sequences spliced to Cδ directly or to Jδ segments and then to Cδ, although several potential splice donor motifs can be found downstream of the Dδ1 segment.
We could not identify TATA boxes (consensus 5′ TATAAA and variants) (19) at appropriate locations upstream of the Dδ2, Jδ1, and Jδ2 start sites. Moreover, the tightly clustered Dδ2 initiation sites are not associated with a consensus initiator sequence 5′-PyPyA+1N T/APyPy-3′ (20) or with GC-rich elements.
A strong promoter associated with Dδ2
We sought to identify enhancer-dependent promoters by cloning fragments into a luciferase reporter plasmid (pXPG) containing Eα, followed by transient transfection into the human T cell line Jurkat. Without a promoter, the vector yields only very low level luciferase activity (Fig. 2 A). However, Dδ2 fragment −836 to +27 (+1 being the most 5′ start site) conferred a ∼100-fold increase in activity. Plasmids containing progressively 5′-deleted promoter fragments maintained similarly high level luciferase activity up to nt −148. Further truncation to −110 and −62 reduced activity to 60 and 40%, respectively, of the −238/+27 fragment (used throughout as a standard of full activity). Further deletion to −46 abolished all promoter activity. A very similar activity profile was obtained when the same promoter fragments were tested in a reported plasmid containing Eδ (data not shown). Because the results were not influenced by the choice of enhancers, we conducted all subsequent assays in Eα-based reporters.
We noticed the presence of a perfect consensus sequence for a downstream promoter element A/GGA/TC/TG/A/C (21) located at the appropriate distance (+28 to +32) from the Dδ2 cluster of transcription starts. However, the inclusion of this element in the promoter fragment (−238 to +51) had no impact on luciferase activity (Fig. 2 A). A 3′ truncation of the promoter up to the position −2 (−238/−2) eliminated all identified transcription start sites and yielded a 45% drop in luciferase assay. Activity was practically abolished by additional truncation to −63.
In summary, fragment −148 to +27 contains all elements necessary for full promoter activity. A critical element between −62 and −46 collaborates with downstream elements to provide 40% of luciferase activity. Other elements between −148 and −62 cooperate to provide full potency to the promoter, although these elements cannot function in the absence of the downstream elements (−238/−63Eα).
We were unable to identify strong promoter activity that could account for germline Jδ1 transcripts. Fragment −416 to +128 (+1 being the most 5′ start site) yielded luciferase activity similar to that conferred by a plasmid with Eα alone (Fig. 2 B). Activity increased slightly with progressive truncation to −235 and to −4, but the activity of the latter was still only ∼6% of the activity of the Dδ2 promoter. Thus, the Jδ1 promoter may rely on additional distal elements not included in the test constructs.
Myb, LEF, Runx, NF1, E47, and USF bind to the Dδ2 promoter in vitro
The functionally important region of the Dδ2 promoter contains potential binding sites for an array of ubiquitous and tissue-restricted transcription factors, including NF1 (−125 denoting the 5′ border of the motif), GATA (−104), Myb (−124 and −120), Runx (−106 and −96), basic helix-loop-helix (bHLH) (−75 and −51), and TCF/LEF (−107 and −38) (Fig. 3,A). Of note, the majority of these sites carry one or more substitutions in the human sequence (Fig. 3 B). We generated duplex oligonucleotides spanning the various sites within the murine sequence and conducted EMSAs using Jurkat and MOLT13 nuclear extracts to assess protein binding to these sites in vitro.
Runx proteins are a family of transcriptional regulators that contain a conserved 128-aa DNA binding domain with homology to Drosophila Runt (22). Consensus binding sites for Runx factors (PuACCPuCA) have been identified in the enhancers and promoters of many T cell-specific genes, including Tcra, Tcrb, Tcrg, Tcrd, Cd3e, Gzmb (23). As shown in Fig. 4, a radiolabeled probe that spans the −96 Runx site generated four major nucleoprotein complexes after incubation with Jurkat T cell nuclear extract (lane 1). Three of these complexes were specifically inhibited by competition with a 75-fold excess of unlabeled probe (lane 2), but not with an unrelated binding site (lane 6) or a −96 Runx site containing a mutated Runx motif (lane 3). Similarly, formation of these complexes was inhibited by competition with known Runx binding sites present within the MoMLV long terminal repeat and δE3 element of Eδ (24) (lanes 4 and 7), but not with a mutant δE3 site (lane 5). The presence of Runx protein in these complexes was confirmed by incubation of nuclear extract with a Runx1-specific Ab. All three bands were supershifted by anti-Runx1, whereas control IgG had no effect (lanes 8 and 9). These findings indicate that Runx1 binds specifically to the Dδ2 promoter in vitro.
We identified two overlapping sites with the potential to bind Myb (Fig. 3,A). One (TAACTG) conforms precisely to the consensus PyAACG/TG, whereas the other has two mismatches that should make it less favorable for Myb binding (GAACTA). A radiolabeled probe containing both Myb sites (Dδ2 Myb) generated a major complex that was competed by an excess of unlabeled probe, but not by an unrelated site (Fig. 5 A, lanes 1, 3, and 6). This complex is also partially competed by a previously characterized binding site for v-Myb (25) (1×Myb; lane 2). A Dδ2 binding site carrying a mutation of the consensus Myb site (Dδ2Mybm1) was still able to compete formation of this complex (lane 4), although competition was slightly less effective than with the wild-type binding site. Competition was substantially impaired only after mutations were introduced into both Myb sites (Dδ2Mybm2) (lane 5).
We further confirmed the specificity of protein binding to the Dδ2 Myb sites using the radiolabeled 1×Myb binding site as a probe. The 1×Myb probe generated two complexes after incubation with Jurkat nuclear extract (Fig. 5,B). Of these, the slow mobility complex was specific (Fig. 5,B, lanes 1 and 2) and appeared to reflect binding of c-Myb (25). Consistent with this, complex formation was inhibited by unlabeled δE3 with a wild-type Myb binding site, but not by δE3 with a mutated Myb binding site (15) (lanes 3 and 4). Significantly, as in the experiments using a radiolabeled Dδ2 probe, wild-type and singly mutated Dδ2 competitors inhibited binding of c-Myb to the 1×Myb probe, whereas the doubly mutated competitor did not (Fig. 5 B, lanes 5–7). These results imply that c-Myb can bind to either of two overlapping sites within the Dδ2 promoter in vitro.
Adjacent to the Myb binding sites are overlapping binding sites with the potential to bind Runx, TCF/LEF, and GATA factors (−113AACCACTTTGATAG−99) (Fig. 3,A). For these experiments, we used nuclear extract from γδ T cell line MOLT13 because it contained greater LEF-binding activity than equivalent Jurkat extracts. Incubation of MOLT13 extract with a probe spanning the composite motif, from position −117 through −97 (C/L), generated three complexes: a doublet of relatively slow mobility and a single species of somewhat faster mobility (Fig. 6,A, lane 1). All three species were partially competed by an excess of unlabeled C/L (lane 5). The lower band of the doublet appears to reflect binding of LEF1, for the following reasons. First, unlabeled C/L with a mutated LEF site fails to inhibit formation of this complex (lane 6). Second, complex formation is specifically inhibited by a wild-type TCRα LEF site (26), but not by a mutant version of this site (lanes 3 and 4) or a MoMLV Runx binding site (lane 2). Third, complex formation was specifically inhibited by anti-LEF1, but not by anti-TCF1 or control IgG (lanes 7–9). Consistent with this, use of the TCRα LEF binding site as a probe generated a specific complex (Fig. 6 B, lanes 1–3) whose formation was inhibited by Dδ2 C/L, but by none of three variants of C/L with distinct LEF binding site mutations (lanes 4–7). We found no evidence for LEF binding at a second consensus LEF binding motif at position −38 (data not shown).
We could not identify either the upper band of the doublet or the fast mobility species (Fig. 6 A). These species appear unrelated to LEF1 or TCF1 based on the inability of specific Abs to inhibit formation of these complexes (lanes 7–9). Similarly, Abs specific for Runx1 and Runx3, the predominant Runx factors in T cells, failed to inhibit or supershift these complexes (data not shown). Consistent with this, the T at the 3′ end of the candidate Runx site (AACCACT) is highly unfavorable for the binding of known Runx proteins (27). Moreover, competition experiments using radiolabeled Dδ2 C/L and an unlabeled Eβ GATA binding site (28) or radiolabeled Eβ GATA and unlabeled Dδ2 C/L yielded no evidence for GATA binding (data not shown).
We identified a consensus NF1 binding site (−125TGAactaacTGCCA−112) that straddles the defined Myb binding sites (Fig. 3,A). A 40-bp Dδ2 probe (C′) spanning the NF1 motif through the defined Runx site (−128 to −86) generated a series of complexes upon incubation with Jurkat nuclear extract (Fig. 7, lane 1). One clearly reflects Runx1 binding to the defined motif at −96, as evidenced by specific competition with the MoMLV Runx site (lane 4) and efficient supershift by anti-Runx1 (lane 5). A complex of slightly slower mobility appears to reflect NF1 binding, because competition was observed by a consensus NF1 binding site, but not by a binding site with a mutation in the NF1 motif (lanes 2 and 3).
Two E box binding sites for bHLH proteins (CANNTG) are present in close proximity at −75 and −51 of the Dδ2 promoter (Fig. 3,A). We generated two independent probes (E1 and E2) to analyze protein binding at these sites. The E1 sequence CACCTG can be recognized by members of the ubiquitous class A bHLH proteins as dimers with either class A or tissue-specific bHLH proteins (29). Because E47/HEB heterodimers are the predominant E protein heterodimers in thymocytes (30), we asked whether anti-E47 inhibited any of the complexes formed with a radiolabeled E1 probe (Fig. 8 A). Among the several complexes whose formation was inhibited by incubation with unlabeled E1, one was also inhibited by anti-E47 (lanes 1, 3, and 4). Hence, an E protein dimer containing E47 can bind to the Dδ2 E1 site in vitro. Based on inhibition with unlabeled MoMLV Runx site, another specific complex formed with the E1 probe was attributed to Runx1 binding (lane 2). This results from overlap of the E1 probe with the −96 Runx site.
The E2 sequence CACGTG can be recognized by bHLH-zipper proteins, including Myc/Mad/Max proteins, USF, transcription factor E3, and transcription factor EB (29). Given the sequence surrounding the E2 core, this site would be unlikely to bind the Myc/Max complex, but would favor instead USF binding (31, 32). Incubation of radiolabeled E2 probe with Jurkat nuclear extract generated a series of complexes, three of which were specific as judged by competition with an excess of unlabeled E2, but not unlabeled E1 (Fig. 8 B, lanes 1, 2, and 4). Competition by E2 was abolished by a mutation in the E box (lane 3). Moreover, when the radiolabeled mutant binding site was used as a probe, only the nonspecific complexes were generated (data not shown). Bona fide USF binding sites are known to form a similar set of three complexes, composed of homo- and heterodimers of USF1 and USF2 (33). Indeed, incubation with anti-USF1, but not control IgG, supershifted all three complexes (lanes 5 and 6). No supershift was observed using anti-Max (data not shown). Because all of the specific complexes formed with the E2 probe were supershifted by anti-USF1, it appears that USF is by far the predominant E2-binding protein in Jurkat nuclear extract. We cannot rule out the possibility that additional bHLH or bHLH-zipper proteins might interact with this site using extracts from other cell sources.
Regulation of the Dδ2 promoter by Myb, Runx, and USF
To assess the contributions of the various protein binding sites to Dδ2 promoter function, we generated versions of pXPGEα containing mutant versions of the −238/+27 promoter fragment. Mutation of either the −96 Runx binding site or the E2 USF binding site dramatically impaired promoter function, with luciferase activity reduced to 29 and 21% of the wild-type fragment, respectively (Fig. 9,A, −96RunxmEα and E2mEα). Mutation of the −120 Myb site reduced the promoter activity by 35% (Fig. 9 A, Mybm3Eα). Because simultaneous mutation of both Myb sites produced no greater inhibition (Mybm2Eα), the consensus 3′ motif, but not the nonconsensus 5′ motif, appears to be functional in vivo. Moreover, because the double, but not the single mutation disrupts the NF1 site, NF1 binding appears to be irrelevant for promoter function as well.
We tested the activities of three different mutations in the −107 LEF site (Fig. 3,A). LEFm1 (CTTTG to CATAG) and LEFm2 (CTTTG to CTACG) failed to influence promoter activity, whereas LEFm3 (CTTTG to CTGGG) reduced luciferase activity by 28% (Fig. 9,A). However, because all three mutants disrupt LEF binding (Fig. 6,B), it seems likely that LEFm3 influences promoter function by disrupting the binding of a protein other than LEF. Mutation of the GATA site (CTTTGATAG to CTTTGTCAG) had minimal effect on promoter function (Fig. 9,A, GATAmEα), consistent with our failure to detect GATA binding to this site in vitro. Mutation of the E1 site also had no effect on promoter function (Fig. 9 A, E1mEα), suggesting that protein binding to this site is irrelevant in vivo. Interestingly, however, mutation of the noncanonical −106 Runx site reduced luciferase activity by 45% (−106RunxmEα). The effects of LEFm3 and −106Runxm suggest the functional importance of one or more additional proteins that are currently undetected by EMSA.
In vivo promoter occupancy by Myb, Runx, and USF
We asked whether the proteins implicated in Dδ2 promoter function were associated with the promoter in DN thymocytes in vivo. Thymocytes from RAG-2−/− mice were treated with paraformaldehyde to cross-link protein-DNA complexes. Lysates were then sonicated to reduce the DNA fragments to an average size of ∼500 bp. Following immunoprecipitation with Abs specific for Myb, Runx1, and USF1, DNA was purified from the Ab-bound and unbound fractions, and enrichment of Dδ2 promoter sequences in the bound fraction was assessed by PCR (Fig. 10,A). PCR analysis at a site in exon 7 of the B cell-specific Oct2 gene served as a negative control. Relative to both IgG and Oct2 controls, we detected specific binding of all three proteins to the Dδ2 promoter (5- to 15-fold enrichment over binding to Oct2) (Fig. 10). Together with the functional data, these results implicate c-Myb, Runx1, and USF1 as bona fide regulators of Dδ2 promoter activity in vivo.
Enhancer-independent activity of the Dδ2 promoter
Myb and Runx proteins play important roles as coregulators of several lymphoid- and myeloid-specific enhancers (15, 34, 35, 36, 37). Therefore, we asked whether the promoter displayed significant activity in the absence of an exogenous enhancer by testing the activities of enhancerless versions of various promoter constructs by transient transfection (Fig. 9,B). Dδ2 promoter fragments −836/+27, −537/+27, and −238/+27 displayed luciferase activities that ranged from 10 to 6% of the −238/+27Eα standard. Similar to the constructs containing Eα, the truncation from −238/+27 to −62/+27 that removes sequences upstream of the USF site reduced activity by 50%. Moreover, inclusion of the downstream promoter element (−238/+51) had no effect on activity, whereas the 3′ truncation that eliminates the USF and transcription start sites (−238/−63) virtually abolished activity. Similarly, promoter activity was abolished by mutation of the USF and Runx sites, and by LEFm3, but not LEFm2 (Fig. 9 C). We conclude that the Dδ2 promoter displays low level, enhancer-independent activity that depends on a set of transcription factors similar or identical with that required for promoter function in the presence of an exogenous enhancer. That the activity of −836/+27 is reproducibly greater than that of −238/+27 may suggest the involvement of additional upstream sequences.
Previous studies have established transcriptional promoters as critical, local cis-acting regulators of accessibility for V(D)J recombination. In this study, we identify and characterize a strong promoter that is tightly associated with the murine Dδ2 gene segment. Transcription initiates at a cluster of sites embedded within the Dδ2 gene segment, and is critically dependent on the binding of USF1 and Runx to sites immediately upstream of the 5′ RSS. c-Myb and at least one additional uncharacterized factor contribute to promoter function as well. We suggest that these factors play critical roles in Dδ2 accessibility and recombination in developing thymocytes in vivo.
Although transcriptional activation by USF is not fully understood, it is known to occur through multiple mechanisms. USF can bind to both E boxes and initiator elements (38, 39). In addition to containing a classical trans activation domain (40), USF can potentiate transcription by interacting with and stabilizing the binding of other DNA-binding proteins (39). Moreover, it can act as a dock to recruit basal transcription factors, therefore facilitating the formation of the preinitiation complex in the apparent absence of a TATA box or initiator element (38, 41, 42). Because the Dδ2 promoter lacks both elements, we suggest that USF functions, at least in part, to recruit the basal transcriptional machinery. USF binding is also associated with an 80° bend in the DNA (41), suggesting that it could facilitate interactions among distant factors. Because of its central position in the Dδ2 promoter and its critical role in promoter function, we suggest that USF coordinates the assembly of a multiprotein complex on the promoter, functioning as a bridge between the core promoter and upstream activators.
Runx proteins activate transcription as part of multiprotein complexes in which interactions with other DNA-binding proteins are essential to potentiate both DNA binding and trans activation (43, 44, 45). In several instances, Runx protein function has been shown to depend critically on c-Myb (15, 34, 35, 36). Synergy with c-Myb appears not to require direct interactions between the two factors, but has been hypothesized to rely on their ability to jointly recruit coactivators such as p300 and CBP. Like Eδ, the Dδ2 promoter contains functional binding sites for both factors. However, because the −110/+27 fragment, with Runx, USF, and transcription start sites, displays an activity that is elevated over that of −62/+27, with USF and transcription start sites only, it seems apparent that Runx1 can function independent of c-Myb in this context. The partial effects of Myb site mutations are consistent with this notion.
It is interesting that USF alone provides for substantial promoter function in −62/+27 (Fig. 2,A), whereas promoter function of a longer fragment with a Runx site mutation is low (Fig. 9 A). One possibility is that transcription in the absence of upstream sequences depends on trans activation by USF itself, whereas in the presence of upstream sequences this function is obscured. Instead, intact promoter function would rely on trans activation provided by Runx1 and c-Myb, with USF potentiating their communication with basal transcription factors. In the absence of Runx1, interactions between upstream factors and the basal machinery would occur, but would be nonproductive, yielding an activity that is reduced as compared with the USF-driven truncated promoter fragment.
Surprisingly, human Dδ3 (homologue of murine Dδ2) displays conserved Myb and LEF motifs, but lacks both the USF and Runx sites that appear critical for murine promoter function (Fig. 3 B). As such, we asked whether germline transcripts initiate at or near the human Dδ3 element in vivo. To do so, we performed RACE analysis of DN thymocytes of RAG-2−/− mice carrying a human Tcrd gene minilocus driven by Eδ (46). Although we could readily detect transcripts initiating upstream of Jδ1 or Jδ3, we failed to detect initiation at sites associated with Dδ3 (J. Carabana, unpublished observations), even though we adjusted our RACE strategy so that transcript detection did not depend on a splicing event similar to that reported in this work. Consistent with this, we could detect only low level promoter activity in this region by transient transfection. These results point to differences in regulation at the murine and human Tcrd loci.
Although the murine PDβ and Dδ2 promoters bear little or no overall structural homology, it is striking that in both cases proximal promoter elements are very closely abutted to the RSSs. The PDβ TATA box is situated within the 5′ RSS, and transcripts initiate at sites within the 3′ RSS (47, 48). Given this architecture, the assembly of a preinitiation complex and the initiation of transcription could provide a powerful mechanism to perturb nucleosome structure and accessibility at the Dβ1 RSSs. Indeed, some form of Eβ-dependent nucleosome disruption at this site has already been documented (49). Because germline Dδ2 transcripts initiate within the Dδ2 gene segment, there is potential for a similar nucleosomal disruption across the Dδ2 RSSs. The nature of such disruption and its role in recombinase access will be important issues to address in future studies.
Eδ deletion resulted in significantly reduced germline transcription and V(D)J recombination at the murine Tcrd locus (12). Nevertheless, germline transcription was still detectable at a level ∼10% of that in wild-type mice, and incomplete rearrangements involving Dδ2 occurred on the majority of alleles. These data suggest the activity of an additional enhancer, or an enhancer-independent promoter, that can stimulate both germline transcription and accessibility for V(D)J recombination on Eδ-deleted alleles. Because similar behavior was observed for alleles lacking both Eδ and Eα, residual activity cannot be attributed to Eα. Our observation that Dδ2 promoter activity depends on some of the same factors found previously to be responsible for Eδ activity raised the possibility that the promoter might be able to provide some enhancer-independent functioning in vivo. Using transfected plasmid substrates, we found USF1, Runx1, c-Myb, and other factors to provide for at least low level transcription in the absence of a distal enhancer. It remains to be seen whether they can function in similar fashion in the context of native chromatin in vivo.
The authors have no financial conflict of interest.
We thank Abbas Hawwari and Iratxe Abarrategui for critical review of the manuscript.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
This work was supported by National Institutes of Health Grant GM41052 (to M.S.K.).
Abbreviations used in this paper: RSS, recombination signal sequence; bHLH, basic helix-loop-helix; DN, double negative; Eα, Tcra enhancer; Eβ, Tcrb enhancer; Eδ, Tcrd enhancer; MoMLV, Moloney murine leukemia virus; PDβ, promoter Dβ; USF1, upstream stimulatory factor 1; LEF1, lymphoid enhancer binding factor 1; TCF1, T cell factor 1.