Abstract
Notch signaling activates T lineage differentiation from hemopoietic progenitors, but relatively few regulators that initiate this program have been identified, e.g., GATA3 and T cell factor-1 (TCF-1) (gene name Tcf7). To identify additional regulators of T cell specification, a cDNA library from mouse Pro-T cells was screened for genes that are specifically up-regulated in intrathymic T cell precursors as compared with myeloid progenitors. Over 90 genes of interest were identified, and 35 of 44 tested were confirmed to be more highly expressed in T lineage precursors relative to precursors of B and/or myeloid lineage. To a remarkable extent, however, expression of these T lineage-enriched genes, including zinc finger transcription factor, helicase, and signaling adaptor genes, was also shared by stem cells (Lin−Sca-1+Kit+CD27−) and multipotent progenitors (Lin−Sca-1+Kit+CD27+), although down-regulated in other lineages. Thus, a major fraction of these early T lineage genes are a regulatory legacy from stem cells. The few genes sharply up-regulated between multipotent progenitors and Pro-T cell stages included those encoding transcription factors Bcl11b, TCF-1 (Tcf7), and HEBalt, Notch target Deltex1, Deltex3L, Fkbp5, Eva1, and Tmem131. Like GATA3 and Deltex1, Bcl11b, Fkbp5, and Eva1 were dependent on Notch/Delta signaling for induction in fetal liver precursors, but only Bcl11b and HEBalt were up-regulated between the first two stages of intrathymic T cell development (double negative 1 and double negative 2) corresponding to T lineage specification. Bcl11b was uniquely T lineage restricted and induced by Notch/Delta signaling specifically upon entry into the T lineage differentiation pathway.
The circulating population of mature T lymphocytes is constantly regenerated as hemopoietic progenitors leave the bone marrow (BM)7 and home to the thymus where development and maturation occur (1, 2, 3). Cell-intrinsic regulatory factors that are up-regulated in a lineage-specific way play dominant roles in lineage choice of hemopoietic precursors. In RBC development, GATA1 acts as a central mediator of erythroid gene expression, and it is known that B cells are instructed specifically by early B cell transcription factor and Pax5. These transcription factors can be considered “master regulators” because, for each cell type, loss of the transcription factor causes a selective block of the developmental pathway while gain of function of the transcription factor can accelerate differentiation into the lineage. Equivalent “master regulators” of T cell development have not yet been found. More than eight known transcription factors are essential for T cell development (reviewed in Refs. 4 and 5), but none of these exhibits the ability to instruct or accelerate entry into the T cell program.
Two transitions in early T cell development are of particular interest for lineage choice mechanisms: the onset of T lineage gene expression (“specification”), and the final exclusion of any fate except a T cell fate (“commitment”). Both transitions occur among intrathymic, early T lineage cell populations (Pro-T cells), which are still negative for the mature T cell markers CD4 and CD8 and do not yet express TCRs. These double-negative (DN) cells within the mouse thymus are divided into four stages on the basis of their expression of the surface markers Kit, CD25, and CD44 reviewed in Refs. 6, 7, 8 . The first thymocyte population, DN1, maintains much lineage plasticity and, under special conditions, is capable of producing macrophages, NK cells, or dendritic cells, with a minute subset apparently capable of generating B cells as well (9, 10, 11, 12). It is not yet clear whether DN1 cells are distinct from prethymic precursors in gene expression pattern. However, as cells enter the next stage, DN2, they express sharply increased levels of Pro-T cell genes such as those encoding pTα, CD3ε, CD25, Rag1, and IL-7Rα (CD127) (10, 13, 14, 15, 16), and some rearrangement begins at the DJβ and VJγ TCR loci (17, 18). In hemopoietic precursors (derived from fetal liver) that are differentiating in vitro in response to Notch/Delta signaling, the first-appearing DN2 phenotype cells display the same dramatic increase in expression of these genes (19). DN2 cells have undergone “specification” but are not yet committed to the T lymphocyte pathway; a high proportion of DN2 cells are still able to differentiate into NK cells, macrophages, or dendritic cells (10, 20, 21, 22, 23). At the DN3 stage, thymocytes stop dividing, further increase expression of the Pro-T differentiation genes as well as Notch target genes (24), and undergo extensive TCR rearrangements. Only at this stage do they become committed to a T cell fate in vivo. Cells only progress beyond the DN3 stage through successful TCR gene rearrangement and TCR-dependent selection, at which time they graduate from Pro-T cell status, to give rise to up to five types of T cells: αβ CD4, αβ CD8, γδ, NKT, or regulatory T cells.
Of all these developmental transitions, surprisingly little is known about the stages encompassing “T lineage specification,” that is, the DN1 to DN2 transition. The regulatory participants in these early stages have not been sufficiently characterized to explain the outcome, although Notch/Delta signaling plays a role (19, 25). Traditional methods to identify all the transcription factors that play key roles in early stages of T lineage specification have found limited success, in part because transcription factors are typically present in low copy numbers. T cell precursors at the earliest stages are represented in vivo at any one time by tiny numbers of cells, providing very limiting material for standard microarray analysis. In addition, small fold changes in transcription factor abundance or changes in transcription factor ratios may generate dramatic shifts in cell state. Identification of truly novel regulatory factors and novel isoforms of known factors has also been limited by the comprehensiveness of the microarrays used (26, 27, 28, 29) and by the microarray’s nucleotide probe design.
To circumvent these problems, we used a subtractive hybridization technique (30) to probe a mouse Pro-T cell cDNA macroarray library. The subtractive technique allowed us to enrich a Pro-T cell probe for message not shared by progenitor/premyeloid cells and to identify genes (known and previously unknown) that might be specifically up-regulated during the initiation of the T lineage program. Clones selected by the subtraction were sequenced and their patterns of expression characterized in detail using sensitive, quantitative real-time PCR (qRT-PCR) on a range of highly purified cell populations. The Pro-T cell library was generated by random priming of mRNA from SCID thymocytes, which consist of DN1–3 stage Pro-T cells and NK precursors (31). The resulting macroarray library represents the actual spectrum of transcripts in the DN2 and DN3 cell populations (those immediately preceding lineage commitment) and also contains multiple clones of genes with abundant transcripts, providing opportunities to sample alternate splice variants.
Our subtraction protocol has identified genes that are specifically enriched in T-lineage as opposed to early B or myeloid lineage precursors. Enriched genes include novel transcription factor candidates, chromatin remodeling factors, RNA binding molecules and helicases, a select group of signaling molecules and adaptors, and novel or functionally uncharacterized genes. In this study, we present the resulting expression profiles of >35 of these genes expressed during T lineage specification. A remarkable feature of the whole ensemble of these Pro-T cell genes is the high frequency of “legacy” genes that are expressed strongly in both stem cells and Pro-T cells, although down-regulated elsewhere. In this context, the very select group of regulatory genes that are specifically induced coincidentally with T lineage specification takes on an unexpected significance.
Materials and Methods
Mice
C57BL/6J, B6.CB17-Prkdcscid/SzJ (B6-scid) (The Jackson Laboratory) and B6-Rag2null mice (originally from E. Palmer, Basel Institute, Basel, Switzerland) were bred and maintained in specific pathogen-free facilities at Caltech.
Thymus and BM samples were taken from animals 5–7 wk old. The animals used were bred and maintained under sterile conditions at Caltech.
cDNA library
The C.B-17-scid thymocyte, random-primed, cDNA library was constructed in the pSPORT1 vector (Invitrogen Life Technologies), and was arrayed and spotted at high density onto Hybond-N+ nylon filters (Amersham Biosciences) using the Q-BOT robot (Genetix) as described previously (31).
Cell populations for library generation and library screening
The two types of cells used as sources of RNA for the subtraction protocol were a bulk population of Pro-T cells and a population of progenitor/premyeloid cells. To obtain large numbers of Pro-T cells in the DN1-DN3 stages, we took advantage of the Rag2 knockout mouse, in which thymocyte development arrests at DN3. In the wild-type mouse thymus, DN3 cells account for only 1% of thymocytes, but even without sorting, Rag2 knockout thymocytes consist of 90% DN3 Pro-T cells, with the remaining cells being DN1, DN2, NK, or thymic stromal cells. Because these cells were not pure Pro-T cells, we refer to them in the text as “Pro-T plus.” A progenitor/premyeloid population was obtained from Lin−Kit+ BM cells matured in culture toward a myeloid fate. Specifically, BM cells that were Kit+Gr1−CD11b−Ter119−CD19− from Rag2ko mice were cultured for 42 h at 37°C in 5% CO2 in IMDM with 10% heat-inactivated FCS supplemented with IL-3 (100 μl of WEHI-3B cell supernatant/ml medium), stem cell factor (Kit ligand) (100 μl of BHK-MKL cell supernatant/ml medium), and 10 ng/ml rIL-6 (PeproTech). At the time of harvest, the morphological appearance of the cells ranged from undifferentiated, blast-like cells to mature granulocytes.
Cell populations for quantitative RNA expression analysis
In addition to the subtraction populations, sorted cell populations were obtained for analysis by qRT-PCR. Two hemopoietic progenitor populations, a Pro-B population, and a population of myeloid cells were all sorted from the BM of Rag2ko mice. LSK CD27− cells were Kit+Sca-1+CD27−Gr1−CD11b−Ter119−CD19−. LSK CD27+ cells were Kit+Sca-1+CD27+Gr1−CD11b−Ter119−CD19−. Pro-B cells were CD19+Kit+/−Gr1−CD11b−Ter119−. Sorted BM myeloid cells were Gr1+CD11b+Kit−Ter119−.
Initial expression screening made use of the unsorted Rag2ko thymocytes called Pro-T plus. For more detailed analysis, five populations of DN Pro-T cells were purified from C57BL/6 mouse thymi by cell sorting, essentially as described previously (13, 24). Each of these DN subsets is CD4−CD8−CD3ε−Ter119−F4/80− and Gr1−. DN1 cells are Kit+CD44+CD25−. DN2 cells are Kit+CD44+CD25+. DN3a cells are KitlowCD44−CD25+CD27−. DN3b cells are CD44−CD25+CD27+, and DN4 cells are CD44−CD25−. All Abs used in this study were obtained from eBioscience or BD Pharmingen.
Thymocytes and BM cells were obtained from animals immediately after euthanasia. Cells were incubated in CBSS (5.4 mM KCl, 0.3 mM Na2HPO4, 0.4 mM KH2PO4, 4.2 mM NaHCO3, 137 mM NaCl, and 5.6 mM d-glucose (pH 7.4))/1% BSA with clone 2.4G2 anti-CD32/CD16 (FcγRIII/II) supernatant on ice for 10 min, followed by washing and addition of Abs for staining. Cells stained with biotin-conjugated Abs were washed through a layer of FCS before staining with streptavidin-PECy5 (eBioscience). Stained cells were sorted using FACS Aria cell sorter (BD Immunocytometry Systems). All sorted fractions were reanalyzed immediately for purity and all fractions used here were at least 96% pure.
Preparation of fetal liver cells as input for OP9-DL1 and OP9-control cultures is described below.
Generation of subtractive probe
A Pro-T plus unselected cDNA probe and a T lineage-enriched subtracted probe were generated from the pool of enriched Pro-T cells (∼1 × 107 cells; Pro-T plus) and the population of myeloid and multipotent progenitors (∼9 × 106 cells) described above. RNA was isolated from each population of cells by the Qiagen RNeasy Lipid Minikit (Qiagen), and the RNA was analyzed for purity and integrity by Agilent Bioanalyzer and RNA 6000 Nanochips (Agilent). Messenger RNA was isolated from total RNA by the Ambion Poly(A) Purist Kit (Ambion). The mRNA was also evaluated for purity and concentration by the Bioanalyzer.
The subtraction method was adapted from Rast et al. (30). cDNA was synthesized using enzymes and buffers from a Clontech Marathon cDNA synthesis kit (BD Biosciences), but with the LT7 random-BT primer: 5′-(biotin)-CGGAGGTAATACGACTCACTATAGGGAGNNNNNN-3′ (34 nt). Qiagen PCR purification columns were used to purify samples between first-strand and second-strand synthesis stages.
Different linkers are used for Selectate (the Pro-T plus-derived cDNA) or Driver (the progenitor/myeloid-derived cDNA) to avoid nonspecific subtraction. Linkers contain a 3′ dideoxy residue to prevent filling in of overhang, and a 5′ phosphate for blunt ended ligation to the cDNA. The Selectate linker sequences were 5′-GGGTGCTGTATTGTGTACTTGAACGGGCGGCCGCA-3′ and 3′-dideoxy-CGCCCGCCGGCGT-P-5′. The Driver linker sequences were 5′-GCCAACGTATGTAAGGTTGAGTTCCGGGCAGGT-3′ and 3′-dideoxy-CCCGTCCA-P-5′. Linkers were annealed by placing the linker pairs in a 1:1 molar ratio at a concentration totaling 1 μg/μl in 10 mM Tris (pH 7.9) and 100 mM NaCl, heating in a heating block to 95°C for 5 min, then turning off the block and allowing the linkers to anneal as the block cools to room temperature (∼30 min). Ligation efficiency was evaluated by comparing their electrophoresis migration on a 2% agarose gel relative to the untreated mixture. These linkers were ligated to either the Selectate or Driver cDNA for 16 h at 16°C with DNA ligase.
cDNA with linkers attached was PCR amplified with LT7 primer (5′-CGGAGGTAATACGACTCACTATAGG-3′) and a primer specific for either the Selectate linker (5′-GGGTGCTGTATTGTGTACTTGAACG-3′) or for the Driver linker (5′-GCCAACGTATGTAAGGTTGAGTTCC-3′) to produce 600 ng of product. The resulting Selectate was size selected for 300- to 500-bp product by electrophoresis of the PCR in an agarose gel, excising the appropriate region of the gel, and electroeluting cDNA from the gel. The electroeluate was precipitated and resuspended in 50 μl of water or T low E (10 mM Tris and 0.1 mM EDTA, pH 7.8). Driver product was precipitated and resuspended in 16 μl of water.
Size-selected Selectate was amplified by PCR (primers listed above), and 1 μg of Selectate was set aside for production of the unsubtracted library probe. Selectate (3 μg) was subjected to single-strand purification by Dynal Streptavidin beads, according to the manufacturer’s instructions (Invitrogen Life Technologies). RNase-free technique was used from this point forward for both Selectate and Driver. The Ambion MEGAshortscript kit (Ambion) was used according to the manufacturer’s instructions to translate the Driver cDNA to RNA. Single-stranded Selectate DNA (200 ng) from Pro-T plus cells was mixed with Driver RNA (30 μg) in a 10-μl total volume, denatured at 95°C, iced, then hybridized at 65°C for 40 h. Double-stranded and single-stranded products were separated by hydroxylapatite chromatography (30, 32), and the eluate containing the single-stranded product was desalted and concentrated. The single-stranded product was used to manufacture the subtracted, radioactive probe by Ambion Maxiscript kit using Amersham 800 Ci/mM 32P-UTP.
Subtractive hybridization protocol
Macroarray filters were sequentially hybridized with cDNA from Pro-T plus Rag2ko thymocytes, stripped of probe, then hybridized with subtracted probe, i.e., probe enriched for mRNA that was not shared by the progenitor/premyeloid population. Hybridization intensity for each probe was measured by a PhosphorImager (Molecular Dynamics, GE Healthcare) using BioArray software (Genetix). Representative data are shown in Fig. 1.
Subtractive hybridization to macroarrays to identify Pro-T cell enriched cDNAs. A and B, Phosphorimages of a single region of a macroarray blot are shown, demonstrating enrichment of two cDNAs in subtracted probe. A was hybridized by unsubtracted Pro-T plus probe. B was hybridized with subtracted probe. cDNA clones are each represented by spot pairs. Note that two spot pairs, indicated by arrows, are much darker in the subtracted probe relative to the mostly unchanged background clones. The enriched clones represented by these spot pairs were sequenced and identified as linker for activation of T cells (LAT) and IL-2 receptor α (IL2Ra). C, The results of the subtractive enrichment of one of the four blots that comprise the whole arrayed library are shown here on a log scale. Each dot on the graph represents the change in hybridization intensity between unsubtracted and subtracted probes for a single clone in the macroarray library. Note the skew of the data reflects the ability of this method to measure enrichment but not depletion between two probe samples. Three geometric SDs from the mode are indicated by the black lines and four geometric SDs by the gray lines. The number of clones in each region of the graph is denoted on the right.
Subtractive hybridization to macroarrays to identify Pro-T cell enriched cDNAs. A and B, Phosphorimages of a single region of a macroarray blot are shown, demonstrating enrichment of two cDNAs in subtracted probe. A was hybridized by unsubtracted Pro-T plus probe. B was hybridized with subtracted probe. cDNA clones are each represented by spot pairs. Note that two spot pairs, indicated by arrows, are much darker in the subtracted probe relative to the mostly unchanged background clones. The enriched clones represented by these spot pairs were sequenced and identified as linker for activation of T cells (LAT) and IL-2 receptor α (IL2Ra). C, The results of the subtractive enrichment of one of the four blots that comprise the whole arrayed library are shown here on a log scale. Each dot on the graph represents the change in hybridization intensity between unsubtracted and subtracted probes for a single clone in the macroarray library. Note the skew of the data reflects the ability of this method to measure enrichment but not depletion between two probe samples. Three geometric SDs from the mode are indicated by the black lines and four geometric SDs by the gray lines. The number of clones in each region of the graph is denoted on the right.
On the macroarray blot, each clone is applied in duplicate as a “spot pair” with unique position and arrangement (see Fig. 1, A and B). Pixel intensity for each spot pair was averaged, and systematic differences in hybridization intensity across the blots were minimized by the application of a whitening filter, much like the linear optimal filter or Wiener filter (33, 34). Its formula is:
where Φ(f) is the whitening filter, S(f) is the signal in Fourier space, and N(f) is the noise in Fourier space. A Wiener filter is often used to filter random, usually small-scale, noise from data leaving mostly the large-scale correlations; this is its normal use in image analysis. However, because our clones were randomly placed on the blots, the only correlations expected in our data are the relatively small-scale ones between spot pairs. Hence, all large-scale correlations are likely due to systematic noise such as inhomogeneous probing or washing of the blots. Therefore, we used the whitening filter to remove such correlations (J. E. Moore, unpublished results).
The ratio of the average spot-pair hybridization intensity before and after subtraction was termed enrichment: (intensity for subtracted probe) ÷ (intensity for the unsubtracted probe) = enrichment. The logarithms of the enrichments for 73,728 spot pairs were calculated, and a clone was deemed to be significantly enriched when the logarithm of its enrichment was more than three SDs above the mode of its blot (Fig. 1,C). Clones more than four SDs above the mode were selected for special attention (see Table I). The modes were calculated by a process called “estimating the rate of an inhomogeneous Poisson process by Jth waiting times” (33), briefly outlined here. For each blot, the logarithms of the enrichments were ordered. A window size, J, was chosen, which for these calculations was (1)/24 the number of spot pairs on a blot, or 768; other reasonable values of J do not appreciably change the estimates. An integer, I, was chosen so as to minimize the difference between the Ith and (I + J)th logarithms, and the mode was estimated by averaging these.
Clones selected by this method were sequenced from both the 5′ and the 3′ ends at the Institute for Systems Biology using standard procedures designed for the Applied Biosystems 3730XL sequencer. Each sequence was filtered by RepeatMasker and analyzed by Blastn, Blastx, and BLAT searches of National Center for Biotechnology Information (NCBI), University of California, Santa Cruz (UCSC), and Ensembl databases to identify matches to known genes or genomic sequences (latest searches: build m36 of mouse genome). Of the 1164 clones submitted, 1046 good sequences were obtained.
GOToolBox analysis
The GO-Stats function of the GOToolBox website (35) was used to perform a hypergeometric analysis of statistically relevant over- or under-represented terms within our data set as compared with the Mouse Genome Informatics database of genes. The Benjamini & Hochberg correction for multiple testing was applied. Selected results of searches in the categories of Biological Processes and Cellular Components are reported herein.
Bioinformatic databases
The following databases were used: www.ncbi.nlm.nih.gov, http://genome.ucsc.edu, www.ensembl.org/Mus_musculus, http://crfb.univ-mrs.fr/GOToolBox/index.php, and www.informatics.jax.org.
Coculture of fetal liver cells with OP9 cells
Hemopoietic progenitors cocultured with BM stromal cells (OP9 cell line) will develop into B lymphocytes in vitro. When OP9 stromal cells are transfected to stably express the Notch ligand Delta-like 1 (OP9-DL1), progenitor cells will develop into T lymphocytes in coculture (36). Mouse fetal liver cells (containing hemopoietic progenitors) were cocultured with OP9-control or OP9-DL1 cells exactly as described previously (19). In short, Kit+Lin− (Lin = Gr1, Ter119, F4/80, CD19) cells from day 14 to 14.5 mouse embryo livers were obtained by FACS sorting. Kit+Lin− fetal liver cells were cocultured with OP9 control cells or with OP9-DL1 cells (36). Cocultures were harvested for RNA analysis by forceful pipetting at indicated time points. In some experiments, to test the effect of delayed addition or withdrawal of Notch signals, progenitor cells were transferred to secondary cultures at day 4. OP9-control and OP9-DL1 cocultures were harvested, Kit+CD27+Lin− cells were isolated from each culture by sorting, and these were each split and used to seed fresh monolayers of OP9 control and OP9-DL1 cells, to be harvested at the indicated later time points.
To compare the time courses of T lineage differentiation from distinct precursor subsets, Kit+Lin− fetal liver cells were fractionated into Kit+Sca-1+ (“LSK”), Kit+Sca-1lowCD27+Flk2/Flt3 (CD135)+IL-7Rα(CD127)− (“Flk+”), and Kit+Sca-1lowCD27+CD135+CD127+ (“CLP-like”) subsets, as described elsewhere (T. Taghon, M. A. Yui, and E. V. Rothenberg, submitted for publication). These subsets were then cocultured with OP9-control or OP9-DL1 cells for 1–7 days before harvesting for RNA.
Quantitative real-time PCR
qRT-PCR was performed on diluted samples of cDNA using SYBR Green PCR Master Mix in an ABI PRISM 7700 Sequence Detector (Applied Biosystems). In all figures comparing expression levels of multiple genes, the measurements for all genes shown were conducted on the same cDNA samples. The ΔCt method was used for all expression measurements, with a fixed threshold to enable direct comparison between test genes and the GAPDH standard. Primers were designed, using Primer3 software (37), to have optimal melting temperatures and to cross introns. The primers were BLAST tested for gene specificity before being synthesized (Operon Biotechnologies). Each primer pair was evaluated for acceptable dose-response titration slopes and amplification. Primer sequences are as follows: Ablim, forward, CTGGCAGCTCAGAGGAGTTC, and reverse, CGCAGCTGGGATGATAATG; Aff3, forward, CAACAGAGAGCAGCGCAACA, and reverse, CCCGTCTCCATATTGCACACTT; AI449175, forward, GCTCCTTCCCAGAAGACTCTC, and reverse, TCAGGCTCTTCAAAATGGTCTT; Akap8, forward, AAATTGAGAAACGGCGTCAG, and reverse, AATGTGCGGCTTCAATCTTT; β-actin, forward, ACACCCGCCACCAG, and reverse, TACAGCCCGGGGAG; Bat2, forward, ATACTGCCACAAGCCGAAAG, and reverse, TCAGGTCCACTCCACTGTCA; Bcl11a, forward, GTCTGCACACGGAGCTCTAA, and reverse, CACTGGTGAATGGCTGTTTG; Bcl11b, forward, GGGCGATGCCAGAATAGAT, and reverse, GGTAGCCTCCACATGGTCAG; Crsp7, forward, ATGGTGGCAGTGTTGGAAGT, and reverse, GGTTTTCTTGCGGACATCAT; Ctdsp1, forward, CCAGTGAACAATGCGGACTT, and reverse, CCCATTCGCTGTAGGAACTC; Ddx17, forward, AGACAAAGAGGCGCTGTGAT, and reverse, CCTTTCCAGATCGGAACTCA; Ddx19b, forward, GCCAAGTAGAGCCTGCAAAC, and reverse, ACTTGCCCATCTGCTCAATC; Deltex1, forward, GAGGATGTGGTTCGGAGGTA, and reverse, CCCTCATAGCCAGATGCTGT; Deltex3L, forward, CGGACACCTACGAGGTGAAG, and reverse, TTTCCAGGACAATGGTCACA; Eva1, forward, TCACAGCCCTTTGTCCTACA, and reverse, AGTTAGCGCATCTCCCACAG; FgfrL1, forward, TGCAAATACCATGGGCTACA, and reverse, GCTTGTGGATGACGATGAAG; Fkbp5, forward, AACGAAGGAGCAACGGTAAA, and reverse, AATCGGAATGTCGTGGTCTT; FUS, forward, CAGCAACGAGCTGGAGACTG, and reverse, TCTGGCTTAGGTGCCTTACACTG; GAPDH, forward, ACTCCACTCACGGCAAATTCA, and reverse, GCCTCACCCCATTTGATGTT; GATA3, forward, GAGGTGGTGTCGCATTCCAA, and reverse, TTTCACAGCACTAGAGACCCTGTTA; Gpr56, forward, TTGCAGCAGCTTAGCAGGTA, and reverse, GTCTCCCAGGAAGCTCACAG; Grap, forward, GTGTGACGAGCAACCACTGA, and reverse, TCCACAACTTCCA CGATGTC; Heb-alternate, forward, GTGCTTATCCTGTCCCTGGAATG, and reverse, TGGCTTGGGAGATGGGTAAC; Heb-canonical, forward, GAGAAGAAGACCGCTCCATGAT, and reverse, TGGCTTGGGAGATGGGTAAC; Helz, forward, TGATGGGCTATTTGGGTGTT, and reverse, CTGGAGGGCCATGTCATAGT; Huwe1, forward, GGTTGCTGCCACAGCTATTT, and reverse, CACCAACCTTTGCTGGAGAT; Ldb1, forward, TGAAGTTGGCTCCACCTTAGT, and reverse, GCTCCTTCGGCGAGTACAG; MLL1, forward, TGCCCATAGCCCAT, and reverse, TCTGTGAATGAGGC; MLL2, forward, GTGCAGCAGAAGATGGTGAA, and reverse, AGAGCAGCCAGCAGGTCTAA; Myb, forward, AGCGGGAATCGGATGAATCT, and reverse, GAGCAGAAGAAGTTTCCCGATTT; Mxd4, forward, CCGAACAACAGGTCTTCACA, and reverse, CGCTTCAGAAGGCTCAGAGT; Prss16, forward, CCCAAACAAGGGTGGTTAGA, and reverse, CTTGGCCAGTTCTGTGTTGA; Ptpn7, forward, CTTACACGCTGGACGCTACA, and reverse, TCCAGGTCTTCAGGGTTGAC; PU.1, forward, GCGCTGGCACCTTTTTGTAT, and reverse, CAATAATTTTACTTGTCTTTAGTGGTTA; Rab2, forward, TGCCAAGACTGCGTCTAATG, and reverse, GCTGAGGGCCAATTTTAATG; Rabgap1, forward, CCTCCCAGTGGTTCCTTACA, and reverse, GGGCGACATTAAAGATGACAC; Scl, forward, CAACAACAACCGGGTGAAGA, and reverse, ATTCTGCTGCCTCCATCGTT; Senp2, forward, TAAGGTTCTCGGCACCATTC, and reverse, GGCTGGGATCTCATCAGTGT; Spatial, forward, GACACAAGAGGCAGCCTACAG, and reverse, GGATGCACCAGGAGGACTT; Tcf7 (aka T cell factor-1 (TCF-1)), forward, CAAGGCAGAGAAGGAGGCTAAG, and reverse, GGCAGCGCTCTCCTTGAG; Tmem13, forward, GCCCTCCCTAGACCCAACTG, and reverse, GCTTCCAAGTAGGCTGTTCCA; Trim44, forward, TCTGTGTCCTGTGTCCAGTCATT, and reverse, CAGTCCACCGGAATCTTTGC; Zcchc11, forward, TGACAGTGCTTCAGGGATTG, and reverse, TAGCCTCTGCTCAGGTGTCA; Zfp27, forward, TTTTTGCCAGCAGCAGATAG, and reverse, CTGCACCACATCCCGATAG; Zfp30, forward, TGCCTACGAGAGGGATCTGT, and reverse, CCTTGTTCCAACAGGGTGA; and Zfp109, forward, GCTGCTCAGAGGAAGCTGTA, and reverse, CCCCAGTGAAAGGCATCTTA.
Data display as heat maps
The heat maps were generated in the Excel program by arranging expression data in a table with the genes forming the rows and the conditions forming the columns. For each gene, its expression data is normalized by dividing by the geometric mean of that gene’s maximum and minimum expressions. All of the normalized values between 1/
Results
Subtractive screening of arrayed library
To identify previously uncharacterized genes that might act during the earliest stages of the T cell developmental program, we performed a hybridization screen for T-lineage-enriched transcripts in a macroarrayed cDNA library of ∼70,000 clones from mouse Pro-T (DN1-DN3) and pre-NK cells. This library, generated in our lab (31), had yielded novel and informative Pro-T cell transcripts before (38) and provided an opportunity to recover unannotated genes as well as alternative transcripts that might not be represented in microarrays. To establish a baseline, the library was initially probed with Rag2ko thymocyte cDNA (Pro-T plus; because this mutation prevents β-selection, this population is primarily DN3 cells). It was then probed with “subtracted probe,” consisting of Rag2ko thymocyte cDNA from which message shared by a myeloid-biased progenitor population was subtracted. T lineage specific cDNAs were those that hybridized with specifically increased intensities to the subtraction-enriched probe (Fig. 1). Clones thus identified as enriched (see Materials and Methods) were sequenced and mapped to their coordinates in the mouse genomic sequence. More than 1000 sequences were analyzed to retrieve genes specific to early T lineage cells.
An early indication of the robustness of the subtraction was evidenced by the fact that 348 clones, one third of the enriched clones, were found to represent genes already known to be up-regulated in or unique to Pro-T cells (Table I). One of these genes, Tcf7 (TCF-1), encodes a transcription factor with known essential roles in T cell development (39, 40, 41) while the others encode pre-TCR and TCR components, signaling molecules (Lck and LAT), the mutagenic DNA polymerase DNTT (terminal deoxynucleotidyl transferase), and distinctive cell surface markers of Pro-T cells. We excluded from consideration 217 clones that represented ribosomal RNA, 51 clones of mitochondrial origin, and 120 clones with significant alignments to short or long interspersed nuclear elements (SINEs or LINEs). Also, 154 of the clones (24%) aligned to unidentified RIKEN sequences in the databases or were not significantly similar to any known sequence in the bacterial or animal NCBI database. (These sequences have been reported and are presented in Supplementary Table I.)8 The 92 genes represented by the remaining clones are the focus of this report.
Candidate genes for early T cell function
Table I lists the transcripts that were identified as enriched in Pro-T plus cells relative to premyeloid cells by 3 (†) or 4 (‡) SDs above the mode. These genes were identified by high-quality sequence-matches (typically >500 bp, all were >100 bp) to documented exons. In addition, select matches that include intronic or immediately flanking sequences are listed, as long as they did not include SINE or LINE homologies, because novel alternative splicing, polyadenylation, and promoter use isoforms would also be of interest.
Ninety of the 92 genes listed in Table I were submitted to GOToolBox (35) for classification by Gene Ontology. Hypergeometric statistical analysis was performed using the GOToolBox GO-Stats function. Only Spatial and Prss16 were omitted from this analysis, for reasons described below. Select GO-Toolbox results are listed in Table II. Among the subtraction-enriched transcripts, those encoding transcriptional regulators were markedly over-represented relative to the Mouse Genome Informatics (MGI) database (p < 4 × 10−6). Also significantly overrepresented in the enriched data set were transcripts predicted to encode components of the ubiquitin cycle (p = 0.0012), Wnt receptor signaling components (p < 3 × 10−6), and proteins with a nuclear localization (p < 4 × 10−7). Wnt signaling and Notch signaling components as well as transcriptional regulators generally were of interest because of the critical roles of these signaling pathways in early T cell development (42, 43, 44, 45). These results suggested that our subtraction-enriched clones could be a rich source of potential regulatory genes for the early stages of T cell development.
Statistical overrepresentation of genes in the subtractiona
Regulation of transcription (p < 4 × 10−6) | ||||||
Aff3 | Foxp1 | Ncor1 | Tardbp | |||
Baz2a | Lass5 | Notch1 | Tcf12 | |||
Bcl11b | Lef1 | Notch3 | Tcf7 | |||
Ccnk | Mxd4 | Rab2 | Zfp109 | |||
Ddef1 | Myb | Runx1 | Zfp30 | |||
Ubiquitin cycle (p = 0.0012) | ||||||
Fbxw4 | Rad18 | Trim39 | Uble1a | |||
Huwe1 | Senp2 | Ube2l3 | ||||
Wnt receptor signaling (p < 3 × 10−6) | ||||||
Csnk1e | Ldb1 | Senp2 | Tcf7 | |||
Fbxw4 | Lef1 | |||||
Nuclear localization (p < 4 × 10−7) | ||||||
Aff3 | Exosc10 | Mxd4 | Tardbp | |||
Akap8 | Fkbp5 | Myb | Tcf12 | |||
Bat2 | Foxp1 | Ncor1 | Tcf7 | |||
Baz2a | Gps1 | Prpf4b | Uble1a | |||
Bcl11b | Huwe1 | Rad18 | Wbp11 | |||
Ctdsp1 | Lass5 | Rbm4 | Zcchc11 | |||
Dde1 | Ldb1 | Runx1 | Zfp27 | |||
Ddx19b | Lef1 | Senp2 | Zfp30 |
Regulation of transcription (p < 4 × 10−6) | ||||||
Aff3 | Foxp1 | Ncor1 | Tardbp | |||
Baz2a | Lass5 | Notch1 | Tcf12 | |||
Bcl11b | Lef1 | Notch3 | Tcf7 | |||
Ccnk | Mxd4 | Rab2 | Zfp109 | |||
Ddef1 | Myb | Runx1 | Zfp30 | |||
Ubiquitin cycle (p = 0.0012) | ||||||
Fbxw4 | Rad18 | Trim39 | Uble1a | |||
Huwe1 | Senp2 | Ube2l3 | ||||
Wnt receptor signaling (p < 3 × 10−6) | ||||||
Csnk1e | Ldb1 | Senp2 | Tcf7 | |||
Fbxw4 | Lef1 | |||||
Nuclear localization (p < 4 × 10−7) | ||||||
Aff3 | Exosc10 | Mxd4 | Tardbp | |||
Akap8 | Fkbp5 | Myb | Tcf12 | |||
Bat2 | Foxp1 | Ncor1 | Tcf7 | |||
Baz2a | Gps1 | Prpf4b | Uble1a | |||
Bcl11b | Huwe1 | Rad18 | Wbp11 | |||
Ctdsp1 | Lass5 | Rbm4 | Zcchc11 | |||
Dde1 | Ldb1 | Runx1 | Zfp27 | |||
Ddx19b | Lef1 | Senp2 | Zfp30 |
The 90 genes selected by the subtraction protocol were analyzed by the GO-Stats function of the GOToolBox web site for statistically significant overrepresentation of Biological Processes or Cellular Components relative to the Mouse Genome Informatics database.
To verify the enrichment predicted by the subtractive screen, a selection of genes identified as up-regulated by the screen was analyzed for expression by qRT-PCR analysis of a sorted Pro-B cell population from Rag2ko mice and the two populations of cells used in the subtractive hybridization, i.e., Rag2ko thymocytes (Pro-T plus), and cultured Rag2ko progenitor/premyeloid cells. Expression analyses also included Nulp and Nfe2 (data not shown), identified early in the study by a less stringent criterion. Of 23 genes tested from the set shown in Table I (qRT-PCR “A”), for all except Ctdsp1, Was (Wasp), Rab2, and Senp2, the qRT-PCR results showed higher expression in the Pro-T plus population as compared with progenitor/premyeloid cells (data not shown). These results justified a higher resolution analysis of the expression patterns of the enriched genes.
Comparison of gene expression patterns between Pro-T and Pro-B cells and multilineage progenitors
The significance of new candidate regulatory genes for T cell lineage determination could be quite different depending on their expression pattern in the above cell populations. We identified three general categories of expression: 1) inherited from a stem-cell precursor, a category we termed “legacy”; 2) expressed in a general “pan-lymphoid” pattern; and 3) actually induced in developing precursors through a T lineage-specific process. We analyzed the patterns of expression of 43 genes from the screen in sorted populations of wild-type mouse hemopoietic cells. By using gene-specific qRT-PCR, we were able to compare expression quantitatively in highly purified, sorted cells from very small populations. Pro-T plus cell populations were compared not only with sorted Gr-1+Mac-1+ myeloid cells and CD19+ Pro-B cells from Rag-knockout BM, but also with sorted populations of enriched hemopoietic stem/progenitor cells (Lin−Kit+Sca-1+CD27−) and multipotent lymphomyeloid progenitors (Lin−Kit+Sca-1+CD27+) (19, 46). These results are presented in Fig. 2 as qRT-PCR graphs and in Fig. 3 as a clustered heat map.
Quantitative real-time PCR comparison of selected gene expression levels in prethymic progenitors, Pro-T cells, and other hemopoietic lineages. The patterns of expression of 38 genes selected by the subtractive screen were analyzed in five hemopoietic cell populations as indicated by the key at the bottom of the figure. Gene expression relative to GAPDH, averaged from three biological replicates, is graphed on a log scale. Error bars indicate plus and minus one geometric SD. The patterns of expression of subtraction-identified genes (B and C) are compared with those of six landmark genes (A). (See Materials and Methods for primers and details of acquisition of cell populations.) The same samples were used for all measurements shown.
Quantitative real-time PCR comparison of selected gene expression levels in prethymic progenitors, Pro-T cells, and other hemopoietic lineages. The patterns of expression of 38 genes selected by the subtractive screen were analyzed in five hemopoietic cell populations as indicated by the key at the bottom of the figure. Gene expression relative to GAPDH, averaged from three biological replicates, is graphed on a log scale. Error bars indicate plus and minus one geometric SD. The patterns of expression of subtraction-identified genes (B and C) are compared with those of six landmark genes (A). (See Materials and Methods for primers and details of acquisition of cell populations.) The same samples were used for all measurements shown.
Heat map of gene expression levels in hemopoietic lineages. The data depicted in graphs in Fig. 2 were used to construct a clustered heat map. The log average of the maximum and minimum expression level for each gene was set to mid-range (yellow). Each color step indicates a 3-fold change in expression level from blue (lowest expression level) to red (highest expression level). Population averages with SDs are shown in Fig. 2.
Heat map of gene expression levels in hemopoietic lineages. The data depicted in graphs in Fig. 2 were used to construct a clustered heat map. The log average of the maximum and minimum expression level for each gene was set to mid-range (yellow). Each color step indicates a 3-fold change in expression level from blue (lowest expression level) to red (highest expression level). Population averages with SDs are shown in Fig. 2.
The genes that we selected from Table I to test for expression pattern included those encoding known transcription factors and chromatin modifying proteins Bcl11b, HEB (aka Tcf12, two distinct promoter isoforms tested), MLL1, MLL2, Mxd4, Myb, and four likely zinc finger transcriptional repressors (Zfp109, Zfp27, Zfp30, and AI449175), as well as known or suspected transcriptional modulatory factors Aff3, Crsp7, Ablim, Ctdsp1, and Ldb1. We also tested genes encoding RNA-binding proteins and helicases Ddx17, Ddx19, FUS, and Helz; zinc finger factors with other less-characterized roles such as Trim44 and Zcchc11; signaling molecules and adaptors such as Gpr56, Grap, Rabgap1, Rab2, Akap8, and Fkbp5; the E3 ubiquitin ligase Huwe1 (Ureb1); potential Wnt signaling modulator Senp2, the protein tyrosine phosphatase Ptpn7 (He-PTP), and Deltex3-like (Dtx3L) (B lymphoma and BAL-associated protein, Rhysin-2), a RING finger ubiquitin ligase related to the Notch-induced protein Deltex1 (Dtx1). Several other genes of unknown function, such as Bat2, Tmem131 (RW1 and Neg), and Eva1, were also included in the analysis based on their high representation among the subtraction-enriched cDNA clones.
The populations used for this analysis were validated by analyses of regulatory landmarks for the stem cell to T cell transition (Fig. 2,A), namely, the genes encoding the stem cell transcription factor SCL/Tal1, the T cell transcription factor GATA3, the myeloid transcription factor PU.1, the Bcl11b relative that is required for B cell development, Bcl11a, and the direct Notch target gene Deltex1, which encodes an E3 ubiquitin ligase (47) (Fig. 2 A). These showed the expected patterns of expression for the cell populations. SCL was expressed highly in the progenitor subsets but not the others; PU.1 was expressed highly in the progenitors and Pro-B cells and was further enriched in myeloid cells, but down-regulated in the Pro-T cells; GATA3 was up-regulated specifically in the Pro-T cells; and Bcl11a was highest in the Pro-B cells but specifically down-regulated in the Pro-T and myeloid cells.
As shown in Fig. 2, the majority of the subtraction-selected genes was verified to show at least 2-fold more expression in Pro-T cells than in the sorted BM myeloid cells (yellow or white vs pink bars, Fig. 2), and in most cases the difference was at least 10-fold (please note log scale in Fig. 2). The exceptions were Senp2, Ctdsp1, and Rab2, which showed weak enrichment if any. Trappc2l had <3-fold enrichment. The genes that were T-enriched showed various patterns of expression relative to Pro-B cells (blue bars) and to the stem and multipotent progenitor cells (green bars). However, as will be described below, remarkably few of these genes were truly T lineage restricted.
Dominance of “legacy genes” and pan-lymphoid genes in Pro-T vs premyeloid-enriched gene set
Many of the genes that were differentially expressed between Pro-T and premyeloid cells were expressed at similar levels in Pro-T cells and in Pro-B cells (≤2× difference and/or within error), implying functions shared in early T and B lineage development. These pan-lymphoid genes include Aff3 (LAF4), Crsp7, Mll1, Mll2, Mxd4, Zfp27, Ddx17, Trim44, Zcchc11, Gpr56, Grap, and Akap8. Although boundaries between classes are not sharply defined, Myb, FUS, Ablim, Huwe1, and Bat2 could be considered pan-lymphoid as well. All of these genes except Ablim and Grap were also expressed at similar levels in the multilineage LSK CD27+ precursors, implying that their lymphoid function may be inherited from a pluripotent precursor. These similarities are evident in the heat map shown in Fig. 3.
Genes specifically up-regulated as part of the T lineage developmental choice would be expected to be expressed more highly in Pro-T cells than in either Pro-B or BM myeloid cells (>2×), and a number of genes were found to have this pattern. However, even within this set, the majority had expression levels in one or both of the progenitor populations (LSK CD27− and LSK CD27+) similar to (within 2×) that found in the Pro-T cell population. These genes include Zfp109, Zfp30, Ldb1, Rabgap1, Ptpn7, and Ddx19b. Genes such as Myb, AI449175, and Helz were also expressed most highly in the Pro-T cell samples, but the magnitudes of their up-regulation relative to stem and progenitor cells were only 2–3×. None of these transcripts were as T lineage enriched as GATA3 or Tcf7 (Fig. 2 A) or even as the canonical form of HEB (HEBcan). These newfound genes therefore are not specifically induced during T lineage specification, but instead represent multipotent precursor legacy genes that T lineage cells continue to express, even while other lineages down-regulate them.
Genes specifically up-regulated in T lineage precursors
Against this background, the T lineage specificity of a select group of genes from our screen stood out (Figs. 2 and 3). These included transcripts of three genes encoding transcription factors, Tcf7 (TCF-1), Bcl11b and HEBalt (the alternative promoter use form of HEB); the RING finger protein Dtx3L; the signaling adaptor protein Fkbp5; and two products of unknown function, Tmem131 (RW1, Neg) and Eva1 (epithelial V Ag). Tmem131 and Fkbp5 were up-regulated by slightly less than an order of magnitude (9-fold) from precursors (Fig. 3, green to gold). HEBalt levels in Pro-T cells were much higher than those in LSK cells, but were also up-regulated substantially in Pro-B cells, in agreement with previous report (48). Bcl11b was unusual for the magnitude and specificity of its up-regulation (Fig. 3, dark blue to red), even greater than the up-regulation of known T lineage factor Tcf7 (TCF-1, Fig. 3, blue to orange) and comparable to that of Deltex1 (Fig. 3, dark blue to red). These genes are investigated in more detail below.
Subtraction-enriched thymic stromal genes and a gene with shared lymphoid and stromal expression
Because mRNAs for the library construction and the subtraction protocol were obtained from nonsorted Rag2ko and SCID thymocytes with some contamination by stromal epithelial cells, our screen would be expected to enrich for stromal-specific cDNAs as well as for thymic lymphocyte-specific ones. Two genes selected by the subtraction, encoding the serine protease Prss16 and Spatial (Titest, 1700021K02Rik), were also specifically expressed in the Pro-T plus population (Fig. 2,C and data not shown). However, their transcripts were not found in sorted hemopoietic populations (Fig. 4,A), in accord with their annotation as stromal specific genes. Unlike Spatial or Prss16, a third “stromal” annotated gene, Eva1 (epithelial V-like Ag 1), was verified to be expressed in sort-purified DN3 cells (Fig. 4,A). Eva1, thought to be a homotypic adhesion molecule and previously found only in thymic stroma, liver, and other epithelial tissues (49), was expressed within the T lineage in a stage-specific and transient way, beginning at the DN2 stage and peaking at the preselection DN3a (24) stage (Fig. 4,B). It is possible that Eva1 mediates homotypic adhesion interactions between thymocytes and thymic stroma. Expression of Eva1 by DN3 cells would have easily been overlooked in immunohistochemical assays (49) because the percentage of DN3 cells among thymocytes in the wild-type thymus is low (∼1%). We found nine noncanonical transcripts for Eva1, a benefit of the macroarray library, that appear to encode at least four novel transcripts with previously unreported exons or promoter regions (Fig. 4,C and Supplemental Table I).8 Regulation of Eva1 is potentially interesting because the Eva1 gene is located on chromosome 9 in the only significant physical cluster of Pro-T cell genes identified in our study. It is immediately adjacent to cd3e (within 32 kb) and within 161 kb of Mll1, which flanks the cd3g/cd3d/cd3e cluster on the other side (data not shown).
Stromal-specific genes and a gene shared by stromal cells and thymic lymphocytes. A, Expression of three genes previously considered to be exclusive to the thymic stroma was measured in sort-purified DN3 thymocytes (denoted DN3) and in unsorted DN3-enriched cells with stromal contamination, Pro-T plus. Although Pro-T plus samples indicate high levels of expression of Eva1, Spatial, and Prss16, only Eva1 expression is expressed equally in sort-purified DN3 cells. B, The pattern of Eva1 expression was measured in LSK prethymic cells, Pro-B and myeloid cells, and in DN thymocyte subsets from the earliest stages through β-selection (see text and Fig. 4 for details). Up-regulation of Eva1 occurs at DN2, peaks at DN3a, and declines after β-selection. Eva1 mRNA is present at low levels in Pro-B cells and is found at background levels in myeloid cells. C, Structures of novel Eva1 transcripts identified in this study and aligned with the mouse genome on chromosome 9 by BLASTn (red blocks identified as “Non-canonical sequences” in Ensembl genome browser). Nine of them are depicted in this Ensembl alignment. The GenBank accession numbers for these expressed sequence tags are EL773010, EL773011, EL773012, EL773013, EL773014, EL773015, EL773016, EL773017, and EL773018. These novel isoforms all affect promoter use and/or splicing patterns at the 5′ end of the Eva1 gene. For sequences of these novel Eva1 cDNAs, see Supplemental Table I.8
Stromal-specific genes and a gene shared by stromal cells and thymic lymphocytes. A, Expression of three genes previously considered to be exclusive to the thymic stroma was measured in sort-purified DN3 thymocytes (denoted DN3) and in unsorted DN3-enriched cells with stromal contamination, Pro-T plus. Although Pro-T plus samples indicate high levels of expression of Eva1, Spatial, and Prss16, only Eva1 expression is expressed equally in sort-purified DN3 cells. B, The pattern of Eva1 expression was measured in LSK prethymic cells, Pro-B and myeloid cells, and in DN thymocyte subsets from the earliest stages through β-selection (see text and Fig. 4 for details). Up-regulation of Eva1 occurs at DN2, peaks at DN3a, and declines after β-selection. Eva1 mRNA is present at low levels in Pro-B cells and is found at background levels in myeloid cells. C, Structures of novel Eva1 transcripts identified in this study and aligned with the mouse genome on chromosome 9 by BLASTn (red blocks identified as “Non-canonical sequences” in Ensembl genome browser). Nine of them are depicted in this Ensembl alignment. The GenBank accession numbers for these expressed sequence tags are EL773010, EL773011, EL773012, EL773013, EL773014, EL773015, EL773016, EL773017, and EL773018. These novel isoforms all affect promoter use and/or splicing patterns at the 5′ end of the Eva1 gene. For sequences of these novel Eva1 cDNAs, see Supplemental Table I.8
Stage-specific onsets of regulatory gene expression in T lineage precursors
The potential roles of the few T lineage-specific transcription factors and signaling molecules identified in Fig. 2 should depend on the developmental stages at which they are induced. Tcf7 (TCF-1) has been extensively studied (29, 30, 31), and its expression shown to increase gradually through the DN1–DN4 progression (8), but the other genes are less well characterized. To determine the timing of up-regulation of these T lineage-biased genes, we analyzed their expression in 5 subpopulations of T cell precursors sorted from wild-type mouse thymus (DN1, DN2, DN3a, DN3b, and DN4), in direct comparison with the two subpopulations of hemopoietic progenitors (Lin−Kit+Sca-1+CD27− and Lin−Kit+Sca-1+CD27+), Pro-B cells, and the sorted BM myeloid population used above. This comparison spans the range of early T lineage milestones: entry into the thymus during the transition to the DN1 stage; “specification” at the DN1 to DN2 transition; “commitment” and proliferation arrest at the DN2 to DN3a transition; and β-selection or γδ-selection, via DN3b and DN4 intermediates (13, 24, 50).
The results are shown in Fig. 5 as qRT-PCR graphs, and Fig. 6 shows DN subsets ± LSK population expression results for a more extensive set of genes in heat map form. For developmental reference standards, we measured GATA3 expression as a model T lineage-specific positive regulator (Fig. 5,A); Myb expression as a key regulator used by both multipotent progenitor cells and Pro-T cells (Fig. 5,B); and PU.1 and SCL as progenitor-cell regulators that are shut off precipitously between the DN1 and DN3 stages (Fig. 5, G and H). The Notch signaling target gene Deltex1 (Dtx1) was also analyzed (Fig. 5,L). Deltex1 was not transcribed in the two hemopoietic progenitor populations (see Fig. 5,M, detection thresholds), but its expression was up-regulated ∼100-fold above background levels at the DN1 and DN2 stages, in agreement with the critical role of Notch signaling in T cell specification. Interestingly, it showed a further up-regulation at the DN3a stage to >2000-fold over the background (Fig. 5 L), suggesting a second discrete phase of Notch activity (51).
Distinct stage-specific regulation of different Pro-T-specific genes: fine-scale developmental regulation of Pro-T cell-enriched genes encoding Bcl11b (J), HEBalt (K), Dtx3L (F), and Fkbp5 (E) and novel zinc finger factor genes Zfp109 (D) and Zfp30 (C) is compared with that of reference genes encoding GATA3 (T lineage) (A), Myb (legacy) (B), PU.1 (G), SCL (H), and Bcl11a (progenitor- and non-T lineage) (I), and Dtx1 (Notch target) (L). Gene expression was analyzed by qRT-PCR in nine hemopoietic cell populations as described in Fig. 4 B and graphed on a log scale relative to GAPDH expression. Cell populations and their relationships to T cell development are shown in the schematic (bottom). They include LSK CD27− stem cell-like hemopoietic progenitors (dark green bars), LSK CD27+ multipotent progenitors (light green), sorted DN T cells from earliest, DN1 stage through β-selection to DN4 stage (orange, gold, yellow, pale yellow, and white), and Pro-B cells (blue) and BM myeloid cells (pink). Populations of each cell type were purified on three separate dates, and the expression levels for the genes analyzed are reported here as the average ± geometric SD among the three independent biological replicates.
Distinct stage-specific regulation of different Pro-T-specific genes: fine-scale developmental regulation of Pro-T cell-enriched genes encoding Bcl11b (J), HEBalt (K), Dtx3L (F), and Fkbp5 (E) and novel zinc finger factor genes Zfp109 (D) and Zfp30 (C) is compared with that of reference genes encoding GATA3 (T lineage) (A), Myb (legacy) (B), PU.1 (G), SCL (H), and Bcl11a (progenitor- and non-T lineage) (I), and Dtx1 (Notch target) (L). Gene expression was analyzed by qRT-PCR in nine hemopoietic cell populations as described in Fig. 4 B and graphed on a log scale relative to GAPDH expression. Cell populations and their relationships to T cell development are shown in the schematic (bottom). They include LSK CD27− stem cell-like hemopoietic progenitors (dark green bars), LSK CD27+ multipotent progenitors (light green), sorted DN T cells from earliest, DN1 stage through β-selection to DN4 stage (orange, gold, yellow, pale yellow, and white), and Pro-B cells (blue) and BM myeloid cells (pink). Populations of each cell type were purified on three separate dates, and the expression levels for the genes analyzed are reported here as the average ± geometric SD among the three independent biological replicates.
Heat maps of LSK and/or DN subset expression of select genes. The gene expression heat maps were generated as was the map shown in Fig. 3. The expression data shown as graphs in Fig. 5 were combined with data from additional genes and are presented here as a heat map. B, Expression of 10 additional genes in DN subsets (relative to β-actin) is shown as a heat map with the classic Notch target gene HES1 as a reference (geometric mean levels from two independent biological series). The sorted DN populations used for this panel were from different preparations than those used for A and Fig. 5 (E.-S. David-Fung, data not shown).
Heat maps of LSK and/or DN subset expression of select genes. The gene expression heat maps were generated as was the map shown in Fig. 3. The expression data shown as graphs in Fig. 5 were combined with data from additional genes and are presented here as a heat map. B, Expression of 10 additional genes in DN subsets (relative to β-actin) is shown as a heat map with the classic Notch target gene HES1 as a reference (geometric mean levels from two independent biological series). The sorted DN populations used for this panel were from different preparations than those used for A and Fig. 5 (E.-S. David-Fung, data not shown).
Only two of the regulatory factors in our study were primarily induced during the DN1 to DN2 transition. One was the promoter-use variant of HEB known as HEBalt (Fig. 5 K). Its dramatic up-regulation was followed by an ∼10-fold decline after the DN3a stage, in agreement with previous report (48), and consistent with the early hit-and-run positive function this basic helix-loop-helix factor variant appears to play in T cell development (48).
The gene with the most singular pattern of expression in our analysis encodes Bcl11b, a zinc finger factor that usually acts as a transcriptional repressor (52, 53, 54). Bcl11b appeared strictly T lineage-specific relative to stem and progenitor populations and, in contrast to HEBalt, is expressed at only trace levels in Pro-B and myeloid cells (Fig. 5,J). In addition, unlike all the other T lineage genes, Bcl1b transcripts showed little expression in DN1 cells, but increased 500-fold between the DN1 and DN2 stages, with only a fewfold further up-regulation to the DN3a stage (Fig. 5,J). The magnitude of this increase dwarfed the increase seen in GATA3 expression over the same interval (Fig. 5,A). Unlike HEBalt, Bcl11b expression then remained fairly level through the DN4 stage, and its expression continued in peripheral T cells (52) (data not shown). Bcl11b up-regulation was accompanied by the reciprocal down-regulation of its relative, Bcl11a, which was strongly expressed in progenitor cells and non-T cells but down-regulated by two orders of magnitude between the DN1 and DN3 stages (Fig. 5 I). This analysis implies that the ratio of Bcl11b to Bcl11a in thymocytes shifts dramatically during progression from DN1 to DN4, by over four orders of magnitude, a finding that is particularly relevant in light of reports that chromosomal translocations affecting Bcl11b expression are found in ∼20% of pediatric, T cell acute lymphoblastic leukemias (55, 56).
Other signaling genes and transcription factor genes selected by the screen showed less dramatic increases with earlier or later peaks of expression. Dtx3l, encoding a putative interaction partner of Deltex1, roughly paralleled Deltex1 (Fig. 5, F and L) and Eva1 (Fig. 4,B) in expression in T lineage populations. Showing little expression in the stem/progenitor cells, Dtx3L was already detectably up-regulated at the DN1 stage and increased to a peak in the DN3a stage. Like several Notch pathway targets (24), Dtx3L was sharply down-regulated after β-selection in the DN3b and DN4 stages. FK506-binding protein5, Fkbp5, was already expressed at significant levels in the multilineage precursor populations, but its T lineage-specific up-regulation also reached a peak at the DN3 stage, albeit with changes of lower amplitude (Fig. 5 E). Fkbp5 is a modulator of the glucocorticoid receptor (57). These signaling molecules and cell surface receptors thus appear to be most strongly expressed at a stage coinciding with T lineage commitment, cell cycle arrest, and TCR gene rearrangement, but after initial T lineage specification has begun.
The genes with similar levels in prethymic progenitor cells and Pro-T plus cell fractions (Fig. 2) confirmed their “legacy” patterns of expression by the continuity and constancy of their expression patterns throughout the early DN stages. Myb expression showed little change from prethymic stages throughout the Pro-T stages, with only a gentle increase from DN1 to DN3 and a steeper drop after β-selection (Fig. 5,B). Two relatively novel KRAB-domain zinc finger transcription factors with “legacy” patterns of expression, Zfp30 and Zfp109, were up-regulated between the LSK CD27− and LSK CD27+ stages of prethymic differentiation and continued their expression through the DN1 to DN3 stages with a decrease after β-selection (Fig. 5, C and D).
Gene expression analyses in these DN and BM subsets were also conducted for Helz, Ddx19b, Tmem131 (Fig. 6,A), and further analysis of DN thymocyte subsets was performed on Aff3, Grap, Ldb1, Mll1, Mll2, Trim44, Atxn2l, Tcf7, and FUS, in comparison with the Notch target gene HES1 (Fig. 6 B). None of these matched the T lineage specification-associated induction of Bcl11b. Aff3 actually declined steadily from the DN2 to the DN4 stage after an early plateau. Tcf7, FUS, Grap, Helz, Ldb1, Mll1, Mll2, Tmem131, and Trim44 remained steady or increased gently to the DN3 stage, with a decline thereafter; but the range of expression was narrow. Ddx19b followed the same pattern, after an initial drop between the LSK CD27+ prethymic stage and DN1 stage. In a companion study of >80 Pro-T cell-expressed transcription factors (E.-S. David, G. Buzi, L. Rowen, R. Butler, R. A. Diamond, M. K. Anderson, and E. V. Rothenberg, manuscript in preparation), only Bcl11b demonstrated T lineage specificity and >100-fold up-regulation at the DN1 to DN2 transition.
Bcl11b induction by Notch/Delta signaling
Notch/Delta signaling induces expression of the known T lineage regulatory genes GATA3 and Tcf7 with a characteristic time course in fetal liver-derived hemopoietic precursors (19, 58, 59). OP9 stromal cells normally support B cell differentiation of hemopoietic precursors, but OP9 cells engineered to express the Notch ligand Delta-like1 (OP9-DL1 cells) support T cell development. Time course analysis of hemopoietic precursor cells in coculture with OP9 control or OP9-DL1 cells provides a second way to look at the earliest events involved in T lineage specification, separable from any technical issues about the correct identification and purification of precursor subsets. We therefore cultured fetal liver-derived hemopoietic precursors on OP9-DL1 or OP9-control stroma and compared the expression kinetics of Bcl11b with those of Tcf7, Deltex1, the T lineage gene CD3ε, Eva1, and legacy or pan-lymphoid transcription factor genes as shown in Fig. 7. Samples were obtained as described previously (19), representing 2-day intervals in a time course of 10 days of culture overall.
Temporal regulation of Pro-T and legacy genes during induction of T cell specification by Notch/Delta signaling. Time courses of gene expression were analyzed by qRT-PCR in fetal liver cells cocultured with either OP9 control cells or OP9 cells expressing the Notch ligand DL1. The results of four culture conditions are shown. Top panels, Schematic of the experiment. Lower panels, Quantitation of gene expression in time course samples by qRT-PCR. Fetal liver-derived precursors (day 0) were cultured for up to 10 days total with OP9-control stromal cells (nonpermissive for T lineage) or OP9-DL1 stroma (T lineage inducing). To distinguish gene expression responses that respond to continuous Notch/DL1 signaling from those that do not, cells retaining progenitor phenotype (Kit+CD27+) after 4 days of culture were repurified from both kinds of cultures, then each sample was split and used to seed both OP9-control and OP9-DL1 secondary cultures. Samples from days 6–10 were generated either from cells that were replated on the original type of stroma or from duplicate samples that were switched to the opposite stromal type, as indicated by the color code: DL1 to DL1 (navy blue = continuous T lineage inducing), DL1 to control (magenta = abortive exposure to T lineage inducing), control to control (aqua = T lineage nonpermissive, B lineage inducing), and control to DL1 (yellow = delayed exposure to T lineage inducing). Expression levels of the indicated genes are graphed relative to GAPDH levels on a log scale.
Temporal regulation of Pro-T and legacy genes during induction of T cell specification by Notch/Delta signaling. Time courses of gene expression were analyzed by qRT-PCR in fetal liver cells cocultured with either OP9 control cells or OP9 cells expressing the Notch ligand DL1. The results of four culture conditions are shown. Top panels, Schematic of the experiment. Lower panels, Quantitation of gene expression in time course samples by qRT-PCR. Fetal liver-derived precursors (day 0) were cultured for up to 10 days total with OP9-control stromal cells (nonpermissive for T lineage) or OP9-DL1 stroma (T lineage inducing). To distinguish gene expression responses that respond to continuous Notch/DL1 signaling from those that do not, cells retaining progenitor phenotype (Kit+CD27+) after 4 days of culture were repurified from both kinds of cultures, then each sample was split and used to seed both OP9-control and OP9-DL1 secondary cultures. Samples from days 6–10 were generated either from cells that were replated on the original type of stroma or from duplicate samples that were switched to the opposite stromal type, as indicated by the color code: DL1 to DL1 (navy blue = continuous T lineage inducing), DL1 to control (magenta = abortive exposure to T lineage inducing), control to control (aqua = T lineage nonpermissive, B lineage inducing), and control to DL1 (yellow = delayed exposure to T lineage inducing). Expression levels of the indicated genes are graphed relative to GAPDH levels on a log scale.
Fig. 7 shows that Bcl11b was strongly and specifically up-regulated in fetal liver cells in response to culture with OP9-DL1 (navy blue line) but not when cultured with OP9-control cells that do not express Notch ligand (turquoise line). The kinetics of Bcl11b induction were strictly dependent on the timing of exposure to DL1. When cells were initiated in culture on OP9-control and then shifted to OP9-DL1 after a delay of 4 days (see Materials and Methods), Bcl11b induction was also delayed (Fig. 7, yellow line). In contrast, induction of Bcl11b through these stages depended on continued signaling from Notch/DL1 interaction, because when the cells were removed from OP9-DL1 to OP9-control after 4 days, Bcl11b expression was down-regulated (Fig. 7, magenta line). The dependence of Bcl11b induction on Notch/Delta signaling was comparable to that of Deltex1 (Fig. 7), but its responses were temporally blunted. Deltex1 reached a plateau within 4 days of stimulation (Fig. 7, navy line) and was immediately turned on or off in response to the addition or removal of DL1 (Fig. 7, yellow and magenta lines), whereas Bcl11b required more than 6 days of stimulation to reach a plateau and showed slower induction and de-induction in response to changes in DL1 stimulation (Fig. 7). This pattern would be consistent with more complex regulatory requirements than Notch signaling alone, or a longer mRNA half-life than Deltex1, or both. Bcl11b showed the same temporal pattern as Tcf7 (Fig. 7) (19), but with even greater T lineage restriction of expression. Thus, Bcl11b is an integral part of a T lineage-specific regulatory program that is induced in the first stages of response to Notch/Delta signaling in hemopoietic progenitors.
For most of the other genes identified as T cell specific in this screen, the OP9-DL1 response kinetics of fetal liver progenitors (Fig. 7) gave results consistent with their steady-state expression patterns in adult prethymic cells and DN thymocytes (Figs. 5 and 6). Fkbp5 (Fig. 7) and Tmem131 (data not shown) were up-regulated in response to Notch/Delta signaling, in agreement with their identification as Notch target genes (28), but with a very shallow change in magnitude. Eva1 was also induced in a Notch/Delta dependent way. The legacy genes Myb and Mll1 showed virtually unchanging expression from fetal liver progenitor throughout the differentiation time course, in the presence or absence of DL1, in keeping with their shared use in T, B, and stem cells.
The OP9 kinetic assays did discriminate between Bcl11b and HEBalt usage in T vs B lineage differentiation. HEBalt is expressed in B lineage as well as T lineage precursors (48), but in the adult, in vivo-derived populations its expression appeared to be strongly biased toward the T lineage (Fig. 2,B). In fetal liver precursors differentiating in vitro, however (Fig. 7), HEBalt was induced in the absence of DL1 (turquoise line, magenta line; B cell conditions) only a fewfold less strongly than in the presence of DL1 (navy line, yellow line; T cell conditions). It also appeared to be expressed at substantial levels in the fetal liver-derived starting populations (Fig. 7), as in the BM LSK populations (Fig. 2 B). In contrast, Bcl11b expression was completely dependent on the T lineage differentiation conditions. Thus, our screen identifies Bcl11b as a singularly specific early component of the T cell program in vivo and in vitro.
Kinetics of Bcl11b induction depend on developmental state of prethymic precursors
The exponential increase of Bcl11b RNA expression over a 4- to 6-day period, as shown in Fig. 7, suggested that in addition to the Notch-dependent process inducing Bcl11b transcription, the frequency of cells competent to express the gene may also be increasing or that Bcl11b may exert a positive feedback effect on its own expression. We therefore tested whether the kinetics of Bcl11b induction under these conditions were dependent on the developmental status of the input cells. We took advantage of the fact that distinct subsets of prethymic precursors in the fetal liver progress to the DN2 stage with faster or slower kinetics in the OP9-DL1 system: CLP-like (Lin−Kit+CD27+CD135+CD127+) cells differentiate faster, while stem-like LSK cells (Lin−Kit+Sca-1+) show a lag (T. Taghon, M. A. Yui, and E. V. Rothenberg, submitted for publication), and “Flk+” cells (Lin−Kit+Sca-1lowCD27+CD135+CD127−) cells give intermediate responses. Fig. 8 shows that none of these populations express detectable Bcl11b initially (“0” time points). When cocultured with OP9-DL1, Bcl11b is rapidly up-regulated in the CLP-like cells, to levels approaching maximal within 2 days, while the LSK cells take 7 days to reach the same level and the Flk+ cells require at least 3 days (Fig. 8, □). These kinetics are in excellent agreement with the time it takes for each population to generate DN2 and later stage cells in vitro (Fig. 8, line graphs) (T. Taghon, M. A. Yui, and E. V. Rothenberg, submitted for publication). Thus, the duration of Notch-Delta signaling required to turn on Bcl11b depends on the initial developmental state of the responding cells.
Impact of developmental preprogramming on kinetics of Bcl11b induction by Notch/Delta signaling. Expression of Bcl11b in cells differentiating from three populations of hemopoietic progenitors was compared by qRT-PCR at time points for up to 7 days on either OP9 control stroma (▪) or OP9-DL1 stroma (□). Bcl11b expression relative to GAPDH is graphed on a log scale (left axis). For each progenitor population, the percentage of cells cocultured on OP9 DL1 (□) that had progressed passed the DN1 stage of T cell development at the indicated time points is also graphed on a linear scale (line graphs, right axis). CLP-like = common lymphoid progenitor-like cells. LSK = stem-like cells. Flk+: Lin−Kit+Sca-1−CD27+Flk2/Flt3+ multipotent progenitors. Bcl11b expression under conditions that promote B cell development (▪, OP9 control) remained at very low/background levels, whereas Bcl11b expression in cells cultured under T cell-promoting conditions (□, OP9-DL1) up-regulated dramatically by day 2 of coculture. Results shown are from one of two independent experiments that gave similar results. The rate at which each progenitor population matures down the T lineage is most rapid for CLP-like cells, slowest for LSKs, and intermediate for Flk+ cells. The rate at which each population proceeds down the T cell pathway in response to Notch signaling, as indicated by the appearance of DN2 and later cells, is tightly correlated to the level of Bcl11b expression.
Impact of developmental preprogramming on kinetics of Bcl11b induction by Notch/Delta signaling. Expression of Bcl11b in cells differentiating from three populations of hemopoietic progenitors was compared by qRT-PCR at time points for up to 7 days on either OP9 control stroma (▪) or OP9-DL1 stroma (□). Bcl11b expression relative to GAPDH is graphed on a log scale (left axis). For each progenitor population, the percentage of cells cocultured on OP9 DL1 (□) that had progressed passed the DN1 stage of T cell development at the indicated time points is also graphed on a linear scale (line graphs, right axis). CLP-like = common lymphoid progenitor-like cells. LSK = stem-like cells. Flk+: Lin−Kit+Sca-1−CD27+Flk2/Flt3+ multipotent progenitors. Bcl11b expression under conditions that promote B cell development (▪, OP9 control) remained at very low/background levels, whereas Bcl11b expression in cells cultured under T cell-promoting conditions (□, OP9-DL1) up-regulated dramatically by day 2 of coculture. Results shown are from one of two independent experiments that gave similar results. The rate at which each progenitor population matures down the T lineage is most rapid for CLP-like cells, slowest for LSKs, and intermediate for Flk+ cells. The rate at which each population proceeds down the T cell pathway in response to Notch signaling, as indicated by the appearance of DN2 and later cells, is tightly correlated to the level of Bcl11b expression.
Discussion
In this study, we have used a gene discovery approach to search for new regulatory factors that participate with GATA3, Tcf7 (TCF-1), and Notch signaling in initiating T lineage specification. The method used was focused on broad-scale de novo identification of gene transcripts that are specifically up-regulated in early T lineage cells relative to other hemopoietic progenitors. This screen yielded a number of candidate transcription factors and signaling molecules that, previously, have been uncharacterized or only circumstantially linked with T cell development. Using quantitative RT-PCR with highly purified cell populations, we have tracked expression of many of the candidate regulatory genes in detail across the transition from stem cell to committed T lineage cell. This analysis has generated two main results. First, against expectation, it has established the preponderance of legacy genes in early T cell development. Most of the regulatory genes that showed preferential expression in T lineage precursors, as compared with B or myeloid lineage cells, actually represent a direct, continuous, quantitatively stable inheritance from pluripotent hemopoietic progenitors. Second, just two new regulatory factors have emerged from this screen with early T lineage up-regulation comparable to that of Tcf7: HEBalt, a functionally distinct promoter variant of HEB, which is also used in early B cells, and Bcl11b, the zinc finger transcription factor and tumor suppressor, which is fully T lineage specific.
This study is distinguished from related studies of early T cell development in part by its focus on the transition from prethymic progenitors (specifically LSK CD27− and LSK CD27+ from BM and Lin−Kit+CD27+ fetal liver precursors) into the first intrathymic stages. Although several other studies have compared populations from DN2 through β-selection, the earlier transitions have remained more obscure. Furthermore, the de novo, gene cloning approach of our work identified a number of genes not previously studied in T cell development, including those encoding transcriptional regulators Aff3, MLL2, Zfp27, Zfp109, and Zfp30, RNA-binding proteins Helz, FUS and Ddx19b, signaling component Grap, the immunophilin Fkbp5, the Lim-binding protein Ldb1, as well as uncharacterized products like Eva1, Tmem131, and Trim44 (Table III). A comparison of our findings and the results of microarray studies by Hoffmann et al. (27), Tabrizifard et al. (26), Dik et al. (28), and Lee et al. (29) shows general agreement where results for comparable samples were given, but also highlights the impact of microarray chip comprehensiveness and probe design. Microarray data from these sources regarding the genes in Fig. 2 are presented in Table III. Hoffmann et al. (27) and Tabrizifard et al. (26) studied T cell development in the mouse, but the Hoffmann work did not examine prethymic or DN1 populations and Tabrizifard et al. (26) started only with DN1 cells and reported results only for transcription factors. These studies were constrained by the limits of their microarrays as shown by the dashes in Table III (not included on the chip used) and possibly also by detection threshold issues (Table III – “n/r”, gene expression not reported). Dik et al. (28) evaluated gene expression in human hemopoietic cells roughly equivalent to the LSK CD27+ through DN transitions and extended to the late CD4 and CD8 single-positive T cell stages. These authors also noted selective up-regulation of Bcl11b at a point in human T cell development similar to our murine data. However, they found Bcl11a expression reinduced in CD4+ ISP cells, representing a later T lineage population, and this was not supported by our findings. The recent work by Lee et al. (29) used a highly comprehensive human microarray chip but their analysis focuses on the later transitions, from β-selection to naive CD4+ T cell. Without early thymocyte and hemopoietic populations for comparison, the Lee paper answers different questions about gene expression in T cell development.
Summary of microarray results from recent reportsa
. | Tabrizifard et al. Petrie lab (24 ) . | Hoffmann et al. Melchers lab (25 ) . | Dik et al. Staal lab (26 ) . | Lee et al. McCune lab (27 ) . |
---|---|---|---|---|
. | Affymetrix Mouse MG-U74A chip . | Affymetrix Mouse Mu11k A&B . | Affymetrix Human U95Av2 . | Affymetrix Human U133 A&B . |
Ablim | —b | — | — | n/r |
Aff3 | — | n/r | n/r | n/r |
AI449175 | — | — | — | — |
Akap8 | n/r | n/r | n/r | n/r |
Bat2 | n/r | n/r | Flat | n/r |
Bcl11b | — | — | Peaks at an equivalent developmental stage | Enriched vs stroma but no up-regulation between ITTP and DP |
Bcl11a | — | — | Peaks at ISP CD4+ | Flat |
Crsp7 | — | — | — | n/r |
Ctdsp1 | — | — | — | n/r |
Ddx17 | n/r | n/r | Flat | n/r |
Ddx19b | — | n/r | n/r | n/r |
Deltex1 | n/r | n/r | ? | n/r |
Dtx3L | — | — | — | n/r |
Eva1 | n/r | — | n/r | n/r |
Fgfrl1 | — | — | — | n/r |
Fkbp5 | n/r | n/r | Higher in ISP CD4+ | Slow rise from ITTP to CB4 |
FUS | n/r | n/r | n/r | n/r |
GATA3 | Peaks at DN3 | Rises through CD4+ | High at SP CD4+ | Rises from ITTP to peak at DP |
Gpr56 | n/r | — | Drops at SP CD4+ | Drops from ITTP to DP |
Grap | — | — | n/r | n/r |
HEBalt | — | — | — | — |
HEBcan | Rises through smDP | n/r | Highest at DP CD3− | n/r |
Helz | — | n/r | n/r | n/r |
Huwe1 | n/r | n/r | n/r | n/r |
Ldb1 | Peaks at smDP | n/r | n/r | n/r |
MLL1 | Peaks at DN2 | n/r | n/r | n/r |
MLL2 | — | — | n/r | n/r |
Mxd4 | n/r | n/r | Only high at UCB | n/r |
Myb | Peak at DN2 and again at smDP | Peak at DN3 | Peaks at CD34+38+1a+ | 6-fold drop between DP and SP CD4+ |
Ptpn7 | — | — | High at DP CD3- and again at SP CD8+ | n/r |
Rab2 | Peaks at smDP | n/r | High at DP CD3+ | n/r |
Rabgap1 | — | — | n/r | n/r |
SenP2 | — | — | — | n/r |
Tcf7 | Rises through smDP | n/r | Peaks at DP CD3+ and DP CD3− | Enriched vs stroma but flat between ITTP and DP |
Tmem131 | n/r | — | n/r | n/r |
Trappc2l | n/r | — | — | n/r |
Trim44 | — | — | Flat | n/r |
Zcchc11 | — | — | n/r | n/r |
Zfp109 | — | — | — | — |
Zfp27 | — | n/r | — | — |
Zfp30 | n/r | n/r | — | n/r |
. | Tabrizifard et al. Petrie lab (24 ) . | Hoffmann et al. Melchers lab (25 ) . | Dik et al. Staal lab (26 ) . | Lee et al. McCune lab (27 ) . |
---|---|---|---|---|
. | Affymetrix Mouse MG-U74A chip . | Affymetrix Mouse Mu11k A&B . | Affymetrix Human U95Av2 . | Affymetrix Human U133 A&B . |
Ablim | —b | — | — | n/r |
Aff3 | — | n/r | n/r | n/r |
AI449175 | — | — | — | — |
Akap8 | n/r | n/r | n/r | n/r |
Bat2 | n/r | n/r | Flat | n/r |
Bcl11b | — | — | Peaks at an equivalent developmental stage | Enriched vs stroma but no up-regulation between ITTP and DP |
Bcl11a | — | — | Peaks at ISP CD4+ | Flat |
Crsp7 | — | — | — | n/r |
Ctdsp1 | — | — | — | n/r |
Ddx17 | n/r | n/r | Flat | n/r |
Ddx19b | — | n/r | n/r | n/r |
Deltex1 | n/r | n/r | ? | n/r |
Dtx3L | — | — | — | n/r |
Eva1 | n/r | — | n/r | n/r |
Fgfrl1 | — | — | — | n/r |
Fkbp5 | n/r | n/r | Higher in ISP CD4+ | Slow rise from ITTP to CB4 |
FUS | n/r | n/r | n/r | n/r |
GATA3 | Peaks at DN3 | Rises through CD4+ | High at SP CD4+ | Rises from ITTP to peak at DP |
Gpr56 | n/r | — | Drops at SP CD4+ | Drops from ITTP to DP |
Grap | — | — | n/r | n/r |
HEBalt | — | — | — | — |
HEBcan | Rises through smDP | n/r | Highest at DP CD3− | n/r |
Helz | — | n/r | n/r | n/r |
Huwe1 | n/r | n/r | n/r | n/r |
Ldb1 | Peaks at smDP | n/r | n/r | n/r |
MLL1 | Peaks at DN2 | n/r | n/r | n/r |
MLL2 | — | — | n/r | n/r |
Mxd4 | n/r | n/r | Only high at UCB | n/r |
Myb | Peak at DN2 and again at smDP | Peak at DN3 | Peaks at CD34+38+1a+ | 6-fold drop between DP and SP CD4+ |
Ptpn7 | — | — | High at DP CD3- and again at SP CD8+ | n/r |
Rab2 | Peaks at smDP | n/r | High at DP CD3+ | n/r |
Rabgap1 | — | — | n/r | n/r |
SenP2 | — | — | — | n/r |
Tcf7 | Rises through smDP | n/r | Peaks at DP CD3+ and DP CD3− | Enriched vs stroma but flat between ITTP and DP |
Tmem131 | n/r | — | n/r | n/r |
Trappc2l | n/r | — | — | n/r |
Trim44 | — | — | Flat | n/r |
Zcchc11 | — | — | n/r | n/r |
Zfp109 | — | — | — | — |
Zfp27 | — | n/r | — | — |
Zfp30 | n/r | n/r | — | n/r |
Expression of the panel of genes identified in Fig. 2 as reported by four recent microarray-based bioinformatics articles. The first two columns of the table summarize investigations of mouse hematopoiesis. Small double-positive thymocytes (smDP) represent the development stage subsequent to DN4. CD4+ indicates a population of naive CD4+ thymocytes. The third and fourth columns summarize analyses of human T lymphopoiesis. Human ISP CD4+ cells are roughly equivalent to the mouse DN3 stage (78 79 ). Single-positive CD4+ (SP CD4+) are naive CD4+ thymocytes. DP CD3− and DP CD3+ human thymocytes are similar to mouse double-positive thymocytes. UCB in the Dik et al. study refers to “stem cell-like” CD34+ cells from umbilical cord blood, presumed to correspond to a mouse LSK CD27− population. The ITTP population referred to in the Lee et al. article are similar to the human ISP CD4+ poulation (and to mouse DN3 cells), and CB4 indicates a population of naive CD4+ T cells obtained from umbilical cord blood.
—, The Affymetrix microarray used by the authors does not have a probe for the gene; n/r, the authors did not report expression for the gene either because it was not the focus of the study or because expression fell below the limits of detection; Flat, gene expression was reported but is consistent throughout the populations evaluated by the authors.
Identification of additional T lineage-specific regulators is significant because the known T lineage-specific transcription factors, GATA3, Tcf7, and HEB (canonical form), do not appear to increase in mRNA expression sharply enough to account for the dramatic onset of T lineage differentiation gene expression, during the DN1 to DN2 “specification” transition (10, 13, 15). In vitro, a 100-fold up-regulation of differentiation genes encoding CD3ε and pTα at the DN1 to DN2 transition is accompanied by only a 2- to 4-fold greater expression of GATA3 and Tcf7, with no detectable increase in known Notch target gene expression to suggest enhanced Notch signaling (24). Therefore, any additional factors that might provide a gain in T lineage-specific regulatory function during the DN1 to DN2 transition would be of great interest as combinatorial participants in the lineage specification process. A priori, it was assumed that many transcription factors would be up-regulated in this interval as the cells began T lineage differentiation, and that the challenge would be to detect a subset with functional importance. The results instead revealed remarkable continuities between the multilineage stem and progenitor cells and the Pro-T cells.
The strategy we used to select the genes of interest was deliberately designed to enable us to recover legacy genes as well as strictly T lineage-specific ones. This decision was based on the evidence for low-level “multilineage priming” of stem cells for expression of genes used in other hemopoietic lineages (60, 61). The surprise was that the Pro-T cell-enriched genes identified by this approach were so dominated by legacy genes, many of them virtually unchanging in their levels of expression from the multipotent progenitor to the Pro-T cell state. It is tempting to speculate that this inherited assemblage of regulatory factors may contribute to the remarkable maintenance of developmental plasticity in intrathymic Pro-T cells until just before β-selection (10, 12, 17, 22, 62, 63, 64, 65) (T. Taghon, M. A. Yui, and E. V. Rothenberg, submitted for publication). Where genetic evidence is available, it confirms that legacy genes like Myb and Mll1 are indeed required for T lineage differentiation (66, 67, 68, 69, 70, 71), but they are not induced de novo in this process. In addition to transcription factors, transcripts encoding signaling molecules, adaptors, phosphatases, predicted RNA-binding molecules, helicases, and others identified in our screen were all found to be shared with prethymic BM LSK cells (summarized in Fig. 9). These legacy genes include many molecules that could play roles in signaling cascades that trigger differentiation as well as in functions shared by multipotent progenitors and early T cells, and they will be of substantial interest for further study.
Legacy genes as well as T lineage-specific genes contribute to the early T cell regulatory state. This Venn diagram represents legacy genes as those genes that are maintained with similar expression in hemopoietic progenitors and in early T cells, but which are down-regulated in early myeloid development. A selection of legacy genes is listed in the overlapping area between progenitor and T development. The genes found in this study that are most specifically up-regulated at a part of the T lineage program (i.e., expression in Pro-T plus is at least 2-fold that of LSK CD27+ and the SDs do not overlap) are listed in the yellow region of the Venn.
Legacy genes as well as T lineage-specific genes contribute to the early T cell regulatory state. This Venn diagram represents legacy genes as those genes that are maintained with similar expression in hemopoietic progenitors and in early T cells, but which are down-regulated in early myeloid development. A selection of legacy genes is listed in the overlapping area between progenitor and T development. The genes found in this study that are most specifically up-regulated at a part of the T lineage program (i.e., expression in Pro-T plus is at least 2-fold that of LSK CD27+ and the SDs do not overlap) are listed in the yellow region of the Venn.
The only two genes that emerged from this screen showing sharp up-regulation during T lineage specification both turned out to encode transcription factors. The alternate promoter use form of HEB, HEBalt, was previously identified in this arrayed cDNA library on the basis of its enrichment in Pro-T cells relative to peripheral lymphocytes (4, 31, 48). In the T lineage, its expression is confined to stages before β-selection, it is positively regulated by both Notch signaling and canonical HEB, and its expression is down-regulated sharply if lymphoid precursors are diverted to a myeloid fate (48, 72, 73). Most recently, HEBalt has been shown to play a hit-and-run accelerating role in early T cell development, distinct from that of canonical HEB (48). There is circumstantial evidence from conditional expression studies with the basic helix-loop-helix transcription factor E2A that increases in effective E-protein activity could provide some of the rate-limiting inductive function for T lineage differentiation genes (69, 71). E2A itself appears to be regulated mostly through a dramatic increase at the protein level between the DN1 and DN2 stages (74), even though its RNA levels remain relatively constant. The basis for this effect is not yet known. Thus, it is intriguing that the sharp increase in HEBalt mRNA expression could provide a functionally distinctive interaction partner (48) for E2A protein, or a decoy partner for its Id-family antagonist, in the same developmental interval. An E2A stabilization role would be consistent with the accelerating function that HEBalt has recently been shown to provide for entry into the T lineage pathway (48).
The highly lineage-specific, >100-fold up-regulation of Bcl11b in the DN2 stage (Fig. 5 and Ref. 55) stands alone as the most remarkable regulatory discontinuity we found to mark the entry into the T lineage program. In a study of human lymphopoiesis, Dik et al. (28) also report dramatic up-regulation of human Bcl11b at a stage similar to the murine LSK CD27+ to DN1 transition. Although already known to be expressed in a T lineage-specific way, Bcl11b was not expected to represent such a rarity. The tight coupling between the timing of Bcl11b up-regulation and the capability of a precursor cell to begin T cell differentiation indicates that this gene responds to developmental stage-specific regulatory inputs in addition to Notch signaling. Continued Bcl11b expression is also closely correlated with maintenance of T lineage identity: when DN2 and DN3 stage thymocytes are diverted to the myeloid pathway by forced expression of PU.1, Bcl11b is one of the genes that is sharply down-regulated (72). This makes the regulatory system controlling Bcl11b a uniquely powerful indicator for the mechanism of T lineage specification. The key question raised by our findings is what role Bcl11b may play during T lineage specification as this gene undergoes its own rapid induction. Steady-state analysis of germline Bcl11b knockout mice has shown only that it is essential for normal passage from the DN3 to the DP stage (75) and has not identified any function yet in the DN1-DN2 transition. It is notable, however, that in the absence of Bcl11b, DN3 cells are quite abnormal and incapable of differentiation to the DP stage even when a TCR transgene is supplied or p53 checkpoint control is removed (76, 77). Indeed, it is now known that abnormalities of Bcl11b expression are found in ∼20% of childhood T cell acute lymphoblastic leukemias (55, 56). Thus, Bcl11b activity during the DN2 stage, when it is first up-regulated, may guide the generation of normal DN3 cells. Bcl11b could contribute to normal T cell development either by acting directly on T lineage genes through its newly reported positive regulatory activity (52), or by participating in the repression of residual antagonists of the T cell program, such as PU.1 or SCL/Tal-1, to enforce T lineage commitment.
In summary, this study has defined about ninety genes enriched in early T cell precursors as compared with precursors embarking on myeloid differentiation, but of ∼40 tested, the majority are already highly expressed in prethymic multilineage precursors and merely sustained in a T lineage-specific way. Using detailed quantitative expression profiling in phenotypically defined subsets and in kinetic analyses of differentiation triggered by Notch/Delta signaling, one gene emerges as most intimately linked with the T lineage specification process. This gene, Bcl11b, has had its functions only hinted at in previous work and should provide a rich field for future study.
Acknowledgments
We thank Drs. Mark Leid (Oregon State University) and Dorina Avram (Albany Medical College) for stimulating discussions about Bcl11b and for generously sharing data before publication; John Cortese (Georgia Tech Research Institute) for helpful advice on terminology; Mary Yui (California Institute of Technology) for providing independent, confirmatory RNA samples; Rochelle Diamond and Stephanie Adams (California Institute of Technology) for expert advice and help with flow cytometry; Scott Bloom (Institute for Systems Biology) for excellent sequencing; Rob Butler, Robin Condie, Natasha Bouey, and Ruben Bayon (California Institute of Technology) for excellent mouse care; and Shirley Pease and the Genetically Engineered Mouse Service staff (California Institute of Technology) for timed pregnant mice.
Disclosures
The authors have no financial conflicts of interest.
Other zinc finger proteins
. | Total Clones . | Enrichment . | qRT-PCR . | Comments . | ||||
---|---|---|---|---|---|---|---|---|
Exon matched | ||||||||
Genes known to be up-regulated in T development | ||||||||
IL-2ra | 124 clones | ‡ | IL-2R, α chain. regulation of T cell proliferation | |||||
CD3g | 61 clones | ‡ | CD3 Ag, γ chain. Component of T cell complex | |||||
Tcrg | 53 clones | ‡ | TCR γ chain | |||||
CD3e | 34 clones | ‡ | CD3 ε chain | |||||
Tcrb | 22 clones | ‡ | TCR β chain | |||||
Tcf7 (TCF-1) | 18 clones | ‡ | A,b B | Transcription factor 7, aka TCF-1, T cell specific | ||||
Thy1 | 17 clones | ‡ | Thymus cell Ag 1, θ | |||||
Lat | 5 clones | ‡ | Linker for activation of T cells, receptor-signaling protein | |||||
Dntt | 5 clones | ‡ | Deoxynucleotidyltransferase, terminal. Also known as Tdt | |||||
Ly6a | 5 clones | ‡ | Lymphocyte Ag 6 complex, locus A, aka Sca-1. | |||||
Lck | 3 clones | ‡ | Lymphocyte protein tyrosine kinase is involved in regulation of the TCR-signaling pathway | |||||
Dpp4 | 2 clones | ‡ | Dipeptidylpeptidase 4, aka CD26 | |||||
Cd24a | ‡ | HSA or Ly52 | ||||||
RNA binding | ||||||||
Ddx17 | † | B | DEAD box helicase protein 17, p72, reported to regulate transcription | |||||
Ddx19b | ‡ | B, C | DEAD box helicase protein 19b | |||||
FUS | 2 clones | † | A, B | Pigpen or Fusion derived from t(12;16) malignant liposarcoma, transcriptional activator | ||||
Helz | † | B, C | Helicase with zinc finger domain | |||||
Other zinc finger proteins | ||||||||
Rad18 | ‡ | Ring-type zinc finger protein, DNA repair activity | ||||||
Trim44 | ‡ | A, B | Tripartite motif-containing 44, B-box domain | |||||
Lim-related proteins | ||||||||
Ablim1 | ‡ | B | Actin-binding LIM protein 1, deletions are linked to cancer | |||||
Ctdsp1 | ‡ | A, B | A small phosphatase with a NLI, nuclear LIM interacting, domain | |||||
Ldb1 | 2 clones | ‡ | A, B | LIM domain-binding 1, may interact with Scl in thrombopoiesis | ||||
Protein metabolism, signaling, kinases, and ubiquitination | ||||||||
Akap8 | ‡ | B | A kinase (PRKA) anchor protein 8 | |||||
Apoe | 2 clones | ‡ | Apolipoprotein E, lipid transport and Ca2+ homeostasis | |||||
Atxn2l | † | A, B | Ataxin 2 spinocerebellar ataxia type II, may be part of cytokine-signaling system | |||||
Ccnk | † | Cyclin K, may play a dual role in regulating CDK and RNA polymerase II activities | ||||||
Ddef1 | † | Development and differentiation enhancing 1, predicted to have transcriptional activity | ||||||
Dpp8 | ‡ | Dipeptidylpeptidase 8 | ||||||
Dtx3l | 2 clones | ‡ | C | Deltex3L, RING finger protein, similar to rhysin2 | ||||
Fgfrl1 | † | B | Fibroblast growth factor receptor-like 1 is involved in the negative regulation of cell proliferation. | |||||
Fkbp5 | † | A, B, C | FK506 binding protein 5, a novel T cell-specific immunophilin capable of calcineurin inhibition. Low in mature T cells. | |||||
Gpr56 | 3 clones | ‡ | B | An atypical G protein-coupled receptor, binds to tissue transglutaminase and acts in brain cortical development | ||||
Gps1 | ‡ | G protein pathway suppressor 1, subunit 1 of the COP9 signalosome | ||||||
Grap | † | A, B | GRB2-related adaptor protein, GRB2 may interact with LAT by the SH2 domain. It is a negative regulator of the Erk pathway. | |||||
Huwe1 | ‡ | A, B | HECT, UBA and WWE domain containing 1 (aka Ureb1) is an E3 ubiquitin protein ligase that inhibits the activity of tumor suppressor p53 protein | |||||
Prkag3 | † | Protein kinase, AMP-activated, γ3 noncatalytic subunit, regulates skeletal muscle glycogen content | ||||||
Prpf4b | † | Pre-mRNA processing factor 4 homolog B, a putative kinase | ||||||
Prss16 # | 2 clones | ‡ | B# | Thymus-specific serine protease precursor Prss16, a protein of the thymic stroma | ||||
Ptpn7 | ‡ | B | Protein tyrosine phosphatase, non-receptor type 7, a MAPK-specific protein tyrosine phosphatase. Same as HEPTP | |||||
Rab2 | ‡ | A, B | A G protein of the RAS oncogene family | |||||
(Table continues) |
. | Total Clones . | Enrichment . | qRT-PCR . | Comments . | ||||
---|---|---|---|---|---|---|---|---|
Exon matched | ||||||||
Genes known to be up-regulated in T development | ||||||||
IL-2ra | 124 clones | ‡ | IL-2R, α chain. regulation of T cell proliferation | |||||
CD3g | 61 clones | ‡ | CD3 Ag, γ chain. Component of T cell complex | |||||
Tcrg | 53 clones | ‡ | TCR γ chain | |||||
CD3e | 34 clones | ‡ | CD3 ε chain | |||||
Tcrb | 22 clones | ‡ | TCR β chain | |||||
Tcf7 (TCF-1) | 18 clones | ‡ | A,b B | Transcription factor 7, aka TCF-1, T cell specific | ||||
Thy1 | 17 clones | ‡ | Thymus cell Ag 1, θ | |||||
Lat | 5 clones | ‡ | Linker for activation of T cells, receptor-signaling protein | |||||
Dntt | 5 clones | ‡ | Deoxynucleotidyltransferase, terminal. Also known as Tdt | |||||
Ly6a | 5 clones | ‡ | Lymphocyte Ag 6 complex, locus A, aka Sca-1. | |||||
Lck | 3 clones | ‡ | Lymphocyte protein tyrosine kinase is involved in regulation of the TCR-signaling pathway | |||||
Dpp4 | 2 clones | ‡ | Dipeptidylpeptidase 4, aka CD26 | |||||
Cd24a | ‡ | HSA or Ly52 | ||||||
RNA binding | ||||||||
Ddx17 | † | B | DEAD box helicase protein 17, p72, reported to regulate transcription | |||||
Ddx19b | ‡ | B, C | DEAD box helicase protein 19b | |||||
FUS | 2 clones | † | A, B | Pigpen or Fusion derived from t(12;16) malignant liposarcoma, transcriptional activator | ||||
Helz | † | B, C | Helicase with zinc finger domain | |||||
Other zinc finger proteins | ||||||||
Rad18 | ‡ | Ring-type zinc finger protein, DNA repair activity | ||||||
Trim44 | ‡ | A, B | Tripartite motif-containing 44, B-box domain | |||||
Lim-related proteins | ||||||||
Ablim1 | ‡ | B | Actin-binding LIM protein 1, deletions are linked to cancer | |||||
Ctdsp1 | ‡ | A, B | A small phosphatase with a NLI, nuclear LIM interacting, domain | |||||
Ldb1 | 2 clones | ‡ | A, B | LIM domain-binding 1, may interact with Scl in thrombopoiesis | ||||
Protein metabolism, signaling, kinases, and ubiquitination | ||||||||
Akap8 | ‡ | B | A kinase (PRKA) anchor protein 8 | |||||
Apoe | 2 clones | ‡ | Apolipoprotein E, lipid transport and Ca2+ homeostasis | |||||
Atxn2l | † | A, B | Ataxin 2 spinocerebellar ataxia type II, may be part of cytokine-signaling system | |||||
Ccnk | † | Cyclin K, may play a dual role in regulating CDK and RNA polymerase II activities | ||||||
Ddef1 | † | Development and differentiation enhancing 1, predicted to have transcriptional activity | ||||||
Dpp8 | ‡ | Dipeptidylpeptidase 8 | ||||||
Dtx3l | 2 clones | ‡ | C | Deltex3L, RING finger protein, similar to rhysin2 | ||||
Fgfrl1 | † | B | Fibroblast growth factor receptor-like 1 is involved in the negative regulation of cell proliferation. | |||||
Fkbp5 | † | A, B, C | FK506 binding protein 5, a novel T cell-specific immunophilin capable of calcineurin inhibition. Low in mature T cells. | |||||
Gpr56 | 3 clones | ‡ | B | An atypical G protein-coupled receptor, binds to tissue transglutaminase and acts in brain cortical development | ||||
Gps1 | ‡ | G protein pathway suppressor 1, subunit 1 of the COP9 signalosome | ||||||
Grap | † | A, B | GRB2-related adaptor protein, GRB2 may interact with LAT by the SH2 domain. It is a negative regulator of the Erk pathway. | |||||
Huwe1 | ‡ | A, B | HECT, UBA and WWE domain containing 1 (aka Ureb1) is an E3 ubiquitin protein ligase that inhibits the activity of tumor suppressor p53 protein | |||||
Prkag3 | † | Protein kinase, AMP-activated, γ3 noncatalytic subunit, regulates skeletal muscle glycogen content | ||||||
Prpf4b | † | Pre-mRNA processing factor 4 homolog B, a putative kinase | ||||||
Prss16 # | 2 clones | ‡ | B# | Thymus-specific serine protease precursor Prss16, a protein of the thymic stroma | ||||
Ptpn7 | ‡ | B | Protein tyrosine phosphatase, non-receptor type 7, a MAPK-specific protein tyrosine phosphatase. Same as HEPTP | |||||
Rab2 | ‡ | A, B | A G protein of the RAS oncogene family | |||||
(Table continues) |
(Continued)
. | Total Clones . | Enrichment . | qRT-PCR . | Comments . | ||||
---|---|---|---|---|---|---|---|---|
Rabgap1 | † | A, B | RAB GTPase-activating protein 1 | |||||
Senp2 | ‡ | B | SUMO/sentrin-specific protease, enhancer of Wnt | |||||
Tgfbr2 | ‡ | TGF, β receptor II | ||||||
Trappc2l | † | B | Trafficking protein particle complex 2-like, formerly 1810017G16Rik | |||||
Ube2l3 | 2 clones | ‡ | Ubiquitin-conjugating enzyme E2L3, an E2 ubiquitin protein ligase | |||||
Uble1a | ‡ | Ubiquitin-like 1-activating enzyme E1A (SUMO-1-activating enzyme) | ||||||
Was (Wasp) | † | A | Wiskott-Aldrich syndrome protein, may function as a signal transduction adaptor downstream of Cdc42 | |||||
Wbp11 | † | WW domain binding protein 11, has protein phosphatase type 1 regulator activity, and is involved in RNA splicing | ||||||
Zcchc11 | ‡ | B | Zinc finger, CCHC domain containing 11, reported interact with TIFA to modulate TLR signaling | |||||
Regulators of transcription | ||||||||
AI449175 | ‡ | A, B | Kruppel-like zinc finger protein that has a KRAB domain | |||||
Aff3 | ‡ | A, B | Laf4, implicated in T cell leukemia | |||||
Crsp7 | ‡ | A, B | Cofactor required for Sp1 transcriptional activation | |||||
Lef1 | ‡ | Lymphoid enhancer binding factor 1, positive regulator of transcription | ||||||
Mll1 | ‡ | A, B | Myeloid/lymphoid or mixed-lineage leukemia 1, trithorax family protein | |||||
Mll2 | ‡ | A, B | Myeloid/lymphoid or mixed-lineage leukemia 2, trithorax family protein | |||||
Mxd4 | ‡ | B | Max dimerization protein 4, bHLH protein and predicted transcription factor | |||||
Notch1 | 8 clones | ‡ | Notch gene homolog 1 is a positive regulator of transcription | |||||
Notch3 | 3 clones | ‡ | Notch gene homolog 3 is involved in the negative regulation of cell differentiation | |||||
Runx1 | ‡ | Runt-related transcription factor 1, AML1, has DNA-dependent transcriptional activity | ||||||
Tardbp | 2 clones | † | TAR DNA-binding protein, predicted to regulate transcription | |||||
Tcf12 (HEB) | 4 clones | ‡ | A, B, C | HEB, bHLH protein, transcriptional activator | ||||
Zfp109 | † | A, B, C | KRAB box zinc finger protein, predicted to regulate transcription | |||||
Zfp27 | † | A, B | KRAB box zinc finger protein, predicted to regulate transcription | |||||
Zfp30 | † | A, B, C | KRAB box zinc finger protein, predicted to regulate transcription | |||||
1700021K02Rik # | 2 clones | ‡ | A, B# | A stromal protein also known as Spatial, stromal protein associated with thymi and lymph nodes, a possible transcription factor | ||||
Nucleotide binding | ||||||||
Exosc10 | ‡ | Exosome component 10, a nucleolar protein with exonuclease activity | ||||||
Gcc1 | ‡ | Golgi-coiled coil 1 contains a GRIP domain | ||||||
Electron transport | ||||||||
Txn2 | ‡ | Thioredoxin 2, a mitochondrial protein involved in electron transport | ||||||
Molecular function unknown | ||||||||
Bat2 | 4 clones | ‡ | B | HLA-B associated transcript 2, a novel MHC class III-encoded protein | ||||
Eva1 | 18 clones | ‡ | A, B | Epithelial V-like Ag 1, possible homotypic adhesion molecule, also expressed on thymic stroma. | ||||
Mpv17 | ‡ | May be integral to the mitochondrial membrane | ||||||
Tmem131 | 3 clones | ‡ | A, B | RW1 or Neg, predicted to contain 2 transmembrane domains. On chromosome 1, immediately adjacent to Zap70. | ||||
6030458C11Rik | ‡ | A RIKEN cDNA of unknown function | ||||||
Intron with or without exon | ||||||||
Regulators of transcription and other zinc fingers | ||||||||
Baz2a | ‡ | Bromodomain adjacent to zinc finger domain 2A or TIP5, transcription termination factor I-interacting protein 5 | ||||||
Bcl11b (CTIP2) | † | B, C | CTIP2, transcription factor required for T cell development | |||||
(Table continues) |
. | Total Clones . | Enrichment . | qRT-PCR . | Comments . | ||||
---|---|---|---|---|---|---|---|---|
Rabgap1 | † | A, B | RAB GTPase-activating protein 1 | |||||
Senp2 | ‡ | B | SUMO/sentrin-specific protease, enhancer of Wnt | |||||
Tgfbr2 | ‡ | TGF, β receptor II | ||||||
Trappc2l | † | B | Trafficking protein particle complex 2-like, formerly 1810017G16Rik | |||||
Ube2l3 | 2 clones | ‡ | Ubiquitin-conjugating enzyme E2L3, an E2 ubiquitin protein ligase | |||||
Uble1a | ‡ | Ubiquitin-like 1-activating enzyme E1A (SUMO-1-activating enzyme) | ||||||
Was (Wasp) | † | A | Wiskott-Aldrich syndrome protein, may function as a signal transduction adaptor downstream of Cdc42 | |||||
Wbp11 | † | WW domain binding protein 11, has protein phosphatase type 1 regulator activity, and is involved in RNA splicing | ||||||
Zcchc11 | ‡ | B | Zinc finger, CCHC domain containing 11, reported interact with TIFA to modulate TLR signaling | |||||
Regulators of transcription | ||||||||
AI449175 | ‡ | A, B | Kruppel-like zinc finger protein that has a KRAB domain | |||||
Aff3 | ‡ | A, B | Laf4, implicated in T cell leukemia | |||||
Crsp7 | ‡ | A, B | Cofactor required for Sp1 transcriptional activation | |||||
Lef1 | ‡ | Lymphoid enhancer binding factor 1, positive regulator of transcription | ||||||
Mll1 | ‡ | A, B | Myeloid/lymphoid or mixed-lineage leukemia 1, trithorax family protein | |||||
Mll2 | ‡ | A, B | Myeloid/lymphoid or mixed-lineage leukemia 2, trithorax family protein | |||||
Mxd4 | ‡ | B | Max dimerization protein 4, bHLH protein and predicted transcription factor | |||||
Notch1 | 8 clones | ‡ | Notch gene homolog 1 is a positive regulator of transcription | |||||
Notch3 | 3 clones | ‡ | Notch gene homolog 3 is involved in the negative regulation of cell differentiation | |||||
Runx1 | ‡ | Runt-related transcription factor 1, AML1, has DNA-dependent transcriptional activity | ||||||
Tardbp | 2 clones | † | TAR DNA-binding protein, predicted to regulate transcription | |||||
Tcf12 (HEB) | 4 clones | ‡ | A, B, C | HEB, bHLH protein, transcriptional activator | ||||
Zfp109 | † | A, B, C | KRAB box zinc finger protein, predicted to regulate transcription | |||||
Zfp27 | † | A, B | KRAB box zinc finger protein, predicted to regulate transcription | |||||
Zfp30 | † | A, B, C | KRAB box zinc finger protein, predicted to regulate transcription | |||||
1700021K02Rik # | 2 clones | ‡ | A, B# | A stromal protein also known as Spatial, stromal protein associated with thymi and lymph nodes, a possible transcription factor | ||||
Nucleotide binding | ||||||||
Exosc10 | ‡ | Exosome component 10, a nucleolar protein with exonuclease activity | ||||||
Gcc1 | ‡ | Golgi-coiled coil 1 contains a GRIP domain | ||||||
Electron transport | ||||||||
Txn2 | ‡ | Thioredoxin 2, a mitochondrial protein involved in electron transport | ||||||
Molecular function unknown | ||||||||
Bat2 | 4 clones | ‡ | B | HLA-B associated transcript 2, a novel MHC class III-encoded protein | ||||
Eva1 | 18 clones | ‡ | A, B | Epithelial V-like Ag 1, possible homotypic adhesion molecule, also expressed on thymic stroma. | ||||
Mpv17 | ‡ | May be integral to the mitochondrial membrane | ||||||
Tmem131 | 3 clones | ‡ | A, B | RW1 or Neg, predicted to contain 2 transmembrane domains. On chromosome 1, immediately adjacent to Zap70. | ||||
6030458C11Rik | ‡ | A RIKEN cDNA of unknown function | ||||||
Intron with or without exon | ||||||||
Regulators of transcription and other zinc fingers | ||||||||
Baz2a | ‡ | Bromodomain adjacent to zinc finger domain 2A or TIP5, transcription termination factor I-interacting protein 5 | ||||||
Bcl11b (CTIP2) | † | B, C | CTIP2, transcription factor required for T cell development | |||||
(Table continues) |
(Continued)
. | Total Clones . | Enrichment . | qRT-PCR . | Comments . | ||||
---|---|---|---|---|---|---|---|---|
Foxp1 | † | Forkhead box protein P1, a possible transcriptional repressor | ||||||
Lass5 | ‡ | A | Longevity assurance homolog 5, homeobox domain | |||||
Myb | ‡ | A, B, C | Myeloblastosis oncogene, Myb is a SANT domain transcription factor | |||||
Ncor1 | † | Nuclear receptor corepressor 1, transcriptional repressor reported to bind Runx1 | ||||||
Runx1 | ‡ | B | Runt-related transcription factor 1, AML1 | |||||
Rbm4 | ‡ | Reported to be involved in nuclear mRNA splicing, via spliceosome. | ||||||
Trim39 | ‡ | Tripartite motif protein 39, has a four-helical cytokine domain | ||||||
Calcium channel proteins | ||||||||
Cacnb2 | † | Calcium channel, voltage-dependent, β2 subunit has high voltage-gated calcium channel activity | ||||||
Kcnn4 | ‡ | Potassium intermediate/small conductance calcium-activated channel, subfamily N, member 4, positive regulator of protein secretion. | ||||||
Protein metabolism, signaling, kinases, and ubiquitination | ||||||||
Arpp21 | † | Protein phophatase type I regulator activity | ||||||
Csnk1e | ‡ | Casein kinase 1, ε, involved in Wnt signaling by DVL1 | ||||||
Fbxw4 | ‡ | F-box and WD-40 domain protein 4, domain structure suggests it may be involved in ubiquitin cycle and/or Wnt signaling | ||||||
Gna11 | ‡ | Guanine nucleotide binding protein, α11. Possibly involved in G protein-coupled receptor protein complex | ||||||
Saps1 | † | SAPS domain family, member 1, formerly KIAA1115 | ||||||
Taok2 | ‡ | TAO kinase 2, contains a serine/threonine protein kinase domain | ||||||
Molecular function unknown | ||||||||
5730419I09RIK | ‡ | May have synaptotagmin and calcium-dependent lipid-binding domains | ||||||
Tm6sf1 | ‡ | Transmembrane 6 superfamily member 1, integral to the membrane | ||||||
Malat1 | ‡ | A, B, C | Metastasis associated lung adenocarcinoma transcript 1 (noncoding RNA) |
. | Total Clones . | Enrichment . | qRT-PCR . | Comments . | ||||
---|---|---|---|---|---|---|---|---|
Foxp1 | † | Forkhead box protein P1, a possible transcriptional repressor | ||||||
Lass5 | ‡ | A | Longevity assurance homolog 5, homeobox domain | |||||
Myb | ‡ | A, B, C | Myeloblastosis oncogene, Myb is a SANT domain transcription factor | |||||
Ncor1 | † | Nuclear receptor corepressor 1, transcriptional repressor reported to bind Runx1 | ||||||
Runx1 | ‡ | B | Runt-related transcription factor 1, AML1 | |||||
Rbm4 | ‡ | Reported to be involved in nuclear mRNA splicing, via spliceosome. | ||||||
Trim39 | ‡ | Tripartite motif protein 39, has a four-helical cytokine domain | ||||||
Calcium channel proteins | ||||||||
Cacnb2 | † | Calcium channel, voltage-dependent, β2 subunit has high voltage-gated calcium channel activity | ||||||
Kcnn4 | ‡ | Potassium intermediate/small conductance calcium-activated channel, subfamily N, member 4, positive regulator of protein secretion. | ||||||
Protein metabolism, signaling, kinases, and ubiquitination | ||||||||
Arpp21 | † | Protein phophatase type I regulator activity | ||||||
Csnk1e | ‡ | Casein kinase 1, ε, involved in Wnt signaling by DVL1 | ||||||
Fbxw4 | ‡ | F-box and WD-40 domain protein 4, domain structure suggests it may be involved in ubiquitin cycle and/or Wnt signaling | ||||||
Gna11 | ‡ | Guanine nucleotide binding protein, α11. Possibly involved in G protein-coupled receptor protein complex | ||||||
Saps1 | † | SAPS domain family, member 1, formerly KIAA1115 | ||||||
Taok2 | ‡ | TAO kinase 2, contains a serine/threonine protein kinase domain | ||||||
Molecular function unknown | ||||||||
5730419I09RIK | ‡ | May have synaptotagmin and calcium-dependent lipid-binding domains | ||||||
Tm6sf1 | ‡ | Transmembrane 6 superfamily member 1, integral to the membrane | ||||||
Malat1 | ‡ | A, B, C | Metastasis associated lung adenocarcinoma transcript 1 (noncoding RNA) |
Genes identified by the subtraction protocol as enriched in the Pro-T plus population relative to a myeloid/progenitor population are listed in Table I. Genes identified by clones with canonical exon matches are listed separately from genes identified by clones with matches to noncoding or intron and exon regions. Two genes selected by the subtraction that were found in thymic stroma but not in developing T cells are indicated by #. One clone was identified for each gene unless otherwise noted. Most genes listed were enriched by 4 SD above the geometric mean, ‡, and some were enriched by 3 SD, †. Selected genes were analyzed by qRT-PCR in various cell population sets, coded A, B, and C. Population set A is comprised of nonsorted Pro-T plus cells, sorted Pro-B cells, and nonsorted progenitor/premyeloid cells. Set B is sorted DN3 cells, stem-like progenitor cells (LSK CD27−), later multipotent progenitor (LSK CD27+), sorted Pro-B cells, and sorted BM myeloid cells (see also Fig. 2). Set C includes LSK CD27− and LSK CD27+ progenitors as in set B, as well as sorted DN thymocyte subsets from earliest DN1 stage through β-selection to the DP stage (see Fig. 5).
Key: A, unsorted Pro-T plus, cultured progenitor/premyeloid and sorted Pro-B cells (data not shown); B, sorted DN3, LSK CD27+, LSK CD27−, Pro-B, and BM myeloid populations (Fig. 2); and C, B populations with sorted DN1, DN2, DN3a, DN3b, and DN4 Pro-T subpopulations (Fig. 5). #, Present in thymic stroma (see Fig. 4); ‡, genes up-regulated by four SDs; and †, 3 SD up-regulation.
Footnotes
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
This work was supported by grants (to E.V.R.) from the National Science Foundation (MCB-9983129) and National Institutes of Health U.S. Public Health Service (R01 CA90233 and R01 CA98925), by National Institutes of Health U.S. Public Health Service awards K08 AI054699 (to C.C.T.) and F32 AI068366 (to J.E.M.); and from the DNA Sequencer Royalty Fund at the California Institute of Technology.
Abbreviations used in this paper: BM, bone marrow; DN, double negative; Dtx3L, Deltex3-like; pTα, pre-TCRα; qRT-PCR, quantitative real-time PCR; TCF-1, T cell factor-1.
The online version of this article contains supplemental material.