Abstract
ARID3a is a DNA-binding protein important for normal hematopoiesis in mice and for in vitro lymphocyte development in human cultures. ARID3a knockout mice die in utero with defects in both early hematopoietic stem cell populations and erythropoiesis. Recent transcriptome analyses in human erythropoietic systems revealed increases in ARID3a transcripts implicating potential roles for ARID3a in human erythrocyte development. However, ARID3a transcript levels do not faithfully reflect protein levels in many cells, and the functions and requirements for ARID3a protein in those systems have not been explored. We used the erythroleukemic cell line K562 as a model to elucidate functions of ARID3a protein in early human erythropoiesis. ARID3a knockdown of hemin-stimulated K562 cells resulted in lack of fetal globin production and modifications in gene expression. Temporal RNA sequencing data link ARID3a expression with the important erythroid regulators Gata1, Gata2, and Klf1. Ablation of ARID3a using CRISPR-Cas9 further demonstrated it is required to maintain chromatin structures associated with erythropoietic differentiation potential. These data demonstrate that the ARID3a protein is required for early erythropoietic events and provide evidence for the requirement of ARID3a functions for proper maintenance of appropriate chromatin structures.
Introduction
The intricate network of transcription factors (TFs) that drive hematopoiesis and erythropoiesis, including development of specific subpopulations, have not been fully elucidated (1). The erythroid-specific globin gene cluster is regulated by an upstream cis-regulatory region, the locus control region (LCR) (2, 3), that binds to specific TFs, allowing accessibility to the embryonic and adult globin in a developmentally controlled fashion (4–6). Decreased levels of hemoglobin gene transcripts and maturational arrest of erythroid lineages in thalassemia are associated with defects in expression of the TFs GATA1, GATA2, and KLF1, as well as with changes in chromatin accessibility of enhancer regions for the globin genes (7), indicating the importance of these factors in erythropoiesis. Transcripts for A + T rich binding protein 3a (ARID3a) were previously identified to be increased significantly throughout primitive erythropoiesis in a mouse model (8), implicating ARID3a as a potential regulator of hematopoietic reprogramming in combination with GATA1 and GATA2 through motif associations (9–11), possibly through binding to distal enhancer regions (12). However, others found discrepancies in protein versus transcript levels during human erythropoiesis, particularly for GATA1 (13), and we found that ARID3a transcripts do not always correlate with protein levels in mature hematopoietic cell subsets (14). Recently, coimmunoprecipitation liquid chromatography–mass spectrometry data, using the megakaryoblastic cell line CMK, revealed that both ARID3a and GATA1 acted in concert for proper regulation of megakaryopoiesis (9), but it is unclear if ARID3a is required for early erythropoiesis in human cells. Therefore, it is critical to assess requirements for individual TFs at the protein level during hematopoietic events.
ARID3a was originally discovered for its ability to increase Ig transcription in B cells (15–17) and is a member of a large family of proteins, many of which have important roles as epigenetic regulators (18). Modulation of ARID3a levels in cord blood results in skewing of lineage fate decisions (13) and changes in the developmental plasticity within hematopoietic stem cells (19). ARID3a can both suppress and enhance individual gene expression in a cell type–specific fashion (20, 21). We previously reported that ARID3a-deficient mice die in utero at embryonic day 12.5 because of failed erythropoiesis (22), suggesting that ARID3a could be important for erythropoiesis in mice. However, ARID3a knockout (KO) embryos also exhibit a 90% reduction in hematopoietic stem cell numbers (22). It is currently unclear if failed erythropoiesis resulted directly from requirements for ARID3a during erythropoiesis or from earlier hematopoietic progenitor defects.
Our earlier data indicated that the human monomyelocytic cell line K562 constitutively expresses ARID3a protein (23). This human cell line has been used, for some time, as a model for erythrocyte and myeloid lineage development and can be induced with exogenous stimuli to differentiate into erythroid cells that express high levels of embryonic and fetal globin genes (13, 24–27). Therefore, we used this model system to determine if ARID3a protein is required for human globin gene expression and to explore how ARID3a might mediate alterations in gene expression in those cells. Furthermore, we used temporal transcriptome analyses and integrated chromatin accessibility data from assay for transposase-accessible chromatin sequencing (ATAC-seq) to determine how ARID3a affects gene expression patterns during induction of erythroid lineage differentiation. These results identify new lineage-specific functions for ARID3a in human erythropoiesis.
Materials and Methods
Cell culture and transfection
K562 (CCL-243; American Type Culture Collection) cells were plated in triplicate at 1 × 105 per well in six-well plates with RPMI 1640 + 7.5% FCS overnight at 37°C prior to treating cells with 0.04 mM hemin (Sigma-Aldrich), as previously described (28). Cells were harvested at 24, 48, and 72 h, and viabilities were assessed using trypan blue exclusion. To evaluate erythroid lineage differentiation, cells were stained with benzidine to detect globin expression, as reported previously (29). Briefly, cells were resuspended in 25 μl PBS and stained at a 1:1 ratio with benzidine solution made with 30% fresh hydrogen peroxide. At least 200 cells were evaluated per replicate. Lentivirus expressing short hairpin RNA (shRNA) specific for ARID3a, or an unrelated control shRNA, both of which coexpress GFP, allowing visualization of infected cells, were purchased from GeneCopoeia (Rockville, MD) and used at a multiplicity of infection of 0.6 to 1.0, as previously described (30). The ARID3a sequence targeted was GCAGTTTAAGCAGCTCTA from exon 2 and does not react with other ARID family members (30). K562 cells were infected with virus 30 min to 3 h prior to stimulation with hemin in the presence of 8 μg/ml polybrene following our previous work (30). Lentivirus transfection efficiency was assessed via GFP expression using a Zoe Fluorescent Imager (Bio-Rad Laboratories) on day 2 and was typically >70%.
Flow cytometry
Abs to the transferrin receptor, an erythroid precursor marker, CD71 allophycocyanin-Cy7 (catalog no. 33410; BioLegend) and the glycophorin A erythrocyte marker CD235a PE-Cy7 (catalog no. 349112; BioLegend) were used for surface staining to evaluate erythroid lineage differentiation. Appropriate isotype controls from BioLegend were used for gating. Myeloid lineage detection was evaluated using surface markers CD24 allophycocyanin (BioLegend Cat no. 311118) and CD33 PE-Cy5 (catalog no. 303406; BioLegend). Following surface marker staining, cells were fixed with fixation buffer (catalog no. 420801; BioLegend), permeabilized with Foxp3/Transcription Factor Staining Buffer Set (catalog no. 00552300; Invitrogen eBioscience), and stained for ARID3a with goat anti-human ARID3a peptide-specific Ab, as we described previously (31). Donkey anti-goat IgG PE (catalog no. Pl31860; Invitrogen) was used as the secondary Ab. Data were collected on a Stratedigm S1200Ex, and data postprocessing and analysis were performed using FlowJo (Tree Star) software version 10.
RNA sequencing and analyses
Total RNA from triplicate samples treated with and without hemin, ARID3a shRNA, and/or scrambled shRNA was isolated using NucleoSpin RNA XS kits (catalog no. 740902.50; Macherey-Nagel). RNA concentrations were measured with an Impen Nanophotometer. RNA integrity numbers were obtained using an Agilent 2200 TapeStation. Library construction was performed as described previously (14). Briefly, the Ovation RNA-Seq v2 (NuGEN Technologies) kit was used to generate sequencing libraries. Paired-end (2 × 50 bp) sequencing was performed on a NovaSeq platform. Fastq files were demultiplexed, and sequencing adapters were removed using Cutadapt (32). Briefly, we created a Bowtie (33) index based on the University of California, Santa Cruz, knownGene (34) transcriptome, and aligned paired-end reads directly to this index using Bowtie2. The average sequence depth was 21M reads with an average alignment of 83% mapping to the hg38 genome assembly. Next, we ran RSEM v1.3.0 (35) using default parameters to obtain transcript per million (TPM) values for each gene. Genes with expression values of TPM >1 in half of the samples were retained, leaving 11,869 transcripts for downstream analyses. Differential gene expression was analyzed using DESeq2 v3.5 (36). Differentially expressed genes (DEGs; false discovery rate [FDR] < 0.05) with fold changes (FC) ≥2 were used for Ingenuity Pathway Analysis (QIAGEN). Hierarchical clustering (Euclidean) was performed on DEGs (FDR value adjusted < 0.05), and heatmaps were generated with the pHeatmap package in R. Principal component analysis was performed in R using the prcomp function.
ARID3a KO
Genome editing of ARID3a was performed via CRISPR/Cas9 mutation of the K562 cell line contracted through Synthego (Redwood City, CA). Briefly, modified guide RNA ARID3a-932711 (5'-CCTCGTAAGTCCAGTCGCCG-3' [TGG]-PAM) targeting exon 3 was chosen to be specific for ARID3a. A bulk KO sample of greater than 70% KO was then single-cell sorted via flow cytometry for isolation of homozygous ARID3a KO clones. Sixty-six clones visually confirmed to have only one cell per well after sorting were allowed to grow, and 61 clones were screened by flow cytometry for ARID3a protein expression. Eight of nineteen clones selected by flow cytometry were then selected as being wild type (WT) or potentially homozygous KO, and levels of ARID3a expression were confirmed by Western blotting using a commercial ARID3a Ab (mouse monoclonal IgG catalog no. sc-398367; Santa Cruz Biotechnology). Homozygous colonies and WT colonies were used for ATAC-seq analyses.
Western blotting
For protein detection, total cell extracts from 1 × 106 cells were resuspended in 50 µl of Laemmli sample buffer containing 5% 2-ME. Following 5 min of boiling at 90°C, 10 µl of extract was loaded onto precast Mini-PROTEAN TGX (catalog no. 456-1093; Bio-Rad Laboratories) gel, and transfer was done for 1 h on nitrocellulose 0.2 µm (Bio-Rad Laboratories). Membranes were blocked in 1% gelatin in TBST for 1 h at room temperature, as previously described (37). Blots were probed overnight for ARID3a and actin with mouse anti-ARID3a and rabbit anti–β-actin, respectively. Following incubation with primary Ab, blots were washed three times for 10 min with TBST and probed with secondary Ab for 1 h at room temperature. The secondary Ab for ARID3a was goat anti-mouse IgG and rabbit anti-goat IgG for actin. Blots were then washed three times for 10 min with TBST. Proteins were detected using the AP conjugate substrate kit (catalog no. 170-6432; Bio-Rad Laboratories).
ATAC-seq and analyses
ATAC-seq libraries were generated from WT and ARID3a−/− K562 clones treated with or without hemin. Duplicate samples of 30,000 cells were washed in cold PBS, pelleted by centrifugation, and lysed using cold lysis buffer (10 mM Tris-HCl [pH 7.4], 10 mM NaCl, 3 mM MgCl2, and 0.1% IGEPAL CA-630). Nuclei were collected by centrifugation, and the pellet was resuspended in 50 µl transposase reaction mix (25 µl 2× TD buffer, 2.5 µl transposase [Illumina Nextera FC121-1030 TDE1 and TD buffer] and 22.5 µl nuclease-free water). The transposition reaction was incubated at 37°C for 30 min. Samples were cleaned using a MiniElute kit (QIAGEN) following manufacturer’s protocol and eluted in 10 µl Buffer EB. Library construction was done by PCR in a reaction mix containing 25 µl 2× NEBNext PCR Master Mix (New England Biolabs), 10 µl transposed sample, and 5 µl primer mix (1.25 µM each of NextEra XT adapter1 and adapter2 primer mix). PCR conditions were 72°C for 5 min, 98°C for 30 s, and 11 cycles (98°C for 10 s, 68°C for 30 s, and 72°C for 1 min), ending with 72°C for 5 min. PCR-amplified sequencing libraries were cleaned with AMPure Beads (Beckman Coulter). Library quality was determined by analysis on an Agilent TapeStation.
For each sample, 25–99 million 50-bp paired-end reads were obtained on an Illumina NextSeq sequencer. All data processing steps were performed within the Partek Flow Genomics Analysis software. Fastq files were processed, and both sequencing primers and Nextera transposase adapters were removed using Cutadapt. Trimmed reads were aligned to the hg38 GRCh38 reference genome using Bowtie2v2.2.5 with parameters very sensitive and maximum fragment size (-X 2000) (33). Low-quality (-Q 30) and duplicate reads and reads mapping to the Encyclopedia of DNA Elements (ENCODE) project blacklist, mitochondrial DNA and rRNA-encoding DNA genes, were removed. MACS2 (38) was used to call peaks for duplicate samples using the parameters -q 0.05, nolambda, slocal 1000, llocal 10,000, -m 5 50, shift 0, extsize 200, and fe-cutoff 1.0. The assay for transposase-accessible chromatin (ATAC) peaks of pooled replicate samples were annotated to genomic regions such as transcription start sites (TSS), introns, and exons using RefSeq version 89. Each peak was annotated in relation to these genomic elements and may have multiple gene annotations. DESeq2 v3.5 was run on the ATAC peaks to identify differential chromatin accessibility in WT (n = 8) versus ARID3a−/− (n = 8) samples with an FDR cutoff of 0.05 (36).
Statistics
Data for viability, benzidine stain, and time course were plotted, and all statistical analyses were performed using Prism (GraphPad) version 7. A one-way ANOVA was used for comparisons of multiple groups, followed by Tukey posttest for multiple comparison corrections. All statistical tests and corresponding p values are stated in the figure legends. Any p values <0.05 were considered significant.
Results
ARID3a knockdown inhibits globin production and expression of erythrocyte markers
The human K562 erythroleukemia cell line was treated in vitro with hemin for 5 d to allow visible production of red, fetal hemoglobin-producing cells (Fig. 1A). Although cells treated with an irrelevant control shRNA (Fig. 1A, right) resembled those treated with hemin only, cells that received ARID3a shRNA showed no obvious red cells (Fig. 1A, middle). Time course analyses indicated near maximal production of globin was achieved in hemin-treated cells by day 3 of treatment, and reduced numbers of globin-producing cells were apparent as early as day 1 after the inhibition of ARID3a (Fig. 1B). Robust inhibition of globin expression was also observed in ARID3a-inhibited samples on day 3 in multiple separate experiments (Fig. 1C). Viabilities and cell numbers (not shown) were equivalent on day 3 in all cultures (Fig. 1D), suggesting ARID3a inhibition did not cause cell death or inhibition of cell division. These data suggest that ARID3a is necessary for globin production.
ARID3a is required for hemin-induced fetal globin production and erythroid maturation.
K562 cells were treated with hemin with and without prior transduction of cells with lentivirus expressing ARID3a-specific shRNA or unrelated shRNA control virus. (A) Hemin-stimulated K562 were stained with benzidine to visualize dark-colored globin production. (B) A representative time course experiment (n = 3) shows percentages of globin-producing cells counted microscopically. (C) Cumulative data from five individual experiments show percentages of benzidine positive cells on day 3 of culture. *p < 0.0001, one-way ANOVA. (D) Percentages of viable K562 cells counted via trypan blue exclusion on day 3 are shown. Flow cytometry of cells stimulated as indicated on day 3 of culture shows surface staining of proteins associated with erythrocyte (E) and monocyte (F) lineage differentiation. (G) Flow cytometric histograms presented in normalized mode indicate numbers of cells expressing intracellular ARID3a (shaded peaks) compared with isotype controls (dotted lines). Solid vertical lines depict peak intensities of control unstimulated cells for comparison. (H) Percentages of ARID3a and the surface markers shown in (E) and (F) are presented for three experiments.
ARID3a is required for hemin-induced fetal globin production and erythroid maturation.
K562 cells were treated with hemin with and without prior transduction of cells with lentivirus expressing ARID3a-specific shRNA or unrelated shRNA control virus. (A) Hemin-stimulated K562 were stained with benzidine to visualize dark-colored globin production. (B) A representative time course experiment (n = 3) shows percentages of globin-producing cells counted microscopically. (C) Cumulative data from five individual experiments show percentages of benzidine positive cells on day 3 of culture. *p < 0.0001, one-way ANOVA. (D) Percentages of viable K562 cells counted via trypan blue exclusion on day 3 are shown. Flow cytometry of cells stimulated as indicated on day 3 of culture shows surface staining of proteins associated with erythrocyte (E) and monocyte (F) lineage differentiation. (G) Flow cytometric histograms presented in normalized mode indicate numbers of cells expressing intracellular ARID3a (shaded peaks) compared with isotype controls (dotted lines). Solid vertical lines depict peak intensities of control unstimulated cells for comparison. (H) Percentages of ARID3a and the surface markers shown in (E) and (F) are presented for three experiments.
To further assess the effects of ARID3a inhibition on erythroid differentiation in this model, we performed flow cytometry to evaluate the presence of known erythrocyte markers (5). Erythrocyte lineage markers CD71 (TFRC) and CD235a (GYPA) were enhanced as expected by hemin treatment, whereas cells treated with ARID3a-specific shRNA, with or without hemin stimulation, more closely resembled unstimulated cells with respect to expression of these surface markers (Fig. 1E). Similarly, although hemin stimulation resulted in increased expression of monocyte marker CD33 and CD24, cells treated with ARID3a-specific shRNA more closely resembled untreated cells (Fig. 1F). The control shRNA-stained cells expressed surface markers similar to those of hemin-treated cells. Flow cytometric analyses of intracellular ARID3a protein levels on day 3 of culture confirmed that ARID3a inhibition resulted in less ARID3a protein (Fig. 1G). Results for three independent experiments are quantified in (Fig. 1H. Together, these data suggest that the ARID3a protein is required for early erythrocyte lineage differentiation in hemin-stimulated K562 cells.
ARID3a inhibition of hemin-stimulated cells results in downregulation of genes associated with erythroid differentiation
To further explore the block in erythroid differentiation in ARID3a-inhibited samples, a time course RNA sequencing (RNA-seq) experiment was performed over 3 d in K562 cells treated with and without hemin and with or without ARID3a inhibition. Triplicate samples from each treatment condition were sequenced, and differential expression analyses were performed to identify genes affected by ARID3a inhibition. Principal component analysis revealed that untreated samples cluster away from hemin-treated samples on both days 2 and 3 (Fig. 2). ARID3a-inhibited samples clustered more closely to untreated samples by day 3. Day 2 ARID3a-inhibited samples were closely clustered between untreated and hemin-treated samples (Fig. 2), indicating perturbed differentiation at this early time point.
ARID3a inhibition alters transcription profiles of hemin-induced K562 cells.
A principal component analysis of K562 cells treated with hemin for 2 or 3 d, with and without ARID3a inhibition, is shown from two different three-dimensional views. Individual dots represent triplicate cultures sequenced on days 2 and 3. Cell treatments and days are labeled.
ARID3a inhibition alters transcription profiles of hemin-induced K562 cells.
A principal component analysis of K562 cells treated with hemin for 2 or 3 d, with and without ARID3a inhibition, is shown from two different three-dimensional views. Individual dots represent triplicate cultures sequenced on days 2 and 3. Cell treatments and days are labeled.
Differential expression analyses were performed on untreated versus hemin-stimulated cells at each time point (n = 3, FDR < 0.05, FC > ±1.5) to identify genes important for erythroid lineage differentiation. There were 846, 2228, and 1055 DEGs on days 1, 2, and 3, respectively, in hemin-treated compared with untreated controls. Hierarchical clustering of the DEGs highlights the differences in transcriptomes over time and shows temporal expression of genes induced or repressed by hemin treatment (Fig. 3). Observations confirmed the differential expression of key genes involved in erythroid lineage differentiation (8), such as the hemoglobin genes α-globin (HBA1 and HBA2) and β-globin (HBG1 and HBG2), NFE2, TAL1, and KLF1, although they were not all differentially expressed on all 3 d (Fig. 3). These data agree with previous transcriptome analyses of hemin-stimulated K562 cells using microarrays (28, 39, 40), validating our model system. Pathway analyses identified erythrocyte development among the top pathways (Table I). Transcription, mRNA splicing, nuclear export, and autoimmune pathways were enriched in hemin-treated samples by day 2, as well as histone and nucleosome processes. These pathways were also enriched at day 3, indicating that most of the changes at the level of transcription were evident by day 2.
Hemin induces differential gene expression of globin-associated genes within 3 d.
Heatmaps of DEGs from triplicate cultures of untreated versus hemin-treated cells are shown at three time points (days 1–3) after hemin stimulation. Numbers of DEGs (FDR < 0.05, FC ≥ ±1.5, n = 3) are indicated below the heatmaps. Select genes previously associated with erythrocyte differentiation are indicated.
Hemin induces differential gene expression of globin-associated genes within 3 d.
Heatmaps of DEGs from triplicate cultures of untreated versus hemin-treated cells are shown at three time points (days 1–3) after hemin stimulation. Numbers of DEGs (FDR < 0.05, FC ≥ ±1.5, n = 3) are indicated below the heatmaps. Select genes previously associated with erythrocyte differentiation are indicated.
GO analysis of DEGs induced by hemin
GO Terms . | Term ID . | Adjusted p Value . | Number of DEGs . |
---|---|---|---|
Erythrocyte differentiation | GO: 0030218 | 1.69E−06 | 86 |
Erythrocyte homeostasis | GO: 0034101 | 5.53E−06 | 95 |
Regulation of transcription, DNA templated | GO: 0006355 | 2.96E−07 | 45 |
mRNA splicing | GO: 0000398 | 4.21E−11 | 41 |
Nuclear export | GO: 0051168 | 1.32E−06 | 121 |
GO Terms . | Term ID . | Adjusted p Value . | Number of DEGs . |
---|---|---|---|
Erythrocyte differentiation | GO: 0030218 | 1.69E−06 | 86 |
Erythrocyte homeostasis | GO: 0034101 | 5.53E−06 | 95 |
Regulation of transcription, DNA templated | GO: 0006355 | 2.96E−07 | 45 |
mRNA splicing | GO: 0000398 | 4.21E−11 | 41 |
Nuclear export | GO: 0051168 | 1.32E−06 | 121 |
ID, identifier.
To identify genes affected by ARID3a inhibition, we performed differential expression analysis on triplicate hemin-stimulated samples treated with a scramble shRNA or ARID3a shRNA with focus on day 2 (Fig. 4A). ARID3a-suppressed cultures revealed strong attenuation of hemin-induced transcriptional activation of GATA2, HEMGN, LDB1, and ZFP361. Quantification of TPM values for select erythroid genes show effects of ARID3a inhibition (Fig. 4B). The expression of the erythroid differentiation marker CD71 (TFRC) examined in (Fig. 1 was also significantly reduced upon ARID3a inhibition. Additionally, both α- and β-globin genes were significantly reduced within the first 2 d, suggesting that the majority of changes in gene expression occurred within the first 2 d.
ARID3a inhibition alters hemin-induced gene expression.
(A) A hierarchical clustered heatmap of DEGs in hemin-treated cells with and without ARID3a inhibition from triplicate cultures (each column) are shown at day 2. Total numbers of DEGs (FDR < 0.05, FC ≥ ±1.5, n = 3) are given at the panel bottoms, and select genes are indicated. (B) TPM values of key erythroid genes quantify the effects of both hemin stimulation and ARID3a inhibition. FDR values are displayed above conditions to indicate statistical significance determined by DESeq2.
ARID3a inhibition alters hemin-induced gene expression.
(A) A hierarchical clustered heatmap of DEGs in hemin-treated cells with and without ARID3a inhibition from triplicate cultures (each column) are shown at day 2. Total numbers of DEGs (FDR < 0.05, FC ≥ ±1.5, n = 3) are given at the panel bottoms, and select genes are indicated. (B) TPM values of key erythroid genes quantify the effects of both hemin stimulation and ARID3a inhibition. FDR values are displayed above conditions to indicate statistical significance determined by DESeq2.
Further analyses of differential gene expression at day 2 by Venn diagram indicates overlapping DEGs affected by each treatment condition (Fig. 5A). Hemin induction affected more genes than were affected by ARID3a inhibition. These analyses identified 227 overlapping DEGs upregulated by hemin treatment and downregulated by ARID3a inhibition (Fig. 5A). Pathway analyses of this gene list identified systemic lupus erythematosus signaling, mRNA processing (Gene Ontology [GO]: 0006396), RNA splicing (GO: 0008380), and chromatin binding (GO: 0003682) pathways (Fig. 5B). Identification of overrepresented TF binding sites within −2 kbp to +500 bp of the promoters of the 227 genes induced by hemin and repressed by ARID3a shRNA showed significant enrichment in genes with binding sites for YY1, PAX3, SIX6, ATF1, and ARID3B. Essential TFs for erythropoiesis (GATA1, GATA2, KLF1, and NFE2) and their cofactors/mediators (MED1 and LDB1) were all inhibited on day 2 in samples treated with hemin and shRNA. A list of the top 35 most significantly DEGs affected by hemin stimulation and those repressed by ARID3a inhibition is given in Table II. Among the top DEGs repressed by ARID3a, the majority were either TFs, microRNAs, or other small nuclear RNAs involved in splicing. In addition, when the 227 genes affected both by hemin and ARID3a were analyzed for inferred protein associations by Ingenuity Pathway Analyses, networks with functions related to cell cycle, cell death and survival, and cell morphology were identified. Interestingly, these analyses identified ARID3a in the top network as a TF associated with FOS and YY1 using both hemin-stimulated and ARID3a shRNA inhibited DEGs (Fig. 5D, 5E). Together, these data suggest that ARID3a is important for appropriate gene regulation of factors required for early erythropoiesis.
ARID3a is required for expression of a number of genes and pathways important for erythropoiesis.
(A) A Venn diagram indicates numbers of DEGs on day 2 of treatment with and without hemin and ARID3a (FDR < 0.05, FC ≥ 1.5, n = 3). (B) GO analyses indicate pathways important for the 227 genes that are affected by ARID3a inhibition and hemin induction. (C) The most highly represented TF binding motifs of the 227 overlapping genes are shown. Relative size of text in binding motifs suggests relative occurrence frequency. (D and E) Network analyses of the 227 DEGs in (A) reveal related genes. Log2 FC values from the untreated versus hemin comparison were overlaid onto the top network identified by Ingenuity Pathway Analysis (IPA) using the 227 overlapping genes. Red color indicates upregulated genes, and green color indicates downregulated genes
ARID3a is required for expression of a number of genes and pathways important for erythropoiesis.
(A) A Venn diagram indicates numbers of DEGs on day 2 of treatment with and without hemin and ARID3a (FDR < 0.05, FC ≥ 1.5, n = 3). (B) GO analyses indicate pathways important for the 227 genes that are affected by ARID3a inhibition and hemin induction. (C) The most highly represented TF binding motifs of the 227 overlapping genes are shown. Relative size of text in binding motifs suggests relative occurrence frequency. (D and E) Network analyses of the 227 DEGs in (A) reveal related genes. Log2 FC values from the untreated versus hemin comparison were overlaid onto the top network identified by Ingenuity Pathway Analysis (IPA) using the 227 overlapping genes. Red color indicates upregulated genes, and green color indicates downregulated genes
Most significantly DEGs on day 2
Untreated versus Hemin . | Hemin versus Hemin + ARID3a shRNA . | ||||
---|---|---|---|---|---|
Gene . | Log2 FC . | Adj p Value . | Gene . | Log2 FC . | Adj p Value . |
TXNIP | 5.24 | 6.25E−15 | SH3BGR | 2.29 | 4.02E−07 |
OSGIN1 | 3.68 | 3.63E−16 | HSPA5 | 2.05 | 7.51E−10 |
HBZ | 3.52 | 6.64E−16 | HERPUD1 | 1.83 | 4.88E−11 |
AKR1C1 | 3.01 | 1.89E−14 | AGR2 | 1.75 | 2.36E−09 |
SQSTM1 | 2.99 | 1.22E−24 | ALDH1A1 | 1.61 | 7.07E−07 |
MCM5 | 2.74 | 1.07E−18 | NFE4 | 1.43 | 3.92E−12 |
HBA2 | 2.54 | 2.68E−15 | SEC24D | 1.40 | 3.58E−07 |
HBE1 | 2.47 | 3.58E−21 | BEX2 | 1.29 | 4.11E−06 |
NQO1 | 2.41 | 2.22E−32 | CREG1 | 1.28 | 2.85E−06 |
HBG2 | 2.12 | 2.33E−15 | LCP1 | 1.22 | 4.33E−07 |
GCLM | 2.08 | 5.83E−19 | SERPINH1 | 1.20 | 5.91E−06 |
PPP1R15A | 2.05 | 2.15E−13 | RTN4 | 1.16 | 3.14E−08 |
FTL | 2.02 | 3.63E−16 | ACOT13 | 1.15 | 6.60E−07 |
HBA1 | 1.98 | 3.73E−16 | TDP2 | 1.11 | 1.49E−07 |
HBG1 | 1.94 | 1.02E−11 | TPM4 | 1.11 | 4.53E−08 |
TXNRD1 | 1.52 | 9.58E−12 | CTSD | 1.11 | 1.78E−08 |
FTH1 | 1.47 | 4.70E−12 | PGD | 1.05 | 3.26E−06 |
CREM | 1.41 | 1.28E−14 | ACTB | 1.02 | 2.38E−06 |
TXN | 1.20 | 6.53E−14 | CTSB | 1.02 | 1.39E−06 |
PRKCSH | −1.18 | 3.70E−16 | SND1 | 0.96 | 8.32E−07 |
FKBP2 | −1.44 | 1.47E−11 | COPA | 0.89 | 2.56E−07 |
UCA1 | −1.52 | 6.71E−13 | GSR | 0.88 | 2.85E−06 |
SERPINH1 | −1.60 | 1.47E−11 | ANXA5 | 0.88 | 1.47E−10 |
PDIA3 | −1.63 | 2.291E−19 | SEC61A1 | 0.80 | 1.54E−06 |
DNAJB11 | −1.69 | 4.11E−14 | CALU | 0.80 | 1.71E−09 |
AC068631.2 | −1.95 | 9.98E−12 | SSX1 | 0.77 | 1.23E−07 |
HSP90B1 | −1.97 | 3.99E−19 | NQO2 | 0.73 | 5.36E−06 |
HERPUD1 | −1.99 | 2.64E−14 | PSMC1 | 0.68 | 1.35E−06 |
NMU | −2.01 | 3.65E−11 | SEM1 | 0.61 | 4.33E−07 |
HYOU1 | −2.07 | 7.01E−19 | TNNI3 | −1.04 | 8.32E−07 |
PDIA6 | −2.19 | 2.54E−17 | VARS | −1.06 | 2.82E−07 |
MANF | −2.24 | 8.93E−16 | LINC01029 | −1.10 | 1.26E−06 |
PDIA4 | −2.35 | 1.43E−14 | MARCKSL1 | −1.14 | 4.84E−07 |
HSPA5 | −2.35 | 7.91E−14 | HEMGN | −1.95 | 7.51E−10 |
CALR | −2.57 | 9.34E−17 | TXNIP | −4.15 | 9.81E−11 |
Untreated versus Hemin . | Hemin versus Hemin + ARID3a shRNA . | ||||
---|---|---|---|---|---|
Gene . | Log2 FC . | Adj p Value . | Gene . | Log2 FC . | Adj p Value . |
TXNIP | 5.24 | 6.25E−15 | SH3BGR | 2.29 | 4.02E−07 |
OSGIN1 | 3.68 | 3.63E−16 | HSPA5 | 2.05 | 7.51E−10 |
HBZ | 3.52 | 6.64E−16 | HERPUD1 | 1.83 | 4.88E−11 |
AKR1C1 | 3.01 | 1.89E−14 | AGR2 | 1.75 | 2.36E−09 |
SQSTM1 | 2.99 | 1.22E−24 | ALDH1A1 | 1.61 | 7.07E−07 |
MCM5 | 2.74 | 1.07E−18 | NFE4 | 1.43 | 3.92E−12 |
HBA2 | 2.54 | 2.68E−15 | SEC24D | 1.40 | 3.58E−07 |
HBE1 | 2.47 | 3.58E−21 | BEX2 | 1.29 | 4.11E−06 |
NQO1 | 2.41 | 2.22E−32 | CREG1 | 1.28 | 2.85E−06 |
HBG2 | 2.12 | 2.33E−15 | LCP1 | 1.22 | 4.33E−07 |
GCLM | 2.08 | 5.83E−19 | SERPINH1 | 1.20 | 5.91E−06 |
PPP1R15A | 2.05 | 2.15E−13 | RTN4 | 1.16 | 3.14E−08 |
FTL | 2.02 | 3.63E−16 | ACOT13 | 1.15 | 6.60E−07 |
HBA1 | 1.98 | 3.73E−16 | TDP2 | 1.11 | 1.49E−07 |
HBG1 | 1.94 | 1.02E−11 | TPM4 | 1.11 | 4.53E−08 |
TXNRD1 | 1.52 | 9.58E−12 | CTSD | 1.11 | 1.78E−08 |
FTH1 | 1.47 | 4.70E−12 | PGD | 1.05 | 3.26E−06 |
CREM | 1.41 | 1.28E−14 | ACTB | 1.02 | 2.38E−06 |
TXN | 1.20 | 6.53E−14 | CTSB | 1.02 | 1.39E−06 |
PRKCSH | −1.18 | 3.70E−16 | SND1 | 0.96 | 8.32E−07 |
FKBP2 | −1.44 | 1.47E−11 | COPA | 0.89 | 2.56E−07 |
UCA1 | −1.52 | 6.71E−13 | GSR | 0.88 | 2.85E−06 |
SERPINH1 | −1.60 | 1.47E−11 | ANXA5 | 0.88 | 1.47E−10 |
PDIA3 | −1.63 | 2.291E−19 | SEC61A1 | 0.80 | 1.54E−06 |
DNAJB11 | −1.69 | 4.11E−14 | CALU | 0.80 | 1.71E−09 |
AC068631.2 | −1.95 | 9.98E−12 | SSX1 | 0.77 | 1.23E−07 |
HSP90B1 | −1.97 | 3.99E−19 | NQO2 | 0.73 | 5.36E−06 |
HERPUD1 | −1.99 | 2.64E−14 | PSMC1 | 0.68 | 1.35E−06 |
NMU | −2.01 | 3.65E−11 | SEM1 | 0.61 | 4.33E−07 |
HYOU1 | −2.07 | 7.01E−19 | TNNI3 | −1.04 | 8.32E−07 |
PDIA6 | −2.19 | 2.54E−17 | VARS | −1.06 | 2.82E−07 |
MANF | −2.24 | 8.93E−16 | LINC01029 | −1.10 | 1.26E−06 |
PDIA4 | −2.35 | 1.43E−14 | MARCKSL1 | −1.14 | 4.84E−07 |
HSPA5 | −2.35 | 7.91E−14 | HEMGN | −1.95 | 7.51E−10 |
CALR | −2.57 | 9.34E−17 | TXNIP | −4.15 | 9.81E−11 |
Adj, adjusted.
Chromatin accessibility is altered in ARID3a KO K562 cells
Because of the large number of genes affected by ARID3a inhibition, including histone subunits, we hypothesized that like other ARID family members, ARID3a might function to maintain important chromatin domains for erythropoiesis. CRISPR-Cas9 gene editing of K562 cells was used to generate clones with biallelic inactivation of ARID3a for ATAC-seq analyses (Fig. 6) to explore this hypothesis. Single-guide RNAs targeting exon 3, which code for the extended DNA-binding domain specific to ARID3 family members, were used to generate genomic deletions of ARID3a (Fig. 6A). Bulk deleted clones were then single-cell sorted, and individual clones were screened for deletion by PCR (not shown), flow cytometry, and Western blotting (Fig. 6B, 6C). Two clones (BH and AL) selected via reduced intracellular staining were confirmed to exhibit no detectable ARID3a protein via Western blotting. In addition, flow cytometry indicated that hemin stimulation of the ARID3a KO clones did not induce expression of the surface markers CD71 and CD235a (Fig. 6D), two gene products that were previously identified as being reduced by ARID3a shRNA inhibition in (Figs. 1 and (4, further validating those data.
Homozygous ARID3a KO clones K562 were generated.
(A). Schematic diagram of the single-guide RNA (sgRNA) cut site that causes deletion of part of the extended-ARID DNA-binding domain used for CRISPR-Cas9 deletion. (B) Single-cell–sorted K562 clones were analyzed for protein expression of ARID3a by Western blotting of a WT and two KO clones, BH and AL, using 100,000 cells per lane. Actin was used to confirm protein loading in each lane (lower panel). (C) Flow cytometry for intracellular ARID3a confirmed KO of ARID3a, as shown for clone BH versus the WT clone. (D) Representative flow cytometry data from hemin-stimulated and untreated clones were used to evaluate erythroid lineage markers in KO clone BH versus the WT clone. Quadrant gates were set according to isotype controls
Homozygous ARID3a KO clones K562 were generated.
(A). Schematic diagram of the single-guide RNA (sgRNA) cut site that causes deletion of part of the extended-ARID DNA-binding domain used for CRISPR-Cas9 deletion. (B) Single-cell–sorted K562 clones were analyzed for protein expression of ARID3a by Western blotting of a WT and two KO clones, BH and AL, using 100,000 cells per lane. Actin was used to confirm protein loading in each lane (lower panel). (C) Flow cytometry for intracellular ARID3a confirmed KO of ARID3a, as shown for clone BH versus the WT clone. (D) Representative flow cytometry data from hemin-stimulated and untreated clones were used to evaluate erythroid lineage markers in KO clone BH versus the WT clone. Quadrant gates were set according to isotype controls
These two KO clones and two WT clones were split into duplicate cultures and were used with and without hemin treatment for ATAC-seq analyses. Mixed model analysis was performed using Partek Genomics Suite and identified 504 genomic regions with differential chromatin accessibility in ARID3a WT versus ARID3a KO K562 cells with and without hemin treatment (Fig. 7A), and 271 were mapped near genes (−1000 to +100 bp of any TSS) (Fig. 7B). A large percentage of differentially accessible sites were also intergenic regions (Fig. 7B). Unsupervised hierarchical clustering of all 504 regions indicated that WT and KO accessible regions grouped together irrespective of hemin treatment and showed both increased and decreased regions of accessibility associated with ARID3a deficiency (Fig. 7C). Alterations in chromatin accessibility associated with ARID3a KO were particularly evident in large intragenic regions, as demonstrated in (Fig. 7D, and may represent enhancers. A list of the top 10 regions that were most significantly up- or downregulated between WT and ARID3a KO clones is shown in (Fig. 7E, and eight of those are intragenic regions of unknown function. Homer analysis of the promoter regions of genes with increased chromatin accessibility identified NFATC1, ZIC2, and GATA3 motifs (Fig. 7F). The same analysis was performed on regions with decreased accessibility and identified enrichment of ARNT/AHR, MTF1, and TEAD motifs (Fig. 7G). Pathway analysis of differentially accessible regions (DARs) increasing in accessibility in ARID3a KO cells compared with WT cells identified erythropoietin-mediated signaling and thrombin as the top pathways, consistent with roles for ARID3a in erythropoietic functions (Table III). BCR signaling was also identified as an enriched pathway, consistent with previous data indicating multiple roles for ARID3a in B lymphocyte development and function (31).
ARID3a is required to maintain chromatin landscapes.
Two homozygous ARID3a KO and WT clones with and without hemin treatment were subjected to ATAC-seq analyses in duplicate cultures. (A) A volcano plot shows DARs between WT and ARID3a KO clones (FDR < 0.05, Log FC > 2). Dark red dots represent individual genes with FDR < 0.05 (y-axis) and Log FC > 2 (x-axis). (B) The genomic distribution of differentially expressed ATAC peaks in WT versus ARID3a KO clones is shown. The human genome was portioned into seven bins relative to RefSeq genes. TSS, transcriptional start site; TTS, transcriptional termination site; CDS, coding sequence; UTR, untranslated region. (C) Unsupervised hierarchical clustering of data from duplicate KO and WT clones with and without hemin generate a heatmap of DARs that reveal both increased and decreased accessible regions. (D) ATAC peaks for the two most DARs (which are intragenic) are shown (E) The top 10 upregulated and downregulated DARs are shown. (F) Homer analysis shows enrichment of potential TF binding sites within DARs with increased or (G) decreased accessibility in WT versus KO clones. Relative size of text in the presented binding motifs suggests relative occurrence frequency.
ARID3a is required to maintain chromatin landscapes.
Two homozygous ARID3a KO and WT clones with and without hemin treatment were subjected to ATAC-seq analyses in duplicate cultures. (A) A volcano plot shows DARs between WT and ARID3a KO clones (FDR < 0.05, Log FC > 2). Dark red dots represent individual genes with FDR < 0.05 (y-axis) and Log FC > 2 (x-axis). (B) The genomic distribution of differentially expressed ATAC peaks in WT versus ARID3a KO clones is shown. The human genome was portioned into seven bins relative to RefSeq genes. TSS, transcriptional start site; TTS, transcriptional termination site; CDS, coding sequence; UTR, untranslated region. (C) Unsupervised hierarchical clustering of data from duplicate KO and WT clones with and without hemin generate a heatmap of DARs that reveal both increased and decreased accessible regions. (D) ATAC peaks for the two most DARs (which are intragenic) are shown (E) The top 10 upregulated and downregulated DARs are shown. (F) Homer analysis shows enrichment of potential TF binding sites within DARs with increased or (G) decreased accessibility in WT versus KO clones. Relative size of text in the presented binding motifs suggests relative occurrence frequency.
GO analysis of DARs identified by ATAC-seq
GO Terms . | Term ID . | Adj. p Value . | Number of DEGs . |
---|---|---|---|
Erythropoietin-mediated signaling | GO: 0038162 | 7.75E−06 | 10 |
Thrombin signaling | GO: 0015057 | 6.30E−05 | 26 |
BCR signaling | GO: 0050852 | 1.11E−04 | 21 |
Regulation of NFAT signaling | GO: 0070884 | 3.34E−04 | 3 |
PI3K/AKT signaling | GO: 0014065 | 2.74E−03 | 6 |
GO Terms . | Term ID . | Adj. p Value . | Number of DEGs . |
---|---|---|---|
Erythropoietin-mediated signaling | GO: 0038162 | 7.75E−06 | 10 |
Thrombin signaling | GO: 0015057 | 6.30E−05 | 26 |
BCR signaling | GO: 0050852 | 1.11E−04 | 21 |
Regulation of NFAT signaling | GO: 0070884 | 3.34E−04 | 3 |
PI3K/AKT signaling | GO: 0014065 | 2.74E−03 | 6 |
Adj., adjusted; ID, identifier.
Comparison of the 158 DEGs affected by ARID3a inhibition (Fig. 4) with the 271 DARs associated with specific genes (Fig. 7) only identified nine genes present in both datasets (Fig. 8A, 8B) in which ARID3a promoter accessibility might be directly correlated with alterations in gene transcription. GO analysis on the nine overlapping genes revealed associations with TNF-mediated signaling, protein folding, erythrocyte differentiation, and protein degradation (Fig. 8C). Many of the ARID3a-associated DARs did not map near gene TSS (Fig. 7) but may involve regions that function distally as enhancers, as occurs with ARID3a in the IgH locus (17, 41, 42). Together, these data suggest that ARID3a-associated effects on transcription are likely to be mediated through multiple mechanisms not limited to alterations in promoter accessibility.
Overlap analysis of DEGs from RNA-seq and DARs from ATAC-seq data.
(A) A Venn diagram indicates the number of common genes when comparing the 158 DEGs from (Fig. 4 with DARs identified as being within 1000 bp of the TSS of a known gene. (B) The nine genes identified as significant in both RNA-seq and ATAC-seq data are listed. (C) Pathways identified from GO analyses with those nine genes are shown.
Overlap analysis of DEGs from RNA-seq and DARs from ATAC-seq data.
(A) A Venn diagram indicates the number of common genes when comparing the 158 DEGs from (Fig. 4 with DARs identified as being within 1000 bp of the TSS of a known gene. (B) The nine genes identified as significant in both RNA-seq and ATAC-seq data are listed. (C) Pathways identified from GO analyses with those nine genes are shown.
ARID3a deficiency directly affects chromatin regions associated with globin gene regulation
The globin LCR is an intergenic region located far upstream of the fetal globin genes that is critical for developmental regulation of those genes (43). This LCR revealed reduced accessibility in ARID3aKO clones compared with WT clones, particularly after hemin treatment (Fig. 9A). Data from ENCODE identified ARID3a binding sites via chromatin immunoprecipitation sequencing (44) (Fig. 9A) in many cases with overlapping GATA1, GATA2, and TAL1 TF binding sites. The erythroid-specific TF loci for NFE2 and TAL1 were also significantly less accessible in ARID3a−KO clones than in WT cells (Fig. 9B, 9C). Quantification of ATAC peaks from the five hypersensitive sites within the globin LCR region and within the NFE2 and TAL1 loci are indicated in (Fig. 9D and 9E. Together, these data suggest that ARID3a is required to maintain appropriate chromatin configurations for globin gene expression and erythropoietic differentiation in K562 cells.
ARID3a inhibition alters chromatin accessibility of regulatory regions important for erythropoiesis.
(A). Chromatin accessibility peaks of the β-globin locus and the LCR in WT and ARID3aKO clones stimulated with and without hemin are shown (two clones per group were analyzed in duplicate, FDR < 0.05, Log FC > 2). Gray bars indicate TF binding sites for ARID3a, GATA1, GATA2, and TAL1 identified by ENCODE in K562 cells. Dotted rectangles show ARID3a binding sites in all four conditions. Chromatin accessibility peaks for two other TFs required for erythropoiesis NFE2 (B) and TAL1 (C) are shown for WT and ARID3a KO clones. Log2 counts per million (CPM) values were quantified for effects due to ARID3a depletion in the LCR region (D) and in sites near the NFE2 and TAL1 genes as indicated by blue stars in (B) and (C). (E) **FDR < 0.005.
ARID3a inhibition alters chromatin accessibility of regulatory regions important for erythropoiesis.
(A). Chromatin accessibility peaks of the β-globin locus and the LCR in WT and ARID3aKO clones stimulated with and without hemin are shown (two clones per group were analyzed in duplicate, FDR < 0.05, Log FC > 2). Gray bars indicate TF binding sites for ARID3a, GATA1, GATA2, and TAL1 identified by ENCODE in K562 cells. Dotted rectangles show ARID3a binding sites in all four conditions. Chromatin accessibility peaks for two other TFs required for erythropoiesis NFE2 (B) and TAL1 (C) are shown for WT and ARID3a KO clones. Log2 counts per million (CPM) values were quantified for effects due to ARID3a depletion in the LCR region (D) and in sites near the NFE2 and TAL1 genes as indicated by blue stars in (B) and (C). (E) **FDR < 0.005.
Discussion
In this study, we demonstrated that ARID3a protein is required for hemin-induced erythrocyte lineage differentiation of the human K562 model cell line. Knockdown of ARID3a with shRNA resulted in a visible reduction in globin production and downregulation of erythroid lineage surface markers CD235a (GYPA) and CD71 (TFRC). RNA-seq analysis revealed that ARID3a is necessary for the expression of genes important for hemin-induced differentiation. Furthermore, our data reveal that ARID3a deficiency results in alterations in chromatin landscapes that contribute to erythropoiesis and suggest that ARID3a, like other ARID family members, is important for epigenetic regulation of chromatin accessibility. These data indicate a previously unappreciated role for ARID3a in human erythropoiesis and are the first, to our knowledge, to document genomic sites altered by depletion of ARID3a in this process.
We previously found that ARID3a KO embryos exhibited profound defects in erythropoiesis at day 12.5 of gestation (22). Consistent with our observations, Kingsley et al. (8) found ARID3a transcripts were enriched 22-fold in primitive erythropoiesis in the mouse. Additional studies in human erythroid progenitors indicated that the ARID3a gene locus was differentially methylated during early erythropoiesis with higher expression in fetal erythroblasts (45). More recently, the ARID3a locus was identified to be differentially methylated in human primary basophilic erythroblasts (46). Further, others linked ARID3a with the erythroid master regulators, GATA1 and TAL1, in human erythropoietic studies that determined TF landscapes of enhancers in erythroid progenitors (6, 47), but these studies did not directly examine effects due to ARID3a. Our data confirm downregulation of globin genes (HBA1, HBA2, and HBZ) in ARID3a-deficient cells and reveal that ARID3a deficiency leads to blocks in differentiation and repression of key erythroid-specific TFs (GATA1, GATA2, KLF1, and NFE2) and genes encoding critical cofactors (MED1, LDB1, and CCAR1) for hemoglobin expression. α-Globin genes (HBA1, HBA2, and HBZ), components of mediator complexes important for erythropoiesis (MED1), and histone subunits were among the 227 genes induced by hemin and downregulated by ARID3a, suggesting ARID3a has a role in globin transcription or mediates cofactor binding to TSS of erythroid-specific genes through epigenetic mechanisms. Our studies extend previous data that were limited to transcript analyses and definitively demonstrate a requirement for ARID3a protein in human erythroid development in this model cell line.
We identified 158 differentially regulated genes associated with ARID3a inhibition at day 2 in this system (Fig. 4). These genes showed significant enrichment of the TF binding sites, GATA1 and GATA2. Both GATA1 and GATA2 are critical regulators of erythropoiesis (48–51). GATA2 is expressed in erythroid precursors (52), and as GATA1 levels increase, GATA2 is replaced by GATA1 at many sites throughout the genome, a process called GATA switching (53, 54). Studies of enhancer turnover in CD34+ cells suggested that ARID3a could be associated with the GATA2-to-GATA1 switch, raising the possibility that ARID3a could be involved in epigenetic modifications in those cells (8). ENCODE data of K562 cells showed considerable overlap between GATA1, GATA2, and ARID3a binding sites in many genes important for erythropoiesis, suggesting ARID3a may function with those factors, either as a TF or as an epigenetic regulator mediating opening/closing of chromatin in enhancer/promoter regions. Knockdown of ARID3a with shRNA in GATA1-mutated cells revealed a block in both megakaryocytic and erythroid differentiation and revealed 65% of predicted ARID3a binding sites in K562 cells overlap with GATA1 sites (9). Indeed, the globin locus and other differentially regulated genes exhibit close proximity of binding sites for ARID3a, GATA1, GATA2, and TAL1 (Fig. 9). This raises the possibility that ARID3a could be a part of the transcription machinery, which also contains GATA1, to drive erythroid-specific gene programs. Moreover, the GATA switch is mediated by positioning of polycomb subunits EZH2 (55). EZH2 was significantly downregulated upon inhibition of ARID3a in our studies, and C.F. Webb’s unpublished observations suggest it may interact directly with ARID3a in K562 cells. Further studies will be needed to explore if ARID3a is important for the GATA switch through regulation of or interactions with EZH2 and other associated TFs.
Our ATAC data revealed that ARID3a deficiency results in both increases and decreases in chromatin accessibility in both coding and noncoding regions of the genome. Indeed, unsupervised hierarchical clustering of hemin-stimulated and unstimulated KO and WT clones suggest that hemin stimulation did not dramatically alter chromatin landscapes (Fig. 7C). Rather, the presence or absence of ARID3a defined the majority of the chromatin alterations. These data suggest that ARID3a functions to establish chromatin landscapes necessary for erythroid differentiation as observed by decreases in chromatin accessibility in erythroid-specific enhancer regions in ARID3a KO clones, including the LCR that is essential for the developmentally controlled expression of embryonic, fetal, and adult globin genes. EHMT1 adds repressive H3K9me2 marks to the LCR region (56) and showed 2-fold increased accessibility in ARID3a KO clones. EHMT1 also adds repressive histone marks to H3K9me2 at the γ-globin locus in human adult erythroid cells, thereby reducing expression of both γ-globin and fetal globin (56). Future studies will be required to how ARID3a contributes to these effects. However, our data suggest that ARID3a may participate in mediation of multiple epigenetic events necessary for erythropoiesis and that these events may require context-dependent transcriptional activities.
It is likely that ARID3a functions in coordination with other epigenetic factors to mediate its effects. Although ARID3a contains additional DNA-binding specificity not observed in other ARID family members, it does not contain obvious epigenetic regulatory domains associated with many of the other ARID family members (57). Identifying the proteins associated with ARID3a will likely be necessary to fully understand its functions, and those proteins are likely to interact in cell type–specific fashions, and perhaps through activation of enhancers. Of the 504 DARs identified by ATAC-seq, 233 of these regions were located in intergenic regions with unknown function. Moreover, there were 12 intergenic regions among the top 20 most DARs (Fig. 7). The importance of intergenic regions is emphasized by a study on the superenhancer-derived RNA, alncRNA-EC7/Bloodlinc, which is required for terminal erythropoiesis and RBC production (42). Our ATAC data show reduced accessibility of this enhancer region. It is not currently possible to distinguish which of these intergenic regions with altered chromatin accessibility directly contribute to erythropoietic functions versus other hematopoietic events. For example, some of these regions may be important in other hematopoietic cells for which ARID3a is linked to disease activity, such as lupus (27, 58). Histone subunits (HIST1H2BN, HIST1H3B, HIST1H3H, and HIST1H4J), chromatin remodelers (SAT2B), heme biosynthesis enzymes (ALAS1), and genes implicated in systemic lupus erythematosus signaling pathways (PPP1R15A and TROVE2) were repressed by ARID3a shRNA on day 2. Thus, it is not possible from these data alone to elucidate the specific functions of ARID3a-regulated regions in this or other cell types.
Limitations of this study include the use of the K562 transformed cell line, which may not faithfully mimic all aspects of fetal globin expression in primary erythropoietic progenitors. In addition, although our data suggest that ARID3a is required for normal transcription levels and to maintain normal chromatin configurations in these cells, our data do not suggest that ARID3a alone is sufficient to mediate all of the alterations observed. It is likely that additional proteins associate with ARID3a to mediate these effects. In addition, some of the ARID3a-associated effects we observed may be indirect effects due to alterations of expression of other TFs and/or epigenetic regulators.
Understanding how hemoglobin expression and erythropoiesis are regulated is critical for the development of new therapeutics for diseases such as sickle cell disease and thalassemia. Further elucidation of how ARID3a functions in erythropoiesis and other hematopoietic events could lead to development of new therapeutic agents for blood disorders. Together, these data expand our knowledge of the importance of ARID3a in hematopoiesis and particularly in erythroid lineage development and define new functional and regulatory roles for ARID3a.
Data availability
RNA-seq and ATAC-seq data are publicly available through the Gene Expression Omnibus National Center for Biotechnology Information database at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE131649 under the accession number GSE131649.
Acknowledgements
We thank the Clinical Genomics Core Facility at the Oklahoma Medical Research Foundation, the Flow Cytometry Core Facility at the University of Oklahoma Health Sciences Center, and the Stephenson Cancer Center at the University of Oklahoma Health Sciences Center for core support. We also thank Ken Jones for helpful discussions.
Footnotes
This work was supported by the National Institutes of Health (AI123951 and AI118836 to C.F.W. and T32 AI007633 to J.G.).
The sequences presented in this article have been submitted to Gene Expression Omnibus of the National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/geo/) under accession number GSE131649.
Abbreviations used in this article
- ARID3a
A + T rich binding protein 3a
- ATAC
assay for transposase-accessible chromatin
- ATAC-seq
assay for transposase-accessible chromatin sequencing
- DAR
differentially accessible region
- DEG
differentially expressed gene
- ENCODE
Encyclopedia of DNA Elements
- FC
fold change
- FDR
false discovery rate
- GO
Gene Ontology
- KO
knockout
- LCR
locus control region
- RNA-seq
RNA sequencing
- shRNA
short hairpin RNA
- TPM
transcript per million
- TSS
transcription start site
- WT
wild type
References
Disclosures
The authors have no financial conflicts of interest.