Abstract
Cytidine deamination of nucleic acids underlies diversification of Ig genes and inhibition of retroviral infection, and thus, it would appear to be vital to host defense. The host defense properties of cytidine deamination require two distinct but homologous cytidine deaminases—activation-induced cytidine deaminase and apolipoprotein B-editing cytidine deaminase, subunit 3G. Although cytidine deamination has clear benefits, it might well have biological costs. Uncontrolled cytidine deamination might generate misfolded polypeptides, dominant-negative proteins, or mutations in tumor suppressor genes, and thus contribute to tumor formation. How cytidine deaminases target a given nucleic acid substrate at specific sequences is not understood, and what protects cells from uncontrolled mutagenesis is not known. In this paper, I shall review the functions and regulation of activation-induced cytidine deaminase and apolipoprotein B-editing cytidine deaminase, subunit 3G, and speculate about the basis for site specificity vis-à-vis generalized mutagenesis.
Cytidine deamination was discovered as a mechanism of editing mRNA transcripts in eukaryotes (RNA editing) (1, 2). RNA editing modifies transcripts in many organisms and thus contributes to expanding the number of gene products without the generation of new genes. Deamination of cytidine (C) to uridine (U) in RNA may create start codons, thus originating new open reading frames; it may introduce premature stop codons, producing truncated forms of proteins; it may open interrupted reading frames, giving rise to novel proteins; or it may modify base pairing in RNA molecules to create alternative secondary structures that could alter RNA stability.
One well-studied type of C-to-U RNA editing is catalyzed by the apolipoprotein (apo)3 B RNA-editing cytidine deaminase (APOBEC)1. APOBEC1 edits apo B mRNA at a specific site by deaminating cytidine-6666 to uridine, thus converting a glutamine codon 2153 (CAA) into a premature stop codon (UAA), producing a truncated form of apo B (apo B-48) (1, 3, 4, 5). The truncated apo B is synthesized in the small intestine and secreted in the triglyceride-rich chylomicrons that carry dietary fat (6). A nonedited apo B mRNA is produced in the liver, where it generates apo B-100, which is a component of the very-low-density and low-density lipoproteins (6) that mediate the transport of endogenously produced lipids.
Cytidine deamination and diversification of the Ig genes
Activation-induced cytidine deaminase (AID) was discovered by Muramatsu et al. (7) as an APOBEC1 homolog with cytidine deaminase properties in stimulated B cell lines. Muramatsu et al. (8) showed that AID is necessary for somatic hypermutation and class switch recombination, because AID−/− B lymphocytes do not undergo class switch recombination and fail to accumulate mutations upon Ag stimulation. Mutations in the human AID gene causing lack of function underlie one type of hyper-IgM syndrome. In this syndrome, B cells fail to switch from IgM to other isotypes, and somatic hypermutation of the Ig V regions does not occur (9). This syndrome seemingly connects AID with Ig isotype class switch and somatic hypermutation. Arakawa et al. (10) demonstrated that AID is also required for Ig gene conversion in chicken B cell lines. Thus, AID participates in three different processes that contribute to the diversification of Abs: somatic hypermutation, isotype class switch recombination, and gene conversion.
Somatic hypermutation involves the introduction of nontemplated mutations in the V region of the Ig genes at a rate that is one million times greater than the basal rate (11); gene conversion consists of the introduction of templated mutations in the V region of Ig genes by copying into the active gene nucleotide stretches from existent template sequences that remain unaltered (12). Class switch recombination changes the C region of Ig H chain genes and thus the effector functions of Abs (13). Class switch recombination consists of the ligation of the Ig H chain V region to C region exons downstream of Cμ and Cδ following looping out and deleting the intervening DNA (13).
Given the diverse impact of AID on Ig genes, the question has emerged of whether AID is an RNA-editing enzyme (8) like APOBEC1, or whether it might directly deaminate dC to dU in the DNA (14). The discovery that AID mutates dC residues in Escherichia coli DNA (14) and in ssDNA in vitro (15, 16, 17) suggests that AID may mutate ssDNA and not RNA in B cells. Faili et al. (17) found mutations in one DNA strand, in the absence of de novo RNA synthesis, suggesting that AID might induce mutagenesis directly without RNA editing. Bransteitter et al. (15) suggested that RNA molecules inhibit AID function in vitro, because the addition of RNase greatly enhances deamination of dC residues in ssDNA molecules. However, RNA may still be required for AID function. Thus, Chaudhuri et al. (18) found that AID function requires transcription and suggested that transcription may generate RNA-DNA hybrids and a secondary structure recognized by the enzyme. Alternatively, AID may associate with the transcription machinery itself, which could target AID to transcribed genes (19, 20).
How cytidine deamination by AID promotes biochemically distinct mechanisms of nucleic acid modification in B cells is incompletely understood. Petersen-Mahrt et al. (14) proposed that cytidine deamination by AID introduces a U opposite to a G in DNA, thus creating a mismatch. If the mismatch is not repaired, then one of the daughter cells will have a C/G to T/A transition, because U does not pair with G. In the absence of repair, each cytidine deamination event would yield a transition in half of the cellular progeny, provided that the cells survive in the presence of the DNA mismatch. Repair of the mismatch may promote further mutations, because mismatch repair promotes nucleotide excision, which in hypermutating B cells is followed by error-prone resynthesis (14). Such “misrepair” of mismatches may explain how mismatch repair promotes somatic hypermutation (21). Mismatch repair machinery may also contribute to the survival of B cells in the presence of mismatches induced by cytidine deamination (22, 23).
Another path to mutations involves excision of uracil, in the dU-dG mismatch, by a DNA glycosylase (24), creating an abasic site. Repair of the abasic site, like repair of mismatches, requires single-strand nicks followed by error-prone DNA synthesis. Error-prone DNA synthesis may introduce any base opposite to G, thus originating both transitions (G/C to A/T) and transversions (G/C to T/A or to C/G). DNA glycosylase contributes to the generation of mutations following cytidine deamination, because uracil DNA glycosylase (UNG)-deficient DT40 chicken cell lines (24) or mice (25) have increased number of transition mutations and UNG-deficient mice also have deficient isotype class switch (25). Ig gene conversion is also decreased by inhibiting uracil-DNA glycosylase in chicken DT40 cells (26). In humans, deficiency of a human UNG causes hyper-IgM syndrome (27), suggesting that cytidine deamination of the Ig genes may be resolved predominantly by excision of uracil.
Although substitution of U for C in DNA explains much of the genesis of mutations, it does not explain class switch recombination, which requires DNA double-strand breaks (28, 29, 30). Double-strand breaks might be formed following nucleotide excision at staggered single-stranded breaks (31) and might also be required for somatic hypermutation (32, 33). Papavasiliou et al. (34), Zan et al. (35), and Bross et al. (32) suggested that Ig DNA double-strand breaks exist before cytidine deamination, and that AID may modify pre-existing DNA ends (35). Indeed AID might be recruited by the DNA ends and in this way may target specific sequences.
Cytidine deamination and defense against retroviruses
APOBEC3G (CEM15) was isolated by Sheehy et al. (36) as an anti-HIV factor, homologous to APOBEC1, the antiviral action of which was overcome by HIV (viral infectivity factor (Vif)). APOBEC3G deaminates dC to dU in the retroviral minus strand following reverse transcription (cDNA) (37, 38, 39). APOBEC3G mutates retroviral cDNA by creating uracil bases in the place of cytidines. Uracil-containing cDNA may be cleaved by UNG, causing degradation of viral cDNA, defective production of early gene products, and impaired integration of provirus in the host genome, thus decreasing virus infectivity (J. Gonçalves, unpublished observations) (40, 41). APOBEC3G may also generate viral variants and in this way help viruses escape the immune response of the host. The deaminase function of APOBEC3G confers protection against retroviruses, because substitutions in the deaminase domain of APOBEC3G (defined by analogy with AID and APOBEC1) (Fig. 1) cause substantial increases in viral infectivity coincident with decreased dC-dT mutations and decreased activity in free nucleotides (37, 38).
CLUSTALW alignment of the human APOBEC1 (P41238), AID (Q9G2XT), and APOBEC3G (AAH24268) protein sequences. APOBEC3G contains a duplicated active site, linker, and pseudo-active site, and to facilitate the alignment, it was cropped in N-terminal (N) fragment 1–214, and C-terminal (C) fragment 215–384. Identities are shown in bold, boxed, and on gray background. The different domains are identified based on homology to APOBEC1. The properties common to cytidine deaminases are indicated as follows: zinc ligand amino acids (∗), residue that mediates proton shuttling during catalysis (•). AID and APOBEC3G retain features of APOBEC1, such as a short insert between the two zinc ligand residues (boxed) and two aromatic residues that are implicated in RNA binding (▾).
CLUSTALW alignment of the human APOBEC1 (P41238), AID (Q9G2XT), and APOBEC3G (AAH24268) protein sequences. APOBEC3G contains a duplicated active site, linker, and pseudo-active site, and to facilitate the alignment, it was cropped in N-terminal (N) fragment 1–214, and C-terminal (C) fragment 215–384. Identities are shown in bold, boxed, and on gray background. The different domains are identified based on homology to APOBEC1. The properties common to cytidine deaminases are indicated as follows: zinc ligand amino acids (∗), residue that mediates proton shuttling during catalysis (•). AID and APOBEC3G retain features of APOBEC1, such as a short insert between the two zinc ligand residues (boxed) and two aromatic residues that are implicated in RNA binding (▾).
The mechanism of cytidine deamination
Cytidine deamination by APOBEC1 begins with activation of zinc-bound water in the active center of the enzyme. The N3 on cytidine is then protonated by the zinc-bound hydroxide to form ammonia. This results in the conversion of cytidine to uracil plus ammonia (42). The catalytic center of the cytidine deaminases has three zinc ligands that, in APOBEC1, are histidine in position 61 (H61), cysteines in positions 93 and 96 (C93 and C96), and a residue that mediates proton shuttling during catalysis, glutamic acid in position 63 (E63) (Fig. 1).
The amino acid sequences of AID and APOBEC3G provide clues to their function. Fig. 1 shows the alignment of human APOBEC1, AID, and APOBEC3G, and depicts putative structural domains according to APOBEC1: α helix, active site, linker, and pseudo-active site domains. Three of the four domains are duplicated in APOBEC3G. To facilitatecomparison, the APOBEC3G sequence was divided into N-terminal (APOBEC3G-N, residues 1–214) and C-terminal(APOBEC3G, residues 215–384) domains (Fig. 1). The cytidine deamination properties of AID and APOBEC3G were deduced from homology with APOBEC1 and from the presence of key catalytic residues (asterisks in Fig. 1). Because AID and APOBEC3G are thought to work on DNA, it is curious that they have two conserved aromatic amino acids that may confer RNA binding properties (arrows) (Fig. 1). The presence of these key residues suggests that AID and APOBEC3G may bind RNA even though they mediate cytidine deamination of DNA substrates (14). Zaim and Kierzek (43) recently suggested that AID might lack an N-terminal α helix and a pseudo-active domain and thus deviate structurally from APOBEC1. Zaim and Kierzek (43) proposed that the active site of AID may be contained in a globular domain spanning from aa 1 to 167, and that, in contrast to APOBEC1, AID may deaminate the substrate in monomeric form.
cis-Acting and trans-acting factors of cytidine deamination
Because cytidine deamination of nucleic acids generates mutations that are potentially tumorigenic, cytidine deamination must be regulated to avoid cancer. Considering how APOBEC1 function is regulated may help understand how this function is accomplished in the case of AID and APOBEC3G. APOBEC1 is the element of the family with the most precise target. APOBEC1 deaminates one cytidine residue in a mRNA molecule that is >14,000 residues long (44, 45). APOBEC1 RNA editing depends on cis-acting (contained in the mRNA sequence) and on trans-acting elements. The proximal cis-acting elements include a 30-nt sequence flanking the edited base and on two other sequences located distal 5′ and 3′ to the edited C. The proximal cis-elements include an 11-nt mooring sequence located 4–6 nt downstream of the edited C (5′-UGAUCAGUAUA-3′) and a consensus binding motif (UUUN(A/U)U) located 3 nt downstream of the edited base (44). The region containing the proximal cis-elements is thought to form a stem loop that situates the cytidine to be deaminated at the apex of the loop (44, 45). Two APOBEC1 monomers embrace the loop, causing the catalytic domains of one of the monomers to be in the right position (44). The distal cis-elements, although not required for cytidine deamination, increase the efficiency of the catalysis (44) (Table I).
Distinctive features of AID, APOBEC3G, and APOBEC1a
Enzyme . | Target Nucleic Acid . | Target Sequence . | Function . |
---|---|---|---|
AID | Ig DNA (RNA?) | 5′-dA/dT-dA/dG(pu)-dC-dC/dT(py)-3′ | Somatic hypermutation; isotype class switch; Ig gene conversion |
APOBEC3G/CEM15 | Retroviral cDNA | 5′-dC/dT(py)-dC-dC-3′ | Decreases retroviral infectivity |
APOBEC1 | apo B mRNA | Mooring sequence 5′-UGAUCAGUAUA-3′ located 4–6 nt downstream and a consensus-binding motif (UUUN(A/U)U) located 3 nt downstream of the edited C | Truncated apo B, apo B-48 |
Enzyme . | Target Nucleic Acid . | Target Sequence . | Function . |
---|---|---|---|
AID | Ig DNA (RNA?) | 5′-dA/dT-dA/dG(pu)-dC-dC/dT(py)-3′ | Somatic hypermutation; isotype class switch; Ig gene conversion |
APOBEC3G/CEM15 | Retroviral cDNA | 5′-dC/dT(py)-dC-dC-3′ | Decreases retroviral infectivity |
APOBEC1 | apo B mRNA | Mooring sequence 5′-UGAUCAGUAUA-3′ located 4–6 nt downstream and a consensus-binding motif (UUUN(A/U)U) located 3 nt downstream of the edited C | Truncated apo B, apo B-48 |
Abbreviations: py, Pyrimidine; pu, purine; C, deaminated cytidine.
The cis-acting elements for AID do not evidently direct AID catalysis to a unique site. AID preferentially targets WRCY (where W is A or T, R is a purine, and Y is a pyrimidine) somatic hypermutation hot spots (46, 47). Less clear are the cis-acting elements that target cytidine deamination to the switch regions, even though Sμ, Sα, and Sε switch regions include repeated AGCTC motifs that match the WRCY sequences (31). Tertiary stem loop structures may direct AID deamination of cytidines to switch regions. In fact, DNA tertiary stem or R loop structures, perhaps generated during transcription, may serve as intermediates for class switch recombination (48, 49, 50). Transcription associated with somatic hypermutation (51, 52) may also form tertiary structures in the nontranscribed strand, creating a susceptible target for AID (Table I).
The cis-acting elements for APOBEC3G include a consensus sequence (5′-dC/dT(py)-dC-dC-3′) in the minus strand of retroviruses (cDNA). APOBEC3G deaminates preferentially the cytidine closest to the 3′ end of this consensus (37, 38, 39). Reverse transcription of viral RNA may also yield tertiary nucleic acid structures specifically recognized by the enzyme. This possibility awaits investigation (Table I).
Sequence specificity and tertiary structure of nucleic acids may not suffice to target cytidine deamination, because APOBEC1, AID, and APOBEC3G cause dC-to-dT transition mutations in multiple sites of E. coli DNA (14, 46, 53). Thus, trans-acting factors may target cytidine deamination, thus regulating catalysis. APOBEC1 is regulated by cofactors. By itself, APOBEC1 does not edit apo B mRNA (44). The minimal unit that catalyzes cytidine deamination of apo B mRNA is composed of a homodimeric APOBEC1 and an APOBEC1 competence factor (44). Other cofactors modulate the activity and specificity of the core enzyme, provide additional RNA binding surface, or have regulatory properties and are not required for cytidine deamination per se (44).
Cytidine deamination by AID may also require cofactors. AID cofactors may determine whether cytidine deamination gives rise to somatic hypermutation, class switch recombination, or gene conversion, and the site of action. Consistent with this concept, Doi et al. (54) showed that AID-dependent switch recombination requires protein synthesis, suggesting that perhaps a cofactor is needed. Barreto et al. (55) and Ta et al. (56) found that some AID mutants allow somatic hypermutation to occur but do not allow class switch recombination. One possible explanation for the bifurcated function of AID is that mutant and wild-type AID associate with different sets of cofactors and that the cofactors target catalysis to different substrates or cause different outcomes following cytidine deamination. Another explanation is that association with cofactors promotes cell survival in the presence of cytidine deamination.
Trans-acting factors also regulate APOBEC3G. Cytidine deamination of retroviral cDNA by APOBEC3G is suppressed by a Vif produced by the HIV (57, 58). Vif impairs both the translation and the stability of APOBEC3G mRNA and promotes the degradation of APOBEC3G by the 26S proteosome, depleting APOBEC3G from cells, thus blocking antiviral activity of APOBEC3G (40, 59, 60, 61, 62). Sheehy et al. (36) and Khan et al. (63) found that APOBEC3G is packed into virions, suggesting that the enzyme plus the viral RNA direct APOBEC3G catalysis on viral cDNA. Because mutation of viral cDNA determines viral infectivity, expression of APOBEC3G by the host cell restricts HIV infection, and this restriction depends on Vif (40, 57, 64). Whether there is a eukaryote counterpart to Vif to limit APOBEC3G expression is an important question, because APOBEC3G is found in a variety of tissues besides spleen and PBL, including testis and ovary (65).
Cell-specific expression of cytidine deaminases
Cell-specific expression restricts cytidine deamination in the organism. APOBEC1 is expressed mainly by intestinal epithelial cells, where it edits the mRNA encoding apo B (1), whereas AID is expressed mainly by activated B cells (7). Human AID expression is induced by IL-4 and repressed by Abs directed against CD45, suggesting that stimuli that activate B cells induce expression of AID (66). In fact, transcription of the AID gene is regulated by factors, such as PAX5 (19) and E factors (67), that enhance transcription of other B cell-specific genes and by factors that repress B cell gene expression, the inhibitor of differentiation (Id) factors, Id2 (19) and Id3 (67). In contrast, APOBEC3G, although not restricted to a single cell type, is mainly found in spleen, gonads, and peripheral blood leukocytes (65). Whether or not expression of APOBEC3G can be induced is not known.
Intracellular localization and trafficking of cytidine deaminases
The function of nuclear factors is often regulated by limiting nuclear localization. For example, nuclear localization of PMS2, a mismatch repair protein necessary for efficient somatic hypermutation (21), is limited by dimerization to another mismatch repair protein, MLH1 (68).
Nuclear localization is regulated in part by size. Proteins smaller than 20–30 kDa diffuse freely through the nuclear pores and thus distribute equally in the nucleus and cytoplasm (69). APOBEC-1 at 27 kDa (70) and AID at 24 kDa (7) should therefore diffuse freely into the nucleus. However, Yang and Smith (71) and Rada et al. (72) found APOBEC1 and AID predominantly in the cytoplasm. Retention in the cytoplasm could be owed to association with other soluble proteins forming high-molecular-mass complexes or with fixed components of the cytoplasm or membranes. Consistent with these possibilities, Yang and Smith (71) showed that the C-terminal 24 aa of APOBEC1, which mediate homodimerization, also contribute a cytoplasmic retention domain. Disruption or neutralization of this domain of APOBEC1 leads to nuclear localization of APOBEC1 (71).
What exactly retains APOBEC1 in the cytoplasm is unknown. However, Xnf 7 may provide a model of regulating nuclear localization. Xnf 7 is retained in the cytoplasm by interaction with an anchor protein that is regulated by phosphorylation (73). If the interaction of Xnf 7 with cytoplasmic anchors is disrupted, Xnf 7 is directed to the nucleus. Chester et al. (74) suggested that release of APOBEC1 from its cytoplasmic anchor causes nuclear import of APOBEC1 associated with a cofactor (APOBEC1 competence factor). Chester et al. (74) showed that APOBEC1 shuttles between the nucleus and the cytoplasm using noncanonic nuclear localization signals at the N-terminal end and a nuclear export sequence at the C-terminal end. Apo-B RNA editing requires nuclear localization of APOBEC-1 (75) and nuclear shuttling may protect the edited RNA from degradation in the cytoplasm (74).
AID like APOBEC1 localizes in the cytoplasm. Ito et al. (76) recently found that AID may shuttle between the nucleus and cytoplasm. AID like APOBEC1 is retained in the cytoplasm. Early regulatory events appear to release it from unknown cytoplasmic anchor, allowing AID to enter the nucleus and initiate cytidine deamination, in the nucleus (M. Cascalho and X. Wu, unpublished observation). Regulation of nuclear localization may be essential to limit AID substrates availability.
Inappropriate cytidine deamination and tumor formation
Inappropriate expression of cytidine deaminases may be tumorigenic. Consistent with this, inappropriate or dysregulated expression of cytidine deaminases is often found in tumors (2). Inappropriate expression of APOBEC1 edits AU-rich sequences that are found in the 3′-untranslated regions of RNAs where they promote their rapid degradation. Thus, APOBEC1 modifies the RNA encoding c-myc, increasing its stability. In certain neurofibromatosis tumors, APOBEC1 edits the neurofibromatosis gene-1, producing an early stop codon (77).
Inappropriate expression of AID increases the mutation rate of genes that are not normal targets of AID (78). Constitutive expression of an AID transgene causes point mutations in the TCR, in c-myc genes, and in T cell lymphomas. AID is constitutively expressed in B cell neoplasms, including human germinal center non-Hodgkin lymphomas, non-germinal center non-Hodgkin B cell lymphomas (79), and chronic lymphocytic leukemia (80). Inappropriate expression of AID may also promote tumorigenesis in other cell types, because AID transgenic mice develop microadenomas of the bronchiolar lining (78).
Because purified APOBEC3G triggers deamination of dC-dU in ssDNA substrates in vitro (39) and converts dC into dU in E. coli (53), one might predict that inappropriate expression of APOBEC3G would mutate nonretroviral nucleic acid. APOBEC3G is expressed in human colorectal adenocarcinoma and Burkitt lymphoma (65), suggesting that it may contribute to tumorigenesis, perhaps by inappropriate editing of transcripts or by mutating DNA. Further study is needed to determine whether APOBEC3G mutates eukaryotic cellular DNA or edits RNA, and, if so, what are the targets.
Conclusion
AID and APOBEC3G contribute to host defense by deaminating cytidines in nucleic acids encoding Igs or in retroviral genomes. AID and APOBEC3G may deaminate dC to dU directly in the genomic DNA or in the viral cDNA, respectively. Whether AID and APOBEC3G deaminate cytidines in mRNA is still controversial. Because AID and APOBEC3G introduce mutations at multiple sites, diversification of Abs and control of retroviruses may come at the cost of an increased risk of tumor formation. Our next challenge is to discover how cytidine deamination is controlled. This may enable the design of novel strategies for the prevention of viral dissemination.
Footnotes
Work in the author’s laboratory is supported by Grant AI 48602 from the National Institutes of Health.
Abbreviations used in this paper: apo, apolipoprotein; APOBEC, apo B RNA-editing cytidine deaminase; AID, activation-induced cytidine deaminase; UNG, uracil DNA glycosylase; Vif, viral infectivity factor; Id, inhibitor of differentiation.