• No results found

University of Groningen Known and unknown functions of TET dioxygenases: the potential of inducing DNA modifications in Epigenetic Editing Chen, Hui

N/A
N/A
Protected

Academic year: 2021

Share "University of Groningen Known and unknown functions of TET dioxygenases: the potential of inducing DNA modifications in Epigenetic Editing Chen, Hui"

Copied!
18
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Known and unknown functions of TET dioxygenases: the potential of inducing DNA

modifications in Epigenetic Editing

Chen, Hui

DOI:

10.33612/diss.168496242

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date: 2021

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Chen, H. (2021). Known and unknown functions of TET dioxygenases: the potential of inducing DNA modifications in Epigenetic Editing. University of Groningen. https://doi.org/10.33612/diss.168496242

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)

Chapter 6

Summary & general discussion

and future perspectives

(3)

SUMMARY AND GENERAL DISCUSSION SUMMARY

In this thesis, we explored locus-specific targeted DNA demethylation for epigenetically silenced genes through epigenetic editing. First, we targeted various potential candidates with "demethylase" activity in A2780, Skov3 and H134s ovarian cancer cell lines. These candidates include TET dioxygenase family members TET1 and TET3, cytosine deaminase Aid and Apobec1, and nucleotide excision repair (NER) pathway component Gadd45a. Full-length coding sequences of Gadd45a, Aid, Apobec1 and truncated versions of the coding sequences of TET1 and TET3 (including the catalytic domains) were fused with ICAM-1 (CD54) and EpCAM (Up2, also named ZFB) zinc fingers (ZFs) coding sequences to form ZF-effector domain (ZF-ED) constructs. Next, we confirmed that all ZF-ED constructs expressed the fusion proteins of expected size through immunoprecipitation followed by Western blot analysis. Subsequently, their effects on the level of DNA methylation at the target genes and on their expression were assessed. The results showed that targeting of Aid, Apobec1 and Gadd45a did not result in significant induction of demethylation at CpG sites in the target region despite detectable expression of ZF-EDs. Interestingly, we detected that TET1 and TET3 resulted in significant demethylation at a CpG site in the target region of the ICAM-1 promoter even though the transduction efficiency was very low (Chapter

2).

This suggested to us that TET dioxygenase indeed can be a promising inducer of targeted demethylation. Therefore, we further tested the targeted demethylation effect of all three TET family members in human ovarian cancer cell line A2780. At the same time, in order to overcome the problem of low transduction efficiency, we sorted the successfully transduced cells to ensure that the cells used for analysis were transduced to express the ZF-TET CD construct. We showed demethylation of targeted CpG sites in the ICAM-1 and EpCAM promoters for targeted TET2 and to a lesser extend for TET1, but not for TET3. Interestingly, we also observed a small but significant increase of ICAM-1 transcription after expression of TET2CD, but not for TET2CD mutant. In contrast to the data obtained for ICAM-1, we did not observe re-activation of EpCAM transcription by Up2-TET2CD. These data indicate that induction of target gene expression might be obtained by targeted TET2, but that the activation of transcription is dependent on the location of the demethylated target CpG sites in the target gene promoter and/or other promoter characteristics. To address genome-wide effects by targeted TET-fusions, we analyzed the methylation level of LINE-1 repeat sequence (17% of the human genome). The results showed that no DNA demethylation was detected for the three core CpG sites of the LINE-1 promoter after treatment with either

of the targeted candidate effector domains. To our knowledge, we were the first to actually induce TET-mediated DNA demethylation at a hypermethylated site of interest, and Chapter 3 describes an interesting approach for further studying the mechanism of TET-induced DNA demethylation in endogenous chromatin contexts. At present, studies on the catalytic properties and physiological functions of TET dioxygenase are focused on mammalian contexts. However, TET homologues can also be found in lower eukaryotes, such as eight TET homologues identified in Chlamydomonas reinhardtii (C. reinhardtii).Yet, the amino acid sequence of TET homologues interacting with 2-OG in C. reinhardtii is not identical compared with TET in mammals, suggesting that they may have different catalytic properties from TET in mammals. In order to better understand the evolutionary history of TET dioxygenases and to provide new insights into the mechanism of DNA demethylation, we explored the catalytic properties and physiological functions of TET homologues in the more ancient organism C. reinhardtii. We first discovered that CrTET1 (also named 5-methylcytosine modifying enzyme, CMD1) can catalyze 5mC to produce two novel DNA modification products with stereoisomers (Named 5-glyceryl-methylcytosine, 5gmC) in vitro. Therefore, it’s necessary to obtain CMD1 mutants to further verify the catalytic reaction and to reveal their physiological functions in C. reinhardtii. However, there has been a lack of effective targeted gene editing tools for nuclear genes, which greatly limits the study of genes in C. reinhardtii. Therefore, we first optimized the gene editing method based on Cas9-gRNA ribonucleoproteins (RNPs) and developed a strategy for co-selection of target gene mutant and MAA7 mutant. MAA7 encodes tryptophan synthase β-subunit, which catalyzes the last step of tryptophan synthesis, and produces tryptophan with serine and indole as substrates. C. reinhardtii will be resistant to 5-Fluoroindole (5-FI) when a severe disruption e.g. a frameshift mutation occurs in MAA7. By using the co-selection strategy, we obtained CMD1 mutants and mutants of the VTC2 gene responsible for the production of vitamin C (VC) in C.

reinhardtii (Chapter 4).

Subsequently, using these mutants, we confirmed that CMD1 could modify 5mC in C. reinhardtii genome and that VC is used as a source of glyceryl group added to 5mC. It also suggests that small molecule metabolites can be directly involved in DNA modification and may be involved in transcriptional regulation. In CMD1 mutants, the content of 5gmC decreased significantly, while the content of 5mC increased, suggesting that the production of 5gmC was related to demethylation of 5mC. Phenotypically, we found that the tolerance of CMD1 mutants to strong light was significantly reduced, and that the ability of non-photochemical quenching (NPQ) and electron transfer rate (ETR) were also significantly reduced, indicating that CMD1 mutant had obvious defects in light protection mechanism. Subsequently, we explored

(4)

the molecular mechanism of this defect, and found that the expression level of NPQ core factor Light-harvesting complex stress-related 3 (LHCSR3) in CMD1 mutant was significantly reduced. Further bisulfite sequencing showed that the methylation levels of upstream sequences of LHCSR3 were significantly increased in CMD1 mutant, inhibiting the expression of this gene, and causing photoprotection defects in CMD1 mutant. Our study thus reveals a new eukaryotic DNA base modification and its involvement in a functionally conserved but mechanistically divergent DNA demethylation pathway for the epigenetic regulation of photosynthesis (Chapter 4). We demonstrated isolation of two non-phenotypic target genes CMD1 and VTC2 mutants with CRISPR-driven gene-editing by developing a co-selection strategy in Chapter 4. But the efficiency of obtaining CMD1 mutants was very low, we identified only one mutant from 986 5-FI resistant colonies. We then found that the editing efficiency of target genes mediated by Cas9-gRNA RNPs is positively related to their transcriptional level in co-selection strategy through targeting FKB12, FTSY, and IFT46 genes. As an alternative, improving the screening accuracy of candidate mutants from the cell population could also improve the isolation efficiency of mutants for genes with low transcriptional level including CMD1. Therefore, we developed a microhomology-mediated integration of donor DNA and targeted integration-dependent two-step screening processes, enabling us to effectively isolate the desired mutants from cell populations and resistant colonies pools. Through this strategy, we identified 10 CMD1 mutants from 120 Hygromycin B-resistant colonies, thus increasing efficiency by about 80 folds when compared with the co-selection strategy (Chapter 5).

We further found that these CMD1 mutants could be divided into two types. One type integrates a complete donor DNA at Cas9 cleavage site and is named CMD1 “one copy” mutant, while another type integrates two donors DNA at Cas9 cleavage site by a linker DNA fragment and is named CMD1 “two copies” mutant. Moreover, DNA sequencing for donor DNA at the 5' and 3' junctions of insertion sites showed no base deletion. This indicates that the integration event is microhomology-mediated and recombination-dependent knock-in of donor DNA for all CMD1 mutants. In addition, our screenings also showed that CrTET2, a paralogous gene of CMD1, does not existed in the CC-5325 strain, which will be helpful to further reveal the functions of CMD1 and 5gmC. The approach will open up a new avenue for improving the mutant isolation efficiency of candidate genes with low transcription level and is helpful to further reveal its function. As such, this study provides an effective platform for targeted knock-out/or knock-in for any gene of interest by utilizing currently methods and the co-selection strategies previously demonstrated (Chapter 5).

GENERAL DISCUSSION

Based on the existing experimental results and conclusions, the viewpoints that need further discussion and future research directions are presented as follows.

Targeted DNA demethylation by Aid and Apobec1

Evidence coupling DNA deamination with base excision repair (BER) pathway came from studies in zebrafish embryos. Demethylation of injected DNA fragments, as well as the whole genome, occurs in a time dependent manner which coincided with the up-regulation of AID/APOBEC genes (1). Knockdown of AID results in loss of neurons in 1-day embryos, due at least in part to increased CpG methylation at the neurod2 promoter (1). Based on the zebrafish model, it was proposed that removal of 5mC by AID/APOBEC1 is a two-step process. Firstly, 5mC is deaminated to produce thymine (T) under the catalysis of Aid/or Apobec1, which led to the formation of G/T mismatch in the genome. Then, the G/T mismatch can be recognized and cut by multiple DNA glycosylases including TDG, and the resulting AP site is repaired as unmethylated cytosine by the base excision repair (BER) pathway. However, two studies in the Aid and Apobec1 knockout mouse models showed that although the expected B cell maturation and immune deficiency could be produced due to their canonical function in immune cells, these knockout mouse were still viable and fertile (2, 3). This suggests that Aid and Apobec1 plas a limited role in the active erasure of 5mC marks as observed in the male pronucleus of zygote. Nevertheless, there are studies in mouse primordial germ cells (PGCs), mouse embryonic stem cell (ESC)/human fibroblast fused heterokaryons, and induced pluripotent stem cells (iPSCs) respectively, which support that the 5mC deamination mediated by Aid or Apobec1 has a potential role in DNA demethylation (4, 5, 6).

Based on the above evidence, we tested Aid and Apobec1 as potential effector domains for target DNA demethylation. Although the results of flow cytometry showed that the transduction efficiencies of CD54/Up2-Aid and CD54/Up2-Apobec1 expression constructs were over 70% and the expression of CD54/Up2-Aid or CD54/ Up2-Apobec1 fusion proteins in host cells was further confirmed by western blot, no significant effects were detected on inducing demethylation at any CpG site in the target region through Aid and Apobec1 (Chapter 2). There may be two reasons. First, Aid/Apobec1 only efficiently acts on single-stranded but not double-stranded DNA (7, 8). Therefore, it may be inefficient to use zinc finger proteins to fuse Aid or Apobec1 for targeted demethylation, because zinc finger proteins can only bind to double-stranded target DNA sequence, and cannot open DNA. Secondly, enzyme activity analysis in vitro showed that the activity of AID/APOBEC for 5mC substrate was 10-20 fold lower than for their canonical substrate, unmethylated cytosine (9).

(5)

In order to overcome the first obstacle, the zinc finger protein can be replaced by the Cas9 mutant without nuclease activity (also known as deactivated Cas9, dCas9) as the DNA binding domain. In the CRISPR/Cas9 system, Cas9 proteins with the guide RNAs (gRNAs) form Cas9-gRNA ribonucleoproteins (RNPs) complexes, which then scan for the protospacer-adjacent motif (PAM) sequence throughout the genome, while part of the 5' terminal sequence of the gRNA (usually about 20 nt) attempts to bind to the DNA sequence matching the upstream sequence of the PAM. The conformation of Cas9 changes to activate its nuclease activity when the two match perfectly. Subsequently, a double-stranded DNA (dsDNA) break is generated by Cas9 cleavage at three bases upstream of the PAM sequence. The feature is widely used for targeted insertion of DNA in host genomes. In the binding process, an R-loop is formed when the gRNA binds to its complementary DNA sequence, resulting in the opening of dsDNA at the target site. By utilizing this feature of Cas9, recently developed Base editors (BEs), which combine a cytidine deaminase with Cas9, have been successfully applied to perform targeted base editing, including C-to-T edits (10, 11, 12, 13).

However, the cytosine of CpG dinucleotides except CpG islands is usually methylated in mammalian cells (14), and methylation of cytosine strongly inhibits cytidine deamination catalyzed by Apobec1 and Aid deaminases. Aiming at the second obstacle, it has been reported that the deamination activity of APOBEC3A for 5mC is significantly higher than that of Aid and Apobec1 (15). Therefore, APOBEC3A can be used to replace Aid or Apobec1 as an effector domain factor for targeted DNA demethylation mediated by 5mC deamination. Indeed, a recent study has shown that APOBEC3A-conjugated base editors (BEs) can mediate efficient C-to-T base editing in regions with high methylation levels and GpC dinucleotide content (16). Therefore, in order to demonstrate whether the targeted demethylation can be achieved by deamination, the CRISPR/Cas9 system should be used in fusion with Apobec3a to open dsDNA. Furthermore, future studies are also needed to further clarify the exact biochemical mechanism for AID/APOBEC mediated DNA demethylation.

Targeted DNA demethylation by Gadd45a

In addition to BER, there is another DNA damage repair pathway, namely nucleotide excision repair (NER) pathway, which is also proposed to play a role in DNA demethylation. This repair mediated by the NER pathway is usually used to repair damaged DNA with bulky lesions caused by exposure to chemicals or radiation (17). In 2007, Niehrs group first reported that growth arrest and DNA-damage-inducible protein 45a (Gadd45a) was involved in DNA demethylation process through NER pathway (18). Later studies in zebrafish and mammalian cells also suggested that Gadd45a may play a role in inducing active DNA demethylation (19, 20). However, two

subsequent independent studies have questioned the role of Gadd45a in the active DNA demethylation process. In the first study, the Pfeifer group carried out a series of experiments that were similar to those carried out by the Niehrs group, but there was no evidence that Gadd45a contributed to DNA demethylation (21). The second study in Gadd45a knockout mouse model showed that the loss of Gadd45a function had no effect on both site-specific DNA methylation and global DNA methylation levels (22). Similar to the conflicting results mentioned above, Gadd45a was not found to have targeted demethylation effect in our study (Chapter 2). For a long time, it was not clear how the demethylation process is initiated and whether Gadd45a is directly involved. Recently, a study reported that demethylation previously suggested to be directly induced by GADD45A depends on its binding to R-loops and further recruitment of TET1 (23). It may be that R-loops are not formed in the ZF target region of ICAM-1 or EpCAM promoter in our study. In addition, even if R-Loops are formed, the expression of TET in cancer cell lines is very low (24, 25, 26), so likely no TET1 has been recruited to target sites, and demethylation could not be initiated. As an alternative, co-transduction of GADD45a and TET1 may be a good optional for targeted DNA demethylation. However, “TET1 only” should be used as a control to compare whether GADD45A combined with TET1 has better targeted demethylation effect than TET1 alone.

Targeted DNA demethylation by TET family proteins

In mammalian cells, early efforts for active DNA demethylation focused on testing various known DNA modifying proteins, such as DNA repair enzymes, cytosine deaminases and DNA glycosylases. However, these studies did not result in a consensus that is universal in its mechanism and that can be widely accepted. In addition, these proposed candidate factors usually only induce DNA demethylation in a limited way and in a very special biological environment. More importantly, in many cases, these results could be confirmed and supported by follow-up studies (27, 28). A breakthrough occurred in 2009, when an oxidized derivative of 5mC, 5-hydroxymethylcytosine (5hmC), was rediscovered and confirmed to exist in the mammalian genome. More importantly, it was also identified that TET dioxygenases are responsible for catalytic conversion of 5mC to 5hmC in the presence of cofactors Fe2+ and 2-OG (29, 30). Interestingly, three subsequent studies have revealed that TET

proteins can further oxidize 5hmC to 5-formylcytosine (5fC) and 5-carboxycytosine (5caC) (31, 32, 33). These findings further shed light on the mechanism of DNA demethylation. On the one hand, two advanced oxidative derivatives, 5fC and 5caC, can be removed in the form of passive demethylation by replication dependent dilution, since no protein factor that can maintain the existence of 5fC and 5caC on the daughter

(6)

chain after DNA replication is known (34). On the other hand, the Xu group found that TDG could specifically recognize and remove 5caC, and the generated AP site can be restored to unmethylated cytosine via BER pathway, thus realizing active DNA demethylation in a real sense (31). At present, although there are several mechanisms independent of TET dioxygenases which have been proposed to be involved in the active DNA demethylation (35, 36), the TET proteins together with the TDG-BER pathway have been widely accepted. As TET dioxygenase-mediated 5mC iterative oxidation plays an important role in DNA demethylation, it is reasonable to believe that TET family members as potential effector domains will have a bright future in the application of targeted DNA demethylation.

Since the C-terminal catalytic domain of all three TET family members has the ability to locate to the nucleus and oxidize 5mC (30, 37, 38), we fused the DNA sequence encoding the catalytic domain of TET proteins with the DNA sequence encoding gene-specific zinc finger protein. This way the large size of ZF-ED fusion protein is reduced to improve the expression efficiency, while the N-terminal non-catalytic domain of TET which binds DNA is also excluded. In the initial experiment, we observed that only TET1CD had a slight but significant demethylation effect on only one CpG site in the target region of ICAM-1 promoter (Chapter 2). Because of their large size, the transduction efficiency of TET1CD and TET3CD is lower than that of Aid, Apobec1, and Gadd45a. This suggests that a significant demethylation might not be detected due to the presence of a large number of cells that had not been successfully transduced to express TET1CD or TET3CD, which dilute the effect of the transduced cells.

In order to overcome the problem that target demethylation cannot be detected in target CpG sites of ICAM-1 promoter due to low transduction efficiency, we introduced FACS sorting of the transduced cells in subsequent experiments to ensure that the cells used for analysis were successfully transduced to express CD54-TETCD or ZFB-TETCD construct. Thanks to this optimization, we detected that CD54-TET2CD had significant demethylation effect on three CpG sites in the target region of ICAM-1 promoter. CD54-TET1CD had significant demethylation effect on one CpG site, while CD54-TET 3CD had no significant demethylation effect on the tested three target CpG sites (Chapter 3). Interestingly, we observed that among the three TET family members tested, TET2CD showed the most significant demethylation ability compared with TET1CD and TET3CD. Consistent with our observations, TET2CD also showed the most significant ability to oxidize 5mC to 5hmC in an in vitro experiment (37). This may reflect their different affinity to 5mC DNA substrates. In addition, for TET3CD, because the immunoprecipitation combined with Western blotting experiment showed that its expression level was the lowest of the three TET members whether it has the effect of targeted demethylation needs further clarification. In fact, Zeisberg et al. recently

reported that they have developed a high-fidelity CRISPR/Cas9-based gene-specific dioxygenase by fusing an endonuclease deactivated high-fidelity Cas9 (dHFCas9) to the catalytic domain of TET3 (TET3CD). Their results showed that dCas9-TET3CD and dHFCas9-TET3CD fusion proteins induced the target demethylation on the RASAL1 promoter, both in vitro and in vivo (39).

For another target gene, EpCAM, ZFB-TET2CD showed only slight but significant

demethylation at one CpG site immediately downstream of the zinc finger protein binding sequence, while no significant demethylation at the other two CpG sites. ZFB

-TET1CD and ZFB-TET3CD had no significant demethylation at all three target CpG

sites (Chapter 3). We believe that under the same targeting strategy, the difference in demethylation between ICAM-1 and EpCAM is due to the different accessibility of chromatin in the two targeting regions. There are four arguments to support this view. Firstly, CD54-TETs CD targeting ICAM-1 and ZFB-TETs CD targeting EpCAM showed

the same transduction efficiency and expression level in host cell A2780, excluding the difference of demethylation effect was due to the inconsistency of transduction efficiency and fusion protein expression level. Secondly, in A2780 cells, the average methylation level of five CpG sites in CD54-TETs CD targeting region was 71%, while that of three CpG sites in ZFB-TETs CD targeting region was 94%. This indicates that

the target region of ZFB-TETs CD on EpCAM promoter is in a more hypermethylated

state, which might reflect a denser chromatin structure (40, 41, 42). Furthermore, dense chromatin structure will significantly reduce the accessibility of DNA binding proteins, including transcription factors, to their own target sequences (43, 44). Thus, the expressed ZFB-TETs CD fusion protein cannot effectively bind to its target

sequence, resulting in the observation that target CpG sites cannot be effectively demethylated. Thirdly, as a positive control, CD54-VP64 showed an average 450-fold increase in the transcription level of target gene ICAM-1 and significant demethylation at four target CpG sites (2 CpG sites in ZF binging region and 2 CpG sites in target region), while ZFB-VP64 showed no significant transcriptional activation for target

gene EpCAM and only a slight demethylation at one target CpG site in ZF binding region. The observed demethylation at the binding site reflects inaccessibility of the ZF bound genomic DNA for (or competition with) the maintainance DNA methyltransferase DNMT1, due to steric hindrance by ZF binding. Indeed, the results of flow cytometry and western blotting showed that both only and the ZF-VP64 constructs had high efficiency of transduction and expression. Because DNA demethylation was not detected at the target CpG sites outside the CpG sites in the ZF binding region on the ICAM-1 promoter, and as VP64 is a small domain (about 7 kDa), the additional VP64-dependent demethylation might be a secondary effect to the re-activation of ICAM-1 transcription. Therefore, since both CD54-VP64 and ZFB-VP64

(7)

have high expression levels in A2780 cells, and transcription activator VP64 has no preference for DNA substrates (45), the reason for the different activation of the two target genes is that the DNA binding efficiency of zinc finger proteins for their target sequences is obviously different. Fourth, the engineered zinc finger CD54 was obtained by screening and further optimization (46), while zinc finger ZFB was obtained by

conventional modular design (47, 48). In conclusion, these results suggest that the effect of targeted DNA demethylation depends not only on the selected effector domain, but also on the DNA binding domain and methylation level and chromatin accessibility of the target genes.

Possible DNA binding domains for targeted DNA demethylation

Epigenetic editing systems rely on two components, a protein-based engineering DNA-binding domain (DBD), and an effect domain (ED) with specific catalytic activity or acting as scaffold, to modify the targeted chromatin or recruitment of other factors, such as transcription machinery (49, 50, 51). Commonly employed DBDs include zinc finger proteins (ZFPs) and transcription activator-like effectors (TALEs), although these seem to be outperformed by type II clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 system (52). ZFPs and TALEs both are DNA binding domains responsible for protein/DNA interaction, and the amino acid peptides that specifically interacts with target DNA sequences are prepared by engineering modular design. In contrast, in the CRISPR/Cas9 system, whether it concerns a Cas9 with complete activity or an inactivated Cas9, the binding to target site depends on complementary base pairing of the single strand guide RNA (gRNA) with the target DNA sequence.

C2H2 zinc finger proteins (ZFPs) were the first reported DNA binding proteins that interact with the target DNA sequence in a modular design and predictable way. One of the zinc finger modules is composed of about 30 amino acids, which can specifically recognize and bind three base pairs in the target DNA nucleotide sequence. In principle, the specificity of ZFPs for target DNA sequence recognition comes from a part of the alpha helix which is also called “recognition domain”, and which can bind to the major groove of the DNA double helix (53). In 2002, Pabo group was the first to use ZFPs system as DNA binding domain for epigenetic editing (54). They engineered a zinc finger protein by modular design to specifically target VEGF-A gene and fused it with a histone methyltransferase. The results showed that the level of H3K9 methylation was increased at the target site and the expression of VEGF-A was repressed (54). This finding has long been unnoticed (49), but following the rise in interest in epigenetics, since 2013 the number of studies targeting endogenous genes for epigenetic editing has taken off. Since then, the modular ZFPs have been designed

to fuse with histone methyltransferases (HMTs) (55) and DNA methyltransferases (DNMTs) to repress expression of target genes (56, 57, 58, 59, 60). Alternatively, ZFPs can also be used to activate target gene expression by being fused with transcriptional activator or the potential DNA demethylase as demonstrated in this thesis. The activation domain commonly used is the VP16, a viral activation domain that recruits Pol II transcriptional machinery (61, 62). Moreover, a tetramer of VP16 domains, named VP64, has been fused to gene specific ZFPs to activate endogenous target genes by targeting promoters and showed a stronger activation effect (63). Also, we fused VP64 with zinc finger proteins including that specifically targeting ICAM-1 (46), and successfully up-regulated ICAM-1 expression in ovarian cancer cell lines, resulting in a reduction in the growth rate of cancer cells (64). From the perspective of molecular mechanism, VP64 not only has a stronger effect in recruiting RNA polymerase transcriptional machinery than VP16, but also further recruits remodeling factors and has been linked to the deposition of activating histone marks, including as H3K27ac and H3K4me, resulting in increased chromatin accessibility (65, 66, 67).

In our study, CD54-VP64 also showed a very significant activation effect for the target gene ICAM-1 (an increase of 500 folds compared with the control). In contrast, the expression of the target gene ICAM-1 increased by only 2.5 folds by CD54-TET2CD (Chapter 3). From a clinical point of view, we believe that the more physiological transcriptional activation of target genes induced by TET2CD targeted demethylation has advantage over that mediated by VP64. Because the DNA sequence encoding VP64 comes from a viral genome, it does not naturally exist in the human genome potentially causing immunity problems. In order to continuously activate a target gene’s transcription, the DNA sequence encoding VP64 must be integrated into the genome of host cells to ensure it will exist for a long time. This may cause host cells to be exposed to long-term uncontrolled risks and may have unpredictable negative effects. On the contrary, the use of effector domains with specific functions that might induce longer-term effects and that naturally exist in the human genome (like TET enzymes) will minimize such negative impacts and uncontrollable risks.

Compared with ZFPs, the specificity of Transcription-Activator-Like Effectors (TALEs) for target DNA sequence recognition comes from the central repeat domain of each module. This domain contains subunits of 33 to 35 amino acid repeats. The 12th and 13th positions form the repeat variable di-residue (RVD), which determines single nucleotide binding (68, 69). In engineered TALEs, the amino acids in the RVD can be exchanged using a simple code to program sequence specificity. Using a modular design method similar to ZFPs, TALEs were also designed and fused with epigenetic effector domain or transcriptional modifiers to targeted the promoter of candidate genes for induce transcriptional activation (65, 70, 71, 72) and repression (73, 74),

(8)

respectively. In addition to targeting promoters, TALEs has also been reported to be able to target and modify chromatin at enhancers of target genes, i.e. by fusing DNA methyltransferase and histone acetylase/methylase with a gene specific TALE to regulate gene expression via epigenetic editing (75, 76, 77). Furthermore, a recent report showed that the fusion of gene specific TALEs with TET dioxygenases, can effectively induce DNA demethylation at the target site by the way of iterative oxidation of 5mC, so as to achieve the purpose of activating expression of the epigenetic silenced target genes by active DNA demethylation (78).

Although ZFPs and TALEs have been widely used as effective targeting platforms to fuse with a variety of epigenetic effector domains for modify specific chromatin sites. However, due to the targeting principle of protein/DNA interaction, it is required to de novo design the DNA binding domain for each specific target sequence, which leads to the tedious process of engineering, thus limiting the large-scale application of ZFPs and TALEs. A major breakthrough came from the advent of CRISPR/Cas9 system, which consists of two components. The first component is an endonuclease Cas9 that can recognize PAM sequence and cleave at three bases upstream to produce a double strand break (DSB) with a blunt end. The second factor is a short single strand guide RNA (gRNA). The two components first form gRNA-Cas9 ribonucleoproteins (RNPs) complex, and then Cas9 cleaves at a specific target site under the guidance of gRNA (79). The recognition of target sequence in CRISPR/Cas9 system is based on RNA/DNA base pairing, which avoids the tedious process of de novo design and engineering synthesis of protein DNA binding domains, makes it highly efficient, specific, universal and easy to use, greatly simplifying the programmable DNA targeting program (80, 81, 82, 83). The exciting result of successfully targeting gene editing in the field of genetics urges researchers to further apply CRISPR/Cas9 system to the field of Epigenetic editing. Thanks to the successful analysis of the crystal structure of the complex formed by gRNA-Cas9 RNP and its substrate DNA, the researchers revealed that the RuvC and HNH domains are necessary for the endonuclease activity of Cas9. Furthermore, the core active sites in RuvC and HNH domains were mutated by engineering methods, and the deactivated Cas9 (dCas9) without endonuclease activity was obtained, but its ability of recognizing and binding PAM sequences on DNA was still preserved. Therefore, in theory, it is possible to fuse dCas9 with specific functional epigenetic domains and guide it to genome specific sites by sequence specific gRNA. Then dCas9 will assist in the interaction between epigenetic domains and target chromatin, but will not cause a double-stranded break.

First of all, the application of CRISPR/Cas9 apart from genome editing was shown by fusing , dCas9 with VP16 or KRAB, a transcription activator or repressor with known functions, to achieve the regulation of gene transcription at specific sites (84, 85, 86,

87, 88). The utility of dCas9 is further expanded to fuse with epigenetic effector domains for targeted histone modifications (89, 90, 91, 92, 93), DNA methylation (94, 95, 96, 97, 98), and interaction between ncRNAs and chromatin (99). In addition to the applications described above, several researches have also reported that engineered dCas9 fusions with TET1 catalytic domain (TET1CD) to investigate the impact of targeted hydroxylation of methylated CpG for various aims in diverse contexts including active DNA demethylation in post-mitotic neurons in murine brains, cell-fate conversion in ESCs, and reactivation of tumor suppressor genes in cancer cells (100, 101, 102, 103).

In this chapter, we describe and discuss three DNA binding domains: ZFPs, TALEs and dCas9, which have their own merits and pitfalls. From the practical point of view, on the one hand, ZFPs are limited by the interaction between single fingers, which may affect the targeting efficiency; on the other hand, due to the influence of additional genomic and chromatin content surrounding the target sites, it may affect the fidelity of ZFPs to target sequence recognition, resulting in the off-target effect (104). However, ZFPs are being the oldest, and have been clinically well-studied as genome editing system, without adverse effects. Indeed, recently the first application in a clinical trial was performed for the treatment of Hemophilia and Hunter Syndrome by in vivo injection of viruses encoding the ZF-nuclease fusions and the correction fragment (105, 106, 107). From the perspective of immunogenicity, the zinc finger proteins-dependent DNA binding domain are based on imitating the zinc finger transcription factors naturally existing in human, which may have more advantages in human cells than TALEs and dCas9 derived from bacteria. Furthermore, in the case of fusing the same epigenetic domain (ED) and targeting the same DNA sequence, the ZF-ED fusion protein has a smaller molecular weight than TALEs or dCas9, which is might improve the expression efficiency. More importantly, also for the clinical application in vivo, the small size of ZF in fusions protein allows larger parts of effector domains to be fused. Moreover, in terms of target specificity, TALEs seems to show better performance than ZFPs and dCas9 (112), because several independent studies support that TALEs is not affected by the sequence and chromatin surrounding the target site (108, 109, 110, 111). Indeed, there are also studies showing that ZFPs and CRISPR/Cas9 systems show significant off-target activity in some applications (113, 114, 115), although other reports claimed that they show good specificity (116, 117, 118).

Compared with ZFPs and TALEs, CRISPR/dCas9 system has the advantages of avoiding the de novo design of DNA binding domain and being suitable for high-throughput applications. As described above, on the one hand, in the dCas9-dependent target epigenetic editing, the recognition of the target by the complex is based on the Watson/Crick base pairing of the single stranded gRNA and the target DNA sequence,

(9)

so there is no need for de novo design of DNA binding domain, which is necessary in ZNF and TALEs, resulting in a tedious work and unpredictable targeting effect; on the other hand, for the retargeting of CRISPR/Cas9 complex, only a new guide RNA is needed. As a result, multiple sites in the genome can be targeted, which provides a convenient and high-throughput method in mammalian cells to study the relationship between chromatin modification state and gene expression as well as functional screening. More detailed experiments are needed to study whether the stable binding dCas9 complex will have potential unwanted influence on chromatin state and modification near its target (119). Despite this, it is generally expected that dCas9-ED system will become the first choice for many future epigenetic editing applications based on the above discussion of its unique advantages.

TET homologues in lower eukaryotes

TET proteins belong to a 2-OG/Fe2+ dependent superfamily of dioxygenases, which

can catalyze several different types of reactions (120, 121, 122, 123, 124). Although the studies on the catalytic properties and physiological functions of TET dioxygenase have been carried out in mammals, the enlightenment of TET dioxygenase function came from the study of lower eukaryotic Trypanosoma brucei (T. brucei). There is a special class of J bases in the nuclear genome of T. brucei. Its production process is divided into two steps: first, J binding protein 1/2 (JBP1/2) oxidizes T to 5hmU, and then base J is produced under the catalysis of J glucosyl-transferase (JGT) (125). Using JBP1/2 as template, the Rao Lab found that TETs, a highly conserved homologous protein with JBP1/2, exist in the human genome, and further revealed that TETs had catalytic activity in oxidizing 5mC to 5hmC. In addition to mammals, TET homologues can also be found in many lower eukaryotes (126). Cheng et al. found eight Tet homologous in Naegleria gruberi (N. gruberi) and confirmed that NgTet1 had similar catalytic activity to mammalian TETs. Crystal structure data showed that NgTet1 flips out 5mC which occupies its active center, and 5mC interacts with NgTet1 by plane stacking, hydrogen bonding and van der Waals force (127). In addition, we also found eight TET/JBP homologues in C. reinhardtii by BLAST search, but their amino acid sequences interacting with 2-OG were not conserved compared with mammalian TETs. This suggests that the TET/JBP homologues in C. reinhardtii may have different catalytic properties and unique physiological functions, different from those in mammals. Therefore, we set out to reveal the biological functions of C. reinhardtii TET homologous by studying their catalytic properties and reaction mechanism, and to further understand the functional differences of TET/JBP conserved domains among different species in evolution (Chapter 4).

CMD1 uses vitamin C (VC) as a co-substrate rather than a-ketoglutarate (2-OG)

In Chapter 4, we identified that CMD1, a homolog of TET in C. reinhardtii, which can catalyze 5mC to produce a new DNA modification 5-glyceryl-methylcytosine (5gmC). In fact, in our initial sequence alignment, we found that the amino acid sequences of eight

C. reinhardtii TETs (CrTETs) homologues interacting with 2-OG were not conserved

compared with mammalian TETs. This may imply that these TET homologues have different catalytic properties and reaction mechanism from mammalian TET. But beyond our expectation, follow-up experiments showed that vitamin C was directly involved in CMD1-mediated catalytic reaction instead of 2-OG as a co-substrate.

Vitamin C, also known as ascorbic acid, has important physiological functions in animals and plants due to its redox properties. The physiological functions of VC include: as a cofactor of enzymatic reaction, as a scavenger of intracellular free radicals, as an electron acceptor to participate in electron transport and as a substrate to participate in the synthesis of intracellular substances (128). Most of the enzymatic reactions involving VC as a cofactor are catalyzed by monooxygenases or dioxygenases. The main enzymatic reaction centers are ferrous or cuprous ions. As a reducing agent, VC can stabilize ferrous ions or cuprous ions in the reduced state and thereby enhance the activity of enzymes (129). In the mammalian TET-mediated 5mC catalytic reaction, VC can greatly enhance the enzyme activity of TET and participate in the biological functions regulated by TET, such as promoting somatic cell reprogramming by TET1 (130). In most enzymatic reactions involving VC, 2-OG acts as the main electron donor, while VC acts only as a cofactor and does not directly chelate with metal ions. However, through the analysis of the mechanism of the CMD1 enzymatic reaction, we have further deepened our understanding of the role of VC. In the enzymatic reaction mediated by CMD1, VC not only acts as an electronic donor to directly replace the chelation of 2-OG with Fe (II), but also acts as a glycerol-based donor to directly modify 5mC. From this point of view, we have found a new function of VC in organisms.

At present, the known Smirnoff-Wheeler pathway may be the only pathway for VC synthesis in plant cells (131). Previous reports have shown that knockdown of

C. reinhardtii VTC2 (CrVTC2), a key VC synthase in this pathway, resulted in a 90%

reduction in intracellular VC content and a slower rate of formation, a lower chlorophyll concentration and a more sensitive response to environmental stimuli (132). Therefore, we speculate that if CrVTC2 is completely knocked out, VC may not be produced, which may lead to C. reinhardtii death. However, the seven CrVTC2 mutants with frame-shifting mutations survived in the conventional TAP medium under continuous light, and did not show any abnormalities, which did not seem to be consistent with previous reports (132). Furthermore, we found that the content of 5gmC in all CrVTC2 mutants with frameshift mutations was even lower than that in CMD1

(10)

mutants, which confirmed that VC was directly involved in the production of 5gmC in C. reinhardtii. However, 5gmC has not disappeared completely. It is speculated that there may be two reasons: one is that VTC2 may not be the only way to synthesize VC, and VC can also be produced by other ways; the other is that besides VC, there are other small molecules that can be replaced to participate in the formation of 5gmC. It may be necessary to detect the content of VC in CrVTC2 mutants by high resolution mass spectrometry to conclude whether it is a necessary gene in the pathway of VC synthesis. However, the direct involvement of VC in the production of the 5gmC modification confirms that small metabolite molecules can be directly involved in epigenetic regulation.

5gmC remains in the genome of CMD1 mutant

In the CMD1 mutant of CC-125 background, we detected that the content of 5gmC was only about 60% lower than that of wild type (Chapter 4). These results indicate that besides CMD1, there are probably other TET homologues that can catalyze 5mC to produce 5gmC. We found that the amino acid sequence of CrTET2 was identical to that of CMD1 (CrTET1) from amino acid 110 to the last amino acid through aligning the amino acid sequences of eight TET homologous in C. reinhardtii. More surprisingly, the genomic sequence of CrTET2 on chromosome 15 is identical to that of CMD1 on chromosome 12. The difference is that the predicted results from the Phytozome database showed that the transcription start site (TSS) of CrTET2 is located downstream, resulting in a protein size of 109 amino acids less than that of CMD1. However, we cannot rule out the possibility that to compensate an introduced CMD1 mutation, CrTET2 will be transcribed from the same location as the transcriptional start site of CMD1 to produce CrTET2 protein with the same function as CMD1. In our efforts to improve the efficiency of obtaining CMD1 mutants, we were surprised to find that CrTET2 did not exist in the nuclear genome of CMD1 mutants with CC-5325 background (Chapter 5). This finding rules out any compensatory effect of CrTET2 on CMD1 in CMD1 mutants with 125 background. This makes CMD1 mutants with CC-5325 background helpful to further study the functions of 5gmC and CMD1. In addition to CrTET2, CrTET3 and CMD1 are most similar in amino acid sequence. Although the purified CrTET3 did not exhibit the activity of catalyzing 5mC to produce 5gmC in

vitro, this may be due to inappropriate conditions of enzymatic reaction. Therefore,

we cannot rule out the possibility that CrTET3 can catalyze 5mC to produce 5gmC in C.

reinhardtii.

The role of CMD1 and 5gmC in Chlamydomonas reinhardtii

In Chapter 4, we found that CMD1 and its catalyzed product 5gmC participated in the

regulation of photoprotection by regulating the expression of target gene LHCSR3 in C. reinhardtii. In general, the regulatory effect of epigenetic modification is broad, not only limited to a specific site of the genome. Therefore, we speculate that target genes regulated by CMD1 may also participate in other biological processes. There are two lines of evidences to support the hypothesis. Firstly, the results of mass spectrometry showed that 5gmC content was different in different life cycles of C.

reinhardtii, suggesting that CMD1 might play a regulatory role in gametogenesis and

zygote formation. Secondly, the results of RNA-seq showed that the expression of some mitotic and metabolic related genes showed different changes in CMD1 mutant, which further indicated that CMD1 might be involved in the regulation of cell cycle and metabolic process (Chapter 4). In the future, if we can identify other proteins involved in 5gmC production besides CMD1 and knock them out to obtain complete 5gmC depleted strains, it will greatly promote our research on the functions of CMD1 and 5gmC in C. reinhardtii.

For 5gmC, an important issue to be further clarified is whether 5gmC is an intermediate product of demethylation of 5mC or a stable epigenetic marker with its unique physiological functions. We demonstrated that 5mC content in CMD1 mutants significantly increased at genome-wide level and at specific sites such as LHCSR3. These results indicate that CMD1 catalyzes the reaction of 5mC to 5gmC and thereby participates in DNA demethylation. How 5gmC can be further removed to result in true demethylation of 5mC needs further clarification. At present, there are two widely accepted ways to remove DNA modified bases. First, TET dioxygenase mediate active and passive demethylation of 5mC in mammalian cells. In the active demethylation pathway, 5fC and 5caC (produced by TET iterative oxidation of 5mC) can be recognized and erased by TDG, and then repaired to unmethylated cytosine via BER pathway. In passive demethylation, modified bases such as 5hmC, 5fC and 5caC can be “erased” by dilution during DNA replication. Secondly, in higher terrestrial plants such as

Arabidopsis thaliana, 5mC can be directly recognized and erased by glycosidase ROS1

(133). Although the specific mechanism of CMD1 that induces DNA demethylation is still unknown, there is a possibility that this is accomplished through the (either of the) two demethylation pathways described above. We retrieved the homologues of TDG or ROS1 and found that there were one TDG homologue and two ROS1 homologues in

C. reinhardtii. Further studies are needed to reveal whether these homologues indeed

have the capacity to erase 5gmC.

CMD1 as a potential effector domain factor in epigenetic editing

Because CMD1 can catalyze the reaction of 5mC with VC to produce 5gmC, and thus might act as an epigenetic modifier, we believe that CMD1 can be used as a potential

(11)

effector domain factor to regulate gene expression by epigenetic editing. Although we have not detected 5gmC in the mammalian genome yet, our unpublished data show that 5gmC can be detected in the genome upon overexpressed of CMD1 in cultured HEK 293T cells when VC is added to the culture medium at the same time (data not shown). This suggests that in the presence of VC, CMD1 can catalyze 5gmC production from 5mC in the context of the mammalian genome. We further examined the effect of 5gmC on gene expression through a dual luciferase reporting system. The results showed that both 5mC and 5gmC could significantly inhibit the expression of the reporter gene. However, when TET2 or TDG were overexpressed at the same time, we found that the expression of the reporter gene on the 5mC plasmid was up-regulated about 30 folds, while the reporter gene could not be activated on the 5gmC plasmid (data not shown), since 5gmC can not be oxidized by TET2, and further removed by TDG. In addition, since 5gmC is not a natural modification in mammalian cells, it is likely that there is no de-modification enzyme for 5gmC. These results suggest that the transcriptional inhibition induced by 5gmC is more stable and persistent than that induced by 5mC. Therefore, by fusing dCas9 with CMD1, and targeting CMD1 to specific CpG sites of genome under the guidance of specific gRNA, the target gene expression can be effectively inhibited by modifying 5mC to produce 5gmC, especially in non-dividing cells.

Targeted gene editing in Chlamydomonas reinhardtii

Protozoa C. reinhardtii is an eukaryotic unicellular green algae which has been widely used as a model organism in basic biological processes and bio-products production (134, 135). The characteristics of the haploid genome of C. reinhardtii vegetative cells facilitate the isolation of homozygous mutants (136). Recently, a genome-wide Chlamydomonas mutant library project was reported with 83% of the genes covered (137). However, half of the insertion mutations are located in introns and UTRs, which may affect the expression of these host genes, but which will not result in loss off function. More importantly, there still is a proportion of genes (less than 17%) for which insertion mutants have not yet been isolated. For the eight recently identified TET homologues, we didn’t find insertion mutants in the mutant library. This indicated that there has been a lack of effective targeted gene editing tools for nuclear genes, which greatly limits the study of genes in C. reinhardtii.

At present, there are three ways for targeted gene editing, including zinc finger nuclease (ZNF), transcriptional activation-like domain nuclease (TALEN) and clustered regularly interspaced short palindromic repeats/CRISPR-associated protein 9 (CRISPR/Cas9). In mammalian cells, these three methods have proven to produce mutations at candidate gene targeting sites, but ZFN and TALEN have been gradually

replaced by CRISPR/Cas9, the third generation of gene editing tools, because of their high cost, time-consuming production and low efficiency. However, initially, the CRISPR/SpCas9 system delivered as plasmids showed extremely low efficiency in target gene editing in C. reinhardtii (138). There may be three reasons for this low efficiency. First, the expression efficiency of exogenous genes is very low in C.

reinhardtii. Secondly, there is a high activity of endogenous nucleases in C. reinhardtii,

which can produce random cutting of the exogenous plasmid DNA delivered into cells, resulting in the integration of incomplete expression cassette into the genome, and further reducing the expression efficiency of the exogenous genes (139). Thirdly, even if SpCas9 could be effectively expressed, the accumulation of SpCas9 protein caused by its continuous expression would cause lethal toxicity to C. reinhardtii, and the edited mutants could not be obtained (138, 140). The breakthrough came in 2016 when two studies reported that Cas9-gRNA RNPs, which were pre-assembled in vitro, could be effectively delivered to C. reinhardtii to overcome the cytotoxicity caused by the continuous expression of Cas9, while improving the editing efficiency of endogenous target genes (141, 142). However, the mutant acquisition efficiency is still very low, and mutants can’t be effectively isolated without phenotype or selective markers. This may be due to the following two reasons: on the one hand, it is still difficult to accurately predict high efficient gRNA target sites; on the other hand, differences in genomic microenvironment and repair preferences among different organisms can also lead to differences in editing efficiency (119).

Therefore, efforts to improve the efficiency of CRISPR/Cas9-mediated gene editing are needed. One aspect of this is to improve the isolation efficiency of target gene mutant in species with low editing efficiency. A notable example is the concept of co-transformation or co-CRISPR based on “functional acquisition alleles” proposed in Caenorhabditis elegance (C. elegance). In addition to targeting genes of interest, another unrelated gene is also targeted, and mutants targeting this gene will produce a visible phenotype. Assuming that transfection results in transferring two gRNA constructs, mutants targeting the target gene will be more easily detected in the visible phenotype mutants. Through this co-transformation or co-CRISPR strategy, the efficiency of obtaining mutants of unknown phenotype target genes indeed was improved in C. elegance (143, 144). Subsequently, this method also significantly increased the efficiency of obtaining target gene mutants in Drosophila melanogaster and some human cell lines with low editing efficiency (145, 146, 147). These results suggest that if targeting two different loci in the genome simultaneously through CRISPR/Cas9, there is a correlation between editing events at both loci (147).

Therefore, we first optimized the target gene editing method based on Cas9-gRNA RNPs and developed a strategy for co-selection of the target gene mutant and MAA7

(12)

mutant. We then demonstrated that isolation of two non-phenotypic target genes CMD1 and VTC2 mutants with CRISPR-driven gene-editing by this co-selection strategy (Chapter 4). The efficiency of obtaining CMD1 mutants, however, still was very low, we identified only one mutant from 986 5-FI resistant colonies (Chapter 4). The reason is that the transcription level of CMD1 is very low in CC-125 strain. Indeed, we found that the editing efficiency of a target gene mediated by Cas9-gRNA RNPs is positively related to its transcriptional level in the co-selection strategy (Chapter

5). We have tried to treat C. reinhardtii with the DNA methyltransferase inhibitor

5-azadeoxycytidine (5-aza), but failed to improve the editing efficiency of target genes at low transcription levels, even though the content of 5mC was detected to be reduced by half in the genome. At present, there is no good way to improve the transcription level of endogenous genes in C. reinhardtii. So improving the screening accuracy of candidate mutants from the cell population could also improve the isolation efficiency of mutants for low editing efficiency target genes including CMD1. Therefore, we developed a microhomology-mediated integration of donor DNA and targeted integration-dependent two-step screening processes, enabling us to effectively isolate the desired mutants from cell population and resistant colonies pools. Through this strategy, we identified 10 CMD1 mutants with CC-5325 background from 120 Hygromycin B resistant colonies, thus increasing efficiency by about 80 folds when compared with the co-selection strategy (Chapter 5). As such, we provide an effective platform for targeted knock-out or knock-in for any gene of interest by utilizing the methods described in Chapter 5 and the co-selection strategies previously demonstrated in Chapter 4.

FUTURE PERSPECTIVES

In Chapter 2 and Chapter 3, we fused zinc finger proteins as programmable DNA binding domains fused to several potential demethylases, and then detected their targeted demethylation effects as potential demethylation inducers for epigenetic editing. Although TET2 can induce demethylation at target CpG sites and re-activate target gene expression, there is still room for improvement in demethylation level and re-activation of target gene expression. Furthermore, whether other potential demethylation domain factors can induce targeted demethylation also needs to be further addressed using improved experimental strategies.

To this end, future experimental settings can benefit out from the following four aspects. Firstly, dCas9 could be used as DNA binding domain to be fused with APOBEC3A to further explore the possibility of inducing targeted demethylation via the deamination pathway. On the one hand, the deamination activity of APOBEC3A to

5mC is stronger than that of AID and Apobec1. On the other hand, the advantage of using dCas9 as DNA binding domain is that double-stranded DNA can be opened at the target region, which facilitates APOBEC3A to act on target CpG sites, since APOBEC3A has significantly higher deamination activity for 5mC on ssDNA than on dsDNA. However, recent studies have reported that cytosine base editor BE3 (APOBEC1-nCas9-UGI) has significant off-target effects in both genome and transcriptome (148, 149, 150, 151). Therefore, it is also necessary to carefully detect whether there are off-target effects on genome or transcriptome when dCas9-APOBEC3A is used for off-targeted demethylation. Secondly, dCas9-GADD45A and dCas9-TET1CD or -TET2CD could be co-targeted to the same region for co-targeted demethylation. As mentioned in the previous discussion, demethylation induced by GADD45A depends on further recruitment of TET1 to its binding site (23). Therefore, if GADD45A and TET1CD are co-targeted to the target sequence, it may be helpful to more effectively induce demethylation of 5mC. Thirdly, dCas9-TET1CD or TET2 CD and dCas9-TDG could be co-targeted to the same region for targeted demethylation. TET-mediated oxidative demethylation is mainly accomplished through active demethylation dependent on TDG-coupled BER and passive demethylation dependent on DNA replication. Therefore, in addition to the passive demethylation induced by DNA replication, the active demethylation mediated by TDG-coupled BER pathway will be enhanced theoretically when TET1CD or TET2CD are simultaneously targeted with TDG. Fourthly, a low-complexity region with unknown function exists in all three double-stranded beta helix domains (DSBH) of TET dioxygenase, which may be related to the regulation of the activity and localization of TET protein (152). Our unpublished data show that TET3 and TET3CD have stronger oxidation activity for 5mC after the deletion of this low-complexity region. Therefore, we can target dCas9-TET2CD (without low-complexity region) to the target region, and use its enhanced oxidation activity to oxidize more of 5mC to 5hmC, 5fC or 5caC. In theory, the proportion of subsequent passive or active demethylation will increase accordingly.

In Chapter 4, we identified a TET homologue named CMD1 in the model organism C. reinhardtii, which can catalyze 5mC to produce a new DNA modification 5gmC. We further found that the catalytic reaction did not depend on the essential cofactor 2-OG of mammal TET, but on VC, which participated in the reaction as a co-substrate. We successfully obtained CMD1 mutants and VTC2 (Encoding VC synthase gene) mutants through the co-selection strategy based on Cas9-gRNA RNPs. Subsequently, using the obtained CMD1 mutant, we revealed the molecular mechanism of CMD1 and 5gmC in regulating the photoprotection process of C. reinhardtii. However, there are still three aspects to be further studied.

(13)

To answer this, we first need to detect the presence of 5gmC in CMD1 mutants with CC-5325 background. As demonstrated in Chapter 5, there is no interference of its paralogous gene CrTET2 in CMD1 mutant with CC-5325 background. This will be very helpful to further study the biological functions of CMD1 and 5gmC. Secondly, if an intermediate product of 5mC demethylation, how is 5gmC in the genome is erased? To answer this question, besides the passive 5gmC de-modification dependent on replication, the active erasure of 5gmC mediated by specific glycosylase needs to be further explored. For example, we found homologues of TDG and ROS1 in C. reinhardtii. On the one hand, these candidate proteins can be expressed and purified in E. coli, and dsDNA containing 5gmC modification can be used as substrates to see whether these candidate factors have the activity of erasing 5gmC modification in vitro. On the other hand, the target genes encoding these candidate factors can be edited to introduce mutations using the co-selection strategy we developed in this chapter. Subsequently, it would be possible to detect whether 5gmC accumulated in the genome of these mutants to verify whether these candidate factors indeed have the effect of erasing 5gmC in C. reinhardtii. Thirdly, because the transcriptional inhibition induced by 5gmC will be more stable and persistent, dCas9 and CMD1 can be fused, and then under the guidance of specific RNA, CMD1 can be targeted to the promoter of abnormally expressed oncogene, and the target gene expression can be deeply inhibited by modifying 5mC to produce 5gmC.

In Chapter 5, we set out to improve the isolation efficiency of mutants of target genes with low transcriptional levels, using a co-selection strategy. Towards this aim, we further developed a microhomology-mediated recombination-dependent donor DNA integration strategy and a targeted integration-dependent mutant screening process. Our successful application of this strategy significantly improved the mutant isolation efficiency for low transcription level target gene CMD1. In view of the existence of potential genes encoding DNA modification enzymes, we identified three potential genes encoding 6mA methyltransferases in C. reinhardtii. However, the transcription levels of these three candidate genes are very low, and there are no corresponding mutants in Chlamydomonas mutant library. Therefore, the gene targeting strategy and mutant screening process we developed in Chapter 5 will help to obtain mutants of these three potential 6mA methyltransferases, so as to verify whether these candidate factors really have 6mA methyltransferase activity, and further reveal their physiological functions. Furthermore, the HA tag sequence can be knocked-in at the 3' end of these candidate genes by this method. This will tag the expressed candidate proteins, facilitating the direct purification of the complex with 6mA methyltransferase activity in C. reinhardtii, clarifying which factors are involved in this enzymatic reaction, which will further help to elucidate the mechanism of catalytic reaction.

To our knowledge, the report in chapter 3 was the first to actually induce TET-mediated DNA demethylation at a hypermethylated site of interest, providing new avenues for therapeutic intervention, and some applications in mouse disease models have clearly demonstrated the clinical relevance of this method (39, 101, 153). In the future, however, in order to obtain better therapeutic effects, we still need to further improve the delivery efficiency of these large molecular weight constructs in vivo. In chapter 4, our study not only identified a novel eukaryotic DNA base modified 5-glyceryl-methylcytosine (5gmC) derived from vitamin C, which is catalyzed by an more ancient TET homolog CMD1, but also revealed the mechanism of CMD1 participating in the photosynthesis process by regulating the methylation level of the upstream region of LHCSR3, a critical gene to prevent photooxidative damage induced by high light. For potential applications, CMD1 and its catalytic product 5gmC can be used to inhibit the expression of oncogenes or to distinguish 5mC from 5hmC in DNA sequencing.

This thesis adds to the rapid developments in the field of genetic and epigenetic engineering. By combining bacterial and eukaryotic models, while addressing fundamental and translational questions, as also done in this thesis, the versatility of CRISPR tools have truly revolutionized biological research and might benefit patients in the future.

(14)

REFERENCES

1. Rai, K., Huggins, I.J., James, S.R., Karpf, A.R., Jones, D.A. and Cairns, B.R. (2008) DNA demethylation in zebrafish involves the coupling of a deaminase, a glycosylase, and Gadd45. Cell, 135, 1201-1212.

2. Muramatsu, M., Kinoshita, K., Faqarasan, S., Yamada, S., Shinkai, Y. and Honjo, Y. (2000) Class switch recombination and hypermutation require activation-induced cytidine deaminase (AID), a potential RNA editing enzyme. Cell, 102, 553–563.

3. Revy, P., Muto, T., Levy, Y., Geissmann, F., Plebani, A., Sanal, O., Catalan, N., Forveille, M., Dufourcq-Labelouse, R., Gennery, A., Tezcan, I., Ersoy, F., Kayserili, H., Ugazio, A.G., Brousse, N., Muramatsu, M., Notarangelo, L.D., Kinoshita. K., Honjo, T., Fischer, A. and Durandy, A. (2000) Activation-induced cytidine deaminase (AID) deficiency causes the autosomal recessive form of the Hyper-IgM syndrome (HIGM2). Cell, 102, 565–575.

4. Popp, C., Dean, W., Feng, S., Cokus, S.J., Andrews, S., Pellegrini, M., Jacobsen, S.E. and Reik, W. (2010) Genome-wide erasure of DNA methylation in mouse primordial germ cells is affected by AID deficiency. Nature, 463, 1101-1105.

5. Bhutani, N., Brady, J.J., Damian, M., Sacco, A., Corbel, S.Y. and Blau, H.M. (2010). Reprogramming towards pluripotency requires AID-dependent DNA demethylation. Nature, 463, 1042–1047.

6. Kumar, R., DiMenna, L., Schrode, N., Liu, T.C., Franck, P., Munoz-Descalzo, S., Hadjantonakis, A.K., Zarrin, A.A., Chaudhuri, J., Elemento, O. and Evans, T. (2013). AID stabilizes stem-cell phenotype by removing epigenetic memory of pluripotency genes. Nature, 500, 89–92.

7. Pham, P., Bransteitter, R., Petruska, J. and Goodman, M.F. (2003) Processive AID-catalysed cytosine deamination on single-stranded DNA simulates somatic hypermutation. Nature, 424, 103-107.

8. Bransteitter, R., Pham, P., Scharff, M.D., and Goodman, M.F. (2003). Activation-induced cytidine deaminase deaminates deoxycytidine on single stranded DNA but requires the action of RNase. Proc. Natl. Acad. Sci. USA, 100, 4102–4107.

9. Nabel, C.S., Jia, H., Ye, Y., Shen, L., Goldschmidt, H.L., Stivers, J.T., Zhang, Y. and Kohli, R.M. (2012). AID/ APOBEC deaminases disfavor modified cytosines implicated in DNA demethylation. Nat. Chem. Biol., 8, 751– 758.

10. Komor, A.C., Kim, Y.B., Packer, M.S., Zuris, J.A. and Liu, D.R. (2016) Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature, 533, 420-424.

11. Nishida, K., Arazoe, T., Yachie, N., Banno, S., Kakimoto, M., Tabata, M., Mochizuki, M., Miyabe, A., Araki, M., Hara, K.Y., Shimatani, Z. and Kondo, A. (2016) Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science, 353(6305), pii: aaf8729.

12. Zong, Y., Wang, Y., Li, C., Zhang, R., Chen, K., Ran, Y., Qiu, J.L., Wang, D. and Gao, C. (2017) Precise base editing in rice, wheat and maize with a Cas9-cytidine deaminase fusion. Nat Biotechnol., 35, 438-440. 13. Kim, K., Ryu, S.M., Kim, S.T., Baek, G., Kim, D., Lim, K., Chung, E., Kim, S. and Kim, J.S. (2017) Highly efficient RNA-guided base editing in mouse embryos. Nat Biotechnol., 35, 435-437.

14. Schübeler, D. (2015) Function and information content of DNA methylation. Nature, 517, 321–326. 15. Schutsky, K.E., Nabel, S.C., Davis, F.K.A., DeNizio, J.E. and Kohli, R.M. (2017) APOBEC3A efficiently deaminates methylated, but not TET-oxidized, cytosine bases in DNA. Nucleic Acids Res, 45, 7655-7665. 16. Wang, X., Li, J., Wang, Y., Yang, B., Wei, J., Wu, J., Wang, R., Huang, X., Chen, J. and Yang, L. (2018) Efficient base editing in methylated regions with a human APOBEC3A-Cas9 fusion. Nat Biotechnol, 36, 946-949.

17. Sancar, A., Lindsey-Boltz, L. A., Unsal-Kacmaz, K. and Linn, S. (2004) Molecular mechanisms of mammalian DNA repair and the DNA damage checkpoints. Annu. Rev. Biochem, 73, 39–85.

18. Barreto, G., Schäfer, A., Marhold, J., Stach, D., Swaminathan, S.K., Handa, V., Döderlein, G., Maltry, N., Wu, W., Lyko, F. and Niehrs, C. (2007) Gadd45a promotes epigenetic gene activation by repair-mediated DNA demethylation. Nature, 445, 671–675.

19. Rai, K., Huggins, I.J., James, S.R., Karpf, A.R., Jones, D.A., and Cairns, B.R. (2008) DNA demethylation in zebrafish involves the coupling of a deaminase, a glycosylase, and gadd45. Cell, 135, 1201–1212.

20. Schmitz, K.M., Schmitt, N., Hoffmann-Rohrer, U., Schäfer, A., Grummt, I., and Mayer, C. (2009). TAF12 recruits Gadd45a and the nucleotide excision repair complex to the promoter of rRNA genes leading to active DNA demethylation. Mol. Cell, 33, 344–353.

21. Jin, S.G., Guo, C. and Pfeifer, G.P. (2008) GADD45A does not promote DNA demethylation. PLoS Genet, 4, e1000013.

22. Engel, N., Tront, J.S., Erinle, T., Nguyen, N., Latham, K.E., Sapienza, C., Hoffman, B. and Liebermann, D.A. (2009) Conserved DNA methylation in Gadd45a-/- mice. Epigenetics, 4, 98–99.

23. Arab,K., Karaulanov,E., Musheev,M., Trnka,P., Schäfer,A., Grummt, I. and Niehrs, C. (2019) GADD45A binds R-loops and recruits TET1 to CpG island promoters. Nat Genet, 51, 217-223.

24. Lian, C.G., Ceol, C., Wu, F., Larson, A., Dresser, K., Xu, W., Tan, L., Hu, Y., Zhan, Q., Lee, C.W., Hu, D., Lian, B.Q., Kleffel, S., Yang, Y., Neiswender, J., Khorasani, A.J., Fang, R., Lezcano, C., Duncan, L.M., Scolyer, R.A., Thompson, J.F., Kakavand, H., Houvras, Y., Zon, L.I., Mihm, M.C. Jr., Kaiser, U.B., Schatton, T., Woda, B.A., Murphy, G.F. and Shi, Y.G. (2012) Loss of 5-hydroxymethylcytosine is an epigenetic hallmark of melanoma. Cell, 150, 1135– 1146.

25. Yang, H., Liu, Y., Bai, F., Zhang, J.Y., Ma, S.H., Liu, J., Xu, Z.D., Zhu, H.G., Ling, Z.Q., Ye, D., Guan, K. and Xiong, Y. (2013) Tumor development is associated with decrease of TET gene expression and 5-methylcytosine hydroxylation. Oncogene, 32, 663–669.

26. Thienpont, B., Steinbacher, J., Zhao, H., D'Anna, F., Kuchnio, A., Ploumakis, A., Ghesquière, B., Van Dyck, L., Boeckx, B., Schoonjans, L., Hermans, E., Amant, F., Kristensen, V.N., Peng Koh, K., Mazzone, M., Coleman, M., Carell, T., Carmeliet, P. and Lambrechts, D. (2016) Tumour hypoxia causes DNA hypermethylation by reducing TET activity. Nature, 537, 63–68.

27. Ooi, S.K. and Bestor, T.H. (2008) The colorful history of active DNA demethylation. Cell, 133, 1145– 1148.

28. Wu, S.C. and Zhang, Y. (2010) Active DNA demethylation: many roads lead to Rome. Nat. Rev. Mol. Cell Biol., 11, 607–620.

29. Kriaucionis, S. and Heintz, N. (2009) The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science, 324, 929–930.

30. Tahiliani, M., Koh, K.P., Shen, Y., Pastor, W.A., Bandukwala, H., Brudno, Y., Agarwal, S., Iyer, L.M., Liu, D.R., Aravind, L. and Rao, A. (2009) Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science, 324, 930–935.

31. He, Y.F., Li, B.Z., Li, Z., Liu, P., Wang, Y., Tang, Q., Ding, J., Jia, Y., Chen, Z., Li, L., Sun, Y., Li, X., Dai, Q., Song, C.X., Zhang, K., He, C. and Xu, G.L. (2011) Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science, 333, 1303–1307.

32. Ito, S., Shen, L., Dai, Q., Wu, S.C., Collins, L.B., Swenberg, J.A., He, C. and Zhang, Y. (2011) Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science, 333, 1300–1303. 33. Pfaffeneder, T., Hackner, B., Truss, M., Münzel, M., Müller, M., Deiml, C.A., Hagemeier, C. and Carell, T. (2011) The discovery of 5-formylcytosine in embryonic stem cell DNA. Angew. Chem. Int. Ed. Engl., 50, 7008– 7012.

34. Guo, F., Li, X., Liang, D., Li, T., Zhu, P., Guo, H., Wu, X., Wen, L., Gu, T.P., Hu, B., Walsh, C.P., Li, J., Tang, F. and Xu, G.L. (2014) Active and passive demethylation of male and female pronuclear DNA in the Mammalian zygote. Cell Stem Cell, 15, 447-458.

35. Wu, H. and Zhang, Y. (2014) Reversing DNA methylation: mechanisms, genomics, and biological functions. Cell, 156, 45–68.

36. Bochtler, M., Kolano, A. & Xu, G. L. (2017) DNA demethylation pathways: additional players and regulators. Bioessays, 39, 1–13.

Referenties

GERELATEERDE DOCUMENTEN

Netherlands Organisation for Scientific Research NWO-VIDI University Medical Center Groningen (Abel Tasman fellowship) The National Key R&D Program of China. The National

Taken together, this thesis aims to contribute to i ) explore specifically targeted DNA demethylation for epigenetically silenced genes through epigenetic editing- mediated

ZFs fused to p300 CD or UTX CD were not detected by western blot, but RT-PCR did show expression of these constructs in the transduced host cells (Fig. Expression of zinc

(C) Quantitative analysis of the methylation levels of target CpG sites in zinc-finger binding region by pyrosequencing after treatment with the ICAM-1- targeted

Further analysis showed that both the protein and mRNA expression levels of LHCSR3 were lower in the cmd1 mutant compared to the wild-type after exposure to high light (Fig. 4d,

(1985) Coordinate transcription of variant surface glycoprotein genes and ann expression site associated gene family in Trypanosoma brucei.. (2000) Base J originally found in

De Wageningse onderzoeker Henjo de Knegt heeft een Rubiconsubsidie gekregen van NWO waarmee hij twee jaar lang naar de universiteit van Helsinki kan.. De eco- loog onderzoekt hoe

Ook heeft de kiezer met minder vertrouwen in de EU maar meer vertrouwen in de democratie een grotere kans om op Berlusconi te stemmen en blijft anti-immigratie een