Design, synthesis, and evaluation of polycomb reader protein Cbx7 antagonists

(1)

by

Chakravarthi Simhadri M.Sc., Andhra University, 2006

B.Sc., Andhra University, 2004

A Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of

DOCTOR OF PHILOSOPHY in the Department of Chemistry

ã Chakravarthi Simhadri, 2017 University of Victoria

(2)

Supervisory Committee

Design, synthesis, and evaluation of polycomb reader Cbx7

antagonists

by Chakravarthi Simhadri M.Sc., Andhra University, 2006 B.Sc., Andhra University, 2004 Supervisory Committee

Dr. Fraser Hof, Department of Chemistry

Supervisor

Dr. Cornelia Bohne, Department of Chemistry

Departmental Member

Dr. Lisa Rosenberg, Department of Chemistry

Dr. Caroline Cameron, Department of Biochemistry and Microbiology

(3)

Abstract

Supervisory Committee

Dr. Fraser Hof, Department of Chemistry

Supervisor

Dr. Cornelia Bohne, Department of Chemistry

Dr. Lisa Rosenberg, Department of Chemistry

Dr. Caroline Cameron, Department of Biochemistry and Microbiology

Outside Member

Writer, eraser, and reader proteins are three classes of proteins/enzymes that add, remove, and recognize post-translational modifications (PTMs) on histone tails, respectively. The orchestrated action of these protein classes controls dynamic state of chromatin and influences gene expression. Dysregulation of these proteins are often associated with disease conditions. All three classes are targeted with small molecule inhibitors for various disease conditions. This is a promising area of research to develop therapeutics for various clinical conditions.

I worked on a methyllysine reader protein Cbx7, which belong to polycomb group of proteins. Cbx7 is a chromodomain containing protein and it uses its chromodomain to recognize methyllysine partners such as H3K27me3. Aberrant expression of Cbx7 is observed in several cancers including prostate, breast, colon, thyroid, etc. Hence targeting Cbx7 with potent and selective inhibitors would be beneficial for therapeutic intervention for Cbx7 associated diseases.

Here I report my work on design, synthesis, and evaluation of Cbx7 inhibitors. In my work, we identified several potent and selective inhibitors for Cbx7 and we published first-in-class antagonists for Cbx7. Few of these inhibitors were tested on cancer stem cell models. Further, I propose future work for targeting Cbx7 and other chromodomain containing proteins.

(4)

List of Tables

Table 1.1. Selected methyl reader inhibitors with their binding affinities toward their targets ... 23 Table 1.2. L3MBTL1/L3MBTL3 inhibitorsa ... 27 Table 2.1. Designed trimethyllysine mimics and their fragment properties 47

Table 2.2. Binding data of Cbx7 with consensus binding motif peptides of SETDB1a ... 49 Table 2.3. IC50 binding data of peptide-fragment hybrids of Cbx7–H3K27me3a ... 53

Table 2.4. Literature binding data of L3MBTL1 inhibitors towards L3MBTL1 and Cbx7a ... 56 Table 2.5. IC50 binding data of compounds arise from different basic amine modifications

on 3-bromo nicotinamide/arylsulfonamide scaffolds towards Cbx7–H3K27me3a_{.. 65}

Table 2.6. IC50 binding data of the rigid linker compounds towards Cbx7–H3K27me3 a_{. 66}

Table 2.7. IC50 binding data of regioisomers of bromo nicotinamide/arylsulfonamide

scaffolds towards Cbx7–H3K27me3a_{... 73}

Table 2.8. IC50 binding data of compounds with multiple modifications on nicotinamide

and sulfonamide scaffolds towards Cbx7–H3K27me3a ... 74 Table 2.9. Mass spectrometry data for the peptide-fragment hybridsa ... 86 Table 3.1. H3K27me3–Cbx7 disruption by Ac-FALKme3S-NH2 analogs bearing Leu(–1)

replacementsa………...121

Table 3.2. H3K27me3–Cbx7 disruption by Ac-FALKme3S- NH2 analogs bearing Ala(–

2) replacements a ... 126 Table 3.3. H3K27me3–Cbx7 disruption by Ac-FALKme3S-NH2 analogs bearing Ser(+1)

replacementsa ... 127 Table 3.4. H3K27me3–Cbx7 disruption by Ac-FALKme3S-NH2 analogs bearing Phe (–

(9)

Table 3.5. H3K27me3–Cbx7 disruption by Ac-FALKme3S-NH2 analogs bearing Phe (–

3) replacementsa ... 132

Table 3.6. H3K27me3–Cbx7 and H3K27me3–Cbx8 disruption by Ac-FALKme3S-NH2 analogs bearing multiple functional group replacementsa ... 134

Table 3.7. ITC-derived thermodynamic binding data of selected compounds for Cbx7 and Cbx8a ... 135

Table 3.8. ITC-derived thermodynamic binding data of compound 1 and 3.62 for Cbx7 and Cbx1 proteinsa ... 136

Table 3.9. ITC-derived thermodynamic binding data of compound 1 and 3.62 for Cbx7 and Cbx4a ... 136

Table 3.10. Mass spectrometry details of Leu(–1) replacement compounds (Table 3.1 compounds)a_{... 139}

Table 3.11. Mass spectrometry details of Ser(+1) replacement compounds (Table 3.3 compounds)a ... 141

Table 3.12. Characterization data of all final compoundsa ... 143

Table 4.1. Roles of Cbx7/Cbx8 in breast tumorigenesis………...………..149

Table 4.2. Binding data of selected compounds chosen for biological studies ... 154

Table 4.3. Binding affinities of 1st_{-generation and 2}nd_{-generation dye-labeled probes with} Cbx proteins as determined by direct FPa_{... 164}

Table 4.4. IC50 data of biotinylated compounds for the disruption of Cbx–compound 4.2 complexa ... 168

Table 4.5. IC50 data of para-bromobenzamide replacements of the scaffold 4.1 for the disruption of Cbx–compound 4.2 complexa ... 172

Table 4.6. IC50 data of N-alkylated lysine replacements based on scaffold 4.1 for the disruption of Cbx–compound 4.2 complexa ... 180

(10)

Table 4.7. IC50 data of the compounds, that have both modifications of N-terminal

benzamide and N-alkylated lysine replacements onto the scaffold 4.1,arose from disruption of Cbx–compound 4.2 complexa ... 182 Table 4.8. ITC Binding data for compound 4.11 with different Cbx proteins.a ... 183 Table 4.9. Characterization data of all Chapter 4 compoundsa ... 192

(11)

List of Figures

Figure 1.1. Nucleosome, PTMs on H3 tail, and a cartoon representation of epigenetic writer, eraser, and reader proteins. ... 3 Figure 1.2. Different methylation states of lysine and arginine at physiological pH. ... 4 Figure 1.3. Examples of two types of recognition modes that are commonly found in the complexes of Royal family of methyllysine readers with their binding partners.. ... 7 Figure 1.4. Schematic of peptide- and protein-domain microarrays. ... 11 Figure 1.5. Histone peptide microarray probed with Cbx7 suggests consensus binding motif for designing alternative binding peptides to Cbx7.. ... 12 Figure 1.6. Protein-domain microarray results indicate MBT domain containing proteins CGI-72 and L3MBTL1 bind H3K4me1 mark, and the chromodomain containing proteins CDY and HP1 bind H3K9me3 mark. ... 15 Figure 1.7. Schematic of FP assay principle and selected literature examples to demonstrate the determination of Kd and IC50 from direct FP and competitive FP.. 16

Figure 1.8. Schematic of AlphaScreen and selected literature examples to demonstrate the determination of Kd app from cross-titration and IC50 from competitive AlphaScreen.

... 20 Figure 1.9. Schematic of an ITC instrument and demonstration of ITC readout of the binding event, EED–UNC5115. ... 22 Figure 1.10. Full-length L3MBTL1 architecture and X-ray co-crystal structure of UNC669 with L3MBTL1.. ... 31 Figure 1.11. Full-length L3MBTL3 architecture and X-ray co-crystal structure of UNC1215 with L3MBTL3. ... 33 Figure 2.1. NMR derived structure of H3K27me3 bound Cbx7 and illustration of two important binding pockets of Cbx7………… ………41

(12)

Figure 2.2. Two literature examples that use fragment assembly approach to create potent inhibitors. ... 44 Figure 2.3. Selected aromatic cage binding ligands that are retrieved from the four-point pharmacophore model screen across the PDB database contain a variety of chemical moieties. ... 46 Figure 2.4. Design of Cbx7 inhibitors using peptide-fragment approach. ... 50 Figure 2.5. Determination of IC50 binding data of selected peptide-hybrids and a control

compound for the disruption of Cbx7–H3K27me3 complex by competitive FP assay.. ... 52 Figure 2.6. X-ray co-crystal structure of UNC669 with L3MBTL1. ... 57 Figure 2.7. Design of small molecule inhibitors of Cbx7 by adapting L3MBTL1 antagonists as scaffolds. ... 59 Figure 2.8. Schematic of STD NMR principle. ... 67 Figure 2.9 Qualitative comparison of STD NMR spectra of compounds 2.7J and 2.7G with 2.7L, and 2.13G with 2.13L. ... 70 Figure 2.10. Competitive STD NMR titration of the 2.13L–Cbx7 with a potent inhibitor entry 62… ... 71 Figure 3.1 Interactions of Kme3S of H3K27me3 with Cbx7 in 2L1B; and numbering to the residues of the parent compound, Ac-FALKme3S-NH2………..….116

Figure 3.2. Competitive FP assay and ITC data of the parent peptide Ac-FALKme3S-NH2. ... 118

Figure 3.3. Design of the Ac-FALKme3S-NH2 from the backbone of H3K27me3 peptide

and interpreting the key molecular recognition elements of the complex Ac-FALKme3S-NH2–Cbx7. ... 120

Figure 3.4. X-ray co-crystal structure of Ac-FAYKme3S-NH2–Cbx7 complex and

analysis of structural cues of the complex to design potent and selective inhibitors for Cbx7. ... 124

(13)

Figure 3.5. Binding data of Cbx7–H3K27me3 complex with and without Mg+2 in the FP buffers………..125 Figure 3.6. The final set of peptidic Cbx7 inhibitors bearing multiple functional group

replacements. ... 134 Figure 4.1. Selected Cbx7 inhibitors and corresponding control compounds for biological studies……….. 152 Figure 4.2. Treatment of MMTV-myc-TS cells with Cbx7/Cbx8 inhibitors inhibited the

TS growth, and the Western blot data analysis indicated that the inhibitors competed for chromatin bound-Cbx7 and Cbx8……….. 156 Figure 4.3. Structural similarities and differences among the chromodomains of HP1 and polycomb paralogs. ………. 160 Figure 4.4. Design of potent and selective Cbx7 inhibitors with N-terminus and Kme3 substitutions on the parent scaffold (4.1)……… 162 Figure 4.5. The structure of the probe peptide used for second-generation competitive FP assays………... 163 Figure 4.6. Binding data of compound 4.1 against a panel of Cbx proteins…………....165 Figure 4.7. Microarray results indicate that biotinylated compounds bind two types of chromodomain families, Cbx and CDY……….. 167 Figure 4.8. Analytical LC-MS traces of the purified compounds, 4.1 and 4.14. ……….……… 171 Figure 4.9. The general mechanism of Mtt deprotection reaction on-resin……… 176 Figure 4.10. Monitoring of diethylation reaction on the lysine NH of the peptide attached on-resin using LR-ESI-MS………. 178 Figure 4.11. Understanding the binding affinities of inhibitors using molecular model. 185 Figure 5.1. Chemical structures of selected peptidic Cbx antagonists and X-ray co-crystal structures of UNC3866 with Cbx7 and Cbx8……… 195

(14)

Figure 5.2. Co-crystal structures of Cbx7 with MS37452, MS351, and Ac-FAYKme3S-NH2 (1.1) reveal small molecules occupy fewer binding subsites of Cbx7 in comparison to peptidic inhibitors. ... 200

(15)

List of Schemes

Scheme 2.1. Synthetic routes to prepare modifications of basic amine and altering

aliphatic linker of nicotinamide/sulfonamide scaffolds. ... 61

Scheme 2.2. Synthesis of Suzuki-coupled products of nicotinamide/sulfonamide scaffolds. ... 63

Scheme 2.3. Synthetic route to prepare fragment B. ... 78

Scheme 2.4. Synthesis of fragment C. ... 80

Scheme 2.5. Synthesis of fragment D.. ... 81

Scheme 2.6. Synthesis of fragment E. ... 82

Scheme 2.7. Synthesis of fragment F.. ... 84

Scheme 3.1. Synthetic routes to prepare Chapter 3 peptidomimetic inhibitors……...…116

Scheme 4.1. Synthesis of 3.59 and 3.11-unme. ... 153

Scheme 4.2. Synthetic routes to prepare of peptidomimetics 4.1, and 4.3–4.13. ... 165

(16)

Abbreviations

Abu 2-aminobutyric acid

Ac acetyl

Ac2O acetic anhydride

AlphaScreen amplified luminescent proximity homogeneous assay screen

Arg arginine

aRme2 asymmetric dimethylarginine

BCLAF1 BCL2-associated transcription factor1

BPTF bromodomain PHD finger transcription factor

Bn-H3K9me1 biotinylated-H3K9me1

BSA bovine serum albumin

Cbx chromobox homolog Cbx1 chromobox homolog 1 Cbx2 chromobox homolog 2 Cbx3 chromobox homolog 3 Cbx4 chromobox homolog 4 Cbx5 chromobox homolog 5 Cbx6 chromobox homolog 6 Cbx7 chromobox homolog 7

CDY chromodomain Y chromosome

CHD4 chromodomain-helicase-DNA binding domain protein 4

CH3I methyl iodide

Chromodomain chromatin organization modifier domain

CpG cyclopentyl glycine D-score druggability-score DCM dichloromethane DIPEA N,N-diisopropylethylamine DMF dimethylformamide DMSO dimethylsulfoxide

DNA deoxyribonucleic acid

DNMT3A DNA methyltransferase 3 Alpha

dPc drosophila polycomb protein

E2F1 E2F transcription factor 1

ERα estrogen receptor alpha

EED embryonic ectoderm development protein

ESI electrospray ionization

EtOAc ethylacetate

EZH2 enhancer of zeste homolog 2

FDA The Food and Drug Administration

FITC fluorescein isothiocynate

Fmoc fluorenylmethyloxycarbonyl

FP fluorescence polarization

(17)

GPCRs G protein-coupled receptors

H2A histone-2A

H2B histone-2B

H3 histone-3

H4 histone-4

H3K4me3 trimethylated lysine 4 on the histone-3 tail H3K27me3 trimethylated lysine 27 on the histone-3 tail H3K36me3 trimethylated lysine 36 on the histone-3 tail H3K79me3 trimethylated lysine 79 on the histone-3 tail H4K20me3 trimethylated lysine 4 on the histone-4 tail

HBTU N,N,N’,N’- tetramethyl-O-(1H-benzotriazol-1-yl)uronium hexafluorophosphate

HCTU O-(1H-6-chlorobenzotriazole-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate

HDAC histone deacetylase

HP1 heterochromatin protein 1

HPLC high-performace liquid chromatography

HOBt 1-hydroxybenzotriazole

HR-ESI-MS high resolution electrospray ionization mass spectrometry

HTS high-throughput screen

ITC isothermal titration calorimetry

K lysine

Kme1 monomethyllysine

Kme2 dimethyllysine

Kme3 trimethyllysine

Kd dissociation constant

LR-ESI-MS low resolution electrospray ionization mass spectrometry L3MBTL1 lethal (3) malignant brain tumor-like protein 1

L3MBTL3 lethal (3) malignant brain tumor-like protein 3 MBT domain malignant brain tumor domain

MBTD1 MBT domain containing protein 1

MD simulations molecular dynamics simulations

P phosphorylation

Pc polycomb

PDB protein data bank

PEG polyethylene glycol

Ph phenyl

PHD plant homeodomain

PHF20L1 PHD finger protein 20-like protein

PRC polycomb repressive complex

PRC1 polycomb repressive complex-1

PRC2 polycomb repressive complex-2

PTM post-translational modification

PyBOP benztriazol-1-yl-oxytripyrrolidinophosphonium hexafluorophosphate

(18)

PWWP Pro-trp-trp-pro domain

NF-κB nuclear factor kappa-light-chain-enhancer of activated B cells

NMR nuclear magnetic resonance spectroscopy

R arginine

Rb retinoblastoma

Rme1 monomethylarginine

RNA ribonucleic acid

RNAi RNA interference

r.t.. room temperature

SAR structure-activity relationship

SETDB1 histone-lysine-N-methyltransferase SETDB1 protein

sRme2 symmetric dimethylarginine

shRNA small hairpin RNA

siRNA small interfering RNA

SPF30 splicing factor spf30

STAT3 signal transducer and activator of transcription 3 STD NMR saturation-transfer difference NMR

TLC thin layer chromatography

TFA trifluoroacetic acid

TRIM33 tripartite motif-containing 33 protein

WD40 tryptophan-aspartic acid repeat domain

WDR5 WD repeat-containing protein

(19)

Acknowledgements

Om Namah Shivaya

I respectfully bow to the creator of the universe, Lord Shiva for giving me a splendid role and purpose

First, I would like to express my deepest gratitude to my supervisor Dr. Fraser Hof for giving me an opportunity as a Ph.D. student in his lab and always being motivating and encouraging throughout my entire Ph.D. studies. Especially, I thank Fraser for his continuous support and invaluable guidance during the tough times of my research work and also in the writing of my manuscripts/dissertation. I greatly expanded my critical thinking and scientific rigor under Fraser’s mentorship. I have truly been blessed with good karma from past/present life, which paved the way to work with the brilliant Dr. Hof.

I would like to thank my committee members, Dr. Cornelia Bohne, Dr. Lisa Rosenberg, Dr. Caroline Cameron, and Dr. Dustin Maly for their valuable feedback on my dissertation. I am also very grateful to Dr. Jeremy Wulff for providing amazing guidance during my courses and research work; and lets hope also for a favorable reference letters...

I would like to thank the UVic technical staff, UVic TA coordinator, and great people I worked with. Thank you so much Dr. Ori Granot and Chris Barr for your inputs which made my work faster. Monica Reimer and Dr. Peter Marrs, you certainly made my TAing life a whole lot easier. Also, I owe a great debt of gratitude to the people of my beloved mother land Bharat (India), Dr. Pusuluri Srinivas (GVK Biosciences), Dr. Bal Raju (Inogent laboratories Ltd), Dr. Anima Boruah (Aurigene discovery Technologies), K. Chinnam Naidu, and other beautifully hearted industry colleagues for instilling the encouragement and confidence, which allowed me to enter my Ph.D. studies.

I would like to thank my peers and fellow students. Thank you very much Alok, Ronan, Raghav, Manoj, Meagan, Mehrave, and Natasha for editing parts of my

(20)

dissertation. My special thanks to Alok, for thoroughly editing nearly the entire dissertation; I sincerely appreciate your patience and hard work. I thank all past and current Hof group members for creating a friendly and enjoyable working environment; you all made my days joyous. Special mentions to Rebecca, Kevin, Graham, and Mike for sharing their valuable scientific expertise. I extend my gratitude and best wishes to: my workplace sisters, Sara Tabet and Aman, and in-lab-biochemistry-tutors, Amar and Janessa. Thank you to Wulff group-Natasha, Emma, Jason, Mike, Jun, Rhonda, Corey, Genny, Aiko, Anuj, and others for having fun & friendly chats.

Most importantly, I want to extend warmest regards to my wonderful family. Mom, Dad, Srinu annayya, Prakash uncle, Jayashree pinni, Srinu bava, in-laws (Hari Prasad mavayya and Kalyani atthayya), and Ram Prasad mavayya, without your unconditional love and moral support I definitely wouldn’t have fulfilled my educational dreams. I am truly indebted and grateful to all of you. Mom, Dad, and Prakash uncle, you are truly inspirational and I know that you all very proud of me! I also want to thank my like-minded sister Kalyani, and extended family, including Sweta vadina and other cousins for their kindness unto me.

To my beautiful wife Bhargavi and my adored son Renesh Chandra (“Raja babu”), my love for both of you extends far beyond the love I have on chemistry. Words cannot express how thankful I am to both of you for your unwavering support. Raja babu, the moment we knew of you, we instantly fell deeply in love with you. Your nanna and amma’s unconditional love is always with you my son! Even though many months went by when we were continents apart, the immeasurable love that we share kept me focused throughout our educational journey.

(21)

Dedication

This dissertation is dedicated to my beloved family and my Prakash uncle family for their sacrifices and extensive support throughout my studies and life

(22)

Chapter 1. Overview of targeting epigenetic methyl readers with

chemical antagonists

Portions of the work (section 1.4.2) I present in this chapter have been adapted from a publication where I made contributions as first author.

Simhadri, C.; Gignac, M. C.; Anderson, C. J.; Milosevich, N.; Dheri, A.; Prashar, N.; Flemmer, R. T.; Dev, A.; Henderson, T. G.; Douglas, S. F.; Wulff, J. E.; Hof, F., Structure–Activity Relationships of Cbx7 Inhibitors, Including Selectivity Studies against Other Cbx Proteins. ACS Omega 2016, 1 (4), 541–551.

Link to the original paper: http://dx.doi.org/10.1021/acsomega.6b00120

“Adapted with permission. Copyright (2016) American Chemical Society.” “This is an unofficial adaptation of an article that appeared in an ACS publication. ACS has not endorsed the content of this adaptation or the context of use.”

1.1 Introduction

1.1.1 Writer, eraser, and reader proteins control chromatin architecture

One of the key mechanisms of controlling chromatin is through PTMs on histones. Chromatin, the compacted state of genetic information (DNA) in the nucleus of a cell, is an array of repeating units, called nucleosomes. A nucleosome consists of ~146 base pairs of DNA wrapped around an octamer of core histones; the octamer is made of a tetramer of histone-3–histone-4 (H3–H4) and two histone-2A–histone-2B (H2A–H2B) dimers (Figure 1.1).1_{Nucleosomes contain protruding unstructured histone N-terminal}

tails which are often post-translationally modified. These PTMs are called epigenetic marks as they control gene expression without changing the underlying DNA sequence. Several PTMs have been identified on different residues of core histones including

(23)

methylation, acetylation, phosphorylation, ubiquitylation, sumoylation, glycosylation, and others (Figure 1.1a).1

Writer, eraser, and reader proteins are three classes of proteins/enzymes that add, remove, and recognize PTMs on histone tails, respectively (Figure 1.1b). The orchestrated action of these protein classes defines chromatin structure and dynamics and influences gene expression by various mechanisms. Abnormalities and dysregulation of these protein classes are often associated with disease conditions.2 Pharmacological intervention of these proteins classes has been a growing subject of research. Some of the writers (lysine methyltransferases and acetyltransferases) and erasers (histone demethylases and deacetylases) are classified as validated therapeutic targets. Four different small molecule inhibitors targeting these protein classes have become FDA-approved drugs, and several more are in ongoing clinical trials.3_{Readers, particularly}

acetyllysine reader proteins (bromodomains) have been successfully targeted with small molecule inhibitors and several bromodomain inhibitors are the subject of ongoing clinical trials for diverse malignancies, inflammation, atherosclerosis, and diabetes.3c, 4 Despite the relatively high abundance of structural data and biological information on methyllysine reader proteins, development of chemical inhibitors against them has lagged behind compared to other PTM-related protein classes.

(24)

Figure 1.1. Nucleosome, PTMs on H3 tail, and a cartoon representation of epigenetic writer, eraser, and reader proteins. a) Left: nucleosome, which is composed of DNA and histone octamer. Histone octamer is made of a tetramer of H3-H4 and two H2A-H2B dimers. All the four core histones have protruding tails and are the subjects of several PTMs. Right: PTMs on histone-3 tail are highlighted. PTMs including methylation (Me), acetylation (Ac) and phosphorylation (P) are specified on different lysine (K) and arginine (R) residues of histone-3. Methyl marks in green and blue are to indicate the positions of transcriptional repression and activation, respectively. General reader proteins of the PTMs are given in the figure. b) Cartoon depiction of writer, eraser and reader proteins on the H3 tail. The structural picture of the nucleosome is generated using a protein data bank (PDB) code 1AOI.5

1.1.2 Methylation status of lysine/arginine affects the recruitment of readers

Methylation can occur on multiple sites on histones and as different side-chain PTM states.Methyl marks are installed by methyltransferase enzymes on the side chains of basic amino acid residues, and the marks are removed by demethylase enzymes. Side chains of lysine (K) residues can be mono-, di- or tri-methylated at the ε-amino group

Writer Eraser Reader K…K…PKK..KR…SKR…KTRA-Nterm 79 56 38 36 27 26 10 9 8 4 3 2 1 Me Me Me P Me Ac Ac Me MBT, Chromo, PHD, Tudor, PWWP, WD40 and others Ac Bromodomains P 14-3-3 Me Me Me _or Histone PTM Readers

Protruding histone-3 tail

DNA

Histone proteins: H2A H2B H3 H4 a)

b)

(25)

(Figure 1.2). Arginine (R) side chains are mono-, symmetric , asymmetric di-methylated on their guanidinyl group (Figure 1.2). Methylation on lysine/arginine residues of non-histone proteins such as p53, transcriptional factors, E2F1, Rb, ERα, NF-κB, STAT-3 and histone/DNA writer enzymes (KMT1C/DNMT1) has also been reported.6

Figure 1.2. Different methylation states of lysine and arginine at physiological pH. a) Lysine (K) can be mono-, di- or tri-methylated on the ε-amino group. b) Arginine can be mono-, symmetric di- or asymmetric di-methylation on the guanidinyl group.

Specific lysine methylation patterns, including the methylation status and position of a particular amino acid that is methylated, drive different biological outcomes such as activation or repression of gene expression.7 For example, trimethylated lysine 4 on histone-3 (H3K4me3), H3K36me3 and H3K79me3 marks result in transcriptional activation whereas H3K9me3, H3K27me3, trimethylated lysine 20 on histone-4 (H4K20me3) result in transcriptional repression (Figure 1.1a). In a normal cell, epigenetic regulators (writers and erasers) control methylation status at any given site. Multiple lines of evidence suggest that aberrant lysine methylation is linked to diseases such as ageing and cancer.1c How can a small methyl group(s) dictate such large

H N N H O O N H H H H N N H O O N H H Me H N NH N H O O H2N NH2 H N NH N H O O N H NH2 Me H N NH N H O O N NH2 Me Me H N N H O O N Me H Me H N N H O O N Me Me Me Lysine methyltransferase Lysine demethylase or or H N NH N H O O N H NH Me Me Arginine methyltransferase Arginine demethylase or or

K Kme1 Kme2 Kme3

R Rme1 sRme2 aRme2

a)

b)

Size, charge diffusion, hydrophobicity Hydrogen bonding donating capacity

(26)

biological outcomes? A methyl PTM works as signal to recruit suitable reader protein complexes. The resulting downstream protein–protein interactions influence the accessibility of DNA to transcription factors, which in turn regulate gene expression.8

1.1.3 Common structural motifs of methyllysine readers

More than 200 methyl lysine/arginine reader proteins have been identified in the human proteome.9 Methyllysine readers are categorized into three major families based on the structural characteristics of the domain that is responsible for binding methylated lysine residues on histones.8b These families are plant homeodomain (PHD) domain containing proteins, Trp-Asp 40 (WD40) repeat domain containing proteins, and the so-called Royal family, which is composed of the subfamilies of chromodomains (chromatin organization modifier domain), Tudor, Pro-Trp-Trp-Pro (PWWP), Agenet, and malignant brain tumor (MBT) domain containing proteins. In a recent review, the physiochemical basis for recognition of methyllysine reading domains and with their binding partners was uncovered.8a

Despite other structural differences, almost all methyllysine readers use a binding pocket called an “aromatic cage” to recognize the methylated lysine side chains of histones. The notable exceptions to this are PHD domains of CHD4 and TRIM33, which do not have an aromatic cage in their structures.10 An aromatic cage can be defined as a lining of 2-4 electron rich aromatic residues in a binding-pocket shape. The interactions between aromatic cage of a methyllysine reader and its cognate methyllysine containing protein were characterized as a combination of cation-π interactions, van der Waals interactions, and hydrophobic contacts.11 At physiological pH, all three methyllysine states contain a positive charge (Figure 1.2). The cation-π interactions take place between the positive charge on methyl ammonium cation and the partial negative charge on the faces of aromatic residues.8a Favorable hydrophobic and van der Waals interactions arise upon binding hydrophobic methylammonium group to aromatic cages by displacing the occupied high-energy waters. In general, methyllysine readers achieve specificity in recruiting suitable methyllysine containing protein partners by profiling the differences that exist in the properties of different lysine methylation states such as the size,

(27)

hydrophobicity, distribution of positive charge, and hydrogen bond donating capability (Figure 1.2). They also recognize and bind neighboring residues located near the methyllysine with structural elements adjacent to the aromatic cage. The size and composition of the methyllysine binding pocket of reader proteins plays a critical role in accommodating different methylation states, for example PHD domain containing protein BPTF-PHD can accommodate higher methylation state (Kme3) whereas L3MBTL1, which is a Kme1/Kme2 reader, is too small to accommodate a Kme3 side chain (Figure 1.3). Reader proteins that read lower methylation states (Kme1/Kme2) contain one or two acidic residues that are spatially close (~6 Å) to the aromatic cage. The electrostatic interaction and hydrogen bond between carboxylate side chain of the acidic residue and a free NH of the methylammonium group gives directionality and specificity for recognition.8a, 12

Higher methylation state Kme3-containing substrates bind to methyl readers with wide and shallow binding pockets, and this is called a surface-recognition mode (Figure 1.3a).8a In this mode, the trimethylammonium group of the substrate binds the aromatic cage and the neighboring substrate residues participate in favorable interactions with the residues that are present in the shallow binding pocket of the protein (see electrostatic surface view in Figure 1.3a). Lower methylation states Kme1 and Kme2 bind to methyl readers with a narrow and deep binding pocket, described as a cavity-insertion mode (see electrostatic surface view in Figure 1.3b and Figure 1.3c).8a In this mode, methyllysine side chain participates in cation-π and hydrophobic interactions with aromatic cage of reader protein along with salt-bridge interaction with acidic residues (Asp/Glu). Other residues of the substrate are less important for binding affinity.

(28)

Figure 1.3. Examples of two types of recognition modes that are commonly found in the complexes of Royal family of methyllysine readers with their binding partners. Three co-crystal structures of methyllysine reader–natural binding partner complexes are presented in the format: protein-domain with binding partner, corresponding aromatic cage with methyllysine side chain, and electrostatic surface view of the binding pocket. Protein is either shown in cartoon or surface representation and ligand in sticks. Hydrogen bonds are depicted as yellow dots. a) PHD type-2 domain of BPTF–H3K4me3, PDB code: 2F6J. b) MBT domain 2 of L3MBTL1–H4K20me2 (PDB code 2RJF). c) MBT domain 2 of L3MBTL1–Kme1 (PDB code 2RHY). L3MBTL1 contains three MBT domains in its structure but to maintain clarity in the b) and c), the MBT domain 1 and 3 are omitted.

1.1.4 Chemical vs. genetic inhibition of reader protein functions

As the recognition of a specific methyl mark to a particular reader protein is crucial for a defined biological outcome, inhibiting reader proteins is a possible route to

a)

b)

c)

Zn+2

(29)

therapeutic intervention for the recognition-linked diseases. Understanding the physiological basis for the recognition would be helpful to create inhibitors and chemical probes. In my thesis, I used “inhibitors” as a general term for peptidic or non-peptidic synthetic small molecules that block a biological target. “Chemical probes” have been defined as small molecules that inhibit a biological target and that have certain characteristics: developed by reliable structure-activity relationship (SAR), <100 nM biochemical potency, >30-fold selectivity over the proteins that are closely related to the target, known selectivity data against general off-target proteins and drug discovery related protein families, proven on-target cellular effects (modulate relevant phenotype) with proven mechanism of action in cells at <1 µM concentration.13 High-quality chemical probes to a specific reader domain would reveal target’s functional roles in a cell of interest and validate its potential in disease phenotype and therapeutic exploitability with small molecules.13

Pharmacological inhibition of a target is entirely different from RNAi-mediated inhibition/silencing of the target.14 RNA-mediated silencing techniques either by siRNA or shRNA have been used to generate a loss-of-function phenotype of a target of interest in cells or whole organisms. Both techniques temporarily reduce the whole target (protein/enzyme) expression and provide no precise control over each domain of the target. On the contrary, pharmacological inhibition with an inhibitor or chemical probe can selectively modulate a particular domain of interest while keeping the target protein physically intact. For example, a selective inhibitor of a methyllysine reader should only bind methyllysine reading domain and inhibit its interaction with its corresponding cognate partner in cells. As all methyllysine readers are multi-domain containing proteins and each domain exhibits distinct functions, removal of the whole protein by genetic silencing might lead to different biological consequences (scaffolding effects) than would be produced by a reader domain-targeting drug. Thus, in the case of reader proteins, pharmacologic and genetic inhibition approaches can induce different phenotypes. In conclusion, pharmacological inhibition is more relevant to drug development against reader proteins than RNAi-mediated silencing. Moreover, in the perspective of reader proteins, both the approaches complement each other in validating the target’s biological

(30)

significance especially in distinguishing the scaffold effects from effects that are from the reader function itself.

1.2 Goals of this chapter

The goals of this chapter are to present

1. An overview of selected biochemical and biophysical assays that determine binding strengths of reader protein–ligand interactions.

2. A case study of the discovery of a chemical probe UNC1215 for MBT domain containing reader L3MBTL3.

3. My thesis hypothesis.

Conventions used throughout my thesis: i) “Antagonist” and “inhibitor” terms were used interchangeably as both terms have same meaning in the context of methyllysine binders. ii) Unless otherwise mentioned, all binding assays were performed with only the recombinant reader domains of the specified proteins. Thus, all protein names mentioned in the discussion related to these assays results refer to only the reader domains of those proteins. For example, all the binding assays were performed with recombinant MBT tandem repeat domain of L3MBTL1 but during the discussion of the binding assay results, I refer to them as “L3MBTL1” instead of “MBT tandem repeat domain of L3MBTL1”.

1.3 Biochemical and biophysical assays of reader protein–ligand interactions

Depending upon the information we seek (qualitative or/and quantitative), a number of biochemical and biophysical techniques are available to determine the binding affinity of a ligand towards the target. In general, quantitative techniques provide “how strong,” or “how fast” in a numerical way so the datasets can be compared quantitatively.15 Whereas, qualitative data provides initial observations on binding and the results must be validated further to assess the binding quantitatively using a suitable biophysical technique. Biophysical studies of methyl reader proteins have largely relied on Fluorescence polarization (FP) assays, Amplified luminescent proximity homogeneous

(31)

assay screen (AlphaScreen), and Isothermal titration calorimetry (ITC). In the next sub-sections, I describe the design principle of qualitative microarrays, as well as FP, AlphaScreen, and ITC using one literature example for each technique.

1.3.1 Microarrays

Microarrays have been commonly used for finding molecular targets for small molecules or small molecules to molecular targets in a high-throughput screen (HTS) fashion for different types of biomolecular interactions.16 Here I describe the design principle of peptide microarrays and protein-domain microarrays and each technique with one literature example that is focused on defining the qualitative binding interactions between reader domains and their recognition partners. Both the microarrays are alternatives to in vitro testing of 100s of hypothesized pair-wise recognition experiments in one experiment. The design principle of peptide and protein microarrays is very similar and involves immobilization of peptides or proteins followed by probing with protein or peptide of interest and detection of positive binding interactions using western blot methods (Figure 1.4). Researchers have developed different chemical methods to immobilize a peptide/protein onto a membrane and several detection techniques which use immuno-reactive tags and conjugates.16

Peptide microarrays are a valuable tool to identify which specific peptides with PTM states are bound by a certain reader protein. They also can be used to determine SAR for binding of a given reader protein. Candidate peptides, including those with different PTM marks, can be either directly synthesized on a modified cellulose membrane or can be separately synthesized and then manually spotted onto the membrane (Figure 1.4a). After peptides are fixed on the membrane, the membrane is blocked with BSA or milk protein in order to reduce non-specific interactions. This is followed by incubation with the recombinant protein of interest containing an immuno-reactive tag, and then by extensive washings with buffer to remove the excess unbound protein. The membrane is then incubated with a suitable primary antibody, washed, and incubated with either fluorescently labeled or a chemiluminescent tag containing secondary antibody to

(32)

visualize positive binding interactions (Figure 1.4a).

Figure 1.4. Schematic of peptide- and protein-domain microarrays. a) Peptide microarray: Peptides with various PTMs, distinctly methylated, and unmethylated peptides are immobilized on a modified cellulose membrane in 4 row x 4 column format. Steps: i) blocking with BSA/milk protein, ii) incubation with protein of interest, iii) incubation with 1° antibody, iv) incubation with fluorophore-tagged 2° antibody, and v) detection with a suitable detector. Between each step, extensive buffer washes are required. b) Protein-domain microarray: eight protein domains are spotted as duplicates on a nitrocellulose membrane in 4 row x 4 column format. Steps: i) and v) are same as above, and vi) incubation with fluorescent-labeled binding partner of interest. Green dots represent peptide/protein spotted areas whereas gray dots indicate positive binding interactions. The figure is adapted from a publication.17

Kme2 Kme3

K K

Kme1 Kme2 KAc Rme1

Kme3 Rme1 SPh Kme1

Kme3 Kme3 KAc SPh

Kme1 Kme2 KAc Rme1

Kme2 Rme1 SPh Kme1

Kme2 Kme3 KAc SPh

Kme2 Kme3 K K HIS HIS = Fluorophore conjugated biotinylated peptide = linker Kme1 i), ii), iii), iv)

v) a) b) = Peptide with Kme1 mark = 1°Antibody = Fluorophore tagged 2°Antibody

HIS = His₆-tagged protein

of interest v)

(33)

Figure 1.5. Histone peptide microarray probed with Cbx7 suggests consensus binding motif for designing alternative binding peptides to Cbx7. Peptide microarray containing 228 peptides on a blot screened against chromodomain Cbx7. Blue rectangle indicates peptides made by substituting alanine at A25 (that is (–2) position relative to Kme3) with any other amino acids do not bind to Cbx7. Red circle and green ellipsoid represent A24F replacement peptide and A24T replacement peptide, respectively. “[Adapted with permission from18_].

In the example in Figure 1.5, peptide microarray data reveals mutants of the H3K27me3 peptide sequence that can bind to chromodomain of chromobox homolog 7 (Cbx7).18 Cbx7 is a chromodomain containing methyl reader protein which recognizes H3K27me3 mark (see below). The peptide array consists of the H3K27me3 peptide (residues 21–33 of histone-3 where Kme3 at 27th position) and peptides that were generated upon each of the residues downstream and upstream to Kme3 in the H3K27me3 sequence (Figure 1.5, horizontal series) substituted with 20 L-amino acids

(Figure 1.5, vertical series). The resulting 228 peptides were directly synthesized on polyethylene glycol (PEG) coated cellulose membrane using an automated high-throughput peptide synthesizer. This microarray was probed with His6-tagged

chromodomain of Cbx7 followed by subsequent Western blot detection with an anti-His6

antibody. The positive binding interactions showed up as black spots. The peptide microarray data revealed several clues to design novel alternative peptides to Cbx7:

H3 sequence (aa 21–33) = Ac-ATKAARKme3SAPATG-OH = Ac-ATKFARKme3SAPATG-OH

(34)

Ala25, that is (–2) residue relative to Kme3, is crucial for binding as no other amino acid replacements were tolerated (see the blue rectangle in Figure 1.5). The binding affinity of H3K27me3 improves when Ala24, that is (–3) residue relative to Kme3, is substituted with hydrophobic residues (C/I/L/F/WY/V). Similar to above, the microarray provided qualitative observations at each position of H3K27me3 peptide sequence with what residues are tolerable or improve the binding. Thus the consensus binding motif A(R/I/L/F/Y/V)Kme3(S/T) was derived. These observations later helped to design peptidic inhibitors to Cbx7 (Chapter 3 of this thesis).19 Spot intensities in the blot do not necessarily correlate quantitatively.

Protein-domain microarray is a powerful technique to profile a peptide or small molecule’s specificity and selectivity against 100s of protein/enzyme domains. General steps in making and using a protein microarray: Spot the recombinant protein-domains of interest on a nitrocellulose membrane coated glass slide, block with BSA/milk protein, incubate with a solution of the fluorescent-labeled binding partner of interest, wash to remove excess molecules, and followed by detection (Figure 1.4b). One limitation of this technique is sometimes the binding pocket or binding properties such as folding of a spotted protein-domain might get altered and consequently no positive binding interactions are found (false negative result).

Towards identifying H3K4me1 and H3K9me3 binding reader domains, chromatin-associated protein-domains that are annotated as PTM reader proteins were arrayed on nitrocellulose coated glass chip and incubated with Cy3-streptavidin bound biotin-H3K4me1 and Cy3-streptavidin bound biotin-H3K9me3 peptides, respectively (Figure 1.6).20 The positive binding interactions showed up as green fluorescent spots. The protein-domain microarray results indicate novel binding interactions. H3K4me1 binds to MBT domain containing proteins CGI-72 and L3MBTL1. H3K9me3 mark binds chromodomain containing proteins such as CDY1 and HP1 proteins, HP1β (Cbx1), HP1γ (Cbx3), and HP1α (Cbx5). In addition to the binding data, clear discrimination of methyl reader domains to methylations states, were observed.

(35)

a)

b)

(36)

Figure 1.6. Protein-domain microarray results indicate MBT domain containing proteins CGI-72 and L3MBTL1 bind H3K4me1 mark, and the chromodomain containing proteins CDY and HP1 bind H3K9me3 mark. a) Shows layout of the protein-domain microarray. Nitrocellulose coated glass slide (right) composed of 12 grids in a 2 row x 6 column pattern (left) and each grid features spotting areas for array in 5 row x 5 column format (right). Each grid was arrayed with duplicate of a protein-domain and one negative control, GST alone (M). b) List of all the arrayed 109 GST fusion protein-domains. The protein-domains are indicated family-wise and each of the fusion protein is indicated with position on the slide. c) Probing the array with Cy3-labeled H3K4me1 peptide (top) and Cy3-labeled H3K9me3 peptide (below). Positive binding interactions (green dots) are MBT domain containing proteins CGI-72 (orange rectangle) and L3MBTL1 (yellow rectangle), and chromodomain containing proteins, CDY1 and HP1 proteins (white square). The Cy3-labeled peptides were prepared by treating the biotinylated peptides with commercially available Cy3-streptavidin and followed by incubation with biotin-agarose beads to remove free Cy3-streptavidin label.21 The figure is adapted from

Reference20 and the link to the copyright:

https://s100.copyright.com/CustomerAdmin/PLF.jsp?ref=4192628b-f61b-419d-a0cc-f8c4aa3fe7c1

1.3.2 Fluorescence Polarization assay

FP is a powerful technique to study variety of biomolecular interactions including protein-protein, protein-small molecule, protein-DNA, and others. FP has been adapted in several HTS campaigns to find suitable small molecules for several drug targets.22 FP assay is based on two physical principles23 and those are: i) Excitation of an immobile fluorophore with polarized light leads to emission of polarized light with relatively high intensity in the same plane as the incident excitation light than the other two perpendicular planes.23b ii) If a fluorophore tumbles during the excited state lifetime, the intensities of the emitted light will be depolarized to some degree and is inversely proportional to the speed of the fluorophore tumbling. The speed of fluorophore tumbling depends in turn on the size of the fluorophore (Figure 1.7a). FP is one of the methods to measure the emitted light and is defined in Equation 1.1 where the I|| and I^ are the

measured intensities of emitted light in the parallel and perpendicular planes, respectively, with respect to the polarization of the incident excitation light. When the fluorescent ligand is bound to a high molecular weight protein this results in a higher molecular weight complex, which tumbles more slowly and subsequently emits more polarized light in the same plane as incident excitation light. This increase in I|| value

(37)

consequently results in higher FP value (Figure 1.7a and Equation 1.1). Unlabeled molecules, unless they contain natural fluorophores, do not emit polarized light hence give no FP signal (Figure 1.7a). Thus any obtained FP value from a molecular binding event is weighted average of FP signal from fluorescent ligand-bound-protein fraction (high FP value) and inherent fluorescent ligand (low FP value).23c

Figure 1.7. Schematic of FP assay principle and selected literature examples to demonstrate the determination of Kd and IC50 from direct FP and competitive FP. a) FP

assay principle. A: Tumbling of free and bound species during excitation of binding assay sample. Arrows indicate speed of tumbling. B: Detection of polarized emission in parallel and perpendicular planes. The figure is adapted from Reference.23a b) Left: Kd of

Polarized Excitation light More depolarized Remains polarized Dual emission beam splitter Fluorescent ligand Tumbles faster (Protein+Fluorescent ligand) complex, tumbles slower

Protein, tumbles slowly emission emission emission No emission Horizontal polarization filter Vertical polarization filter A I|| I_⊥ B a) b) Kd = 1.1 ± 0.2 µM Log[EED] µM mP (n orma lize d) 0.05 µM [compound 3-FAM] Log[compound] µM IC₅₀= 3.9 ± 0.9 µM UNC5115 mP (n orma lize d)

(38)

compound 3-FAM to EED is 1.1 µM as determined by direct FP assay. Right: IC50 of

UNC5115 for EED–compound 3-FAM is 3.9 µM as determined by competitive FP assay. Figure 1.7b is “Adapted with permission from24. Copyright (2017) American Chemical Society.”

FP = (I|| - I^)/(I|| + I^) (Equation 1.1)

The physical property FP of fluorescent ligands led to its use in assays to determine the relative binding affinities of fluorescent ligands and unlabeled ligands/inhibitors toward a target with direct FP and competitive FP, respectively. Typical direct FP assay involves, using a multi-well plate reader, carrying out a titration of varied concentrations of protein into a fixed concentration of fluorescent probe and generating a binding isotherm by plotting protein concentrations used against corresponding average of measured FP values (in millipolarization units). From the binding isotherm, the dissociation constant Kd can be obtained.

A competitive FP experiment can determine the relative binding affinity (IC50) of an

unlabeled inhibitor that competes with the fluorescent ligand. A titration with varied concentrations of the inhibitor into a mixture with fixed concentrations of the fluorescent ligand and the protein provides data points to generate a binding curve. The inhibitor’s IC50, which is the concentration of inhibitor needed to displace 50% of the fluorescent

ligand from the complex, can be obtained by curve fitting. In the selected literature example, Kd of fluorescein-labeled compound 3 (compound 3-FAM) towards a

polycomb reader domain EED was determined as 1.1 µM by direct FP assay, and the IC50

of inhibitor UN5115 to EED was determined as 3.9 µM by competitive FP (Figure 1.7b).24 Competitive FP was conducted with the fluorescent probe 3-FAM instead of a labeled natural binding partner because higher potency fluorescent ligands widen the range of measurable potencies for inhibitors.25

1.3.3 AlphaScreen assay

AlphaScreen is a versatile bead-based technique that has been successfully applied to study numerous biomolecular interaction types including protein-protein,

(39)

protein-DNA, and others.26 The technique involves donor beads, acceptor beads, and immuno-reactive tags or strong affinity molecular recognition partners such as biotin-streptavidin, Ni+2 with His6-tagged proteins, glutathione with GST-tagged proteins.26 The

binding of biomolecules is captured on donor/acceptor beads via their pre-coated corresponding immuno-reactive tag conjugates or strong affinity partners. Donor beads are embedded with a photosensitizer (phthalocyanine) molecules that can convert ambient oxygen molecules into singlet oxygens upon irradiation with 680 nm light. The released singlet oxygens can travel up to 200 nm in solution. When the acceptor beads are within this distance, the singlet oxygens activate a cascade of chemical reactions with the chemicals embedded in the acceptor beads (thioxene, anthracene, and rubene) and produce luminescence signal at 520–620 nm as readout (Figure 1.8a explains the schematic of AlphaScreen where the donor and acceptor beads contain strong affinity partners). The intensity of the signal (alpha counts per second) corresponds to the binding affinity. As the readout is based on triggering a cascade reaction and, in fact, each of the donor/acceptor beads is embedded with a high concentration of key chemicals hence the signal automatically amplifies thus this feature allows AlphaScreen assay to detect binding of binding partners at very low concentrations (femtomolar range) in solutions. The beads show avidity effects, where 100s of similar binding sites on each bead type are brought together. Therefore, the resultant binding is the cooperative sum of binding from multiple pairs of binding partners. This makes the AlphaScreen suitable for detection of low-affinity interactions. However, the obtained affinities apparently register as high binding affinities (Kd app). Therefore, the Kd app values are not quantitative, but binding

affinities (IC50) obtained from competition experiments in AlphaScreen are quantitative.

In general, commercially available donor/acceptor beads contain reactive aldehyde surfaces that can readily be conjugated with desired tag conjugate of interest. Thus, different donor/acceptor beads can be designed to suit the need of biomolecular interaction in the chosen study.

Herein I describe an AlphaScreen assay that uses commercially available AlphaScreen Histidine (Ni+2) detection kit to determine the binding affinity of protein–

(40)

peptide pair (His6-L3MBTL1–Bn-H3K9me1), followed by determination of binding

affinity of the L3MBTL1 antagonist, compound 13, that compete binding of His6

-L3MBTL1 to Bn-H3K9me1 (Figure 1.8b).27 Both the biomolecules, L3MBTL1 which was purified with His6-tag (His6-L3MBTL1) and the binding partner H3K9me3 peptide

which was synthesized as biotin conjugate (Bn-H3K9me1) were incubated in a suitable buffer to reach binding equilibrium. Then Ni+2 chelate acceptor beads and streptavidin coated donor beads were added and incubated. Optimal concentrations for His6

-L3MBTL1 and Bn-H3K9me1 were chosen as 50 nM and 150 nM, respectively, to carry out competition titration with the L3MBTL1 antagonist, compound 13 (Figure 1.8b).27 At these concentrations of protein and peptide, Kd app ~ 60 nM was determined. Increasing

concentration of compound 13 correspondingly dissociated binding of Bn-H3K9me1 to His6-L3MBTL1, led to a reduction of Alpha-counts, which was shown as % inhibition.

Relative binding affinity IC50 for antagonist 13 was determined from binding isotherm

(41)

Figure 1.8. Schematic of AlphaScreen and selected literature examples to demonstrate the determination of Kd app from cross-titration and IC50 from competitive AlphaScreen. a)

Schematic of AlphaScreen that uses streptavidin coated donor beads and Ni+2_chelated

acceptor beads. See main text for explanation. Generally, each donor bead is coated with 100s of streptavidin molecules and acceptor bead is chelated with 100s of Ni+2 ions. To maintain clarity in the figure, one streptavidin molecule and 10 Ni+2 ions are shown. The figure is adapted from Reference.27a b) Left: Cross titration between His6-L3MBTL1 and

Bn-H3K9me1. Concentrations of His6-L3MBTL1 and Bn-H3K9me1 used for titration in

a grid pattern are from 0 to 200 nM and 0 to 1666 nM, respectively. Selected optimal concentrations of the protein 50 nM and the biotinylated probe 150 nM. Right: Determination of IC50 for compound 13. Figure 1.8b Left: is from Reference27a and

Right: “Adapted with permission from27b. Copyright (2010) American Chemical Society.”

Due to inherent limitations of competitive FP and AlphaScreen assays, inhibitory

HIS₆ Ni+2 Ni+2 Ni+2 Ni+2 Ni+2 Ni+2 Ni+2 _Ni₊₂ Ni+2 Ni+2 1_O 2 Emission at 520–620 nm Excitation with 670 nm Protein

Streptavidin coated donor beads Peptide linked to Biotin through a linker Ni+2 _{chelated acceptor beads}

a) Al ph a co un ts Bn-H3K9me1 (nm) _{[compound 13] µM} % inhibition b) _{L3MBTL1 (nM)} IC₅₀ = 17± 0.13 µM

(42)

constants (IC50) derived from these assays are not considered as true binding constants

(Kd). The results from these assays are experimental-condition-dependent, so comparison

of binding affinities of sets of inhibitors must be made among the inhibitors tested under same conditions. These affinity constants are good for ranking compounds while doing iterative medicinal chemistry approaches to a target but the results for potent compounds from the comparison should be screened with secondary assays such as ITC, which is a label-free and avidity free technique (see below).

1.3.4 Isothermal Titration Calorimetry (ITC)

ITC is a label-free biophysical technique that provides a direct measurement of thermodynamic parameters (change in enthalpy ΔH, change in entropy ΔS, stoichiometry (n), and binding constant/dissociation constant (Kd)) of a biomolecular binding reaction

in a single experiment.28 ITC design principle relies upon accurate measurement of heat evolved during a biomolecular interaction; either heat released (exothermic) or heat absorption from surroundings (endothermic) is a universal characteristic of any biomolecular reaction. ITC consists of one reference cell and one sample cell in an adiabatic jacket, and outside the adiabatic jacket, a syringe that can dispense known amounts of titrant into sample cell (Figure 1.9a). Prior to the ITC experiment start, reference cell, sample cell, and syringe are filled with water/buffer, protein, and ligand (titrant), respectively. In a typical ITC experiment, at a constant temperature, 20–30 ligand aliquots are titrated into a protein in a specific time intervals, then measures heat effects (heat release/absorption) during each discrete injection. During the titration, in order to maintain the biomolecular reaction at constant temperature (isothermal) and to accurately measure the heat effects, instrument coordinates through a constant power supply (<1 mW) unit located at the reference cell, feedback power unit located at the sample cell, and a calibrating heater located at the sample cell which contain a sensitive thermopile that is to measure the temperature between the reference cell and the sample cell (Figure 1.9a). For each injection, heat effects are calculated (in µcal) from the power (in µcal/sec) that is supplied by feedback unit to maintain the identical temperature between the sample cell and the reference cell (ΔT ~ 0). As the concentration of the

(43)

ligand in each injection is known, enthalpy change (ΔH per injection in heat/mole of ligand) upon the addition of each aliquot can be calculated. After this, obtain thermodynamic data of the biomolecular reaction by generating a binding isotherm by plotting ΔH values of each data point as a function of corresponding molar ratio ([ligand]/[protein]), and subsequently fitting it to non-linear least square equation to a suitable binding model that provides best fit for ΔH, association constant (Ka), and

stoichiometry (n). Dependent variables of the biomolecular reaction, free energy (ΔG) and change in entropy (ΔS) can be derived from equation 1.2, and the binding affinity Kd

can be calculated from Ka (i.e., Kd = 1/Ka). In the following literature example, the ligand

UNC5115 binds to EED with 1:1 binding stoichiometry with a binding affinity (Kd) of

1.14 µM.24 The experiment and data analysis were performed similarly as described above (Figure 1.9b).

Figure 1.9. Schematic of an ITC instrument and demonstration of ITC readout of the binding event, EED–UNC5115. a) Schematic of ITC. Annotations in the figure are adapted from publication.28b b) ITC readout of binding of EED to compound UNC5115.

Molar Ratio kca l/mo l of in je ct an t µ ca l/se c Time (min) ΔH stoichiometry (n) Slope = Ka ΔT = 0 Constant power supply Feedback power Adiabatic shield Water/buffer in Reference cell Protein in sample cell Ligand in Syringe a) b) UNC5115 Kd = 1.14 ± 0.14 µM

(44)

Figure 1.9b is “Adapted with permission from24. Copyright (2017) American Chemical Society.”

ΔG = RT lnKa = ΔH-TΔS (Equation 1.2)

1.4 Chemical inhibitors of epigenetic reader proteins

Despite increasing evidence of methyl readers’ participation in several diseases and knowledge on the druggability of the readers,29 efforts to make chemical probes for methyl readers are still in their infancy due to several reasons: 1) Many of the readers contain shallow/flat binding pockets. 2) Most of the readers bind their natural partners with weak affinities; consequently, dye-conjugated partners bind weaker to their reader targets. Therefore, HTS efforts to find potent small molecule hits have not been successful due to the weak binding affinity of dye-conjugated binding partner–reader target pair. 3) High structural similarity within each reader family further complicates the development of selective inhibitors.

To date, only 8 methyl readers have been targeted with small molecule inhibitors (Table 1.1). Herein, I describe a case study of the discovery of a chemical probe UNC1215 for MBT domain containing protein, L3MBTL1.

Table 1.1. Selected methyl reader inhibitors with their binding affinities toward their targets

Inhibitor Structure Target (reader

domain): potency (assay) UNC1215 L3MBTL3 (MBT): 120 nM (ITC) Reference30 NH N O N O N N H H

(45)

Entry 11 Cbx7 (chromo): 1.8 µM (ITC) Reference19 Entry 5 Cbx6 (chromo): 0.9 µM (FP assay) Reference31 CF16 Pygo2 (PHD): 7.3 mM (NMR) Reference32 UNC2170 53BP1 (Tudor): 2.2 µM (ITC) Reference33 UNC4991 CDYL2 (chromo): 0.64 µM (ITC) Reference34 N H H N N H O O O H N N H O O Me N Me Me NH2 O OH Me OH N H H N N H O O O H N N H O O N Me Me H N O OH Me OH N H OH O O HN FITC NH2 O Br N N NH2 Br O N H NH2 H N O N H O H N Me O N H O H N N O HO NH2 O