• No results found

University of Groningen MicroRNA expression and functional analysis in Hodgkin lymphoma Yuan, Ye

N/A
N/A
Protected

Academic year: 2021

Share "University of Groningen MicroRNA expression and functional analysis in Hodgkin lymphoma Yuan, Ye"

Copied!
25
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

MicroRNA expression and functional analysis in Hodgkin lymphoma

Yuan, Ye

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date: 2019

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Yuan, Y. (2019). MicroRNA expression and functional analysis in Hodgkin lymphoma. University of Groningen.

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)

24. Zhang, J.G., et al., MicroRNA-21 (miR-21) represses tumor suppressor PTEN and promotes growth and invasion in non-small cell lung cancer (NSCLC). Clin Chim Acta, 2010. 411(11-12): p. 846-52.

25. Go, H., et al., MicroRNA-21 plays an oncogenic role by targeting FOXO1 and activating the PI3K/AKT pathway in diffuse large B-cell lymphoma. Oncotarget, 2015. 6(17): p. 15035-49. 26. Jones, K., et al., Plasma microRNA are disease response biomarkers in classical Hodgkin

lymphoma. Clin Cancer Res, 2014. 20(1): p. 253-64.

27. Jafari, N., et al., Expression levels of microRNA machinery components Drosha, Dicer and DGCR8 in human (AGS, HepG2, and KEYSE-30) cancer cell lines. Int J Clin Exp Med, 2013. 6(4): p. 269-74.

28. Chiosea, S., et al., Up-regulation of dicer, a component of the MicroRNA machinery, in prostate adenocarcinoma. Am J Pathol, 2006. 169(5): p. 1812-20.

29. Guo, Y., et al., Silencing the double-stranded RNA binding protein DGCR8 inhibits ovarian cancer cell proliferation, migration, and invasion. Pharm Res, 2015. 32(3): p. 769-78. 30. Jay, C., et al., miRNA profiling for diagnosis and prognosis of human cancer. DNA Cell Biol,

2007. 26(5): p. 293-300.

31. Lu, J., et al., MicroRNA expression profiles classify human cancers. Nature, 2005. 435(7043): p. 834-8.

32. Lambertz, I., et al., Monoallelic but not biallelic loss of Dicer1 promotes tumorigenesis in vivo. Cell Death Differ, 2010. 17(4): p. 633-41.

CHAPTER 4

A high-throughput microRNA gain-of-function

screen in Hodgkin lymphoma

Ye Yuan1,3, Joost Kluiver1, Fubiao Niu1, Jasper Koerts1, Debora de Jong1, Bea

Rutgers1, Sem Penninkhof1, Martijn Terpstra2, Arjan Diepstra1, Lydia Visser1, Klaas

Kok2, Anke van den Berg1

Departments of 1Pathology and Medical Biology, 2Genetics, University of Groningen,

University Medical Center Groningen, Groningen, the Netherlands. 3Institute of

Clinical Pharmacology of the Second Affiliated Hospital, Harbin Medical University, Harbin, Heilongjiang Province, China

(3)

Abstract

In this preliminary study, we determined the effects of miRNA overexpression on Hodgkin lymphoma (HL) cell growth using a high-throughput next generation sequencing-based (NGS) approach.

A virus pool was made with 40 miRNA overexpression constructs. The effectiveness of the overexpression constructs was validated by performing small RNA sequencing on virus pool infected HEK-293T cells. Two HL cell lines were infected in duplicate and construct abundance was followed by NGS over a period of 3 weeks. Constructs with altered abundance were identified by (1) using cutoff values based on the mean read count ratio of day 5 over day 13 / 21 + or - 2xSD of the barcoded empty vector experiment as determined in chapter 3 and (2) using the Tukey IQR test on slopes calculated from the read count ratios over time.

An increase in expression level of at least 2 fold was observed for one or both of the two strands for 67% of the constructs in HEK-293T cells. Using the first data analysis approach miR-141 was found to be depleted at day 21 compared to day 5 in at least 3 of 4 infections, whereas none of the constructs were increased with this cutoff. Using the second approach miR-19b-1 was found to be increased in at least 3 of 4 infections. Our previously published small RNA sequencing data revealed significantly lower miR-141-3p expression levels in HL cell lines compared to GC-B cells, whereas miR-19b-3p levels were not different. Validation by GFP competition assay revealed an increase in GFP+ cells for miR-19b-1, consistent with the high-throughput screen, whereas the

validation for miR-141 failed due to low infection efficiency.

Thus, miR-19b-1 enhances cell growth in HL, while miR-141 may repress cell growth. Further validation to fully explore the pathogenetic relevance of these miRNAs is required.

Introduction

Deregulated expression of miRNAs has been linked to the development of multiple cancers [1], including B-cell lymphoma. [2, 3] In Hodgkin lymphoma (HL), the first report on a deregulated miRNA showed enhanced expression of miR-155, which was processed from the B cell integration cluster (BIC) transcript. [4] After that, several other studies showed that multiple miRNAs are deregulated in HL. [5] Several oncogenic miRNAs have been reported in HL. The miR-17/106b seed family target CDKN1A (encoding for the P21 protein) in HL and inhibition of this seed family resulted in a block in cell cycle progression [6]. Other miRNAs with oncogenic properties in HL include miR-9, which targets the plasma cell differentiation gene PRDM1 [7] and HuR which controls cytokine production [8]; and the miR-96, miR-182 and miR-183 cluster, which repressed FOXO1, a gene inducing growth arrest and apoptosis in HL cell lines. [9] The only reported tumor suppressor miRNA in HL is miR-135a, which was shown to target JAK2. Overexpression of miR-135a in HL led to reduced JAK2 levels and, as a downstream consequence, also to reduced Bcl-xL levels. [10] Thus, it is clear that miRNAs play essential roles in HL. However, little is known about the relevance of the vast majority of miRNAs with aberrant expression levels in HL.

Here, we applied a high-throughput miRNA overexpression screen to study the effect of miRNAs on the growth of HL cell lines. To validate the overexpression of the mature miRNAs of the corresponding overexpression constructs, we performed small RNA sequencing on cells infected with the miRNA overexpression pool. Two HL cell lines were infected in duplicate with the same pool to identify constructs for which the abundance changed over time. Two methods were applied to identify miRNAs that effect HL cell growth upon overexpression, one based the barcoded empty vector experiment as described in Chapter 3 and one using the variance in slopes calculated from the read count ratios. Finally, specific miRNA constructs were selected for individual validation by GFP competition assays.

Materials and methods

Design of the construct pools

The miRNA overexpression constructs were partly obtained by SBI (Palo Alto, CA, USA) and partly custom made by amplification of the stem-loop region using primer

(4)

4

Abstract

In this preliminary study, we determined the effects of miRNA overexpression on Hodgkin lymphoma (HL) cell growth using a high-throughput next generation sequencing-based (NGS) approach.

A virus pool was made with 40 miRNA overexpression constructs. The effectiveness of the overexpression constructs was validated by performing small RNA sequencing on virus pool infected HEK-293T cells. Two HL cell lines were infected in duplicate and construct abundance was followed by NGS over a period of 3 weeks. Constructs with altered abundance were identified by (1) using cutoff values based on the mean read count ratio of day 5 over day 13 / 21 + or - 2xSD of the barcoded empty vector experiment as determined in chapter 3 and (2) using the Tukey IQR test on slopes calculated from the read count ratios over time.

An increase in expression level of at least 2 fold was observed for one or both of the two strands for 67% of the constructs in HEK-293T cells. Using the first data analysis approach miR-141 was found to be depleted at day 21 compared to day 5 in at least 3 of 4 infections, whereas none of the constructs were increased with this cutoff. Using the second approach miR-19b-1 was found to be increased in at least 3 of 4 infections. Our previously published small RNA sequencing data revealed significantly lower miR-141-3p expression levels in HL cell lines compared to GC-B cells, whereas miR-19b-3p levels were not different. Validation by GFP competition assay revealed an increase in GFP+ cells for miR-19b-1, consistent with the high-throughput screen, whereas the

validation for miR-141 failed due to low infection efficiency.

Thus, miR-19b-1 enhances cell growth in HL, while miR-141 may repress cell growth. Further validation to fully explore the pathogenetic relevance of these miRNAs is required.

Introduction

Deregulated expression of miRNAs has been linked to the development of multiple cancers [1], including B-cell lymphoma. [2, 3] In Hodgkin lymphoma (HL), the first report on a deregulated miRNA showed enhanced expression of miR-155, which was processed from the B cell integration cluster (BIC) transcript. [4] After that, several other studies showed that multiple miRNAs are deregulated in HL. [5] Several oncogenic miRNAs have been reported in HL. The miR-17/106b seed family target CDKN1A (encoding for the P21 protein) in HL and inhibition of this seed family resulted in a block in cell cycle progression [6]. Other miRNAs with oncogenic properties in HL include miR-9, which targets the plasma cell differentiation gene PRDM1 [7] and HuR which controls cytokine production [8]; and the miR-96, miR-182 and miR-183 cluster, which repressed FOXO1, a gene inducing growth arrest and apoptosis in HL cell lines. [9] The only reported tumor suppressor miRNA in HL is miR-135a, which was shown to target JAK2. Overexpression of miR-135a in HL led to reduced JAK2 levels and, as a downstream consequence, also to reduced Bcl-xL levels. [10] Thus, it is clear that miRNAs play essential roles in HL. However, little is known about the relevance of the vast majority of miRNAs with aberrant expression levels in HL.

Here, we applied a high-throughput miRNA overexpression screen to study the effect of miRNAs on the growth of HL cell lines. To validate the overexpression of the mature miRNAs of the corresponding overexpression constructs, we performed small RNA sequencing on cells infected with the miRNA overexpression pool. Two HL cell lines were infected in duplicate with the same pool to identify constructs for which the abundance changed over time. Two methods were applied to identify miRNAs that effect HL cell growth upon overexpression, one based the barcoded empty vector experiment as described in Chapter 3 and one using the variance in slopes calculated from the read count ratios. Finally, specific miRNA constructs were selected for individual validation by GFP competition assays.

Materials and methods

Design of the construct pools

The miRNA overexpression constructs were partly obtained by SBI (Palo Alto, CA, USA) and partly custom made by amplification of the stem-loop region using primer

(5)

sets amplifying a region varying between 481 to 525bp (Table S1). For pCDH-miR-19b-1 and pCDH-miR-27a two copies of the stem-loop fragment were cloned into the vector to get an insert size similar to the other constructs. These uniform insert sizes were essential as shown in the pilot experiment described in Chapter 3B. Inserts were ligated into the EcoR1 and NotI restriction sites of the lentiviral-based pCDH-EF1-MCS-IRES-GFP vector (SBI) and confirmed by Sanger sequencing. A total of 41 constructs were generated, including one negative control (pCDH-NC, random 511bp part of RFP), 5 miRNAs within the top-10 most abundantly expressed in HL, 8 miRNAs with decreased and 6 miRNAs with increased expression levels compared to germinal center (GC)-B cells as based on small RNA sequencing in Chapter 2. The remaining constructs were added based on availability of constructs, despite not having significantly altered expression levels in HL. MiR-7-18763 was a novel miRNA identified as being downregulated in HL (Chapter 2). The 39 pCDH constructs encoding for known miRNAs can potentially generate 78 mature miRNAs, i.e. both a 3p and a 5p strand. Four of the 78 possible strands are not annotated in miRBase (http://www.mirbase.org/). A plasmid mix was prepared for the generation of viral particles, with for part of the constructs an increased or decreased DNA input based on low or high read counts at day 5 in the pilot study described in Chapter 3B (Table S1).

Cell culture, lentivirus production, infection and sorting

Culturing of HL and HEK-293T cells as well as lentiviral particles production was performed as described in Chapter 3. For the high-throughput screen, 8 million cells were infected with a maximal infection percentage of 10%. At day 5, day 13 and day 21 after infection, a total of >10 million cells were prepared for sorting, whereas the remainder of the cells (>2 million) was used to continue the culture. GFP+ cells were

sorted on the MoFlo sorter using a 70-µm nozzle (BD Biosciences, San Jose, CA, USA).

Small RNA library preparation, sequencing and data analysis

Small RNA libraries were generated from around 1µg total RNA isolated from empty vector or pCDH pool infected HEK-293T cells using NEXTflex™ Small RNA Sequencing Kit v3 (Bio Scientific, Austin TX, USA). Total read counts were normalized to 1,000,000. Unique miRNAs with an average read count of at least 5 per sample were included in the downstream analysis. Effectiveness of the constructs was

determined by determining fold change (FC) in pCDH-pool infected HEK-293T cells sample as compared to EV infected HEK-293T cells. A cut off value of 2 was considered as an effective overexpression, and a cut off of 1.5 as an intermediate efficiency. It should be noted that with our approach each construct is infected only in a limited percentage of the cells, so overexpression levels will be limited.

DNA isolation and Polymerase chain reaction (PCR)

Genomic DNA was isolated using the salt/chloroform extraction method. DNA concentration was measured with the NanoDropTM 1000 Spectrophotometer (Thermo

Fisher Scientific Inc., Waltham, MA, USA) and the quality was checked on a 1% agarose gel. Triplicate PCRs were performed using ampliTaq DNA Polymerase kit (Thermo Fisher) using 400ng genomic DNA (representing ~67,000 cells) as input. A universal forward primer (5’-CTGGGAAATCACCATAAACG-3’) with a unique 9nt sample ID (Table S2) was used for each PCR in combination with reverse primer pCDH-R+2bp (5’-CCAAGCGGCTTCGGCCAGTAACGTT-3’) for all L540 samples and with pCDH-R+3bp (5’-TCCAAGCGGCTTCGGCCAGTAACGTT-3’) for all KM-H2 samples. All PCR amplifications were performed in one run to reduce experimental variation. PCR products were analyzed on 2% agarose gel and mixed in equal amounts based on band intensities.

Next generation sequencing

Library preparation of the PCR product mixes was done as described in Chapter 3. Paired-end sequencing was performed using the MiSeq™ (Illumina, San Diego, CA, USA). After next generation sequencing, the read counts were assigned to a sample using the sample ID followed by alignment versus the pCDH constructs. The alignment was performed using BWA (version 0.7.12; https://github.com/lh3/bwa) and processing of the reads was done with SAM tools (version 1.3; http://www.htslib.org/) [11]. We normalized total read counts per sample to 10,000, based on the approximate number of GFP+ cells in the PCR for the sample with the lowest GFP% after sorting (15.6%). Constructs with an average read count of less than 40 were not included in the final analysis (i.e. 4 constructs) because of the relatively broad variation in read counts in triplicate PCR, which makes the calculations to detect changes over time unreliable. We calculated the average read counts of the triplicate PCRs per sample and calculated the fold change of the read counts at day 13 and day 21 relative to day

(6)

4

sets amplifying a region varying between 481 to 525bp (Table S1). For pCDH-miR-19b-1 and pCDH-miR-27a two copies of the stem-loop fragment were cloned into the vector to get an insert size similar to the other constructs. These uniform insert sizes were essential as shown in the pilot experiment described in Chapter 3B. Inserts were ligated into the EcoR1 and NotI restriction sites of the lentiviral-based pCDH-EF1-MCS-IRES-GFP vector (SBI) and confirmed by Sanger sequencing. A total of 41 constructs were generated, including one negative control (pCDH-NC, random 511bp part of RFP), 5 miRNAs within the top-10 most abundantly expressed in HL, 8 miRNAs with decreased and 6 miRNAs with increased expression levels compared to germinal center (GC)-B cells as based on small RNA sequencing in Chapter 2. The remaining constructs were added based on availability of constructs, despite not having significantly altered expression levels in HL. MiR-7-18763 was a novel miRNA identified as being downregulated in HL (Chapter 2). The 39 pCDH constructs encoding for known miRNAs can potentially generate 78 mature miRNAs, i.e. both a 3p and a 5p strand. Four of the 78 possible strands are not annotated in miRBase (http://www.mirbase.org/). A plasmid mix was prepared for the generation of viral particles, with for part of the constructs an increased or decreased DNA input based on low or high read counts at day 5 in the pilot study described in Chapter 3B (Table S1).

Cell culture, lentivirus production, infection and sorting

Culturing of HL and HEK-293T cells as well as lentiviral particles production was performed as described in Chapter 3. For the high-throughput screen, 8 million cells were infected with a maximal infection percentage of 10%. At day 5, day 13 and day 21 after infection, a total of >10 million cells were prepared for sorting, whereas the remainder of the cells (>2 million) was used to continue the culture. GFP+ cells were

sorted on the MoFlo sorter using a 70-µm nozzle (BD Biosciences, San Jose, CA, USA).

Small RNA library preparation, sequencing and data analysis

Small RNA libraries were generated from around 1µg total RNA isolated from empty vector or pCDH pool infected HEK-293T cells using NEXTflex™ Small RNA Sequencing Kit v3 (Bio Scientific, Austin TX, USA). Total read counts were normalized to 1,000,000. Unique miRNAs with an average read count of at least 5 per sample were included in the downstream analysis. Effectiveness of the constructs was

determined by determining fold change (FC) in pCDH-pool infected HEK-293T cells sample as compared to EV infected HEK-293T cells. A cut off value of 2 was considered as an effective overexpression, and a cut off of 1.5 as an intermediate efficiency. It should be noted that with our approach each construct is infected only in a limited percentage of the cells, so overexpression levels will be limited.

DNA isolation and Polymerase chain reaction (PCR)

Genomic DNA was isolated using the salt/chloroform extraction method. DNA concentration was measured with the NanoDropTM 1000 Spectrophotometer (Thermo

Fisher Scientific Inc., Waltham, MA, USA) and the quality was checked on a 1% agarose gel. Triplicate PCRs were performed using ampliTaq DNA Polymerase kit (Thermo Fisher) using 400ng genomic DNA (representing ~67,000 cells) as input. A universal forward primer (5’-CTGGGAAATCACCATAAACG-3’) with a unique 9nt sample ID (Table S2) was used for each PCR in combination with reverse primer pCDH-R+2bp (5’-CCAAGCGGCTTCGGCCAGTAACGTT-3’) for all L540 samples and with pCDH-R+3bp (5’-TCCAAGCGGCTTCGGCCAGTAACGTT-3’) for all KM-H2 samples. All PCR amplifications were performed in one run to reduce experimental variation. PCR products were analyzed on 2% agarose gel and mixed in equal amounts based on band intensities.

Next generation sequencing

Library preparation of the PCR product mixes was done as described in Chapter 3. Paired-end sequencing was performed using the MiSeq™ (Illumina, San Diego, CA, USA). After next generation sequencing, the read counts were assigned to a sample using the sample ID followed by alignment versus the pCDH constructs. The alignment was performed using BWA (version 0.7.12; https://github.com/lh3/bwa) and processing of the reads was done with SAM tools (version 1.3; http://www.htslib.org/) [11]. We normalized total read counts per sample to 10,000, based on the approximate number of GFP+ cells in the PCR for the sample with the lowest GFP% after sorting (15.6%). Constructs with an average read count of less than 40 were not included in the final analysis (i.e. 4 constructs) because of the relatively broad variation in read counts in triplicate PCR, which makes the calculations to detect changes over time unreliable. We calculated the average read counts of the triplicate PCRs per sample and calculated the fold change of the read counts at day 13 and day 21 relative to day

(7)

5 for each independent infection. For the first analysis strategy, constructs with a possible effect on cell growth were determined based on a fold change below or above the threshold as defined in Chapter 3A for the barcoded empty vector experiments, i.e. average ratio plus or minus 2× the SD, (1.36 decrease and 1.38 increase in abundance).

For the second analysis strategy, the ratios of the read counts at day 13 and day 21 relative to day 5 minus/plus 1 were plotted and the slope of the trend line that was forced to 0 was calculated using MATLAB (version 6.1, The MathWorks Inc., Natick, MA, 2000). An adapted Tukey IQR method with a lower band cutoff of Q1-(1xIQR) and an upper band cutoff of Q3+(1xIQR) was applied to all the slopes to identify constructs that behaved as outliers in the population.

GFP competition assay

HL cells were infected with lentiviral particles that were generated using one of the selected miRNA constructs aiming at an infection percentage of 10-40%. After infection, these cells were cultured for 22 days and the percentage of GFP+ cells was monitored

triweekly by flow cytometry (BD Biosciences, San Jose, California, USA). The percentage of GFP+ cells at day 6 was set to 100%. Significant changes were assessed

as described in Chapter 2. P-values <0.05 were considered statistically significant in comparisons between a specific miRNA construct and the empty vector (EV) as a control.

Results

Validation of the miRNA overexpression constructs

Overexpression of the mature miRNAs of the lentiviral miRNA overexpression pool was tested by small RNA sequencing of pCDH pool infected HEK-293T cells. The percentage of GFP+ cells was 66% for the pCDH pool infected cells and 98% for the EV infected cells at day 6 after infection. Four of 78 possible mature miRNAs were not reported in miRBase (http://www.mirbase.org/) and these were also not identified in the small RNA sequencing data. The remaining 75 mature miRNAs were derived from 39 constructs. For 9 constructs a >2 fold increased expression level was observed for both strands (5p and 3p) in the virus pool infected cells as compared to EV-infected cells. For 17 constructs a >2 fold increased level was observed for one of the two

strands (5p or 3p). The remaining 13 constructs did not show 2-fold increased expression levels for either the 3p or 5p strands (Figure 1). Thus, the majority of the constructs (26/39=67%) lead to increased mature miRNA levels in HEK-293T cells. There was no obvious correlation of successful induction of higher miRNA levels with endogenous expression levels in EV-infected cells. As these data are obtained in a mixed cell population with only a proportion of the cells being infected with a specific construct, we anticipate that the actual performance of the constructs is underestimated.

NGS and data analysis of the high-throughput screen

The same viral pool was used to infect HL cell lines KM-H2 and L540 following the experimental workflow as shown in Figure 2A. Results of the GFP sorting and PCR amplifications are illustrated in Figure 2B and 2C as well as summarized in Table 1. The GFP percentages after sorting were relatively low in L540 (ranging from 15.6% to 30.0%) when compared to KM-H2 (ranging from 62.3% to 70.8%). Total read counts per sample varied between 6,209 and 12,732 in KM-H2 and between 7,274 and 20,571 in L540 (Table 1). After normalization of total reads to 10,000, the average read counts per construct ranged from 5 to 1,652 (Table S3). Four constructs had average read counts of less than 40 in the pCDH infected HL cells and were therefore excluded from further analyses.

A comparison of insert sizes and percentages of reads counts between in the 1st screen

described in Chapter 3B and in the 2nd screen as described in this chapter is shown in

Table S4. The interquartile range varied much more in the 1st screen (0.15% to 2.15%)

as compared to the 2nd screen (1.43% and 2.81%) (Figure S1). Therefore, both the

adapted DNA input and the more similar insert sizes resulted in less variation in read counts.

Identification of miRNA overexpression constructs with altered abundance

For the first analysis approach, we used the cutoff values as determined for the EV-BC screen as described in Chapter 3A (average ratio plus or minus 2× the SD, i.e. 1.36 decrease and 1.38 increase in abundance). None of the constructs were depleted in at least 3 of 4 infections at day 13 and miR-141 was the only depleted construct from the pool at day 21 in at least 3 of 4 infections (Figure 3). None of the constructs showed a consistent increase in abundance at either day 13 or 21. Using less strict criteria, i.e. 2 out of 4 infections, seven additional constructs showed a change in abundance over

(8)

4

5 for each independent infection. For the first analysis strategy, constructs with a possible effect on cell growth were determined based on a fold change below or above the threshold as defined in Chapter 3A for the barcoded empty vector experiments, i.e. average ratio plus or minus 2× the SD, (1.36 decrease and 1.38 increase in abundance).

For the second analysis strategy, the ratios of the read counts at day 13 and day 21 relative to day 5 minus/plus 1 were plotted and the slope of the trend line that was forced to 0 was calculated using MATLAB (version 6.1, The MathWorks Inc., Natick, MA, 2000). An adapted Tukey IQR method with a lower band cutoff of Q1-(1xIQR) and an upper band cutoff of Q3+(1xIQR) was applied to all the slopes to identify constructs that behaved as outliers in the population.

GFP competition assay

HL cells were infected with lentiviral particles that were generated using one of the selected miRNA constructs aiming at an infection percentage of 10-40%. After infection, these cells were cultured for 22 days and the percentage of GFP+ cells was monitored

triweekly by flow cytometry (BD Biosciences, San Jose, California, USA). The percentage of GFP+ cells at day 6 was set to 100%. Significant changes were assessed

as described in Chapter 2. P-values <0.05 were considered statistically significant in comparisons between a specific miRNA construct and the empty vector (EV) as a control.

Results

Validation of the miRNA overexpression constructs

Overexpression of the mature miRNAs of the lentiviral miRNA overexpression pool was tested by small RNA sequencing of pCDH pool infected HEK-293T cells. The percentage of GFP+ cells was 66% for the pCDH pool infected cells and 98% for the EV infected cells at day 6 after infection. Four of 78 possible mature miRNAs were not reported in miRBase (http://www.mirbase.org/) and these were also not identified in the small RNA sequencing data. The remaining 75 mature miRNAs were derived from 39 constructs. For 9 constructs a >2 fold increased expression level was observed for both strands (5p and 3p) in the virus pool infected cells as compared to EV-infected cells. For 17 constructs a >2 fold increased level was observed for one of the two

strands (5p or 3p). The remaining 13 constructs did not show 2-fold increased expression levels for either the 3p or 5p strands (Figure 1). Thus, the majority of the constructs (26/39=67%) lead to increased mature miRNA levels in HEK-293T cells. There was no obvious correlation of successful induction of higher miRNA levels with endogenous expression levels in EV-infected cells. As these data are obtained in a mixed cell population with only a proportion of the cells being infected with a specific construct, we anticipate that the actual performance of the constructs is underestimated.

NGS and data analysis of the high-throughput screen

The same viral pool was used to infect HL cell lines KM-H2 and L540 following the experimental workflow as shown in Figure 2A. Results of the GFP sorting and PCR amplifications are illustrated in Figure 2B and 2C as well as summarized in Table 1. The GFP percentages after sorting were relatively low in L540 (ranging from 15.6% to 30.0%) when compared to KM-H2 (ranging from 62.3% to 70.8%). Total read counts per sample varied between 6,209 and 12,732 in KM-H2 and between 7,274 and 20,571 in L540 (Table 1). After normalization of total reads to 10,000, the average read counts per construct ranged from 5 to 1,652 (Table S3). Four constructs had average read counts of less than 40 in the pCDH infected HL cells and were therefore excluded from further analyses.

A comparison of insert sizes and percentages of reads counts between in the 1st screen

described in Chapter 3B and in the 2nd screen as described in this chapter is shown in

Table S4. The interquartile range varied much more in the 1st screen (0.15% to 2.15%)

as compared to the 2nd screen (1.43% and 2.81%) (Figure S1). Therefore, both the

adapted DNA input and the more similar insert sizes resulted in less variation in read counts.

Identification of miRNA overexpression constructs with altered abundance

For the first analysis approach, we used the cutoff values as determined for the EV-BC screen as described in Chapter 3A (average ratio plus or minus 2× the SD, i.e. 1.36 decrease and 1.38 increase in abundance). None of the constructs were depleted in at least 3 of 4 infections at day 13 and miR-141 was the only depleted construct from the pool at day 21 in at least 3 of 4 infections (Figure 3). None of the constructs showed a consistent increase in abundance at either day 13 or 21. Using less strict criteria, i.e. 2 out of 4 infections, seven additional constructs showed a change in abundance over

(9)

time; 23a, 26a, 30a, 34a, 429, let-7a were depleted and miR-19b-1 was enriched. For the miR-146a construct an opposite pattern was found in two of the infections; enriched in L540 and depleted in KM-H2. No changes were observed for the negative control construct.

For the second analysis strategy that was based on slopes, miR-19b-1 was the only construct that was enriched over time in at least 3 of the 4 infections. None of the constructs were consistently depleted in 3 out of 4 infections. With the less strict criterion of 2 out of 4 infections, another 4 miRNAs, i.e. miR-26a, miR-30a, miR-34a and miR-141 were depleted over time. In addition, the same opposite pattern as with the first strategy was found for miR-146a. No change was observed for the negative control construct.

Re-analysis of our previously published small RNA sequencing data [12] indicated high levels of miR-19b-3p, followed by moderate levels of miR-141-3p and no expression of the 5p strands of both constructs. The levels of miR-19b-3p were similar in HL cell lines and GC-B cells, while miR-141-3p levels were significantly lower in HL cell lines (Figure 5).

Validation of high-throughput screening results with GFP

competition assay for individual miRNA overexpression constructs

To validate the effects of the high-throughput screen, we performed GFP competition assays for miR-19b-1 and miR-141 overexpression constructs in HL cell lines. A representative GFP gate setting of pCDH-miR-19b-1 infected KM-H2 and L540 is shown in Figure 6. A consistent increase in GFP positive cells was observed in KM-H2, but not in L540 (Figure 6). For miR-141 we could not reach a sufficient high GFP% (>10%) for a reliable GFP competition assay in either of the cell lines.

Discussion

In this project, we studied the effects of miRNA overexpression on HL cell growth using next generation sequencing. The overall results were disappointing with only minor effects for two miRNAs.

The first data analysis strategy was based on an independent infection with a barcoded empty vector library and in the second strategy we selected constructs based on the distribution of the slope fold changes within the experiment. An advantage of the first

strategy is that it reveals how much variation can be expected from a pool infection that should not have effects on cell growth. Such experiments should give a clear indication about the number of false positive hits that can be expected using high throughput screens in the cell lines of interest. A disadvantage of this first strategy is that it is based on an independent infection experiment and in our set-up also using a different lentiviral vector. This might not be representative of the actual high throughput screening experiment. Moreover, it is time and money consuming to do an independent high throughput screen for each cell line. An advantage of the second approach is that it does consider experimental variation within the actual high throughput screen. This is important because of the experimental set-up and normalization procedures. Strong negative effects of one or more miRNAs might lead to false positive effects and vice versa. Due to the low number of miRNA constructs with altered slopes, we cannot draw clear conclusions, but we do consider the second approach as being more reliable. A concern for a high throughput screen like we did is that part of the constructs might not induce appropriate overexpression and thus preclude identification of potential HL relevant miRNA candidates. We validated miRNA overexpression for 67% of the constructs by small RNA sequencing of pCDH pool infected HEK-293T cells. Although a 2-fold induction may seem rather mild one should keep in mind that this is measured within a mixed population of infected cells in which only a proportion of the cells are infected with a specific construct. Failure to induce notable overexpression levels for some of the constructs might be related to the relatively low GFP expression levels observed for this lentiviral vector. This makes appropriate gating of GFP+ cells hard

and might lead to loss of cells infected with constructs with an overall lower GFP level. To, at least partly, avoid this problem we performed the control experiment in HEK-293T cells, which are easy to infect. In HL cells, it is hard to get a high GFP+ cell percentage, which might lead to an under estimation of the efficiency of the constructs in a mixed population of cells. Validation over appropriate overexpression by individual infection experiments followed by sorting of GFP+ cells, although more reliable, is quite labor intensive and reduces the power of a high throughput screen. The extension of the ectopic miRNA overexpression level might depend on endogenous miRNA levels as well as on miRNA processing efficiencies. Although we did not see a clear association between endogenous levels and fold increase upon pool infection in HEK-293T cells, several proteins have been described that regulate biogenesis of miRNAs and expression of such proteins might vary between HEK-293T cells and HL cells. [13] Altogether, we consider the at least 2-fold overexpression in pool infected HEK-293T cells for 67% of the constructs as an overall acceptable efficiency.

(10)

4

time; 23a, 26a, 30a, 34a, 429, let-7a were depleted and miR-19b-1 was enriched. For the miR-146a construct an opposite pattern was found in two of the infections; enriched in L540 and depleted in KM-H2. No changes were observed for the negative control construct.

For the second analysis strategy that was based on slopes, miR-19b-1 was the only construct that was enriched over time in at least 3 of the 4 infections. None of the constructs were consistently depleted in 3 out of 4 infections. With the less strict criterion of 2 out of 4 infections, another 4 miRNAs, i.e. miR-26a, miR-30a, miR-34a and miR-141 were depleted over time. In addition, the same opposite pattern as with the first strategy was found for miR-146a. No change was observed for the negative control construct.

Re-analysis of our previously published small RNA sequencing data [12] indicated high levels of miR-19b-3p, followed by moderate levels of miR-141-3p and no expression of the 5p strands of both constructs. The levels of miR-19b-3p were similar in HL cell lines and GC-B cells, while miR-141-3p levels were significantly lower in HL cell lines (Figure 5).

Validation of high-throughput screening results with GFP

competition assay for individual miRNA overexpression constructs

To validate the effects of the high-throughput screen, we performed GFP competition assays for miR-19b-1 and miR-141 overexpression constructs in HL cell lines. A representative GFP gate setting of pCDH-miR-19b-1 infected KM-H2 and L540 is shown in Figure 6. A consistent increase in GFP positive cells was observed in KM-H2, but not in L540 (Figure 6). For miR-141 we could not reach a sufficient high GFP% (>10%) for a reliable GFP competition assay in either of the cell lines.

Discussion

In this project, we studied the effects of miRNA overexpression on HL cell growth using next generation sequencing. The overall results were disappointing with only minor effects for two miRNAs.

The first data analysis strategy was based on an independent infection with a barcoded empty vector library and in the second strategy we selected constructs based on the distribution of the slope fold changes within the experiment. An advantage of the first

strategy is that it reveals how much variation can be expected from a pool infection that should not have effects on cell growth. Such experiments should give a clear indication about the number of false positive hits that can be expected using high throughput screens in the cell lines of interest. A disadvantage of this first strategy is that it is based on an independent infection experiment and in our set-up also using a different lentiviral vector. This might not be representative of the actual high throughput screening experiment. Moreover, it is time and money consuming to do an independent high throughput screen for each cell line. An advantage of the second approach is that it does consider experimental variation within the actual high throughput screen. This is important because of the experimental set-up and normalization procedures. Strong negative effects of one or more miRNAs might lead to false positive effects and vice versa. Due to the low number of miRNA constructs with altered slopes, we cannot draw clear conclusions, but we do consider the second approach as being more reliable. A concern for a high throughput screen like we did is that part of the constructs might not induce appropriate overexpression and thus preclude identification of potential HL relevant miRNA candidates. We validated miRNA overexpression for 67% of the constructs by small RNA sequencing of pCDH pool infected HEK-293T cells. Although a 2-fold induction may seem rather mild one should keep in mind that this is measured within a mixed population of infected cells in which only a proportion of the cells are infected with a specific construct. Failure to induce notable overexpression levels for some of the constructs might be related to the relatively low GFP expression levels observed for this lentiviral vector. This makes appropriate gating of GFP+ cells hard

and might lead to loss of cells infected with constructs with an overall lower GFP level. To, at least partly, avoid this problem we performed the control experiment in HEK-293T cells, which are easy to infect. In HL cells, it is hard to get a high GFP+ cell percentage, which might lead to an under estimation of the efficiency of the constructs in a mixed population of cells. Validation over appropriate overexpression by individual infection experiments followed by sorting of GFP+ cells, although more reliable, is quite labor intensive and reduces the power of a high throughput screen. The extension of the ectopic miRNA overexpression level might depend on endogenous miRNA levels as well as on miRNA processing efficiencies. Although we did not see a clear association between endogenous levels and fold increase upon pool infection in HEK-293T cells, several proteins have been described that regulate biogenesis of miRNAs and expression of such proteins might vary between HEK-293T cells and HL cells. [13] Altogether, we consider the at least 2-fold overexpression in pool infected HEK-293T cells for 67% of the constructs as an overall acceptable efficiency.

(11)

In comparison to our preliminary screen described in Chapter 3B, we added 11 constructs and modified insert sizes of 8 of the constructs. We hypothesized that similar insert sizes would result in a more homogenous distribution of the reads over all constructs. In addition, we also adapted the DNA input for 15 constructs, which had a relatively high or low read count in the initial screen. This indeed reduced variation in read counts, albeit still not optimal for some of the constructs. These differences might have been caused by differences in the efficiency of virus production and differences in GFP expression levels. These points might both introduce a construct dependent bias in sorting efficiency. One of the potential problems is that during virus production the lentiviral transcript can already form the miRNA hairpin structure and thus be processed before it is encapsulated in viral particles. Such differences in virus production efficiency were also observed in individual GFP competition assay using different miRNA overexpression constructs with starting GFP percentages ranging from 1% to 31%.

MiR-141 was identified with the first analysis strategy and miR-19b-1 using the second strategy. When we use a less strict selection criterion (changes in at least 2 of 4 infections) for the first strategy, another seven constructs were noted as being changed i.e. 23a, 26a, 30a, 34a, 429 and let-7a decreased and miR-19b-1 increased. In the second strategy, 4 additional constructs were decreased in 2 of 4 infections i.e. miR-26a, miR-30a, miR-34a and miR-141. This less strict cutoff results in a higher consistency between the two methods, including the identification of 19b-1 and 141 with both analysis methods. Next to these miRNAs, miR-146a showed with both analyses strategies a consistent decrease in KM-H2 and a consistent increase in L540. In our previously published small RNA-seq data, no difference in miR-146a expression was observed in HL compared to GC-B cells [12]. However, miR-146a-5p was found to be increased in cHL tissue (n=32) compared to reactive lymphadenopathy tissue samples (n=60). [14] For 9 of the 11 constructs that changed consistently with the less strict criteria, an at least 2-fold increase in expression level was observed for at least one of the two strands (5p and 3p) in pool infected HEK-293T cells. The two exceptions were miR-26a and let-7a. Both miR-26a-5p and let-7a-miR-26a-5p were highly abundant in HEK-293T cells, which might explain why we did not observe a further increase in the virus pool infected HEK-293T cells. The endogenous levels of the other 9 miRNAs were much lower than those of miR-26a-5p and let-7a-5p.

Seven of the miRNA constructs included in the pool, had a decreased expression in HL compared to GC-B cells based on small RNA sequencing data. The abundance of

one of these miRNAs, miR-141, was decreased in 2 or 3 out of 4 infections (depending on the strategy) in our screen. MiRNA-141-3p inhibits proliferation and differentiation of human stromal stem cells. [15] Furthermore, it suppresses growth of prostate cancer stem cells and metastasis formation by targeting Rho GTPase family members (for example, CDC42, CDC42EP3, RAC1 and ARPC5) and CD44 and EZH2 stem cell molecules. [16] It also functions as a tumor suppressor through targeting of ZEB2 and HGFR in colorectal cancer cells. [17] miR-141 overexpression was effective in decreasing cell growth and promoting migration and invasion of triple-negative breast cancer cells. [18] So, our data support a potential tumor suppressive role for miR-141 in HL, but further studies are required.

In the validation experiments using individual GFP competition assays, we observed an increase of GFP+ cells over time for the miR-19b-1 construct in KM-H2, but not in L540. MiR-19b-1 is a member of a larger seed family. For the other family members we did not have constructs in our pool. Moreover, their expression levels were overall lower in HL as compared to miR-19b. It has been reported that miR-19b promotes tumor growth and metastasis via targeting TP53 [19], promotes breast tumorigenesis by suppressing PTPRG [20] and promote cell proliferation and migration in lung cancer by targeting the tumor suppressor MTUS1 [21]. So our findings are consistent with an oncogenic role of miR-19b-1 in HL.

In conclusion, our high throughput screen indicated one miRNA with a possible tumor suppressor role in HL and one with oncogenic activity. Further improvement of the technical procedures, follow-up time and possible switching to another vector with a stronger and/or separate promotor for the GFP protein is likely to aid effective identification of miRNAs involved in the regulation of HL cell growth.

(12)

4

In comparison to our preliminary screen described in Chapter 3B, we added 11 constructs and modified insert sizes of 8 of the constructs. We hypothesized that similar insert sizes would result in a more homogenous distribution of the reads over all constructs. In addition, we also adapted the DNA input for 15 constructs, which had a relatively high or low read count in the initial screen. This indeed reduced variation in read counts, albeit still not optimal for some of the constructs. These differences might have been caused by differences in the efficiency of virus production and differences in GFP expression levels. These points might both introduce a construct dependent bias in sorting efficiency. One of the potential problems is that during virus production the lentiviral transcript can already form the miRNA hairpin structure and thus be processed before it is encapsulated in viral particles. Such differences in virus production efficiency were also observed in individual GFP competition assay using different miRNA overexpression constructs with starting GFP percentages ranging from 1% to 31%.

MiR-141 was identified with the first analysis strategy and miR-19b-1 using the second strategy. When we use a less strict selection criterion (changes in at least 2 of 4 infections) for the first strategy, another seven constructs were noted as being changed i.e. 23a, 26a, 30a, 34a, 429 and let-7a decreased and miR-19b-1 increased. In the second strategy, 4 additional constructs were decreased in 2 of 4 infections i.e. miR-26a, miR-30a, miR-34a and miR-141. This less strict cutoff results in a higher consistency between the two methods, including the identification of 19b-1 and 141 with both analysis methods. Next to these miRNAs, miR-146a showed with both analyses strategies a consistent decrease in KM-H2 and a consistent increase in L540. In our previously published small RNA-seq data, no difference in miR-146a expression was observed in HL compared to GC-B cells [12]. However, miR-146a-5p was found to be increased in cHL tissue (n=32) compared to reactive lymphadenopathy tissue samples (n=60). [14] For 9 of the 11 constructs that changed consistently with the less strict criteria, an at least 2-fold increase in expression level was observed for at least one of the two strands (5p and 3p) in pool infected HEK-293T cells. The two exceptions were miR-26a and let-7a. Both miR-26a-5p and let-7a-miR-26a-5p were highly abundant in HEK-293T cells, which might explain why we did not observe a further increase in the virus pool infected HEK-293T cells. The endogenous levels of the other 9 miRNAs were much lower than those of miR-26a-5p and let-7a-5p.

Seven of the miRNA constructs included in the pool, had a decreased expression in HL compared to GC-B cells based on small RNA sequencing data. The abundance of

one of these miRNAs, miR-141, was decreased in 2 or 3 out of 4 infections (depending on the strategy) in our screen. MiRNA-141-3p inhibits proliferation and differentiation of human stromal stem cells. [15] Furthermore, it suppresses growth of prostate cancer stem cells and metastasis formation by targeting Rho GTPase family members (for example, CDC42, CDC42EP3, RAC1 and ARPC5) and CD44 and EZH2 stem cell molecules. [16] It also functions as a tumor suppressor through targeting of ZEB2 and HGFR in colorectal cancer cells. [17] miR-141 overexpression was effective in decreasing cell growth and promoting migration and invasion of triple-negative breast cancer cells. [18] So, our data support a potential tumor suppressive role for miR-141 in HL, but further studies are required.

In the validation experiments using individual GFP competition assays, we observed an increase of GFP+ cells over time for the miR-19b-1 construct in KM-H2, but not in L540. MiR-19b-1 is a member of a larger seed family. For the other family members we did not have constructs in our pool. Moreover, their expression levels were overall lower in HL as compared to miR-19b. It has been reported that miR-19b promotes tumor growth and metastasis via targeting TP53 [19], promotes breast tumorigenesis by suppressing PTPRG [20] and promote cell proliferation and migration in lung cancer by targeting the tumor suppressor MTUS1 [21]. So our findings are consistent with an oncogenic role of miR-19b-1 in HL.

In conclusion, our high throughput screen indicated one miRNA with a possible tumor suppressor role in HL and one with oncogenic activity. Further improvement of the technical procedures, follow-up time and possible switching to another vector with a stronger and/or separate promotor for the GFP protein is likely to aid effective identification of miRNAs involved in the regulation of HL cell growth.

(13)

Figure 1. Validation of miRNA overexpression in pCDH pool infected HEK-293T cells. Read counts of each miRNA in EV infected sample (indicated with - sign; left white and black bars) compared to pCDH pool infected sample (indicated with a + sign; right white and black bars) for pCDH constructs with fold changes. (A) Constructs with a more than 2 fold for both 3p- and 5p-strands of mature miRNAs; (B) Constructs with a more than 2 fold for either 3p- or 5p-strands; (C) Constructs with an increase less than 2 fold for both strands. White bars indicate the 5p and black bars the 3p strand of each miRNA.

Figure 2. An overview of the high-throughput screen of miRNA overexpression constructs. (A) Schematic representation of the workflow. The DNA fragments of the miRNA stem-loop with flanking sequences were cloned into EcoRI and NotI restriction sites of the pCDH vector. The lentiviral particles were infected into KM-H2 and L540 cells. Genomic DNA was isolated of GFP+ cells sorted at different

time points. The inserts were amplified and subjected to next generation sequencing. (B) An example of the sorting results in KM-H2 cells. GFP+ cells were sorted at day 5, day 13 and day 21 from duplicate

infections. (C) Agarose gel electrophoresis of the PCR products of pCDH infected samples. Sizes of pCDH PCR products range from 572 to 602bp. PCR products were mixed for next generation sequencing based on band intensities.

(14)

4

Figure 1. Validation of miRNA overexpression in pCDH pool infected HEK-293T cells. Read counts

of each miRNA in EV infected sample (indicated with - sign; left white and black bars) compared to pCDH pool infected sample (indicated with a + sign; right white and black bars) for pCDH constructs with fold changes. (A) Constructs with a more than 2 fold for both 3p- and 5p-strands of mature miRNAs; (B) Constructs with a more than 2 fold for either 3p- or 5p-strands; (C) Constructs with an increase less than 2 fold for both strands. White bars indicate the 5p and black bars the 3p strand of each miRNA.

Figure 2. An overview of the high-throughput screen of miRNA overexpression constructs. (A)

Schematic representation of the workflow. The DNA fragments of the miRNA stem-loop with flanking sequences were cloned into EcoRI and NotI restriction sites of the pCDH vector. The lentiviral particles were infected into KM-H2 and L540 cells. Genomic DNA was isolated of GFP+ cells sorted at different

time points. The inserts were amplified and subjected to next generation sequencing. (B) An example of the sorting results in KM-H2 cells. GFP+ cells were sorted at day 5, day 13 and day 21 from duplicate

infections. (C) Agarose gel electrophoresis of the PCR products of pCDH infected samples. Sizes of pCDH PCR products range from 572 to 602bp. PCR products were mixed for next generation sequencing based on band intensities.

(15)

Figure 3. Identification of miRNA overexpression constructs that affect growth of HL cells using fold change analysis strategy. The miRNA overexpression constructs were sorted from low to high based on average reads per construct. Fold changes of day 13 and day 21 relative to day 5 were calculated based on normalized read counts in first (up) and second (down) infections for (A) KM-H2 and (B) L540 cell lines. These dotted lines indicate the upper and lower boundaries based on average ±2× the SD of EV-BC abundance changes as shown in Chapter 3A. Normalized read counts per construct are shown in Table S3. Arrows indicate constructs for which the abundance changed in at least 3 of 4 infections.

Figure 4. Identification of miRNA overexpression constructs that affect cell growth of HL cells using slope-based analysis strategy. The miRNA overexpression constructs were sorted from low to high based on average reads per construct. Slopes were calculated for (A) KM-H2 and (B) L540 cell lines based on read count ratio of day 13 / day 5 and of day 21 / day 5, forcing the trend line to zero. The dotted lines indicate the upper and lower boundaries based on interquartile range. Black dots indicate pCDH constructs that changed over time and arrows indicate pCDH constructs that changed consistently in at least 4 out of 6 infections.

(16)

4

Figure 3. Identification of miRNA overexpression constructs that affect growth of HL cells using fold change analysis strategy. The miRNA overexpression constructs were sorted from low to high

based on average reads per construct. Fold changes of day 13 and day 21 relative to day 5 were calculated based on normalized read counts in first (up) and second (down) infections for (A) KM-H2 and (B) L540 cell lines. These dotted lines indicate the upper and lower boundaries based on average ±2× the SD of EV-BC abundance changes as shown in Chapter 3A. Normalized read counts per construct are shown in Table S3. Arrows indicate constructs for which the abundance changed in at least 3 of 4 infections.

Figure 4. Identification of miRNA overexpression constructs that affect cell growth of HL cells using slope-based analysis strategy. The miRNA overexpression constructs were sorted from low to

high based on average reads per construct. Slopes were calculated for (A) KM-H2 and (B) L540 cell lines based on read count ratio of day 13 / day 5 and of day 21 / day 5, forcing the trend line to zero. The dotted lines indicate the upper and lower boundaries based on interquartile range. Black dots indicate pCDH constructs that changed over time and arrows indicate pCDH constructs that changed consistently in at least 4 out of 6 infections.

(17)

Figure 5. Expression levels of miR-19b-3p and miR-141-3p in HL cell lines and GC-B cells. Expression levels of (A) miR-19b-3p and (B) miR-141-3p in HL cell lines and GC-B cells based on previously published small RNA sequencing data [12]. Significant differences were determined by a Mann-whitney test. *P <0.05. RPM: read counts per million.

Figure 6. Validation of the effects of miR-19b-1 overexpression on HL cell growth. Comparison of the high throughput screen using fold change of read counts in the 1st and 2nd infection in KM-H2 and

L540 (A) and with individual GFP competition assays (B) for miR-19b-1. The GFP percentage was measured triweekly for 22 days and the percentage at the first day of measurement (day 6) was set to 1. NGS represents Next generation sequencing data and validation represents GFP competition assay results. Representative FACS results of pCDH-miR-19b-1 infection at day 4 for (C) KM-H2 and (D) L540. The gates were set based on not-infected cells; the separation between GFP+ and GFP- cells is suboptimal and might have caused small variations in GFP-percentages.

(18)

4

Figure 5. Expression levels of miR-19b-3p and miR-141-3p in HL cell lines and GC-B cells.

Expression levels of (A) miR-19b-3p and (B) miR-141-3p in HL cell lines and GC-B cells based on previously published small RNA sequencing data [12]. Significant differences were determined by a Mann-whitney test. *P <0.05. RPM: read counts per million.

Figure 6. Validation of the effects of miR-19b-1 overexpression on HL cell growth. Comparison of

the high throughput screen using fold change of read counts in the 1st and 2nd infection in KM-H2 and

L540 (A) and with individual GFP competition assays (B) for miR-19b-1. The GFP percentage was measured triweekly for 22 days and the percentage at the first day of measurement (day 6) was set to 1. NGS represents Next generation sequencing data and validation represents GFP competition assay results. Representative FACS results of pCDH-miR-19b-1 infection at day 4 for (C) KM-H2 and (D) L540. The gates were set based on not-infected cells; the separation between GFP+ and GFP- cells is suboptimal and might have caused small variations in GFP-percentages.

(19)

Table 1. An overview of the pCDH pool infected samples and NGS read counts before normalization

KM-H2 1st GFP% Sorted cells Mapped read counts and percentages

PCR 1 (%) PCR 2 (%) PCR 3 (%) D5 62.6% 217,650 6,704 (94.5%) 6,748 (95.3%) 12,427 (95.1%) D13 62.3% 511,180 10,790 (93.6%) 11,883 (93.3%) 9,699 (94.1%) D21 70.8% 700,000 7,519 (92.8%) 8,542 (93.2%) 8,962 (91.7%) KM-H2 2nd GFP% Sorted cells Mapped read counts and percentages

PCR 1 (%) PCR 2 (%) PCR 3 (%) D5 66.9% 205,560 6,209 (94.7%) 6,820 (94.5%) 6,990 (93.7%) D13 68.0% 441,240 11,956 (93.9%) 9,608 (93.2%) 12,732 (92.7%) D21 70.8% 700,000 12,215 (92.2%) 8,920 (92.0%) 10,165 (92.1%) L540 1st GFP% Sorted cells Mapped read counts and percentages

PCR 1 (%) PCR 2 (%) PCR 3 (%) D5 20.7% 254,120 13,821 (96.5%) 12,358 (96.2%) 919 (97.0%) D13 16.1% 330,160 19,807 (95.5%) 19,498 (95.6%) 20,571 (96.0%) D21 18.9% 450,000 16,166 (95.7%) 7,294 (96.0%) 18,097 (95.7%) L540 2nd GFP% Sorted cells Mapped read counts and percentages

PCR 1 (%) PCR 2 (%) PCR 3 (%) D5 30.0% 343,800 7,274 (95.4%) 7,780 (95.6%) 7,677 (95.8%) D13 15.6% 373,030 10,300 (95.3%) 13,911 (94.7%) 8,159 (95.6%) D21 18.1% 450,000 11,841 (95.4%) 14,766 (95.3%) 14,945 (95.3%)

Supplementary Figure

Supplementary Figure 1. Percentages of read counts per construct for the 1st and 2nd screen. The

lines represent the median. In the 2nd screen the read counts per construct were much more

(20)

4

Table 1. An overview of the pCDH pool infected samples and NGS read counts before normalization

KM-H2 1st GFP% Sorted cells Mapped read counts and percentages

PCR 1 (%) PCR 2 (%) PCR 3 (%) D5 62.6% 217,650 6,704 (94.5%) 6,748 (95.3%) 12,427 (95.1%) D13 62.3% 511,180 10,790 (93.6%) 11,883 (93.3%) 9,699 (94.1%) D21 70.8% 700,000 7,519 (92.8%) 8,542 (93.2%) 8,962 (91.7%) KM-H2 2nd GFP% Sorted cells Mapped read counts and percentages

PCR 1 (%) PCR 2 (%) PCR 3 (%) D5 66.9% 205,560 6,209 (94.7%) 6,820 (94.5%) 6,990 (93.7%) D13 68.0% 441,240 11,956 (93.9%) 9,608 (93.2%) 12,732 (92.7%) D21 70.8% 700,000 12,215 (92.2%) 8,920 (92.0%) 10,165 (92.1%) L540 1st GFP% Sorted cells Mapped read counts and percentages

PCR 1 (%) PCR 2 (%) PCR 3 (%) D5 20.7% 254,120 13,821 (96.5%) 12,358 (96.2%) 919 (97.0%) D13 16.1% 330,160 19,807 (95.5%) 19,498 (95.6%) 20,571 (96.0%) D21 18.9% 450,000 16,166 (95.7%) 7,294 (96.0%) 18,097 (95.7%) L540 2nd GFP% Sorted cells Mapped read counts and percentages

PCR 1 (%) PCR 2 (%) PCR 3 (%) D5 30.0% 343,800 7,274 (95.4%) 7,780 (95.6%) 7,677 (95.8%) D13 15.6% 373,030 10,300 (95.3%) 13,911 (94.7%) 8,159 (95.6%) D21 18.1% 450,000 11,841 (95.4%) 14,766 (95.3%) 14,945 (95.3%)

Supplementary Figure

Supplementary Figure 1. Percentages of read counts per construct for the 1st and 2nd screen. The

lines represent the median. In the 2nd screen the read counts per construct were much more

(21)

Supplementary Tables

Table S1. Overview of the miRNA overexpression constructs included in the virus pool for the high-throughput screen

Possible relevance to HL

No. pCDH construct 5p-strand 3p-strand

1 miR-7-18763 5 Decreased Not in miRbase

2 miR-9 1 Increased Increased

3 miR-21 5 Top 10 Not diff. expr.

4 miR-23a 2 Increased Increased

5 miR-24 Increased Increased

6 miR-26a 2 Not diff. expr. Not diff. expr.

7 miR-26b 2 Not diff. expr. Not diff. expr.

8 miR-27a 2 Increased Increased

9 miR-27b 4 Not diff. expr. Top 10

10 miR-28 1 Decreased Decreased

11 miR-29a 2 Not diff. expr. Not diff. expr.

12 miR-29b 1 Not diff. expr. Not diff. expr.

13 miR-29c Not diff. expr. Not diff. expr.

14 miR-30a 3 Not diff. expr. Decreased

15 miR-34a 3 Not diff. expr. Not diff. expr.

16 miR-101 Increased Not diff. expr.

17 miR-106b 5 Not diff. expr. Not diff. expr.

18 miR-125a 1 Not diff. expr. Not diff. expr.

19 miR-125b 1 Not diff. expr. Not diff. expr.

20 miR-141 Not expressed Decreased

21 miR-142 3 Top 10 Not diff. expr.

22 miR-146a 1 Not diff. expr. Not expressed

23 miR-148a Decreased Decreased

24 miR-150 2 Decreased Decreased

25 miR-151a 5 Not diff. expr. Not diff. expr.

26 miR-155 2 Not diff. expr. Increased

27 miR-181a-2 3 Top 10 Decreased

28 miR-181b-2 Not diff. expr. Not diff. expr.

29 miR-19b-1 2 Not expressed Not diff. expr.

30 miR-200a 5 Not expressed Not expressed

31 miR-200c 5 Not expressed Not diff. expr.

32 miR-205 5 Not diff. expr. Not expressed

33 miR-221 4 Not diff. expr. Not diff. expr.

34 miR-222 1 Not expressed Not diff. expr.

35 miR-363 1 Not expressed Decreased

36 miR-429 5 Not in miRbase Not expressed

37 miR-449a 5 Not diff. expr. Not in miRbase

38 miR-486 5 Not diff. expr. Not diff. expr.

39 miR-577 Decreased Not in miRbase

40 let-7a 1 Top 10 Increased

1DNA input was 4-fold increased, 3DNA input was 2-fold increased and 4DNA input was reduced to 50% in the DNA

pool used for the virus production. 2constructs with a modified insert size to make the pool more homogenous. 5constructs are newly added constructs.

Table S2. Flag sequences of forward primers used for PCR of the pCDH pool infected samples

No. Sample ID (5’-3’) No. Sample ID (5’-3’) No. Sample ID (5’-3’)

1 CGTCCGTA 13 CTTCGCCT 25 TCCACAGC

2 CGTGCTCC 14 GGTCCGGC 26 TCGCGACT

3 CTAACAAG 15 GTACAAGA 27 TCTTACAC

4 CTACGTAA 16 GTATGGTA 28 TCTTTACA

5 CTCGGCCC 17 GTCCCGAC 29 TGACATAT

6 CTCGGTTG 18 GTCTTTTG 30 TGACGCTT

7 CTGCGAGT 19 GTGACCAT 31 TGAGTTCA

8 CTGGAGGT 20 GTGCAAAC 32 TGATAATC

9 CTGTAACA 21 GTGCGGCG 33 TGCCCATC

10 CTGTCTGC 22 GTGTCATT 34 TGCGATGC

11 CTTACTAT 23 GTTGACCC 35 TGCTAGCA

(22)

4

Supplementary Tables

Table S1. Overview of the miRNA overexpression constructs included in the virus pool for the high-throughput screen

Possible relevance to HL

No. pCDH construct 5p-strand 3p-strand 1 miR-7-18763 5 Decreased Not in miRbase

2 miR-9 1 Increased Increased

3 miR-21 5 Top 10 Not diff. expr.

4 miR-23a 2 Increased Increased

5 miR-24 Increased Increased 6 miR-26a 2 Not diff. expr. Not diff. expr.

7 miR-26b 2 Not diff. expr. Not diff. expr.

8 miR-27a 2 Increased Increased

9 miR-27b 4 Not diff. expr. Top 10

10 miR-28 1 Decreased Decreased

11 miR-29a 2 Not diff. expr. Not diff. expr.

12 miR-29b 1 Not diff. expr. Not diff. expr.

13 miR-29c Not diff. expr. Not diff. expr. 14 miR-30a 3 Not diff. expr. Decreased

15 miR-34a 3 Not diff. expr. Not diff. expr.

16 miR-101 Increased Not diff. expr. 17 miR-106b 5 Not diff. expr. Not diff. expr.

18 miR-125a 1 Not diff. expr. Not diff. expr.

19 miR-125b 1 Not diff. expr. Not diff. expr.

20 miR-141 Not expressed Decreased 21 miR-142 3 Top 10 Not diff. expr.

22 miR-146a 1 Not diff. expr. Not expressed

23 miR-148a Decreased Decreased 24 miR-150 2 Decreased Decreased

25 miR-151a 5 Not diff. expr. Not diff. expr.

26 miR-155 2 Not diff. expr. Increased

27 miR-181a-2 3 Top 10 Decreased

28 miR-181b-2 Not diff. expr. Not diff. expr. 29 miR-19b-1 2 Not expressed Not diff. expr.

30 miR-200a 5 Not expressed Not expressed

31 miR-200c 5 Not expressed Not diff. expr.

32 miR-205 5 Not diff. expr. Not expressed

33 miR-221 4 Not diff. expr. Not diff. expr.

34 miR-222 1 Not expressed Not diff. expr.

35 miR-363 1 Not expressed Decreased

36 miR-429 5 Not in miRbase Not expressed

37 miR-449a 5 Not diff. expr. Not in miRbase

38 miR-486 5 Not diff. expr. Not diff. expr.

39 miR-577 Decreased Not in miRbase 40 let-7a 1 Top 10 Increased

1DNA input was 4-fold increased, 3DNA input was 2-fold increased and 4DNA input was reduced to 50% in the DNA

pool used for the virus production. 2constructs with a modified insert size to make the pool more homogenous. 5constructs are newly added constructs.

Table S2. Flag sequences of forward primers used for PCR of the pCDH pool infected samples

No. Sample ID (5’-3’) No. Sample ID (5’-3’) No. Sample ID (5’-3’) 1 CGTCCGTA 13 CTTCGCCT 25 TCCACAGC 2 CGTGCTCC 14 GGTCCGGC 26 TCGCGACT 3 CTAACAAG 15 GTACAAGA 27 TCTTACAC 4 CTACGTAA 16 GTATGGTA 28 TCTTTACA 5 CTCGGCCC 17 GTCCCGAC 29 TGACATAT 6 CTCGGTTG 18 GTCTTTTG 30 TGACGCTT 7 CTGCGAGT 19 GTGACCAT 31 TGAGTTCA 8 CTGGAGGT 20 GTGCAAAC 32 TGATAATC 9 CTGTAACA 21 GTGCGGCG 33 TGCCCATC 10 CTGTCTGC 22 GTGTCATT 34 TGCGATGC 11 CTTACTAT 23 GTTGACCC 35 TGCTAGCA 12 CTTATATT 24 GTTTAAGC 36 TGTACTCT

(23)

Table S3. Average read counts per construct in high throughput screen Order as shown in Fig 3/4 pCDH Constructs Average read counts

All samples L540 KM-H2 1 miR-363 86 91 80 2 miR-146a 93 108 78 3 miR-23a 93 116 71 4 miR-9 104 91 117 5 miR-106b 106 123 88 6 miR-150 131 130 132 7 miR-125b 154 176 132 8 miR-29a 165 183 147 9 miR-101 168 189 146 10 miR-29c 175 163 186 11 miR-577 178 205 153 12 let-7a 181 198 165 13 miR-21 189 160 219 14 miR-148a 189 130 249 15 miR-30a 209 190 229 16 miR-26a 209 194 227 17 NC 215 237 193 18 miR-155 216 257 175 19 miR-181a-2 221 259 184 20 miR-222 226 248 206 21 miR-151a 228 205 251 22 miR-125a 240 273 206 23 miR-34a 256 223 288 24 miR-486 271 210 333 25 miR-205 272 285 261 26 miR-141 275 246 304 27 miR-24 276 312 241 28 miR-27b 286 280 290 29 miR-429 286 268 304 30 miR-181b-2 322 344 300 31 miR-29b 323 341 307 32 miR-142 340 307 374 33 miR-221 346 390 301 34 miR-26b 355 221 490 35 miR-28 429 424 434 36 miR-7-18763 474 437 512 37 miR-19b-1 1,652 1,731 1,561

Table S4. Comparison of normalized read counts per pCDH construct in 1st and 2nd screen

Constructs 1st screen 2nd screen

Insert size Percentage of reads Insert size Percentage of reads

miR-363 491 0.07% 491 0.86% miR-146a 491 0.07% 491 0.93% miR-9 490 0.07% 490 1.04% miR-125b 489 0.09% 489 1.54% miR-150 570 0.09% 499 1.31% miR-125a 491 0.10% 491 2.40% miR-222 491 0.12% 491 2.26% let-7a 489 0.16% 489 1.81% miR-30a 492 0.23% 492 2.09% miR-29b 483 0.31% 483 3.23% miR-142 490 0.35% 490 3.40% miR-28 492 0.33% 492 4.29% miR-181a-2 491 0.38% 491 2.21% miR-34a 491 0.40% 491 2.56% miR-29c 492 0.62% 492 1.75% miR-101 490 0.63% 490 1.68% miR-141 491 0.71% 491 2.75% miR-148a 492 0.78% 492 1.89% miR-577 487 0.84% 487 1.78% miR-181b-2 493 0.92% 493 3.22% miR-24 491 0.95% 491 2.76% miR-26b 450 1.65% 499 3.55% miR-26a 386 2.15% 495 2.09% miR-27b 492 2.16% 492 2.86% miR-29a 434 2.33% 489 1.65% miR-221 489 2.58% 489 3.46% miR-23a 247 3.24% 501 0.93% miR-155 338 6.02% 497 2.16% miR-27a 278 10.71% 503 0.05% miR-19b-1 185 60.94% 491 16.52% miR-200c 525 0.09% miR-449a 487 0.14% miR-200a 511 0.32% miR-106b 519 1.06% miR-21 481 1.89% NC 511 2.15% miR-151a 493 2.28% miR-486 492 2.71% miR-205 507 2.72% miR-429 512 2.86% miR-7-18763 491 4.74%

Referenties

GERELATEERDE DOCUMENTEN

Of the miRNAs with a phenotype on HL growth upon miRNA inhibition, miR- 21-5p was the most abundant miRNA with significantly increased expression levels in HL cell lines compared

In dit proefschrift hebben we onderzocht (1) welke miRNAs een veranderde expressie hebben in cHL cellijnen ten opzichte van GC-B cellen, (2) welke miRNAs de groei van cHL

MYC target genes have been shown to play a role in cell cycle, apoptosis, and cellular transformation.[47] On the one hand, overexpression of MYC has been associated with

Most of the miRNAs highly abundant in germinal center B cells are also highly abundant in Hodgkin lymphoma (This thesis). High-throughput microRNA inhibition screenings

This thesis will mainly consider miRNA involvement in diabetic nephropathy and renal cell cancer, due to the fact that characteristics involved in fibrosis and cell proliferation

(2010) “Serum MicroRNA Signatures Identified in a Genome-Wide Serum MicroRNA Expression Profiling Predict Survival of Non–Small-Cell Lung Cancer”, Journal of Clinical Oncology,

[r]

In the present study, we have developed the microRNA Expression and Sequence Analysis Database, mESAdb, to provide a series of interactive analysis tools for testing the association