The aberrant transcriptional program of myeloid malignancies with poor prognosis
Gerritsen, Myléne
DOI:
10.33612/diss.113503008
IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from
it. Please check the document version below.
Document Version
Publisher's PDF, also known as Version of record
Publication date:
2020
Link to publication in University of Groningen/UMCG research database
Citation for published version (APA):
Gerritsen, M. (2020). The aberrant transcriptional program of myeloid malignancies with poor prognosis:
the effects of RUNX1 and TP53 mutations in AML. Rijksuniversiteit Groningen.
https://doi.org/10.33612/diss.113503008
Copyright
Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the
author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).
Take-down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately
and investigate your claim.
Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the
number of authors shown on this cover page is limited to 10 maximum.
with adverse risk characteristics and
have a distinct gene expression pattern
G. Berger, M. Gerritsen, G. Yi, T.N. Koorenhof-Scheele,
L.I. Kroeze, M. Stevens-Kroef, K. Yoshida, Y. Shiraishi,
E. van den Berg, H. Schepers, G. Huls, A.B. Mulder,
S. Ogawa, J.H.A. Martens, J.H. Jansen, E. Vellenga
This research was originally published in Blood
Advances. © the American Society of Hematology.
ABSTRACT
Ring sideroblasts (RS) emerge as result of aberrant erythroid differentiation leading to excessive
mitochondrial iron accumulation, a characteristic feature for myelodysplastic syndromes (MDS)
with mutations in the spliceosome gene SF3B1. However, RS can also be observed in patients
diagnosed with acute myeloid leukemia (AML). The objective of this study was to characterize
RS in AML patients. Clinically, RS-AML is enriched for ELN adverse-risk (55%). In line with this
finding, 35% of all cases had complex cytogenetic aberrancies and TP53 was most recurrently
mutated in this cohort (37%), followed by DNMT3A (26%), RUNX1 (25%), TET2 (20%) and ASXL1
(19%). In contrast to RS-MDS, the incidence of SF3B1 mutations was low (8%). Whole exome
sequencing and SNP array analysis on a subset of patients did not uncover one single genetic
defect underlying the RS phenotype. Shared genetic defects between erythroblasts and
total mononuclear cell fraction indicate common ancestry for the erythroid lineage and the
myeloid blast cells in RS-AML patients. RNA sequencing analysis on CD34
+AML cells revealed
differential gene expression between RS-AML and non RS-AML cases, including genes involved
in megakaryocyte and erythroid differentiation. Furthermore, several heme metabolism-related
genes were found to be up-regulated in RS- CD34
+AML cells, as was observed in SF3B1
mutMDS.
These results demonstrate that although the genetic background of RS-AML differs from that of
RS-MDS, certain downstream effector pathways are in common.
5
INTRODUCTION
Ring sideroblasts (RS) are erythroid precursor cells that accumulate excessive mitochondrial
iron and can be observed in bone marrow smears associated with multiple medical conditions
1.
Presence of RS is a characteristic feature in myelodysplastic syndrome (MDS) subtypes, including
MDS with single lineage dysplasia (MDS-RS-SLD), multilineage dysplasia (MDS-RS-MLD) and in
combination with the presence of thrombocytosis (MDS/MPN-RS-T)
2. Non-malignant causes
of RS include several drugs, toxins, alcohol, copper deficiency and congenital sideroblastic
anemia
3. This latter group comprises conditions caused by inborn defects in genes that operate
in several mitochondrial pathways, including ALAS2, ABCB7, SLC25A38 and HSPA9
4–7. In MDS, the
RS phenotype is strongly correlated with mutations in splicing factor 3B subunit 1 (SF3B1), with
an incidence higher than 80%
8–12. SF3B1 mutations are usually observed in low-risk MDS, which
is characterized by a stable clinical course and a low risk of leukemic transformation
8,13. As a core
component of the U2 small nuclear ribonucleoprotein particle (snRNP), SF3B1 is essential for
pre-RNA splicing
14. The molecular mechanism by which SF3B1 mutations result in RS formation is
not yet fully understood. A proposed mechanism is that specific patterns of missplicing result in
altered expression of genes that are essential for correct programming of erythropoiesis
15–17. The
relationship between genetic defects in SF3B1 and the RS phenotype is not one-to-one; in
10-20% of the MDS-RS patients no mutation in the SF3B1 gene is detected
8–12. Moreover, RS can also
be present in a subset of AML patients, while SF3B1 mutations are infrequent in this disease
10,18,19.
Besides SF3B1, the only other correlation between a gene defect and the RS phenotype was
described for PRPF8, for which mutations are reported in ~3% of myeloid neoplasms, including
MDS, MDS/MPN and (s)AML
20. In the present study we determined the prevalence of RS in various
ontogenic AML subtypes and the association of the RS phenotype in AML with adverse risk
characteristics. To identify the landscape of genomic defects that underlies the RS phenotype
in AML, we performed whole exome sequencing, targeted sequencing and SNP-array analysis.
Finally, to identify differential expression of genes associated with the RS phenotype in AML, we
performed RNA sequencing on CD34
+-selected AML cells.
MATERIALS AND METHODS
Patients and data collection
For this study, we collected data from 126 AML and high-risk MDS (≥10% bone marrow blasts)
patients who were diagnosed between January 2000 and April 2018 at the University Medical
Center Groningen. The inclusion criterion was the presence of ring sideroblasts in the diagnostic
bone-marrow smear. Patients with previously reported MDS-RS were excluded. Diagnosis and
risk classification was revised based upon World Health Organization classification (2016)
2and
European Leukemia Net (ELN) recommendations (2017)
21. Bone marrow (BM) and/or peripheral
study was conducted in accordance with the Declaration of Helsinki and institutional guidelines
and regulations. Morphological and cytogenetic analyses were based on standard procedures. On
iron-stained aspirate smears, 400 red blood cell precursors were counted. Ring sideroblasts were
defined by ≥ 5 iron granules encircling one third or more of the nucleus. The ring sideroblasts
percentage was determined as percentage of the total red blood cell precursor cells.
Sorting of cell fractions
The mononuclear cell (MNC) fraction from BM and/or PB was obtained by density gradient
centrifugation using lymphoprep (PAA, Cölbe, Germany) according to standard procedures.
Analysis and sorting of various cell fractions was performed on MoFlo XDP or Astrios (Beckman
Coulter). A list of antibodies can be found in the Supplementary Methods.
DNA isolation and amplification
Genomic DNA from various cell fractions was extracted with the NucleoSpin Tissue kit
(Macherey-Nagel, Düren, Germany) according to the manufacturer’s instructions. In case of insufficient
yield, a maximum of 70 ng DNA was amplified using the Qiagen REPLI-g kit (Qiagen, Venlo, the
Netherlands), according to the manufacturer’s protocol.
Targeted deep sequencing using a myeloid gene panel
Targeted sequencing of DNA derived from BM or PB samples obtained at diagnosis was carried
out using the myeloid TruSight sequencing panel (Illumina, San Diego, CA, USA) or by
in-house sequencing panel containing 27 genes (Supplementary Table 1). Library preparation
was performed according to the manufacturer’s protocol (Illumina). Aligning and filtering of
sequencing data was performed using NextGENe version 2.3.4.2 (SoftGenetics, Pennsylvania, US).
Cartagenia Bench Lab NGS (Agilent, Santa Clara, CA, USA) was used for analysis of the resulting vcf
files. Sequencing artifacts were excluded using a threshold of 5%. A minimal variant read depth
of 20 reads was set as criterion. Variants that frequently occur in the general healthy population
(>2% 1000 Genome Phase 1, ESP6500 and dbSNP, and >5% Genome of the Netherlands) were
excluded from further analysis.
Microarray-based genomic profiling
Microarray-based genomic profiling on MNC and erythroblast fractions was performed on
a CytoScan HD array platform (Affymetrix, Inc., Santa Clara, CA, USA) in agreement with the
manufacturer’s reference. Data analysis was performed using Chromosome Analysis Suite
software package (Affymetrix) using annotations of reference genome build GRCh37 (hg19).
Comprehensive analysis and interpretation of the obtained microarray genomic profiling
data was performed using a previously described filtering pipeline and criteria
22. Aberrations
5
standardized ISCN 2016 nomenclature system
23. Visualization of the resultant genomic profiles
was performed using NEXUS software (Nexus Copy Number 8.0, BioDiscovery, El Segundo, CA,
USA).
Whole exome sequencing and confirmation of mutations in erythroblasts
Whole exome sequencing (WES) to an average depth of 143x was performed on DNA isolated
from diagnostic BM-MNC (n=13) or PB-MNC (n=3) samples. The procedure is described in more
detail in the supplemental methods. For a subset of patients, the presence of somatic variants in
TP53 and SRSF2 identified by (WES) was validated and quantified in the erythroblast fraction as
described in the Supplemental Methods.
RNA extraction and Illumina high-throughput sequencing
RNA was isolated by separation of the aqueous phase by TRIzol Reagent (Thermo Fisher) according
to the manufactures protocol. The aqueous phase was subsequently mixed 1:1 with 70% ethanol
and isolation was continued using the RNeasy mini kit (Qiagen) including performing on-column
DNaseI treatment. Library preparation, Illumina high-throughput sequencing and RNA-seq data
analysis are described in detail in the supplementary materials. Raw RNA-seq data is available via
GSE127861.
Statistical analysis
Bivariate correlations were made using a Pearson correlation (continuous variables) or Spearman
correlation (categorical variables). A p-value <0.05 was used to define statistical significance.
Statistical calculations were performed using Prism version 6.0.
RESULTS
Clinical characteristics
To study the ring sideroblasts (RS) phenotype in more detail in a comprehensive group of myeloid
neoplasms (MNs), we collected clinical data on a cohort of patients (n= 126) consisting of AML
and high-risk MDS patients (≥10% BM blasts) hereafter also indicated as ‘AML patients’. These
include patients with RS (≥1%) in the diagnostic bone marrow smear, excluding those with a
documented prior clinical history of MDS-RS. The median blast percentage in this cohort was
32% (range 10-91%), two-third of the patients were male and the median age at diagnosis was
67 years (range 32-87). The majority of patients were diagnosed with de novo AML (55.6%) and
AML with myelodysplasia-related changes was the most common WHO (2016) subtype (37.3%,
Table 1). Although highest RS percentages were observed in cases with lower blast counts, blast
count did not significantly correlate with RS percentage (Pearson r = 0.16, p = 0.07; Figure 1A).
Patients were placed into three subgroups based on their RS percentage; 1-4% RS, 5-14% RS and
≥15% RS (Table 1). These groups were comparable regarding erythroblast percentage and age.
Table 1 – Clinical characteristics of AML with RS phenotype All RS cases 1-4% RS 5-14% RS ≥15% RS Total 126 53 42% 28 22% 45 36% Age (years) Median 67 67 63 70 Range 32-87 40-81 32-78 43-87 Sex Male 84 67% 32 60% 20 71% 32 71% Female 42 33% 21 40% 8 29% 13 29% Type of disease de novo AML 70 56% 35 66% 12 43% 23 51% sAML 20 16% 6 11% 8 29% 6 13% t-MN 17 14% 7 13% 3 11% 7 16% Other 19 15% 5 9% 5 18% 9 20% WHO diagnosis AML with MDS-related changes 47 37% 14 26% 16 57% 17 38% AML NOS 24 19% 13 25% 3 11% 8 18% t-MN 17 14% 7 13% 3 11% 7 16% MDS-EB2 17 14% 5 9% 4 14% 8 18%
AML with recurrent abnormalities
16 13 % 13 25 % 1 4% 2 4%
Other 5 4% 1 2% 1 4% 3 7%
ELN risk score
Favorable 12 10% 11 21% 1 4% 0 0% Intermediate 8 6% 4 8% 2 7% 2 1% Intermediate* 29 23% 17 32% 5 18% 9 20 % Adverse 74 59% 21 40% 19 67% 34 76% Unknown 3 2% 0 0% 1 4% 2 4% BM blasts, % Median 32% 34% 24% 32% Range 10%-91% 10-88% 11-88% 10-91% Erythroblasts, % Median 21% 23% 21% 17% Range 1-64% 4-64% 2-50% 1-59%
Data denoted as number (percentage), unless otherwise stated. * No evaluation for ASXL1, RUNX1 and TP53. Abbreviations: AML – acute myeloid leukemia, t-MN – therapy-related myeloid neoplasm, NOS – not otherwise specified, sAML- secondary AML, MDS-EB2 – myelodysplastic syndrome with excess blasts type 2, ELN – European Leukemia Net, WHO – World Health Organization, BM – bone marrow, RS - ringsideroblasts
5
The group AML patients with an RS phenotype was enriched with patients in the ELN adverse risk
category (55%) (Figure 1B). The proportion of adverse risk patients increased with increasing RS
percentage (Table 1, Figure 1B). The subgroup with ≥15% RS, which represents the minimum
required percentage for WHO MDS-RS diagnosis in absence of SF3B1 mutations
2, had no patients
in the ELN favorable risk category (Figure 1B).
Figure 1. Clinical data. A. Correlation between blast percentage and RS percentage, both determined in diagnostic bone
marrow smear. Boxplot represents median and range of blast percentage in the RS cohort. B. Pie charts representing risk classification (ELN 2017) for total RS cohort (‘all cases’) and three subgroups based on RS percentage (Table 1) * Mutational status of ASXL1, RUNX1 and TP53 unknown.
Mutational and chromosomal defects observed in association with RS phenotype
Complex cytogenetic aberrancies (defined as 3 or more chromosomal abnormalities) detected
by conventional karyotyping were observed in 35% of all RS cases (n=126, Figure 2A). Increasing
incidences of chromosomal abnormalities were associated with increasing RS percentages in the
three subgroups (21%, 39% and 49%, respectively; data not shown). Screening for mutations
in CEBPA, FLT3 and NPM1 was conducted by conventional RT-PCR. In addition, a subset of 60
patients (de novo AML n=36, sAML n=11, t-MN n=6, MDS-EB2 (MDS with excess blasts (<10%))
n=7) was analyzed for mutations in a panel of genes that are recurrently mutated in MNs using
NGS methods (either panel-based (Supplementary Table 1) or WES). In this subset, 27 patients
had ≥15% RS, 15 patients had 5-14% RS and 18 patients had 1-4% RS . In addition, in accordance
with cytogenetic findings, mutations in the TP53 gene (37%), which frequently coincide with
genetic instability, were the most recurrent in this subset (Figure 2B). Other frequently mutated
genes included RUNX1 (25%), NPM1 (16%) and the epigenetic modifiers DNMT3A (26%), ASXL1
(19%) and TET2 (20%). SF3B1 mutations were detected in 6 cases; four with de novo AML, one
with t-MN and one with sAML. In two patients, no mutations in genes frequently affected in
myeloid neoplasms were identified. The incidence of certain mutations tended to segregate with
RS percentages: NPM1 mutations were mainly observed in cases with low RS percentages (30%
in 1-4% RS group vs. 3% in >15% RS group (Figure 2B)). In contrast, TP53 and GATA2 mutations
were detected especially in patients with higher RS percentages (73% in >15% RS group vs. 12%
in 1-4% RS group (Figure 2B)).
Figure 2 - Chromosomal and molecular defects observed in association with RS phenotype. A. Frequency of commonly
observed cytogenetic defects in myeloid neoplasms (MNs) in the RS cohort, subdivided by RS percentage at diagnosis, n=126.
B. Frequency of mutations in genes commonly mutated in MNs as detected by NGS, subdivided by RS percentage. ASXL1,
CALR, CBL, DNMT3A, EZH2, IDH1, IDH2, JAK2, KIT, NRAS, RUNX1, SF3B1, SRSF2, TET2, TP53 and WT1 n=60, BCOR, BCORL, ETV6, GATA2, GNAS, IKZF1, NOTCH1, PTPN11, SETBP1, STAG2, U2AF1 and ZRSR2 n=45. For CEBPA (n=83), FLT3 (n=103) and NPM1 (n=93) results of PCR , followed by Sanger sequencing(CEBPA) or Capillary electrophoresis (FLT3 and NPM1) are shown.
Genome-wide screening for genetic defects
To identify genetic defects underlying the RS phenotype that are not covered by panel-based
sequencing, 15 patients of the RS cohort were selected for genome-wide analysis based on high
RS percentage (≥13%) and material availability. This group was supplemented with one patient
(RS022) that had 15% RS at AML relapse following autologous stem cell transplantation, while
no RS were observed at initial AML diagnosis. All 16 patients belonged to the adverse risk group
according to ELN criteria
21. Using WES, samples were first analyzed for the occurrence of germline
mutations in genes associated with congenital sideroblastic anemia (Supplementary Table 2).
However, no such mutations were detected. Based on WES a median of 19 (range 9-34) somatically
acquired mutations were found per patient, of which a median 2.5 (range 0-10) were observed
in genes recurrently mutated in myeloid malignancies
19(Figure 3A, 3B and Supplementary
Table 3). The total number of mutations did not correlate with age (Pearson r = 0.07, p = 0.78,
Figure 3C). The vast majority of observed mutations concerned nonsynonymous point mutations
(Figure 3D). TP53 was most frequently affected in this cohort (in 12/16 patients), followed by
DNMT3A (4/16) and SRSF2 (3/16) (Figure 3A). In this cohort, one mutation was detected in SF3B1
and no mutations were observed in PRPF8. For patient RS022, who had 15% RS and several
cytogenetic abnormalities in the relapse sample while both were absent at initial presentation,
no mutations were detected in known leukemia-associated genes (Figure 3A). However, the
5
observed mutation in BUB1, a key player in the mitotic spindle checkpoint, most likely explains the
chromosomal aberrancies observed in this relapse sample (Supplementary Table 3). In addition
to WES, SNP array analysis was used to identify chromosomal defects (Supplementary Figure 1).
Scattering of one or more chromosomes (chromothripsis) was detected in 9/16 cases. Deletions
of chromosomes 5q, (parts of) chromosome 7, chromosome 12p, chromosome 17p and (partial)
gain of chromosome 8 were frequently observed chromosomal aberrancies (Figure 3A and
Figure 3E). In 6/16 cases the 17p deletion involved the TP53 locus, and for 4/16 cases this deletion
involved the locus of PRPF8 (marked by asterisks in Figure 3A). Together, these data indicate that
no single-gene defect underlies the RS phenotype in this cohort: RS in AML is associated with a
variety of adverse-risk genetic defects.
Erythroblasts share genetic defects with leukemic clones
To determine whether RS are part of the malignant leukemic clone, TP53 and SRSF2 mutations
detected by WES in the total MNC fraction (containing both myeloid blast cells and differentiated
cells) were verified in sorted erythroblast populations (see Supplementary methods and
Supplementary Figure 2) of 8 patients. VAFs strongly correlated between both cell fractions
(Pearson r = 0.94, p = 0.0002) (Figure 4A). No correlation was observed between VAFs of
erythroblast mutations and the percentage of RS (Pearson r = -0.43, p = 0.20) (Figure 4B). SNP
array analysis on both fractions revealed that most of the cytogenetic defects were shared by
the MNC fraction and erythroblast populations. However, differences were also detected in 4/6
patients (Figure 4C). Observed differences consisted of either novel aberrancies, exemplified by
the gain of chromosome 11 in the EBs of patient RS006, or discrepancies in observed frequency
of cytogenetic defects, as observed for the 7q deletion in patient RS009 (0.6 in EBs vs. 0.25 in
total MNCs) (Figure 4C). The frequency of cytogenetically aberrant clones within the erythroblast
fractions did not correlate to RS percentages in these four patients (data not shown). No hotspot
mutations in exons 13-16 of SF3B1 were detected in sorted EB fractions.
As erythroblasts of RS-MN patients were observed as part of the malignant clone, we hypothesized
that RS could be eliminated upon treatment. We therefore analyzed follow-up BM smears,
including RS percentage, which were available for 17 patients of the RS cohort. Eight patients
received intensive chemotherapy
24(median interval between both evaluations 2 months (range
1-5)), six were treated using hypomethylating agents
25(median interval 4.3 months (2.5-12) and
three patients received a combination of both (median interval 16 months (5-20) (Supplementary
Table 4)). Of this group, 65% responded to therapy (defined as complete or partial response)
and 35% did not respond (defined as no response or relapse). Patients who responded showed
a marked decrease in RS percentage at follow-up evaluation, whereas the RS percentage in
non-responders was stable or increased (Figure 4D).
5
Figure 3. Genetic defects detected by whole exome sequencing and SNP-array analysis. A. Overview; for each patient, allmutations in genes known to be recurrently mutated in myeloid malignancies detected by WES are depicted as well as cytogenetic abnormalities detected by SNP array analysis. 2 = two mutations were detected. B. Number of acquired mutations per patient as determined by WES. In black, the number of mutations in genes that have been previously implicated in pathogenesis of myeloid malignancies; in grey the number of mutations in genes that have not been previously implicated in myeloid malignancies. C. Correlation between age and the number of mutations. D. Distribution of the various types of alterations detected in the total set of patients with RS phenotype. E. SNP array results overview, figure created using Nexus software. Red=loss, blue=gain.
Up-regulated genes in RS-AML are associated with megakaryocyte/erythroid
differentiation and mRNA splicing
To investigate transcriptional differences that underlie the RS-phenotype in AML, RNA sequencing
was performed on CD34
+-selected AML cells of six patients with ≥13% RS (Supplementary
Table 5). The results were compared to normal bone marrow (NBM) CD34
+and SF3B1-mutated
MDS CD34
+samples (GSE63569
26; hereafter indicated as SF3B1
mutMDS), samples from 36
AML patients with a mixed background regarding cytogenetic and genetic defects (with no
documented presence of RS) that were part of the Blueprint study
27(hereafter indicated as
non-RS-AML) and three AML samples containing SF3B1 mutations (TCGA, Blueprint dataset and 1 of
our own samples (RS020); hereafter indicated as SF3B1
mutAML). First PCA and t-SNE analysis was
performed which consistently separated non-RS-AML from NBM and MDS samples (Figure 5A).
RS-AML samples were located in between these populations, as well as the SF3B1
mutAMLs. These
findings suggest that RS-AML comprises a distinct entity that partially resembles expression
patterns in SF3B1
mutAML, probably reflecting involvement of overlapping pathways. By direct
comparison of RS-AML to NBM CD34
+, 1196 genes were identified to be up-regulated in AML with
RS-phenotype. Functionally, this gene set was highly enriched for genes involved in epigenetic
regulation as well as protein modifications including ubiquitination (Supplementary Figure 3A).
Processes associated with down-regulated genes (n=1309) included cell cycle-related pathways
and DNA replication (Supplementary Figure 3B).
To investigate gene expression patterns that are correlated to RS-AML but not to AML in general,
we compared gene expression of AML to non-AML patients. Down-regulated genes in
RS-AML vs. non-RS-RS-AML are involved in immune-related processes (Supplementary Figure 3C).
Gene ontology analysis on the up-regulated genes in RS- AML revealed enrichment for genes
involved in megakaryocyte differentiation and platelet function (Figure 5B and Supplementary
Table 6), including GATA1, GATA2, KLF1, AHSP and TRIM58, which are genes that are also involved
in the regulation of erythroid differentiation. Furthermore, this analysis determined that EPOR, the
gene encoding for the erythropoietin receptor, was also up-regulated in CD34
+cells of patients
Figure 4 - Genetic defects in erythroblast of patients with RS-phenotype A. VAFs of TP53 and SRSF2 mutations detected
in mononuclear cell (MNC) fractions (determined by WES) and erythroblast fraction (determined by amplicon based sequencing). B. Correlation between VAFs in erythroblast fractions (VAF indicated on left y-axis) and observed RS percentages (percentage indicated on right y-axis). Different colors represent individual mutations (as indicated in legend of Figure 4A).
C. Differences observed in SNP array results between MNC- and EB fractions, figure shows screenshots taken from Chromosome
Analysis Suite software package (Affymetrix). D. Relative changes in RS percentage of patients who received treatment, the RS percentage at follow-up examination (post-treatment) are displayed relative to the RS percentages determined at diagnosis (Diagnosis). Blue lines indicate patients who responded to therapy and red lines indicate patients who did not respond (see
5
To determine whether the increased expression of these genes is due to an increased amount of
megakaryocyte–erythroid progenitors (MEPs) in the RS-AML we extracted a MEP signature based
on previously published gene expression patterns
28,29. As our cohort of non-RS-AMLs comprises
both CD33
+and CD34
+selected samples, we first compared the MEP signature between both
cell populations to exclude that marker-biased analysis determines this difference. However
this comparison did not reveal differences between both cell-fractions with regard to the extent
of the MEP signature. (Supplementary Figure 3D). Compared to the non-AML cohort,
RS-AML displayed a slightly increased expression of this MEP signature (Figure 5C). Altogether
these findings demonstrate that CD34
+RS-AML cells are partially distinct reflected by elevated
expression of genes associated to megakaryocyte and erythroid differentiation including the MEP
signature.
Figure 5 - RS-AML transcription program. A. PCA and t-SNE plots using RNA-seq results of the indicated cell types. Previously
published RNA-seq was included for the following cell types: normal bone marrow (NBM) (GSE63569), non-RS-AML (Blueprint study), SF3B1mut MDS (GSE63569) and SF3B1mut AML (combination of TCGA dataset, Blueprint study and own data). B. Biological
process enrichment for genes upregulated (1464) in RS-AML as compared to non-RS-AML. C. MEP signature comparison between RS-AML, non-RS-AML and NBM.
Similarities and differences with SF3B1 mutated myeloid neoplasms
Next we studied in more detail the similarities and differences between the different MNs with
RS. Therefore we determined the most differentially expressed genes followed by clustering in
RS-AML CD34
+cells (n=6), NBM CD34
+cells (n=5), CD34
+cells of SF3B1
mutAML (n=3) and SF3B1
mutRS-AML from MDS and NBM, are enriched for genes associated with cell cycle-related processes
and protein modifications, respectively, highlighting the difference in proliferation and expansion
between RS-AMLs and SF3B1
mutMDS (Supplementary Figure 4A-E). Cluster 4, which is enriched
for genes involved in cellular communication and trafficking, is more specific for NBM CD34
+cells
(Supplementary Figure 4D). Cluster 5 represents genes that are specifically up-regulated in
CD34
+cells of SF3B1
mutMDS and show heterogeneous expression in RS-AML. Functionally, these
genes are involved in erythroid development including iron metabolism (Figure 6B).
Figure 6 - Comparison between RS-AML and SF3B1mut AML and MDS. A. Heatmap of gene expression by supervised k-means
clustering in RS-AML, SF3B1mut AML and SF3B1mut MDS versus control normal bone marrow (NBM) cells. The expression of the 6
clusters identified in for individual groups is shown on the right. B. Biological process enrichment for cluster 5 as identified in
A, for other clusters see Supplementary Figure 3. C. Gene overlap of common upregulated genes between RS-AML, SF3B1mut
5
To identify gene expression differences that are shared by RS AML, SF3B1
mutAML and SF3B1
mutMDS, we compared gene expression of all gene sets to NBM and determined the overlapping
genes. A total of 47 genes were consistently up-regulated and 125 genes were down-regulated
in all three groups (Figure 6C, Supplementary Figure 5A, Supplementary Table 7). Examples
of up-regulated genes include ALAS2, HBB and NFE2, all players in heme-metabolism. Functional
annotation revealed erythroid-related processes as highest for this gene set (Figure 6C). This
result is in line with the previously described increased expression of the erythroid gene program
and increased MEP signature expression in these CD34
+cells. Commonly down-regulated
genes strongly enrich for immune pathways (Supplementary Figure 5A). Gene expression of
PRPF8, SF3B1 and ABCB7, which were previously described as being involved in RS of SF3B1
mutMDS due to deregulated splicing, were not expressed at a lower level in RS-AML compared to
NBM (Supplementary Figure 5B). Finally, we analyzed whether genes that have been described
as being differentially spliced in the previous datasets of Dolatshad et al.
26were differentially
spliced (Supplementary Figure 5C) or differentially expressed in our dataset (Supplementary
Figure 5D). We found only limited overlap in genes that are differentially spliced or expressed in
SF3B1
mutMDS and our dataset of RS-AML. These findings demonstrate that RS-AML shares a gene
expression signature with SF3B1
mutMDS, but that these AMLs also have unique characteristics that
distinguish them from MDS.
DISCUSSION
In this study, we characterized the presence of RS in a cohort of patients diagnosed with acute
myeloid leukemia or high-risk MDS. We observed that RS-AML is enriched for ELN adverse
risk disease, including the associated genetic and chromosomal defects. Clinically, higher RS
percentage at diagnosis is accompanied with higher incidence of poor risk disease characteristics.
Unlike MDS, no single-gene defect was found that underlies the RS phenotype in AML. Gene
expression analysis indicated up-regulation of genes enriched for megakaryocyte and erythroid
differentiation in RS-AML, as compared to a general AML cohort.
Although RS are generally regarded as a specific feature of certain MDS subtypes, the reported
incidence of 25% indicates that RS is also a rather common finding in AML
18. In contrast to reports
on MDS-RS subtypes, mutations in spliceosome gene SF3B1 were rarely observed in our RS-AML
cohort. The paucity of this mutation in RS-AML might be explained by the low tendency towards
AML transformation from MDS subtypes with SF3B1 mutations
8,13. Also, we did not identify
genetic mutations in PRPF8, another gene that has been associated with an RS phenotype
20. In
our cohort, 25% (4/16) of the patients had a deletion of the chromosomal locus of PRPF8, but we
did not observe significantly lower gene expression of PRPF8.
We did not identify a single gene defect that underlies the RS phenotype in AML however,
panel-based and whole-exome sequencing revealed a high incidence of adverse-risk mutations in the
RS-AML cohort, including DNMT3A, RUNX1 and ASXL1. Mutations in TP53 were observed most
recurrently, particularly in patients with >15% RS at diagnosis. These results are in agreement with
a previous study using a restricted gene panel to analyze the RS phenotype in AML patients
18. The
high incidence of poor-risk cytogenetic aberrancies, including chromothripsis, probably reflects
the high frequency of TP53 mutations
30.
In the present study we have primarily included patients with de novo AML reflecting the lack
of a clinical relevant history of antecedent MDS. However, a previous study has suggested that
in particular patients with t-AML and s-AML with TP53 mutations have transited through an
unrecognized MDS prodrome based on genome ontogeny alterations in time
31. Whether similar
findings apply to patients with RS-AML whereby RS development is an early MDS event and AML
development a late event will require additional studies.
Interestingly, RS have also been reported in relation to MNs in Li-Fraumeni patients with a
congenital TP53 mutation
32. Defects in TP53 also occur frequently in acute erythroid leukemia
(AEL). Besides having a high frequency of TP53 mutations, AML with RS phenotype resembles AEL
regarding higher age at diagnosis (median age 68 years), a male predominance and presence of
complex chromosomal aberrancies
33. However, unlike AEL, erythroid differentiation in RS-AML
does not stop at the pro-erythroblastic stage. Instead, accumulation of immature myeloid blast
cells is commonly observed in RS-AML, while this is rare in AEL
33.
The presence of RS in our AML cohort is not restricted to a specific ontogenic subtype
34, indicating
that RS in AML arise from a common mechanism that goes beyond disease ontogeny. Also, it
has been reported that the presence of RS on its own is not predictive for overall survival in AML
patients
18, suggesting that RS-AML is not a distinct disease entity. However, we observed several
differences in megakaryocytic and erythroid differentiation gene expression that are specific
for RS-AML compared to non-RS-AML suggesting a role for the megakaryocyte-erythrocyte
progenitor (MEP) cell
35. The transcription factors GATA1 and GATA2 are involved in erythroid
lineage restriction, partly via stimulation of erythropoietin receptor expression
36, which we also
observed as being up-regulated in RS-AML, while both become down-regulated during myeloid
differentiation
37. These results suggest that the erythroid differentiation program is largely
functional, despite the presence of a block in the myeloid differentiation program. These findings
were present despite the fact that the majority of cytogenetic and molecular defects were shared
by MNC and erythroblast populations, although our SNP array results suggests that cytogenetic
evolution can take place.
In MDS, SF3B1 mutations presumably result in RS formation by interfering with mRNA splicing,
resulting in differential gene expression
8,26. Although AML is genetically distinct from
RS-MDS, downstream mechanisms that result in aberrant erythroid differentiation may be similar.
However, when comparing differentially spliced genes observed in SF3B1
mutMDS samples
26and
5
In conclusion, we have shown that RS are a frequent finding in AML, particularly in relation to
adverse risk genetic defects. We revealed that erythroblasts share the mutations that are found
in the malignant myeloid blast cells in RS-AML, and thus are part of the malignant population.
Although the genetic background of RS-AML differs from that of RS-MDS, downstream effector
pathways may be comparable, providing a possible explanation for presence of RS in AML.
Acknowledgments
We would like to thank K. Chiba and S. Miyano from the Laboratory of DNA information
Analysis at The University of Tokyo for their technical support. S. Ogawa was supported by the
following grants: Grant-in-Aid for Scientific Research on Innovative Areas from the Ministry of
Health, Labor and Welfare of Japan (15H05909). Grant for Project for Development of Innovative
Research on Cancer Therapeutics from the Japan Agency for Medical Research and Development,
AMED (JP18ck0106250h0002). Grant for project for cancer research and therapeutics evolution
(P-CREATE) from AMED (JP18cm0106501h0003).
REFERENCES
1. Ohba R, Furuyama K, Yoshida K, et al. Clinical and genetic characteristics of congenital sideroblastic anemia: comparison with myelodysplastic syndrome with ring sideroblast (MDS-RS). Ann. Hematol. 2013;92(1):1–9.
2. Arber DA, Orazi A, Hasserjian R, et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016;127(20):2391–2406.
3. Sheftel A, Richardson D, Prchal J, Ponka P. Mitochondrial iron metabolism and sideroblastic anemia. Acta Haematol. 2009;122(2–3):120–33. 4. Cotter P, Baumann M, Bishop D. Enzymatic defect in “X-linked” sideroblastic anemia: molecular evidence for erythroid delta-aminolevulinate
synthase deficiency. Proc. Natl. Acad. Sci. United States Am. 1992;89(9):4028–4032.
5. Allikmets R, Raskind W, Hutchinson A, et al. Mutation of a putative mitochondrial iron transporter gene (ABC7) in X-linked sideroblastic anemia and ataxia (XLSA/A). Hum. Mol. Genet. 1999;8(5):743–9.
6. Guernsey D, Jiang H, Campagna D, et al. Mutations in mitochondrial carrier family gene SLC25A38 cause nonsyndromic autosomal recessive congenital sideroblastic anemia. Nat. Genet. 2009;41(6):651–3.
7. Schmitz-Abe K, Ciesielski S, Schmidt P, et al. Congenital sideroblastic anemia due to mutations in the mitochondrial HSP70 homologue HSPA9. Blood. 2015;126(25):2734–8.
8. Papaemmanuil E, Cazzola M, Boultwood J, et al. Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts. N. Engl. J. Med. 2011;365(15):1384–95.
9. Yoshida K, Sanada M, Shiraishi Y, et al. Frequent pathway mutations of splicing machinery in myelodysplasia. Nature. 2011;478(7367):64–9. 10. Malcovati L, Papaemmanuil E, Bowen D, et al. Clinical significance of SF3B1 mutations in myelodysplastic syndromes and myelodysplastic/
myeloproliferative neoplasms. Blood. 2011;118(24):6239–46.
11. Patnaik M, Lasho T, Hodnefield J, et al. SF3B1 mutations are prevalent in myelodysplastic syndromes with ring sideroblasts but do not hold independent prognostic value. Blood. 2012;119(2):569–72.
12. Damm F, Kosmider O, Gelsi-boyer V, et al. Mutations affecting mRNA splicing define distinct clinical phenotypes and correlate with patient outcome in myelodysplastic syndromes. Blood. 2019;119(14):3211–3218.
13. Greenberg P, Tuechler H, Schanz J, et al. Revised international prognostic scoring system for myelodysplastic syndromes. Blood. 2012;120(12):2454–65.
14. Chen M, Manley J. Mechanisms of alternative splicing regulation: insights from molecular and genomics approaches. Nat. Rev. Mol. Cell Biol. 2009;10(11):741–54.
15. Alsafadi S, Houy A, Battistella A, et al. Cancer-associated SF3B1 mutations affect alternative splicing by promoting alternative branchpoint usage. Nat. Commun. 2016;7:10615.
16. Dolatshad H, Pellagatti A, Fernandez-Mercado M, et al. Disruption of SF3B1 results in deregulated expression and splicing of key genes and pathways in myelodysplastic syndrome hematopoietic stem and progenitor cells. Leukemia. 2015;29(5):1092–1103.
17. Shiozawa Y, Malcovati L, Gallì A, et al. Aberrant splicing and defective mRNA production induced by somatic spliceosome mutations in myelodysplasia. Nat. Commun. 2018;9(1):.
18. Martin-Cabrera P, Jeromin S, Perglerova K, et al. Acute myeloid leukemias with ring sideroblasts show a unique molecular signature straddling secondary acute myeloid leukemia and de novo acute myeloid leukemia. Haematologica. 2017;102(4):e125-128.
19. Papaemmanuil E, Gerstung M, Bullinger L, et al. Genomic Classification and Prognosis in Acute Myeloid Leukemia. N. Engl. J. Med. 2016;374(23):2209–2221.
20. Kurtovic-Kozaric A, Przychodzen B, Singh J, et al. PRPF8 defects cause missplicing in myeloid malignancies. Leukemia. 2015;29(1):126–36. 21. Dohner H, Estey E, Grimwade D, et al. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international
expert panel. Blood. 2017;129(4):424–448.
22. da Silva-Coelho P, Kroeze L, Yoshida K, et al. Clonal evolution in myelodysplastic syndromes. Nat. Commun. 2017;8:15099.
23. McGowan-Jordan J, Simons A, Schmid M. An international system for human cytogenomic nomenclature. Cytogenet. genome Res. 2016;148:1–140.
24. Lowenberg B, Pabst T, Maertens J, et al. Therapeutic value of clofarabine in younger and middle-aged (18-65 years) adults with newly diagnosed AML. Blood. 2017;129(12):1636–1645.
25. van der Helm L, Scheepers E, Veeger N, et al. Azacitidine might be beneficial in a subgroup of older AML patients compared to intensive chemotherapy: a single centre retrospective study of 227 consecutive patients. J. Hematol. Oncol. 2013;6(29):.
26. Dolatshad H, Pellagatti A, Liberante F, et al. Cryptic splicing events in the iron transporter ABCB7 and other key target genes in SF3B1-mutant myelodysplastic syndromes. Leukemia. 2016;30(12):2322–2331.
27. Yi G, Wierenga ATJ, Petraglia F, et al. Chromatin-based classification of genetically heterogeneous AMLs into two distinct subtypes with diverse stemness phenotypes. Cell Rep. 2019;26(4):1059–1069.
28. Chen L, Kostadima M, Martens J, et al. Transcriptional diversity during lineage commitment of human blood progenitors. Science (80-. ). 2014;345(6204):1251033.
29. Aran D, Hu Z, Butte AJ. xCell: Digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 2017;18(220):.
30. Fontana M, Marconi G, Feenstra J, et al. Chromothripsis in acute myeloid leukemia: Biological features and impact on survival. Leukemia. 2017;32:1609–1620.
31. Lindsley RC, Mar BG, Mazzola E, et al. Acute myeloid leukemia ontogeny is defined by distinct somatic mutations. Blood. 2015;125(9):1367– 1376.
5
32. Talwalkar S, Yin C, Naeem R, et al. Myelodysplastic syndromes arising in patients with germline TP53 mutation and Li-Fraumeni syndrome.
Arch Pathol Lab Med. 2010;134(7):1010–5.
33. Reinig E, Greipp P, Chui A, Howard M, Reichard K. De novo pure erythroid leukemia: refining the clinicopathologic and cytogenetic characteristics of a rare entity. Mod. Pathol. 2018;31(5):705–717.
34. Lindsley R, Mar B, Mazzola E, et al. Acute myeloid leukemia ontogeny is defined by distinct somatic mutations. Blood. 2015;125(9):1367–76. 35. Akaashi K, Traver D, Miyamoto T, Weissman I. A clonogenic common myeloid progenitor that gives rise to all myeloid lineages. Nature.
2000;404(6774):193–7.
36. Zon L, Youssoufian H, Mather C, Lodish H, Orkin S. Activation of the erythropoietin receptor promoter by transcription factor GATA-1. Proc.
Natl. Acad. Sci. U. S. A. 1991;88(23):10638–41.
SUPPLEMENTARY METHODS
Sorting of cell fractions
To separate cell fractions, following thawing of viably frozen MNCs, cells were washed and stained
with a panel of antibodies and sorted for purification of different cell fractions (antibodies against
CD3, CD34, CD71 and CD235a surface markers (conjugates CD3-FITC (cat. 345763), CD34-APC
(cat. 555824) or CD34-PE Cy7 (cat. 348811) or CD34-FITC (cat. 345801, BioLegend, Uithoorn, the
Netherlands), CD71-BV786 (cat. 563768) or CD71-APC (cat. 551374) or CD71-PE (cat. 555537)
and CD235a-BV421 (cat. 562938) or CD235a-APC (cat. 551336) Unless stated otherwise, these
antibodies were obtained from BD Bioscience (Breda, the Netherlands). Single viable cells were
selected based on forward and side scatter profiles in combination with negativity for DAPI or
PI (both Sigma-Aldrich, Saint Louis, MO, USA). Blast fraction was defined as CD34 positive, T cell
fraction as CD3 positive and erythroblast fraction was sorted as CD71/CD235a positive. Sorting
purity was defined as ≥95% and confirmed by reanalysis.
Whole exome sequencing and confirmation of mutations in erythroblasts
Following exome capturing using Human All Exon V5 (Agilent Technologies, Santa Clara, CA, USA),
massively parallel sequencing was performed on enriched exome fragments using the HiSeq 2500
platform (Illumina, San Diego, CA, USA). Alignment of sequences and calling of mutations was
executed our previously described in-house pipelines
1,2, with minor modifications. The resultant
data file was analyzed for the presence of germline variants in genes that have been previously
implicated in sideroblastic anemia. Subsequently, DNA isolated from sorted autologous T cells
was used as a constitutive reference to exclude germline variants. Filtering strategy and variant
calling was performed as previously described
3. Somatic variants in TP53 and SRSF2 were validated
using amplicon-based deep sequencing on an Ion Torrent Personal Genome Machine (Thermo
Fisher Scientific, Waltham, MA, USA). The sequencing procedure using an automated robotic
workflow was performed as previously described
4. The obtained sequencing data was mapped
to the reference genome build GRCh37 (hg19). Variant calling was performed using the SeqNext
module of the Sequence pilot software, version 4.2.2 (JSI Medical Systems, Ettenheim, Germany).
RNA Extraction and Illumina high-throughput sequencing
RNA libraries were prepared using the KAPA RNA HyperPrep Kit with RiboErase (HMR) according
to the manufactures protocol (KR1351 – v1.16, Roche Sequencing Solutions). The process
is described briefly here: 25ng -1ug input RNA was depleted from ribosomal RNA by oligo
hybridization, RNaseH treatment and DNase digestion. rRNA-depleted RNA was fragmented to
~200 bp fragments and first strand synthesis was performed using random primers.
Second-strand synthesis was performed using dUTP for Second-strand specificity. After adapter ligation, library
amplification was performed and the number of cycles was dependent on the amount of starting
5
material. Fragment size and quality was checked on a bioanalyser using a high sensitivity DNA
Chip (Agilent). Samples were sequenced on the Illumina HiSeq 2000. Finally, each eligible library
was subjected to 243 bp paired-end sequencing (PE43) on an Illumina NextSeq 500 system.
RNA-Seq analysis
The hg19 reference genome was first indexed by STAR aligner with UCSC gene annotation. The
resulting RNA-seq reads were mapped to the hg19 genome using STAR with two-pass mode,
and the gene-level read counts were enumerated at the same time. The DESeq2 tool was used to
examine differentially expressed genes by conducting pair-wise comparison between different
groups. Only genes with a Benjamini-Hochberg-adjusted p-value -0.1 and a fold change 1.5
were considered significantly deregulated. Principal component analysis (PCA) and t-Distributed
Stochastic Neighbor Embedding (t-SNE) were used to probe the transcriptomic relationships
between these groups. To group differential genes with similar expression patterns, the k-means
clustering approach was performed based on z-score normalization and then displayed as a
heatmap. To assess the enriched annotation for deregulated genes, we performed functional
enrichment analysis with the Metascape tool, and only terms showing p-values below 0.01 were
considered significantly over-represented.
For alternative splicing analysis we used MISO suites
5with default options to detect alternative
splicing events in our study. The MISO annotations contained five types of events: skipped exons
(SE), alternative 3’/5’ splice sites (A3SS, A5SS), mutually exclusive exons (MXE) and retained introns
(RI). We first merged RNA-seq data from the same cell types then computed percentage splicing
index (PSI) value of each event, and then finally performed pairwise comparisons between groups
to identify differentially spliced genes.
We downloaded the MEP signature gene list from two previous papers
6,7to analyze how these
genes were expressed in the four groups. Expression quantification (FPKM-scale) for each RefSeq
gene was performed with the Cuffnorm function in Cufflinks.
Data retrieval
Data from NBM CD34+ and SF3B1 mutated MDS were retrieved from Dolatshad et al.
8, while
Blueprint data and RS-AML are in-house sequencing data. Similar sequencing technologies
based on the Illumina platform are used and strand specific RNA-seq data was analyzed, using
comparable sample prep methods. For processing, sequencing reads were trimmed when
needed to similar read length and files were normalized for sequencing depth. In addition the
analysis pipeline integrated in DEseq2 was used to discover and correct for batch effects.
REFERENCES
1. Yoshida K, Sanada M, Shiraishi Y, et al. Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478, 64–9 (2011). 2. da Silva-Coelho P, Kroeze LI, Yoshida K, Koorenhof-Scheele TN, Knops R, van de Locht LT, et al. lonal evolution in myelodysplastic
syndromes. Nat. Commun. 8, 15099 (2017).
3. Berger G, Kroeze LI, Koorenhof-Scheele TN, de Graaf AO, Yoshida K, Ueno H, et al. Early detection and evolution of preleukemic clones in therapy-related myeloid neoplasms following autologous SCT. Blood 131, 1846–57 (2018).
4. Sandmann, S., de Graaf, A. O., van der Reijden, B. A., Jansen, J. H. & Dugas, M. GLM-based optimization of NGS data analysis: A case study of Roche 454, Ion torrent PGM and Illumia NextSeq sequencing data. PLoS One 12, e0171983 (2017).
5. Katz, Y., Wang, E., Airoldi, E. & Burge, C. Analysis and design of RNA sequencing experimetns for identifying isoform regulation. Nat. Methods
7, 1009–1015 (2010).
6. Chen, L. et al. Transcriptional diversity during lineage commitment of human blood progenitors. Science (80-. ). 345, 1251033 (2014). 7. Aran, D., Hu, Z. & Butte, A. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 18, (2017).
8. Dolatshad H, Pellagatti A, Liberante F, et al. Cryptic splicing events in the iron transporter ABCB7 and other key target genes in SF3B1-mutant myelodysplastic syndromes. Leukemia. 2016;30(12):2322–2331.
5
Table S1: Detailed Information of 21 Targeted Genes for Mutation Detection.Target gene TruSight sequencing panel smMIPS panel Target gene TruSight sequencing panel smMIPS panel ABL1 x JAK3 x ASXL1 x x KDM6A x ATRX x KIT x x BCOR x KRAS x x BCORL1 x MLL x BRAF x x MPL x x CALR x x MYD88 x x CBL x x NOTCH1 x x CBLB x NPM1 x x CBLC x NRAS x x CDKN2A x PDGFRA x CEBPA x PHF6 x CSF3R x x PTEN x CUX1 x PTPN11 x DNMT3A x x RAD21 x ETV6/TEL x RUNX1 x x EZH2 x x SETBP1 x x FBXW7 x SF3B1 x x FLT3 x x SMC1A x GATA1 x SMC3 x GATA2 x SRSF2 x x GNAS x STAG2 x HRAS x TET2 x x IDH1 x x TP53 x IDH2 x x U2AF1 x x IKZF1 x WT1 x x JAK2 x x ZRSR2 x
Table S2. Genes implicated in congenital sideroblastic anemia ABCB7 ISCA1 ALAS2 ISCA2 FECH ISCU GLRX5 MFRN1 HSCB NIFS HSP70 PUS1 HSPA9 SLC25A38 IRBP1 YARS2 IRBP2
5
Table S3: WES results Pa tien t No G ene Lo ca tion of gene Typ e of v arian t Ch r. Star t c o-or dina te of v arian t (GR Ch37/hg19) End c o-or dina te of v arian t (GR Ch37/hg19) Ref er enc e sequenc e O bser ved sequenc e Tr anscript ID Co ding sequenc e change Amino acid change
VA F RS001 1 CAD ex onic nonsynon ymous SNV 2p23.3 27460682 27460682 G A NM_001306079 c.G4471A p.G1491R 0.47 2 FBXO11 ex onic nonsynon ymous SNV 2p16.3 48036334 48036334 A G NM_001190274 c.T2518C p.S840P 0.51 3 ANAPC1 ex onic nonsynon ymous SNV 2q13 112638237 112638237 C T NM_022662 c.G166A p.G56S 0.44 4 DPP10 ex onic nonsynon ymous SNV 2q14.1 116525948 116525948 G A NM_001004360 c.G1168A p.E390K 0.45 5 HTR1E ex onic nonsynon ymous SNV 6q14.3 87725923 87725923 C T NM_000865 c.C871T p.R291C 0.47 6 HINT3 ex onic nonsynon ymous SNV 6q22.32 126288106 126288106 A T NM_138571 c.A275T p.H92L 0.48 7 AUTS2 ex onic nonsynon ymous SNV 7q11.22 70255558 70255558 G A NM_001127231 c.G3284A p.R1095Q 0.48 8 LAMB4 ex onic nonsynon ymous SNV 7q31.1 107703252 107703252 G T NM_007356 c.C3249A p.D1083E 0.52 9 COL5A1 ex onic nonsynon ymous SNV 9q34.3 137711968 137711968 C G NM_000093 c.C4453G p.P1485A 0.45 10 MYOZ1 ex onic nonsynon ymous SNV 10q22.2 75394385 75394385 C T NM_021245 c.G359A p.G120D 0.11 11 DTX4 ex onic nonsynon ymous SNV 11q12.1 58949859 58949859 G A NM_001300727 c.G541A p.V181M 0.57 12 DDB1 ex onic nonsynon ymous SNV 11q12.2 61091536 61091536 C T NM_001923 c.G836A p.R279Q 0.38 13 PABPC3 ex onic nonsynon ymous SNV 13q12.13 25671409 25671409 T C NM_030979 c.T1073C p.V358A 0.46 14 C14orf39 ex onic nonsynon ymous SNV 14q23.1 60933638 60933638 C G NM_174978 c.G892C p.A298P 0.35 15 IRX6 ex onic nonsynon ymous SNV 16q12.2 55362987 55362987 C T NM_024335 c.C1097T p.A366V 0.58 16 TP53 ex onic nonsynon ymous SNV 17p13.1 7577100 7577100 T C NM_001126115 c.A442G p.R148G 0.97 17 ANKRD13B ex onic nonsynon ymous SNV 17q11.2 27934799 27934799 C A NM_152345 c.C154A p.L52M 0.55 18 ZNF831 ex onic nonsynon ymous SNV 20q13.32 57768856 57768856 G A NM_178457 c.G2782A p.V928M 0.45 19 PQBP1 ex onic nonsynon ymous SNV Xp11.23 48759747 48759747 G A NM_001167992 c.G230A p.R77H 0.43 20 DCAF12L1 ex onic nonsynon ymous SNV Xq25 125685795 125685795 C A NM_178470 c.G797T p.R266L 0.43 RS002 1 DNMT3A ex onic fr ameshif t deletion 2p23.3 25467135 25467135 G -NM_153759 c.1173delC p.P391f s 0.28 2 ZFP62 ex onic fr ameshif t deletion 5q35.3 180276363 180276363 A -NM_001172638 c.2132delT p.F711f s 0.26 3 ZNF45 ex onic fr ameshif t deletion 19q13.31 44417824 44417828 CA CA T -NM_003425 c.1760_1764del p.D587f s 0.25 4 ANTXR1 ex onic nonsynon ymous SNV 2p13.3 69420473 69420473 C T NM_032208 c.C1360T p.L454F 0.33 5 CPN2 ex onic nonsynon ymous SNV 3q29 194063301 194063301 G T NM_001080513 c.C131A p.P44Q 0.36 6 GLIPR2 ex onic nonsynon ymous SNV 9p13.3 36162450 36162450 A T NM_001287014 :c .A150T p.T1014M 0.32 7 TOP2A ex onic nonsynon ymous SNV 17q21.2 38556279 38556279 G A NM_001067 c.C3041T p.T1014M 0.21
Table S3: C on tinued Pa tien t No G ene Lo ca tion of gene Typ e of v arian t Ch r. Star t c o-or dina te of v arian t (GR Ch37/hg19) End c o-or dina te of v arian t (GR Ch37/hg19) Ref er enc e sequenc e O bser ved sequenc e Tr anscript ID Co ding sequenc e change A
mino acid change
VA F 8 SRSF2 ex onic nonsynon ymous SNV 17q25.1 74732959 74732959 G A NM_001195427 c.C284T p.P95L 0.30 9 SETBP1 ex onic nonsynon ymous SNV 18q12.3 42531913 42531913 G A NM_015559 c.G2608A p.G870S 0.31 10 LRRC74B ex onic nonsynon ymous SNV 22q11.21 21401734 21401734 C T NM_001291006 c.C229T p.R77C 0.36 11 MAGEB1 ex onic st opgain Xp21.2 30269562 30269562 C T NM_177404 c.C952T p.R318X 0.45 RS003 1 EP300 ex onic fr ameshif t deletion 22q13.2 41513640 41513640 A -NM_001429 c.544delA p.N182f s 0.42 2 PLEKHG5 ex onic nonsynon ymous SNV 1p36.31 6533483 6533483 T G NM_001042664 c.A623C p.E208A 0.22 3 IGSF21 ex onic nonsynon ymous SNV 1p36.13 18692047 18692047 C T NM_032880 c.C871T p.R291C 0.42 4 TCHH ex onic nonsynon ymous SNV 1q21.3 152081494 152081494 C T NM_007113 c.G4199A p.R1400H 0.48 6 DNMT3A ex onic nonsynon ymous SNV 2p23.3 25463587 25463587 C A NM_153759 c.G1528T p.G510C 0.44 7 SP140 ex onic nonsynon ymous SNV 2q37.1 231134614 231134614 G T NM_001278453 c.G1048T p.V350L 0.40 8 THPO ex onic nonsynon ymous SNV 3q27.1 184093770 184093770 G A NM_000460 c.C47T p.A16V 0.17 9 PAICS ex onic nonsynon ymous SNV 4q12 57314708 57314708 T C NM_001079524 c.T518C p.F173S 0.48 10 SLC34A1 ex onic nonsynon ymous SNV 5q35.3 176813234 176813234 T G NM_001167579 c.T272G p.V91G 0.44 11 GOPC ex onic nonsynon ymous SNV 6q22.1 117923180 117923180 T C NM_001017408 c.A272G p.N91S 0.44 12 ADGRG6 ex onic nonsynon ymous SNV 6q24.1 142630707 142630707 G C NM_001032394 c.G29C p.S10T 0.31 13 FAM20C ex onic nonsynon ymous SNV 7p22.3 299825 299825 C T NM_020223 c.C1634T p.A545V 0.80 15 VDAC3 ex onic nonsynon ymous SNV 8p11.21 42259409 42259409 G T NM_001135694 c.G430T p.V144L 0.51 16 CNTNAP3B ex onic nonsynon ymous SNV 9p11.2 43861084 43861084 C T NM_001201380 c.C1958T p.S653L 0.28 17 KLHL25 ex onic nonsynon ymous SNV 15q25.3 86311930 86311930 G A NM_022480 c.C1112T p.A371V 0.08 18 CACNA1H ex onic nonsynon ymous SNV 16p13.3 1270897 1270897 G A NM_001005407 c.G6947A p.R2316Q 0.43 19 TP53 ex onic nonsynon ymous SNV 17p13.1 7578403 7578403 C A NM_001126115 c.G131T p.C44F 0.79 20 CHERP ex onic nonsynon ymous SNV 19p13.11 16653179 16653179 G A NM_006387 c.C11T p.P4L 0.47 21 TSSK6 ex onic nonsynon ymous SNV 19p13.11 19625876 19625876 C T NM_032037 c.G361A p.G121S 0.09 22 ANOS1 ex onic nonsynon ymous SNV Xp22.31 8504940 8504940 T C NM_000216 c.A1493G p.Y498C 0.55 23 RAB33A ex onic nonsynon ymous SNV Xq26.1 129306271 129306271 G A NM_004794 c.G235A p.E79K 0.49 24 LRP2 ex onic st opgain 2q31.1 170103945 170103945 G A NM_004525 c.C2851T p.R951X 0.44 25 NF1 splicing 17q11.2 29560019 29560019 G A NM_001042492 c.3497-1G>A 0.07
5
Table S3: C on tinued Pa tien t No G ene Lo ca tion of gene Typ e of v arian t Ch r. Star t c o-or dina te of v arian t (GR Ch37/hg19) End c o-or dina te of v arian t (GR Ch37/hg19) Ref er enc e sequenc e O bser ved sequenc e Tr anscript ID Co ding sequenc e change Amino acid change
VA F RS004 1 ITLN1 ex onic nonsynon ymous SNV 1q23.3 160851072 160851072 C T NM_017625 c.G436A p.D146N 0.53 2 UBE3D ex onic nonsynon ymous SNV 6q14.1 83763921 83763921 T C NM_001304437 c.A215G p.Q72R 0.47 3 CLDN4 ex onic nonsynon ymous SNV 7q11.23 73245557 73245557 T C NM_001305 c.T26C p.M9T 0.87 4 CNBD1 ex onic nonsynon ymous SNV 8q21.3 87899821 87899821 A C NM_173538 c.A140C p.H47P 0.55 5 TAOK2 ex onic nonsynon ymous SNV 16p11.2 30002843 30002843 G T NM_004783 c.G3104T p.S1035I 0.49 6 TP53 ex onic nonsynon ymous SNV 17p13.1 7577568 7577568 C T NM_001126115 c.G317A p.C106Y 0.84 7 FLII ex onic nonsynon ymous SNV 17p11.2 18156687 18156687 C T NM_001256265 c.G779A p.G260D 0.44 8 NF1 ex onic nonsynon ymous SNV 17q11.2 29687632 29687632 C T NM_000267 c.C8225T p.P2742L 0.57 9 SMARCB1 ex onic nonsynon ymous SNV 22q11.23 24176338 24176338 C T NM_001007468 c.C1102T p.R368C 0.32 10 RIMS1 ex onic st opgain 6q13 72889437 72889437 C T NM_014989 c.C631T p.R211X 0.47 RS005 1 TFE3 ex onic fr ameshif t deletion Xp11.23 48890995 48890995 G -NM_001282142 c.806delC p.P269f s 0.13 2 GATA2 ex onic fr ameshif t inser tion 3q21.3 128202821 128202821 -C NM_001145662 c.898dupG p.A300f s 0.12 3 PRR30 ex onic nonsynon ymous SNV 2p23.3 27360794 27360794 T G NM_178553 c.A404C p.N135T 0.10 4 EPAS1 ex onic nonsynon ymous SNV 2p21 46574041 46574041 C T NM_001430 c.C56T p.S19F 0.35 5 HECW1 ex onic nonsynon ymous SNV 7p13 43548590 43548590 C T NM_001287059 c.C3787T p.R1263W 0.34 6 CFAP43 ex onic nonsynon ymous SNV 10q25.1 105945803 105945803 A G NM_025145 c.T1939C p.S647P 0.36 7 TRIM34,TRIM6- TRIM34 ex onic nonsynon ymous SNV 11p15.4 5664529 5664529 G A NM_001003827 c.G1057A p.V353M 0.43 8 LRFN5 ex onic nonsynon ymous SNV 14q21.1 42356850 42356850 C T NM_152447 c.C1022T p.T341I 0.29 9 EFTUD1 ex onic nonsynon ymous SNV 15q25.2 82444553 82444553 G A NM_001040610 c.C2089T p.L697F 0.45 10 VPS9D1 ex onic nonsynon ymous SNV 16q24.3 89777197 89777197 G A NM_004913 c.C1055T p.T352M 0.43 11 C17orf53 ex onic nonsynon ymous SNV 17q21.31 42222572 42222572 C T NM_001171251 c.C5T p.A2V 0.32 12 FCER2 ex onic nonsynon ymous SNV 19p13.2 7755121 7755121 C T NM_001207019 c.G649A p.G217S 0.33 13 TP53 ex onic st opgain 17p13.1 7579406 7579406 G C NM_001126118 c.C164G p.S55X 0.54 RS006 1 RUNX1 ex onic fr ameshif t deletion 21q22.12 36164672 36164685 GGGC GA G C TGGC TT -NM_001001890 c.1109_1122del p.Q370f s 0.31 2 MIB2 ex onic nonsynon ymous SNV 1p36.33 1564610 1564610 A G NM_001170688 c.A2450G p.N817S 0.44 3 MAP3K1 ex onic nonsynon ymous SNV 5q11.2 56174925 56174925 A G NM_005921 c.A2084G p.N695S 0.52
Table S3: C on tinued Pa tien t No G ene Lo ca tion of gene Typ e of v arian t Ch r. Star t c o-or dina te of v arian t (GR Ch37/hg19) End c o-or dina te of v arian t (GR Ch37/hg19) Ref er enc e sequenc e O bser ved sequenc e Tr anscript ID Co ding sequenc e change A
mino acid change
VA F 4 MRPS27 ex onic nonsynon ymous SNV 5q13.2 71519557 71519557 C T NM_001286751 c.G790A p.E264K 0.42 5 SYNPO ex onic nonsynon ymous SNV 5q33.1 150028960 150028960 G T NM_001109974 c.G1123T p.G375C 0.43 6 CCER1 ex onic nonsynon ymous SNV 12q21.33 91347577 91347577 C T NM_152638 c.G943A p.E315K 0.74 7 IDH2 ex onic nonsynon ymous SNV 15q26.1 90631934 90631934 C T NM_001290114 c.G29A p.R10Q 0.51 8 SRSF2 ex onic nonsynon ymous SNV 17q25.1 74732959 74732959 G T NM_001195427 c.C284A p.P95H 0.48 9 FCHO1 ex onic nonsynon ymous SNV 19p13.11 17881319 17881319 A G NM_001161359 c.A272G p.Y91C 0.51 10 SMC1A ex onic nonsynon ymous SNV Xp11.22 53438958 53438958 A G NM_006306 c.T1100C p.L367S 0.95 11 PPIP5K2 splicing 5q21.1 102465408 102465408 G A NM_001281471 c.114+1G>A 0.45 RS007 1 DNMT3A ex onic fr ameshif t inser tion 2p23.3 25470924 25470924 -T NM_153759 c.269dupA p.D90f s 0.37 2 NOL9 ex onic nonsynon ymous SNV 1p36.31 6592799 6592799 A G NM_024654 c.T1259C p.I420T 0.41 3 LYST ex onic nonsynon ymous SNV 1q42.3 235884048 235884048 T C NM_000081 c.A9473G p.N3158S 0.46 4 NCKAP5 ex onic nonsynon ymous SNV 2q21.2 133539546 133539546 G A NM_207363 c.C4838T p.T1613M 0.39 5 IDH1 ex onic nonsynon ymous SNV 2q34 209113113 209113113 G A NM_001282386 c.C394T p.R132C 0.39 6 ARPP21 ex onic nonsynon ymous SNV 3p22.3 35779794 35779794 C A NM_001267617 c.C1571A p.T524K 0.38 7 HTR3E ex onic nonsynon ymous SNV 3q27.1 183824347 183824347 C T NM_001256614 c.C1315T p.H439Y 0.42 8 PCDH10 ex onic nonsynon ymous SNV 4q28.3 134071887 134071887 C T NM_020815 c.C592T p.R198C 0.47 9 GLIS3 ex onic nonsynon ymous SNV 9p24.2 4117915 4117915 G C NM_152629 c.C1098G p.I366M 0.43 10 PSMD13 ex onic nonsynon ymous SNV 11p15.5 249005 249005 A G NM_175932 c.A728G p.N243S 0.40 11 DYNC1H1 ex onic nonsynon ymous SNV 14q32.31 102445775 102445775 A G NM_001376 c.A464G p.N155S 0.46 12 ZNF629 ex onic nonsynon ymous SNV 16p11.2 30795086 30795086 G A NM_001080417 c.C563T p.T188M 0.46 13 TP53 ex onic nonsynon ymous SNV 17p13.1 7578271 7578271 T C NM_001126115 c.A182G p.H61R 0.76 14 DCC ex onic nonsynon ymous SNV 18q21.2 50929215 50929215 G A NM_005215 c.G2887A p.V963I 0.41 15 FAM47C ex onic nonsynon ymous SNV Xp21.1 37026698 37026698 G A NM_001013736 c.G215A p.R72H 0.44 16 MAGEC1 ex onic nonsynon ymous SNV Xq27.2 140995474 140995474 A G NM_005462 c.A2284G p.S762G 0.40 17 ELP3 ex onic st opgain 8p21.1 28017905 28017905 C T NM_001284224 c.C1060T p.R354X 0.45 18 CAPN15 ex onic st opgain 16p13.3 599083 599083 C T NM_005632 c.C1540T p.Q514X 0.45 19 DCC ex onic st opgain 18q21.2 51025778 51025778 C T NM_005215 c.C4009T p.R1337X 0.40