Oncogenic variants guiding treatment in thoracic malignancies
Meng, Pei
DOI:
10.33612/diss.160074057
IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below.
Document Version
Publisher's PDF, also known as Version of record
Publication date: 2021
Link to publication in University of Groningen/UMCG research database
Citation for published version (APA):
Meng, P. (2021). Oncogenic variants guiding treatment in thoracic malignancies. University of Groningen. https://doi.org/10.33612/diss.160074057
Copyright
Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).
Take-down policy
Chapter 7
Targeted sequencing of circulating cell-free DNA in
stage II-III resectable esophageal squamous cell
carcinoma patients
Pei Meng
1,2*, Jiacong Wei
1,3*, Yiqun Geng
1*, Shaobin Chen
4, Miente Martijn Terpstra
3, Qiongyi
Huang
1, Qian Zhang
1, Zuoqing Su
1, Wanchun Yu
1, Min Su
5, Klaas Kok
3, Anke van den Berg
2, Jiang
Gu
1,6
1
Provincial Key laboratory of Infectious Diseases and Molecular Pathology, Department of
Pathology, Collaborative and Creative Center, Shantou University Medical College, Shantou,
Guangdong, 515041, China
2Department of Pathology and Medical Biology, University of Groningen, University Medical
Center Groningen, Groningen, 9700RB, Netherlands
3Department of Genetics, University of Groningen, University Medical Center Groningen,
Groningen, 9700RB, Netherlands
4Department of Thoracic surgery, Cancer Hospital of Shantou University, Shantou, Guangdong,
515041, China
5Department of Pathology & Institute of Clinical Pathology, Shantou University Medical College,
Shantou, Guangdong, 515041, China
6Jinxin Research Institute for Reproductive Medicine and Genetics, Chengdu Jinjiang Hospital
for Maternal and Child Health Care, 66 Jingxiu Road, Chengdu, 610066, China.
* Authors contributed equally
Corresponding author
Abstract
Background: The aim of this study was to investigate the potential of cell-free DNA (cfDNA) as a
disease biomarker in esophageal squamous cell carcinoma (ESCC) that can be used for
treatment response evaluation and early detection of tumor recurrence. Methods: Matched
tumor tissue, pre- and post-surgery plasma and WBCs obtained from 17 ESCC patients were
sequenced using a panel of 483 cancer-related genes. Results: Somatic mutations were
detected in 14 of 17 tumor tissues. Putative harmful mutations were observed in genes
involved in well-known cancer-related pathways, including PI3K-Akt/mTOR signalling,
Proteoglycans in cancer, FoxO signalling, Jak-STAT signalling, Chemokine signalling and Focal
adhesion. Forty-six somatic mutations were found in pre-surgery cfDNA in 8 of 12 patients, with
mutant allele frequencies (MAF) ranging from 0.24% to 4.91%. Three of the 8 patients with
detectable circulating tumor DNA (ctDNA) had stage IIA disease, whereas the others had stage
IIB-IIIB disease. Post-surgery cfDNA somatic mutations were detected in only 2 of 14 patients,
with mutant allele frequencies of 0.28% and 0.36%. All other somatic mutations were
undetectable in post-surgery cfDNA, even in samples collected within 3-4 hours after surgery.
Conclusion: Our study shows that somatic mutations can be detected in pre-surgery cfDNA in
stage IIA to IIIB patients, and at a lower frequency in post-surgery cfDNA. This indicates that
cfDNA could potentially be used to monitor disease load, even in low disease-stage patients.
Keywords: Esophageal squamous cell carcinoma, circulating cell-free DNA, next generation
7
1. Background
Esophageal squamous cell carcinoma (ESCC) is the most common form of esophageal cancer
and is one of the deadliest cancers worldwide [1]. Because early-stage esophageal cancer is
mostly asymptomatic, the majority of ESCC patients are diagnosed with advanced disease.
Despite improvements in imaging, surgical techniques and chemoradiation therapy, effective
treatment of ESCC patients remains challenging, with an overall 5-year survival of less than 30%
[1]. For localized ESCC, surgery is the preferred option. However, even after radical resection,
the 5-year overall survival of ESCC patients with positive lymph nodes is less than 40% [2,3].
Moreover, the recurrence rate is also high in patients without positive lymph nodes [4]. Thus,
accurate and timely detection of minimal residual disease or relapse is crucial for tailoring
adjuvant therapy for a longer survival time.
Liquid biopsies have become a research hotspot for non-invasive follow-up on disease load and
therapy response and for early detection of recurrence [5]. Thus far, promising results have
been obtained with circulating cell-free DNA (cfDNA) [6-8]. PCR-based techniques such as
droplet digital PCR (ddPCR) and PNA-mediated PCR have been used to detect recurrent
mutations in EGFR and KRAS in lung cancer patients [9-12] and in APC in colorectal cancer [13].
Although the results are promising, these approaches require prior knowledge of the mutations
present in the tumor sample. To circumvent this limitation, next generation sequencing
(NGS)-based approaches using cancer hotspot panels or whole exome approaches have been applied
to cfDNA. These studies have reported variable dynamic patterns in mutant allele frequencies
of somatic mutations in lung cancer, breast cancer and colon cancer patients [7,14-18]. Some
studies have been carried out to monitor treatment response and disease progression by
screening cfDNA samples for mutations detected in the primary tumor [5,7]. For example, in
stage II colon cancer patients, the presence of tumor DNA in cfDNA provided evidence of
minimal residual disease and was associated with a higher risk of recurrence [7].
Only a few studies have focused on the analysis of cfDNA in ESCC. The amount of cfDNA was
shown to be higher in ESCC patients compared to healthy controls [19]. Three to six months
after tumor resection, the amount of cfDNA was significantly reduced, indicating that a major
fraction of the cfDNA is derived from tumor cells. Another study showed the feasibility of using
cfDNA before and after surgery to track tumor load [20]. Decreased mutant allele frequencies
were generally observed in post-surgery plasma of 8 ESCC patients using whole exome
sequencing and in 3 patients using targeted deep sequencing.
Together, these studies have indicated that circulating tumor DNA (ctDNA) can be detected in
the cfDNA of ESCC patients, but additional studies are required before ctDNA can be used in
routine clinical practice. In this study, we carried out targeted deep-sequencing using a
cancer-related gene panel to explore the cfDNA mutation profile in stage II and III ESCC patients both
pre- and post-surgery.
2. Methods
2.1 Patient selection
Seventeen ESCC patients who underwent radical tumor resection between November 1, 2013
and May 31, 2014 were included from the Shantou University cancer hospital (Figure 7.1). None
of the patients were treated with chemotherapy or radiotherapy before surgery. Tumor tissue
samples were stored at -80°C and evaluated by proficient pathologists. Blood was collected one
day before surgery and between 3 to 4 hours up to 9 days after surgery. Clinical annotations
were retrospectively extracted from the institutional clinical database.
Figure 7.1. Schematic representation of the sample collection and a brief summary of the sequencing
results.
2.2 Blood sample separation
Whole blood (2-5ml) samples were collected in EDTA tubes and processed within 2 hours. After
centrifugation at 900×g for 10 minutes, whole blood samples were separated into plasma and
WBC fraction. Aliquots of plasma were subjected to two subsequent centrifugation steps at
16,000×g for 10 minutes at 4°C to remove residual WBCs. Red blood cells in the WBC fraction
were lysed using standard procedures. WBCs were collected by centrifugation at 600×g for 10
minutes and washed with PBS. All samples were stored at -80°C.
2.3 DNA extraction, library preparation and sequencing
DNA isolations, library preparations and NGS of fresh frozen tumor tissues, WBCs and plasma
samples were performed by Novogene (Beijing, China). In brief, the QIAamp DNA Mini Kit
(Qiagen, Hilden, Germany) was used to extract DNA from fresh frozen tumor tissues. The
QIAamp circulating nucleic acid kit (Qiagen) was used to isolate cfDNA from 1 to 3 ml of plasma.
Genomic DNA of WBCs was extracted using the RelaxGene Blood DNA System (TianGen Biotech
Co., Ltd., Beijing, China). All DNA samples were analysed using Nanodrop for purity (ratio of
OD260/280) and DNA yield was quantified using the Qubit dsDNA HS assay kit on the Qubit 2.0
system (Life Technologies, Carlsbad, CA). DNA samples were stored at -80°C until they were
subjected to NGS.
DNA input for sequencing was 500ng for tissue samples and WBCs and about 30ng for cfDNA
samples. Exons of 483 human protein-coding genes related to cancer were included in the NGS
7
FASTQ files were obtained from the company and processed as previously described [22].
Briefly, reads were aligned to the hg19 reference genome with Burrows-Wheeler Aligner (BWA)
and Genome Analysis Toolkit (GATK) [23]. Format conversion and de-duplication was
performed using Picard Tools. HaplotypeCaller was used for variant calling in all samples in one
workflow. The data analysis pipeline is set to report all variants with a MAF>1%. Subsequently
the pipeline reports the allele depth for these variants in all samples. Personal variants were
filtered out when detected in the WBC samples at a variant allele frequency of>0.5% in
combination with a minimal variant allele count of two. In addition, we removed all variants
with a sequencing depth less than 25x in WBCs, because we cannot reliably assess whether
these represent personal variants or somatic mutations. Variants with a coverage of<100x or a
mutant allele frequency (MAF) of<5% in the primary tumor samples were excluded, as these
variants are likely to be present in a minor subclone of the tumor. All other variants observed in
the tumor samples were considered to represent somatic mutations. For all somatic mutations
called in the tumor, mutations in cfDNA were determined to be present when we observed
either (1) 4 altered reads with a MAF>0.5%, or (2)>4 altered reads and a MAF above the
sequencing error background frequency, which was 0.43% per position and 0.14% per
alternative nucleotide. Somatic mutations specific for cfDNA, were reported when MAF>1% and
coverage>100. As the background rate for INDELs is much lower, we considered INDELs with
two or more altered reads as true somatic mutations, irrespective of the MAF.
To further establish the reliability of the mutations reported in cfDNA in our study, we checked
the read counts of the non-REF and non-variant bases by IGV. This indicates the sequencing
error rate at this specific site. Prediction of pathogenicity of somatic variants was based on
Combined Annotation Dependent Depletion (CADD) score [24]. We defined variants with a
CADD score ≥20 as harmful. For the analyses of the mutational profile in tumor tissues,
including the downstream pathway analysis, we focused on putative harmful mutations. DAVID
v6.8 [25] was used for KEGG pathway analysis of the genes with harmful mutations. For the
analyses of cfDNA, we included all somatic mutations, including non-harmful and silent
mutations, as these can be equally informative for disease load.
2.5 Statistics
For non-normally distributed data sets, median and range are given, and significance was
determined by Mann-Whitney-Wilcoxon rank sum test. A P-value<0.05 was considered
significant.
3. Results
3.1 Patient characteristics
Characteristics of the ESCC patients are shown in Supplementary Table S2. The cohort of 17
patients consisted of 12 males and 5 females (age range from 42 to 77 years), all diagnosed
with stage II to III disease. All patients were followed-up for 24 months. Four of the 17 patients
experienced disease progression at 4 to 21 months after surgery. For 2 of the 4 patients with
progression, the amount of cfDNA in the pre-surgery plasma sample was too low for NGS
analysis. Median cfDNA yield was 11.9 ng (range 4.86-38.6 ng) per ml of plasma. The amount of
plasma cfDNA obtained before surgery was lower than after surgery (p=0.015) (Figure 7.2). No
obvious differences in cfDNA yield were seen between samples obtained 3-4 hours after
surgery compared to those obtained 2-9 days after surgery.
A summary of the sequencing data is shown in Supplementary Table S3. A phred quality score
of 30 (Q30) was achieved for 91% of the bases. Mean target coverage of all samples was 667x
and more than 95% of the target bases reached a coverage of more than 100x. The average
mismatch rate per nucleotide position was 0.14% per base. Supplementary Table S4 gives an
overview of the somatic mutations detected per sample and per patient. No somatic mutations
were detected in the tumor samples of three of the patients, nor in their corresponding pre-
and post-surgery cfDNA samples. We therefore excluded these three patients from further
analysis.
Figure 7.2. Cell-free DNA yield in pre- and post-surgery blood samples. DNA yields were calculated per
millilitre of blood. Sixteen pairs of cfDNA samples were included as one of the two insufficient
pre-cfDNA samples information is not available. The amount of plasma pre-cfDNA isolated before surgery was
lower than the amount obtained after surgery (p=0.015) based on Mann-Whitney-Wilcoxon rank sum
test.
3.3 Somatic mutations in tumor DNA
We detected a total of 131 somatic mutations with a median coverage of 348x (range 100x to
3,131x) (Supplementary Table S4). Sixty-three of the 131 (48.1%) had a CADD score>20,
indicating a putatively pathogenic effect. The median number of mutations per patient was 9
(range 2 to 17) and the median MAF was 21.0% (range 9.7% to 85.2%). TP53 was the most
commonly mutated gene, with pathogenic mutations in 14 patients, followed by mutations in
NOTCH1 (4 patients), CDKN2A (3 patients), KMT2C (2 patients) and PTEN (2 patients) (Table
7.1). Two recurrent mutations were observed one in TP53 and one in CDKN2A. Pathway
analysis of the 37 genes with harmful mutations indicated a total of 30 significantly enriched
pathways (p-value<0.01), each encompassing 4 to 9 mutated genes (Supplementary Table S5).
These include the PI3K-Akt/mTOR (9 genes), FoxO (7 genes), Jak-STAT (7 genes), Chemokine (7
genes) and Focal adhesion (7 genes) signalling pathways and the Proteoglycans in cancer (8
genes) pathway.
7
Table 7.1. Overview of all recurrently mutated genes (mutated in at least 3 patients)
Gene Patient ID CDS mutation Amino Acid change CADD Score tumor MAF
DNA cfDNA Pre- cfDNA Post- WBC
TP53 ESCC01 c.1036G>T p.Glu346* 44 24.9% 2.2% - -
c.673-2A>G . 23 43.9% 2.3% - -
ESCC02 c.733G>A p.Gly245Ser 35 43.0% - - -
c.839G>C p.Arg280Thr 33 20.2% - - -
ESCC03 c.614A>G p.Tyr205Cys 24 73.7% - - -
ESCC04 c.223_229delCC TGCAC p.Pro75fs 20 76.1% 1.7% - - ESCC05 c.364_365insT p.Thr123fs 35 85.2% - - - ESCC06 c.551_554delAT AG p.Asp184fs 33 10.7% - - - ESCC07 c.920-1G>T . 25 14.4% NA - - ESCC08 c.770_782+10de lTGGAAGACTCC AGGTCAGGAGC C p.Leu257fs 33 18.3% 0.9% - -
ESCC09 c.643A>G p.Ser215Gly 29 29.3% 2.3% - -
ESCC10 c.481G>A p.Ala161Thr 27 66.2% 2.8% - -
ESCC11 c.844C>T p.Arg282Trp 33 27.1% NA - -
ESCC13 33 21.9% - - -
ESCC12 c.643A>C p.Ser215Arg 27 27.9% - - -
ESCC14 c.742C>T p.Arg248Trp 34 27.3% 0.9% - -
c.818G>A p.Arg273His 27 15.0% - - -
NOTCH1 ESCC01 c.4672dupG p.Leu1559fs 35 57.6% 4.9% - -
c.1070T>C p.Phe357Ser 29 13.6% 2.1% - -
ESCC02 c.1359_1361del
CAA p.Asn454del 19 62.7% - - -
ESCC05 c.867_868insC p.Gln290fs 29 83.2% - - -
ESCC08 c.928G>A p.Gly310Arg 27 26.8% 0.6% - -
ESCC09 c.4646G>T p.Cys1549Phe 29 27.6% 1.8% - -
KMT2D ESCC02 c.12823C>T p.Gln4275* 41 43.8% - - -
c.14119C>G p.Pro4707Ala 18 20.3% - - -
ESCC05 c.*636delA . 1 30.4% - - -
ESCC08 c.9730delG p.Glu3244fs 35 13.5% - - -
KMT2C ESCC04 n.-1G>A . 6 19.5% - - -
ESCC11 c.569G>A p.Arg190Gln 24 15.2% NA - -
ESCC14 c.11953G>A p.Gly3985Arg 24 12.7% 0.6% 0.4% -
CDKN2A ESCC09 c.316+1G>T . 27 28.8% 2.6% - -
c.172C>T p.Arg58* 35 27.0% 1.8% - -
ESCC10 35 65.5% 1.4% - -
ESCC14 c.488G>A p.Arg163Gln 34 42.6% - - -
ETV6 ESCC01 c.329-72C>T . 4 21.1% - - -
ESCC04 c.464-2686G>C . 2 36.2% 0.6% - -
3.4 Somatic mutations in pre- and post-surgery cfDNA
Part of the mutations identified in tumor samples were also identified in cfDNA. No novel
mutations were found in any of the patients including the 3 patients without somatic mutations
in the tumor. Five of the patients had no detectable somatic mutations in pre- and/or
surgery cfDNA (Table 7.2). Seven patients had somatic mutations pre-surgery, but not in
post-surgery cfDNA. One patient (ESCC14) had 4 mutations in pre-post-surgery cfDNA and one mutation
in post-surgery cfDNA. The remaining patient (ESCC07) lacked pre-surgery cfDNA, but did have
one mutation in post-surgery cfDNA. As a control for the reliability of our filtering criteria, we
analysed the sequencing error read at mutant base positions of all mutations detected in the
tumor samples. In all cases this was less than 4 reads in all cfDNA samples and below the MAF
observed for the cfDNA sample (Supplementary Figure S6).
The median number of mutations observed in pre-surgery cfDNA was 5 per patient. For two
patients, a high proportion of the somatic mutations detected in the tumor were also detected
in pre-surgery cfDNA (80% for both ESCC09 and ESCC10). The median on-target coverage for
the cfDNA samples was 613x (range 391x to 839x) pre-surgery and 752x (range 546x to 1932x)
post-surgery. There was no difference in the coverage at positions for which mutant reads were
detected (median 673x, range 391x to 839x) as compared to positions for which no mutant
reads were detected (median 543x, range 471x to 725x) in cfDNA (Figure 7.3). The median
mutant allele frequency was 1.3% in pre-surgery cfDNA (range 0.24% to 4.91%). For all somatic
mutations, the MAFs in pre-surgery cfDNA were much lower than those observed in the tumor
tissue.
7
Table 7.2. Overview of clinical characteristics and the number of somatic mutations in tumor
DNA and cfDNA.
Sample ID Stage PFS time Post-surgery CRT Blood drawing time
Number of somatic mutations Tumor surgery Pre- surgery
Post-ESCC01 PT2N0M0G2-3 IIA 5m N 2d 12 6 0 ESCC02 PT3N0M0G2 IIB N N 3-4h 13 2 0 ESCC03 PT2N0M0G2 IIA N N 3-4h 13 0 0 ESCC04 PT3N0M0G2 IIA N N 3-4h 17 6 0 ESCC05 PT3N0M0G1 IIA N N 9d 9 1 0 ESCC06 PT3N0M0G2 IIA N N 5d 3 0 0 ESCC07 PT3N0M0G1 IIA 12m Y 3-4h 3 NA 1 ESCC08 PT3N1M0G1 IIIB N N 3-4h 9 3 0
ESCC09 PT4aN0M0G2 IIIB N N 3-4h 10 8 0
ESCC10 PT4aN0M0G2 IIIB N N 9d 20 16 0
ESCC11 PT3N1M0G3 IIIB 4m N 3-4h 5 NA 0
ESCC12 PT3N1M0G2 IIIB N Y 3-4h 2 0 0
ESCC13 PT3N1M0G2 IIIB N Y 3-4h 2 0 0
ESCC14 PT2N1M0G2 IIIA N Y 6d 13 4 1
PFS: Progression-free survival, N: no progression during follow-up, m: months, CRT:chemoradiotherapy, NA: Not Available
In 2 of the 14 post-surgery cfDNA samples, we identified one of the somatic mutations
observed in the corresponding tumor tissue. In the remaining 12 post-surgery cfDNA samples,
no mutations were observed (Figure 7.4). Mutant allele frequencies of the two mutations were
0.28% (ESCC07) and 0.36% (ESCC14). For one patient the time between surgery and blood
collection was 3-4 hours and for the second patient this was 6 days.
3.5 Correlation of cfDNA mutations with clinical characteristics
We detected cfDNA mutations in pre-surgery samples from 4 out of 6 stage II and 4 out of 6
stage III patients. Only one of the three patients with disease recurrence within one year had
pre-surgery cfDNA. In this patient, 6 of the 12 somatic mutations observed in the corresponding
tumor DNA were detectable in cfDNA. For the other two patients, we only had post-surgery
cfDNA, and in one of the two we detected a single mutation out of three somatic mutations
detected in the tumor samples.
Figure 7.3. Coverage at the target regions in cfDNA samples. A. pre-surgery cfDNA samples. B.
post-surgery cfDNA samples. Mean target region coverage (black squares) and the coverage for the
nucleotide positions for which somatic mutations were detected in the corresponding tumor samples is
indicated. Dot colours indicate coverage at the nucleotide position for which the mutant allele was
(orange dots) or was not (green dots) detected in cfDNA.
7
Figu
re
7.
4.
O
ve
rv
ie
w
o
f m
ut
an
t alle
le
fre
qu
en
ci
es
in
tu
m
or
D
N
A
(t
DN
A)
an
d c
fD
N
A p
er p
at
ie
nt
. A
ll c
tD
N
A all
ele
fre
qu
en
cie
s ar
e s
ho
w
n,
in
clu
din
g t
ho
se
w
ith
alt
ere
d
re
ad
n
um
be
r a
nd
a
lle
le
fr
eq
ue
nc
ie
s b
el
ow
o
ur
th
re
sh
ol
d.
T
w
o d
iff
er
en
t m
ut
at
io
ns
w
er
e d
et
ec
te
d i
n C
DK
N
2A
in
E
SC
C0
9
an
d in
E
PH
A4
in
E
SC
C1
0.
F
or
all
m
ut
at
io
ns
,
a m
uc
h l
ow
er
M
AF
w
as
o
bs
er
ve
d i
n p
os
t-su
rg
ery
c
fD
NA
. T
he
b
lac
k b
ox
es
in
dic
at
e t
he
tu
m
or
-s
pec
ifi
c s
om
ati
c m
uta
tio
ns
th
at
wer
e d
et
ec
ted
in
ei
th
er
p
re
- o
r p
os
t-su
rg
ery
c
fD
N
A p
las
m
a s
am
pl
es.
4. Discussion
Early detection of tumor recurrence and a tool to evaluate treatment response in ESCC would
allow us to optimize treatment strategy for individual patients. In the absence of effective
prognostic biomarkers in ESCC [26], analysis of cfDNA might provide an easily accessible source
of information to monitor disease load after surgery. In this study we performed targeted
sequencing of pre- and post-surgery cfDNA and of matched tumor tissues and WBCs. Our main
finding is that we were able to detect a subset of the tumor-specific somatic mutations in
pre-surgery cfDNA for most of our stage II and stage III ESCC patients using as little as 3 ml blood.
Most of these mutations were not detected in cfDNA of blood samples obtained as early as 3-4
hours after surgery.
Even though our study cohort size is limited, this is the most extensive study thus far to
compare pre- and post-surgery cfDNA in stage II and III ESCC patients. For 3 of the patients no
somatic mutations were identified, despite the use of a broad cancer gene panel consisting of
483 genes. Thus, future studies should focus on a larger or ESCC-specific panel to allow
detection of mutations in all patients. Regretfully, due to lack of material we could not do an
independent validation of the mutations detected in cfDNA. To overcome this shortcoming, we
monitored sequencing error rates for all positions for which we identified mutations in cfDNA
using our predefined criteria. This indicated that the sequencing errors did not pass our criteria.
Moreover, in our previous NGS-based studies we validated close to 100% of the mutations
called in the NGS data by the same pipeline for all variants with 4 or more reads by an
independent technique [27,28]. Finally, our pipeline for variant calling is based on the GATK
workflow, and this pipeline was previously shown to have a sensitivity of 95% and positive
predictive value of 99% [29]. Taken together, we consider the mutations in cfDNA as called in
our study as reliable. Previous studies have demonstrated that ctDNA can be detected in most
advanced-stage cancer patients with high sensitivity. This allows monitoring of therapeutic
response, identification of tumor-specific variants relevant for choice of therapy, and detection
of acquired resistance-induced mutations. Using ddPCR, somatic mutations have been detected
in cfDNA of more than 75% of patients with advanced pancreatic, ovarian, colorectal, bladder,
gastroesophageal, breast, melanoma, hepatocellular and head and neck cancers, but in less
than 50% of primary brain, renal, prostate or thyroid cancers [18]. Thus, it is clear that the
presence of ctDNA varies between cancer types. In our study cohort, somatic mutations were
found in pre-surgery cfDNA in 8 out of 12 patients. Our data support the potential of using
cfDNA as a biomarker of disease load for this patient group for whom no effective biomarker is
currently available. The detection rate could be higher with optimized approaches for detection
of variants with low MAF and using an extended ESCC-specific gene panel.
A high-throughput sequencing approach allows detection of somatic mutations in cfDNA in a
more comprehensive target region, as compared to mutation-specific PCR approaches, albeit
with a somewhat lower sensitivity. With the detection of ctDNA in 4 out of 6 stage II patients,
our results are comparable to the findings in previous studies focusing on stage I or II
colorectal, breast, lung, ovarian and pancreatic cancer, with 43% to 71% of patients harbouring
7
Postoperative adjuvant chemoradiotherapy was recommended to eliminate micrometastatic
disease and minimal residual disease. Unfortunately, there is no effective tool to assess minimal
residual disease for early tailoring of adjuvant therapy to avoid both under- and overtreatment.
In addition, there is also no effective tool for early relapse surveillance prior to imaging.
Detection of ctDNA after resection can point to minimal residual disease or even predict clinical
relapse and poor outcome in different cancer types [7,30,31]. Due to the limited number of
patients in our study and post-surgery treatment in some of them, we cannot reliably assess
the potential clinical value of the presence of ctDNA in pre- and post-surgery cfDNA.
Nevertheless, we did observe timely changes in MAF in cfDNA as early as 3 to 4 hours after
surgery, which is consistent with the reported half-life of cfDNA ranging from 16 minutes to two
hours [32,33]. The relatively short half-life of cfDNA makes it a good biomarker to monitor
dynamic changes in disease load. Cellular damage due to the surgery may lead to an increased
amount of cfDNA, as we observed in some post-surgery cases. This will lead to a fractional
decrease in the amount of ctDNA. Thus, both cellular damage and the short half-life of cfDNA
may cause drop of the MAF to below the detection limit. So, for residual disease monitoring is
advisable to draw blood a few days after surgery.
Theoretically, ctDNA could be used broadly to guide treatment and to monitor for treatment
resistance or cancer recurrence. CtDNA is also more specific to tumor load compared to
serum-based protein biomarkers such as cancer antigen 125 in ovarian cancer patients [16] and can be
used for tumor for which no serum-based protein biomarkers are available, as is the case for
ESCC [34].
5. Conclusions
We detected tumor-specific mutations in pre-surgery cfDNA of both stage II and stage III ESCC
patients. In samples taken shortly after surgery, mutations were either undetectable or had a
significantly lower MAF, which indicates that the presence of mutations in cfDNA correlates
with tumor load. This implies that cfDNA may be used as a marker for the presence of tumor
cells in ESCC patients. Larger studies are needed to establish the clinical applicability of cfDNA
and the predictive value of treatment outcome as we only had three samples of patients that
relapsed after surgery.
List of abbreviations
cfDNA: cell free DNA; ESCC: esophageal squamous cell carcinoma; EC: esophageal cancer;
ddPCR: droplet digital PCR; ctDNA: circulating tumor DNA; NGS: next generation sequencing;
MAF: mutant allele frequency
Declarations
Ethics approval and consent to participate
The study was approved by the Ethics Committee of the Cancer Hospital of Shantou University.
All patients gave written informed consent for use of their samples and all patient data were
de-identified for this study.
Consent for publication
Not applicable
Availability of data and materials
All data generated or analyzed during this study are included in its supplementary information
files and are available from the corresponding author on reasonable request.
Competing interests
The authors declare that they have no competing interests.
Funding
This work is supported by Li Ka Shing Foundation.
Authors’ contributions
JG, YQG, JCW and PM conceived the experiments. PM, JCW, SBC, QYH, QZ, ZQS, MS and WCY
collected the samples and/or conducted the experiments. PM, JCW, MMT and KK analysed the
data. PM, JCW, KK, AB and JG designed the study and wrote the manuscript. All authors have
reviewed and approved the manuscript.
Acknowledgments
We gratefully acknowledge the staff who helped collect samples for this study at department of
Thoracic surgery, Cancer Hospital of Shantou University and at department of Pathology &
Institute of Clinical Pathology, Shantou University Medical College. We thank Kate Mc Intyre for
language editing. We would like to thank Novogene (Beijing, China) for the help with targeted
sequencing. We thank the UMCG Genomics Coordination Center, the UG Center for
Information Technology and their sponsors BBMRI-NL and TarGet for storage and compute
infrastructure.
7
References
1. Pennathur, A.; Gibson, M.K.; Jobe, B.A.; Luketich, J.D. Oesophageal carcinoma. The Lancet 2013, 381, 400-412.
2. Li, L.; Zhao, L.; Lin, B.; Su, H.; Su, M.; Xie, D.; Jin, X.; Xie, C. Adjuvant therapeutic modalities following three-field lymph node dissection for stage ii/iii esophageal squamous cell carcinoma. J Cancer 2017, 8, 2051-2059.
3. Xiao, Z.F.; Yang, Z.Y.; Liang, J.; Miao, Y.J.; Wang, M.; Yin, W.B.; Gu, X.Z.; Zhang, D.C.; Zhang, R.G.; Wang, L.J. Value of radiotherapy after radical surgery for esophageal carcinoma: A report of 495 patients. Ann Thorac Surg 2003, 75, 331-336.
4. Shen, W.B.; Gao, H.M.; Zhu, S.C.; Li, Y.M.; Li, S.G.; Xu, J.R. Analysis of the causes of failure after radical surgery in patients with pt3n0m0 thoracic esophageal squamous cell carcinoma and consideration of postoperative radiotherapy. World J Surg Oncol 2017, 15, 192.
5. Heitzer, E.; Ulz, P.; Geigl, J.B. Circulating tumour DNA as a liquid biopsy for cancer. Clinical chemistry 2015, 61, 112-123.
6. Siravegna, G.; Mussolin, B.; Buscarino, M.; Corti, G.; Cassingena, A.; Crisafulli, G.; Ponzetti, A.; Cremolini, C.; Amatu, A.; Lauricella, C., et al. Clonal evolution and resistance to egfr blockade in the blood of colorectal cancer patients. Nature medicine 2015, 21, 795-801.
7. Tie, J.; Wang, Y.; Cristian Tomasetti; Li, L.; Springer, S.; Kinde, I.; Silliman, N.; Tacey, M.; Wong, H.-L.; Christie, M., et al. Circulating tumour DNA analysis detects minimal residual disease and predicts recurrence in patients with stage ii colon cancer. Science translational medicine 2016, 8, 346-392.
8. Newman, A.M.; Bratman, S.V.; To, J.; Wynne, J.F.; Eclov, N.C.; Modlin, L.A.; Liu, C.L.; Neal, J.W.; Wakelee, H.A.; Merritt, R.E., et al. An ultrasensitive method for quantitating circulating tumour DNA with broad patient coverage. Nature medicine 2014, 20, 548-554.
9. Thierry, A.R.; Mouliere, F.; El Messaoudi, S.; Mollevi, C.; Lopez-Crapez, E.; Rolet, F.; Gillet, B.; Gongora, C.; Dechelotte, P.; Robert, B., et al. Clinical validation of the detection of kras and braf mutations from circulating tumour DNA. Nature medicine 2014, 20, 430-435.
10. Jean-Yves Douillard; Gyula Ostoros; Manuel Cobo; Tudor Ciuleanu; Rebecca Cole; Gael McWalter; Jill Walker, P.; Simon Dearden; Alan Webster; Tsveta Milenkova, et al. Gefitinib treatment in egfr mutated caucasian nsclc: Circulating-free tumour DNA as a surrogate for determination of egfr status. Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer 2014, 9, 1345-1353.
11. Hye-Ryoun Kim, S.Y.L., Dae-Sung Hyun, Min Ki Lee, Hyun-Kyung Lee, Chang-Min Choi, Sei-Hoon Yang, Young-Chul Kim, Yong Chul Lee, Sun Young Kim, Seung Hun Jang, Jae Cheol Lee, Kye Young Lee. Detection of egfr mutations in circulating free DNA by pna-mediated pcr clamping. J Exp Clin Cancer Res. 2014, 32. 12. Xu, J.-M.; Liu, X.-J.; Ge, F.-J.; Lin, L.; Wang, Y.; Sharma, M.R.; Liu, Z.-Y.; Tommasi, S.; Paradiso, A. Kras
mutations in tumour tissue and plasma by different assays predict survival of patients with metastatic colorectal cancer. Journal of Experimental & Clinical Cancer Research 2014, 33.
13. Diehl F, L.M., Dressman D, He Y, Shen D, Szabo S, Diaz LA Jr, Goodman SN, David KA, Juhl H, Kinzler KW, Vogelstein B. Detection and quantification of mutations in the plasma of patients with colorectal tumour. Proceedings of the National Academy of Sciences of the United States of America 2005, 102, 16368-16373. 14. Rothe, F.; Laes, J.F.; Lambrechts, D.; Smeets, D.; Vincent, D.; Maetens, M.; Fumagalli, D.; Michiels, S.; Drisis,
S.; Moerman, C., et al. Plasma circulating tumour DNA as an alternative to metastatic biopsies for mutational analysis in breast cancer. Annals of oncology : official journal of the European Society for Medical Oncology / ESMO 2014, 25, 1959-1965.
15. Xu, S.; Lou, F.; Wu, Y.; Sun, D.Q.; Zhang, J.B.; Chen, W.; Ye, H.; Liu, J.H.; Wei, S.; Zhao, M.Y., et al. Circulating tumour DNA identified by targeted sequencing in advanced-stage non-small cell lung cancer patients. Cancer letters 2016, 370, 324-331.
16. Forshew, T.; Murtaza, M.; Parkinson, C.; Gale, D.; Tsui, D.W.; Kaper, F.; Dawson, S.J.; Piskorz, A.M.; Jimenez-Linan, M.; Bentley, D., et al. Noninvasive identification and monitoring of cancer mutations by targeted deep sequencing of plasma DNA. Science translational medicine 2012, 4, 136ra168.
17. Phallen J, S.M., Adleff V, Leal A, Hruban C, White J, Anagnostou V, Fiksel J, Cristiano S, Papp E, Speir S, Reinert T, Orntoft MW, Woodward BD, Murphy D, Parpart-Li S, Riley D, Nesselbush M, Sengamalay N, Georgiadis A, Li QK, Madsen MR, Mortensen FV, Huiskens J, Punt C, van Grieken N, Fijneman R, Meijer G, Husain H, Scharpf RB, Diaz LA Jr, Jones S, Angiuoli S, Ørntoft T, Nielsen HJ, Andersen CL, Velculescu VE.
Direct detection of early-stage cancers using circulating tumour DNA. Science translational medicine 2017 9.
18. Bettegowda, C.; Sausen, M.; Leary, R.J.; Kinde, I.; Wang, Y.; Agrawal, N.; Bartlett, B.R.; Wang, H.; Luber, B.; Alani, R.M., et al. Detection of circulating tumour DNA in early- and late-stage human malignancies. Science translational medicine 2014, 6, 224ra224.
19. Banki F, M.R., Oh D, Hagen JA, DeMeester SR, Lipham JC, Tanaka K, Danenberg KD, Yacoub WN, Danenberg PV, DeMeester TR. Plasma DNA as a molecular marker for completeness of resection and recurrent disease in patients with esophageal cancer. Arch Surg 2007, 142, 533-538.
20. Luo, H.; Li, H.; Hu, Z.; Wu, H.; Liu, C.; Li, Y.; Zhang, X.; Lin, P.; Hou, Q.; Ding, G., et al. Noninvasive diagnosis and monitoring of mutations by deep sequencing of circulating tumour DNA in esophageal squamous cell carcinoma. Biochemical and biophysical research communications 2016, 471, 596-602.
21. Yu, J.Y.; Yu, S.F.; Wang, S.H.; Bai, H.; Zhao, J.; An, T.T.; Duan, J.C.; Wang, J. Clinical outcomes of egfr-tki treatment and genetic heterogeneity in lung adenocarcinoma patients with egfr mutations on exons 19 and 21. Chinese journal of cancer 2016, 35, 30.
22. Wei, J.; van der Wekken, A.J.; Saber, A.; Terpstra, M.M.; Schuuring, E.; Timens, W.; Hiltermann, T.J.N.; Groen, H.J.M.; van den Berg, A.; Kok, K. Mutations in emt-related genes in alk positive crizotinib resistant non-small cell lung cancers. Cancers (Basel) 2018, 10.
23. McKenna, A.; Hanna, M.; Banks, E.; Sivachenko, A.; Cibulskis, K.; Kernytsky, A.; Garimella, K.; Altshuler, D.; Gabriel, S.; Daly, M., et al. The genome analysis toolkit: A mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010, 20, 1297-1303.
24. Kircher, M.; Witten, D.M.; Jain, P.; O'Roak, B.J.; Cooper, G.M.; Shendure, J. A general framework for estimating the relative pathogenicity of human genetic variants. Nature genetics 2014, 46, 310-315. 25. Huang da, W.; Sherman, B.T.; Lempicki, R.A. Systematic and integrative analysis of large gene lists using
david bioinformatics resources. Nature protocols 2009, 4, 44-57.
26. Qing, T.; Zhu, S.; Suo, C.; Zhang, L.; Zheng, Y.; Shi, L. Somatic mutations in zfhx4 gene are associated with poor overall survival of chinese esophageal squamous cell carcinoma patients. Scientific reports 2017, 7, 4951.
27. Saber, A.; Hiltermann, T.J.N.; Kok, K.; Terpstra, M.M.; de Lange, K.; Timens, W.; Groen, H.J.M.; van den Berg, A. Mutation patterns in small cell and non-small cell lung cancer patients suggest a different level of heterogeneity between primary and metastatic tumour. Carcinogenesis 2017, 38, 144-151.
28. Liu, Y.; Abdul Razak, F.R.; Terpstra, M.; Chan, F.C.; Saber, A.; Nijland, M.; van Imhoff, G.; Visser, L.; Gascoyne, R.; Steidl, C., et al. The mutational landscape of hodgkin lymphoma cell lines determined by whole-exome sequencing. Leukemia 2014, 28, 2248-2251.
29. McCormick, R.F.; Truong, S.K.; Mullet, J.E. Rig: Recalibration and interrelation of genomic sequence data with the gatk. G3 (Bethesda, Md.) 2015, 5, 655-665.
30. Sausen, M.; Phallen, J.; Adleff, V.; Jones, S.; Leary, R.J.; Barrett, M.T.; Anagnostou, V.; Parpart-Li, S.; Murphy, D.; Kay Li, Q., et al. Clinical implications of genomic alterations in the tumour and circulation of pancreatic cancer patients. Nat Commun 2015, 6, 7686.
31. Majure, M.; Logan, A.C. What the blood knows: Interrogating circulating tumour DNA to predict progression of minimal residual disease in early breast cancer. Ann Transl Med 2016, 4, 543.
32. Rostami, A.; Bratman, S.V. Utilizing circulating tumour DNA in radiation oncology. Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology 2017.
33. Diehl, F.; Schmidt, K.; Choti, M.A.; Romans, K.; Goodman, S.; Li, M.; Thornton, K.; Agrawal, N.; Sokoll, L.; Szabo, S.A., et al. Circulating mutant DNA to assess tumour dynamics. Nature medicine 2008, 14, 985-990. 34. Qi, Y.J.; Chao, W.X.; Chiu, J.F. An overview of esophageal squamous cell carcinoma proteomics. J
7
Supplementary data
Supplementary Table S1. Overview of the 483 genes present in the Illumina cancer gene panel.
ABCB1 ABCC1 ABCC2 ABCC4 ABCC6 ABCG2 ABL1
ACK1/TNK2 ACVR1B AKT1 AKT2 AKT3 ALK AMER1
APC AR ARAF ARFRP1 ARID1A ARID1B ARID2
ASXL1 ATIC ATM ATP7A ATR ATRX AURKA
AURKB AXIN1 AXL B2M BAIAP3 BAP1 BARD1
BCL2 BCL2L2 BCL6 BCOR BCORL1 BCR BIRC5
BLK BLM BRAF BRCA1 BRCA2 BRIP1 BRK/PTK6
BSG/CD147 BTK C11orf30 C18orf56 C8orf34 CAMK2G CAMKK2
CARD11 CASP8 CBFB CBL CBR1 CBR3 CCND1
CCND2 CCND3 CCNE1 CCR4 CD19 CD22 CD274
CD33 CD38 CD3EAP CD52 CD74 CD79A CD79B
CDA CDC73 CDH1 CDK1 CDK12 CDK2 CDK4
CDK5 CDK6 CDK7 CDK8 CDK9 CDKN1B CDKN2A
CDKN2B CDKN2C CEBPA CHEK1 CHEK2 CHST3 CIC
CSNK1A1 COMT CREBBP CRKL CRLF2 CSF1R CSK
CTCF CTLA4 CTNNA1 CTNNB1 CYBA CYLD CYP19A1
CYP1A1 CYP1A2 CYP1B1 CYP2A6 CYP2B6 CYP2C19 CYP2C8
CYP2C9 CYP2D6 CYP2E1 CYP3A4 CYP3A5 CYP4B1 DAXX
DDR1 DDR2 DNMT1 DNMT3A DOT1L DPYD DSCAM
E2F1 EGF EGFL7 EGFR EGR1 EMC8 EML4
ENOSF1 EP300 EPH/EPHA1 EPHA2 EPHA3 EPHA4 EPHA5
EPHA7 EPHA8 EPHB1 EPHB2 EPHB3 EPHX1 ERBB2/HER2
ERBB3 ERBB4 ERCC1 ERCC2 ERG ESR1/ER ETV1
ETV4 ETV5 ETV6 EWSR1 EZH2 FAM46C FANCA
FANCC FANCD2 FANCE FANCF FANCG FANCL FBXW7
FCGR3A FGF10 FGF14 FGF19 FGF23 FGF3 FGF4
FGF6 FGFR1 FGFR2 FGFR3 FGFR4 FGR FKBP1A
FLT1 FLT3 FLT4 FOXL2 FRK FUBP1 FYN
FZD7 GALNT14 GATA1 GATA2 GATA3 GCK GID4
GINS2 GNA11 GNA13 GNAQ GNAS GPC3 GPR124
GRIN2A GSK3B GSTM1 GSTM3 GSTP1 GSTT1 H3F3A
HCK HGF HIF-1/HIF1A HIST1H3B HNF1A HRAS HSP90AA1
IDH1 IDH2 IGF1 IGF1R/IGFR IGF2 IGF2R IKBKB
IKBKE IKZF1 IL7R INHBA INSR/IR IRF4 IRS2
ITK JAK1 JAK2 JAK3 JUN KAT6A KDM5A
KDM5C KDM6A KDR/VEGFR KEAP1 KIT KLC3 KLHL6
KMT2A/MLL KMT2B/MLL4 KMT2C/MLL3 KMT2D/MLL2 KRAS LCK LIMK1
LMO1 LRP1B LRP2 LYN MAP2K1 MAP2K2 MAP2K4
MAP3K1 MAP4K4 MAP4K5 MAPK1 MAPK10 MAPK14 MAPK8
MAPK9 MAPKAPK2 MARK1 MCL1 MDM2 MDM4 MED12
MEF2B MEN1 MERTK MET MITF MKNK2 MLH1
MPL MRE11A MS4A1 MSH2 MSH6 MTDH MTHFR
MTOR MTRR MUTYH MYC MYCL1 MYCN MYD88
NAT1 NAT2 NCAM1 NCF4 NCOA3 NCOR1 NEK11
NOTCH2 NPM1 NQO1 NRAS NTRK1 NTRK2 NTRK3
NUP93 PAK1 PAK3 PALB2 PARP1 PARP2 PAX5
PBRM1 PDCD1 PDGFRA PDGFRB PDK1 PHF6 PHKA2
PIGF PIK3CA PIK3CB PIK3CG PIK3R1 PIK3R2 PKC/PRRT2
PKCγ/PRKCG PKCε/PRKCE PLK1 PPARD PPP1R13L PPP2R1A PRDM1
PRDX4 PRKAA1 PRKAR1A PRKCA PRKCB PRKDC PTCH1
PTEN PTK2 PTPN11 PTPRD RAC2 RAD50 RAD51
RAF1 RARA RB1 RET RICTOR RMDN2 RNF43
ROCK1 RON/MST1R ROS1 RPL13 RPS6KA1 RPS6KB1 RPTOR
RRM1 RUNX1 SCF/KITLG SDHA SDHAF1 SDHAF2 SDHB
SDHC SDHD SETD2 SF3B1 SGK1 SHH SIK1
SKP2 SLC10A2 SLC15A2 SLC22A1 SLC22A16 SLC22A2 SLC22A6
SLCO1B1 SLCO1B3 SMAD2 SMAD4 SMARCA4 SMARCB1 SMO
SOCS1 SOD2 SOX10 SOX2 SOX9 SPEN SPG7
SPOP SRC SRD5A2 SRMS STAG2 STAT1 STAT2
STAT3 STAT4 STAT5A STAT5B STAT6 STEAP1 STK11
STK3 STK4 SUFU SULT1A1 SULT1A2 SULT1C4 SYK
TCF7L1 TCF7L2 TEK TET2 TGFBR1 TGFBR2 TK1
TMPRSS2 TNF TNFAIP3 TNFRSF14 TNFRSF8 TNFSF11 TNFSF13B
TOP1 TP53 TPMT TPX2 TRAIL-R1 TRAIL-R2 TSC1
TSC2 TSHR TYMS/TS TYRO3 U2AF1 UBE2I UGT1A1
UGT1A9 UGT2B15 UGT2B17 UGT2B7 UMPS VEGFA VEGFB
VHL WEE1 WISP3 WNK3 WT1 XPC XPO1
7
Supplementary Table S2. Clinical and pathological features of ESCC patients. Clinical or pathological features Number of patients (%)
Age (years) Median (range) 59 (42-77) Sex Male 12 (70.6%) Female 5 (29.4%) Smoking history Yes 12 (70.6%) No 5 (29.4%) Alcohol history Yes 5 (29.4%) No 12 (70.6%) Family history Yes 2 (11.8%) No 15 (88.2%) T (Depth of tumor) 2 3 (17.6%) 3 13 (76.5%) 4 1 (5.9%)
N (lymph node status)
0 12 (70.6%)
1-2 5 (29.4%)
TNM stage
II 10 (58.8%)
III 7 (41.2%)
Relapse after surgery
Yes 4 (23.5%)
O ve rv ie w o f t he s eque nc ing r es ul ts . I n t hi s t abl e, pa ire d r ea d num be r, Q 30 , a lig ne d r ea d n um be r, pe rc ent ag e o f dupl ic at e r ea ds , a lig ne d uni que r ea c ov er ag e a re l ist ed p er sa mp le. Pa ire d re ad s (m ill io n) Q 30 Per cen ta ge of re ad s al ig ne d Rea ds al ig ne d (m ill io n) Pa ire d mi sma tc h rat e Per cen ta ge o f du pl ic at e re ad s U ni qu e re ad s ali gn ed (m illio n) M ea n t ar get co ver ag e Per cen ta ge o f t ar get ba ses w ith c ov er ag e ≥100X 17. 2 91 99. 9% 17. 2 0. 42% 13% 14. 6 509 99. 2% 17. 8 91 99. 9% 17. 8 0. 42% 12% 15. 4 555 99. 0% as m a 30. 1 91 99. 8% 30. 1 0. 44% 26% 21. 8 687 100. 0% as m a 33. 1 92 99. 9% 33. 0 0. 38% 23% 25 840 99. 9% 29. 1 91 99. 9% 29. 1 0. 41% 13% 25 801 98. 6% 20. 2 92 99. 8% 20. 2 0. 39% 13% 17. 3 566 97. 3% as m a 29. 3 92 99. 9% 29. 2 0. 39% 20% 22. 9 713 98. 3% as m a 38. 8 93 99. 9% 38. 8 0. 35% 23% 29. 4 911 98. 6% 18. 7 91 99. 9% 18. 7 0. 51% 15% 15. 4 439 93. 8% 20. 4 91 99. 9% 20. 4 0. 51% 11% 17. 5 438 94. 1% as m a 27. 9 89 99. 8% 27. 9 0. 53% 24% 20. 5 567 97. 0% as m a 37. 5 90 99. 8% 37. 4 0. 50% 25% 27. 3 72 9 98. 2% 18. 1 91 99. 9% 18. 1 0. 44% 11% 15. 7 487 96. 4% 26. 7 90 99. 9% 26. 6 0. 47% 10% 23. 4 601 97. 8% as m a 33. 3 91 99. 8% 33. 2 0. 43% 21% 25. 7 782 98. 4% as m a 27. 7 91 99. 9% 27. 7 0. 42% 17% 22. 4 641 97. 9% 20. 6 91 99. 8% 20. 6 0. 43% 12% 17. 8 544 97. 7% 26 91 99. 9% 26. 0 0. 43% 13% 22. 2 706 97. 8% as m a 24. 7 90 99. 8% 24. 6 0. 47% 33% 16 391 97. 2% as m a 32. 2 91 99. 9% 32. 1 0. 43% 23% 24. 2 699 97. 7% 20. 6 91 99. 9% 20. 6 0. 42% 13% 17. 6 554 97. 8% 18. 2 91 99. 7% 18. 2 0. 41% 13% 15. 6 491 96. 3%
7
ID Sa mp le s Pa ire d re ad s (m ill io n) Q 30 Per cen ta ge of re ad s al ig ne d Rea ds al ig ne d (m ill io n) Pa ire d mi sma tc h rat e Per cen ta ge o f du pl ic at e re ad s U ni qu e re ad s ali gn ed (m illio n) M ea n t ar get co ver ag e Per cen ta ge o f t ar get ba ses w ith c ov er ag e ≥100X Pr e-sur ge ry pl as m a 36. 8 91 99. 8% 36. 8 0. 44% 42% 20. 8 471 98. 2% Po st -s ur ge ry pl as m a 32. 5 91 99. 8% 32. 4 0. 45% 23% 24. 5 667 98. 1% ES CC0 7 W BC 10. 1 93 99. 8% 10. 1 0. 42% 14% 8. 5 312 93. 7% Tum or ti ss ue 63. 9 93 99. 9% 63. 8 0. 42% 22% 48. 5 1, 760 99. 4% Po st -s ur ge ry pl as m a 92. 8 94 99. 8% 92. 6 0. 38% 46% 48. 5 1, 471 98. 8% ES CC0 8 W BC 22. 3 91 99. 9% 22. 3 0. 42% 14% 18. 9 679 99. 8% Tum or ti ss ue 23. 4 89 99. 9% 23. 4 0. 51% 12 % 20. 1 740 99. 6% Pr e-sur ge ry pl as m a 29. 9 89 99. 9% 29. 9 0. 51% 27% 21. 4 660 99. 8% Po st -s ur ge ry pl as m a 33. 3 92 99. 9% 33. 3 0. 42% 26% 23. 9 717 96. 7% ES CC0 9 W BC 24. 3 91 99. 9% 24. 3 0. 42% 15% 20. 2 640 98. 2% Tum or ti ss ue 20. 2 91 99. 9% 20. 1 0. 41% 12% 17. 5 560 97. 6% Pr e-sur ge ry pl as m a 24. 1 92 99. 9% 24. 1 0. 39% 21% 18. 7 559 97. 6% Po st -s ur ge ry pl as m a 37. 7 92 99. 9% 37. 7 0. 39% 24% 28 808 96. 9% ES CC1 0 W BC 17. 3 91 99. 9% 17. 3 0. 43% 14% 14. 6 448 94. 1% Tum or ti ss ue 15. 5 90 99. 9% 15. 5 0. 45% 10% 13. 7 427 95. 6% Pr e-sur ge ry pl as m a 42 92 99. 9% 42. 0 0. 42% 29% 28. 9 839 98. 0% Po st -s ur ge ry pl as m a 34. 8 89 99. 7% 34. 7 0. 50% 23% 26 752 98. 1% ES CC1 1 W BC 13 95 99. 7% 13. 0 0. 26% 5% 12 211 87. 7% Tum or ti ss ue 67. 7 93 99. 9% 67. 6 0. 41% 21% 52. 7 1, 939 99. 5% Po st -s ur ge ry pl as m a 92. 7 94 99. 9% 92. 6 0. 39% 36% 57. 7 1, 932 99. 4% ES CC1 2 W BC 32. 9 94 99. 9% 32. 9 0. 44% 19% 25. 5 618 88. 5% Tum or ti ss ue 16 93 100. 0% 16. 0 0. 38% 12% 13. 7 454 84. 8% Pr e-sur ge ry pl as m a 52. 3 92 99. 9% 52. 2 0. 46% 31% 34. 4 725 92. 9% Po st -s ur ge ry pl as m a 30. 4 91 99. 5% 30. 2 0. 42% 28% 21. 1 546 97. 6% ES CC1 3 W BC 17. 6 91 99. 9% 17. 6 0. 49% 11% 15. 4 469 95. 8%Pa ire d re ad s (m ill io n) Q 30 Per cen ta ge of re ad s al ig ne d Rea ds al ig ne d (m ill io n) Pa ire d mi sma tc h rat e Per cen ta ge o f du pl ic at e re ad s U ni qu e re ad s ali gn ed (m illio n) M ea n t ar get co ver ag e Per cen ta ge o f t ar get ba ses w ith c ov er ag e ≥100X 16. 1 93 99. 9% 16. 1 0. 48% 12% 13. 6 382 82. 5% as m a 34. 6 91 99. 9% 34. 6 0. 45% 38% 20. 8 520 98. 1% as m a 25 91 99. 8% 24. 9 0. 42% 17% 20. 1 586 97. 5% 22 91 99. 9% 22. 0 0. 41% 14% 18. 6 593 97. 9% 19. 2 91 99. 9% 19. 2 0. 41% 12% 16. 5 536 96. 9% as m a 32. 7 91 99. 9% 32. 7 0. 42% 37% 20. 1 478 97. 9% as m a 38 92 99. 9% 38. 0 0. 41% 22% 28. 9 827 98. 2% 20. 4 91 99. 9% 20. 4 0. 43% 13% 17. 3 545 97. 8% 21 90 99. 9% 20. 9 0. 46% 11% 18. 2 568 97. 7% as m a 41. 3 91 99. 9% 41. 3 0. 43% 36% 25. 8 663 98. 6% as m a 36. 5 92 99. 9% 36. 4 0. 41% 21% 28 849 98. 3% 20. 2 91 99. 9% 20. 2 0. 40% 13% 17. 2 570 97. 7% 20. 2 90 99. 9% 20. 2 0. 46% 13% 17. 3 542 97. 3% as m a 32. 4 91 99. 8% 32. 3 0. 44% 30% 22 554 97. 9% as m a 33. 6 92 99. 9% 33. 6 0. 40% 20% 26. 2 751 97. 0% 16. 8 91 99. 9% 16. 8 0. 45% 16% 13. 6 420 91. 8% 18 92 99. 9% 18. 0 0. 41% 14% 15. 1 474 86. 6% as m a 23. 5 93 99. 8% 23. 4 0. 45% 18% 18. 5 312 83. 8% as m a 33. 6 92 99. 8% 33. 5 0. 40% 20% 26. 2 779 98. 0%
7
Su pp le me nt ary T ab le S 4. De tai ls of al l s om at ic m ut at ion s. C od in g s eq ue nc e m ut at ion s, am in o ac id c han ge s, C AD D sc or e, A LT re ad c ou nt s an d M AF for tu m or , p re -s ur ge ry pl as m a, po st -s ur ge ry pl as m a, a nd W BC a re li st ed. Pa tien t ID St age PF S (m ) Ge ne CD S m ut at io n Ami no A ci d M ut at io n CAD D Sc or e Tu mo r t iss ue Pr e-cf DN A Po st -c fDN A W BC AL T M AF AL T M AF AL T M AF AL T M AF ES CC0 1 PT 2N 0M 0 G2 -3 IIA 5 TP 53 c. 673 -2A> G . 23 76 44% 6 2. 3% 0 0. 0% 0 0. 0% TP 53 c. 1036 G> T p. Gl u346 * 44 61 25% 11 2. 2% 1 0. 2% 0 0. 0% EP HA 3 c. 175 G> C p. As p59H is 26 47 19% 1 0. 3% 0 0. 0% 0 0. 0% RB1 c. 2206 C> T p. Gl n736 * 48 163 58% 10 2. 3% 0 0. 0% 1 0. 3% N O TCH 1 c. 4672d up G p. Le u1559 fs 35 182 58% 23 4. 9% 0 0. 0% 0 0. 0% N O TCH 1 c. 1070 T> C p. Ph e357S er 29 51 14% 9 2. 1% 0 0. 0% 0 0. 0% DS CA M c. 1076d up A p. As n359 fs 30 11 6 21% 8 0. 9% 1 0. 1% 0 0. 0% LRP 1B c. 7387 +88A >G . 2 54 40% 0 0. 0% 1 0. 7% 0 0. 0% ETV 6 c. 329 -72C >T . 4 41 21% 0 0. 0% 0 0. 0% 0 0. 0% CI C c. 7059 G> A p. Gl u2353 Gl u 0 43 20% 0 0. 0% 0 0. 0% 0 0. 0% RO S1 c. 5248 +270 4G >T . 2 42 15% 0 0. 0% 0 0. 0% 0 0. 0% SMA RC A4 c. 1761 +30 C> T . 2 76 19% 0 0. 0% 0 0. 0% 0 0. 0% ES CC0 2 PT 3N 0M 0 G2 IIB > 24 KM T2 D c. 12823 C> T p. Gl n4275 * 41 70 44% 0 0. 0% 0 0. 0% 0 0. 0% TP 53 c. 733 G> A p. Gl y245S er 35 126 43% 2 0. 4% 0 0. 0% 0 0. 0% TP 53 c. 839 G> C p. Ar g280 Th r 33 115 20% 0 0. 0% 0 0. 0% 0 0. 0% RRM 1 c. 1601 C> T p. Th r534I le 30 169 26% 1 0. 1% 0 0. 0% 0 0. 0% LR P2 c. 1601d up G p. Gl u535f s 35 305 28% 0 0. 0% 0 0. 0% 0 0. 0% N O TCH 1 c. 1359 _136 1d el CAA p. As n454d el 19 254 63% 2 0. 3% 0 0. 0% 0 0. 0% KM T2 D c. 14119 C> G p. Pr o4707Al a 18 83 20% 0 0. 0% 0 0. 0% 0 0. 0% ARI D2 c. 4922 +102 G >C . 0 36 25% 0 0. 0% 0 0. 0% 0 0. 0% KI TL G c. 521 -107 G> C . 16 23 13% 0 0. 0% 0 0. 0% 0 0. 0% RI CT O R n. -1 T>G . 2 126 50% 0 0. 0% 0 0. 0% 0 0. 0% ARFRP 1 c. 346 +8_ 346+ 43d el CG GC CT GGC TG CT CT GG GA GGG AT G G ATG G TGA G TG . 11 39 11% 2 0. 4% 0 0. 0% 1 0. 2% CS K c. 129 +13C >T . 9 77 20% 0 0. 0% 0 0. 0% 0 0. 0% TN FS F1 1 c. 434 -13C >A . 7 119 16% 0 0. 0% 1 0. 1% 1 0. 1% ES CC0 3 PT 2N 0M 0 > KM T2 B c. 6859 C> T p. Ar g2287 Tr p 31 30 17% 0 0. 0% 0 0. 0% 0 0. 0%S ) Ge ne CD S m ut at io n Ami no A ci d M ut at io n CAD D Sc or e Tu mo r t iss ue Pr e-cf DN A Po st -c fDN A W BC AL T M AF AL T M AF AL T M AF AL T M AF 24 TS C1 c. 2117 G> A p. Ar g706H is 35 35 16% 0 0. 0% 0 0. 0% 0 0. 0% MS 4A 1 c. 191 T> G p. Ile 64S er 24 66 25% 0 0. 0% 0 0. 0% 0 0. 0% TP 53 c. 614A >G p. Ty r205C ys 24 650 74% 1 0. 1% 0 0. 0% 0 0. 0% SRC c. 1262A >G p. Gl u421 Gl y 33 24 13% 0 0. 0% 1 0. 2% 0 0. 0% MS H2 c. 1892 G> A p. Ar g631L ys 14 162 27% 0 0. 0% 0 0. 0% 0 0. 0% LRP 2 c. 7252 G> C p. Gl u2418 Gl n 0 107 13% 0 0. 0% 0 0. 0% 0 0. 0% N PM1 c. 507A >C p. Gl u169A sp 0 58 11% 0 0. 0% 1 0. 1% 0 0. 0% SET D2 c. 4586 +132 G >T . 1 57 56% 0 0. 0% 0 0. 0% 0 0. 0% M ST1 R c. 2183 +82 G> C . 10 98 73% 0 0. 0% 0 0. 0% 0 0. 0% PP P2R1A c. 1129 -3 6T> C . 3 117 43% 1 0. 2% 0 0. 0% 0 0. 0% BCR n. -1 C>G . 3 98 34% 0 0. 0% 0 0. 0% 0 0. 0% ARI D1A c. 4101 +74 G> T . 0 27 13% 0 0. 0% 0 0. 0% 0 0. 0% 24 TP 53 c. 223 _229d el CC TG CA C p. Pr o75f s 20 197 76% 7 1. 7% 0 0. 0% 0 0. 0% LIM K1 c. 691 G> A p. Va l231I le 25 155 26% 9 0. 9% 0 0. 0% 0 0. 0% PI K3C A c. 1635 G> T p. Gl u545A sp 24 201 27% 19 2. 9% 1 0. 2% 0 0. 0% FBX W 7 c. 1591 G> A p. Gl u531L ys 27 410 52% 2 0. 2% 0 0. 0% 0 0. 0% NF E2 L2 c. 235 G> A p. Gl u79L ys 34 386 42% 1 0. 1% 0 0. 0% 0 0. 0% PT PR D c. 583 G> C p. Gl u195 Gl n 23 124 12% 0 0. 0% 0 0. 0% 0 0. 0% CYP 3A 4 c. 395 T> A p. Le u132* 35 150 14% 0 0. 0% 0 0. 0% 1 0. 1% AME R1 c. 2887 C> T p. Pr o963Se r 1 190 17% 13 1. 8% 0 0. 0% 0 0. 0% ATR c. 7503 +178 G >C . 7 40 33% 0 0. 0% 0 0. 0% 0 0. 0% IG F1 R c. 640 +92 G> A . 3 26 15% 0 0. 0% 0 0. 0% 0 0. 0% SMO c. 537 +14 G> T . 5 58 19% 0 0. 0% 0 0. 0% 0 0. 0% HG F c. 2011 -5 0C >A . 1 70 20% 0 0. 0% 0 0. 0% 0 0. 0% KM T2 C n. -1 G >A . 6 71 20% 0 0. 0% 0 0. 0% 0 0. 0% BCR c. 1280 -805 4C >T . 3 83 21% 0 0. 0% 0 0. 0% 0 0. 0% PT PR D c. 2128 +18 C> T . 13 118 13% 0 0. 0% 0 0. 0% 0 0. 0% AB CC4 c. 2807 -9C >G . 9 265 30% 12 1. 0% 0 0. 0% 0 0. 0% ETV 6 c. 464 -2686 G >C . 2 505 36% 9 0. 6% 0 0. 0% 0 0. 0% 24 TP 53 c. 364 _365i ns T p. Th r123f s 35 127 85% 0 0. 0% 0 0. 0% 0 0. 0% N O TCH 1 c. 867 _868i ns C p. Gl n290f s 29 395 83% 3 0. 9% 0 0. 0% 0 0. 0%
7
Pa tien t ID St age PF S (m ) Ge ne CD S m ut at io n Ami no A ci d M ut at io n CAD D Sc or e Tu mo r t iss ue Pr e-cf DN A Po st -c fDN A W BC AL T M AF AL T M AF AL T M AF AL T M AF SMA D2 n. *584C >T . 2 33 24% 0 0. 0% 0 0. 0% 0 0. 0% AXL c. 1926 +62 G> A . 2 58 27% 0 0. 0% 0 0. 0% 0 0. 0% KM T2 D c. *63 6d el A . 1 78 30% 0 0. 0% 0 0. 0% 0 0. 0% TO P1 c. 336 -8 6T> G . 5 66 22% 0 0. 0% 0 0. 0% 0 0. 0% IG F2 R c. 4948 -101 T> A . 3 81 18% 0 0. 0% 0 0. 0% 0 0. 0% RARA c. 178 +1107 A> G . 15 151 19% 0 0. 0% 1 0. 2% 0 0. 0% RO S1 c. 5249 -2 2T> C . 3 622 54% 2 0. 5% 0 0. 0% 1 0. 2% ES CC0 6 PT 3N 0M 0 G2 IIA > 24 TP 53 c. 551 _554d el AT AG p. As p184 fs 33 34 11% 0 0. 0% 0 0. 0% 0 0. 0% E2 F1 c. 605 G> A p. Ar g202 Gl n 19 29 13% 0 0. 0% 0 0. 0% 0 0. 0% DS CA M c. 934 +69C >A . 2 21 12% 0 0. 0% 0 0. 0% 0 0. 0% ES CC0 7 PT 3N 0M 0 G1 IIA 12 PTE N c. 800d el A p. Ly s267f s 28 189 14% na na 2 0. 3% 0 0. 0% TP 53 c. 920 -1 G >T . 25 234 14% na na 1 0. 5% 1 0. 3% TN FA IP 3 c. 441C >G p. Le u147L eu 5 254 11% na na 0 0. 0% 1 0. 3% ES CC0 8 PT 3N 1M 0 G1 IIIB > 24 TP 53 c. 770 _782 +10d el TG G AAG AC TCC AG GT CA GG AG CC p. Le u257f s 33 55 18% 4 0. 9% 0 0. 0% 0 0. 0% CR EBBP c. 4336 C>T p. Ar g1446 Cy s 27 98 19% 3 0. 6% 0 0. 0% 0 0. 0% N O TCH 1 c. 928 G> A p. Gl y310Ar g 27 148 27% 4 0. 6% 0 0. 0% 0 0. 0% KM T2 D c. 9730d el G p. Gl u3244 fs 35 93 13% 2 0. 2% 0 0. 0% 0 0. 0% STA T3 c. 1279 G> A p. As p427A sn 29 51 10% 3 0. 5% 0 0. 0% 0 0. 0% HI F1 A c. 1860A >C p. Gl n620H is 0 155 10% 4 0. 4% 1 0. 1% 1 0. 1% STK 3 n. -1 G >A . 10 45 15% 0 0. 0% 0 0. 0% 0 0. 0% M ER TK c. 2190 -759 C> A . 6 51 13% 0 0. 0% 1 0. 2% 0 0. 0% FL T3 c. 2653 +133 C> A . 5 13 11% 0 0. 0% 0 0. 0% 0 0. 0% ES CC0 9 PT 4a N 0M 0G 2 IIIB > 24 CDK N 2A c. 316 +1G >T . 27 42 29% 4 2. 6% 0 0. 0% 0 0. 0% CDK N 2A c. 172C >T p. Ar g58* 35 69 27% 6 1. 8% 0 0. 0% 0 0. 0% N O TCH 1 c. 4646 G> T p. Cy s154 9P he 29 94 28% 7 1. 8% 2 0. 3% 1 0. 2% EP HB2 c. 548 G> T p. Gl y183Va l 23 130 21% 7 0. 9% 0 0. 0% 1 0. 1% PTE N c. 69 2_ 695d up CC AC p. Ar g233f s 35 145 23% 10 1. 4% 0 0. 0% 0 0. 0% TP 53 c. 643A >G p. Se r215 Gl y 29 269 29% 22 2. 3% 0 0. 0% 1 0. 1% PTC H1 c. 182A >C p. Gl u61Al a 19 293 30% 12 0. 9% 0 0. 0% 0 0. 0% PD GF RB c. 996 G> T p. Ar g332Ar g 12 72 27% 3 0. 8% 0 0. 0% 0 0. 0%S ) Ge ne CD S m ut at io n Ami no A ci d M ut at io n CAD D Sc or e Tu mo r t iss ue Pr e-cf DN A Po st -c fDN A W BC AL T M AF AL T M AF AL T M AF AL T M AF ETV 5-AS1 n. -1d el C . 2 171 22% 6 0. 9% 0 0. 0% 2 0. 3% M TO R c. 4329 +67 C> G . 2 28 12% 2 0. 8% 0 0. 0% 0 0. 0% 24 CDK N 2A c. 172C >T p. Ar g58* 35 95 66% 6 1. 4% 0 0. 0% 0 0. 0% RO CK 1 c. 3441 _344 2i ns A p. Ly s1148 fs 36 107 65% 9 2. 6% 0 0. 0% 0 0. 0% TYR O 3 c. 409 +1G >A . 25 74 32% 13 2. 0% 0 0. 0% 0 0. 0% SET D2 c. 6318 G> T p. Ly s2106A sn 29 205 64% 21 3. 5% 0 0. 0% 0 0. 0% TP 53 c. 481 G> A p. Al a161T hr 27 213 66% 28 2. 8% 1 0. 1% 1 0. 2% CCN D1 c. 415 -1 G >A . 24 143 28% 11 2. 0% 0 0. 0% 0 0. 0% PI K3C B c. 2027 G> T p. Tr p676L eu 34 113 21% 15 2. 2% 1 0. 1% 1 0. 3% TS C1 n. -1 G >C . 4 59 47% 3 1. 9% 0 0. 0% 0 0. 0% BL K c. 953 -9 2G >A . 3 21 21% 0 0. 0% 0 0. 0% 0 0. 0% M RE 11A c. 1572 +139 G >A . 13 25 25% 0 0. 0% 0 0. 0% 0 0. 0% EP HA 5 c. 1691 C> T p. Al a564Va l 18 101 22% 15 1. 3% 0 0. 0% 0 0. 0% ER BB4 c. 3487 C> G p. Le u1163Va l 5 44 12% 5 1. 3% 0 0. 0% 0 0. 0% ER BB2 c. 3033 C> T p. As p1011 As p 18 34 11% 6 0. 8% 2 0. 3% 0 0. 0% EP HA 4 c. 1319 -1 5G >T . 1 71 41% 4 0. 6% 0 0. 0% 0 0. 0% ETV 6 c. 164 -1464 6G >A . 2 38 20% 4 0. 7% 0 0. 0% 0 0. 0% BCO R c. 1887 C> T p. As n629A sn 2 73 36% 12 2. 9% 1 0. 3% 0 0. 0% EP HA 4 c. 1319 -7C >G . 0 118 55% 8 1. 2% 0 0. 0% 0 0. 0% RB1 c. 718 +52 T> G . 15 56 17% 0 0. 0% 0 0. 0% 0 0. 0% C8 or f3 4 c. 736 +8083 G >C . 1 77 22% 8 1. 7% 0 0. 0% 0 0. 0% TM PR SS2 c. 557 -2350 C> T . 3 445 68% 30 2. 0% 0 0. 0% 0 0. 0% 4 KM T2 C c. 569 G> A p. Ar g190 Gl n 24 179 15% na na 2 0. 2% 0 0. 0% JA K3 c. 2134 G> A p. Gl y712S er 34 243 12% na na 1 0. 0% 0 0. 0% PI K3C G c. 2857A >G p. M et 953Va l 27 344 16% na na 3 0. 2% 0 0. 0% TP 53 c. 844C >T p. Ar g282 Tr p 33 559 27% na na 0 0. 0% 0 0. 0% ETV 1 c. 224 -3767 G >T . 15 427 14% na na 2 0. 1% 0 0. 0% 24 TP 53 c. 643A >C p.S er 21 5A rg 27 143 28% 3 0. 3% 0 0. 0% 0 0. 0% FA N CA c. 1226 -1 G> C . 23 53 13% 2 0. 2% 0 0. 0% 0 0. 0% 24 TP 53 c. 844C >T p. Ar g282 Tr p 33 84 22% 0 0. 0% 0 0. 0% 0 0. 0% KDM 5C c. 1927 T> C p. Se r643P ro 27 19 16% 0 0. 0% 0 0. 0% 0 0. 0%