• No results found

University of Groningen Next generation sequencing guided molecular diagnostic tests in non-small-cell lung cancer Wei, Jiacong

N/A
N/A
Protected

Academic year: 2021

Share "University of Groningen Next generation sequencing guided molecular diagnostic tests in non-small-cell lung cancer Wei, Jiacong"

Copied!
33
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Next generation sequencing guided molecular diagnostic tests in non-small-cell lung cancer

Wei, Jiacong

DOI:

10.33612/diss.101317239

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date: 2019

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Wei, J. (2019). Next generation sequencing guided molecular diagnostic tests in non-small-cell lung cancer. Rijksuniversiteit Groningen. https://doi.org/10.33612/diss.101317239

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)

2

in revision to Journal of Molecular Diagnostics

Jiacong Wei1,

4

; Anna A. Rybczynska

1

; Pei Meng

2

,

4

;

Martijn Terpstra

1

; Ali Saber

2

; Jantine

Sietsema

2

; Wim Timens

2

, Ed Schuuring

2

, T. Jeroen

N. Hiltermann

3

; Harry. J.M. Groen

3

;

Anthonie J. van der Wekken

3

; Anke van den Berg

2

;

Klaas Kok

1

*

1 University of Groningen, University Medical Centre Groningen, Depart-ment of Genetics, Groningen, The Netherlands

2 University of Groningen, University Medical Centre Groningen, Depart-ment of Pathology and Medical Biology, Groningen, The Netherlands 3 University of Groningen, University Medical Centre Groningen,

Depart-ment of Pulmonary Diseases, Groningen, The Netherlands,

4 Department of Pathology, Collaborative and Creative Centre, Shantou University Medical College, Shantou, Guangdong, China.

* Corresponding author

An all-in-one transcriptome-based

assay to identify therapy-guiding

genomic aberrations in non-small cell

lung cancer patients

Use of whole genome sequencing for diagnosis and discovery in the cancer genetics clinic. Ebiomedicine 2015,

2, 74‐81.

23. Srinivasan, S.; Clements, J.A.; Batra, J. Single nucleotide polymorphisms in clinics: Fantasy or reality for cancer? Critical reviews in clinical laboratory sciences 2016, 53, 29‐39.

24. Roychowdhury, S.; Chinnaiyan, A.M. Translating cancer genomes and transcriptomes for precision oncology. Ca-a Cancer Journal for Clinicians 2016, 66, 75‐88.

25. McDermott, U. Next‐generation sequencing and empowering personalised cancer medicine. Drug

Discovery Today 2015, 20, 1470‐1475.

26. Ballester, L.Y.; Luthra, R.; Kanagal‐Shamanna, R.; Singh, R.R. Advances in clinical next‐generation sequencing: Target enrichment and sequencing technologies. Expert Review of Molecular Diagnostics 2016, 16, 357‐372.

27. Do, H.; Dobrovic, A. Sequence artifacts in DNA from formalin‐fixed tissues: Causes and strategies for minimization. Clinical Chemistry 2015, 61, 64‐71.

28. Rosenbaum, J.N.; Bloom, R.; Forys, J.T.; Hiken, J.; Armstrong, J.R.; Branson, J.; McNulty, S.; Velu, P.D.; Pepin, K.; Abel, H., et al. Genomic heterogeneity of alk fusion breakpoints in non‐small‐cell lung cancer. Modern

Pathology 2018, 31, 791.

29. van der Wekken, A.; Pelgrim, R.; 't Hart, N.; Werner, N.; Mastik, M.; Hendriks, L.; van der Heijden, E.H.; Looijen‐Salamon, M.; de Langen, A.J.; Staal‐van den Brekel, J., et al. Dichotomous alk‐ihc is a better predictor for alk inhibition outcome than traditional alk‐fish in advanced non‐small cell lung cancer. Clinical Cancer

Research 2017.

30. Geiss, G.K.; Bumgarner, R.E.; Birditt, B.; Dahl, T.; Dowidar, N.; Dunaway, D.L.; Fell, H.P.; Ferree, S.; George, R.D.; Grogan, T., et al. Direct multiplexed measurement of gene expression with color‐coded probe pairs. Nature biotechnology 2008, 26, 317.

31. Evangelista, A.F.; Zanon, M.F.; Carloni, A.C.; de Paula, F.E.; Morini, M.A.; Ferreira‐Neto, M.; Soares, I.C.; Miziara, J.E.; de Marchi, P.; Scapulatempo‐Neto, C., et al. Detection of alk fusion transcripts in ffpe lung cancer samples by nanostring technology. Bmc Pulm Med 2017, 17, 017‐0428.

32. Jamal‐Hanjani, M.; Wilson, G.A.; McGranahan, N.; Birkbak, N.J.; Watkins, T.B.K.; Veeriah, S.; Shafi, S.; Johnson, D.H.; Mitter, R.; Rosenthal, R., et al. Tracking the evolution of non‐small‐cell lung cancer. The New

England journal of medicine 2017, 376, 2109‐2121.

33. Saber, A.; Timens, W.; van den Berg, A.; Groen, H.J.M.; Hiltermann, T.J.N.; de Lange, K.; Kok, K.; Terpstra, M.M. Mutation patterns in small cell and non‐small cell lung cancer patients suggest a different level of heterogeneity between primary and metastatic tumour. Carcinogenesis 2017, 38, 144‐151.

34. Ferronika, P.; van den Bos, H.; Taudt, A.; Spierings, D.C.J.; Saber, A.; Hiltermann, T.J.N.; Kok, K.; Porubsky, D.; van der Wekken, A.J.; Timens, W., et al. Copy number alterations assessed at the single‐cell level revealed mono‐ and polyclonal seeding patterns of distant metastasis in a small‐cell lung cancer patient. Annals

of Oncology 2017, 28, 1668‐1670.

35. Buder, A.; Tomuta, C.; Filipits, M. The potential of liquid biopsies. Current Opinion in Oncology 2016,

28, 130‐134.

36. Tu, M.; Chia, D.; Wei, F.; Wong, D. Liquid biopsy for detection of actionable oncogenic mutations in human cancers and electric field induced release and measurement liquid biopsy (elb). Analyst 2016, 141, 393‐ 402.

37. Heitzer, E.; Ulz, P.; Geigl, J.B. Circulating tumour DNA as a liquid biopsy for cancer. Clinical Chemistry 2015, 61, 112‐123.

(3)

Abstract

The number of genomic aberrations relevant for therapeutic decisions for non‐small cell lung cancer patients has increased in the past decade. To reliably test the presence of these aberrations, often multiple molecular tests are required, which is a challenge due to the generally small tissue specimens. To optimize diagnostic testing, we developed a

transcriptome‐based next generation sequencing (NGS) based on single primed enrichment technology. We interrogated cell lines and patient derived frozen biopsies, pleural effusion and FFPE samples. All clinical samples were selected based on previously identified

mutations at the DNA level in EGFR, KRAS, ALK, PIK3CA, BRAF, AKT1, MET, NRAS or ROS1, or FISH breaks and/or IHC positivity for ALK, ROS1, RET and NTRK1. The number of unique reads was dependent on both material type and RNA quality indicated by the DV200 value. In 26 samples with>50K unique reads and a DV200>50, all 17 SNVs/INDELs, 2 MET exon 14 skipping events and 14 fusion gene transcripts were detected at RNA level, giving an test accuracy of 100%. In summary, this lung cancer specific all‐in‐one transcriptome‐based assay for simultaneous detection of mutations and fusion genes is highly sensitive and effective on both FFPE and frozen tissue.

(4)

2

Abstract

The number of genomic aberrations relevant for therapeutic decisions for non‐small cell lung cancer patients has increased in the past decade. To reliably test the presence of these aberrations, often multiple molecular tests are required, which is a challenge due to the generally small tissue specimens. To optimize diagnostic testing, we developed a

transcriptome‐based next generation sequencing (NGS) based on single primed enrichment technology. We interrogated cell lines and patient derived frozen biopsies, pleural effusion and FFPE samples. All clinical samples were selected based on previously identified

mutations at the DNA level in EGFR, KRAS, ALK, PIK3CA, BRAF, AKT1, MET, NRAS or ROS1, or FISH breaks and/or IHC positivity for ALK, ROS1, RET and NTRK1. The number of unique reads was dependent on both material type and RNA quality indicated by the DV200 value. In 26 samples with>50K unique reads and a DV200>50, all 17 SNVs/INDELs, 2 MET exon 14 skipping events and 14 fusion gene transcripts were detected at RNA level, giving an test accuracy of 100%. In summary, this lung cancer specific all‐in‐one transcriptome‐based assay for simultaneous detection of mutations and fusion genes is highly sensitive and effective on both FFPE and frozen tissue.

Background

Lung cancer accounts for approximately 27% of all cancer‐related deaths worldwide1. The

discovery of targetable driver genes in a subset of non‐small cell lung cancer (NSCLC)

patients shaped personalized targeted therapies and prolonged patient survival2‐5.

Therapeutically relevant aberrations include amongst others activating mutations in EGFR, KRAS, BRAF and MET, and fusion genes leading to activation of ALK, ROS1, RET and NTRK. Multiple different diagnostic tests are required to reliably identify these aberrations in clinical settings. The most commonly used techniques to detect these aberrations are targeted DNA sequencing to detect single nucleotide variants (SNVs) and small insertions and deletions (INDELs), fluorescence in situ hybridization (FISH) to detect chromosomal

breaks and immunohistochemistry (IHC) to detect aberrant expression of ALK and ROS16‐10.

A recurrent problem, especially for advanced‐stage NSCLC patients, is the ow tumour content of the biopsies. This hampers comprehensive molecular testing using a combination of different tests to reliably screen for presence of genomic aberrations in all known driver genes. To overcome this limitation, a comprehensive test to identify all types of therapeutic‐ relevant aberrations in one single assay is needed. In several studies a combination of a DNA‐ and RNA‐based next generation sequencing (NGS) tests were applied, limiting the

screening to two parallel tests11‐15. In a study using a targeted RNA‐based NGS test on frozen

cytological samples from lung cancer and thyroid cancer, both fusion genes and mutations

were simultaneously identified16. However, currently available all‐in‐one approaches

covering all different types of aberrations are not yet commonly applied in routine clinical settings.

In this study, we designed a lung cancer specific targeted all‐in‐one transcriptome‐based assay based on the single primed enrichment technology (SPET) to simultaneously identify mutations, gene fusions and exon skipping events. The assay covers all gene loci that are currently relevant to select the most optimal targeted therapy in advanced stage NSCLC patients and does not require pre‐knowledge about the fusion gene partners. We tested the effectiveness of our comprehensive assay in samples with known aberrations as based on the literature (i.e. cell lines) or on our routine molecular diagnostic test results. We specifically tested feasibility on formalin fixed paraffin embedded (FFPE) tissue samples.

Materials and methods Sample information

We included 6 lung cancer‐derived cell lines, 5 cell lines derived from other cancer types

with specific genomic aberrations relevant for lung cancer17‐29, and 42 tissue samples of

patients with known genomic aberrations identified between 2011 and 2017 (Figure 1, Supplementary Table 1) and 5 FFPE tissue samples without known mutations. All cell lines were obtained from the American Type Culture Collection (ATCC). The origin of the patient samples was pleural effusions (PEs) (11 samples from 9 patients), frozen tissues (2 samples from 2 patients), and FFPE tissues (29 samples from 27 patients). PE samples were harvested directly (n=8) or after culturing for two weeks to two months (n=3). Cell lines were cultured in RPMI‐1640; PE samples in DMEM‐F12, both supplemented with 10% fetal bovine serum (FBS) and 5% penicillin/streptomycin following standard culturing protocols.

In total, these 53 samples covered 63 known variants, i.e. 41 SNVs / INDELs, 3 MET exon skipping and 20 fusion genes (Figure 1). Five lung tumour tissue samples without molecular

(5)

aberrations according to the routine molecular tests were included as non‐mutated NSCLC cases. All patient samples were obtained from the UMCG pathology biobank and were anonymized for the investigators. The study protocol is consistent with the Research Code of

the University Medical Centre Groningen

(https://www.rug.nl/umcg/research/documents/research‐code‐info‐umcg‐nl.pdf) and

national ethical and professional guidelines (“Code of conduct; Dutch federation of biomedical scientific societies”, htttp://www.federa.org).

RNA isolation

RNA from cell lines and PE samples was isolated using GeneJET RNA Purification Kit (Thermo Fisher Scientific, Waltham, USA), RNA from frozen tissue samples was isolated using TRIzol (Invitrogen, Waltham, USA). RNeasy FFPE kit (QIAGEN GmbH, Hilden, Germany) was used for RNA isolation from total tissue sections of FFPE tissue samples, without enrichment of tumour cell rich areas. For all kits, isolation procedures were done according to the manufacturer’s protocol. The quality of the RNA samples was analysed using Fragment Analyzer (Advanced Analytical, Armes, USA). The obtained DV200 value indicates the percentage of RNA fragments longer than 200 nucleotides (Supplementary Table 1).

Design of all-in-one lung cancer assay

The target region for the assay was designed to cover all clinically relevant genomic aberrations (Supplementary Table 2). For mutation hotspots, landing probes were designed within 50 nucleotides upstream and downstream of each target region. Target regions included in the assay were based on the routinely used custom‐designed diagnostic amplicon based panel; i.e. BRAF (codons 466, 499 and 600), EGFR (codons 790, 858, exon 19 deletion (E19 DEL) regions, and all mutated codons in exons 18‐21), PIK3CA (codons 442, 545 and 1047), KRAS (codons 12, 13 and 61), NRAS (codons 12, 13 and 61), DDR2 (codon 768), AKT1 (codon 17), ERBB2 (exon 20), MAP2K1 (codons 56, 57 and 67), as well as the tyrosine kinase domains of ALK and ROS1. For fusion genes routinely tested by IHC or FISH in the molecular diagnostics, i.e. ALK, ROS1, RET and NTRK, and the most frequently observed fusion partner genes, we included the relevant landing probes from the Ovation Fusion

Panel Target Enrichment System kit (NuGEN Technologies, San Carlos, USA)14. These landing

probes were close to the boundaries of the exons facing towards the flanking up‐ and downstream exons. In addition, landing probes were designed at the boundary of exons 13 facing towards exon 14 and at the boundary of exon 15 facing towards exon 14 of the MET gene to detect exon 14 skipping events. Finally, we added landing probes for a selection of housekeeping genes, to serve as internal quality controls (Supplementary Table 2).

In the cause of this study, we made three versions of our design. Minor changes on landing probe regions were subsequently made to obtain a more optimal coverage of the hotspot regions, whereas the target regions remained the same. Furthermore, the highly expressed housekeeping genes added in design 1 were replaced by less abundantly expressed housekeeping genes in later designs (Supplementary Table 2). Landing probes in genes relevant for immunotherapy were added in the last design but were not analysed in detail in this study. Moreover, probes added to the design for another unrelated study, will not be discussed in this paper.

(6)

2

aberrations according to the routine molecular tests were included as non‐mutated NSCLC cases. All patient samples were obtained from the UMCG pathology biobank and were anonymized for the investigators. The study protocol is consistent with the Research Code of

the University Medical Centre Groningen

(https://www.rug.nl/umcg/research/documents/research‐code‐info‐umcg‐nl.pdf) and

national ethical and professional guidelines (“Code of conduct; Dutch federation of biomedical scientific societies”, htttp://www.federa.org).

RNA isolation

RNA from cell lines and PE samples was isolated using GeneJET RNA Purification Kit (Thermo Fisher Scientific, Waltham, USA), RNA from frozen tissue samples was isolated using TRIzol (Invitrogen, Waltham, USA). RNeasy FFPE kit (QIAGEN GmbH, Hilden, Germany) was used for RNA isolation from total tissue sections of FFPE tissue samples, without enrichment of tumour cell rich areas. For all kits, isolation procedures were done according to the manufacturer’s protocol. The quality of the RNA samples was analysed using Fragment Analyzer (Advanced Analytical, Armes, USA). The obtained DV200 value indicates the percentage of RNA fragments longer than 200 nucleotides (Supplementary Table 1).

Design of all-in-one lung cancer assay

The target region for the assay was designed to cover all clinically relevant genomic aberrations (Supplementary Table 2). For mutation hotspots, landing probes were designed within 50 nucleotides upstream and downstream of each target region. Target regions included in the assay were based on the routinely used custom‐designed diagnostic amplicon based panel; i.e. BRAF (codons 466, 499 and 600), EGFR (codons 790, 858, exon 19 deletion (E19 DEL) regions, and all mutated codons in exons 18‐21), PIK3CA (codons 442, 545 and 1047), KRAS (codons 12, 13 and 61), NRAS (codons 12, 13 and 61), DDR2 (codon 768), AKT1 (codon 17), ERBB2 (exon 20), MAP2K1 (codons 56, 57 and 67), as well as the tyrosine kinase domains of ALK and ROS1. For fusion genes routinely tested by IHC or FISH in the molecular diagnostics, i.e. ALK, ROS1, RET and NTRK, and the most frequently observed fusion partner genes, we included the relevant landing probes from the Ovation Fusion

Panel Target Enrichment System kit (NuGEN Technologies, San Carlos, USA)14. These landing

probes were close to the boundaries of the exons facing towards the flanking up‐ and downstream exons. In addition, landing probes were designed at the boundary of exons 13 facing towards exon 14 and at the boundary of exon 15 facing towards exon 14 of the MET gene to detect exon 14 skipping events. Finally, we added landing probes for a selection of housekeeping genes, to serve as internal quality controls (Supplementary Table 2).

In the cause of this study, we made three versions of our design. Minor changes on landing probe regions were subsequently made to obtain a more optimal coverage of the hotspot regions, whereas the target regions remained the same. Furthermore, the highly expressed housekeeping genes added in design 1 were replaced by less abundantly expressed housekeeping genes in later designs (Supplementary Table 2). Landing probes in genes relevant for immunotherapy were added in the last design but were not analysed in detail in this study. Moreover, probes added to the design for another unrelated study, will not be discussed in this paper.

Library preparation

Library preparation for the SPET procedure was done according to the protocol provided by the manufacturer (NuGEN Technologies, San Carlos, USA). We aimed for an RNA input of 200ng for non‐FFPE and 500ng for FFPE samples for library preparation. Briefly, after ds‐ cDNA synthesis, adaptors containing an 8‐nt unique barcode and a universal forward primer were ligated to the fragments. The resulting fragments were denatured and landing probes, containing the universal reverse primer, were hybridized overnight, followed by an extension step (Supplementary Figure 1). Subsequently, a test qPCR was done to determine the optimal number of cycles for library amplification. The number of cycles used for the library amplification was 0 to 4 above the cycle threshold determined by the test qPCR, as recommended. After amplification, TapeStation measurement and/or Kapa qPCR were done to determine the molarity of the library. Eight or sixteen libraries were mixed in equimolar amounts and subjected to NGS on a MiSeq platform (Illumina, San Diego, USA) with a 150bp paired‐end sequencing protocol provided by the manufacturer.

For eight samples, the library was sequenced a second time with 1/8 of the standard input. For three high quality RNA samples, libraries were prepared using three different amounts of RNA input, i.e. 200 ng, 100ng and 50ng (Supplementary Table 2).

Figure 1. Schematic representation of the 53 samples included in our all‐in‐one‐transcriptome assay and the expected 63 mutations. Shown are the number of samples for each source of tumour material. In the lower boxes the expected number of single nucleotide variants (SNVs) / insertions or deletions (INDEL), MET exon skipping mutations and fusion genes are indicated. (*) Two of the expected variants turned out to be absent. WT: wild type, samples without known mutations.

NGS data analysis

The FASTQ files were processed with an in‐house pipeline. Alignment of reads was done using Hisat2, and Genome Analysis Toolkit (GATK) the human genome reference build

GRCh37 with decoys from the GATK bundle30, 31. Picard Tools was used for format conversion

and marking duplicates, including the random‐barcode information of the reads. We initially performed a manual check using the IGV browser for all known SNVs, and INDELs starting from the aligned reads. In addition, we designed a pipeline for variant detection. Haplotype Caller was used for integrated calling of the variants for all samples. Variants were annotated using SnpEff/SnpSift with the Ensembl release 75 gene annotations and the dbNSFP2.7 database dbsnp 138, Cosmic v72, 1000 genomes phase 3 and the ExAC 0.3

(7)

and INDELs is an adaptation of the GATK workflow and uses molgenis compute as workflow management software. The data were filtered for quality metrics similar to GATK recommendations and custom filters for population frequency and variant effect. Synonymous mutations, variants present in the 1000 human genome project at a frequency of more than 2%, variants with less than 3 altered read counts or a variant allele frequency

(VAF) less than 5%, and variants with CADD score less than 20 were filtered out36. Fusion

gene detection was done with Fusion Catcher and Strand NGS software (Strand Genomics,

San Francisco, USA)37. We focused fusion gene analyses on ALK, ROS1, RET and NTRK. We

only reported in‐frame fusion transcripts with the tyrosine kinase domain of the indicated genes as well as those in which the sum of spanning and splitting read counts were at least five. Recurrent mutations that exclusively occur at the end of the reads were excluded, as they most likely represent technical artefacts. Reads that could only be aligned to part of the fusion gene region, i.e. indicative for a fusion gene breakpoint split read, were subjected to

BLAT analysis to identify the fusion gene partner38.

Detection of fusion gene transcripts by NanoString

Using the Lung Panel Gene Fusion (NanoString technologies, Seattle, USA), we detected fusion transcripts in ALK, RET and ROS1. A total amount of 100‐200ng RNA was hybridized overnight following the protocol of the manufacturer. The next day, samples were loaded on streptavidin‐coated cartridges and analysed on nCounter® SPRINT Profiler (NanoString technologies, Seattle, USA). The raw barcode counts were background adjusted with a Truncated Poisson correction using negative control spikes and normalized relative to the positive control spikes. Samples with good hybridization quality as determined by good signals for housekeeping genes and values below 30 for negative controls were included for calling fusion transcripts. A t‐test between the 3’ and 5’ probe counts was applied to identify imbalance probes. Presence of a fusion transcript is defined as positive based on the following criteria: the p‐value is<0.01 for the 3’ and 5’ count difference; the 3’/5’ ratio and the 3’/negative control count ratio are both>1.5; the absolute counts of fusion specific probes are>20 (ALK and ROS1) or>30 (RET); counts of the fusion‐specific probe is>2x SD of the mean probe count across the gene, except of the outlier counts, which is above the Upper Tukey fence (Q3 + 1.5*IQR).

Variant detection by droplet digital (dd) PCR

We applied ddPCR (Bio‐Rad, Hercules, USA) on cDNA to quantify expression of the mutant allele of the most commonly observed mutations, e.g. T790M, L858R, E19 DEL in EGFR as well as G12A, G12D, G12F in KRAS. For RNA samples from cell lines, cDNA was synthesized with the RevertAid H Minus First Strand cDNA Synthesis Kit (Thermo Scientific, #K1631). For RNA from clinical FFPE tissues and pleural effusions, cDNA was synthesized with the iScript cDNA Synthesis Kit (BIO‐RAD, Hercules, USA). Negative and positive control samples were included for each variant in all experiments. For good quality samples, RNA input was 1‐2ng, for other samples the input varied between 5 and 479ng (Supplementary Table 3 and Supplementary Table 4). Reaction mixes included 11μl ddPCR Supermix for probes and 1μL mutation assay in a final volume of 22μl. Droplets were generated using the QX100 droplet generator after addition of 70μl droplet generation oil (Bio‐Rad, Hercules, USA). PCR was performed on a T100 Thermal Cycler (Bio‐Rad, Hercules, USA), using the following PCR conditions: 95°C for 10 min, 39 cycles of 95°C for 30 seconds, 59°C for KRAS or 55°C for EGFR and ALK for 60 seconds, 72°C for 15 seconds, 98°C for 10 minutes followed by a cooling down to 4°C. The temperature ramp change was 2°C per second for all steps. Droplets were

(8)

2

and INDELs is an adaptation of the GATK workflow and uses molgenis compute as workflow management software. The data were filtered for quality metrics similar to GATK recommendations and custom filters for population frequency and variant effect. Synonymous mutations, variants present in the 1000 human genome project at a frequency of more than 2%, variants with less than 3 altered read counts or a variant allele frequency

(VAF) less than 5%, and variants with CADD score less than 20 were filtered out36. Fusion

gene detection was done with Fusion Catcher and Strand NGS software (Strand Genomics,

San Francisco, USA)37. We focused fusion gene analyses on ALK, ROS1, RET and NTRK. We

only reported in‐frame fusion transcripts with the tyrosine kinase domain of the indicated genes as well as those in which the sum of spanning and splitting read counts were at least five. Recurrent mutations that exclusively occur at the end of the reads were excluded, as they most likely represent technical artefacts. Reads that could only be aligned to part of the fusion gene region, i.e. indicative for a fusion gene breakpoint split read, were subjected to

BLAT analysis to identify the fusion gene partner38.

Detection of fusion gene transcripts by NanoString

Using the Lung Panel Gene Fusion (NanoString technologies, Seattle, USA), we detected fusion transcripts in ALK, RET and ROS1. A total amount of 100‐200ng RNA was hybridized overnight following the protocol of the manufacturer. The next day, samples were loaded on streptavidin‐coated cartridges and analysed on nCounter® SPRINT Profiler (NanoString technologies, Seattle, USA). The raw barcode counts were background adjusted with a Truncated Poisson correction using negative control spikes and normalized relative to the positive control spikes. Samples with good hybridization quality as determined by good signals for housekeeping genes and values below 30 for negative controls were included for calling fusion transcripts. A t‐test between the 3’ and 5’ probe counts was applied to identify imbalance probes. Presence of a fusion transcript is defined as positive based on the following criteria: the p‐value is<0.01 for the 3’ and 5’ count difference; the 3’/5’ ratio and the 3’/negative control count ratio are both>1.5; the absolute counts of fusion specific probes are>20 (ALK and ROS1) or>30 (RET); counts of the fusion‐specific probe is>2x SD of the mean probe count across the gene, except of the outlier counts, which is above the Upper Tukey fence (Q3 + 1.5*IQR).

Variant detection by droplet digital (dd) PCR

We applied ddPCR (Bio‐Rad, Hercules, USA) on cDNA to quantify expression of the mutant allele of the most commonly observed mutations, e.g. T790M, L858R, E19 DEL in EGFR as well as G12A, G12D, G12F in KRAS. For RNA samples from cell lines, cDNA was synthesized with the RevertAid H Minus First Strand cDNA Synthesis Kit (Thermo Scientific, #K1631). For RNA from clinical FFPE tissues and pleural effusions, cDNA was synthesized with the iScript cDNA Synthesis Kit (BIO‐RAD, Hercules, USA). Negative and positive control samples were included for each variant in all experiments. For good quality samples, RNA input was 1‐2ng, for other samples the input varied between 5 and 479ng (Supplementary Table 3 and Supplementary Table 4). Reaction mixes included 11μl ddPCR Supermix for probes and 1μL mutation assay in a final volume of 22μl. Droplets were generated using the QX100 droplet generator after addition of 70μl droplet generation oil (Bio‐Rad, Hercules, USA). PCR was performed on a T100 Thermal Cycler (Bio‐Rad, Hercules, USA), using the following PCR conditions: 95°C for 10 min, 39 cycles of 95°C for 30 seconds, 59°C for KRAS or 55°C for EGFR and ALK for 60 seconds, 72°C for 15 seconds, 98°C for 10 minutes followed by a cooling down to 4°C. The temperature ramp change was 2°C per second for all steps. Droplets were

counted on a QX‐100 droplet reader (Bio‐Rad, Hercules, USA) and data were analysed by Quantasoft software version 1.6.6 (Quantasoft, Prague, Czech Republic) for detection of FAM and HEX signals. For EGFR T790M, forward and reverse primer were 3’‐ CAAGGAAATCCTCGATGAAGCC‐5’ and 3’‐GTCTTTGTGTTCCCGGACATAGT‐5’ with a HEX labelled wild type probe 3’‐ATGAGCTGCGTGATGAG‐5’ and a FAM labelled mutant probe 3‐ ATGAGCTGCATGATGAG‐5’. For EGFR L858R, forward and reverse primer were 3’‐ GCAGCATGTCAAGATCACAGATT‐5’ and 3’‐CATCCACTTGATAGGCACTTTGC‐5’ with a HEX labelled wild type probe 3’‐AGTTTGGCCAGCCCAA‐5’ and a FAM labelled mutant probe 3’‐ AGTTTGGCCCGCCCAA‐5’. For EGFR exon 19 deletions, primers used were 3’‐ GTGAGAAAGTTAAAATTCCCGTC‐5’ and 3’‐TGGCCATCACGTAGGCTTC‐5’ with a FAM labelled probe covering the deletion part of exon 19 3’‐AAGGAATTAAGAGAAGCAACATCTCC‐5‘and a wild type HEX control probe upstream of the commonly deleted region 3’‐ ATCGAGGATTTCCTTGTTGGCT‐5’. Primer and probe details of KRAS were according to the

literature39. Data were analysed using Bio‐Rad QuantaSoft™ Analysis Pro (Bio‐Rad, Hercules,

USA). Threshold for mutant droplet signal was set manually. The number of mutant and wild‐type copies was estimated from the poison distribution as indicated by the manufacturer. Fractional abundance was calculated using mutant copies divided by the sum of mutant and wild‐type copies.

Statistics

To estimate which sample and test conditions were important for an optimal result of the all‐in‐one transcriptome test linear regression analysis was performed using SSPS (version 23.0, IBM). The uniquely aligned sequencing reads were dependent variable; independent variables were design version, tissue origin (FFPE or non‐FFPE), RNA input, DV200, number of PCR cycles used for library preparation. Parameters with p<0.1 in the univariable analyses were further included into multivariable analysis.

Progression‐free survival (PFS) was calculated from date start treatment to date of

progressive disease on imaging according RECIST v1.140.

Results

Sequencing results

All available RNA samples (N=53) were subjected to our assay, irrespective of RNA quality and quantity, to gain insight into the overall performance of our assay. DV200 values were available for 43 samples. The median total reads obtained was 2.4M (range: 1.6M to 5.1M) and the median unique reads was 455K (range: 42K to 1.5M). For additional QC data see Supplementary Table 1.

Mutation detection

Visual inspection of the aligned reads of our transcriptome assay in IGV revealed presence of 35 of the 44 expected variants with at least 3 mutant reads (Table 1). These included 26 SNVs, 6 EGFR exon 19 INDELs and 3 MET exon 14 skipping mutations. For the latter 3, we observed skipping of MET exon 14 at the transcript level consistent with the MET exon 14 skipping mutations detected at the DNA level (Figure 3A). Of the nine variants that we did not detect in IGV, we observed no mutant reads for seven cases and one mutant read for two cases (Table 1). One of the two samples with one mutant read, PE13_T2 of patient P13, had sufficient total unique reads. This sample was obtained after culture for 2 months, while

(9)

the sample tested in the diagnostic setting was analysed without culturing, two ALK‐resistant mutations were identified. Both mutations were also present in the PE sample harvested after 1 month of culture (P13_T1). The p.I1171N mutation was present in 81% of the reads and the p.G1269A mutation was present in 18% of the reads. In P13_T2, the ALK p.I1171N resistant mutation was called in almost all reads, whereas the p.G1269A mutation was absent (Figure 3B). This indicates heterogeneity in the initial PE sample with a strong growth disadvantage of the lineage with the ALK p.G1269A mutation. Thus, P13_T2 is truly negative for the second mutation (p.G1269A). The second sample with one mutant read, i.e. P07, had a total coverage of 3 reads. The seven undetected variants (six SNVs and one EGFR E19 DEL) all had a total read depth below 15 at the position of the expected mutation. To quantify the expression level of the wild type and mutant alleles in the cases where we did not detect the mutation at the transcriptome level, we set up a highly sensitive RNA‐based ddPCR assay. We first tested the efficiency of the ddPCR assay in 13 samples harbouring 14 confirmed SNVs/INDELs using 1 to 8ng of RNA input and detected all SNVs and INDELs. A good correlation was observed between variant allele frequencies (VAFs) as determined by our all‐in‐one assay and the ddPCR assay using the same batch of RNA (R‐squared 0.83) (Figure 2A and Supplementary Table 3).

For six samples in which we did not observe the expected variant in our assay, RNA was available for ddPCR. Despite using a high RNA input (497ng), only eight wild‐type and no mutant KRAS p.G12A droplets were detected for patient P32, indicating a very low KRAS expression level. For the other five samples, we did observe mutant droplets with a frequency ranging from 10% to 71%, but again using very high RNA input for the ddPCR (range 25 to 370ng). The number of mutant droplets was still much lower than the numbers observed in the above‐mentioned test samples for which we used a much lower RNA input. This implies that for these cases the abundance of both the wild type and mutant transcript was below the detection limit of our assay, due to poor RNA quality, low expression of the gene or insufficient unique reads. Next, we tested the performance of the in‐house‐ developed pipeline to call the variants observed by IGV. The three MET exon 14 skipping events were excluded, as our transcriptome‐based assay does not allow the detection of intronic variants. We analysed all 53 samples with our pipeline and were able to call 29 of the 32 SNVs and INDELs. In addition, our pipeline called four mutations not mentioned in the molecular diagnostics report (Table 1). The three variants that were observed in IGV, but not called by the pipeline were (1) AKT1 p.E17K in P35 with 81 mutant reads out of 292 total reads, (2) ALK p.G1269A in P13_T1 with 44 mutant reads out of 240 total reads, and (3) EGFR E19 DEL in P22 with 13 mutant reads out of 38 total reads. Using a second variant caller, Freebayes, we were able to call the AKT1 p.E17K mutation, but not the other two. The reason why our pipeline did not call the EGFR E19 DEL is most likely caused by improper alignment of part of the mutant reads (Supplementary Figure 2). In the other samples with EGFR E19 DELs confirmed by our pipeline, the number of reads with the deletion ranged from 20 to 311 in IGV, while the reported read counts according to the pipeline ranged from 8 to 128. Apparently, our pipeline missed a subset of the reads containing the EGFR E19 DEL due to improper alignment or short read length. It is unclear why our pipeline did not call the other two variants.

By screening all our samples using this pipeline, including those for which no SNVs were reported, we detected four novel mutations. Three of them were frameshift mutations (NRAS p.G15fs, BRAF p.Y472fs, and KRAS p.L56fs), which are filtered out in the molecular

(10)

2

the sample tested in the diagnostic setting was analysed without culturing, two ALK‐resistant mutations were identified. Both mutations were also present in the PE sample harvested after 1 month of culture (P13_T1). The p.I1171N mutation was present in 81% of the reads and the p.G1269A mutation was present in 18% of the reads. In P13_T2, the ALK p.I1171N resistant mutation was called in almost all reads, whereas the p.G1269A mutation was absent (Figure 3B). This indicates heterogeneity in the initial PE sample with a strong growth disadvantage of the lineage with the ALK p.G1269A mutation. Thus, P13_T2 is truly negative for the second mutation (p.G1269A). The second sample with one mutant read, i.e. P07, had a total coverage of 3 reads. The seven undetected variants (six SNVs and one EGFR E19 DEL) all had a total read depth below 15 at the position of the expected mutation. To quantify the expression level of the wild type and mutant alleles in the cases where we did not detect the mutation at the transcriptome level, we set up a highly sensitive RNA‐based ddPCR assay. We first tested the efficiency of the ddPCR assay in 13 samples harbouring 14 confirmed SNVs/INDELs using 1 to 8ng of RNA input and detected all SNVs and INDELs. A good correlation was observed between variant allele frequencies (VAFs) as determined by our all‐in‐one assay and the ddPCR assay using the same batch of RNA (R‐squared 0.83) (Figure 2A and Supplementary Table 3).

For six samples in which we did not observe the expected variant in our assay, RNA was available for ddPCR. Despite using a high RNA input (497ng), only eight wild‐type and no mutant KRAS p.G12A droplets were detected for patient P32, indicating a very low KRAS expression level. For the other five samples, we did observe mutant droplets with a frequency ranging from 10% to 71%, but again using very high RNA input for the ddPCR (range 25 to 370ng). The number of mutant droplets was still much lower than the numbers observed in the above‐mentioned test samples for which we used a much lower RNA input. This implies that for these cases the abundance of both the wild type and mutant transcript was below the detection limit of our assay, due to poor RNA quality, low expression of the gene or insufficient unique reads. Next, we tested the performance of the in‐house‐ developed pipeline to call the variants observed by IGV. The three MET exon 14 skipping events were excluded, as our transcriptome‐based assay does not allow the detection of intronic variants. We analysed all 53 samples with our pipeline and were able to call 29 of the 32 SNVs and INDELs. In addition, our pipeline called four mutations not mentioned in the molecular diagnostics report (Table 1). The three variants that were observed in IGV, but not called by the pipeline were (1) AKT1 p.E17K in P35 with 81 mutant reads out of 292 total reads, (2) ALK p.G1269A in P13_T1 with 44 mutant reads out of 240 total reads, and (3) EGFR E19 DEL in P22 with 13 mutant reads out of 38 total reads. Using a second variant caller, Freebayes, we were able to call the AKT1 p.E17K mutation, but not the other two. The reason why our pipeline did not call the EGFR E19 DEL is most likely caused by improper alignment of part of the mutant reads (Supplementary Figure 2). In the other samples with EGFR E19 DELs confirmed by our pipeline, the number of reads with the deletion ranged from 20 to 311 in IGV, while the reported read counts according to the pipeline ranged from 8 to 128. Apparently, our pipeline missed a subset of the reads containing the EGFR E19 DEL due to improper alignment or short read length. It is unclear why our pipeline did not call the other two variants.

By screening all our samples using this pipeline, including those for which no SNVs were reported, we detected four novel mutations. Three of them were frameshift mutations (NRAS p.G15fs, BRAF p.Y472fs, and KRAS p.L56fs), which are filtered out in the molecular

diagnostics tests as not being relevant for therapy decision making. The fourth was an EGFR p.V834L observed in an FFPE sample (P03), in which we also detected a KRAS p.G12A mutation. The KRAS mutation was reported in the molecular diagnostics, whereas the EGFR mutation was not. This could indicate that this mutation is the result of a post‐

transcriptional modification event.Fusion gene detection

We identified fusion transcripts for 15 of the 20 fusions reported by diagnostic tests (clinical samples) or literature (cell lines) using two fusion detection pipelines. No additional chimeric transcripts were identified. By visual inspection of the aligned reads in IGV we observed partly unaligned reads for three additional samples. BLAST of the unaligned sequences indicated presence of fusion transcripts that were not called by the two pipelines. Thus, of the 20 expected fusion transcripts we confirmed 18 with our all‐in‐one transcriptome assay (13 ALK, 3 ROS1, 1 RET and 1 NTRK1) (Table 2). Besides detection of the target fusion genes, our assay also pinpointed the intron in which the break occurred and the fusion partner in all cases. Eleven of the 13 ALK transcripts correspond to previously published fusion transcripts, i.e. eight EML4_E6‐ALK_E20, two DCTN1_E26‐ALK_E20 and one KIF5B_E24‐ALK_E20. For P36, we observed an uncommon breakpoint region, in intron 18 of the ALK gene, resulting in an EML4_E6‐ALK_E18 fusion transcript. In P42 a novel ALK fusion transcript was identified, i.e. MPRIP_E21‐ALK_E20. The five non‐EML4_ALK fusions were defined as EZR_E10‐ ROS1_E34 in two cases, CD74_E6‐ROS1_E34 in one case, KIF5B_E15‐RET_E12 in one case,

and a TPM3_E7‐NTRK1_E9 in the last case. For two FISH‐break and/or IHC positive cases no

fusion gene transcripts were identified by our assay. P07 was scored as positive based on both ALK IHC and FISH, but no fusion transcript was observed with our transcriptome assay. Of note, for this patient we also missed the SNV in ALK, due to low total read count. For P08, RET was scored positive based on a FISH break pattern in 27% of the cells (2% true split and 25% extra red signals), while an ALK break was seen in 63% of the cells. With our assay we only observed ALK fusion transcripts and no RET fusion transcripts.

As an independent validation, we applied NanoString to detect the fusion gene transcripts. We validated the assays on five confirmed cases and were able to identify the expected fusion transcripts by NanoString. We next examined the two cases in which we did not find the expected fusion transcripts by our all‐in‐one assay. For P07, NanoString detected an EML4_E6‐ALK_E20 fusion transcript. In P08, NanoString detected the ALK, but not the RET fusion transcript, consistent with our all‐in‐one assay. Thus, for one case the negative result of our all in one assay was consistent with NanoString.

As a previous study showed a potential association between survival on TKI treatment and the fusion gene partners, we also analysed PFS in relation to the fusion partner of the nine ALK‐positive patients (Supplementary Table 5). Five patients with the canonical EML4_E6‐ ALK_E20 fusion transcripts were treated with crizotinib and had PFS of 6, 8, 9, 14 and 15 months. P36 with an EML4_E6‐ALK_E18 fusion gene had a PFS of 24 months. P42 with MPRIP_E21‐ALK_E20 had a PFS of 8 months. P08 with a KIF5B_E24‐ALK_E20 fusion gene had a PFS of 19 months. P14 with DCTN1_E26‐ALK_E20 fusion transcript did not respond to crizotinib and treatment was changed to alectinib, with again no response.

(11)

Figure 2. Validation by ddPCR and selection criteria for FFPE samples. A. Comparison of the variant allele frequencies (VAF) as detected by the all‐in‐one transcriptome‐based assay and droplet digital (dd) PCR. The Y axis represents the VAF of the mutations as assessed by our all‐in‐one NGS assay, and the X axis represents the fraction abundance calculated from ddPCR. B. Overview of total read counts in samples for which we did and did not observe the genomic aberrations with our all‐in‐one transcriptome‐based assay. •Blue dots indicate samples with DV200 above 50, •red dots indicate samples with DV200 below 50, and • black dots indicate samples for which DV200 value was not measured. Dashed line indicates the cutoff level of 50,000 unique reads.

RNA input limit

It is challenging to set a clear RNA input limit for transcriptome‐based assays. Tumour content of the sample and the expression level of the genes in non‐tumour cells are just two of the variables that play a role in the minimal amount of RNA needed for the assay. Despite these issues, we tried to get some insight into the detection limit of our assay. Eight samples were re‐sequenced with an 8‐fold lower library input. As a result, the number of unique reads for these samples decreased (Supplementary Table 6). In all eight cases, the expected aberrations were again successfully called by the pipeline, with VAFs similar to those observed under the standard conditions (Supplementary Table 6). In addition, we repeated library preparation for three RNA samples using a 4‐fold and 20‐fold lower RNA input. In two cases, this resulted in a 3‐ and 4‐fold decrease of the number of unique reads with a 4‐fold lower RNA input, and 5‐ and 12‐fold decrease with 20‐fold lower RNA input. For the third case, the pattern was less consistent for the 4‐fold lower RNA input library (supplementary Table 7). Again, we were able to detect the expected variants for all samples irrespective of the RNA input. This indicates that total RNA input as low as 10ng for library preparation is feasible for samples with sufficient RNA quality and high tumour cell content.

(12)

2

Figure 2. Validation by ddPCR and selection criteria for FFPE samples. A. Comparison of the variant allele frequencies (VAF) as detected by the all‐in‐one transcriptome‐based assay and droplet digital (dd) PCR. The Y axis represents the VAF of the mutations as assessed by our all‐in‐one NGS assay, and the X axis represents the fraction abundance calculated from ddPCR. B. Overview of total read counts in samples for which we did and did not observe the genomic aberrations with our all‐in‐one transcriptome‐based assay. •Blue dots indicate samples with DV200 above 50, •red dots indicate samples with DV200 below 50, and • black dots indicate samples for which DV200 value was not measured. Dashed line indicates the cutoff level of 50,000 unique reads.

RNA input limit

It is challenging to set a clear RNA input limit for transcriptome‐based assays. Tumour content of the sample and the expression level of the genes in non‐tumour cells are just two of the variables that play a role in the minimal amount of RNA needed for the assay. Despite these issues, we tried to get some insight into the detection limit of our assay. Eight samples were re‐sequenced with an 8‐fold lower library input. As a result, the number of unique reads for these samples decreased (Supplementary Table 6). In all eight cases, the expected aberrations were again successfully called by the pipeline, with VAFs similar to those observed under the standard conditions (Supplementary Table 6). In addition, we repeated library preparation for three RNA samples using a 4‐fold and 20‐fold lower RNA input. In two cases, this resulted in a 3‐ and 4‐fold decrease of the number of unique reads with a 4‐fold lower RNA input, and 5‐ and 12‐fold decrease with 20‐fold lower RNA input. For the third case, the pattern was less consistent for the 4‐fold lower RNA input library (supplementary Table 7). Again, we were able to detect the expected variants for all samples irrespective of the RNA input. This indicates that total RNA input as low as 10ng for library preparation is feasible for samples with sufficient RNA quality and high tumour cell content.

Table 1. Ov er vie w of SNV / IN DEL sample s ana lyz ed b y th e all ‐in ‐on e tran script om e‐ ba se d ass ay an d summ ary of t he r es ult s Samp le ID Origi n DV200 Know n variants de tec ted at D NA le vel Re sults o f all -in -on e t ransc rip to m e-base d assay Gen e Amino Acid C han ge MD test or refer en ce Too l Mu tan t Re ads Total Reads VAF Statu s Varian ts kno wn a t DN A level P35 PE 81 AKT 1 p. E17K NGS IGV 81 292 28% con firmed P13_ T1 PE 89 ALK p. G12 69 A NGS IGV 44 240 18% con firmed P13_ T1 PE 89 ALK p. I1171N NGS Pip eline 331 336 99% con firmed P13_ T2 PE 86 ALK p. G12 69 A NGS IGV 1 167 1% tru e negativ e P13_ T2 PE 86 ALK p. I1171N NGS Pip eline 353 434 81% con firmed P07 FFPE 65 ALK p. L1196M NGS IGV 1 3 33% no t co nfirme d P35 PE 81 BRAF p. V600E NGS Pip eline 31 60 52% con firmed P25 FFPE 71 BRAF p. V600E NGS Pip eline 33 46 72% con firmed H1650 cell line na EGFR p. E746_A75 0del NGS Pip eline 46 76 61% con firmed H1975 cell line na EGFR p. T7 90M NGS Pip eline 347 425 82% con firmed H1975 cell line na EGFR p. L858R NGS Pip eline 564 684 82% con firmed H820 cell line 99 EGFR p. E746_A75 0del NGS Pip eline 80 606 13% con firmed H820 cell line 99 EGFR p. T7 90M NGS Pip eline 127 660 19% con firmed P04_S2 PE 88 EGFR p. L858R NGS Pip eline 4661 4931 95% con firmed P05 PE 17 EGFR p. E746_A75 0del 22 IGV 0 0 0% no t co nfirme d P04_S1 FFPE 26 EGFR p. L858R 19,22 Pip eline 69 72 96% con firmed P06 FFPE 37 EGFR p. L747_P7 53delin sS 22 Pip eline 8 17 47% con firmed P06 FFPE 37 EGFR p. T7 90M 19,22 IGV 0 15 0% no t co nfirme d P15 FFPE 40 EGFR p. E746_A75 0del NGS Pip eline 51 76 67% con firmed P15 FFPE 40 EGFR p. T7 90M NGS Pip eline 22 88 25% con firmed P17 FFPE 57 EGFR p. E746_A75 0del NGS Pip eline 128 182 70% con firmed P17 FFPE 57 EGFR p. T7 90M NGS Pip eline 62 127 49% con firmed P22 FFPE 69 EGFR p. E746_A75 0del 19,22 IGV 13 38 34% con firmed P26 FFPE na EGFR p. L858R NGS IGV 0 6 0% no t co nfirme d A549 cell line na KRAS p. G12 S 18 Pip eline 512 513 100% con firmed

(13)

Samp le ID Origi n DV200 Know n variants de tec ted at D NA le vel Re sults o f all -in -on e t ransc rip to m e-base d assay Gen e Amino Acid C han ge MD test or refer en ce Too l Mu tan t Re ads Total Reads VAF Statu s HCT116 cell line na KRAS p. G13 D 24 Pip eline 223 456 49% con firmed KO PN ‐8 cell line 99 KRAS p. G12 D 29 Pip eline 99 177 56% con firmed P01 PE 90 KRAS p. G12 D NGS Pip eline 14 111 13% con firmed P03 FFPE na KRAS p. G12 A NGS Pip eline 8 8 100% con firmed P23 FFPE 44 KRAS p. G12 C NGS IGV 0 1 0% no t co nfirme d P28 FFPE 38 KRAS p. G12 A NGS Pip eline 8 22 36% con firmed P31 FFPE 66 KRAS p. Q61H NGS Pip eline 60 123 49% con firmed P39 FFPE 65 KRAS p. G12 D NGS Pip eline 8 12 67% con firmed P40 FFPE 68 KRAS p. G12 F NGS IGV 0 1 0% no t co nfirme d P32 FFPE 32 KRAS p. G12 D NGS IGV 0 2 0% no t co nfirme d H596 cell line na ME T splicing mu tat ion 27 IGV 1116 1196* 93% con firmed Hs746 T cell line 97 ME T splicing mu tat ion 30 IGV 8744 8774* 100% con firmed P21 FFPE 34 ME T splicing mu tat ion NGS IGV 50 56* 89% con firmed H1299 cell line na NRAS p. Q61K 20 Pip eline 1107 2549 43% con firmed H596 cell line na PIK3CA p. E545K 28 Pip eline 156 330 47% con firmed HCT116 cell line na PIK3CA p. H1047R 25 Pip eline 69 115 60% con firmed P02 FFPE 51 PIK3CA p. H1047L NGS Pip eline 12 31 39% con firmed P26 FFPE na PIK3CA p. E542K NGS IGV 0 0 0% no t co nfirme d P37 PE 21 ROS1 p. D2033N NGS Pip eline 3 3 100% con firmed Ove rv iew of a dd ition al varian t t ha t were n ot report ed b y MD P03 FFPE na EGFR p. V834L NGS; FISH Pip eline 9 64 14% na P34_ S1 FFPE 76 KRAS p. L56fs NGS; FISH Pip eline 7 33 21% na P39 FFPE 65 NRAS p. G15 fs NGS; FISH Pip eline 5 22 23% na P40 FFPE 68 BRAF p. Y4 72 fs NGS; FISH Pip eline 6 13 46% na MD= Medical Diag nosti cs, ( *) is calculated us ing average o f co verag e in MET ex on 13 an d 15 min us t he co verag e in MET ex on 14. ( ** ) not c alculate

d when total reads i

s< 3. Table 2. Ov er vie w of fu sion g ene / FISH break p ositi ve samp les an aly zed by th e all ‐in ‐on e tran sc ript om e‐ ba se d as say a nd summary o f th e re sults Samp le ID Origi n DV 200 MD Var iant Re sults o f all -in -on e t ransc rip to m e-base d assay Gen e IHC FI SH Fusion Tr ansc ript IGV (splitting re ads in Ge ne 1 , spli tting r ead s in G en e 2) Fusion catc he r (span ning, spli tting r ead s) Stran d NGS (splitting read s) Statu s H2228 cell line nd ALK nd nd EM L4_E6 ‐AL K_E20 104, 59 41, 107 84 con firmed P07 FFPE 65 ALK + + no t co nfirme d P08 FFPE 67 ALK + + KIF5B _E24‐ ALK_E2 0 5, 9 con firmed P13_ T1 PE 89 ALK + nd EM L4_E6 ‐AL K_E20 83, 238 58, 185 170 con firmed P13_ T2 PE 86 ALK + nd EM L4_E6 ‐AL K_E20 190, 74 64, 190 141 con firmed P14_ T0 PE 86 ALK + nd DCT N1 _E26 ‐ALK _E20 76, 21 20, 41 51 con firmed P14_t1 PE 96 ALK + nd DCT N1 _E26 ‐ALK _E20 50, 6 13, 23 21 con firmed P18 Frozen 86 ALK + + EM L4_E6 ‐AL K_E20 230, 143 74, 290 789 con firmed P33 FFPE 58 ALK + + EM L4_E6 ‐AL K_E20 6, 4 con firmed P34_S1 Frozen 82 ALK + + EM L4_E6 ‐AL K_E20 62, 41 44, 156 77 con firmed P34_S1 FFPE 76 ALK + + EM L4_E6 ‐AL K_E20 7, 3 2, 3 2 con firmed P34_S2 FFPE 70 ALK + + EM L4_E6 ‐AL K_E20 38, 17 10, 3 2 con firmed P36_S2 PE 93 ALK + + EM L4_E6 ‐AL K_E18 49, 105 35, 156 86 con firmed P42 PE 53 ALK + + MPRIP_E21 ‐ALK _E20 8, 20 5, 26 27 con firmed KM12 * cell line 94 NTRK1 nd nd TPM3 _E7‐ NTRK 1_E9 188, 87 41, 153 340 con firmed P08 FFPE 67 RET nd + tru e negativ e P11 FFPE 81 RET nd + KIF5B _E15‐ RET_E1 2 2, 3 con firmed P37 PE 21 ROS1 nd + CD74 _E6‐ ROS1 _E34 0, 3 2, 3 1 con firmed P38 FFPE 55 ROS1 nd + EZ R_E10 ‐R OS1 _E34 11, 2 4, 3 con firmed P41 FFPE nd ROS1 nd + EZ R_E10 ‐R OS1 _E34 19, 0 10, 9 con firmed MD: Molecu lar Di ag nosti cs; PE : p le ur al effus ion; FF PE : fo rmalin fix ed paraffin emb edded; n d: n ot d one. * The N TR K1 fus ion g en e was r epo rte d i n reference 23.

(14)

2

Samp le ID Origi n DV200 Know n variants de tec ted at D NA le vel Re sults o f all -in -on e t ransc rip to m e-base d assay Gen e Amino Acid C han ge MD test or refer en ce Too l Mu tan t Re ads Total Reads VAF Statu s HCT116 cell line na KRAS p. G13 D 24 Pip eline 223 456 49% con firmed KO PN ‐8 cell line 99 KRAS p. G12 D 29 Pip eline 99 177 56% con firmed P01 PE 90 KRAS p. G12 D NGS Pip eline 14 111 13% con firmed P03 FFPE na KRAS p. G12 A NGS Pip eline 8 8 100% con firmed P23 FFPE 44 KRAS p. G12 C NGS IGV 0 1 0% no t co nfirme d P28 FFPE 38 KRAS p. G12 A NGS Pip eline 8 22 36% con firmed P31 FFPE 66 KRAS p. Q61H NGS Pip eline 60 123 49% con firmed P39 FFPE 65 KRAS p. G12 D NGS Pip eline 8 12 67% con firmed P40 FFPE 68 KRAS p. G12 F NGS IGV 0 1 0% no t co nfirme d P32 FFPE 32 KRAS p. G12 D NGS IGV 0 2 0% no t co nfirme d H596 cell line na ME T splicing mu tat ion 27 IGV 1116 1196* 93% con firmed Hs746 T cell line 97 ME T splicing mu tat ion 30 IGV 8744 8774* 100% con firmed P21 FFPE 34 ME T splicing mu tat ion NGS IGV 50 56* 89% con firmed H1299 cell line na NRAS p. Q61K 20 Pip eline 1107 2549 43% con firmed H596 cell line na PIK3CA p. E545K 28 Pip eline 156 330 47% con firmed HCT116 cell line na PIK3CA p. H1047R 25 Pip eline 69 115 60% con firmed P02 FFPE 51 PIK3CA p. H1047L NGS Pip eline 12 31 39% con firmed P26 FFPE na PIK3CA p. E542K NGS IGV 0 0 0% no t co nfirme d P37 PE 21 ROS1 p. D2033N NGS Pip eline 3 3 100% con firmed Ove rv iew of a dd ition al varian t t ha t were n ot report ed b y MD P03 FFPE na EGFR p. V834L NGS; FISH Pip eline 9 64 14% na P34_ S1 FFPE 76 KRAS p. L56fs NGS; FISH Pip eline 7 33 21% na P39 FFPE 65 NRAS p. G15 fs NGS; FISH Pip eline 5 22 23% na P40 FFPE 68 BRAF p. Y4 72 fs NGS; FISH Pip eline 6 13 46% na MD= Medical Diag nosti cs, ( *) is calculated us ing average o f co verag e in MET ex on 13 an d 15 min us t he co verag e in MET ex on 14. ( ** ) not c alculate

d when total reads i

s< 3. Table 2. Ov er vie w of fu sion g ene / FISH break p ositi ve samp les an aly zed by th e all ‐in ‐on e tran sc ript om e‐ ba se d as say a nd summary o f th e re sults Samp le ID Origi n DV 200 MD Var iant Re sults o f all -in -on e t ransc rip to m e-base d assay Gen e IHC FI SH Fusion Tr ansc ript IGV (splitting re ads in Ge ne 1 , spli tting r ead s in G en e 2) Fusion catc he r (span ning, spli tting r ead s) Stran d NGS (splitting read s) Statu s H2228 cell line nd ALK nd nd EM L4_E6 ‐AL K_E20 104, 59 41, 107 84 con firmed P07 FFPE 65 ALK + + no t co nfirme d P08 FFPE 67 ALK + + KIF5B _E24‐ ALK_E2 0 5, 9 con firmed P13_ T1 PE 89 ALK + nd EM L4_E6 ‐AL K_E20 83, 238 58, 185 170 con firmed P13_ T2 PE 86 ALK + nd EM L4_E6 ‐AL K_E20 190, 74 64, 190 141 con firmed P14_ T0 PE 86 ALK + nd DCT N1 _E26 ‐ALK _E20 76, 21 20, 41 51 con firmed P14_t1 PE 96 ALK + nd DCT N1 _E26 ‐ALK _E20 50, 6 13, 23 21 con firmed P18 Frozen 86 ALK + + EM L4_E6 ‐AL K_E20 230, 143 74, 290 789 con firmed P33 FFPE 58 ALK + + EM L4_E6 ‐AL K_E20 6, 4 con firmed P34_S1 Frozen 82 ALK + + EM L4_E6 ‐AL K_E20 62, 41 44, 156 77 con firmed P34_S1 FFPE 76 ALK + + EM L4_E6 ‐AL K_E20 7, 3 2, 3 2 con firmed P34_S2 FFPE 70 ALK + + EM L4_E6 ‐AL K_E20 38, 17 10, 3 2 con firmed P36_S2 PE 93 ALK + + EM L4_E6 ‐AL K_E18 49, 105 35, 156 86 con firmed P42 PE 53 ALK + + MPRIP_E21 ‐ALK _E20 8, 20 5, 26 27 con firmed KM12 * cell line 94 NTRK1 nd nd TPM3 _E7‐ NTRK 1_E9 188, 87 41, 153 340 con firmed P08 FFPE 67 RET nd + tru e negativ e P11 FFPE 81 RET nd + KIF5B _E15‐ RET_E1 2 2, 3 con firmed P37 PE 21 ROS1 nd + CD74 _E6‐ ROS1 _E34 0, 3 2, 3 1 con firmed P38 FFPE 55 ROS1 nd + EZ R_E10 ‐R OS1 _E34 11, 2 4, 3 con firmed P41 FFPE nd ROS1 nd + EZ R_E10 ‐R OS1 _E34 19, 0 10, 9 con firmed MD: Molecu lar Di ag nosti cs; PE : p le ur al effus ion; FF PE : fo rmalin fix ed paraffin emb edded; n d: n ot d one. * The N TR K1 fus ion g en e was r epo rte d i n reference 23.

(15)

Figure 3. IGV screenshots for MET exon skipping and ALK resistance mutations. A. IGV screenshot of reads mapping to MET exons 13 to 15 for a randomly selected control sample (P34_S2) without MET exon 14 skipping and for two cell lines H596 and Hs746 and one patient (P21) with known MET exon 14 skipping mutations. Numbers indicate the average coverage per exon. B. IGV screenshots of reads mapping to ALK exon 24 and exon 23 for two PE samples of a single patient (P13). After a culture period of 1 month (P13_T1) two different resistance associated mutations were identified, whereas after a culture period of 2 months (P13_T2) only one of the two mutations was present in the culture. Indicated are the number of mutant reads and the total read depth.

Quality criteria for successful mutation detection

We next established quality criteria for successful detection of mutations. A threshold of 50K unique reads appears to be required to reliably call variants with an accuracy of 94% with our assay (50 out of 53 expected variants) (Figure 2B and Supplementary Table S9). To identify factors associated with the percentage of unique reads, we performed a univariable linear regression analysis. A significant correlation was observed with panel design version (1, 2 or 3), material type (FFPE, non‐FFPE), RNA input, DV200 and the number of cycles used to amplify the library. In a multivariable analysis, material type and DV200 remained significant (Supplementary Table 9). Based on this analysis, we decided to introduce the DV200 value as an additional quality criterion. DV200 values were measured in 39 samples with known variants. Setting the DV200 threshold at 50 we reached an accuracy of 92% (35 out of 38 expected variants) (Figure 2B and Supplementary Table S9). When both criteria were applied, e.g. DV200>50 and unique reads>50K, all 31 expected variants were detected, including those present in 10 FFPE samples, leading to an accuracy of 100%.

Discussion

Current tests to select therapy for advanced stage NSCLC patients include sequencing‐based methods to detect mutations in hotspot regions, FISH techniques for the detection of chromosomal breaks and IHC for detection of protein overexpression. A strong point of the targeted transcriptome‐based sequencing assay we report here is that it can pinpoint all DNA and RNA aberrations in one test. As we interrogate the transcriptome, our assay provides information on the expression of the mutant allele. In addition, our assay provides information on fusion partners and shows the consequence of MET exon skipping mutations

(16)

2

Figure 3. IGV screenshots for MET exon skipping and ALK resistance mutations. A. IGV screenshot of reads mapping to MET exons 13 to 15 for a randomly selected control sample (P34_S2) without MET exon 14 skipping and for two cell lines H596 and Hs746 and one patient (P21) with known MET exon 14 skipping mutations. Numbers indicate the average coverage per exon. B. IGV screenshots of reads mapping to ALK exon 24 and exon 23 for two PE samples of a single patient (P13). After a culture period of 1 month (P13_T1) two different resistance associated mutations were identified, whereas after a culture period of 2 months (P13_T2) only one of the two mutations was present in the culture. Indicated are the number of mutant reads and the total read depth.

Quality criteria for successful mutation detection

We next established quality criteria for successful detection of mutations. A threshold of 50K unique reads appears to be required to reliably call variants with an accuracy of 94% with our assay (50 out of 53 expected variants) (Figure 2B and Supplementary Table S9). To identify factors associated with the percentage of unique reads, we performed a univariable linear regression analysis. A significant correlation was observed with panel design version (1, 2 or 3), material type (FFPE, non‐FFPE), RNA input, DV200 and the number of cycles used to amplify the library. In a multivariable analysis, material type and DV200 remained significant (Supplementary Table 9). Based on this analysis, we decided to introduce the DV200 value as an additional quality criterion. DV200 values were measured in 39 samples with known variants. Setting the DV200 threshold at 50 we reached an accuracy of 92% (35 out of 38 expected variants) (Figure 2B and Supplementary Table S9). When both criteria were applied, e.g. DV200>50 and unique reads>50K, all 31 expected variants were detected, including those present in 10 FFPE samples, leading to an accuracy of 100%.

Discussion

Current tests to select therapy for advanced stage NSCLC patients include sequencing‐based methods to detect mutations in hotspot regions, FISH techniques for the detection of chromosomal breaks and IHC for detection of protein overexpression. A strong point of the targeted transcriptome‐based sequencing assay we report here is that it can pinpoint all DNA and RNA aberrations in one test. As we interrogate the transcriptome, our assay provides information on the expression of the mutant allele. In addition, our assay provides information on fusion partners and shows the consequence of MET exon skipping mutations

at the transcript level. Setting quality criteria for both RNA (DV200>50) and the total number of unique reads (>50K reads) our assay identified all expected mutations at the transcriptome level, and thus reached an accuracy of 100%. Application of these two criteria to our FFPE samples would have resulted in the exclusion of 11 out of 21 FFPE samples for which we did have DV200 values. For seven of these cases the low DV200 value would have been a reason to stop library preparation and subsequent NGS data analysis. The other four samples would have been used for library preparation and NGS but would have been regarded as failures based on low unique read counts. In a diagnostic setting, a new tissue sample would have been requested for all eleven cases. The use of freshly prepared FFPE blocks in combination with macrodissection of tumour cell‐rich regions, which is routinely done in the diagnostic setting, could further increase the number of successfully analysed samples. The main reason for potential dropout in a routine setting will be related to tumour cell content and RNA quality. We expect that on fresh FFPE blocks the dropout frequency will be similar to the current dropout using DNA based NGS assays, which are also dependent on tumour content and quality of the isolated DNA. Thus, our all‐in‐one transcriptome‐based assay is expected to have an overall good performance on clinical FFPE samples and is excellent if quality criteria are followed.

For fusion genes, both the juxtaposed exons of the target gene as well as the specific fusion partners were identified. In addition to the common fusion products, we found several rare fusion‐partners. The MPRIP_E21‐ALK_E20 fusion product has only been reported in a

meeting abstract41. Also a second uncommon ALK fusion transcript with a break in intron 18

of ALK, EML4_E6‐ALK_E18, has only been reported once42. Detailed knowledge on fusion

partner and/or break points might have clinical implications. In a recent study on ALK FISH positive lung cancer, patients with canonical fusion partners involving EML4 were found to have a better prognosis (20.6 months vs 5.4 months, P<0.01) than those with non‐canonical

ALK fusions43. In our study with a limited number of patients, survival of patients with

canonical and non‐canonical breaks was similar. Still, implementation of techniques to identify the fusion gene partner may become important in routine diagnostics, when knowledge about the fusion gene indeed predicts drug response. This further underlines the importance of the all‐in‐one transcriptome‐based test.

At the DNA level, several specific mutations in intron 13 and 14 of MET have been linked to exon 14 skipping at the transcript level, but for other novel mutations it remains unclear

whether this indeed leads to MET exon 14 skipping44. Detection of MET exon skipping using

a transcriptome‐based NGS method as described in this study directly measures the consequence of the mutation, even though the actual mutation causing the exon skipping will not be identified.

Application of an all‐in‐one transcriptome‐based assay maximises the success rate of detecting genomic aberrations with limited tissue volume from the in general small lung cancer biopsies. In current molecular diagnostic settings, a few methods are available to simultaneously identify SNVs, INDELs, exon skipping and gene fusions. A bait‐based library enrichment method was used to detect SNVs, INDELs, translocations, inversions and copy

number variations (CNVs) using DNA in FFPE tissue45. All 34 known SNVs/INDELs were

identified in their study. In addition, they identified ALK fusions including the fusion partner in six out of seven ALK IHC positive cases. In another study, using RNA from frozen biopsy samples, 6 fusions and 17 mutations were detected. For 10 paired cases, the concordance

Referenties

GERELATEERDE DOCUMENTEN

Wederzijds respect is hier een belangrijke waarde; kinderen leren met elkaar om te gaan, regels respecteren, respect hebben voor zichzelf, rekening houden met anderen, respect

Next generation sequencing guided molecular diagnostic tests in non‐small‐cell lung cancer Thesis, University of Groningen, Groningen, The Netherlands.. Printing of this thesis

Management of acquired resistance to epidermal growth factor receptor kinase inhibitors in patients with advanced non‐small cell lung cancer. Acquired resistance to tkis in

The presence of gene amplifications was based on ratio of amplicon reads of a given gene relative to the reference amplicons in the sample or relative to

Using a different analysis strategy, performing separate pathway analysis for genes mutated in each individual patient we identified the metabolism pathway as the only pathway that

cfDNA: cell free DNA; ESCC: oesophageal squamous cell carcinoma; EC: oesophageal cancer; ddPCR: droplet digital PCR; ctDNA: circulating tumour DNA; NGS: next generation

Ultra‐sensitive detection of the pretreatment egfr t790m mutation in non‐small cell lung cancer patients with an egfr‐activating mutation using droplet digital pcr.

De FISH‐techniek wordt gebruikt voor amplificaties en specifieke chromosomale breuken, IHC wordt gebruikt om te bepalen of er eiwit overexpressie is, nanostring om fusiegenen