Somatic mobile element detection in pediatric cancer
Student: Ramon van Amerongen (6132375) First examiner: Jayne Hehir-Kwa Second examiner: Josephine Daub
University: Utrecht University Faculty: Faculty of Science
Master: Bioinformatics and Biocomplexity
Mobile elements are pieces of DNA which move from one place to another region in the genome. As a result, the insertions that occur can influence genes in unpredictable ways.
These insertions are normally prevented in healthy tissues but they have been observed to occur occasionally in adult cancer. On the other hand, insertions have not been extensively studied in childhood cancer yet. The goal of this study was to therefore to attempt to detect insertions in children with cancer. In addition, as finding these elements is difficult, the accuracy of different detection programs was first tested. Then, the most optimal combination of programs was used to detect insertions in patients with different cancer types. Between 0 and 6 insertions were generally found per patient, but we also found a single patient with more than 100 insertions. However, we might have underestimated the actual number of insertions and therefore their role, as the observed accuracy during testing was low. Future studies should therefore use more patients and improve ways of detecting these insertions.
Mobile elements (MEs) form a large part of our repetitive genome, but they distinguish themselves by their ability to move from one place to another. In humans the class I elements, LINE1, Alu and SVA, are still capable of moving via reverse transcriptase activity, also called ‘retrotransposition’. However, only LINE1 elements are autonomous, as only a small subset of LINE1 express the required proteins which facilitate retrotransposition of all three types of ME. Normally, retrotransposition is repressed, but can be lifted during embryonal development or in cancer. The resulting ME insertions (MEIs) can affect gene function or cause structural variants (SVs). Research into somatically occurring MEIs and tools to detect them are limited: Whilst MEIs has been observed to occur somatically in adult cancer, no extensive investigation has been performed in pediatric cancer yet.
Therefore, the goal of our study was to assess the frequency of somatic MEIs in our pediatric solid tumor cohort and the reliability of tools to detect them. To this end, we first
benchmarked a set of tools using simulated somatic MEIs before running the best combination of tools on a cohort of paired WGS pediatric cancer patients. Somatic MEIs were seen to occur rarely in a low number of patients in a randomized pattern over the tumor genome. Nevertheless, we observed some insertions that caused translocations or affected cancer genes. Patients with osteosarcoma were particularly affected by MEIs compared to patients with one of the other tumor types. Therefore, while most insertions are passenger mutations, some might have the potential to drive tumor development. We thus show that somatic MEI are difficult to detect but are active in pediatric cancers at a low rate.
LAYMAN’S SUMMARY... 2
LINKING SNVS, SVSAND MEIS...14
Reported MEIs per tool...18
Overlap between tools...19
ME and TSD types of overlapping insertions...20
Discordant ME and TSD types of overlapping insertions...21
Variant allele fraction of overlapping insertions...21
Biological significance of somatic insertions...22
LINKING SNVS, SVSAND MEIS...23
copy number alteration, 26 LINE
long interspersed nuclear element, 5 LTR
long terminal repeats, 5 ME
mobile element, 5 MEI
mobile element insertion, 6 SINE
short interspersed nuclear element, 5
single nucleotide polymorphism, 7 SV
structural variant, 5 SVA
SINE, variable number tandem repeats (VNTR) and Alu, 5
target site duplication, 5 VAF
variant allele fraction, 8
Mobile elements in the human genome
Over two third of the human genome is composed of repetitive elements of which a large portion are mobile elements (MEs) which can move and insert themselves somewhere in the genome. Therefore, they can be seen as a type of structural variant (SV). Among these elements, only a few families remain active in humans as most are of ancestral evolutionary origin and are long degraded (de Koning et al., 2011). Two main classes of MEs exist: class I and class II. The former copies itself through an RNA intermediate while the latter moves via a cut-and-paste mechanism. The class I elements can be further subdivided based on their presence of long terminal repeats (LTR). The ones who lack these LTRs belong to either the LINE (long interspersed nuclear element) or SINE order (short interspersed nuclear element) (Wicker et al., 2007).
Some ME types can autonomously move across the genome through the proteins they express while others cannot. In humans, members of the LINE1 family (figure 1a) are the only autonomous elements (Lander et al., 2001). The typical functional LINE1 element is around 6000 nucleotides long, includes a poly A tail and can have a duplication at the insertion site, termed a target site duplication (TSD), although small deletions can also occur (Lander et al., 2001; Szak et al., 2002; Thung et al., 2014a). However, as most LINE1 elements are truncated, mutated or rearranged, only around 100 LINE1 remain functional in each person of which only a handful are active (Lander et al., 2001; Brouha et al., 2003;
Rodriguez-Martin et al., 2020). Active LINE1s express the proteins ORF1p and ORF2p which facilitate retrotransposition through RNA-binding and reverse transcriptase/endonuclease activity, respectively (Feng et al., 1996; Kulpa & Moran, 2005; Mathias et al., 1991).
However, these proteins also target two other non-autonomous elements: Alu and SVA (Ahl et al., 2015; Raiz et al., 2012). Alus (figure 1b) belong to the SINE family and tend to be around 300 bases in length but are the most abundant out of the three (Lander et al.,
2001). SVAs (figure 1c), on the other hand, are composite elements (SINE-VNTR-Alu) of around 2 kb long and are by far the least abundant (H. Wang et al., 2005).
Effects of mobile element insertions
Mobile element insertions (MEIs) can affect genes through a variety of mechanisms. On a local scale they can insert themselves into coding (figure 2a) and regulatory regions (figure 2a and b) disrupting their function. MEIs can also provide new sequences that can end up in the gene transcript from the target site. In a process known as ‘3’ transduction’
a neighboring sequence at the 3’ end of the source element is incorporated into the RNA and
Figure 1: The active human mobile elements and their composition:
(a) The Long interspersed element 1 (LINE1) with two open reading frames. ORF2 has an enconuclease (EN) and reverse transcriptase (RT) domain (b) The most abundant mobile element in the human genome is the 7SL RNA derived Alu.
(c) SVA are composite elements of SINE, variable number tandem repeats (VNTR, and Alu. Adapted from (Burns, 2017).
Figure 2: Local scale effects of MEIs: (a) Insertion into an exon or splice site can disrupt genes. (b) MEs can disrupt more distant regulatory sequences but also provide alternative promotors which can lead to chimeric transcripts. (c) Alternative splicing can
subsequently at the target site. If the transduced sequence was part of an exon it is also present in the new RNA and this is thus termed ‘exon shuffling’ (Cordaux & Batzer, 2009). In addition, MEIs can provide alternative spice sites and promotors which can lead to different RNA isoforms. For example, usage of a LINE1 promotor can lead to a chimeric transcript (figure 2b: right) and alternative splicing can lead to partial Alu sequences to be incorporated into the gene transcript (figure 2c). The latter is known as ‘exonization’ (Cordaux & Batzer, 2009; Burns, 2017).
On a larger scale, MEIs can also induce larger structural variants. One such example is insertion-mediated deletion during which a neighboring sequence is deleted upon insertion.
These range from 1 bp up to megabases in size. Ectopic recombination between homologous elements can also lead to deletions, duplications and insertions. Moreover, double strand DNA breaks can be generated and repaired by the ME machinery. These breaks have been shown in adult cancer to cause translocations and large-scale duplications or deletions through complex rearrangements (Cordaux & Batzer, 2009; Rodriguez-Martin et al., 2020).
Repression of mobile element activity
The deleterious activity of MEIs is usually repressed through DNA methylation, histone modifications or interactions with small RNAs. Mice studies have pointed out that this repression is especially important during early embryonic development and the
development of germ cells. Otherwise, the absence of repression can lead to the dead of the embryo or infertility, respectively (Protasova et al., 2021). However, there is some evidence in mice that their activity spikes during the preimplantation period of the embryo and in spermatocytes (Protasova et al., 2021). In one study this eventually led to LINE1 insertions during the development of the embryo but not in the germ cells (Kano et al., 2009). ME activity also seems to occur during the preimplantation period of human embryos
(Protasova et al., 2021) and in one study was accompanied by insertions (Muñoz-Lopez et al., 2019). Because of these MEIs during development, individuals end up with a mosaic distribution of somatic MEIs, in somatic as well as germline tissues (Kano et al., 2009;
Muñoz-Lopez et al., 2019), especially in the brain (Zhao et al., 2019; Protasova et al., 2021).
In adult somatic tissues LINE1 is normally repressed. Repression is only lifted in the brain or during certain pathologies (e.g. cancer) (Burns, 2017; Protasova et al., 2021).
The absence of MEI repression can thus lead to insertions in the germline of individuals, in addition to the MEIs already found in the human reference genome. Every individual has over 1000 non-reference MEIs most of which are individual-specific (Niu et al., 2022). In addition, they account for around 9.3 percent of all protein-truncating events among single nucleotide polymorphisms (SNPs) and SVs (Niu et al., 2022). This and other previously functional effects of MEIs have been described to result in more than 124 hereditary diseases such as hemophilia and neurofibromatosis (Hancks & Kazazian, 2016).
The effect of somatic variants in cancer Besides germline variants, insertions
also happen in somatic tissues during cancer. Hypomethylation of the LINE1 promotor and ORF1p are often seen as hallmarks of cancer. LINE1
hypomethylation has also been correlated to retrotransposition rates (Burns, 2017), which vary from 1 up to
>100 somatic MEIs in about half of the adult cancers (Rodriguez-Martin et al., 2020). This especially the case in epithelial cancers, such as esophageal, head-and-neck and colorectal cancer (figure 3: top) (Rodriguez-Martin et al., 2020). Most of these are LINE1 insertions as opposed to the larger number of Alu insertions in the germline (Burns, 2017; Lander et al., 2001; Rodriguez-Martin et al., 2020).
It is still unclear how exactly these insertions drive cancer development. Most are considered passenger mutations (Burns, 2017;
Rodriguez-Martin et al., 2020), but one clear driver insertion was found in the APC gene of a colorectal cancer patient (Scott et al., 2016). Moreover, a recent pan cancer study showed
that MEIs are related to large rearrangements which delete tumor-suppressor genes or amplify oncogenes (Rodriguez-Martin et al., 2020).
Hidden role of MEIs in pediatric cancer
MEIs could have a hidden role in pediatric cancer as well, as activity spikes during
embryogenesis and some pediatric cancers are of embryonic nature. However, research is often limited and contradictory. For example, in nephroblastoma (Wilms tumor), which is a pediatric kidney tumor, LINE1 hypomethylation has been linked to telomere shortening and relapse (Chang et al., 2015; de Sá Pereira et al., 2017). ORF1p also promotes the growth of nephroblastoma cells in vitro (Tang et al., 2018). In addition, TRIM28 is involved in the suppression of retrotransposition and is sometimes affected in Wilms tumor (Hol et al., 2021). On the contrary, the activity of LINE1 is suppressed by NELL2 signaling in Ewing sarcoma, a tumor affecting the bones and soft tissues of children (Jayabal et al., 2021).
Moreover, in osteosarcoma patients, which occurs mainly early but also later in life, only a limited number of somatic insertions were found per patient in the previous pan cancer (figure 3: Bone-Osteosarcoma) and another recent study by Wang et al (Rodriguez-Martin et al., 2020; C. Wang & Liang, 2022). However, reference MEIs seem to be enriched around the frequent chromosomal breaks in osteosarcoma (Smida et al., 2017).
Figure 3: The number of MEI events per tumor type of the PCAWC study: The number in parentheses indicate the number of samples for that tumor type. Adapted from (Rodriguez-Martin et al., 2020).
The low number of somatic MEIs in osteosarcoma is understandable, given several reasons. Somatic mutations in general are rarer in pediatric than in adult cancer and in non- epithelial cancers compared to other cancer types (Rodriguez-Martin et al., 2020). Also, germline (or somatic mosaic) insertions could also contribute to cancer development, especially in these embryonal tumors. Wang et al, for example, found a large number of germline MEIs, some of which were inside cancer genes (C. Wang & Liang, 2022).
Biological and technical difficulties of detection
Compared to smaller variant detection tools, detection of structural variants such as MEIs is tricky and tools for MEI detection are still in their infancy (Ewing, 2015; Rishishwar et al., 2017). Like other structural variants, the short reads that are commonly generated by Illumina sequencing do not span the entire variant. Read mapping is further compromised by the inherently repetitive and sometimes degraded nature of MEs and the numerous other MEs scattered across the genome. MEIs detection tools have the added difficulty of discriminating normal insertions from ME insertions which often display TSDs and
transductions as well. (Ewing, 2015; Rishishwar et al., 2017).
Identifying tumor-specific (somatic) variants and telling them apart from germline variants comes with its own set of problems. Heterogeneity of the tumor sample leads to lower variant allele fractions (VAFs) of somatic variants, making it more difficult to detect these insertions. They are commonly distinguished from insertions in the germline by subtraction of the insertions found in the normal sample from those found in the tumor sample (figure 4a). Consequently, if a germline insertion is not detected in the normal sample due to low coverage but is detected in the tumor, it will be falsely labeled as somatic.
In addition, contamination of the normal sample with the tumor sample or visa versa also makes it more difficult to reliably classify an insertion as either somatic or germline (van Belzen et al., 2021).
To overcome these problems, multiple tools can be combined by taking the union or intersection of the variants (figure 4b/c) (Rishishwar et al., 2017; van Belzen et al., 2021).
However, this will especially be difficult in the context of somatic MEI detection. Somatic insertions often overlap only partially between tools (Ewing, 2015). In addition, union could lead to pooling of the false positives, especially in genomically unstable tumor samples.
Higher sequencing depths produce more false positives (Rishishwar et al., 2017), Therefore, union can lead to especially high number of false positives in tumor samples, which are sequenced at a deeper depth than normal samples.
Goal of research
The exact role of MEIs in cancer is thus still unclear, especially in pediatric cancer. This is also complicated by the fact that their detection is difficult. This study therefore focused on two
Figure 4: Workflow for identification of tumor-specific MEIs: (a) Germline MEIs are filtered by subtracting them from the ones found in the tumor. (b/c) Somatic MEIs from different tools are then combined through either a union or intersection. During an intersection different MEIs are seen as a single event when they fall within the same overlapping window. C side of figure was adapted and edited from (van Belzen et al., 2021).
things: (1) Testing the reliability of using multiple methods to detect (somatic) MEIs and (2) Assessing somatic MEI frequency in embryonal solid tumors.
Before identifying ME insertions in our cohort, it is first necessary to choose an appropriate set of tools that can reliably detect insertions. Therefore, we developed a synthetic tumor- normal paired sample to benchmark different tools individually and in combination. When the term ‘TSD (type)’ is used it henceforth refers to the occurrence of a duplication, deletion or no alteration at the target site.
Generation of the benchmarking datasets
The tumor-normal paired datasets for benchmarking were generated from the male sample HG002 of the Genome in a bottle consortium (Zook et al., 2016). We had chosen for this as others have extensively studied and used this sample for previous studies, among whom Chu et al who used it to create a benchmark annotated with germline MEIs (Chu et al., 2021). As the BAM file we used was aligned with Novoalign we first converted it to FASTQ with Picard (2.20.1) and then remapped it to hg38 with BWA (0.7.13) (supplementary method 1). Then, we only kept reads that either aligned on chromosome 1 or had a mate mapping to this chromosome. The file was subsequently subsampled from 300x to a ~30x coverage ‘normal’
and ~90x ‘tumor’ dataset to match the coverages of the tumor-normal paired samples at the Princes Máxima centre and elsewhere.
Figure 5: Distribution of simulated somatic ME insertions: Alu, LINE1 and SVA elements were generated and inserted into the 90x coverage sample using BAMsurgeon. Here, the numbers of the successfully inserted elements are displayed at different variant allele fractions.
We retrieved existing annotations for 111 Alu, 9 LINE1 and 9 SVA germline insertions of the HG002 sample from the published data by the previous study by Chu et al (supplementary figure 1). Somatic ME insertions were simulated in the 90x coverage sample using
BAMsurgeon (1.3) (figure 5) (Ewing et al., 2015). MEI sequences had to be generated and then supplied to the BAMsurgeon which then tries to insert these sequences into the BAM file. The program first constructs a contig by local assembly. Afterwards, it inserts the sequences into this contig to subsequentially simulate reads from the contig and add them to the BAM file. However, the insertion is sometimes unsuccessful if the program fails somewhere along these steps.
To generate the insertions we used a custom python script
(generate_te_insertions.py) to generate 161 ME sequences per ME type and variant allele fraction between 0.1 and 0.9. Insertions were not generated for positions occurring in high signal or low mappability regions. The probability for a target site duplication, deletion and no TSD were set to 90%, 5% and 5%. The length of a target site duplications and deletions were taken from normal distributions with their tops at 13 and 5 bp, respectively (see supplementary method 2 for more details). Secondly, Picard (2.20.1)
CollectInsertSizeMetrics was used to determine the mean and standard deviation of the insert size of the 90x BAM file. We then supplied the generated ME sequences, insert size metrics and the 90x BAM to the addsv.py script of BAMsurgeon. Reads were subsequently simulated internally using wgsim (package of samtools (1.2) with an error of 0.02) and remapped tot the same reference it with BWA (0.7.13). We made small alterations to BAMsurgeon to support target site deletions and enable exclusion of the preferred cut-site in the input file with insertions. As some of the simulated reads were mapped to other chromosomes than chromosome 1, we again subsetted the reads to only reads or read pairs with reads on chromosome 1. Eventually, BAMsurgeon was able to incorporate 61%
(2634/4348) of the input insertions successfully into the 90x coverage BAM. This resulted in a mean of 98 insertions per ME type and VAF (with a standard deviation of 6). Slightly more insertions were successfully incorporated at higher VAFs (figure 5). The number and length of target site duplications and deletions, on the other hand, were largely unaffected
(supplementary figures 2 and 3). There were no significant differences detected by one-side binomial testing in the set and observed probability of a target site duplication (90% vs 89.8%, P=0.72), deletion (5% vs 5.13%, P=0.75) or nothing happening (5% vs 5.09%, P=0.82).
The median lengths of a target site duplication (15 bp) and deletion (8 bp) were also close to top values of the distributions the lengths were taken from (13 and 7 bp).
Description of the cohort
The cohort used in this study consists of primary tumor-normal paired WGS samples from 79 pediatric cancer patients. They had been admitted to the Princes Máxima Centre in the Netherlands between 2018 and 2022 and had given informed consent for their samples to be used in research (figure 6).
Figure 6: The cohort: The number of patients of each cancer type with primary tumor samples that were included in this study.
The cohort is represented by four different types of solid tumors: Wilms tumor
(nephroblastoma), osteosarcoma, Ewing sarcoma and embryonal rhabdomyosarcoma. A large portion of the Wilms tumor patients were also included in a recent study researching somatic SVs (van Belzen et al., 2022).
MEI detection tools
We selected the MEI detection programs Mobster, xTea and Jitterbug because they were able to detect somatic MEI events in tumor-normal paired samples (this feature was added to Mobster as part of the project). Most programs do not have this feature and the end user is therefore reliant on subtracting predicted insertions found in the normal sample from those in the tumor (see Biological and technical difficulties of detection for potential problems with this method). We benchmarked all three programs and subsequently ran them on the cohort. All files were either generated in or converted to the VCF 4.2 format.
Mobster was released in 2014 and has been used in several studies since (Thung et al., 2014a). For this project, we improved its breakpoint detection and added somatic MEI detection, variant allele fraction (VAF) prediction and VCF output with additional tags (see supplementary method 3 for more details). We subsequently ran the new version (v0.2.5.0) of Mobster in multisample mode on all paired tumor-normal samples to determine the somatic insertions. Somatic insertions were defined as any detected insertions without supporting reads in the germline sample. During benchmarking we also compared the old version (v0.2.4.1) of Mobster to the new one to see if the accuracy of the detection was affected. First Mobster was run on both the normal and tumor benchmark samples
separately. Then germline insertions were removed from the tumor insertions to obtain the somatic insertions using a custom python germline filtering script. This script first removes any duplicate insertions in the tumor and normal samples (see the Benchmarking section).
Next, to obtain the somatic insertions it removes any insertion from the tumor sample with insertions from the normal sample within 100 bp of its breakpoints. During all runs settings were set to default and the hg38 reference genome and its corresponding RepeatMasker file of the repository were used. The reference genome was not provided for the old version of Mobster. Going forward, when the term ‘Mobster’ is mentioned, the new version of Mobster is referenced.
Jitterbug was also run with hg38 on default settings on the datasets but we had to perform small alterations to enable it to work properly (see supplementary method 3 for more details ) (Hénaff et al., 2015). In order to do somatic MEI calling, Jitterbug’s authors advised to provide the filtered tumor insertions and unfiltered germline insertions to the
‘process_ND_TD.py’ script. However, due to the low observed recall (figure 8) we provided the unfiltered tumor insertions instead. Any resulting gff3 file was then converted to VCF format by a custom script (see supplementary method 3).
We ran xTea (v0.1.7) by Chu et al on the germline samples in default mode and on the tumor samples in tumor mode (Chu et al., 2021). During the latter mode, tumor purity was set to 0.3 and the normal sample was not provided to prevent the germline insertions from being removed. xTea was not run in default mode on the tumor sample as it removes most non- heterozygous/homozygous insertions. We obtained the somatic insertions using the same germline filtering script as for Mobster. During benchmarking xTea was also run in tumor mode on both datasets to enable its internal germline filtering. Henceforth this will be referred to as 'somatic mode’, while running tumor mode without the normal sample and using the custom germline filtering script as ‘tumor mode’. We could then compare the effect of the internal filtering of xTea to the use of the custom script.
We benchmarked all three tools (Mobster, xTea and Jitterbug) and their different methods in terms of shared insertions, detection accuracy and prediction of MEI features (breakpoints, TSD, ME type and VAF). When Mobster and xTea are mentioned, they refer to the newest version of Mobster and the tumor mode of xTea. When they are compared to the old version of Mobster and the somatic mode of xTea, it will be explicitly stated so. We determined the number of true positive insertions per tool and between tools by going through several steps. We first ran the tools on the benchmarking datasets. If somatic insertions were determined by using the python script, duplicate insertions were first removed. If somatic insertions were determined internally duplicates were removed afterwards. As predicted insertions are occasionally generated by the same event, only the insertions with the highest number of supporting reads are kept if the insertions share breakpoints with any other insertion of the same tool. A predicted call is subsequently considered a true positive if it occurs within 100 bp of an existing MEI. Therefore, for true positive predictions to be shared between tools, the predictions all need to overlap the same insertion in the benchmark dataset. A false negative is any benchmark MEI that is not captured. The number of false
positives is the number of unique insertions not within 100 bp of an existing MEI. False positives are again shared by tools if they occur within 100 bp of each other. We used a special overlap algorithm to determine the shared false positives (supplementary method 4 and supplementary figure 5). Based on these numbers the recall/sensitivity, precision and F1 scores were calculated for different combinations of tools.
In addition, we tested which tools perform better based on the differences in breakpoints, TSD and ME type and VAF prediction compared to the benchmark. Only the true positive insertions are considered for this part of the analysis. When comparing, any numerical differences between the predicted and actual values (e.g. breakpoints, TSD length and VAF) are first calculated. Then, it is tested whether their calculated differences differ significantly. The Mann-Whitney U test is used in all cases, as the distributions are skewed and data is not paired. In addition, only insertions with matching TSD type (duplication or deletion) are used when considering their length. When neither a duplication or deletion was detected this will be referred to as ‘no_tsd’. In reality, neither a duplication or deletion happened, or it was unknown, as xTea does not make this distinction.
Determining the number and features of MEIs in the cohort Next we determined whether insertions were either unique or shared between tools.
Predictions for insertions were first generated for all samples using the newest version of Mobster and the tumor mode of xTea. Insertions were subsequently removed if they were duplicates or were in high signal or low mappability regions. Next, we determined the somatic insertions as indicated for each tool. We only considered insertions to be
overlapping between tools if any of the breakpoints of one tool occurred within 100 bp of the breakpoints of another tool. The same overlap algorithm for determining the shared false positives was applied here (see supplementary method 5). Overlapping insertions were subsequently manually verified using igv-report. Only verified insertions without any reads in the germline sample were considered somatic and were used in subsequent analysis steps. Differences between tools for the same cancer type were investigated using the Wilcoxon signed-rank test, as data could be paired. Any differences between the number of insertions between cancer types were tested using the Kruskal-Wallis and (post hoc) Mann- Whitney U test. For the verified insertions, we tested whether the occurrence of at least one insertion different between the cancer types through the use of a chi-squared test and post hoc Fischer exact tests. The p-values of the Mann-Whitney U, Wilcoxon signed-rank and Fischer exact tests were adjusted for the number of tests (Bonferroni correction). Finally, we investigated the ME type, TSD type and length and the VAF for the verified insertions.
Investigating the biological significance of somatic MEIs
Next, we investigated whether the verified insertions are potential tumor drivers by looking at the enrichment in genic regions and overlap with cancer genes and somatic SVs. To test for genic enrichment, we first the insertions with any gene components from ensemble. It was then determined whether each insertion occurred in genic, exonic or intronic regions and whether the region was coding or non-coding. Priority was given to exonic regions over intronic ones when an insertion overlapped multiple components. Next, the probabilities of an insertion occurring in each region was calculated from the genomic size of its
components. We subsequently determined the enrichment in each region by testing the observed against the calculated probability using one sided binomial testing. Multiple testing correction was also applied. Cancer genes were retrieved from the Cosmic website (Sondka
et al., 2018) and within 1 kb of these genes any insertions were subsequently identified.
Somatic structural variants were determined by Van Belzen et al according to the same methods described in their earlier study (van Belzen et al., 2022): In short, SVs needed to have at least 50% reciprocal overlap when called by at least two tools out of Manta (v. 1.6), DELLY (v. 0.8.1.) and GRIDSS (v. 2.8.2) (Chen et al., 2016; Rausch et al., 2012; Cameron et al., 2017). We subsequently identified any insertions and genes within 100 bp of the
breakpoints of structural variants.
Linking SNVs, SVs and MEIs
Next, to explain the activity of MEIs, we tested whether the occurrence of at least one MEI is related to SNVs or SVs overlapping cancer related genes. For each cancer type separately, Fischer exact tests were performed and corrected (Bonferroni) for the number of genes.
As the role of ME insertions in pediatric cancer is unclear and their detection is difficult the goal of the results presented here is two-fold: To show the reliability of the three tools by using a benchmark and to see if ME insertions can be detected in solid tumors of pediatric patients. After the somatic insertions are predicted from the benchmark, the shared
numbers between the tools are first visualized. Then the accuracy of the tools to detect ME insertions and their features is determined, alone and in combination with the other two tools. Based on these results, an optimal combination of tools is selected to predict the insertions for the pediatric patients. The numbers and features of the insertions (ME type, TSD type and VAF) are then visualized and compared between the cancer types and tools. At last, their overlap with other features is studied. These insertions might have biological significance if they occur within (cancer) genes or if they are near structural variants.
Benchmarking Shared insertions
After running xTea (in tumor mode), Mobster (newest version) and Jitterbug on the tumor- normal paired benchmark samples, we determined the overlap in their predicted insertions (figure 7): 55% (1455/2634) of the insertions were detected by at least one tool, 42%
(1117/2634) by at least two and 14% by all three (368/2634). Mobster had the most
detected insertions (1342), followed by xTea (1071) and Jitterbug (527). The largest overlap was found between xTea and Mobster which forms 91% (979/1071) and 73% (979/1342) of their individual calls respectively. Despite Jitterbug sharing 72% (380/527) of its detected insertions with xTea and 93% (494/527) with Mobster, 70% (368/527) were shared by all three. Mobster also had the most unique calls (237). These numbers remain largely unaffected when using the old version of Mobster or the somatic mode of xTea
(supplementary figure 6). Overall, the most important finding is that xTea and Mobster shared more insertions with each other than with Jitterbug alone.
Figure 7: UpSet plot of the predicted ME insertions: The number of shared insertions between different combinations of xTea, Mobster and Jitterbug is shown. A black (linked) dot means that those insertions were by that tool or multiple tools.
Next, we used the number false positives and false negatives per tool to calculate the detection accuracy for the insertions in terms of recall and precision. Individual tools show low to medium recall and medium to high precision while their combinations are often trade-offs between the two scores (figure 8 and table 1). When looking at the tools individually, xTea showed the highest precision (0.94), closely followed by Mobster (0.92) and then Jitterbug (0.54). Mobster, on the other hand, had the highest recall (0.51), then xTea (0.41) and again at last Jitterbug (0.20). When looking at the recall for the ME types, LINE1 was best detected by Mobster (0.60) and Jitterbug (0.27) compared to the other element types (see supplementary table 9). xTea, on the other hand, was better able to detect Alu (0.43) and SVA (0.44) elements than LINE1 (0.35). Below 0.2 almost no insertion was detected (supplementary figure 7). Meanwhile, the recall increased for insertions with higher VAFs (supplementary figure 8), up to a maximum recall of 0.75 for Mobster at 0.9 VAF. Jitterbug was the exception as its recall flattened above 0.4. Thus, xTea and Mobster have a notable higher detection accuracy than Jitterbug when detecting insertions above the VAF threshold of 0.2.
Figure 8: Recall and precision scores: The recall and precision were calculated from the number of true positives, false positives and true negatives. The scores for the individual scores are shown in blue. The other colors indicate either the intersection (orange), union (green) or union of pairwise intersections (red) between the prediction sets of the tools.
Union or intersections of the insertions of each tool did not strongly improve the recall or precision. The scores were either similar or at the cost of one and other. The recall increased up to 0.54 or 0.78 for the union of all three tools over all insertions or only the insertions at 0.9 VAF, respectively. As the precision was already high, the maximum was therefore 1.0. The union and intersection of xTea and Mobster, however, seemed to perform better than combinations with Jitterbug. Especially the union between xTea and Mobster seemed to produce the best trade-off between recall and precision (highest F1 score). However, to counteract unexpected false positives in the real data intersecting the predicted insertions of each tool is necessary. Therefore, to also limit the reduction in recall a two-out-three voting system was considered for further analysis. This approach still had slightly higher recall (0.42) than the intersection of xTea and Mobster alone (0.37).
Table 1: Benchmarking accuracy scores: The recall is calculated as the fraction of the true positives (TP) of the total number of true positives and false positives (FP). The precision is based on the ratio of the TP of the total number of false negatives (FN) and FPs. The F1 is the harmonic mean between the two scores: 2*((precision*recall)/(precision+recall)).
Tools Overlap method
TP FP FN Recall Precision F1
Jitterbug - 527 441 2107 0.20 0.54 0.29
Mobster - 1342 69 1292 0.51 0.95 0.66
xTea - 1071 34 1563 0.41 0.97 0.57
Mobster+Jitterbug intersection 494 3 2140 0.19 0.99 0.32 Mobster+Jitterbug union 1375 507 1259 0.52 0.73 0.61 Mobster+xTea intersection 979 0 1655 0.37 1.00 0.54 Mobster+xTea union 1434 103 1200 0.54 0.93 0.69 xTea+Jitterbug intersection 380 1 2254 0.14 0.997 0.25 xTea+Jitterbug union 1218 474 1416 0.46 0.72 0.56 Mobster+xTea
intersection 368 0 2266 0.14 1.00 0.25 Mobster+xTea
union 1455 540 1179 0.55 0.73 0.63
two out of three
1117 4 1517 0.42 0.996 0.59
Using the old version of Mobster or the somatic mode of xTea did not change any of these outcomes noteworthily. Most notably is the reduction in the number of false positives from 130 to 69 for Mobster (see supplementary table 10). The tumor mode of xTea also had slightly higher recall (0.41 vs 0.396) but lower precision (0.97 vs 1.0) compared to the somatic mode.
Next, the prediction accuracy of the tools for the breakpoints, TSD length and VAF were expressed in terms of differences and mismatches between their predicted and actual values. These differences (table 2) are small between xTea and Mobster but large compared to Jitterbug. Jitterbug has, for example, significantly higher differences than xTea and
Mobster when predicting both breakpoints (P<0.001, Mann-Whitney U test), but the tool does not predict the TSD length or the VAF. The new version of Mobster also displayed smaller differences in the start breakpoint compared to the old version (P<0.001, Mann- Whitney U test). Compared to xTea, Mobster had higher VAF differences (P=6.37E-12, unpaired t-test), similar differences for start and end breakpoints (Mann-Whitney U test, P=0.28 and P=0.90) and lower differences in TSD length (P<0.001, Mann-Whitney U).
However, the predicted and actual VAFs for both tools vary widely and are often underestimated, especially for higher VAFs (supplementary figure 11).
Table 2: Differences in MEI features compared to the benchmark: The values displayed here are based on all true positive calls and are used for determining statistical differences between the features. Predictions for the end breakpoint, TSD length and the VAF are missing for the old version of Mobster, as these features were not implemented yet. For Jitterbug, TSD prediction is not implemented and VAF prediction was not functional. “(TSD)” refers to the number of insertions with matching TSD type which were used to calculate the TSD length differences.
Tool n Start breakpoint
End breakpoint difference
TSD length difference
median | mean (SD) median | mean (SD) median | mean (SD)
median | mean (SD)
Jitterbug 529 4 | 10.58 (17.29) 6 | 12.03 (16.19) - -
1350 1140 (TSD)
1 | 5.08 (13.06) - - -
1342 1135 (TSD)
0 | 3.87 (12.51) 0 | 3.93 (13.92) 0 | 1.01 (2.85) 0.17 | 0.18 (0.09)
0 | 0.83 (2.05) 0 | 0.89 (2.31) 1 | 1.21 (1.58) 0.145 | 0.16 (0.1)
Next, we studied the mismatch rate for the ME and TSD type. The ME type could be identified reliably in most cases, while the TSD type was still misidentified on some
occasions. Mobster misclassified the ME type for 5 (0.004%) and Jitterbug for 20 insertions (4%, mainly SVA as LINE1) (supplementary table 12). xTea identified all ME types correctly.
When predicting the TSD type, Mobster misidentified the TSD type in 207 of the cases (14%) and xTea in 77 of the cases (7%) (supplementary table 13). Ultimately, xTea and Mobster seemed similar in terms of accuracy, but they both outperformed Jitterbug in all fields. In addition, Jitterbug also displayed technical difficulties and did not report either the VAF or
TSD. Therefore, we eventually excluded Jitterbug during the cohort analysis. Moreover, we chose the newest version of Mobster over its older version due to less false positives, better ME feature prediction and having additional functionalities (somatic MEI detection, VCF output and TSD and VAF prediction). We preferred the tumor mode of xTea over its somatic mode, due to the slightly higher recall of the former. The increase in false positives would then be accounted for by taking the intersection between the prediction sets of the tumor mode of xTea and the newest version of Mobster.
Reported MEIs per tool
After benchmarking the tools on simulated data, we applied the tools to our cohort: xTea was run in tumor mode and Mobster in its newest version to get the MEI prediction sets for each sample. The reported number of predicted somatic insertions varied widely between and within cancer types and between xTea and Mobster (see figure 9). xTea had consistently more reported insertions than Mobster for all four cancer types (adjusted P=0.0039
(osteosarcoma), P<0.001 (other cancer types), Wilcoxon signed-rank test). Significant differences were also observed between the four cancer types for each tool (P=0.0036 for Mobster and P=0.0020 for xTea, Kruskal-Wallis test). Post-hoc testing with the Mann
Whitney U test revealed, however, that only osteosarcoma had significantly more insertions compared to Ewing sarcoma (adjusted P=0.0026 for Mobster and P=0.022 for xTea) and nephroblastoma (adjusted P=0.0056 and P=0.014) (see supplementary table 14). The variable and high number of MEIs thus warrants the necessity of taking the intersect between the two tools.
Figure 9: Somatic MEI counts for each cancer type and tool: Here the number of predicted somatic mobile elements insertions are shown for all patients per cancer type ad reported for each tool. Due to the left skewness of the data, the counts are displayed on a log scale.
Overlap between tools
In contrast to the sometimes high number of insertions reported by a single tool, we found that number of actual overlapping insertions between xTea and Mobster over all patients was low (235). 125 out of 235 insertions belonged to a single osteosarcoma patient (M520AAD). Many of the removed insertions were therefore likely false positives. Before manual verification, the number of insertions for most patients varied between 0-7 for each cancer type (see supplementary figure 15) and 61% (48/79) of the patients had at least one event. During verification, however, we observed that 73% (80/110) of the insertions among the patients (excluding the outlier) were either due to noise or were also observable in the germline sample. Contrarily, 99% (124/125) of the insertions of the outlier patient M520AAD were true somatic insertions. Eventually, only 25% (20/79) of all patients had more than one insertion for a total of 154 verified insertions. Excluding patient M520AAD, there were only 30 insertions ranging from 1 to 6 per patient (figure 10) and these numbers were
significantly higher for osteosarcoma than the other cancer types (see supplementary table 16). However, after Bonferroni correction only the Kruskal-Wallis test (P<0.001) and the Mann-Whitney U test between osteosarcoma and Ewing sarcoma remained significant (P=0.0079). Instead, it was therefore tested whether the occurrence of at least 1 insertion differed significantly between nephroblastoma (6/33=18%), embryonal rhabdomyosarcoma (6/15=40%), Ewing Sarcoma (0/20=0%) and osteosarcoma (8/11=73%). A chi-squared test did point to a possible significant difference (P<0.001). Ad hoc testing with Bonferroni corrected Fischer exact tests revealed that osteosarcoma was associated more than Ewing sarcoma (P<0.001) and nephroblastoma (P=0.024) with the occurrence of at least one MEI event. The same applies to embryonal rhabdomyosarcoma compared to Ewing sarcoma (P=0.019). It thus seems that osteosarcoma patients (8/11) are more affected by MEIs than the other cancer types.
Figure 10: Number of verified somatic MEIs across the tools for each cancer type: The number of overlapping somatic insertions within 100 bp between xTea and Mobster are shown here for each cancer type. The outlier patient M520AAD is not shown here.
ME and TSD types of overlapping insertions
Next, we studied the ME and TSD types for the verified insertions. As xTea had a lower number of type mismatches during benchmarking, its predictions were chosen over those of Mobster. Among the different cancer types, LINE1 insertions were seen most frequently in embryonal rhabdomyosarcoma and osteosarcoma, while Alu insertions occurred in similar numbers in nephroblastoma (figure 11). Contrarily, 112 insertions in patient M520AAD were Alu elements, while the other 12 were LINE1.
Figure 11: Predicted ME types of verified insertions: The ME types predicted by xTea are shown here.
Target site duplications, on the other hand, occurred most frequently among all insertions (134/154) (figure 12) and had a median length of 14 (supplementary figure 17). In 6 cases deletions occurred with a median length of 26. In the last 14 cases either nothing happened, or it was unknown, as xTea does not make this distinction.
Figure 12: Predicted TSD types of verified insertions: The TSD types predicted by xTea are shown here for the overlapping insertions. ‘no_tsd’ means neither a duplication or deletion was detected, either because it was not there or it was not detectable.
Discordant ME and TSD types of overlapping insertions
The ME and TSD types did not always match between xTea and Mobster for the verified insertions: All insertions had matching ME types for patient M520AAD, but 66% (20/30) of the insertions of the other patients had discordant ME types (supplementary table 18 and figure 19). The TSD type, on the other hand, matched for 131 out of 154 insertions (85%) (supplementary table 20 and figure 21).
These mismatches had unforeseen consequences. If Mobster’s predictions for the ME type were chosen over xTea’s, the number of Alu insertions would drop from 10 to 2, as 9 insertions were classified as Alu by xTea but as SVA (5) or LINE1 (4) by Mobster. If instead,
only matching insertions were considered, the number of Alu insertions would also be lower (1) (supplementary figure 22).
In addition, these mismatches seem to affect differences in the breakpoints between the two tools. The median differences in start and end breakpoints of all insertions dropped to 0 bp when a ME mismatch occurred (P<0.001, Mann Whitney U) (table 3). The same drop was seen when the TSD type was discordant (P<0.001, Mann Whitney U).
Table 3: Breakpoint differences depend on mismatches of the ME and TSD type: Here the verified insertions were grouped between matching and mismatching groups in terms of the ME and TSD type. Then, the differences between the predicted and actual breakpoints were determined for each group and compared.
Breakpoint Median (match)
p-value Adjusted p-value
ME Start 0.0 144 20 10 4.82E-4 9.64E-4
ME End 0.0 144 30.5 10 3.58E-8 7.17E-8
TSD Start 0.0 131 14 23 9.14E-17 1.83E-16
TSD End 0.0 131 15 23 3.05E-17 6.09E-17
Variant allele fraction of overlapping insertions
Besides the ME and TSD types, we also studied the variant allele fractions of the verified insertions (figure 13). The variant allele fraction of most insertions was below 0.6 for all cancer types, in particular for patient M520AAD (figure 14). However, it is likely that many insertions had higher allele fractions in reality, as the tools were seen to underestimate the VAF during benchmarking. Moreover, a low tumor purity of a particular sample would naturally also lead to a lower VAF than is actually the case. Many of these insertions could therefore be of biological significance to the cancer.
Figure 13: Variant allele fractions for the verified insertions: The distribution of the variant allele fraction is shown for the insertions of three cancer types here in a histogram. The distribution for the outlier patient is visualized separately with a KDE line.
Biological significance of somatic insertions
Next, we investigated whether the verified insertions had any biological significance for the developing tumor. Using binomial testing, the calculated versus the observed probabilities (supplemental figures 23 and 24) of an insertion landing in genic regions were not
significantly different for either patient M520AAD or the other patients (see supplementary tables 25 and 26 for the regions and p-values, binomial testing). Hence there is no
observable enrichment in genic regions. Nevertheless, out of 68 insertions occurring in
genes of all patients, we observed two insertions in intronic regions of cancer genes of patient M520AAD: TP53 and LRP1B at a VAF of 0.55 and 0.09, respectively.
Table 4: Features of the overlapping verified insertions and SVs. Genes in italics are cancer genes.
Patient label ME SV
chr21:24803252- 24803281 0.340319
chr10:97652440- 97652476 X
chr7:25598253- 25598282 0.306666 LINC03007
chr14:67064335- 67064364 X
chr8:25569145- 25569173 0.234787
chrX:11935059- 11935080 X
chr17:28655914- 28655943 0.309456 RPS12P28, SDF2
chr22:21572140- 21572169 X
chr22:44583337- 45944618 0.404622 Multiple M021AAB
chr10:97652477- 99124031 0.167242 Multiple M994AAB
chr21:17038582- 17457075 0.617712
RNU6-113P, NEK4P1, LINC01549
Next, the verified insertions were overlapped with structural variants. In total eight
insertions were within 100 bp of the breakpoints of a SV with 1-2 SVs occurring per patient:
4 translocations, 2 inversions, 1 duplication and 1 deletion (table 4). The frequency of the translocations was significantly higher than the overall frequency of translocations (0.082) in our cohort (P=0.0024, binomial testing). This could indicate that MEIs are more prone to cause translocations over other SVs, at least in osteosarcoma, as most of these SVs (5/8) and translocations (3/4) were observed in this cancer type. The three other SVs occurred in embryonal rhabdomyosarcoma patients: 2 in M021AAB and 1 in M117AAB. The SVs
observed here also affect cancer genes: The inversion in TP53 overlaps the earlier discussed insertion in patient M520AAD. In addition, the cancer related gene GPHN is overlapped by the non-MEI overlapping breakpoint of a translocation.
Linking SNVs, SVs and MEIs
Next, to find possible explanations for MEI activity, we tested whether SNVs or SVs in cancer related genes could explain the presence of at least one MEI. Some mutated cancer genes were significantly related to the presence of at least one MEI when doing it over all patients (supplementary table 27). However, after multiple testing correction this significance was lost. In addition, since some genes might be mutated in only a certain cancer type, this was likely a confounding factor. If the data was therefore stratified by the cancer type instead, all significant genes before correction were also lost, likely due to too low sample sizes.
Interestingly, RB1 was one of the more significant genes in both cases, because all 5 osteosarcoma patients with the RB1 mutation (including patient M520AAD) have at least one MEI.
The presence of mobile elements insertions has not been investigated before in a pan- cancer cohort of pediatric patients. In addition, detection of these insertions is difficult (Biological and technical difficulties of detection). The goal of this project was therefore two- fold: To investigate the presence of mobile element insertions in pediatric cancer (Biological findings) and to test the reliability of methods to detect them (Technical findings and
Based on our findings, somatic MEIs seem to mostly occur in low number as passenger mutations in a subset of patients with different solid pediatric tumors. For adult cancer, this has also been described in previous studies (Burns, 2017; Rodriguez-Martin et al., 2020). The only exception are epithelial cancers where MEIs seem to occur more frequently (Rodriguez- Martin et al., 2020). In our cohort, 25% of the 79 patients had 1-6 insertions after manual verification of the intersection between xTea and Mobster, except for the osteosarcoma patient M520AAD who had 124 insertions. Differences between cancer types were therefore small, but not neglectable: No MEIs were observed in Ewing sarcoma while osteosarcoma patients were the most affected (8/11 tumor samples had at least 1 MEI). This is
understandable given that osteosarcoma in general has a high somatic mutation rate tumor type compared to other childhood cancers and that NELL2-signaling could potentially suppress MEI activity in Ewing sarcoma (Smida et al., 2017; Jayabal et al., 2021). In addition, two previous studies also found similar numbers of MEIs in osteosarcoma but it was not specified whether they (exclusively) included pediatric patients (Rodriguez-Martin et al., 2020; C. Wang & Liang, 2022). However, one outlier osteosarcoma patient (M520AAD) in our cohort had 124 insertions which has not been described before.
The TSD types of the insertions were as expected, but the ME types were not: In accordance with earlier studies, target site duplications occurred more often than deletions and had a median length of 14 (Thung et al., 2014b; C. Wang & Liang, 2022). However, when inspecting the ME type, Alu insertions occurred more frequently than expected, even though LINE1 insertions were observed the most. In the adult pan-cancer study LINE1 elements formed 99% of all insertions with an identifiable ME type (Rodriguez-Martin et al., 2020). Such big differences were not observed in this study: Slightly more LINE1 insertions were seen in embryonal rhabdomyosarcoma and osteosarcoma, while LINE1 and Alu elements occurred around similar rates in nephroblastoma. Interestingly, in patient M520AAD the majority (112/124=90%) of the insertions were Alu elements. These differences between adult and pediatric cancer types could hypothetically be explained by the embryonic nature of the cancer types in our cohort: As Alu insertions happen more frequently in the germline (Burns, 2017; Lander et al., 2001), they might be favorited over LINE1 insertions earlier in life. As Alu and LINE RNA compete to bind the ORF2p protein, there might therefore be a hidden mechanism affecting this balance throughout life.
When looking into reasons for MEI activity, no somatic SNVs and SVs were
significantly associated to the presence of MEIs. Nevertheless, RB1 might be related to MEI activity. It was mutated in 5 osteosarcoma patients who all had at least one MEI and the protein has been linked to suppression of LINE1 activity before (Montoya-Durango & Ramos, 2011; Ramos et al., 2021). However, no further studies linked disruption of RB1 to
retrotransposition rates. In addition, outlier patient was among these 5, but had many more
insertions than the other 4 patients. It is therefore likely that something else that regulates MEI activity is (also) impacted in the outlier patient.
The insertions are unlikely to be relevant to the tumor. Some insertions did show high VAF up to 0.6. This is likely even higher due to the underestimation of the VAF by the tools during benchmarking and due to varying tumor purities. This could point to potential tumor drivers. However, no enrichment was observed in genic or non-genic regions, suggesting that mutations happened randomly. Still, MEIs were seen to cause structural variants which might affect more genes: 8 MEIs overlapped with structural variants, 4 of which were translocations of which 3 occurred in osteosarcoma. Therefore, MEIs might thus be more prone to be cause translocations in osteosarcoma than other SV types. In addition, the cancer genes TP53 and LRP1B were affected by a MEI in the osteosarcoma patient M520AAD. Both of these genes are often seen to be impacted by mutations in osteosarcoma (Sayles et al., 2019). GPHN was also affected by a SV overlapping a MEI in another
osteosarcoma patient (M713AAB) and by a MEI in a patient of the previous osteosarcoma study (C. Wang & Liang, 2022). Thus, despite most MEIs occurring as passenger mutations, osteosarcoma patients seem more readily affected by MEIs that are potential drivers. In addition, the effect of RB1 mutations on MEI activity and our outlier patient M520AAD warrants further exploration.
Technical findings and decisions
After benchmarking the tools, it became apparent that MEI detection is indeed challenging.
Individual tools had low recall over all insertions (up to 0.51) and moderate recall at the highest VAF (up to 0.75). However, Mobster and xTea did show high precision. Combinations of tools did not produce noteworthy improvements: The recall could only be increased up to 0.54 from 0.51. Nevertheless, when comparing the tools and their methods directly, an eventual decision was made for the most optimal combination of tools: The intersect between the newest version of Mobster and the tumor mode of xTea.
We excluded Jitterbug, as xTea and Mobster shared more insertions and had higher detection accuracy than Jitterbug in terms of recall, precision and ME type prediction. In addition, Jitterbug was not able to predict the TSD or VAF. The authors of Jitterbug admitted in a later study that they initially overestimated its recall: They measured a maximum recall of 0.50 on a 20x coverage human sample while they initially reported a recall of 0.9 using simulated data (Hénaff et al., 2015; Vendrell-Mir et al., 2019). The inferior performance of Jitterbug was likely partially due to its different method for determining the ME type: Instead of using a library of ME sequences it classified an insertion according to the ME type of the reference ME the discordant reads mapped to. However, a previous study that also
compared multiple MEI detection tools concluded that the ME detection algorithm had no influence on the relative performance between tools (Rishishwar et al., 2017). It is thus not entirely clear why Jitterbug was outperformed by the other tools by such a large margin.
We chose the newest version of Mobster over its old version, as it had less false positives (see table 1 and supplementary table 10) and better breakpoint detection during benchmarking and additional VAF prediction. xTea was still slightly better in predicting the TSD type, VAF and ME type. Therefore, we chose xTea’s predicted ME over Mobster’s when insertions overlapped in the real cohort.
For xTea, on the other hand, we preferred its tumor mode over its somatic mode due to the higher recall of the former: We wanted to maximize xTea’s recall as this would be limited by taking the intersect between the two tools. However, the tumor mode had a
lower precision and thus lead to a slight increase of tool-specific false positives for the benchmark (see table 1 and supplementary table 10) and potentially a large increase for the cohort: xTea (as well as Mobster) sometimes had unrealistically large numbers of insertions for some patients. By taking the intersect, however, all false positives were removed for the benchmark and a great reduction was seen for the cohort. Any germline insertions that were missed by both tools or other false positives were removed by manually verifying them afterwards. Therefore, taking the intersect and performing manual verification was necessary in our case to reduce the number of false positives.
Even though we opted for this approach, the most optimal combination of tools might depend on the setting and should be carefully considered: The best combination should not only be based on the results of a benchmark as we were not able to fully replicate these results in the real data. Therefore, observations in the real data, goal of the study and results of previous studies should additionally be considered. Previous
benchmarking studies concluded that due to low overall recall and low overlap between tools, union of the tools might be the optimal approach in some cases (Ewing, 2015;
Vendrell-Mir et al., 2019). In our case, however, the intersection is needed to reduce the false positives that arise due to the potential instability of (some) tumor samples and the need to distinguish germline from somatic insertions. In addition, this would also limit potential false positives that were reported for Mobster in higher coverage samples by previous studies (Chu et al., 2021; Rishishwar et al., 2017). Therefore, in cases similar to ours, especially when detecting somatic MEIs, our intersect approach might be best: Each tool’s settings could be configured to increase sensitivity, while their predictions could be intersected to prevent tool-specific false positives. Any remaining false positives (mainly germline insertions) can then be identified by manual verification of the predictions.
Limitations and future prospects
Nevertheless, our approach had several limitations. We might have underestimated the number of MEI events as the recall was penalized by taking the intersect between the tools.
On top of that, the actual recall might even be lower than what was expected based on the benchmarking: The benchmark dataset contained several biases which could have made insertions easier to detect, especially by Mobster. The simulated ME sequences were based on Mobster’s library sequences and did not consider the incomplete nature or transduced sequences of some MEIs. In addition, MEs were inserted into easier-to-detect regions: High signal or low mappability regions were excluded. BAMsurgeon might also have been more likely to fail to simulate an insertion in low coverage regions. The benchmark also does not simulate other co-occurring genetic alterations in tumor samples such as CNAs (copy number alteration), SVs and SNVs, which could generate false positives or make insertions harder to detect. Meanwhile xTea has more strict filtering and more sophisticated detection (e.g. transduced sequences), which could help it to better detect insertions in these cases.
Besides these aforementioned reasons, it is unclear how the fact that MEIs are simulated in high numbers and close proximity on a single chromosome might have impacted the recall or other benchmarking results. On the whole, there could thus in reality a bigger
discrepancy in performance between xTea and Mobster, resulting in an even greater
reduction in recall when taking the intersect. On the other hand, our benchmark might have produced unrealistic recall estimates as this heavily relies on the distribution of the VAFs of the insertions. In reality, the recall will therefore be lower if more insertions have lower allele fractions but higher if insertions have higher allele fractions.