VU Research Portal
Bioinformatic solutions for chromosomal copy number analysis in cancer
Scheinin, I.
2017
document version
Publisher's PDF, also known as Version of record
Link to publication in VU Research Portal
citation for published version (APA)
Scheinin, I. (2017). Bioinformatic solutions for chromosomal copy number analysis in cancer.
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal ?
Take down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.
E-mail address:
vuresearchportal.ub@vu.nl
Contents
Abbreviations . . . 9 Abstract . . . 10 English . . . 10 Finnish . . . 11 Dutch . . . 12 1 Introduction 15 Chromosomal aberrations in cancer . . . 16Challenges for data analysis of CNAs . . . 17
Aberration length and magnitude . . . 17
Ploidy, cellularity, and heterogeneity . . . 18
Review of literature for data analysis of CNAs . . . 19
Microarrays for genome-wide CNA detection . . . 21
Array laboratory process . . . 21
Array data and meta-data . . . 21
Copy number analysis of microarray data . . . 22
Preprocessing of microarray data . . . 22
Segmentation and calling of microarray data . . . 25
Next-generation sequencing for CNA detection . . . 28
Sequencing laboratory process . . . 28
NGS data and meta-data . . . 29
Approaches for copy number analysis by NGS . . . 29
Paired-end mapping methods . . . 30
Split-read methods . . . 30
Depth of coverage methods . . . 31
Assembly-based methods . . . 31
Combinatorial methods . . . 32
Copy number analysis of DOC data . . . 32
Preprocessing of DOC data . . . 32
Segmentation and calling of DOC data . . . 33
Downstream analyses of CNAs . . . 36
Regioning to reduce dimensionality . . . 36
Identification of recurrent aberrations . . . 37
Statistical tests for association with clinical data . . . 37
Clustering for subtype discovery . . . 40
Aims of this dissertation . . . 42
2 CanGEM: mining gene copy number changes in cancer
Scheinin et al. (2008) Nucleic Acids Research 36: D830-D835 59
3 CGHpower: exploring sample size calculations for chromosomal copy
num-ber experiments
Scheinin et al. (2010) BMC Bioinformatics 11: 331–340 67
4 DNA copy number analysis of fresh and formalin-fixed specimens by
shal-low whole-genome sequencing with identification and exclusion of prob-lematic regions in the genome assembly
Scheinin and Sie et al. (2014) Genome Research 24: 2022–2032 79
5 Spatial and temporal evolution of distal 10q deletion, a prognostically un-favorable event in diffuse low-grade gliomas
van Thuijl and Scheinin et al. (2014) Genome Biology 15: 471–483 91
6 Summary and discussion 105
Summary of the original publications . . . 106
CanGEM database for CNAs in cancer . . . 106
Clinical data . . . 106
Copy number analysis of microarray data . . . 106
Sample size calculations with CGHpower . . . 107
Copy number analysis and power calculations . . . 107
Diagnostic plots . . . 108
Copy number preprocessing with QDNAseq . . . 108
Correction to read counts and identification of problematic regions in the genome . . . 108
Performance evaluation . . . 109
CNAs in low-grade gliomas . . . 109
Associations between CNAs and survival . . . 109
Evolving picture of glioma classification . . . 110
Discussion . . . 111
Academic software development . . . 111
Bioinformatics software developed for this dissertation . . . 112
Conclusions . . . 119
References . . . 120
Full list of publications . . . 127
Acknowledgments . . . 129