• No results found

University of Groningen Core gene identification using gene expression Claringbould, Annique

N/A
N/A
Protected

Academic year: 2021

Share "University of Groningen Core gene identification using gene expression Claringbould, Annique"

Copied!
15
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Groningen

Core gene identification using gene expression

Claringbould, Annique

DOI:

10.33612/diss.145227875

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from

it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date:

2020

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Claringbould, A. (2020). Core gene identification using gene expression. University of Groningen.

https://doi.org/10.33612/diss.145227875

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)
(3)
(4)

1

2

3

4

5

6

7

8

Role of genetics in complex traits

Let me start with a cliché but nonetheless true statement: everyone is unique. People differ with respect to their lifestyle, personality, beliefs, looks, and health. The observable characteristics of an individual, known as their phenotype, arise from a combination of

environmental and genetic factors (or genotype). Genetic information is stored in DNA,

and, while humans share 99.9% of their DNA, the variation in the remaining parts influences everything from height to disease risk.

Indeed, many thousands of diseases and disorders are (partially) determined by genetic factors (Buniello A, et al. 2019). On one end of the spectrum are monogenic or Mendelian conditions: these are defined as being caused by mutations in a single gene. In the most

extreme case of highly penetrant autosomal dominant mutations, if a child is born with such a mutation, they will have a high chance of developing the disease. On the other end are complex diseases, where a combination of environmental factors and up to thousands

of genetic risk alleles together increase the risk of developing the disease, without being deterministic. For example, someone may have a high genetic risk for coronary artery disease, yet will never develop any heart problems. While there are many types of genetic variation, including copy-number variation, insertions, deletions and translocations, single nucleotide polymorphisms (SNPs) are the most widely studied class and make up the

majority of known genetic variation.

In the last two decades, the costs to perform SNP genotyping experiments have decreased dramatically. As a result, it is now possible to compare groups of individuals with and without a complex disease and find out if any commonly occurring SNPs are more prevalent in one of the groups. Testing for association between disease status and the genotype of many genetic variants simultaneously is called a genome-wide association study (GWAS). The

earliest successful GWAS included genotypes of 96 individuals with age-related macular degeneration and 50 healthy control subjects. The researchers investigated whether any of the ~116,000 tested SNPs were more often present in the case group as compared to controls (Klein et al., 2005). Since that early study, the number of tested SNPs and sample sizes have gone up substantially: recent GWASs include over a million individuals, while studying millions of SNPs simultaneously. Such large-scale studies can identify genome-wide hits for hundreds of independent SNPs (Evangelou et al., 2018; Lee et al., 2018; Nielsen et

al., 2018; Timmers et al., 2019). Aside from diseases, other complex traits, such as height,

dietary preferences and educational attainment, are now regularly being investigated in large biobanks like UK Biobank and Lifelines (Stolk et al., 2008; Bycroft et al., 2018).

A number of factors influence the success of a GWAS: the study sample size, the genetic architecture of the trait (i.e. the allele frequency and effect size distribution of causal

(5)

alleles), which SNPs are being tested, the phenotypic heterogeneity of the trait, and how many of the trait-associated SNPs are present in the study population (Visscher et al., 2017). A successful GWAS identifies multiple significantly associated SNPs and provides an unbiased estimate of their effect sizes. GWAS are hypothesis-free: you test all available SNPs for association with a disease, without selecting loci of interest based on prior information. The combined effect of all genetic factors on the variability in the phenotype is called

heritability. Heritability can be estimated in a number of ways, for example by looking at

phenotypic similarities between monozygotic twins (i.e. with the exact same DNA) who were raised apart. As a result, we now have a pretty good estimate of the heritability for many complex traits. SNP heritability is the proportion of genetic heritability of a trait that can be

explained by the information contained in the genotyped and imputed SNPs. SNP heritability estimates are typically lower than full heritability because genotyping platforms interrogate only a million SNPs directly and therefore do not capture all existing SNPs. On top of that, other classes of genetic variation are not taken into account when calculating SNP heritability. SNP heritability estimates also depend heavily on the method of calculation (Hou

et al., 2019). For a number of years, GWAS would only identify a handful of SNPs per trait, and

their combined effect did not add up to the estimated heritability (Maher, 2008; Manolio et

al., 2009). This problem of ‘missing heritability’ was partly resolved first by simulation (Yang et al., 2010) and later by using real data (Wainschtein et al., 2019) on height, a widely used

model complex trait. Height is highly polygenic and traditional heritability estimates are very accurate because height can be quantified precisely and cheaply. These studies show that by combining information from all SNPs, and not just the statistically significant ones, the majority of estimated heritability can be explained.

Because complex traits are influenced by many genetic factors, individual SNPs or genetic loci generally play a small role in the development in these traits. However, if you sum up the number of risk alleles from all associated SNPs, you can calculate a polygenic score (PGS)

for each individual who has been genotyped. A PGS reflects the genetic ‘risk’ of developing a disease or trait: they are the best predictor of a phenotype based on only genetic information. As more trait-associated SNPs are identified, and their effect sizes estimated more accurately, a PGS becomes better at predicting a phenotype. If you can successfully predict a phenotype based on genotype, you can use PGSs to identify individuals with an increased risk as compared to the general population. Patient stratification into high and low risk groups based on their PGS is gaining traction, and these approaches will likely be used to improve clinical care in the near future (Khera et al., 2018).

In summary, GWAS have been tremendously useful for obtaining unbiased SNP estimates, gaining information on the genetic architecture and heritability of a trait, and predicting

(6)

1

2

3

4

5

6

7

8

phenotypes from genotype through PGS. However, there is one big caveat in these estimates: most GWAS are performed on individuals of European descent (Martin et al., 2019). Genetic loci segregate differently across populations, which means that SNPs may have different allele frequencies and different effect sizes in non-European populations. As a result, many trait-associated SNPs that are more prevalent in non-European individuals remain unidentified (Wojcik et al., 2019). Moreover, the efficiency of phenotype prediction and the subsequent successful clinical implementation of PGS depend heavily on the population used to identify the trait-associated SNPs, emphasizing the need for more diversity in GWAS populations (Duncan et al., 2019; Martin et al., 2019).

Nevertheless, GWAS have identified thousands of disease-linked genetic variants (Buniello A, et al. 2019). These variants have provided some insight into the development of diseases, but two problems hamper better understanding of the molecular mechanisms leading to the phenotype.

First, genetic variants do not operate in a vacuum. Recombination (where chromosomes

of a pair break in the same place, cross-over and recombine with the other chromosome) causes parts of the DNA to segregate independently. Conversely, genetic variants within the same block of DNA will be linked and therefore contain overlapping information. If two variants always segregate together in a given population, they are said to be in complete linkage disequilibrium (LD). In GWAS, association signals are often attributed

to a genetic locus that contains multiple genetic variants. One of those SNPs will be most significantly associated to the trait under investigation, but prioritising a single causal variant per locus is not straightforward because of LD patterns. On top of that, there might be multiple independent causal variants per locus. Knowing the causal variant(s) would allow investigation into the precise mutation that contributes to disease. Statistical fine-mapping strategies have had some success in identifying causal variants for several common diseases (Maller et al., 2012; Farh et al., 2015; Westra et al., 2018). Recently, it has also become feasible to introduce single-base edits into the genome in vitro using a combination of components of the bacterial CRISPR system (Rees and Liu, 2018). Using this technique, it is possible to confirm a putative causal variant by experimentally validating its effect on expression. Prioritisation using a statistical framework followed by an experimental read-out of the effect of the SNP in cells will likely result in a reliable characterization of the causal variant(s) in a locus.

While these methods may prioritise genetic variants and loci, they do not necessarily point to the causal gene. The large majority of GWAS SNPs are located in non-coding regions of the genome, meaning outside gene bodies or in the intronic regions of genes (Edwards et

(7)

link GWAS SNPs to causal genes directly is the second reason that it is challenging to obtain a molecular understanding of a phenotype from association statistics. In this dissertation, I use gene expression information to prioritise likely causal genes in order to take a step in the direction of molecular insights into disease.

Linking genetic information to gene expression

Gene expression, or gene activity, is a measure of how often a gene is being transcribed from DNA to RNA within a cell or tissue. RNA levels can be measured using a microarray

chip or using RNA-sequencing (RNA-seq). While the microarray chip is less expensive,

allowing for gene expression profiles from more samples, it requires a priori knowledge on the transcripts you expect in the sample. Nowadays, the RNA-sequencing approach is more commonly used as it provides information on the number of each RNA transcript as well as on the sequence of these transcripts.

RNA transcripts are translated into proteins, the building blocks of a cell. The unidirectional informational flow from DNA to RNA to protein is often referred to as the central dogma

(Strachan and Read, 2011). Because a liver cell needs to function differently from a lung cell, they will have activated different transcriptional programs that control the level of RNA expression and the subsequent translation to proteins (Ardlie et al., 2015). These transcriptional programs are broadly similar between the liver cells of all individuals. As such, it is possible to find differentially expressed genes by comparing the average gene expression levels of healthy individuals with those of patients (Costa et al., 2013). Because these genes are specifically up- or downregulated in patients, they may give clues to the aetiology of the disease.

Common genetic variants, such as those identified in GWAS, can also affect gene expression levels (Cheung and Spielman, 2002). If a genetic variant affects the level of expression, the combination of the SNP and the gene it influences is called an expression quantitative trait locus (eQTL). Following the ‘central dogma’ sequence, such up- or downregulation of gene

expression may lead to altered protein levels (Liu and Aebersold, 2016) and ultimately influence the onset of a disease. Many individuals who carry risk alleles for common disease do not go on to develop that disease, but we can still observe effects on gene expression in these people. As such, investigating the influence of genetic variation on gene expression levels in healthy individuals can help to understand the biological mechanisms at play in disease.

QTL mapping was initially introduced in plant and animal research to identify genetic loci that influence quantitative traits of economic importance for breeding programs (Young, 1996; Jansen and Nap, 2001; Haley, 2002). The advent of both genotyping and expression profiling

(8)

1

2

3

4

5

6

7

8

technologies launched investigations into the link between genetic variation and expression levels in humans in the context of disease (Cheung et al., 2003; Dixon et al., 2007; Emilsson

et al., 2008). eQTL mapping is very similar to GWAS in procedure. You test for association

between genotypes and phenotypes, while assuming that the expression levels of a certain gene reflect a phenotype of interest. As a consequence, a genome-wide association study can be conducted for each gene. A significant eQTL can be visualized as a box plot showing the expression level of a gene across individuals with homozygous reference, heterozygous and homozygous alternative genotypes, respectively (Figure 1.1, lower panels).

A SNP that influences the gene expression level of a nearby gene is called a cis-eQTL (Figure 1.1, left panel). Cis-eQTL SNPs are thought to often directly influence gene expression. For

example, the genotype at the cis-eQTL SNP location could influence the binding affinity of the RNA polymerase protein complex and thereby impact the expression of the gene. Indeed, cis-eQTL SNPs are often located within the gene body or promotor region of the gene (Chapter 6). It has recently also become possible to observe indirect, or trans, effects, where the SNP

is located far away from the gene, and the SNP and gene can even be located on different chromosomes (H.-J. Westra et al., 2013, Figure 1.1, right panel). A typical scenario for how a

trans-eQTL could arise is that the SNP locally affects the expression of a transcription factor

(TF) and as a result all the genes regulated by the TF protein are up- or downregulated (i.e.

they have higher or lower gene expression).

SNP Gene cis-eQTL AA AC CC Expr ession SNP Gene trans-eQTL AA AC CC Expr ession

Figure 1.1 Cis- and trans- expression quantitative trait loci (eQTLs). An eQTL is the combination of a single nucleotide polymorphism (SNP) and the gene whose expression it influences. Cis-eQTLs are characterised by the small distance between the SNP and the gene, indicating they are part of one regulatory unit with a direct effect (top left). The effect of the SNP genotype on the gene expression is usually large in cis-eQTLs (bottom left). In trans-eQTLs, the SNP has an indirect effect on the gene and may therefore be located on a different part of the genome (top right). As a result, the effect size of trans-eQTLs are usually much smaller (bottom right).

(9)

Understanding gene regulation is useful on its own, but it becomes more relevant when it also has implications for the development of (complex) diseases. Cis-eQTLs describe gene expression regulation locally, while trans-eQTLs represent the indirect and more downstream effects of variations in SNP genotype. Such downstream regulation is likely to be more informative for disease, because trans-eQTLs often converge onto a smaller set of genes that are at the core of a disease (Chapter 6, Vuckovic et al., 2020). However, trans-eQTL effects

have such small effect sizes that large sample sizes are required to detect them (Westra et

al., 2013). In order to use knowledge of gene expression regulation to find the trait-relevant

genes, it is thus important to assess a suitably large number of individuals.

However, it also important to look at the right tissue. Gene expression differs substantially between tissues and cell types, in line with their various functions (Hsiao et al., 2002). This difference in expression levels does not necessarily imply that the genetic regulation driven by common genetic variation must also vary across tissues: a gene can be expressed at different levels in different tissues yet always be downregulated in the presence of a particular SNP. The genotype-tissue expression (GTEx) consortium compares gene expression across 49 tissues from a few hundred individuals to investigate to what extent gene regulation is shared across tissues (Aguet et al., 2017). We now know that cis-eQTLs are often shared between tissues and their effect sizes are highly correlated (Aguet et al., 2017; Qi et al., 2018), but there are examples of cis-eQTLs having opposing effects between blood cell types (van der Wijst et al., 2018). Trans-eQTLs, on the other hand, are estimated to be less shared (Chapter 6, Aguet et al., 2017), which could be explained by tissue-specific enhancer

or TF activity. However, the small amount of overlap in trans-eQTLs could also be a result of the currently limited power to detect these effects in non-blood tissues.

The aim of this thesis is to identify and prioritise complex trait genes using gene expression data. A number of methodological considerations regarding context-specificity arise when using gene expression data for this purpose. Therefore, the first chapters are dedicated to methodologies in gene-expression-based gene discovery. The subsequent large-scale gene discovery studies use those methodological insights to account for tissue and cell-type specificity and zoom in on core genes for complex traits.

In Chapter 2, we compare the effect of genetic variants on molecular traits, like gene

expression and DNA methylation, with the impact on complex diseases. While some SNPs have large effects on local gene expression and methylation, this often does not translate to large effects on disease. In this chapter, we show that distal effects have smaller effects, and we hypothesise that they are closer to disease on a scale of trait-complexity.

(10)

1

2

3

4

5

6

7

8

Chapter 3 is a comparison of the effect of methodological choices when associating gene

expression or DNA methylation to phenotypes. By testing one change from the basic analysis settings at a time, this chapter gives recommendations for the normalisation, covariate correction, and statistical tests that give the most robust results in epigenome- and transcriptome-wide association studies (EWAS and TWAS, respectively).

Chapter 4 describes the prioritisation of age-related genes in blood. We observe that an

uncorrected association analysis between age and gene expression gives rise to associations that are driven by cell composition in blood rather than intracellular aging mechanisms, because cell proportions in blood are known to vary with age as well. Correcting for cell proportions allows for detection of genes that more likely play a role in aging.

In Chapter 5, we develop a new method, MR-link, to assign putative local causal genes

for complex traits. Mendelian Randomisation (MR) is a statistical technique to move from correlation to causation. The technique can prioritise genes when applied to gene expression, but it suffers from confounding by LD patterns and pleiotropy. MR-link addresses these issues and is able to retrieve known causal genes for low-density lipoprotein levels.

Chapter 6 is the largest study thus far linking genetic variation to gene expression levels

in blood. We collected data from 31,684 individuals to identify cis-eQTLs, trans-eQTLs, and associations between PGSs and expression. We show that nearly all genes (88%) are locally regulated and that distal regulation by trait-associated SNPs is very prevalent. Although these

trans-eQTLs are smaller and indirect, SNPs related to one trait often converge on a set of

genes that is relevant for that trait. In line with that, we find that combined genetic risk (PGS) significantly associates to the expression levels of driver genes. Most results pertain to blood-related traits like autoimmune diseases or cell counts. However, replication in other tissues, purified cell types, and single-cell RNA-seq indicates that a large fraction of these trans-eQTLs is not driven by blood cell composition.

In Chapter 7 we take a different approach to finding the most relevant genes to a trait: we

integrate summary statistics from GWAS with gene co-expression patterns across many different tissues. We hypothesise that genes that are co-regulated with many GWAS genes should play a central role in the development of the disease or trait. We use height and inflammatory bowel disease as two model traits to show that these co-regulated genes are enriched for trait-related Mendelian disease genes and are thus likely to be core genes. Finally, Chapter 8, describes the recently proposed omnigenic model. In it, I put each chapter

of this thesis in the context of this model and discuss the implications for future research into core gene identification for complex traits.

(11)

Glossary & abbreviations

Central dogma The process where genetic information flows unidirectionally from DNA to RNA to protein

Complex disease Condition that is caused by the interplay between many genetic variants and environmental factors

DNA Deoxyribonucleic acid

eQTL Expression quantitative trait locus

EWAS Epigenome-wide association study

Genetic architecture Frequency and effect sizes of trait-associated variants

Genotype The genetic makeup of an individual

GWAS Genome-wide association study

LD Linkage disequilibrium

Mb Megabases

Monogenic condition Condition that is caused by a mutation in one gene

PGS Polygenic score

Phenotype Observable characteristics of an individual pQTL Protein quantitative trait locus

Recombination Exchange of genetic material across chromosomes

RNA Ribonucleic acid

RNA-seq RNA sequencing

SNP Single nucleotide polymorphism

TF Transcription factor

(12)

1

2

3

4

5

6

7

8

References

Aguet, F. et al. (2017) ‘Genetic effects on gene expression across human tissues’, Nature. Nature Research, 550(7675), pp. 204–213. doi: 10.1038/nature24277.

Ardlie, K. G. et al. (2015) ‘The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans’, Science. American Association for the Advancement of Science, 348(6235), pp. 648–660. doi: 10.1126/ science.1262110.

Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, McMahon A, Morales J, Mountjoy E, Sollis E, Suveges D, Vrousgou O, Whetzel PL, Amode R, Guillen JA, Riat HS, Trevanion SJ, Hall P, Junkins H, Flicek P, Burdett T, Hindorff LA, C. F. and P. H. (2019) The NHGRI-EBI GWAS Catalog of published genome-wide association

studies, targeted arrays and summary statistics 2019., Nucleic Acids Research. Available at: https://www.ebi.ac.uk/

gwas/ (Accessed: 17 February 2020).

Bycroft, C. et al. (2018) ‘The UK Biobank resource with deep phenotyping and genomic data’, Nature. Nature Publishing Group, 562(7726), pp. 203–209. doi: 10.1038/s41586-018-0579-z.

Cheung, V. G. et al. (2003) ‘Natural variation in human gene expression assessed in lymphoblastoid cells’, Nature

Genetics. Nature Publishing Group, 33(3), pp. 422–425. doi: 10.1038/ng1094.

Cheung, V. G. and Spielman, R. S. (2002) ‘The genetics of variation in gene expression’, Nature Genetics, pp. 522–525. doi: 10.1038/ng1036.

Costa, V. et al. (2013) ‘RNA-Seq and human complex diseases: Recent accomplishments and future perspectives’,

European Journal of Human Genetics. Nature Publishing Group, pp. 134–142. doi: 10.1038/ejhg.2012.129.

Dixon, A. L. et al. (2007) ‘A genome-wide association study of global gene expression’, Nature Genetics. Nature Publishing Group, 39(10), pp. 1202–1207. doi: 10.1038/ng2109.

Duncan, L. et al. (2019) ‘Analysis of polygenic risk score usage and performance in diverse human populations’,

Nature Communications. Nature Publishing Group, 10(1). doi: 10.1038/s41467-019-11112-0.

Edwards, S. L. et al. (2013) ‘Beyond GWASs: Illuminating the dark road from association to function’, American

Journal of Human Genetics. Cell Press, pp. 779–797. doi: 10.1016/j.ajhg.2013.10.012.

Emilsson, V. et al. (2008) ‘Genetics of gene expression and its effect on disease’, Nature. Nature Publishing Group, 452(7186), pp. 423–428. doi: 10.1038/nature06758.

Evangelou, E. et al. (2018) ‘Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits’, Nature Genetics. Nature Publishing Group, 50(10), pp. 1412–1425. doi: 10.1038/s41588-018-0205-x.

Farh, K. K. H. et al. (2015) ‘Genetic and epigenetic fine mapping of causal autoimmune disease variants’, Nature. Nature Publishing Group, 518(7539), pp. 337–343. doi: 10.1038/nature13835.

Haley, C. (2002) ‘Quantitative Trait Loci Analysis in Animals’, Heredity. Springer Nature, 88(6), pp. 486–486. doi: 10.1038/sj.hdy.6800068.

Hou, K. et al. (2019) ‘Accurate estimation of SNP-heritability from biobank-scale data irrespective of genetic architecture’, Nature Genetics. Nature Publishing Group, p. 1. doi: 10.1038/s41588-019-0465-0.

Hsiao, L. L. et al. (2002) ‘A compendium of gene expression in normal human tissues’, Physiological Genomics. American Physiological Society, 2002(7), pp. 97–104. doi: 10.1152/physiolgenomics.00040.2001.

Jansen, R. C. and Nap, J. P. (2001) ‘Genetical genomics: The added value from segregation’, Trends in Genetics. Elsevier Ltd, pp. 388–391. doi: 10.1016/S0168-9525(01)02310-1.

Khera, A. V et al. (2018) ‘Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations’, Nature genetics, 50, pp. 1219–1224. doi: 10.1038/s41588-018-0183-z. Klein, R. J. et al. (2005) ‘Complement factor H polymorphism in age-related macular degeneration’, Science. NIH Public Access, 308(5720), pp. 385–389. doi: 10.1126/science.1109557.

Lee, J. J. et al. (2018) ‘Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals’, Nature Genetics. Nature Publishing Group, 50(8), pp. 1112–1121. doi: 10.1038/s41588-018-0147-3.

Liu, Y. and Aebersold, R. (2016) ‘The interdependence of transcript and protein abundance: new data–new complexities’, Molecular Systems Biology, 12(1), p. 856. doi: 10.15252/msb.20156720.

Maher, B. (2008) ‘Personal genomes: The case of the missing heritability’, Nature. Nature Publishing Group, pp. 18–21. doi: 10.1038/456018a.

Maller, J. B. et al. (2012) ‘Bayesian refinement of association signals for 14 loci in 3 common diseases’, Nature

(13)

Manolio, T. A. et al. (2009) ‘Finding the missing heritability of complex diseases’, Nature. NIH Public Access, pp. 747–753. doi: 10.1038/nature08494.

Martin, A. R. et al. (2019) ‘Clinical use of current polygenic risk scores may exacerbate health disparities’, Nature

Genetics. Nature Publishing Group, 51(4), pp. 584–591. doi: 10.1038/s41588-019-0379-x.

Nielsen, J. B. et al. (2018) ‘Biobank-driven genomic discovery yields new insight into atrial fibrillation biology’,

Nature Genetics. Nature Publishing Group, pp. 1234–1239. doi: 10.1038/s41588-018-0171-3.

Qi, T. et al. (2018) ‘Identifying gene targets for brain-related traits using transcriptomic and methylomic data from blood’, Nature Communications. Nature Publishing Group, 9(1), p. 2282. doi: 10.1038/s41467-018-04558-1. Rees, H. A. and Liu, D. R. (2018) ‘Base editing: precision chemistry on the genome and transcriptome of living cells’, Nature Reviews Genetics. Nature Publishing Group, pp. 770–788. doi: 10.1038/s41576-018-0059-1. Stolk, R. P. et al. (2008) ‘Universal risk factors for multifactorial diseases LifeLines: a three-generation population-based study’, Eur J Epidemiol, 23, pp. 67–74. doi: 10.1007/s10654-007-9204-4.

Strachan, T. and Read, A. P. (2011) Human molecular genetics. 4th edn. Garland Science.

Timmers, P. R. et al. (2019) ‘Genomics of 1 million parent lifespans implicates novel pathways and common diseases and distinguishes survival chances’, eLife, 8. doi: 10.7554/eLife.39856.

Visscher, P. M. et al. (2017) ‘10 Years of GWAS Discovery: Biology, Function, and Translation’, The American Journal

of Human Genetics, 101, pp. 5–22. doi: 10.1016/j.ajhg.2017.06.005.

Vuckovic, D. et al. (2020) ‘The Polygenic and Monogenic Basis of Blood Traits and Diseases’, medRxiv. Cold Spring Harbor Laboratory Press, p. 2020.02.02.20020065. doi: 10.1101/2020.02.02.20020065.

Wainschtein, P. et al. (2019) ‘Recovery of trait heritability from whole genome sequence data’, bioRxiv. Cold Spring Harbor Laboratory, p. 588020. doi: 10.1101/588020.

Westra, H.-J. et al. (2013) ‘Systematic identification of trans eQTLs as putative drivers of known disease associations’, Nature Genetics. Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved., 45(10), pp. 1238–1243. doi: 10.1038/ng.2756.

Westra, H. J. et al. (2018) ‘Fine-mapping and functional studies highlight potential causal variants for rheumatoid arthritis and type 1 diabetes’, Nature Genetics. Nature Publishing Group, pp. 1366–1374. doi: 10.1038/s41588-018-0216-7.

van der Wijst, M. G. P. P. et al. (2018) ‘Single-cell RNA sequencing identifies celltype-specific cis-eQTLs and co-expression QTLs’, Nature Genetics. Nature Publishing Group, 50(4), pp. 493–497. doi: 10.1038/s41588-018-0089-9. Wojcik, G. L. et al. (2019) ‘Genetic analyses of diverse populations improves discovery for complex traits’, Nature. Nature Publishing Group, pp. 514–518. doi: 10.1038/s41586-019-1310-4.

Yang, J. et al. (2010) ‘Common SNPs explain a large proportion of the heritability for human height’, Nature

Genetics. Nature Publishing Group, 42(7), pp. 565–569. doi: 10.1038/ng.608.

Young, N. D. (1996) ‘QTL MAPPING AND QUANTITATIVE DISEASE RESISTANCE IN PLANTS’, Annual Review of

(14)

1

2

3

4

5

6

7

8

(15)

Referenties

GERELATEERDE DOCUMENTEN

Linking common and rare disease genetics to identify core genes using Downstreamer. Discussion 25 39 69 89 123 171 195 Chapter 1 Introduction 11 Appendices Summary

In this review, we compare detected effect size and allele frequencies of associated variants from genome-wide association studies (GWAS) on complex traits and diseases with

Covariates For age, correcting solely for technical covariates or cell-counts resulted in a large increase (119% compared to the base model) in replicated genes. For BMI and

MR-link uses summary statistics of an exposure combined with individual-level data on the outcome to estimate the causal effect of an exposure from IVs (i.e. eQTLs if the exposure

This indicates that we prioritise core genes mostly for traits where blood is the relevant tissue, as expected under the omnigenic model, where all genes expressed in

The WHO classification 7 was used: class I - normal at light microscopic level; class II - mesangial; class III - focal proliferative; class IV - diffuse proliferative; and class V

Serial renal biopsies provide valuable insight into the frequent and complex histological transitions that take place in lupus nephritis.u Despite therapy, the 4 patients who

contender for the Newsmaker, however, he notes that comparing Our South African Rhino and Marikana coverage through a media monitoring company, the committee saw both received