• No results found

Polygenic Risk Score of depression predicts increased risk in Rotterdam Study

N/A
N/A
Protected

Academic year: 2021

Share "Polygenic Risk Score of depression predicts increased risk in Rotterdam Study"

Copied!
30
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Human Genomics Facility of the Erasmus Medical Center

With special thanks to:

Bahar Sedaghatikhayat, Jeroen van Rooij, André Uitterlinden, Roel Ophoff, Annemarie Luik, Annabel Vreeker

Abstract

Genomewide association studies (GWAS) allow for identification of genetic variants with risk increasing effects. Polygenic risk scores (PRS) can be generated from GWAS data and are used to predict individual risk of developing specific diseases. While widely used in medical research, PRS has only recently been gaining traction in the fields of psychology and psychiatry due to its practical and clinical applications. In the present study a PRS was created for depression, schizophrenia and various other psychological traits. The PRS’s were then applied to the first cohort of the Rotterdam Study (RSI) to predict the risk of developing depression. It was found that the PRS of depression predicted increased risk in RSI. Schizophrenia appeared to predict decreased risk. Accurately predicting depression based on genetic profile could improve the diagnosis process due to adding a genetic component and might give insight into the biological pathways of psychological traits.

Keywords: Polygenic Risk Score, Genome Wide Association Study, Rotterdam Study,

psychology, psychiatry, depression, schizophrenia, DNA, genetic variants

10/1/2020 - 19/6/2020

Polygenic Risk Score of

depression predicts increased

risk in Rotterdam Study

BACHELOR INTERSHIP PSYCHOBIOLOGY

(2)

Table of Contents

Introduction...2

Methods...5

Rotterdam Study I Cohort...5

Data selection...5

Exclusion criteria:...5

Data extraction and Quality Control...5

Assessing linkage disequilibrium...7

Calculating the PRS...7

Weighted PRS...8

Unweighted PRS...7

Statistical method and predicting depression by using PRS...10

Results...11

Depression in RSI...11

Data selection...11

PRS distribution and correlation...16

Prediction depression PRS...18

Predicting depression phenotype in RSI using all PRS...20

Discussion...23

Glossary...26

(3)

Introduction

Every human individual has a unique genome. The genome functions as a genetic blueprint and is stored in the form of deoxyribonucleic acid, or DNA. It consists of four basic building blocks, the nucleobases, Adenine (A), Thymine (T), Guanine (G) and Cytosine (C). Nucleotides contain a nucleobase, a sugar group (deoxyribose) and a phosphate group. Deoxyribose binds to the phosphate, resulting in a nucleotide chain. Nucleotides can form hydrogen bonds with other nucleotides. Due to their biochemical composition, A can only bind with T and G only with C. A double stranded DNA helix is the result.

Genetic variation is caused mainly by mutation and recombination of the DNA. This variation permanently alters the sequence of the DNA and is caused by erroneously replicated or damaged DNA. It manifests itself in the form of single nucleotide variants (SNV), were a single nucleotide is substituted at a certain position in the DNA sequence. SNVs can be either pathogenic or non-disease related. The majority of individuals may carry the cytosine (C) nucleotide at this position, whereas the minority carries the adenine (A) nucleotide. In this example the C nucleotide is the major allele, and the A nucleotide is the minor allele. Other forms of genetic variation are deletion; where a part of the DNA sequence is not replicated, and insertion; where nucleotides are added to the sequence. Genetic variants can be neutral, beneficial or harmful. Harmful variants may cause or increase the risk of getting a disorder. It is possible for healthy individuals to have a risk variant and not develop disease. However, the increased genetic risk combined with certain environmental risk factors could eventually result in disease.

Ever since the human genome project (Lander et al., 2001) sparked a global interest in the field of genetic research, it steadily became one of the most prolific fields in science. By studying the human genome it became possible to determine genetic variations, and determine their impact on diseases. At first, studies were done in families using heritability and genetic linkage. While this method proved to be useful for single gene disorders, it lef much to be desired when it came to determining the aetiology of more complex diseases and disorders. Single gene diseases are ofen referred to as simple diseases. They have a clear heritability pattern and identified risk variants in specific genes, e.g. Sickle-cell disease, which is caused by a harmful variant in the β-globin gene (Negre, Eggimann et al., 2016) and are prime targets for gene therapy. However, complex diseases are the result of genetic and environmental factors. As most complex diseases are of a polygenic nature, Genome Wide Association Studies (GWAS) gained acclaim.

Studies of this type measure the allelic frequencies of millions of variants in a large population of patients with a certain phenotype, e.g. breast cancer, and statistically compare it to a control group. If alleles are found within the patient population and absent (or present to a lesser extent) in the control group, this implicates that they contribute to the disorder. The alleles that are more ofen found in the cases are called ‘risk alleles’. The difference in frequency between both groups determines how much increased risk the risk allele conveys.

(4)

Early GWAS studies were limited due to statistical power, but rapid technological developments facilitated an increase in study size and thus statistical power. As genetic data is not inherited completely randomly, genetic linkage can occur. Alleles in a population can be in linkage disequilibrium (LD), meaning that they are non-randomly associated. When the frequency of this association is higher/lower than expected by chance the alleles are said to be in linkage disequilibrium.

Risk alleles with small effects that would have gone unnoticed in family studies could now be detected with GWAS. While GWAS were very useful to identify risk alleles, the clinical applications of the studies were few, as the risks per allele were small. Thus identifying a carrier of such risk variants did not increase disease risk enough to warrant some type of clinical action. However, calculating the sum of all known risk alleles for one disease for an individual allows to identify individuals with an increased chance of developing the disease. These sums of risk alleles are called a polygenic risk score (PRS).

The polygenic risk score (PRS) has a direct application in a clinical setting. For example, when it’s clear that an individual is genetically prone to a myocardial infarction, prescribing cholesterol lowering medication or adjusting the lifestyle accordingly could prevent disease. PRS have achieved great results in recent years. For example, a study regarding breast cancer (Mavaddat et al., 2018) found that the lifetime risk of breast cancer was 32.6% in the individuals in the highest ten percent of the PRS distribution. Individuals in the top 1 percent had a 2.78 to 4.37-fold risk of developing the disease compared to the middle quintile, and the bottom 1 percent 0.16 to 0.27. In a clinical setting this is incredibly valuable information, as the examination frequency of the high-risk individuals could be increased to find any signs of the disease in time. The study also shows that the PRS predicts a lowered risk of developing the disease. Provided that an individual does not carry a rare high-risk variant not included in the PRS (i.e. a BRCA2 mutation) it could be possible to downscale the frequency of examination for this group, thus reducing the cost.

PRS are being formed for many phenotypes across different fields of research, including the neurological and psychiatric fields. Considering the fact that psychiatric disorders and diseases are difficult to accurately diagnose and are more susceptible to comorbidities, gaining an insight into the genetic aetiology might help differentiate between the disorders. In the last three decades psychiatric disorders have been responsible for 14 percent of the years lived with a disability and have a global prevalence of 10 percent, with depressive disorders (including bipolar disorder) being the leading cause, affecting 264 million worldwide (World Health Organization, 2017). While cognitive therapy and pharmaceutical interventions in the form of antidepressants can be effective, much could be improved. The methods of assessment are inaccurate, ofen diagnosing those with depressive disorders incorrectly and prescribing antidepressants to people without the disorder. It is of the utmost importance that diagnoses are consistently correct and standardized for psychiatric diseases, to lower comorbidity and increase effectiveness of treatment.

(5)

PRS might provide a more consistent way of determining a psychiatric disorder. Accurately distinguishing between disorders could result in more specific treatments for each phenotype, as different genetic profiles might favour different treatments, e.g. pharmacological over cognitive. Recent studies found that with a PRS for schizophrenia, schizoaffective bipolar cases could be separated from other bipolar cases (Hamshere et al., 2011) and psychotic symptoms could be accurately predicted (Ruderfer et al., 2018).

Recent studies found promising results for a variety of phenotypes, identifying 102 independent variants for major depressive disorder and successfully applying the PRS to predict depression in the UK biobank, the Psychiatric Genomics Consortium (PGC) and the 23andme cohorts (Howard et al., 2019). Furthermore, for schizophrenia, 108 variants were discovered (Ripke et al., 2014).

Schizophrenia (SCZ) and major depressive disorder (MDD), being the leading psychiatric disorders are included in this study. Bipolar disorder (BD) however, is included separately from the depression phenotype to better distinguish the genetic aetiology. As anxiety disorder has a global prevalence of 3.6% and is ranked 6th as contributor to the global years

lived with a disability (World Health Organization, 2015), it was included. Due to the high comorbidity between attention deficit/hyperactivity disorder (ADHD), autism spectrum disorder (ASD) and anxiety disorder (ANX) these phenotypes were incorporated in this study. Lastly, to determine the genetic overlap between psychiatric disorders and neurological diseases, Alzheimer’s disease (AD) was included as a phenotype.

The present study aims to recreate polygenic risk scores for depression and schizophrenia, as well as various other highly prevalent and debilitating psychiatric/neurological phenotypes, establishing their predictive value when applied to phenotypic data. To do this, genetic and phenotype data from the Rotterdam Study (RS) will be used. The RS was started in 1990 and is still ongoing, it is a prospective study regarding a variety of diseases and disorders, including cardiovascular, neurological, respiratory and psychiatric (Hofman et al., 1990). As of 2008, the cohort was comprised of 14,926 participants (Hofman et al., 2015).

(6)

Methods

Rotterdam Study I Cohort

The polygenic risk scores were applied to the depression phenotype of the first cohort of the Rotterdam Study (RSI), consisting of 5722 participants. It was chosen to use the subsets of major depression, depressive syndrome and depressive symptoms. Finally a overall depression subset was created by combining aforementioned groups. The subset of major depression contained the most severe depressive cases, followed by the depressive syndrome group, and finally the depressive symptoms group. Participants were interviewed every 4 to 5 years and the worst depressive episode defined their phenotype. So if a participant scored highest on major depression during the first interview, but in the second and third interview showed predominantly depressive symptoms, this individual would still be categorized in the major depression phenotype.

Data selection

Polygenic risk scores were calculated using data from GWAS. The GWAS identifies the risk variants reaching the statistical threshold of p < 5 × 10-8. Relevant data was extracted from

the GWAS, being the reference allele/risk allele, odds ratio (OR) or beta, chromosome position and the name of the variant.

Selecting the GWAS was done by using the databases GWAS Atlas (Watanabe et al., 2019) and GWAS Catalog (Buniello at al., 2019). As both contained a large number of studies, several selection criteria were applied.

Only articles regarding the relevant phenotypes were included: Attention deficit/hyperactivity disorder (ADHD), Alzheimer’s disease (AD), autism spectrum disorder (ASD), anxiety disorder (ANX), bipolar disorder (BD), major depressive disorder (MDD) and schizophrenia (SCZ).

Exclusion criteria:

Studies with a sample size less than 1000 were excluded, as well as studies with ancestry other than European/Caucasian. This was done due to possible loss of power and the participants of the Rotterdam Study being predominantly of European ancestry.

Data extraction and Quality Control

The data needed for calculating the PRS was extracted from the selected studies. The reference/risk alleles, odds ratio or beta, chromosome positions and variants were

(7)

categorized per phenotype and stored into one file. If the beta was missing, it was calculated from the odds ratio, since the beta was essential to calculate the PRS.

To ensure correct calculation of the PRS, the extracted data had to be recoded to match the genetic data from the Rotterdam study (RS) to account for any inconsistencies. Correct coding of alleles is crucial in the PRS calculating process, as the outcome of the PRS will be significantly different otherwise. Following is a step-by-step guide of the correction process:

1. The figure below shows how the data extracted from the study is coded. The reference allele is displayed in the A1 column and the effect allele is displayed in the A2 column. In the example three different variants are visible.

Figure 1: Extracted data from study. A1 = Reference allele, A2 = Effect allele

2. Then the corresponding original beta is extracted from the study and added to the corresponding variant. Beta indicates the effect size and thus direction of the effect.

Figure 2: Beta added to extracted data. A1 = Reference allele, A2 = Effect allele, Beta = Effect size

3. Once the original beta is added, the data is compared to the data present in the RSI. Alleles that do not match can now be identified.

Figure 3: Non-matching alleles for variants 1 and 2. A1 = Reference allele, A2 = Effect allele, Beta = Effect size

4. Figure 3 shows that variants 1 and 2 do not match the RSI data. Variant 3 however is coded correctly: C being the reference allele, and T being the effect allele. It is crucial to correct the direction of variants 1 and 2 for correct risk prediction of the PRS.

(8)

Figure 4: Alleles and beta adjusted for variants 1 and 2. A1 = Reference allele, A2 = Effect allele, Beta = Effect size

5. The reference and risk allele of the ‘Data from study’ column are switched, and the direction of the corresponding original beta is inversed, giving us the corrected beta. As can be seen in figure 4, all variants now correctly match the A1 and A2 of the data present in the RSI.

Table 1: Example extracted data. Including chromosome number/position, name of variant, nearest gene, reference and effect allele, odds ratio, beta and P-value

The table above again explains the aforementioned process: To ensure the right direction of the PRS, the reference allele (A1) extracted from the study should be the same as the reference allele in the Rotterdam Study. The A1 for the first ADHD variant does not match the A1 in the info file. Thus the direction of beta needs to be changed from −0,09321 to 0,09321, to obtain the corrected beta. Since the other variants in this example do match the info file, the original beta can be used unchanged, meaning the original and corrected beta are the same in this case.

Assessing linkage disequilibrium

It is important to know if variants are in LD, as it might affect the PRS. Each variant included in the PRS is assumed to have an independent effect. If variants are in LD this cannot be guaranteed, and the variants have to be excluded in order sum their effects. To assess the LD between the variants, LDlink was used (Machiela et al., 2015). With the files acquired from LDlink, using the open source network creating tool Cytoscape (Shannon et al., 2003), genetic overlap and LD were visualised.

Calculating the PRS

Unweighted PRS

The following figures provide an example of the calculation process of the unweighted PRS. Note that the beta’s used in this example are already corrected for differences between this

(9)

study and the original studies in the aforementioned quality control. For clarity they will be referred to as ‘corrected beta’s’.

1. To obtain a PRS, the risk-increasing alleles are counted. Variants are either coded as a 0, 1 or 2, indicating how many effect alleles they carry.

Figure 5: Example unweighted PRS calculation.Beta = effect of variant, P = Participant, RSI = First cohort of the RS, Variant = Effect allele

2. Figure 5 shows the basis of the PRS calculation. However, it is assumed that all the variants in this figure indicate increased risk.

3. To assess if the effects of the variants are indeed risk increasing and not risk decreasing, the corresponding corrected beta’s are used.

Figure 6: Example incorrect unweighted PRS calculation. Beta = effect of variant, P = Participant, RSI = First cohort of the RS, Variant = Effect allele

4. The beta corresponding to variant 1 is negative, indicating a risk decreasing effect. As P2 and P3 carry 0 of this variant, they have an increased risk. However, this is not conveyed in the final unweighted PRS score.

5. To change the direction of variant 1, the 0’s have to be changed to 2’s and vice versa.

Figure 7: Example adjusted unweighted PRS calculation. Beta = effect of variant, P = Participant, RSI = First cohort of the RS, Variant = Effect allele

6. In figure 7 variant 1 is adjusted to correctly display risk, instead of a protective effect. The allele coding is corrected and all variants can now be summed to obtain the correct unweighted PRS.

Weighted PRS

Calculating the weighted PRS is done in similar fashion. In contrast to the unweighted PRS however, allele coding does not have to be corrected for risk decreasing effects, as the beta is directly applied.

(10)

1. The figure below displays data needed for the calculation. Note that the corrected beta for variant 1 is negative and thus risk decreasing.

Figure 8: Example data coding weighted PRS. Beta = effect of variant, P = Participant, RSI = First cohort of the RS, Variant = Effect allele

2. To acquire the effect sizes per variant for each individual, the corrected beta is multiplied by the number of variants carried per individual.

Figure 9: Example calculation of variant effect per individual. Beta = effect of variant, P = Participant, RSI = First cohort of the RS, Variant = Effect allele

3. Once the effects of each variant are established, the total of the variants is taken per participant: The weighted PRS. This final step can be seen in the figure below.

Figure 10: Example calculation weighted PRS. Beta = effect of variant, P = Participant, RSI = First cohort of the RS, Variant = Effect allele

It was chosen to include a weighted and unweighted PRS in this study. However, figure 11 demonstrates the difference.

Figure 11: Comparison weight and unweighted PRS

Due to the added weight of the corrected beta, the weighted PRS is able to distinguish between P1 and P2. The weighted PRS is more informative and as the effects of variants are not always the same size, more reliable than the unweighted PRS. Hence, the main focus of this study is on the weighted PRS. Afer calculating the PRS, histograms were made to determine the normal distribution.

(11)

Psychiatric disorders are ofen highly comorbid, according to literature (Smoller et al., 2013). Thus assessing the correlation between PRS could provide valuable information. To do this, scatterplots were made to compare the PRS of every phenotype. Correlations between the PRS are expected to occur when the variants used to calculate the PRS are either shared with other phenotypes or are in LD.

Statistical method and prediction of depression using

PRS

Using the PRS for each trait, we will try to predict the main outcomes; major depression, depressive syndrome, depressive symptoms and any of the above. Using R-scripts previously generated in this group, the average weighted score of the PRS is determined for all cases and all controls of each phenotype. These averages are then statistically compared using a regression model, resulting in an effect difference, so called odds ratio, and a respective p-value. Furthermore, individuals in the top 10% of the PRS distributions are separated and the percentage of cases in this group is compared with the percentage in the lowest 10% of the PRS distribution, providing again an odds ratio and respective p-value.

The PRS of depression will be applied to four different groups: Major depression, depressive symptoms, depressive disorder and the overall depression group, which consists of all aforementioned groups. Then, two analyses will be performed per group: Comparison of cases versus controls, and comparison of the highest 10% with the lowest 10% of the PRS distribution. The remaining PRS’s will be applied in the same manner. The PRS of depression is expected to successfully distinguish cases from controls and predict increased risk.

(12)

Results

Depression in RSI

The phenotype information for depression acquired from RSI can be seen in the table below. Mean age in RSI was 67.3 (Min = 55, Max = 99, SD = 7.9).

Table 2: RSI sample size and percentage of total

N % No Depression 4122 72.0 Major Depression 253 4.4 Depressive Disorder 189 3.3 Depressive Symptoms 1150 20.1 Bipolar 8 0.1

Data selection

Once the databases were filtered, a total of 326 articles remained: 295 in the GWAS Atlas and 31 in the GWAS Catalog. Only 301 of those articles were unique, as there was an overlap of 25 articles between the databases, see table 1.

Table 3: Number of articles available per phenotype, before and after applying selection criteria.

From these studies, a single study was selected to calculate the PRS, as shown in table 4 below.

(13)

Table 4: Study titles and authors per phenotype, of studies selected for the calculation of the PRS.

Table 5 gives an overview of the essential information per phenotype.

Table 5: Study cohort, markers (of which excluded in the present study), heritability (proportion of observed variance attributed to genetic factors), prevalence, explained variance (proportion of heritability explained by genetic variants) and sample size

As can be seen in the table, most of the disorders are highly heritable. Autism Spectrum Disorder, Alzheimer’s Disease, Attention Deficit/Hyperactivity Disorder and Schizophrenia being the most heritable with a twin heritability of around 80%. The global lifetime prevalence of Major Depressive Disorder is highest at 30-40% followed by Anxiety Disorder (14-16%). The explained variance of Schizophrenia is highest, with 18% of the genetic heritability explained by SNVs. From the total 293 markers across all 7 traits, 280 were present in the RS population and included in this study. 13 variants were excluded as they were not present in the RSI cohort, one was excluded due to a missing odds ratio.

(14)

Genetic overlap and linkage disequilibrium

Afer extracting the variants for each PRS, they were compared to determine if variants were present in multiple phenotypes. This was not the case. To further determine genetic overlap, each variant was annotated to the nearest gene, assuming that the genetic impact of the SNP on the trait was through this gene. Aferwards, genes between PRS’s were compared to assess if the genes were affected by any of the variants, causing overlap. This overlap is shown in figure 12.

Figure 12: Genetic overlap between phenotypes. Dots represent genes. Overlapping genes coloured green, purple coloured genes were reported more than once (Depression: RBFOX1, count: 2, Schizophrenia: SDCCAG8, count: 3)

(15)

In figure 12 the genetic overlap of the phenotypes is displayed. Size and colour of phenotypes change according to overlap: darker red and larger size indicate more overlap. Both AD and ASD show no overlap with any of the phenotypes. MDD shows the most genetic overlap, sharing genes with SCZ (n = 3: ATP2A2; ZNF536; TCF4) ANX (n = 2: TMEM106B; CTC-340A15.2), BD (n = 1: SHANK2) and ADHD (n = 1: SORCS3). In addition to the overlap with MDD, BD shows overlap with schizophrenia (n = 2: GRIN2A; CACNA1C).

As variants do not always affect the nearest gene, annotating variants to them is not optimal when comparing PRS’s. Therefore, it was opted to also make a comparison based on the genetic blocks in which the variants are located. Variants in the same block usually express similar effects and are correlated in the human population.

By comparing how ofen variant A is present in the same individuals that also carry variant B, it can be determined which variants are in the same block. This is called linkage disequilibrium (LD). In the next figure, it is compared if SNPs from different PRS are in the same LD block, even if the annotated gene might not be the same one.

(16)

Figure 13: Independent variants in linkage disequilibrium (LD). Dots represent LD-blocks. Green coloured LD-blocks contain variants shared by multiple phenotypes

In figure 13 the linkage disequilibrium is displayed. Similar to figure 12, ASD and AD show no overlap. MDD shows the most overlap, sharing risk variants in LD with SCZ (n = 10), ANX (n = 3), BD (n = 2) and ADHD (n = 1). A small overlap can be seen between MDD, SCZ and BD (n = 1; LD block 26). In contrary to figure 12, ADHD overlaps with both MDD (n = 1) and SCZ (n = 1) in figure 13. ANX only shows overlap with MDD (n = 3). Comparing both figures shows that the LD overlap is slightly higher than gene-annotated overlap, especially for MDD and SCZ.

(17)

PRS distribution and correlation

Subsequently, the list of variants was used for each phenotype to collect the variants from the RS dataset. The weighted and unweighted PRS score for each of the 6000 participants of the Rotterdam Study cohort 1 was calculated per phenotype.

As an example, in figure 14 and figure 15 the histogram and scatterplot are displayed of the weighted PRS for MDD. Figure 16 shows the histogram of the weighted ASD PRS.

Figure 14: Histogram of the weighted PRS of major depressive disorder (MDD)

Figure 14 shows that the weighted PRS of MDD is normally distributed. Individuals in the far right of the distribution carry many risk alleles thus have a high predicted increased risk of depression. The lef side of the distribution contains individuals carrying fewer risk alleles then the rest of the population.

Then the extend of the correlation between the weighted and unweighted PRS was determined;

(18)

Figure 15: Scatterplot of the weighted PRS for major depressive disorder (MDD) versus the unweighted PRS of MDD, showing a correlation of R² = 0,9416

As expected, the correlation in figure 15 between the weighted and unweighted PRS of MDD approaches 1. Individuals in the top right of the correlation have an increased risk of developing MDD, whereas individuals in the bottom lef have a decreased risk.

Figure 16: Histogram of the weighted PRS of Autism Spectrum Disorder (ASD)

In contrast to the weighted PRS of MDD, the weighted PRS of ASD is lef-skewed. This result is not completely unexpected due to the low number of variants (n = 5). Therefore, the PRS for ASD will not be able to distinguish individuals as accurately as the PRS for MDD, and may not be able to provide a viable risk prediction in the RSI dataset.

In addition to checking the overlap of the PRS’s, the correlation in the RSI population of the scores themselves was measured. This was done to determine if the PRS’s are correlated,

(19)

despite not measuring the exact same genetic signals. Correlation plots were run for each combination of PRS.

Figure 17: Correlations between PRS of all phenotypes

In figure 17 can be seen that the comparison of ‘PRS x’ with ‘PRS y’ does not show any correlation between the PRS’s. Comparing ‘PRS x’ with ‘PRS x’ shows a correlation close to one. The results show that each PRS predicts the corresponding phenotype. The scatterplots of anxiety and autism show an almost categorical distribution, as can be clearly seen in the ‘Anxiety x Autism’ graph. The lack of available variants for both anxiety and autism might have affected the reliability of the ‘Anxiety x Autism’ plot, thus the possibility of correlation cannot be excluded and prediction of the PRS may not be accurate.

Prediction depression PRS

(20)

Table 6: Results of the PRS for depression vs overall depression

Table 6 shows that the OR of the PRS of cases to the PRS of controls is 1.07 with a significant p-value of 0.024. The OR of the highest 10% versus the lowest 10% is 1.29 with a p-value of 0.071. This means that the PRS of depression, consisting of 102 variants, predicts cases from controls, with the PRS being 1.07 times higher in cases. A trend is visible in the top 10% compared to the bottom 10%, but does not reach significance. The analysis of comparing the highest and lowest 10% provides a larger effect size, however, it also decreases power due to a smaller sample size.

Additional analyses were done to determine if the depression PRS could accurately predict major depression, depressive syndrome or any depressive symptoms.

Table 7: Results of the PRS for depression vs major depression

The results of the PRS for depression applied to the major depression subgroup show an OR of 1.21 for the PRS cases versus the PRS controls, with a significant p-value of 0.005. Analysis of the top and bottom 10% gives an OR of 2.4 with a significant p-value of 0.009. The PRS accurately distinguishes cases from controls in the major depression subgroup. The top 10% in this group has more than twice as much risk compared to the bottom 10%.

Table 8: Results of the PRS for depression vs depressive syndrome

Results of the PRS of depression vs depressive syndrome do not show any significance or effects. The OR for PRS cases versus controls is 1.02 with a p-value of 0.826. Comparing top with bottom 10% gives an OR of 1.00 and a p-value of 0.990. The PRS for depression does not seem to be able to predict depressive syndrome.

(21)

Table 9: Results of the PRS for depression vs depressive symptoms

The results of the PRS for depression versus depressive symptoms do show a trend but do not reach significance. The OR of the cases compared to controls is 1.05 with a p-value of 0.141. Top 10% versus bottom 10% gives results in an OR of 1.18 with a p-value of 0.299. Similar to the depressive syndrome subgroup, no significant results are found. However, there is a trend visible, especially in the second analysis. The OR of 1.18 implies an increased risk in the top 10% vs the bottom 10%, although this could be an artefact, as the p-value is not significant. Aferwards, it was tested if the other 6 PRS’s predicted overall depression.

Predicting depression phenotype in RSI using all PRS

Table 10: Results all PRS’s vs overall depression

Interestingly, in table 10 it can be seen that the PRS for schizophrenia predicts a 1.09 (in the cases vs controls group) and 1.37 (in the top 10% versus bottom 10%) times decreased risk for developing depression. The OR of the PRS of schizophrenia cases versus PRS of schizophrenia controls gives an OR of 0.91 with a significant p-value of 0.004. The analysis of highest and lowest 10% gives an even lower OR of 0.73 with a significant p-value of 0.029. Finally, analyses were conducted of all PRS’s on the remaining subgroups of depression: Major depression, depressive syndrome and depressive symptoms.

(22)

Table 11: Results all PRS’s vs major depression

Again the results reflect an apparent decrease in risk for the schizophrenia PRS. The weighted PRS gives an OR of 0.81 with a significant p-value of 0.002, while top versus bottom 10% analysis results in an OR of 0.37 with a significant p-value of 0.003. Meaning a decreased risk of 1.23 and 2.7 times for major depression.

Table 12: Results all PRS’s vs depressive syndrome

No significant results are found for the depressive syndrome subgroup, nevertheless, the trend of the schizophrenia PRS predicting decreased risk is visible.

(23)

Table 13: Results all PRS’s vs depressive symptoms

Table 13 shows similar results to table 12. The only significant result is the PRS of schizophrenia with a p-value of 0.040 and an OR of 0.93. Continuing the trend, the schizophrenia PRS seems to predict decreased risk of developing depressive symptoms.

(24)

Discussion

Due to the high comorbidity between psychiatric phenotypes (Smoller et al., 2013), a large genetic overlap was expected. Figure 12 and 13 show that the genetic overlap between phenotypes was minimal and that few variants reported in the articles were in LD. Mutations in the gene SHANK2 cause hyperconnectivity in human neurons and are strongly associated with ASD (Zaslavsky et al., 2019). The results however do not reflect this association, probably due to the low number of available variants for ASD. Alternative functions attributed to the SHANK2 gene include circadian entrainment, which provides an explanation for its link to both BD and MDD phenotypes, as disturbance of the circadian rhythm is a symptom of both disorders. In figure 12 can be seen that SORC3 is present in the ADHD and MDD phenotypes. SORC3 is mostly associated with AD however, knock-out cell lines provide evidence of increased amyloid precursor. The absence of SORC3 might be explained by the SORL1 gene functioning similarly, and by the fact that the AD study, as well as ANX and ASD, had limited available variants. GRIN2A encodes a NDMA-receptor subunit. NMDA-receptors are involved in long-term potentiation (LTP) which is essential for learning and memory. As LTP impairment in patients with schizophrenia may lead to cognitive deficits (Salavati et al., 2015), this might have similar implications for BD patients. Although there is less overlap between ANX and MDD than previous evidence suggests, this could be attributed to the lack of available variants for the ANX phenotype. Nonetheless, the fact that there is any overlap at all implies some degree of comorbidity.

Expectedly, figure 13 does not differ much from figure 12. The largest LD overlap was between SCZ-MDD, and SCZ-BD. A plausible explanation could be schizoaffective disorder (SAD), characterized by psychosis-like symptoms, e.g. hallucinations and delusions and unstable moods. Two types exist of this disorder: Depressive and bipolar type. The depressive type can be distinguished by the depressive symptoms, whereas the bipolar type is defined by manic symptoms. The fact that AD does not show any overlap could be explained by the fact that it is a neurological disorder, meaning the biological pathways involved may differ from the other traits.

As expected, the depression PRS successfully predicted increased risk of depression. The strongest effect was found for major depression. The study used for creating the PRS defines depression as ‘clinically diagnosed’, this could mean that the variants included in the PRS better predict severe depression than the less severe depressive symptoms. To control for this possibility, a new PRS could be created with variants taken from a study with a less severe phenotype of depression. Applying this PRS to the RSI might result in a better prediction of the depressive syndrome or depressive symptoms groups. However, depressive symptoms are shared with many other diseases/disorders and vary greatly in genetic aetiology. Thus it is possible that this genetic difference contributed to the lack of effect found. Further research might establish the comorbidity between anxiety and depression, by using the depression PRS on the anxiety data available in the RSI. Studying bipolar as a

(25)

separate phenotype might provide valuable results, however, as there are only eight cases in the RS, this has to be done in a different cohort. Given the fact that RSII and RSIII are readily available, applying the PRS of depression to the depression phenotypes of these cohorts would be the logical continuation of this study.

Interestingly, the schizophrenia PRS seemed to predict decreased risk of developing depression. Due to the lack of correlations between the PRS of depression and schizophrenia, this result was not expected. Previous research did find evidence of an association between schizophrenia and depression (Witt et al., 2017; Musliner et al., 2019), although the direction of the effect was in the other direction. It is possible that this is an artefact caused by selection bias. The age of onset of schizophrenia is in the early twenties, the mean age of participants in the depression phenotype of RSI is 67.3, meaning that individuals in the top 10% of the schizophrenia PRS might be less susceptible to developing psychiatric disorders, as they passed the age of onset without developing schizophrenia. Patients with schizophrenia are also less likely to participate in studies due to the paranoid nature of the disease, hence there is no schizophrenia phenotype available in the RS. Another explanation could be a technical issue, which may have caused the effect to be in the opposite direction. To assess if this is the case, the results must be looked into in more detail. Follow up research could still be done to confirm this result, by using the phenotype for psychotic symptoms, which is available for RSI. The PRS for schizophrenia could be used to predict psychotic symptoms, if the results show that the high risk individuals have increased psychotic symptoms it can be concluded that the PRS is indeed correct.

The effects of the predictions are stronger for the comparison of the top versus bottom 10% of the PRS distribution, but are less significant. The decrease in significance can be plausibly attributed to the decrease in power, as the analysis only takes 20% of the PRS distribution into account.

Accurately predicting risk of developing depression could have substantial clinical relevance. While clinicians initially might be sceptical, PRS could provide the means to generalize the process of diagnosis, adding the genetic component to the current methods. The accuracy of the PRS could be tested by applying it to a clinically diagnosed group and a control group. If the PRS is able to distinguish between these groups it may be valuable to do a pilot with predictive genetic screening. Individuals are screened and then divided into a group that is informed about their genetic risk, and a group that is not. Following both groups might provide insight in the ability to prevent and assist high risk individuals.

It might prove valuable to further assess the biological pathways of depression, understanding the genetic aetiology is crucial for developing pharmaceutical treatments. Finding biological markers greatly impacts both the process of diagnosis and treatment development. In the future, genetic screening might help to determine whether target specific treatments can be used. It is also important to realise that due to the large genomic differences between ancestries, it is impossible to predict if the PRS’s can be applied

(26)

trans-ancestrally. Future studies might use the PRS in a cohort with a different ancestry to assess whether this is the case. Categorizing by specific symptoms, e.g. a concentration deficit, instead of by disorder such as depression or schizophrenia, might make it easier to determine the genetic basis. As there is a large comorbidity between psychiatric symptoms, it might provide an insight in the specific genes involved.

(27)

Glossary

Aetiology: Cause of a disease AD: Alzheimer’s Disease

ADHD: Attention Deficit-Hyperactivity Disorder ANX: Anxiety Disorder

ASD: Autism Spectrum Disorder BD: Bipolar Disorder

Circadian rhythm: Internal biological sleep-wake cycle recurring every 24 hours Comorbidity: Co-occurrence of one or more additional conditions

GWAS: Genome Wide Association Study

LD: Linkage Disequilibrium, associated heritability of specific alleles

LTP: Long Term Potentiation. Increase in synaptic strength afer high-frequency stimulation. MDD: Major Depressive Disorder

OR: Odds ratio

Pharmacogenetics: Study of genetic variation affecting drug response Polygenic: Relating to multiple genes

PRS: Polygenic risk score RS: Rotterdam Study

RSI: First cohort of the Rotterdam Study RSII: Second cohort of the Rotterdam Study RSIII: Third cohort of the Rotterdam Study SCZ: Schizophrenia

(28)

References

Belmont, J. W., Hardenbol, P., Willis, T. D., Yu, F., Yang, H., Ch’Ang, L. Y., … Tanaka, T. (2003). The international HapMap project. Nature, 426(6968), 789–796.

https://doi.org/10.1038/nature02168

Choi, S. W., Mak, T. S. H., & O’Reilly, P. F. (2018). A guide to performing Polygenic Risk Score analyses. BioRxiv, 2, 416545. https://doi.org/10.1101/416545

Demontis, D., Walters, R. K., Martin, J., Mattheisen, M., Als, T. D., Agerbo, E., … Robinson, E. B. (n.d.). Discovery of the first genome-wide significant risk loci for attention deficit / hyperactivity disorder.

Fullerton, J. M., & Nurnberger, J. I. (2019). Polygenic risk scores in psychiatry: Will they be useful for clinicians? [version 1; peer review: 4 approved]. F1000Research. F1000 Research Ltd. https://doi.org/10.12688/f1000research.18491.1

Grove, J., Ripke, S., Als, T. D., Mattheisen, M., Walters, R. K., Won, H., … Børglum, A. D. (2019). Identification of common genetic risk variants for autism spectrum disorder. Nature

Genetics, 51(3), 431–444. https://doi.org/10.1038/s41588-019-0344-8

Hamshere, M. L., O’Donovan, M. C., Jones, I. R., Jones, L., Kirov, G., Green, E. K., … Craddock, N. (2011). Polygenic dissection of the bipolar phenotype. British Journal of Psychiatry,

198(4), 284–288. https://doi.org/10.1192/bjp.bp.110.087866

Howard, D. M., Adams, M. J., Clarke, T. K., Hafferty, J. D., Gibson, J., Shirali, M., … McIntosh, A. M. (2019). Genome-wide meta-analysis of depression identifies 102 independent variants and highlights the importance of the prefrontal brain regions. Nature

Neuroscience, 22(3), 343–352. https://doi.org/10.1038/s41593-018-0326-7

Kunkle, B. W., Grenier-Boley, B., Sims, R., Bis, J. C., Damotte, V., Naj, A. C., … Pericak-Vance, M. A. (2019). Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nature Genetics, 51(3), 414– 430. https://doi.org/10.1038/s41588-019-0358-2

Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., … Morgan, M. J. (2001). Initial sequencing and analysis of the human genome. Nature, 409(6822), 860– 921. https://doi.org/10.1038/35057062

Mavaddat, N., Michailidou, K., Dennis, J., Lush, M., Fachal, L., Lee, A., … Easton, D. F. (2019). Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes.

American Journal of Human Genetics, 104(1), 21–34.

https://doi.org/10.1016/j.ajhg.2018.11.002

Mavaddat, N., Michailidou, K., Dennis, J., Lush, M., Fachal, L., Lee, A., … Easton, D. F. (2019). Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes.

(29)

Musliner, K. L., Mortensen, P. B., McGrath, J. J., Suppli, N. P., Hougaard, D. M., Bybjerg-Grauholm, J., … Agerbo, E. (2019). Association of Polygenic Liabilities for Major

Depression, Bipolar Disorder, and Schizophrenia with Risk for Depression in the Danish Population. JAMA Psychiatry, 76(5), 516–525.

https://doi.org/10.1001/jamapsychiatry.2018.4166

Negre, O., Eggimann, A. V., Beuzard, Y., Ribeil, J. A., Bourget, P., Borwornpinyo, S., … Payen, E. (2016, February 1). Gene Therapy of the β-Hemoglobinopathies by Lentiviral Transfer of the βa(T87Q)-Globin Gene. Human Gene Therapy. Mary Ann Liebert Inc.

https://doi.org/10.1089/hum.2016.007

Pineda-Cirera, L., Shivalikanjli, A., Cabana-Domínguez, J., Demontis, D., Rajagopal, V. M., Børglum, A. D., … Fernàndez-Castillo, N. (2019). Exploring genetic variation that influences brain methylation in attention-deficit/hyperactivity disorder. Translational

Psychiatry, 9(1), 242. https://doi.org/10.1038/s41398-019-0574-7

Purves, K. L., Coleman, J. R. I., Meier, S. M., Rayner, C., Davis, K. A. S., Cheesman, R., … Eley, T. C. (2019). A major role for common genetic variation in anxiety disorders. Molecular

Psychiatry, 1–12. https://doi.org/10.1038/s41380-019-0559-1

Ripke, S., Neale, B. M., Corvin, A., Walters, J. T. R., Farh, K. H., Holmans, P. A., … O’Donovan, M. C. (2014). Biological insights from 108 schizophrenia-associated genetic loci. Nature,

511(7510), 421–427. https://doi.org/10.1038/nature13595

Ruderfer, D. M., Ripke, S., McQuillin, A., Boocock, J., Stahl, E. A., Pavlides, J. M. W., … Kendler, K. S. (2018). Genomic Dissection of Bipolar Disorder and Schizophrenia, Including 28 Subphenotypes. Cell, 173(7), 1705-1715.e16. https://doi.org/10.1016/j.cell.2018.05.046 Salavati, B., Rajji, T. K., Price, R., Sun, Y., Graff-Guerrero, A., & Daskalakis, Z. J. (n.d.).

Imaging-Based Neurochemistry in Schizophrenia: A Systematic Review and Implications for Dysfunctional Long-Term Potentiation. Retrieved May 24, 2020, from

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4266301/

Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., … Ideker, T. (2003). Cytoscape: A sofware Environment for integrated models of biomolecular interaction networks. Genome Research, 13(11), 2498–2504. https://doi.org/10.1101/gr.1239303 Stahl, E. A., Breen, G., Forstner, A. J., McQuillin, A., Ripke, S., Trubetskoy, V., … Sklar, P. (2019).

Genome-wide association study identifies 30 loci associated with bipolar disorder.

Nature Genetics, 51(5), 793–803. https://doi.org/10.1038/s41588-019-0397-8

Watanabe, K., Stringer, S., Frei, O., Umićević Mirkov, M., de Leeuw, C., Polderman, T. J. C., … Posthuma, D. (2019). A global overview of pleiotropy and genetic architecture in complex traits. Nature Genetics, 51(9), 1339–1348. https://doi.org/10.1038/s41588-019-0481-0

Witt, S. H., Streit, F., Jungkunz, M., Frank, J., Awasthi, S., Reinbold, C. S., … Rietschel, M. (2017). Genome-wide association study of borderline personality disorder reveals

(30)

genetic overlap with bipolar disorder, major depression and schizophrenia.

Translational Psychiatry, 7(6), e1155. https://doi.org/10.1038/tp.2017.115

World Health Organization (2017) Depression and Other Common Mental Disorders Global

Health Estimates.

Licence: CC BY-NC-SA 3.0 IGO.

Zaslavsky, K., Zhang, W. B., McCready, F. P., Rodrigues, D. C., Deneault, E., Loo, C., … Ellis, J. (2019). SHANK2 mutations associated with autism spectrum disorder cause

hyperconnectivity of human neurons. Nature Neuroscience, 22(4), 556–564. https://doi.org/10.1038/s41593-019-0365-8

Referenties

GERELATEERDE DOCUMENTEN

Hieronder wordt voor de directe consumptieve vraag (in Nederland) naar witlof geïllustreerd hoe een voortschrijdend rekenkundig gemiddelde wordt becijferd, en hoe via dit

‘De boodschap uit onze ronde langs bedrijven was’, zegt Visser, ‘dat zowel de fokkerij- bedrijven als de plantenveredelaars zitten te wachten op nieuwe tools en algoritmes en op

Een sociogram wordt in een gesprek met een patiënt ingevuld en biedt doorgaans niet alleen inzichten voor een hulpverlener maar ook inzich- ten voor de persoon zelf, omdat deze

Doelstelling van dit onderzoek is om meer inzicht te krijgen in de invloed die de mogelijkheden van Het Nieuwe Werken heeft op de voorkeur van bedrijven voor locaties van

Indonesian, or L2, subtitling yielded better recall of learned items in an intentional language learning setting than Dutch, or L1, subtitling, as well as better results on

Interestingly, self-reported postpartum depression prevalence rates of the women with a positive TPO-ab status showed a pattern that is similar to the typical course of

Toename in activatie van persuasion knowledge leidt echter niet direct tot een toename in het gebruik van de vier verschillende weerstand strategieën: counterarguing, attitude

First, regarding the concepts of consociational and integrative power-sharing, here I follow the distinction made by Sisk (2008) in the previous