• No results found

K.U.Leuven Center for Systems Biology

N/A
N/A
Protected

Academic year: 2021

Share "K.U.Leuven Center for Systems Biology"

Copied!
61
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

SymBioSys

K.U.Leuven Center for Systems Biology

(2)

Topics to be addressed

 International trend

 Project concept

 Project structure

 3 problems and 3 cases

 Computational methodology leads to user-friendly tools and real biological impact

 Strategic importance internationally

 Strategic importance K.U.Leuven

 Coherence of the consortium

(3)

Systems biology

Biostatistics

Genetics

Sequence analysis

Expression analysis

Personalize d medicine

Nutraceutical s

Post-genomic drug development

(new targets,

toxicogenomics) GMO

s

(4)

Systems biology

Biological question

& model

High-throughput technology

Computers

& databases Mathematical

models

(5)

The Human Genome Project has catalyzed striking paradigm changes in biology - biology is an information science. [...] Systems biology will play a central role in the 21st century; there is a need for global (high throughput) tools of genomics,

proteomics, and cell biology to decipher biological information; and computer science and applied math will play a commanding role in converting biological information into knowledge.

Leroy Hood, Institute for Systems Biology, Seattle, WA, 2002

(6)

Center of Excellence

 Become a world-leading bioinformatics center for systems biology

Bioinformatics & microarrays

Three topics of excellence

Gene prioritization by integrative genomics

Graphical models of regulatory motifs and modules

Inference of regulatory networks

 We will achieve this goal through

Further build-up of existing expertise

Symbiosis between computational and biological partners

Concrete cases for real biological relevance

Diverse cases for generic applicability in biology

(7)

Systems biology

G en es M od ul es N et w or ks

(8)

Probabilistic models In te gr at iv e

ge no m ics

Re gu lato ry m od

ule s

Cellular networks

Cas e Ca se

Project concept

Ca se

Cas e

(9)

Probabilistic models In te gr at iv e

ge no m ics

Re gu lato ry m od

ule s

Cellular networks

Ge ne tic al g en om ics En

do crin olo gy

Salmonella genomics

Biological problem

Research concept & consortium

(10)

Probabilistic models In te gr at iv e

ge no m ics

Re gu lato ry m od

ule s

Cellular networks

Ge ne tic al g en om ics En

do crin olo gy

Salmonella genomics

Biological problem

Experiment design

Research concept & consortium

(11)

Probabilistic models In te gr at iv e

ge no m ics

Re gu lato ry m od

ule s

Cellular networks

Ge ne tic al g en om ics En

do crin olo gy

Salmonella systems biology

Biological problem

Experiment design

Biological data

Research concept & consortium

(12)

Probabilistic models In te gr at iv e

ge no m ics

Re gu lato ry m od

ule s

Cellular networks

Ge ne tic al g en om ics En

do crin olo gy

Salmonella genomics

Biological problem

Experiment design

Biological data

Data analysis

Research concept & consortium

(13)

Probabilistic models In te gr at iv e

ge no m ics

Re gu lato ry m od

ule s

Cellular networks

Ge ne tic al g en om ics En

do crin olo gy

Salmonella genomics

Biological problem

Experiment design

Biological data

Data analysis

Biological validation

Research concept & consortium

(14)

Probabilistic models

In te gr at iv e ge no m ics

Re gu lato ry m od

ule s

Cellular networks

Ge ne tic al g en om ics En

do crin olo gy

Salmonella genomics

Biological problem

Experiment design

Biological data

Data analysis

Biological validation

Improved method

Research concept & consortium

(15)

In te gr at iv e ge no m ics

Re gu lato ry m od

ule s

Cellular networks

Ge ne tic al g en om ics En

do crin olo gy

Salmonella genomics

Biological problem

Experiment design

Biological data

Data analysis

Biological validation

Improved method

New biology Probabilistic

models

Research concept & consortium

(16)

In te gr at iv e ge no m ics

Re gu lato ry m od

ule s

Cellular networks

Ge ne tic al g en om ics En

do crin olo gy

Salmonella genomics

Biological problem

Experiment design

Biological data

Data analysis

Biological validation

Improved method

New biology Probabilistic

models

Research concept & consortium

(17)

In te gr at iv e ge no m ics

Re gu lato ry m od

ule s

Cellular networks

Ge ne tic al g en om ics En

do crin olo gy

Salmonella genomics DME-VIB

Prometa

KUL &

DME-VIB

World

Probabilistic models

Peripheral groups & visibility

Yeast (CMPG

& Bio)

(18)

Project structure

WP1. Candidate genes

WP2. Regulatory modules

WP3. Cellular networks

Human

genetics Glucose

regulation VitD

modes of action

Salmonella systems

biology

(19)

Network inference Motif

analysis Primary

analysis CGH ChIP

chip Proteomics Metabol omics

Candidate genes

Regulatory modules

Cellular networks

cDNA/

Affy Gene

prioritization

Data analysis Data generation

Project structure (SysBio -> 3 partners)

Genetical genomics Endocrinology

Salmonella

genomics

(20)

WP1. Candidate gene prioritization

High-throughput

genomics Statistics

& data mining Candidate genes

?

(21)

Human genetics identifies key genes in monogenic and multifactorial diseases

Module analysis Statistical

analysis Gene CGH cDNA/Affy

prioritization

Algorithms Technologies

1

3 2 4

5

(22)

WP2. Module discovery

ACT MYLA C

MYL1 MYOG

MYF6 CHRM2

MEF2

MYOD SRF

(23)

Bayesian networks Motif

analysis Statistical

analysis CGH cDNA/ ChIP Proteomics Metabolomics

Affy Gene

prioritization

Algorithms Technologies

OH

OH HO

H

Cells/tissues treated with 1,25-(OH)

2

D

3

Identification of signalling cascades and transcription factors important for the effects

of 1,25-(OH)

2

D

3

TF

Validation of transcription factor binding to detected

motifs 2 1

3

4

5

VitD affects bone and calcium homeostasis and

has potent anti-proliferative effects

(24)

mRNA expression analysis in pancreatic

beta cells: finding mechanisms of diabetes

Motif analysis Statistical

analysis Generation

of antibodies

Functional analysis of beta cells Affymetrix

Gene System Gene

prioritization

Algorithms Technologies

Discovery of new modules for post-transcriptional gene regulation

1

3 4

5

Beta non brain pitui lung kidney fat liver muscl Cells beta

cells muscle pituitary non-beta cells

<-2.5 >2.5

Signal Log Ratio of mRNA in beta -cells versus other tissues

mRNA expression profiles of normal

& diabetic beta cells

2

Mouse models for a common human disease

(25)

Microarray-data

ChIP-chip-data

Library of strains, each with a tagged regulator

Chromatin IP to enrich promoters bound by regulator

in vivo

Microarray to identify promoters bound by regulator in vivo Regulator Tag

Library of strains, each with a tagged regulator

Chromatin IP to enrich promoters bound by regulator

in vivo

Microarray to identify promoters bound by regulator in vivo

Regulator Tag

Sequence data

Network inference REMODISCOVERY

R M Functional Class: p-value Seed Profile

M o d u l e 1

Mbp1 Swi6 Swi4 Stb1

M_18 (Mbp1) M_12 (Mbp1) M_11 (Swi4) M_67 (Swi4)

10 CELL CYCLE AND DNA PROCESSING: 0

10.03 cell cycle: 2.7e-5 10.01 DNA processing: 1.3e-4 42.04 cytoskeleton: 4.2e-3

M o d u l e 2

Swi4 Mbp1 Swi6 FKH2

M_18 (Mbp1) M_12 (Mbp1) M_11 (Swi4) M_8 (Mcm)

40 CELL FATE : 5.2e-4 40.01 cell growth / morphogenesis:

2.6e-3 43 CELL TYPE DIFFERENTIATION: 5.2e-3 43.01 f ungal/microorganismic cell type differentiation: 5.2e-3

34.11 cellular sensing and response:

5.3e-3

01.05.01 C-compound and carbohydrate utilization: 6.8e-3

10.03.04.03 chromosome condensation: 9.4e-3

M o d u l e 3

NDD1 FKH2 Mcm1

M_8 (Mcm) M_30 (Mcm)

43 CELL TYPE DIFFERENTIATION:

3.6e-3

43.01 fungal/microorganismic cell type differentiation: 3.6e-3

10.03.03 cytokinesis (cell division) /septum formation : 4.8e-3

M o d u l e 4

Swi5 (Ace2)

M_8 (Mcm)

32.01 stress response: 3.2e-3 10.03 cell cycle: 8.7e-3

Combinatorial algorithm

WP3. Network inference

(26)

Salmonella is a powerful model for systems biology (illustration size)

Network inference Module

analysis Statistical

analysis CGH cDNA/ ChIP Proteomics Metabolomics

Affy Gene

prioritization

Algorithms Technologies

Library of strains, each with a tagged regulator

Chromatin IP to enrich promoters bound by regulator

in vivo

Microarray to identify promoters bound by regulator in vivo Regulator Tag

Library of strains, each with a tagged regulator

Chromatin IP to enrich promoters bound by regulator

in vivo

Microarray to identify promoters bound by regulator in vivo Regulator Tag

0 TF1 Gene 1 Gene 2 Gene 3 Gene 4 Gene 5 Gene 6 Gene n

TF2TF3TF4 … TFm

1 0 0 1

1 0 1 0 0

1 0 1 0 0

1 1 1 0 1

1 0 1 0 0

0 1 1 0 0

1 0 1 1 0

0 TF1 Gene 1 Gene 2 Gene 3 Gene 4 Gene 5 Gene 6 Gene n

TF2TF3TF4 … TFm

1 0 0 1

1 0 1 0 0

1 0 1 0 0

1 1 1 0 1

1 0 1 0 0

0 1 1 0 0

1 0 1 1 0

0 M1 Gene 1 Gene 2 Gene 3 Gene 4 Gene 5 Gene 6 Gene n

M2 M3 M4 Mp

1 0 0 0

0 0 1 1 1

1 0 0 1 1

1 1 1 0 1

1 0 1 1 1

0 1 1 0 0

1 0 1 1 1

0 M1 Gene 1 Gene 2 Gene 3 Gene 4 Gene 5 Gene 6 Gene n

M2 M3 M4 Mp

1 0 0 0

0 0 1 1 1

1 0 0 1 1

1 1 1 0 1

1 0 1 1 1

0 1 1 0 0

1 0 1 1 1

E1 Gene 1 Gene 2 Gene 3

Gene n

E2 E3 E4 Ex

Gene 4 Gene 5

E1 Gene 1 Gene 2 Gene 3

Gene n

E2 E3 E4 Ex

Gene 4 Gene 5

Preprocessing

Heterogeneous data

Motif

compendium Inferred

network

(27)

Toucan 2

CGHGate

Endeavour

(28)

Real biological impact

 Screenshots of titles of papers demonstrating a real

biological impact of bioinformatics methods?

(29)

Bioi@SCD growth

 Turnover since 1998

0 200000 400000 600000 800000 1000000 1200000 1400000

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Omzet verloop per financieringskanaal 1998-2009

IWT FWO EU DWTC BOF

(30)

CMPG

• J. Vanderleyden

• J. Michiels

• B. Cammue

Dept. of Mol. Microbiology

• J. Thevelein

CME-MG

• B. Hassan

• P. Marynen

• B. De Strooper

• W. Van de Ven

Lab of Clin. & Evolut.

Virology

• A. Vandamme

Dept. of Transgene Tech. &

Gene Therapy

• P. Carmeliet

CME-UZ

• JJ. Cassiman (CME-KUL)

• J. Vermeesch

Intensive Care

• G. Van Den Berghe

Obstetrics & Gynaecology

• I. Vergote

• T. D‘Hooghe

• D. Timmerman

Paper

Paper

Paper

Paper Paper

Paper Paper

Paper

Lab of Functional Biology

• J. Winderickx

LEGENDO

• C. Mathieu

(31)

CMPG

• J. Vanderleyden

• J. Michiels

• B. Cammue

Lab of Clin. & Evolut.

Virology

• A. Vandamme

QuantPsy

• I. Van Mechelen

Lab of Functional Biology

• J. Winderickx

LEGENDO

• C. Mathieu

Mol.Cell Biology BioChemistry

• F. Schuit

BioStat

• G. Verbeke

Dept. of Mol. Microbiology

• J. Thevelein

Dept. of Transgene Tech. &

Gene Therapy

• P. Carmeliet

CME-MG

• B. Hassan

• P. Marynen

• B. De Strooper

•W. Van de Ven

CME-UZ

• JJ. Cassiman

• J. Vermeersch

Intensive Care

• G. Van Den Berghe

Obstetrics & Gynaecology

• I. Vergote

• T. D‘Hooghe

• D. Timmerman

CoE CoE

CoE

CoE

CoE

CoE

CoE

(32)

European bioinformatics landscape

(33)
(34)

 Integration bioinformatics & stats

 Algorithmic methodologiesz

(35)

Three topics of excellence

 Bioinformatics & microarrays

1.

Gene prioritization by integrative genomics

2.

Graphical models of regulatory motifs and modules

3.

Bayesian networks for prokaryotic systems biology

(36)

(1) Genomic data fusion

After an experiment, many sources of information are available to select the best candidates for modeling and validation

Probabilistic methods can optimize the prioritization

Known genes related to

a disease

or pathway Candidate

genes

Locus

Screening

Multiple data sources

Sequence

 Expression

Function

(37)

Endeavour [Methodological impact]

http://www.esat.kuleuven.ac.be/endeavour

(38)

(2) Regulatory modules [what is a

module? What is transcript. regulation?]

© Davidson EH et al. Science. 2002 Mar 1;295(5560):1669-78.

(39)

Gibbs motif finding

 Initialization

Sequences

Random motif matrix

 Iteration

Sequence scoring

Alignment update

Motif instances

Motif matrix

 Termination

Convergence of the alignment

and of the motif matrix

(40)

MotifSampler & TOUCAN

(41)

(3) Network inference

Reconstruction of the

regulatory network underlying the phenotypic behavior

High throughput data

(42)

Benchmarking network inference methods

Realistic network structures Realistic network dynamics

Simulated networks Inferred networks

Graphical models System identification

A K

A v

v

maxs

max

 1

v ifA v

N etw or k s im ula tio n N etw or k I nfe re nc e

(43)
(44)

Workpackages

 WP1: Candidate genes

Preliminary data analysis

Microarrays (xM1.1)

Generic

CGH microarrays (gWP1)

Genetical genomics

Dealing with noise (xM2.1)

Knowledge mining (gWP2)

& Combined modeling of different data sets (xM2.3)

Genetical genomics

Generic -> WP3: Salmonella

Software & databases (xM1.4)

(45)

Workpackages

WP2: Regulatory modules

Motif and module discovery (xM1.2)

Expression profiling in vitD and analogs pathways (xM3.1, xM3.2)

Beta cell regulation

Transcriptional regulation

Post-transcriptional regulation

Genetic modules

Multiple genome scans and gene modifiers?

Software & databases (xM1.4)

WP3: Cellular networks

Network inference (xM1.3)

Salmonella high-throughput technologies (xM4.1)

Salmonella high-throughput data and analysis (xM4.2)

VitD pathway modeling? Glucose sensing?

Detection of dependence relations (xM2.2)

Software & databases (xM1.4)

(46)

Bioi@SCD growth

 Personnel since 1998

0 5 10 15 20 25

Jul- 98

Oct- 98

Jan- 99

Apr- 99

Jul- 99

Oct- 99

Jan- 00

Apr- 00

Jul- 00

Oct- 00

Jan- 01

Apr- 01

Jul- 01

Oct- 01

Jan- 02

Apr- 02

Jul- 02

Oct- 02

Jan- 03

Apr- 03

Jul- 03

Oct- 03

Jan- 04

Apr- 04

Jul- 04

Oct- 04

Jan- 05

Personeelsverloop 1998-2005

PhD Postdoc ZAP

(47)

Bioi@SCD growth

 Publications since 1998

0 2 4 6 8 10 12 14 16 18 20

1999 2000 2001 2002 2003 2004 2005

Aantal publicaties van 1999-2005

Books Conference Journal

(48)

Bio@SCD growth

 5 successful PhDs

Gert Thijs (juni 2003) : Probabilistic methods to search for regulatory elements in sets of coregulated genes

Frank De Smet (mei 2004) : Microarrays : algorithms for knowledge discovery in oncology and molecular biology

Stein Aerts (mei 2004): Computational discovery of cis- regulatory modules in animal genomes

Geert Fannes (juni 2004): Bayesian learning with expert knowledge : Transforming informative priors between Bayesian networks and multilayer perceptrons

Patrick Glenisson (juni 2004) : Integrating scientific

literature with large scale gene expression analysis

(49)

Bioi@SCD growth

 Software portal

http://www.esat.kuleuven.ac.be/~dna/Bioi/

Number of user on a monthly basis

0 200 400 600 800 1000 1200 1400

Nov-00 Feb-01

May-01 Aug-01

Nov-01 Feb-02

May-02 Aug-02

Nov-02 Feb-03

May-03 Aug-03

Nov-03 Feb-04

Toucan 2

Endeavour

(50)

CMPG

• J. Vanderleyden

• J. Michiels

• B. Cammue

Dept. of Mol. Microbiology

• J. Thevelein

CME-MG

• B. Hassan

• P. Marynen

• B. De Strooper

•W. Van de Ven

Intensive Care

• G. Van Den Berghe

Obstetrics & Gynaecology

• I. Vergote

• T. D‘Hooghe

• D. Timmerman

IDO, BOF PostDoc GBOU, PhD

Project, PhD, PostDoc

(51)

CAGE

(52)

Bruges

Kortrijk

Ghent

Antwerp

Brussels

Leuven

Turnhout

2005

Geel

Hasselt Mechelen

Bruges

Genencor International Ghent

Ablynx AlgoNomics Applied Maths Bayer BioScience Bioin4matrix BioMARIC CropDesign deVGen

Innogenetics

Maize Technologies Int’l Methexis Genomics Xcellentis

Yakult Peakadilly Antwerp DCI-labs Flen Pharma Histogenex

Memo Bead Technologies

Turnhout

DiaMed EuroGen

Janssen Pharmaceutica Geel

Barrier Therapeutics Genzyme Flanders Maia Scientific

Mechelen Bio-Art CryoSave

Galapagos Genomics Tibotec

Virco Brussels Beta-cell Dentech EggCentris

R.E.D. Laboratories

Leuven

4AZA Bioscience Diatos

Neurogenetics PharmaDM reMynd RNA-TEC Thromb-X Tigenix Vivactis

Flemish biotech companies

(53)

Bayesian networks Motif

analysis Statistical

analysis CGH ChIP

chip Proteomics Metabol omics

Candidate genes PI:

Regulatory modules PI:

Cellular networks PI:

cDNA/

Affy Gene

prioritization

Algorithmic research Data generation

Project structure – budget (750 KEuro?)

Genetical genomics Endocrinology Salmonella genomics

Postdoc 2 Phd 2

Techn 1

Postdoc 3 Phd 3 Postdoc 1

Phd 1

Techn 2

Techn 3

Phd 4

(54)
(55)

allerlei

Eerste citaties met “bioinformatics”

Trends Biotechnol 1993

Ann N Y Acad Sci 1993

(56)

Network reconstruction based on heterogeneous data

Microarray-data

ChIP-chip-data

Library of strains, each with a tagged regulator

Chromatin IP to enrich promoters bound by regulator

in vivo

Microarray to identify promoters bound by regulator in vivo Regulator Tag

Library of strains, each with a tagged regulator

Chromatin IP to enrich promoters bound by regulator

in vivo

Microarray to identify promoters bound by regulator in vivo

Regulator Tag

Sequence data

Preprocessing Network inference

(57)

A K

A v

v

maxs

max

 1

v ifA v

Network structures based on real biological networks

Realistic network dynamics Simulated networks

Benchmarking network inference methodologies

(58)

R M Functional Class: p-value Seed Profile

M o d u l e 1

Mbp1 Swi6 Swi4 Stb1

M_18 (Mbp1) M_12 (Mbp1) M_11 (Swi4) M_67 (Swi4)

10 CELL CYCLE AND DNA PROCESSING: 0

10.03 cell cycle: 2.7e-5 10.01 DNA processing: 1.3e-4 42.04 cytoskeleton: 4.2e-3

M o d u l e 2

Swi4 Mbp1 Swi6 FKH2

M_18 (Mbp1) M_12 (Mbp1) M_11 (Swi4) M_8 (Mcm)

40 CELL FATE : 5.2e-4 40.01 cell growth / morphogenesis:

2.6e-3 43 CELL TYPE DIFFERENTIATION: 5.2e-3 43.01 f ungal/microorganismic cell type differentiation: 5.2e-3

34.11 cellular sensing and response:

5.3e-3

01.05.01 C-compound and carbohydrate utilization: 6.8e-3

10.03.04.03 chromosome condensation: 9.4e-3

M o d u l e 3

NDD1 FKH2 Mcm1

M_8 (Mcm) M_30 (Mcm)

43 CELL TYPE DIFFERENTIATION:

3.6e-3

43.01 fungal/microorganismic cell type differentiation: 3.6e-3

10.03.03 cytokinesis (cell division) /septum formation : 4.8e-3

M o d u l e 4

Swi5 (Ace2)

M_8 (Mcm)

32.01 stress response: 3.2e-3 10.03 cell cycle: 8.7e-3

(59)

A K

A v

v

maxs

max

 1

v ifA v

Realistic network structures

Realistic network dynamics

Simulated networks

Benchmarking network inference methodologies

Inferred networks

Graphical models

System identification

(60)

Now: the molecular pipeline

Powerful high-throughput technologies enable genomewide screening

Sequencing, microarrays, etc.

Some genes selected (arbitrarily) for validation

After a long validation the best-known genes are integrated into

a biological model (maken van predictieve modellen op beperkte genen is niet het onderwerp van het project)

Screen

Validate

Model

(61)

Future: the systems genomics pipeline

Validate Select

By integrating computation tightly with biological experiments, promising genes are selected and integrated to computational models to retain only the best candidates for validation

There is a continuous interchange between the different levels of analysis

Screen

Model

Referenties

GERELATEERDE DOCUMENTEN

In this research we therefore compare the effect of a standardised canopy (topping of shoots and removal of laterals) versus a normal canopy (as managed by the producer) on the

0,0 0,5 1,0 1,5 2,0 2,5 Akkerbouw- bedrijven Melkvee- bedrijven Varkens- bedrijven Glastuinbouw- bedrijven Totaal land en tuinbouw x 1 miljoen euro 0 20 40 60 80 100 solvabiliteit

In Section 7 (IT and distance learning in K-12 education) the potential of IT for distance learning in pri- mary and secondary education has been explored with particular

Studies involving non-criminal samples yielded psychometric results that support the use of the PPI in student and community samples (Lilienfield &amp; Andrews, 1996;

And you look at the demands that’s been set up front, you actually ask yourself the question, or you actually just rubber stamp it to say that if we’re going to continue the way

The history of the Johannesburg Stock Exchange (JSE) – in existence since 1887 2 – has seen dramatic developments in domestic government policy, far-reaching changes in the

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of

Volgens de beleidsdoelstellingen op het gebied van de verkeersveiligheid moet in het jaar 2010 het aantal doden ten opzichte van 1986 met de helft zijn teruggebracht, maar als