• No results found

Approaches to Medical Approaches to Medical Classification Problems Classification Problems

N/A
N/A
Protected

Academic year: 2021

Share "Approaches to Medical Approaches to Medical Classification Problems Classification Problems"

Copied!
90
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Probabilistic Machine Learning Probabilistic Machine Learning

Approaches to Medical Approaches to Medical Classification Problems Classification Problems

Chuan LU

Jury:

Prof. L. Froyen, chairman Prof. J. Vandewalle Prof. S. Van Huffel, promotor Prof. J. Beirlant Prof. J.A.K. Suykens, promotor Prof. P.J.G. Lisboa

(2)

Clinical decision support systems Clinical decision support systems

„

Advances in technologies facilitate data collection

„

computer based decision support systems

„

Human beings: subjective, experience dependent .

„

Artificial intelligence (AI) in medicine

„

Expert system

„

Machine learning

„

Diagnostic modelling

„

Knowledge discovery

STOP

Coronary Disease

(3)

Medical classification problems Medical classification problems

„

Essential for clinical decision making

„

Constrained diagnosis problem

„

e.g. benign -, malignant + (for tumors).

„

Classification

„

Find a rule to assign an obs. into one of the existing classes

„

supervised learning, pattern recognition

(4)

Medical classification problems Medical classification problems

„

Essential for clinical decision making

„

Constrained diagnosis problem

„

e.g. benign -, malignant + (for tumors).

„

Classification

„

Find a rule to assign an obs. into one of the existing classes

„

supervised learning, pattern recognition

„

Our applications:

„

Ovarian tumor classification with patient data

„

Brain tumor classification based on MRS spectra

„

Benchmarking cancer diagnosis based on microarray data

(5)

„

Apply learning algorithms, autonomous acquisition and integration of knowledge

„

Approaches

„

Conventional statistical learning algorithms

„

Artificial neural networks, Kernel-based models

„

Decision trees

„

Learning sets of rules

„

Bayesian networks

Machine learning

Machine learning

(6)

„

Apply learning algorithms, autonomous acquisition and integration of knowledge

„

Approaches

„

Conventional statistical learning algorithms

„

Artificial neural networks, Kernel-based models

„

Decision trees

„

Learning sets of rules

„

Bayesian networks

Machine learning Machine learning

Good performance

Artificial neural networks, Kernel-based models Decision trees

Learning sets of rules

Bayesian networks

(7)

Building classifiers

Building classifiers a flowchart a flowchart

Classifier Machine

Learning Algorithm

Training

Training Patterns + class labels

Model selection rning

rithm

(8)

Building classifiers

Building classifiers a flowchart a flowchart

Classifier Machine

Learning Algorithm

Training

Training Patterns + class labels

Test, Prediction

Predicted Class

New pattern

Classifier

Model selection rning

rithm

Probability of disease

(9)

Building classifiers

Building classifiers a flowchart a flowchart

Classifier Machine

Learning Algorithm

Training

Training Patterns + class labels

Test, Prediction

Predicted Class

New pattern

Classifier

Feature

selection Model

selection rning

rithm

(10)

Building classifiers

Building classifiers a flowchart a flowchart

Classifier Machine

Learning Algorithm

Training

Training Patterns + class labels

Test, Prediction

Predicted Class

New pattern

Classifier

Feature selection

Central Issue

Good generalization performance!

model fitness ⇔ complexity Central Issue

Good generalization performance!

model fitness ⇔ complexity Regularization, Bayesian learning

Model selection rning

rithm

Probability of disease

(11)

Building classifiers

Building classifiers a flowchart a flowchart

Classifier Machine

Learning Algorithm

Training

Training Patterns + class labels

Test, Prediction

Predicted Class

New pattern

Classifier

Feature selection

Central Issue

Probabilistic framework

Feature

selection Model

selection

Test, Prediction

Predicted Class

New pattern

Classifier Machine

Learning Algorithm

Training

Training Patterns + class labels

Central Issue

Model selection rning

rithm

(12)

Outline Outline

„

Supervised learning

„

Bayesian frameworks for blackbox models

„

Preoperative classification of ovarian tumors

„

Bagging for variable selection and prediction in cancer diagnosis problems

„

Conclusions

„

(13)

Outline Outline

„

Supervised learning

„

Bayesian frameworks for blackbox models

„

Preoperative classification of ovarian tumors

„

Bagging for variable selection and prediction in cancer diagnosis problems

„

Conclusions

„

Supervised learning

„

Bayesian frameworks for blackbox models

„

Preoperative classification of ovarian tumors

„

Bagging for variable selection and prediction in cancer diagnosis problems

„

Conclusions

„

(14)

Conventional linear classifiers Conventional linear classifiers

„

Linear discriminant analysis (LDA)

„ Discriminating using z=wTx ∈RR

„ Maximizing between-class

variance while minimizing within- class variance

z

1

x

2

S

b

S

w

x

1

z

2

(15)

Conventional linear classifiers Conventional linear classifiers

„

Linear discriminant analysis (LDA)

„ Discriminating using z=wTx ∈RR

„ Maximizing between-class

variance while minimizing within- class variance

„

Logistic regression (LR)

„ Logit: log (odds)

log p T

= = + b

w x

Probability of malignancy

Σ

w w w

. . .

output x ∈RR

within-

(16)

Feedforward

Feedforward neural networks neural networks

inputs

x

1

x

2

. . . x

D

Σ

. . .

Σ Σ

hidden layer output

x

1

x

2

. . . x

D

. . .

Σ

bias

0

f( , ) ( )

M

j j j

w φ

=

=

x w x

(17)

Feedforward

Feedforward neural networks neural networks

inputs

x

1

x

2

. . . x

D

Σ

. . .

Σ Σ

hidden layer output

x

1

x

2

. . . x

D

. . .

Σ

bias

0

f( , ) ( )

M

j j j

w φ

=

=

x w x

Multilayer Perceptrons

(MLP)

(18)

Feedforward

Feedforward neural networks neural networks

inputs

x

1

x

2

. . . x

D

Σ

. . .

Σ Σ

hidden layer output

x

1

x

2

. . . x

D

. . .

Σ

bias

0

f( , ) ( )

M

j j j

w φ

=

=

x w x

Multilayer Perceptrons

(MLP)

Activation function

(19)

Feedforward

Feedforward neural networks neural networks

inputs

x

1

x

2

. . . x

D

Σ

. . .

Σ Σ

hidden layer output

x

1

x

2

. . . x

D

. . .

Σ

bias

0

f( , ) ( )

M

j j j

w φ

=

=

x w x

Multilayer Perceptrons

(MLP)

Activation function

Radial basis function (RBF) neural networks

(20)

Feedforward

Feedforward neural networks neural networks

inputs

x

1

x

2

. . . x

D

Σ

. . .

Σ Σ

hidden layer output

x

1

x

2

. . . x

D

. . .

Σ

bias

0

f( , ) ( )

M

j j j

w φ

=

=

x w x

Multilayer Perceptrons

(MLP)

Activation function

Radial basis function (RBF) neural networks

Basis function

(21)

Feedforward

Feedforward neural networks neural networks

inputs

x

1

x

2

. . . x

D

Σ

. . .

Σ Σ

hidden layer output

x

1

x

2

. . . x

D

. . .

Σ

bias

0

f( , ) ( )

M

j j j

w φ

=

=

x w x

Multilayer Perceptrons

(MLP)

Activation function

Radial basis function (RBF) neural networks

Basis function

„ Training (Back-propagation, L-M, CG,…), validation, test

„ Regularization, Bayesian methods

(22)

Support vector machines (SVM) Support vector machines (SVM)

„

For classification: functional form

„

Statistical learning theory

[Vapnik95]

1

y( ) sign k( , )

N

i i i

i

y

α

b

=

⎛ ⎞

= ⎜⎝

+ ⎟⎠

x x x functionkernel

(23)

Support vector machines (SVM) Support vector machines (SVM)

„

For classification: functional form

„

Statistical learning theory

[Vapnik95]

1

y( ) sign k( , )

N

i i i

i

y

α

b

=

⎛ ⎞

= ⎜⎝

+ ⎟⎠

x x x functionkernel

x ⇒ ϕ(x)

(24)

Support vector machines (SVM) Support vector machines (SVM)

„

For classification: functional form

„

Statistical learning theory

[Vapnik95]

„

Margin maximization

1

y( ) sign k( , )

N

i i i

i

y

α

b

=

⎛ ⎞

= ⎜⎝

+ ⎟⎠

x x x

x

wwTTxx+ b+ < < 0 Class:

Class: --11

wwTTxx+ b+ > > 0 Class: +1 Class: +1

Hyperplane Hyperplane:: wwTTxx+ b+ = = 0

x x x

x x

x margin

x kernel function

2/2/&&ww&&22

(25)

Support vector machines (SVM) Support vector machines (SVM)

„

For classification, functional form

„

Statistical learning theory

[Vapnik95]

„

Margin maximization

1

y( ) sign k( , )

N

i i i

i

y

α

b

=

⎛ ⎞

= ⎜⎝

+ ⎟⎠

x x x functionkernel

„

Kernel trick

(26)

Support vector machines (SVM) Support vector machines (SVM)

„

For classification, functional form

„

Statistical learning theory

[Vapnik95]

„

Margin maximization

1

y( ) sign k( , )

N

i i i

i

y

α

b

=

⎛ ⎞

= ⎜⎝

+ ⎟⎠

x x x functionkernel

„

Kernel trick

( ) T ( ) f x = w

ϕ

x + b

Feature space Mercer’s theorem

k(x, z) = <ϕ(x),ϕ (z)> 1

( ) ( , )

N

i i i

i

f y k

α

b

=

=

+

x x x

Dual space

el trick

(27)

Support vector machines (SVM) Support vector machines (SVM)

„

For classification, functional form

„

Statistical learning theory

[Vapnik95]

„

Margin maximization

1

y( ) sign k( , )

N

i i i

i

y

α

b

=

⎛ ⎞

= ⎜⎝

+ ⎟⎠

x x x functionkernel

„

Kernel trick

( ) T ( ) f x = w

ϕ

x + b

Feature space Mercer’s theorem

k(x, z) = <ϕ(x),ϕ (z)> 1

( ) ( , )

N

i i i

i

f y k

α

b

=

=

+

x x x

Dual space

el trick

Positive definite kernel k(.,.) RBF kernel:

Linear kernel:

2 2

( , ) exp{ / }

k x z = − −x z r ( , ) T

k x z = x z

(28)

Support vector machines (SVM) Support vector machines (SVM)

„

For classification, functional form

„

Statistical learning theory

[Vapnik95]

„

Margin maximization

1

y( ) sign k( , )

N

i i i

i

y

α

b

=

⎛ ⎞

= ⎜⎝

+ ⎟⎠

x x x functionkernel

„

Kernel trick

( ) T ( ) f x = w

ϕ

x + b

Feature space Mercer’s theorem

k(x, z) = <ϕ(x),ϕ (z)> 1

( ) ( , )

N

i i i

i

f y k

α

b

=

=

+

x x x

Dual space

el trick

Positive definite kernel k(.,.) RBF kernel:

Linear kernel:

2 2

( , ) exp{ / }

k x z = − −x z r ( , ) T

k x z = x z

„

Quadratic programming

„

Sparseness, unique solution

(29)

Support vector machines (SVM) Support vector machines (SVM)

„

For classification, functional form

„

Statistical learning theory

[Vapnik95]

„

Margin maximization

1

y( ) sign k( , )

N

i i i

i

y

α

b

=

⎛ ⎞

= ⎜⎝

+ ⎟⎠

x x x functionkernel

„

Kernel trick

( ) T ( ) f x = w

ϕ

x + b

Feature space Mercer’s theorem

k(x, z) = <ϕ(x),ϕ (z)> 1

( ) ( , )

N

i i i

i

f y k

α

b

=

=

+

x x x

Dual space

el trick

Positive definite kernel k(.,.) RBF kernel:

Linear kernel:

2 2

( , ) exp{ / }

k x z = − −x z r ( , ) T

k x z = x z

(30)

Least squares

Least squares SVMs SVMs

„

LS-SVM classifier

[Suykens99]

„ SVM variant

„ Inequality constraint ⇒ equality constraint

„ Quadratic programming ⇒ solving linear equations

(31)

Least squares

Least squares SVMs SVMs

„

LS-SVM classifier

[Suykens99]

„ SVM variant

„ Inequality constraint ⇒ equality constraint

„ Quadratic programming ⇒ solving linear equations

2

, 1

The following model is taken:

min ( , ) 1 ,

2 ( ) )

(

N T

w b i

i T

J b C e

ϕ b

=

= +

+

=

w w w

w x

f x

Primal problem

(32)

Least squares

Least squares SVMs SVMs

„

LS-SVM classifier

[Suykens99]

„ SVM variant

„ Inequality constraint ⇒ equality constraint

„ Quadratic programming ⇒ solving linear equations

2

, 1

The following model is taken:

min ( , ) 1 ,

2

s.t. [ ( ) ] 1 1,...,

with regularization const ( )

. )

. (

N T

w b i

i T

T

i i i

J b C e

y b e

i

b

C N

ϕ

ϕ

=

= +

+ = +

=

=

w w w

x

w x

w f x

Primal problem

1

1 1

1

[ ,..., ] , [1,...,1] , [ ,..., ] ,

[ ,..., ] , ( ) ( ) ( , )

Resulting clas y( ) sig

sifi

n[ ( , )

0

r

0

:

] e

T T T

N v N

T T

N ij i j

N

i i i

T v

v N

j i

y y e e

k

b C

y k b

α

α α ϕ ϕ

= = =

=

⎤ ⎡ ⎤ ⎡ ⎤

+ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦ =

=

=

+

=

y 1 e

α x

1

α y

1

x x

x

I

x

x x

solved in dual space

Dual problem

(33)

Model evaluation Model evaluation

„

Performance measure

„ Accuracy: correct classification rate

„ Receiver operating characteristic (ROC) analysis

„ Confusion table

Test result

True result ++

+ +

TP TP FP

FP

FN FN TN

TN

sensitivity

specficity

TP TP FN

TN TN FP

= +

= +

(34)

Model evaluation Model evaluation

„

Performance measure

„ Accuracy: correct classification rate

„ Receiver operating characteristic (ROC) analysis

„ Confusion table

„ ROC curve

„ Area under the ROC curve AUC=P[y(x )<y(x )]

Test result

True result ++

+ +

TP TP FP

FP

FN FN TN

TN

sensitivity

specficity

TP TP FN

TN TN FP

= +

= +

Assumption:

equal misclass. cost and constant class distribution in the target environment

(35)

Model evaluation Model evaluation

„

Performance measure

„ Accuracy: correct classification rate

„ Receiver operating characteristic (ROC) analysis

„ Confusion table

Test result

True result ++

+ +

TP TP FP

FP

FN FN TN

TN

sensitivity

specficity

TP TP FN

TN TN FP

= +

= +

TPTP TNTN

FNFN FPFP

(36)

Model evaluation Model evaluation

„

Performance measure

„ Accuracy: correct classification rate

„ Receiver operating characteristic (ROC) analysis

„ Confusion table

„ ROC curve

„ Area under the ROC curve AUC=P[y(x )<y(x )]

Test result

True result ++

+ +

TP TP FP

FP

FN FN TN

TN

sensitivity

specficity

TP TP FN

TN TN FP

= +

= +

(37)

Model evaluation Model evaluation

„

Performance measure

„ Accuracy: correct classification rate

„ Receiver operating characteristic (ROC) analysis

„ Confusion table

Test result

True result ++

+ +

TP TP FP

FP

FN FN TN

TN

sensitivity

specficity

TP TP FN

TN TN FP

= +

= +

(38)

Outline Outline

„

Supervised learning

„

Bayesian frameworks for blackbox models

„

Preoperative classification of ovarian tumors

„

Bagging for variable selection and prediction in cancer diagnosis problems

„

Conclusions

(39)

Bayesian frameworks for

Bayesian frameworks for blackbox blackbox models models

„

Advantages

„

Automatic control of model complexity, without CV

„

Possibility to use prior info and hierarchical models for hyperparameters

„

Predictive distribution for output

Principle of Bayesian learning

[MacKay95]

•Define the probability distribution over all quantities within the model

•Update the distribution given data using Bayes’ rule

Principle of Bayesian learning

[MacKay95]

•Define the probability distribution over all quantities within the model

•Update the distribution given data using Bayes’ rule

(40)

Bayesian inference Bayesian inference

(

,b , ,

)

p D( , , ,bP D( H p) ( ,, ) b ,H)

p D H = w θ Hw

w θ θ

θ

Likelihood × Prior Evidence

Posterior =

Bayes’ rule

( , ) (

) ( ( ,

( ,

) )

= p D H p) H

p D

p D H D H

p H

θ θ

θ θ

( ) (

( ) ) ( )

( )

j j

j j

p D H p H

p D

p D p

H = D H

: RBF kernel width, (model H kernel parameter, e.g.

Model evidence [MacKay95, Suykens02, Tipping01]

(41)

Bayesian inference Bayesian inference

(

,b , ,

)

p D( , , ,bP D( H p) ( ,, ) b ,H)

p D H = w θ Hw

w θ θ

θ

Likelihood × Prior Evidence

Posterior =

Bayes’ rule

( , ) (

) ( ( ,

( ,

) )

= p D H p) H

p D

p D H D H

p H

θ θ

θ θ

( ) (

( ) p D Hj p Hj) ( )

p H D = p D H Model

[MacKay95, Suykens02, Tipping01]

:

infer , bfor given , H Level 1

w θ

(42)

Bayesian inference Bayesian inference

(

,b , ,

)

p D( , , ,bP D( H p) ( ,, ) b ,H)

p D H = w θ Hw

w θ θ

θ

Likelihood × Prior Evidence

Posterior =

Bayes’ rule

( , ) (

) ( ( ,

( ,

) )

= p D H p) H

p D

p D H D H

p H

θ θ

θ θ

( ) (

( ) ) ( )

( )

j j

j j

p D H p H

p D

p D p

H = D H

: RBF kernel width, (model H kernel parameter, e.g.

Model evidence [MacKay95, Suykens02, Tipping01]

:

infer , bfor given , H Level 1

w θ

:

Infer hyperparameter Level 2

θ

(43)

Bayesian inference Bayesian inference

(

,b , ,

)

p D( , , ,bP D( H p) ( ,, ) b ,H)

p D H = w θ Hw

w θ θ

θ

Likelihood × Prior Evidence

Posterior =

Bayes’ rule

( , ) (

) ( ( ,

( ,

) )

= p D H p) H

p D

p D H D H

p H

θ θ

θ θ

( ) (

( ) p D Hj p Hj) ( )

p H D = p D H Model

[MacKay95, Suykens02, Tipping01]

:

infer , bfor given , H Level 1

w θ

:

Infer hyperparameter Level 2

θ

: Level 3

(44)

Bayesian inference Bayesian inference

(

,b , ,

)

p D( , , ,bP D( H p) ( ,, ) b ,H)

p D H = w θ Hw

w θ θ

θ

Likelihood × Prior Evidence

Posterior =

Bayes’ rule

( , ) (

) ( ( ,

( ,

) )

= p D H p) H

p D

p D H D H

p H

θ θ

θ θ

( ) (

( ) ) ( )

( )

j j

j j

p D H p H

p D

p D p

H = D H

: RBF kernel width, (model H kernel parameter, e.g.

Model evidence [MacKay95, Suykens02, Tipping01]

:

infer , bfor given , H Level 1

w θ

:

Infer hyperparameter Level 2

θ

:

Compare models Level 3

Marginalization (Gaussian appr.)

(45)

Sparse Bayesian learning (SBL) Sparse Bayesian learning (SBL)

„ Automatic relevance determination (ARD) applied to f(x)=wTφ(x)

„ Prior for wm varies

hierarchical priors ⇒ sparseness

„ Basis function φ(x)

„ Original variable

linear SBL model

variable selection!variable selection!

„ Kernel ⇒

relevance vector machines

„ Relevance vectors: prototypical

(46)

Sparse Bayesian learning (SBL) Sparse Bayesian learning (SBL)

„ Automatic relevance determination (ARD) applied to f(x)=wTφ(x)

„ Prior for wm varies

hierarchical priors ⇒ sparseness

„ Basis function φ(x)

„ Original variable

linear SBL model

variable selection!variable selection!

„ Kernel ⇒

relevance vector machines

„ Relevance vectors: prototypical

„ Sequential SBL algorithm [Tipping03]

RVMRVM

(47)

Sparse Bayesian LS

Sparse Bayesian LS - - SVMs SVMs

„

Iteratively pruning of easy cases (support value α<0)

[Lu02]

„

Mimicking margin

maximization as in SVM

„

Support vectors close to decision boundary

Sparse Bayesian LSSVM Sparse Bayesian

LSSVM

(48)

Variable (feature) selection Variable (feature) selection

„

Importance in medical classification problems

„

Economics of data acquisition

„

Accuracy and complexity of the classifiers

„

Gain insights into the underlying medical problem

„

Filter, wrapper, embedded

„

We focus on model evidence based methods within the Bayesian framework [Lu02, Lu04]

„ Forward / stepwise selection

„ Bayesian LS-SVM

„ Sparse Bayesian learning models

„ Accounting for uncertainty in variable selection via sampling methods

Who’s who?

(49)

Outline Outline

„

Supervised learning

„

Bayesian frameworks for blackbox models

„

Preoperative classification of ovarian tumors

„

Bagging for variable selection and prediction in cancer diagnosis problems

„

Conclusions

(50)

Ovarian cancer diagnosis Ovarian cancer diagnosis

„

Problem

„ Ovarian masses

„ Ovarian cancer : high mortality rate, difficult early detection

„ Treatment of different types of ovarian tumors differ

„ Develop a reliable diagnostic tool to preoperatively discriminate between malignant and benign tumors.

„ Assist clinicians in choosing the treatment.

„

Medical techniques for preoperative evaluation

„ Serum tumor maker: CA125 blood test

„ Ultrasonography

„ Color Doppler imaging and blood flow indexing

„

Two-stage study

„ Preliminary investigation: KULeuven pilot project, single-center

(51)

Ovarian cancer diagnosis Ovarian cancer diagnosis

„

Attempts to automate the diagnosis

„

Risk of malignancy Index (RMI)

[Jacobs90]

RMI= score

morph

× score

meno

× CA125

„

Mathematical models

(52)

Ovarian cancer diagnosis Ovarian cancer diagnosis

„

Attempts to automate the diagnosis

„

Risk of malignancy Index (RMI)

[Jacobs90]

RMI= score

morph

× score

meno

× CA125

„

Mathematical models

Logistic Regression Multilayer perceptrons Kernel-based models

(53)

Ovarian cancer diagnosis Ovarian cancer diagnosis

„

Attempts to automate the diagnosis

„

Risk of malignancy Index (RMI)

[Jacobs90]

RMI= score

morph

× score

meno

× CA125

„

Mathematical models

Logistic Regression Multilayer perceptrons Kernel-based models

Bayesian belief network

(54)

Ovarian cancer diagnosis Ovarian cancer diagnosis

„

Attempts to automate the diagnosis

„

Risk of malignancy Index (RMI)

[Jacobs90]

RMI= score

morph

× score

meno

× CA125

„

Mathematical models

Logistic Regression Multilayer perceptrons Kernel-based models

Bayesian belief network

Hybrid Methods

(55)

Ovarian cancer diagnosis Ovarian cancer diagnosis

„

Attempts to automate the diagnosis

„

Risk of malignancy Index (RMI)

[Jacobs90]

RMI= score

morph

× score

meno

× CA125

„

Mathematical models

Logistic Regression Multilayer perceptrons Kernel-based models

Bayesian belief network

Hybrid Methods

(56)

Preliminary investigation

Preliminary investigation pilot project pilot project

„

Patient data collected at Univ. Hospitals Leuven, Belgium, 1994~1999

„

425 records (data with missing values were excluded), 25 features.

„

291 benign tumors, 134 (32%) malignant tumors

„

Preprocessing:

e.g.

„

CA_125->log,

„

Color_score {1,2,3,4} -> 3 design variables {0,1}..

„

Descriptive statistics

(57)

Preliminary investigation

Preliminary investigation pilot project pilot project

„

Patient data collected at Univ. Hospitals Leuven, Belgium, 1994~1999

„

425 records (data with missing values were excluded), 25 features.

„

291 benign tumors, 134 (32%) malignant tumors

„

Preprocessing:

e.g.

„

CA_125->log,

Variable (symbol) Benign Malignant Demographic Age (age)

Postmenopausal (meno)

45.6 ± 15.2 31.0 %

56.9 ± 14.6 66.0 % Serum marker CA 125 (log) (l_ca125) 3.0 ± 1.2 5.2 ± 1.5

CDI High blood flow (colsc3,4) 19.0% 77.3 %

Morphologic Abdominal fluid (asc) Bilateral mass (bilat) Unilocular cyst (un)

Multiloc/solid cyst (mulsol) Solid (sol)

Smooth wall (smooth)

32.7 % 13.3 % 45.8 % 10.7 % 8.3 % 56.8 %

67.3 % 39.0 % 5.0 % 36.2 % 37.6 % 5.7 % Demographic, serum marker, color Doppler imaging

and morphologic variables

(58)

Experiment

Experiment pilot project pilot project

„

Desired property for models:

„ Probability of malignancy

„ High sensitivity for malign.

↔ low false positive rate.

„

Compared models

„ Bayesian LS-SVM classifiers

„ RVM classifiers

„ Bayesian MLPs

„ Logistic regression

„ RMI (reference)

„

‘Temporal’ cross-validation

„ Training set: 265 data (1994~1997)

„ Test set: 160 data (1997~1999)

„

Multiple runs of stratified randomized CV

„ Improved test performance

„ Conclusions for model comparison similar to temporal CV

(59)

Variable selection

Variable selection pilot project pilot project

„

Forward variable selection based on Bayesian LS-SVM

Evolution of the model evidence 10 variables were

selected based on the training set (first treated 265 patient data) using RBF kernels.

(60)

Model evaluation

Model evaluation pilot project pilot project

ƒ Compare the predictive power of the models given the selected variables

ROC curves on test Set (data from 160 newest treated patients)

(61)

Model evaluation

Model evaluation pilot project pilot project

„ Comparison of model performance on test set with rejection based on

| ( P y = + 1 | ) - 0.5 x | ∝ uncertainty

(62)

Model evaluation

Model evaluation pilot project pilot project

„ Comparison of model performance on test set with rejection based on

| ( P y = + 1 | ) - 0.5 x | ∝ uncertainty

¾The rejected patients need further examination by human experts

¾Posterior probability essential for medical decision making

¾The rejected patients need further examination by human experts

¾Posterior probability essential for medical decision making

(63)

Extensive study

Extensive study IOTA project IOTA project

„

International Ovarian Tumor Analysis

„

Protocol for data collection

„

A multi-center study

„

9 centers

„

5 countries: Sweden, Belgium, Italy, France, UK

„

1066 data of the dominant tumors

„

800 (75%) benign

„

266 (25%) malignant

(64)

Data

Data IOTA project IOTA project

0 50 100 150 200 250 300 350

MSW LBE RIT MIT BFR MFR KUK OIT NIT

Center

benign

primary invasive borderline metastatic

metastatic 11 17 10 1 0 0 2 1 0

MSW LBE RIT MIT BFR MFR KUK OIT NIT

(65)

Model development

Model development IOTA project IOTA project

„

Randomly divide data into

„ Training set: Ntrain=754 Test set: Ntest=312

„ Stratified for tumor types and centers

„

Model building based on the training data

„ Variable selection:

„ with / without CA125

„ Bayesian LS-SVM with

„ Compared models:

„ LRs

„ Bay LS-SVMs, RVMs,

„ Kernels: linear/RB, additive RBF

„

Model evaluation

„ ROC analysis

„ Performance of all centers as a whole / of individual centers

„ Model interpretation?

(66)

Model evaluation

Model evaluation IOTA project IOTA project

MODELa (12 var) MODELa (12 var)

MODELb (12 var) MODELb

(12 var) MODELaa

(18 var) MODELaa

(18 var)

Comparison of model performance using different variable subsets

•Variable

subset matters more than model type

•Linear models suffice

pruning Variable

subset

Referenties

GERELATEERDE DOCUMENTEN

'ίσως γαρ και τούτο αΐνίττεται κατά το προοίμιον των Αφορισμών λέγων, ως επειδή κατά την πεΐραν ή Ιατρική σχεδόν ακατάληπτος εστίν (ούτε γαρ οτε βουλόμεθα

• Sluggish cognitive tempo as another possible characteristic of attention problems were investigated by Carlson and Mann (in Dillon &amp; Osborne, 2006:7). When this

In de interviews met de ploegenwerkers is gevraagd of er werkzaamheden op de gasfabriek zijn die weliswaar nu door de ploeg uitgevoerd worden, maar die net zo goed of beter in

Opgaven examen MULO-B Algebra 1912 Algemeen.. Als ze elkaar ontmoeten heeft A

 Model development and evaluation of predictive models for ovarian tumor classification, and other cancer diagnosis problems... Future work

„ Model development and evaluation of predictive models for ovarian tumor classification, and other cancer diagnosis problems... Future work

Poster presentation: ‘Variable selection using linear sparse Bayesian models for medical classification problems’... The Doctoral Programme The

A very good feasible solution is found after a short computation time, which means that the branch &amp; bound algorithm can also be used as a heuristic.. We test the performance