Distance-based analysis of dynamical systems and time series by optimal transport
Muskulus, M.
Citation
Muskulus, M. (2010, February 11). Distance-based analysis of
dynamical systems and time series by optimal transport. Retrieved from
https://hdl.handle.net/1887/14735
Version: Corrected Publisher’s Version License:
Licence agreement concerning inclusion of doctoral thesis in the Institutional Repository of the University of Leiden
Downloaded from: https://hdl.handle.net/1887/14735
Note: To cite this publication please use the final published version (if
applicable).
Distance-based analysis of dynamical systems and time series by optimal transport
P
ROEFSCHRIFTter verkrijging van
de graad van Doctor aan de Universiteit Leiden, op gezag van Rector Magnificus prof.mr. P.F. van der Heijden,
volgens besluit van het College voor Promoties te verdedigen op donderdag 11 Februari
klokke 11.15 uur
door
Michael Muskulus
geboren te Sorengo, Switzerland in 1974
Promotiecommissie
Promotor:
prof. dr. S.M. Verduyn Lunel
Overige leden:
dr. S.C. Hille
prof. dr. J.J. Meulman
prof. dr. P.J. Sterk (Academisch Medisch Centrum, Universiteit van Amsterdam) prof. dr. P. Stevenhagen
prof. dr. S.J. van Strien (University of Warwick)
Distance-based analysis of dynamical systems and time
series by optimal transport
T HOMAS S TIELTJES I NSTITUTE FOR M ATHEMATICS
Muskulus, Michael, 1974–
Distance-based analysis of dynamical systems and time series by optimal transport
AMS 2000 Subj. class. code: 37M25, 37M10, 92C50, 92C55, 62H30
NUR: 919
ISBN: 978-90-5335-254-0
Printed by Ridderprint Offsetdrukkerij B.V., Ridderkerk, The Netherlands
Cover: Michael Muskulus
This work was partially supported by the Netherlands Organization for Scientific Research (NWO) under grant nr. 635.100.006.
Copyright © 2010 by Michael Muskulus, except the following chapters:
Chapter 8 J. Neurosci. Meth. 183 (2009), 31–41: Copyright © 2009 by Elsevier B.V.
DOI: 10.1016/j.jneumeth.2009.06.035
Adapted and reprinted with permission of Elsevier B.V.
No part of this thesis may be reproduced in any form without the express written consent of the copyright holders.
After I got my PhD, my mother took great relish in introducing me as, “This is my son. He’s a doctor, but not the kind that helps people”.
Randy Pausch
Für Frank & Ingrid And to the most beautiful neuroscientist in the world Sanne, thank you for our adventures in the past, in the present, and in the future
Contents
Prologue xv
1 General Introduction 1
1.1 Distance-based analysis . . . 1
1.2 Reader’s guide. . . 5
1.3 Major results & discoveries . . . 9
2 Dynamical systems and time series 11 2.1 Introduction . . . 11
2.2 Wasserstein distances. . . 14
2.3 Implementation . . . 18
2.3.1 Calculation of Wasserstein distances . . . 18
2.3.2 Bootstrapping and binning . . . 19
2.3.3 Incomplete distance information . . . 19
2.3.4 Violations of distance properties . . . 20
2.4 Analysis . . . 21
2.4.1 Distance matrices . . . 21
2.4.2 Reconstruction by multidimensional scaling . . . 21
2.4.3 Classification and discriminant analysis . . . 25
2.4.4 Cross-validation . . . 26
2.4.5 Statistical significance by permutation tests . . . 27
2.5 Example: The Hénon system . . . 28
2.5.1 Sample size and self-distances . . . 28
2.5.2 Influence of noise . . . 29
2.5.3 Visualizing parameter changes . . . 30
2.5.4 Coupling and synchronization . . . 32
2.5.5 Summary . . . 35
2.6 Example: Lung diseases . . . 37 vii
Contents
2.6.1 Background . . . 37
2.6.2 Discrimination by Wasserstein distances . . . 39
2.7 Generalized Wasserstein distances . . . 43
2.7.1 Translation invariance . . . 44
2.7.2 Rigid motions . . . 45
2.7.3 Dilations and similarity transformations . . . 46
2.7.4 Weighted coordinates . . . 47
2.7.5 Residuals of Wasserstein distances . . . 48
2.7.6 Optimization of generalized cost . . . 49
2.7.7 Example: The Hénon system . . . 50
2.8 Nonmetric multidimensional scaling . . . 50
2.9 Conclusions . . . 52
Applications 55
3 Lung diseases 57 3.1 Respiration. . . 573.2 The forced oscillation technique. . . 59
3.3 Asthma and COPD . . . 63
3.3.1 Materials: FOT time series . . . 64
3.3.2 Artifact removal . . . 65
3.4 Fluctuation analysis. . . 65
3.4.1 Power-law analysis . . . 66
3.4.2 Detrended fluctuation analysis . . . 68
3.5 Nonlinear analysis . . . 71
3.5.1 Optimal embedding parameters . . . 72
3.5.2 Entropy . . . 73
3.6 Results . . . 74
3.6.1 Statistical analysis . . . 74
3.6.2 Variability and fluctuation analysis. . . 78
3.6.3 Distance-based analysis . . . 80
3.6.4 Nonlinear analysis . . . 83
3.6.5 Entropy analysis . . . 84
3.7 Discussion . . . 84
3.7.1 Main findings . . . 85
3.7.2 Clinical implications . . . 87
3.7.3 Further directions. . . 88
3.7.4 Conclusion . . . 89 viii
Contents
4 Structural brain diseases 91
4.1 Quantitative MRI . . . 91
4.2 Distributional analysis . . . 93
4.3 Systemic lupus erythematosus . . . 95
4.3.1 Materials . . . 96
4.3.2 Histogram analysis . . . 97
4.3.3 Multivariate discriminant analysis . . . 99
4.3.4 Fitting stable distributions . . . 101
4.3.5 Distance-based analysis . . . 103
4.3.6 Discussion . . . 104
4.3.7 Tables: Classification accuracies . . . 106
4.4 Alzheimer’s disease. . . 107
4.4.1 Materials . . . 109
4.4.2 Results . . . 110
5 Deformation morphometry 113 5.1 Overview. . . 113
5.2 Introduction . . . 113
5.3 The Moore-Rayleigh test . . . 115
5.3.1 The one-dimensional case . . . 117
5.3.2 The three-dimensional case . . . 119
5.3.3 Power estimates. . . 121
5.4 The two-sample test. . . 125
5.4.1 Testing for symmetry. . . 125
5.4.2 Further issues . . . 128
5.5 Simulation results . . . 129
5.6 Application: deformation-based morphometry . . . 130
5.6.1 Synthetic data . . . 130
5.6.2 Experimental data . . . 133
5.7 Discussion . . . 136
6 Electrophysiology of the brain 139 6.1 Introduction . . . 139
6.2 Distance properties . . . 141
6.2.1 Metric properties . . . 141
6.2.2 Embeddability and MDS. . . 143
6.2.3 Graph-theoretic analysis . . . 146
6.3 Connectivity measures . . . 147
6.3.1 Statistical measures. . . 147
6.3.2 Spectral measures. . . 150
6.3.3 Non-linear measures . . . 152
6.3.4 Wasserstein distances . . . 153 ix
Contents
6.4 Example: MEG data during motor performance . . . 155
6.5 Example: Auditory stimulus processing . . . 159
6.6 Conclusion . . . 160
Epilogue 163
Appendices 167
A Distances 169 A.1 Distance geometry . . . 169A.1.1 Distance spaces . . . 169
A.1.2 Congruence and embeddability. . . 172
A.2 Multidimensional scaling . . . 175
A.2.1 Diagnostic measures and distortions . . . 177
A.2.2 Violations of metric properties and bootstrapping . . . 181
A.3 Statistical inference . . . 184
A.3.1 Multiple response permutation testing . . . 185
A.3.2 Discriminant analysis . . . 186
A.3.3 Cross-validation and diagnostic measures in classification . . . 189
A.3.4 Combining classifiers . . . 191
B Optimal transportation distances 193 B.1 The setting . . . 193
B.2 Discrete optimal transportation . . . 194
B.3 Optimal transportation distances . . . 199
C The dts software package 201 C.1 Implementation and installation . . . 201
C.2 Reference . . . 202
cmdscale.add . . . 202
ldadist.cv . . . 203
mfdfa . . . 205
mle.pl. . . 207
powerlaw . . . 209
samp.en . . . 210
td . . . 211
stress . . . 213
td.interp . . . 214
ts.delay . . . 215 x
Contents
D The MooreRayleigh software package 217
D.1 Implementation and installation . . . 217
D.2 Reference . . . 217
bisect . . . 217
diks.test. . . 218
F3 . . . 220
lrw . . . 220
mr . . . 222
mr3 . . . 223
mr3.test. . . 224
pairing . . . 225
rsphere . . . 226
Notes 229
Bibliography 239
Samenvatting 257
Curriculum vitae 259
xi
List of boxes
1 Additional typographic elements . . . 9
2 Wasserstein distances of dynamical systems . . . 35
3 Main questions about lung diseases . . . 58
4 Real-time tracking of single-frequency forced oscillation signals . . . . 61
5 Power-law analysis of forced oscillation signals . . . 78
6 Detrended fluctuation analysis of forced oscillation signals . . . 80
7 Embedding parameters in reconstructing impedance dynamics. . . 83
8 Sample entropy of forced oscillations . . . 84
9 Analysis of systemic lupus erythematosus . . . 103
10 Analysis of Alzheimer’s Disease. . . 110
11 Why reconstruct distances in Euclidean space? . . . 175
12 How to publish distance matrices . . . 184
13 Why use the homoscedastic normal-based allocation rule? . . . 189
xiii