UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl)
UvA-DARE (Digital Academic Repository)
Measuring and predicting anonymity
Koot, M.R.
Publication date
2012
Link to publication
Citation for published version (APA):
Koot, M. R. (2012). Measuring and predicting anonymity.
General rights
It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).
Disclaimer/Complaints regulations
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.
List of Figures
1.1 Privacy in ‘functions’ and ‘states’, according to Westin [87]. . . 5
1.2 Taxonomy of privacy violations according to Solove [74]. . . 6
2.1 Degrees of anonymity according to Reiter and Rubin [67] . . . 14
2.2 Linking to re-identify data [76] . . . 24
3.1 Box-and-whisker plot showing anonymity set sizes kA, per munic-ipality. Whiskers denote the minimum and maximum values; the boxes are defined by lower and upper quartiles and the median value is shown. . . 37
3.2 Box-and-whisker plot showing anonymity set sizes kB, per munici-pality. Whiskers denote min-max values. . . 38
4.1 For all Dutch municipalities: the Kullback-Leibler distance and the estimated uniqueness probability, when revealing age. . . 48
4.2 For all Dutch municipalities: the Kullback-Leibler distance and the estimated uniqueness probability, when revealing age and gender. . 48
4.3 For two Dutch municipalities: the uniqueness probability as a func-tion of the group size k; also the curve under uniformity has been added. . . 51
4.4 For all Dutch municipalities: the e↵ect of aggregated (age) statistics on the KL-distance. . . 51
5.1 Mean number of singletons, as a function of the Kullback-Leibler distance . Left panels: full population; right panels: ages 0–79
only. Top to bottom: k = 60, 90, 120. . . 67
5.2 Variance of the number of singletons, as a function of the Kullback-Leibler distance . Left panel: full population; right panel: ages 0–79 only. . . 69
6.1 1, 2, and 3for two municipalities, as a function of the population size of the postal code area. . . 76
6.2 1 for all municipalities, as a function of the Kullback-Leibler dis-tance , for k = 20, 40, 60, 80. Notice that the observations ( ) are accurately predicted ( ) by the Kullback-Leibler distance () for various population sizes (k). . . 77
6.3 Graphical illustration of accuracy of the O( )-approximation;ES as a function of k for height, weight and birthday. The lines correspond to the estimates resulting from simulation, and the ‘+’ with the O( )-approximation. Tables show mean number of singletons for various values of k. . . 87
6.4 Expected number of singletons, for k = 5, 10, 20, 40, respectively (k = 30 is skipped due to page layout). The solid lines are the simulation-based estimates, the dots are the approximations based on the formulas derived in this Section. Per picture, the first 6 data points correspond to H = 0.5 cm, the second 6 data points to H = 1.0 cm, the third set of 6 data points to H = 2.0 cm, the fourth set of 6 data points to H = 5.0 cm, the fifth set of 6 data points to H = 10.0 cm, and the last set of 6 data points to H = 20.0 cm. Within each group of 6 data points, these correspond to W = 0.5, 1.0, 2.0, 5.0, 10, 20 kg. . . 88
6.5 Left panel: e↵ect of W for H fixed; right panel: e↵ect of H for W fixed. . . 89
7.1 Preliminary model for applying distribution-informed privacy pre-dictions as part of privacy policy making. . . 93
B.1 Revealing demographics: questionnaire screen 1. . . 107
B.2 Revealing demographics: questionnaire screen 2. . . 109
B.3 Revealing demographics: questionnaire screen 3. . . 109
B.4 Revealing demographics: questionnaire screen 4. . . 110 124