In the era of ubiquitous computing, the collection of users' geographical location is increasingly widespread. This represents an enabling technology, capable of creating new type of services but at the same time represents a new digital asset that needs to be protected in order to safeguard the users' privacy. In fact, exploiting everyday movements, it is possible for a threat actor to gather sensible information about the victims that can be leveraged afterwards. In this preliminary paper, we reproduced some major results in the field of re-identification of users' trajectories, validating them under scenarios where different countermeasures for geographical data are in place. Specifically, we tested generalization of spatial data using geohashing and K-means clustering. The results were obtained using a dataset that collects users from all over the world, allowing the clustering methods to range on very different scales. Results shows that, even if a strong data generalization is applied, users' trajectories keep their uniqueness, showing high re-identification ratios. Nevertheless, the usability issues typical of these techniques are still present, having only few tens of points for covering the entire globe which cannot be considered a general solution for every possible use case of such data.
Effects of Geohashing and K-Means Clustering on Uniqueness in a Mobility Dataset / Artioli, A.; Bedogni, L.; Andreolini, M.. - (2024), pp. 389-394. (Intervento presentato al convegno 9th Annual IEEE/ACM Symposium on Edge Computing, SEC 2024 tenutosi a ita nel 2024) [10.1109/SEC62691.2024.00042].
Effects of Geohashing and K-Means Clustering on Uniqueness in a Mobility Dataset
Artioli A.;Bedogni L.;Andreolini M.
2024
Abstract
In the era of ubiquitous computing, the collection of users' geographical location is increasingly widespread. This represents an enabling technology, capable of creating new type of services but at the same time represents a new digital asset that needs to be protected in order to safeguard the users' privacy. In fact, exploiting everyday movements, it is possible for a threat actor to gather sensible information about the victims that can be leveraged afterwards. In this preliminary paper, we reproduced some major results in the field of re-identification of users' trajectories, validating them under scenarios where different countermeasures for geographical data are in place. Specifically, we tested generalization of spatial data using geohashing and K-means clustering. The results were obtained using a dataset that collects users from all over the world, allowing the clustering methods to range on very different scales. Results shows that, even if a strong data generalization is applied, users' trajectories keep their uniqueness, showing high re-identification ratios. Nevertheless, the usability issues typical of these techniques are still present, having only few tens of points for covering the entire globe which cannot be considered a general solution for every possible use case of such data.Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris