Size and complexity of modern data centers pose scalability issues for the resource monitoring system supporting management operations, such as server consolidation. When we pass from cloud to multi-cloud systems, scalability issues are exacerbated by the need to manage geographically distributed data centers and exchange monitored data across them. While existing solutions typically consider every Virtual Machine (VM) as a black box with independent characteristics, we claim that scalability issues in multi-cloud systems could be addressed by clustering together VMs that show similar behaviors in terms of resource usage. In this paper, we propose an automated methodology to cluster VMs starting from the usage of multiple resources, assuming no knowledge of the services executed on them. This innovative methodology exploits the Bhattacharyya distance to measure the similarity of the probability distributions of VM resources usage, and automatically selects the most relevant resources to consider for the clustering process. The methodology is evaluated through a set of experiments with data from a cloud provider. We show that our proposal achieves high and stable performance in terms of automatic VM clustering. Moreover, we estimate the reduction in the amount of data collected to support system management in the considered scenario, thus showing how the proposed methodology may reduce the monitoring requirements in multi-cloud systems.
Automatic virtual machine clustering based on bhattacharyya distance for multi-cloud systems / Canali, Claudia; Lancellotti, Riccardo. - ELETTRONICO. - (2013), pp. 45-52. (Intervento presentato al convegno 2013 International Workshop on Multi-Cloud Applications and Federated Clouds, MultiCloud 2013 tenutosi a Prague, cze nel April 2013) [10.1145/2462326.2462337].