Cloud computing has recently emerged as a leading paradigm to allow customers to run their applications in virtualized large-scale data centers. Existing solutions for monitoring and management of these infrastructures consider virtual machines (VMs) as independent entities with their own characteristics. However, these approaches suffer from scalability issues due to the increasing number of VMs in modern cloud data centers. We claim that scalability issues can be addressed by leveraging the similarity among VMs behavior in terms of resource usage patterns. In this paper we propose an automated methodology to cluster VMs starting from the usage of multiple resources, assuming no knowledge of the services executed on them. The innovative contribution of the proposed methodology is the use of the statistical technique known as principal component analysis (PCA) to automatically select the most relevant information to cluster similar VMs. We apply the methodology to two case studies, a virtualized testbed and a real enterprise data center. In both case studies, the automatic data selection based on PCA allows us to achieve high performance, with a percentage of correctly clustered VMs between 80% and 100% even for short time series (1 day) of monitored data. Furthermore, we estimate the potential reduction in the amount of collected data to demonstrate how our proposal may address the scalability issues related to monitoring and management in cloud computing data centers.

Improving scalability of cloud monitoring through PCA-based Clustering of Virtual Machines / Canali, Claudia; Lancellotti, Riccardo. - In: JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY. - ISSN 1000-9000. - STAMPA. - 29:1(2014), pp. 38-52. [10.1007/s11390-013-1410-9]

Improving scalability of cloud monitoring through PCA-based Clustering of Virtual Machines

CANALI, Claudia;LANCELLOTTI, Riccardo
2014

Abstract

Cloud computing has recently emerged as a leading paradigm to allow customers to run their applications in virtualized large-scale data centers. Existing solutions for monitoring and management of these infrastructures consider virtual machines (VMs) as independent entities with their own characteristics. However, these approaches suffer from scalability issues due to the increasing number of VMs in modern cloud data centers. We claim that scalability issues can be addressed by leveraging the similarity among VMs behavior in terms of resource usage patterns. In this paper we propose an automated methodology to cluster VMs starting from the usage of multiple resources, assuming no knowledge of the services executed on them. The innovative contribution of the proposed methodology is the use of the statistical technique known as principal component analysis (PCA) to automatically select the most relevant information to cluster similar VMs. We apply the methodology to two case studies, a virtualized testbed and a real enterprise data center. In both case studies, the automatic data selection based on PCA allows us to achieve high performance, with a percentage of correctly clustered VMs between 80% and 100% even for short time series (1 day) of monitored data. Furthermore, we estimate the potential reduction in the amount of collected data to demonstrate how our proposal may address the scalability issues related to monitoring and management in cloud computing data centers.
2014
29
1
38
52
Improving scalability of cloud monitoring through PCA-based Clustering of Virtual Machines / Canali, Claudia; Lancellotti, Riccardo. - In: JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY. - ISSN 1000-9000. - STAMPA. - 29:1(2014), pp. 38-52. [10.1007/s11390-013-1410-9]
Canali, Claudia; Lancellotti, Riccardo
File in questo prodotto:
File Dimensione Formato  
14-1-3-2517.pdf

Accesso riservato

Descrizione: articolo
Tipologia: Versione pubblicata dall'editore
Dimensione 702.83 kB
Formato Adobe PDF
702.83 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/993114
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 31
  • ???jsp.display-item.citation.isi??? 22
social impact