An Integrated Smart City Platform

. Smart Cities aim to create a higher quality of life for their citizens, improve business services and promote tourism experience. Fos-tering smart city innovation at local and regional level requires a set of mature technologies to discover, integrate and harmonize multiple data sources and the exposure of eﬀective applications for end-users (citizens, administrators, tourists...). In this context, Semantic Web technologies and Linked Open Data principles provide a means for sharing knowledge about cities as physical, economical, social, and technical systems, enabling the development of smart city services. Despite the tremendous eﬀort these communities have done so far, there exists a lack of comprehensive and eﬀective platforms that handle the entire process of identiﬁcation, ingestion, consumption and publication of data for Smart Cities.


Introduction
"A Smart City is an Information and Communication Technologies (ICT) enabled development which extensively uses information as a way to improve quality of life for its citizens and population at large" [1]. In this context, sensors, social media, web activities, tracking devices, etc. generate various and large amount of real-time data that Smart Cities need to deal with. This paper proposes a platform for the European national administrations, citizens and companies for implementing data integration processes that short the time and costs in enabling the data gathering for smart city applications and services. Starting from local services available to citizens such as public transport information, commercial data, etc., it will build a unified dataset by using consolidated technologies and tools. This dataset that can be used and queried by several applications for exploring the data, searching for information, extracting statistical indicators or publishing open data. The proposed platform is based on four open-source mature tools that covers the entire process of the data value chain (see Figure 1). These tools have been used for several years in specific areas such as data integration (MOMIS [2,3]), data aggregation and reconciliation towards a Smart City Ontology (ETL tools and KM4City [4]), integration of environmental sensor data (SOS-SM [5][6][7]) with respect to the Semantic Sensor Network W3C ontology (SSN ontology) and population and update of ontologies (Infoboxer [8]). This paper is devoted to describe how the proposed platform works, and how it can be used. The rest of the paper is structured as follows. In the next Section, related work is analyzed. After that, in Section 3, the proposed data value chain is depicted. Moreover, some of the web applications that interact with the unified RDF dataset are briefly described. Finally, in Section 4 conclusions and future work are depicted.

Related Work
The amount of data generated and collected by public administrations (city councils, regional administrations, etc.) has been exponentially increasing since the beginning of the digital era. Besides, since all nations of the Organization for Economic Co-operation and Development (OECD) signed a declaration establishing that all publicly-funded data should be made publicly available 5 , Open Data Initiatives have emerged to ease the development of methodologies, technologies and standards in order to publish these data. One of the most successful Open Data Initiatives has been the Linked Open Data Initiative 6 , which is focused on improving the access and integration of public data coming from different sources by using machine readable datasets.
In this context, numerous projects and different initiatives to exploit all these data in order to improve the management of cities and the quality of life of their citizens have arisen: CitySDK 7 , [9], [10], etc. These projects are focused on specific domains (e.g. energy, pollution, transport, tourism) or challenges (e.g. how to make predictions on a specific topic, integrating specific datasets, publishing data, ...). However, up to our knowledge, there is not a general platform that takes into account all the data value chain to offer smart city services by considering all the affected agents (administrations, citizens, companies, etc.). So, in this paper, we propose such a platform by integrating previous consolidated solutions focused on specific problems (data integration, ontology population, etc.).

The Data Value Chain
The steps that guides users from the discovery of raw data and its ingestion to the analysis and exploitation of results realize the data value chain, as "in a Data Value Chain, information flow is described as a series of steps needed to generate value and useful insights from data" [11]. The proposed platform covers the entire process of the data value chain that is composed of four phases (see Figure 1).
Phase I (Data Sources Search) starts with the definition of a specific goal. Then, a deep and wide search of the relevant data sources (local, regional, national and international sources) is performed. Both public and private data sources are analyzed and selected w.r.t. the domain of the smart city project [12,13]. The platform allows an iterative process, so new data sources can be incrementally considered to enrich the project.
Phase II (Data Integration) is devoted to the integration and mapping of the different selected data sources w.r.t. the Knowledge Model for City (KM4City) and the Semantic Sensor Network Ontology (SSN) 8 . Both KM4City and SSN are domain-specific ontologies; whereas KM4City focuses on providing a vocabulary to improve the effectiveness of smart city applications, SSN is an ontology developed by the W3C Semantic Sensor Networks Incubator Group (SSN-XG) to describe sensors and their observations. Integration and mapping are performed with different tools by considering the type of the selected data-sources. Thus, sensor data are processed by Sensor Observation Service Semantic Mediation (SOS-SM); real time data are filtered and transformed by means of Extraction Transformation and Load (ETL) tools (such as Pentaho 9 ); and static heterogeneous data-sources are integrated by means of MOMIS 10 , which can aggregate data coming from both structured or semi-structured data sources in a semiautomatic way to bring out new information from apparently unrelated existing data. The created integrated view is transformed into RDF triples. Moreover, this obtained RDF Knowledge Base can be extended or updated by using Infoboxer 11 . A tool oriented to non-technical users that helps to link the values introduced to existing entities in the data source and enforces semantic constraints on them.
Ingesting data from different kind of public and private sources necessarily requires to deal with aspects such as: variability, complexity, variety, geo-spatial aspects, integration and size of these data sources. So, data ingestion and aggregation processes must address the "Big Data" issues described in [4,14,15,2]. This problem can be partially solved by using specific reconciliation processes to make these data interoperable with other ingested and harvested data. The velocity of data is related to the frequency of data update, and it allows distinguishing static from dynamic data.
Phase III (Open Data Publication) makes the resulting value-added information public and searchable on the Web as Linked Open Data. The owner of the data can choose or filter a portion of the RDF Knowledge Base and publish it as Open Data or Linked Open Data. In particular, the goal of this phase is to enable users to publish one or a set of datasets according to the 5-stars deployment scheme for the Linked Open Data proposed by Berners-Lee [16]. Thus, linking the dataset to external sources is unavoidable in order to create a proper 5-stars dataset.
Phase IV (Applications built on the Integrated Data) takes as input the integrated information and provides specialized applications such as tools for geographical querying, for exploring and navigating LOD sources, for analyzing statistical information, etc. In more detail, these applications exploit the RDF-Knowledge Base by making queries [17][18][19][20] and offer different services to different types of users (citizens, companies, tourist, staff of public administrations, mo-bile operators, etc.). For example: searching services around a certain GPS point such as looking for an area where restaurants are available, services to detect and predict critical conditions or discover cause-effect relationships, recommendation services tuned on the basis of statistical data, decision support services, dash-boards that allow to analyze Key Performance Indicators such as MOMISdashboard, etc. All the services can communicate with the system asking for data or providing data to the servers through the Smart City API [4].

Conclusions and Future Work
In this paper, we presented a platform to turn the potential of data for the economy and society into reality where public and private data sources can be exploited in order to improve the management of cities and the quality of life of their citizens. The proposed platform is a suite of four open-source mature solutions focused on: the integration of data sources (MOMIS), the publication of smart data related to cities by considering a unified view (KM4city), and the population and update of data sources by considering data provided by users (Infoboxer) and by sensor observation services (SOS-SM). Moreover, other mature tools, such as MOMIS dashboard, are also considered to enrich the platform.
The different components of this platform have been already tested and deployed in several contexts, successfully, but the integrated solution has not been tested yet. So, as future work, we would like to study how to improve the performance of the integrated solution and how to audit its deployments by considering standard metrics, such as the ones defined in the norm UNE 178301 "Smart Cities and Open Data".