Using Real Sensors Data to Calibrate a Traffic Model for the City of Modena

In Italy, road vehicles are the preferred mean of transport. Over the last years, in almost all the EU Member States, the passenger car fleet increased. The high number of vehicles complicates urban planning and often results in traffic congestion and areas of increased air pollution. Overall, efficient traffic control is profitable in individual, societal, financial, and environmental terms. Traffic management solutions typically require the use of simulators able to capture in detail all the characteristics and dependencies associated with real-life traffic. Therefore, the realization of a traffic model can help to discover and control traffic bottlenecks in the urban context. In this paper, we analyze how to better simulate vehicle flows measured by traffic sensors in the streets. A dynamic traffic model was set up starting from traffic sensors data collected every minute in about 300 locations in the city of Modena. The reliability of the model is discussed and proved with a comparison between simulated values and real values from traffic sensors. This analysis pointed out some critical issues. Therefore, to better understand the origin of fake jams and incoherence with real data, we approached different configurations of the model as possible solutions.


Introduction
Italy is the second State in Europe with the highest number of passenger cars per thousand inhabitants. The Eurostat/ITF/UNECE Common Questionnaire on Inland Transport registers that in 2016 in Italy there were 625 cars every thousand inhabitants.
The aim of the TRAFAIR 1 project [1] is to implement a flexible solution to monitor and forecast urban air quality in 6 European cities (Modena, Florence, Pisa, Livorno, Zaragoza and Santiago de Compostela). Real traffic data are needed to evaluate trafficrelated emissions and then estimate how these pollutants move in the air according to the wind, weather and building shapes. Therefore, a simulation model has been employed to obtain the vehicle flow where no sensors are located. 1 www.trafair.eu.
Public administrations usually employ static traffic models that provide only an average traffic condition during peak hours in the main streets of the city. This kind of model does not consider the dynamic evolution of traffic during daytime and the seasonal variation during the year.
In general, simulation is a dynamic representation of the real world achieved building a computer model and moving it through time [2]. Traffic modelling [3] aims to accurately recreate real traffic flow by using data coming from a network of sensors distributed over the area of interest. The costs of the construction of such a distributed system can be burdensome for public administrations. However, in many cities, some distributed sensors are used for other purposes.
In the city of Modena, more than 300 induction loops sensors are located near traffic-light controlled junctions. These devices are used locally to control the traffic light logic but their traffic-related data (for instance, the vehicle counts and the average speed) have never been analyzed before. In the TRAFAIR project, we customize a traffic model for the city of Modena employing SUMO (Simulation for Urban Mobility) [4] and OSM 2 (Open Street Map), both open sources, to ensure a costeffective solution.
The paper is organized as follows: Sect. 2 briefly describes the traffic model; in Sect. 3, an evaluation of the model performance is given comparing real traffic data with the simulated ones; finally, in Sect. 4 different configurations are explored to find a solution to the emerged criticalities.

The Model
Our traffic model [5] is a micro-simulation model, obtained by using SUMO, configured to generate the routes of the vehicles starting from traffic sensors data. In a microsimulation model, vehicles are simulated individually: each vehicle has its own trip to follow and moves inside the road network considering traffic restrictions. Our model has the aim to produce data about vehicle counts and their average speed in every road portion of Modena starting from the measurements of the traffic sensors.
The sensors placed in Modena count the vehicles passing through them every minute and evaluate their average speed. We collect sensors data in real-time in a local Post-greSQL database, and the model interacts with it directly [6]. In our SUMO simulated map, we placed a "calibrator" near each traffic sensor. A calibrator is an object capable of producing the aspired traffic flow, i.e. the number of vehicles counted by the sensor associated with that calibrator. Calibrators are part of the SUMO suite and are like virtual traffic sensors calibrated considering the real measurements of the on-road sensors. Unlike sensors that measure the number of vehicles pointwise, calibrators control the flow on a lane of a road portion. For this reason, we have also placed some virtual detectors, SUMO objects that mime exactly the behaviour of the sensors and returns the vehicle count at a precise point of the map. We use a Python script to produce automatically the file containing the positions of the virtual detectors. We consider the GPS coordinates of the corresponding traffic sensor and the name of the street in which they are placed. The road name was necessary to avoid some errors, sensors are placed near junctions where roads are one near another. Thus, considering only the geolocalization of the sensors and placing them in the nearest road section not always ensure to find the right position. Therefore, we consider also the similarity between the name of the roads in the junction and the correct road name to estimate the right position. The values retrieved by these virtual detectors can be compared with the measurement of the real ones to evaluate the performance of the model.

Evaluating the Model
We evaluated the performance of the model using different techniques. In every point where there is a real sensor, the time series of the real flow measurements and the one retrieved by the virtual detector in the simulation are compared using DTW (Dynamic Time Warping) distance. The DTW distance [7] is a way to measure the distance between two different time series that allows sequences to be stretched along the time axis. The sensors with a DTW distance higher than 1200 have been considered distant and not reliable. DTW distance is not the only metric used to determine if a calibrator is following real measurements or not. We also calculated the difference between the virtual and real sensors vehicle counts at the same instant and evaluated an average of this difference. Finally, we evaluated the number of instants in which the difference is higher than 2 vehicles per minute. This metric consider that we could have some instants in which the calibrator is not able to follow real measurement and the distance is high but in all the others the calibrator flows is similar to the real one, thus the error is limited to a short period time. Using these two methods, we can classify calibrators into two ways: the ones that manage to produce the real aspired flow will be referred to as 'aligned', the others as 'not aligned'.
Tests have been performed on seven November 2018 days. In Table 1, the number of not aligned calibrators is displayed for every tested day. The ratio is obtained dividing the number of not aligned calibrators by the total number of calibrators in the simulation. This number is equal to the number of sensors with at least one measurement on that specific simulated day. More than 20% of calibrators appear to be not aligned.
In Fig. 1 a graphical comparison between two time series is displayed to underline the difference between aligned and not aligned calibrator. We have identified the calibrators that in all the tested days are always classified as 'not aligned'. They are 39 and belong to 23 different junctions. We observed that often calibrators in a junction belong to the same group. Observing some not aligned junctions we discover that in 6 junctions (where 13 of the 39 calibrators are located) the problem is related to the SUMO road network that does not match the real one. The geographic data are provided by OSM and they include only information about the total number of lanes in a road without information about directions. Thus, sometimes the direction of the lanes assigned by SUMO is not right. An example is shown in Fig. 2: on the left of the figure, the two sensors at the bottom are located in SUMO road network in the same lane on the same direction; however, the two sensors are on two different lanes in reality as shown on the aerial view on the right. To overcome this problem, the counts provided by the two sensors could be summed up.
Another reason why some calibrators are not aligned is the creation of fake jams in the simulation: when a jam appears, calibrators are not able to insert new vehicles even if the required flow is higher. We observe that 10 of the 23 junctions of not aligned calibrators are affected by fake jams. The presence of a fake jam can be observed in Fig. 1 in the graph at the top of the figure on the left. The calibrator initially manages to follow real vehicles count, then the number of vehicles increases, and a jam appears reducing the flow through it. The duration of the simulation can contribute to producing fake jams. When a calibrator generates a vehicle, it will remain in the simulation since the end of its route; if it does not drive over another calibrator that decides to remove it. Reducing the duration of the simulation (splitting the simulation in several sub-simulations of reduced duration) allows avoiding this problem because refreshing the simulation will remove not necessary vehicles. We performed several simulations excluding the calibrators of a specific junction to observe if the absence of them could affect the performance of the others. We observed that this influence is related to the geographical distance and also to the existence of a path that can connect the two junctions.  In some cases, not aligned calibrators are located in the right place of a crossroad with the right morphology and their measurements are acceptable. The reason for their 'not alignment' is addressed to the sensors located near them. 3 of the not aligned junctions have some sensors in the neighbourhood that counts zero vehicles for the whole day simulation. The measurements of these sensors must be excluded from the input data, otherwise, the lane in which they are located will be forced to do not have any flow for the whole simulation.

Trying Different Configurations to Find a Solution
We tried different configurations of the traffic model to overcome the issues emerged in Sect. 3. Firstly, we removed from the simulation input of Thursday 8 th November some calibrators unable to follow real measurements or measuring zero vehicles all day. We removed 29 calibrators. Through the comparison of the lists of not aligned calibrators in the regular simulation and in the simulation without the 29 excluded calibrators, we observed that 11 calibrators improve their performances and 12 calibrators' performances get worse.
An interesting fact is that, comparing the real measurement and the simulated counts in 20 of the 29 positions where sensors have been removed, the time series of their virtual detectors better followed real measurements. This means that the model can infer the vehicles counts in their position even without the calibrators. However, this solution was not good enough since 58 calibrators still appear not aligned.
For this reason, we tried another solution. We split the Monday 19th simulation in sub-simulations with a duration of 3 h each. This interval of time was chosen because it is not likely to have routes longer than 3 h in an urban context; however, the interval is long enough that calibrators can influence each other but not enough to make this influence producing fake jams in the network. A simulation of 24 h composed of eight simulations of 3 h was performed.
The time series obtained by the sub-simulations were compared with real measurements as described in Sect. 3. The number of not aligned calibrators decreases incredibly to 2.0% (5/241), all of them belonging to the 39 calibrators that do not follow real measurements in any of the previous simulations. In Fig. 1 there is an example of how the sub-simulations approach removes fake jams and improves the performances of two calibrators.

Conclusion and Future Work
Splitting the simulation in sub-simulations proved to be a good solution to ensure that the simulation follows real measurements. The exclusion of sensors that always counts zero vehicles is necessary to avoid errors caused by not reliable input data from sensors. Therefore, we produce a simulation capable of following the exact number of vehicles circulating in almost every point where there is a sensor and to infer vehicles counts where there are a lot of sensors in the neighbourhood area. To enhance the realism of the simulation, a good improvement could be to include information about average traffic flows, like Origin-Destination matrices, to produce routes in roadways where no sensors are placed.