Event cameras are biologically-inspired sensors that gather the temporal evolution of the scene. They capture pixel-wise brightness variations and output a corresponding stream of asynchronous events. Despite having multiple advantages with respect to conventional cameras, their use is limited due to the scarce compatibility of asynchronous event streams with traditional data processing and vision algorithms. In this regard, we present a framework that synthesizes RGB frames from the output stream of an event camera and an initial or a periodic set of color key-frames. The deep learning-based frame synthesis framework consists of an adversarial image-to-image architecture and a recurrent module. Two public event-based datasets, DDD17 and MVSEC, are used to obtain qualitative and quantitative per-pixel and perceptual results. In addition, we converted into event frames two additional wellknown datasets, namely Kitti and Cityscapes, in order to present semantic results, in terms of object detection and semantic segmentation accuracy. Extensive experimental evaluation confirm the quality and the capability of the proposed approach of synthesizing frame sequences from color key-frames and sequences of intermediate events.

Video Frame Synthesis combining Conventional and Event Cameras / Pini, Stefano; Borghi, Guido; Vezzani, Roberto. - In: INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE. - ISSN 0218-0014. - 35:16(2021), pp. 1-25. [10.1142/S0218001421600132]

Video Frame Synthesis combining Conventional and Event Cameras

Stefano Pini;Guido Borghi;Roberto Vezzani
2021

Abstract

Event cameras are biologically-inspired sensors that gather the temporal evolution of the scene. They capture pixel-wise brightness variations and output a corresponding stream of asynchronous events. Despite having multiple advantages with respect to conventional cameras, their use is limited due to the scarce compatibility of asynchronous event streams with traditional data processing and vision algorithms. In this regard, we present a framework that synthesizes RGB frames from the output stream of an event camera and an initial or a periodic set of color key-frames. The deep learning-based frame synthesis framework consists of an adversarial image-to-image architecture and a recurrent module. Two public event-based datasets, DDD17 and MVSEC, are used to obtain qualitative and quantitative per-pixel and perceptual results. In addition, we converted into event frames two additional wellknown datasets, namely Kitti and Cityscapes, in order to present semantic results, in terms of object detection and semantic segmentation accuracy. Extensive experimental evaluation confirm the quality and the capability of the proposed approach of synthesizing frame sequences from color key-frames and sequences of intermediate events.
2021
35
16
1
25
Video Frame Synthesis combining Conventional and Event Cameras / Pini, Stefano; Borghi, Guido; Vezzani, Roberto. - In: INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE. - ISSN 0218-0014. - 35:16(2021), pp. 1-25. [10.1142/S0218001421600132]
Pini, Stefano; Borghi, Guido; Vezzani, Roberto
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1244774
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact