The increase in available computing power and the Deep Learning revolution have allowed the exploration of new topics and frontiers in Artificial Intelligence research. A new field called Embodied Artificial Intelligence, which places at the intersection of Computer Vision, Robotics, and Decision Making, has been gaining importance during the last few years, as it aims to foster the development of smart autonomous robots and their deployment in society. The recent availability of large collections of 3D models for photorealistic robotic simulation has allowed faster and safe training of learning-based agents for millions of frames and a careful evaluation of their behavior before deploying the models on real robotic platforms. These intelligent agents are intended to perform a certain task in a possibly unknown environment. To this end, during the training in simulation, the agents learn to perform continuous interactions with the surroundings, such as gathering information from the environment, encoding and extracting useful cues for the task, and performing actions towards the final goal; where every action of the agent influences the interactions. This dissertation follows the complete creation process of embodied agents for indoor environments, from their concept to their implementation and deployment. In the first part of this work, we study the importance of building efficient representations of the agent's knowledge aimed at its understanding of the world and learning capabilities on the task to pursue. We devise and examine two alternative approaches to implicitly encode and maximize the information collected without the need for annotated data, which are usually costly and difficult to produce. The first explored method rewards actions that produce a significant change in the agent's knowledge or internal representation of the environment and is called Impact. The second approach, instead, is called Curiosity, and as human curiosity does, it encourages the agent to explore states of the environment where it can see or learn new things. The investigation of implicit representations for embodied agents is followed by a study of agents' behavior on various robotic tasks, both in simulated and real settings. Following, we investigate the last step for a successful implementation of an autonomous agent: the deployment of the trained models on a real robot. We study how to transfer the knowledge acquired in simulation into the real world, considering and coping with the architectural discrepancies between those worlds to minimize the degradation caused by the simulation-to-reality transfer. The final part of this work presents the acquisition and public release of a photo-realistic 3D model of an art gallery accompanied by a dataset for navigation. This contribution enlarges the number of datasets available in the literature and enables simulated robot navigation inside museums. With this thesis, we aim to contribute to research in Embodied AI and autonomous agents, in order to foster future work in this field. We present a detailed analysis of the procedure behind implementing an intelligent embodied agent, comprehending a thorough description of the current state-of-the-art in literature, technical explanations of the proposed methods, and accurate experimental studies on relevant robotic tasks.
L'incremento della potenza di calcolo disponibile e la rivoluzione del Deep Learning hanno aperto nuovi temi e frontiere nella ricerca sull'Intelligenza Artificiale. Un nuovo campo chiamato Intelligenza Artificiale Incorporata (Embodied Artificial Intelligence), che si colloca al confine tra Computer Vision, Robotica e Decision Making, sta guadagnando importanza negli ultimi anni, in quanto mira a promuovere lo sviluppo e l'impiego nella società di robot autonomi intelligenti. La recente disponibilità di grandi collezioni di modelli 3D per la simulazione robotica fotorealistica ha permesso un addestramento più rapido e sicuro di agenti intelligenti usando milioni di fotogrammi, unito ad un'attenta valutazione del loro comportamento prima di distribuire i modelli su robot reali. Questi agenti intelligenti devono svolgere un determinato compito in un ambiente potenzialmente sconosciuto. A questo fine, durante l'allenamento in simulazione, gli agenti imparano ad eseguire interazioni continue con l'ambiente circostante, come la raccolta di informazioni dall'ambiente e la codifica ed estrazione di dati utili per l'esecuzione del compito assegnato; dove ogni azione dell'agente influenza tali interazioni. Questa tesi segue l'intero processo di creazione di agenti da ambienti interni, dalla loro concezione alla loro implementazione. Nella prima parte di questo lavoro, studiamo l'importanza di costruire rappresentazioni efficienti della conoscenza dell'agente finalizzate alla sua comprensione del mondo e alle capacità di apprendimento del compito da perseguire. Abbiamo ideato ed esaminato due approcci alternativi per codificare implicitamente e massimizzare le informazioni raccolte senza la necessità di dati annotati, che di solito sono costosi e difficili da produrre. Il primo metodo esplorato premia le azioni che producono un cambiamento significativo della conoscenza dell'agente o nella rappresentazione dell'ambiente, ed è chiamato Impact (Impatto). Il secondo approccio, invece, è chiamato Curiosity (Curiosità) e, come fa la curiosità umana, incoraggia l'agente a esplorare gli stati dell'ambiente in cui può vedere o imparare cose nuove. L'indagine sulle rappresentazioni implicite per gli agenti embodied è seguita da uno studio del comportamento degli agenti in vari compiti robotici, sia in ambienti simulati che reali. A seguire, proponiamo uno studio sull'ultimo passo per la creazione con successo di un agente autonomo: l'implementazione dei modelli addestrati su un robot reale. Investighiamo come trasferire nel mondo reale le conoscenze acquisi-te in simulazione, considerando e affrontando le discrepanze architettoniche tra questi due mondi in modo da minimizzare il peggioramento delle prestazioni causato dal trasferimento da simulato a reale. La parte finale di questo lavoro presenta l'acquisizione e pubblicazione di un modello 3D fotorealistico di una galleria d'arte, accompagnato da un set di dati per la navigazione. Questo contributo amplia il numero di dataset disponibili in letteratura consentendo la navigazione simulata di robot all'interno dei musei. Con questa tesi intendiamo contribuire alla ricerca sull'Embodied AI e sugli agenti autonomi, in modo da promuovere il lavoro futuro in questo campo. Presentiamo un'analisi dettagliata della procedura di implementazione di un agente intelligente, che comprende una descrizione approfondita dell'attuale stato del-l'arte in letteratura, spiegazioni tecniche dei metodi proposti e accurati studi sperimentali su compiti robotici d'interesse.
Agenti Autonomi Incorporati: Quando la Robotica incontra il Ragionamento con Deep Learning / Roberto Bigazzi , 2023 Mar 08. 35. ciclo, Anno Accademico 2021/2022.
Agenti Autonomi Incorporati: Quando la Robotica incontra il Ragionamento con Deep Learning
BIGAZZI, ROBERTO
2023
Abstract
The increase in available computing power and the Deep Learning revolution have allowed the exploration of new topics and frontiers in Artificial Intelligence research. A new field called Embodied Artificial Intelligence, which places at the intersection of Computer Vision, Robotics, and Decision Making, has been gaining importance during the last few years, as it aims to foster the development of smart autonomous robots and their deployment in society. The recent availability of large collections of 3D models for photorealistic robotic simulation has allowed faster and safe training of learning-based agents for millions of frames and a careful evaluation of their behavior before deploying the models on real robotic platforms. These intelligent agents are intended to perform a certain task in a possibly unknown environment. To this end, during the training in simulation, the agents learn to perform continuous interactions with the surroundings, such as gathering information from the environment, encoding and extracting useful cues for the task, and performing actions towards the final goal; where every action of the agent influences the interactions. This dissertation follows the complete creation process of embodied agents for indoor environments, from their concept to their implementation and deployment. In the first part of this work, we study the importance of building efficient representations of the agent's knowledge aimed at its understanding of the world and learning capabilities on the task to pursue. We devise and examine two alternative approaches to implicitly encode and maximize the information collected without the need for annotated data, which are usually costly and difficult to produce. The first explored method rewards actions that produce a significant change in the agent's knowledge or internal representation of the environment and is called Impact. The second approach, instead, is called Curiosity, and as human curiosity does, it encourages the agent to explore states of the environment where it can see or learn new things. The investigation of implicit representations for embodied agents is followed by a study of agents' behavior on various robotic tasks, both in simulated and real settings. Following, we investigate the last step for a successful implementation of an autonomous agent: the deployment of the trained models on a real robot. We study how to transfer the knowledge acquired in simulation into the real world, considering and coping with the architectural discrepancies between those worlds to minimize the degradation caused by the simulation-to-reality transfer. The final part of this work presents the acquisition and public release of a photo-realistic 3D model of an art gallery accompanied by a dataset for navigation. This contribution enlarges the number of datasets available in the literature and enables simulated robot navigation inside museums. With this thesis, we aim to contribute to research in Embodied AI and autonomous agents, in order to foster future work in this field. We present a detailed analysis of the procedure behind implementing an intelligent embodied agent, comprehending a thorough description of the current state-of-the-art in literature, technical explanations of the proposed methods, and accurate experimental studies on relevant robotic tasks.File | Dimensione | Formato | |
---|---|---|---|
PhD_Tesi_Finale.pdf
Open access
Descrizione: Tesi definitiva Bigazzi Roberto
Tipologia:
Tesi di dottorato
Dimensione
11.66 MB
Formato
Adobe PDF
|
11.66 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris