Memory-Latency-Accuracy Trade-Offs for Continual Learning on a RISC-V Extreme-Edge Node

Ravaglia, L.; Rusci, M.; Capotondi, A.; Conti, F.; Pellegrini, L.; Lomonaco, V.; Maltoni, D.; Benini, L.

doi:10.1109/SiPS50750.2020.9195220

AI-powered edge devices currently lack the ability to adapt their embedded inference models to the ever-changing envi ronment. To tackle this issue, Continual Learning (CL) strategies aim at incrementally improving the decision capabilities based on newly acquired data. In this work, after quantifying memory and computational requirements of CL algorithms, we define a novel HW/SW extreme-edge platform featuring a low power RISC-V octa-core cluster tailored for on-demand incremental learning over locally sensed data. The presented multi-core HW/SW architecture achieves a peak performance of 2.21 and 1.70 MAC/cycle, respectively, when running forward and backward steps of the gradient descent. We report the trade-off between memory footprint, latency, and accuracy for learning a new class with Latent Replay CL when targeting an image classification task on the CORe50 dataset. For a CL setting that retrains all the layers, taking 5h to learn a new class and achieving up to 77.3% of precision, a more efficient solution retrains only part of the network, reaching an accuracy of 72.5% with a memory requirement of 300 MB and a computation latency of 1.5 hours. On the other side, retraining only the last layer results in the fastest (867 ms) and less memory hungry (20 MB) solution but scoring 58% on the CORe50 dataset. Thanks to the parallelism of the low-power cluster engine, our HW/SW platform results 25× faster than typical MCU device, on which CL is still impractical, and demonstrates an 11× gain in terms of energy consumption with respect to mobile-class solutions.

Memory-Latency-Accuracy Trade-Offs for Continual Learning on a RISC-V Extreme-Edge Node / Ravaglia, L.; Rusci, M.; Capotondi, A.; Conti, F.; Pellegrini, L.; Lomonaco, V.; Maltoni, D.; Benini, L.. - 2020-:(2020), pp. 53-58. ( 34th IEEE Workshop on Signal Processing Systems, SiPS 2020 prt 2020) [10.1109/SiPS50750.2020.9195220].

Memory-Latency-Accuracy Trade-Offs for Continual Learning on a RISC-V Extreme-Edge Node

Ravaglia L.;Rusci M.;Capotondi A.;Conti F.;Pellegrini L.;Lomonaco V.;Maltoni D.;Benini L.

2020

Abstract

AI-powered edge devices currently lack the ability to adapt their embedded inference models to the ever-changing envi ronment. To tackle this issue, Continual Learning (CL) strategies aim at incrementally improving the decision capabilities based on newly acquired data. In this work, after quantifying memory and computational requirements of CL algorithms, we define a novel HW/SW extreme-edge platform featuring a low power RISC-V octa-core cluster tailored for on-demand incremental learning over locally sensed data. The presented multi-core HW/SW architecture achieves a peak performance of 2.21 and 1.70 MAC/cycle, respectively, when running forward and backward steps of the gradient descent. We report the trade-off between memory footprint, latency, and accuracy for learning a new class with Latent Replay CL when targeting an image classification task on the CORe50 dataset. For a CL setting that retrains all the layers, taking 5h to learn a new class and achieving up to 77.3% of precision, a more efficient solution retrains only part of the network, reaching an accuracy of 72.5% with a memory requirement of 300 MB and a computation latency of 1.5 hours. On the other side, retraining only the last layer results in the fastest (867 ms) and less memory hungry (20 MB) solution but scoring 58% on the CORe50 dataset. Thanks to the parallelism of the low-power cluster engine, our HW/SW platform results 25× faster than typical MCU device, on which CL is still impractical, and demonstrates an 11× gain in terms of energy consumption with respect to mobile-class solutions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2020
			
	Titolo del Convegno
	
				34th IEEE Workshop on Signal Processing Systems, SiPS 2020
			
	Luogo del Convegno
	
				prt
			
	Data del Convegno
	
				2020
			
	Codice DOI
	
				https://dx.doi.org/10.1109/SiPS50750.2020.9195220
			
	Codice WoS
	
				WOS:000783760500010
			
	Codice Scopus
	
				2-s2.0-85096764955
			
	N° del Volume
	
				2020-
			
	Pagina iniziale
	
				53
			
	Pagina finale
	
				58
			
	Tutti gli autori
	
						Ravaglia, L.; Rusci, M.; Capotondi, A.; Conti, F.; Pellegrini, L.; Lomonaco, V.; Maltoni, D.; Benini, L.
					
	Citazione
	
				Memory-Latency-Accuracy Trade-Offs for Continual Learning on a RISC-V Extreme-Edge Node / Ravaglia, L.; Rusci, M.; Capotondi, A.; Conti, F.; Pellegrini, L.; Lomonaco, V.; Maltoni, D.; Benini, L.. - 2020-:(2020), pp. 53-58. ( 34th IEEE Workshop on Signal Processing Systems, SiPS 2020 prt 2020) [10.1109/SiPS50750.2020.9195220].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris