Work-in-Progress: Quantized NNs as the Definitive solution for inference on low-power ARM MCUs?

High energy efficiency and low memory footprint are the key requirements for the deployment of deep learning based analytics on low-power microcontrollers. Here we present work-in-progress results with Q-bit Quantized Neural Networks (QNNs) deployed on a commercial Cortex-M7 class microcontroller by means of an extension to the ARM CMSIS-NN library. We show that i) for Q=4 and Q=2 low memory footprint QNNs can be deployed with an energy overhead of 30% and 36% respectively against the 8-bit CMSIS-NN due to the lack of quantization support in the ISA; ii) for Q=1 native instructions can be used, yielding an energy and latency reduction of ∼3.8× with respect to CMSIS-NN. Our initial results suggest that a small set of QNN-related specialized instructions could improve performance by as much as 7.5× for Q=4, 13.6× for Q=2 and 6.5× for binary NNs.

Work-in-Progress: Quantized NNs as the Definitive solution for inference on low-power ARM MCUs? / Rusci, M.; Capotondi, A.; Conti, F.; Benini, L.. - (2018), pp. 1-2. (Intervento presentato al convegno 2018 ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2018 tenutosi a Torino Incontra Congress Center, ita nel 2018) [10.1109/CODESISSS.2018.8525915].

Work-in-Progress: Quantized NNs as the Definitive solution for inference on low-power ARM MCUs?

Rusci M.;Capotondi A.;Conti F.;Benini L.

2018

Abstract

High energy efficiency and low memory footprint are the key requirements for the deployment of deep learning based analytics on low-power microcontrollers. Here we present work-in-progress results with Q-bit Quantized Neural Networks (QNNs) deployed on a commercial Cortex-M7 class microcontroller by means of an extension to the ARM CMSIS-NN library. We show that i) for Q=4 and Q=2 low memory footprint QNNs can be deployed with an energy overhead of 30% and 36% respectively against the 8-bit CMSIS-NN due to the lack of quantization support in the ISA; ii) for Q=1 native instructions can be used, yielding an energy and latency reduction of ∼3.8× with respect to CMSIS-NN. Our initial results suggest that a small set of QNN-related specialized instructions could improve performance by as much as 7.5× for Q=4, 13.6× for Q=2 and 6.5× for binary NNs.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2018
			
	Titolo del Convegno
	
				2018 ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2018
			
	Luogo del Convegno
	
				Torino Incontra Congress Center, ita
			
	Data del Convegno
	
				2018
			
	Codice DOI
	
				https://dx.doi.org/10.1109/CODESISSS.2018.8525915
			
	Codice WoS
	
				WOS:000698598900010
			
	Codice Scopus
	
				2-s2.0-85058211709
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				2
			
	Tutti gli autori
	
						Rusci, M.; Capotondi, A.; Conti, F.; Benini, L.
					
	Citazione
	
				Work-in-Progress: Quantized NNs as the Definitive solution for inference on low-power ARM MCUs? / Rusci, M.; Capotondi, A.; Conti, F.; Benini, L.. - (2018), pp. 1-2. (Intervento presentato al  convegno 2018 ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2018 tenutosi a Torino Incontra Congress Center, ita nel 2018) [10.1109/CODESISSS.2018.8525915].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
Capotondi_CODES18.pdf Open access Tipologia: AAM - Versione dell'autore revisionata e accettata per la pubblicazione Dimensione 418.99 kB Formato Adobe PDF Visualizza/Apri	418.99 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1182360

Citazioni

ND

15

1

social impact