Exploring large language models (LLMs) through interactive Python activities

Tufino, E.

doi:10.1088/1361-6552/adea28

This paper presents an approach to introduce physics students to the basic concepts of large language models (LLMs) using Python-based activities in Google Colab. The teaching strategy integrates active learning strategies and combines theoretical ideas with practical, physics-related examples. Students engage with key technical concepts, such as word embeddings, through hands-on exploration of the Word2Vec neural network and GPT-2—an LLM that gained a lot of attention in 2019 for its ability to generate coherent and plausible text from simple prompts. The activities highlight how words acquire meaning and how LLMs predict subsequent tokens by simulating simplified scenarios related to physics. By focusing on Word2Vec and GPT-2, the exercises illustrate fundamental principles underlying modern LLMs, such as semantic representation and contextual prediction. Through interactive experimenting in Google Colab, students observe the relationship between model parameters (such as temperature) in GPT-2 and output behaviour, understand scaling laws relating data quantity to model performance, and gain practical insights into the predictive capabilities of LLMs. This approach allows students to begin to understand how these systems work by linking them to physics concepts—systems that will shape their academic studies, professional careers and roles in society.

Exploring large language models (LLMs) through interactive Python activities / Tufino, E.. - In: PHYSICS EDUCATION. - ISSN 0031-9120. - 60:5(2025), pp. 1-14. [10.1088/1361-6552/adea28]

Exploring large language models (LLMs) through interactive Python activities

Tufino E.

2025

Abstract

This paper presents an approach to introduce physics students to the basic concepts of large language models (LLMs) using Python-based activities in Google Colab. The teaching strategy integrates active learning strategies and combines theoretical ideas with practical, physics-related examples. Students engage with key technical concepts, such as word embeddings, through hands-on exploration of the Word2Vec neural network and GPT-2—an LLM that gained a lot of attention in 2019 for its ability to generate coherent and plausible text from simple prompts. The activities highlight how words acquire meaning and how LLMs predict subsequent tokens by simulating simplified scenarios related to physics. By focusing on Word2Vec and GPT-2, the exercises illustrate fundamental principles underlying modern LLMs, such as semantic representation and contextual prediction. Through interactive experimenting in Google Colab, students observe the relationship between model parameters (such as temperature) in GPT-2 and output behaviour, understand scaling laws relating data quantity to model performance, and gain practical insights into the predictive capabilities of LLMs. This approach allows students to begin to understand how these systems work by linking them to physics concepts—systems that will shape their academic studies, professional careers and roles in society.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Rivista
	
				PHYSICS EDUCATION
			
	N° del Volume
	
				60
			
	Fascicolo
	
				5
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				14
			
	Codice DOI
	
				https://dx.doi.org/10.1088/1361-6552/adea28
			
	Codice Scopus
	
				2-s2.0-105011304537
			
	Citazione
	
				Exploring large language models (LLMs) through interactive Python activities / Tufino, E.. - In: PHYSICS EDUCATION. - ISSN 0031-9120. - 60:5(2025), pp. 1-14. [10.1088/1361-6552/adea28]
			
	Tutti gli autori
	
						Tufino, E.
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Tufino_2025_Phys._Educ._60_055003 (5).pdf Open access Tipologia: VOR - Versione pubblicata dall'editore Licenza: [IR] creative-commons Dimensione 1.05 MB Formato Adobe PDF Visualizza/Apri	1.05 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris