YTTREX: crowdsourced analysis of YouTube’s recommender system during COVID-19 pandemic

Sanna, Leonardo; Romano, Salvatore; Corona, Giulia; Agosti, Claudio

Algorithmic personalization is difficult to approach because it entails studying many different user experiences, with a lot of variables outside of our control. Two common biases are frequent in experiments: relying on corporate service API and using synthetic profiles with small regards of regional and individualized profiling and personalization. In this work, we present the result of the first crowdsourced data collections of YouTube's recommended videos via YouTube Tracking Exposed (YTTREX). Our tool collects evidence of algorithmic personalization via an HTML parser, anonymizing the users. In our experiment we used a BBC video about COVID-19, taking into account 5 regional BBC channels in 5 different languages and we saved the recommended videos that were shown during each session. Each user watched the first five second of the videos, while the extension captured the recommended videos. We took into account the top-20 recommended videos for each completed session, looking for evidence of algorithmic personalization. Our results showed that the vast majority of videos were recommended only once in our experiment. Moreover, we collected evidence that there is a significant difference between the videos we could retrieve using the official API and what we collected with our extension. These findings show that filter bubbles exist and that they need to be investigated with a crowdsourced approach.

YTTREX: crowdsourced analysis of YouTube’s recommender system during COVID-19 pandemic / Sanna, L., Romano, S., Corona, G., Agosti, C.. - (2020). (SIMBig 2020 - 7th International Conference on Information Management and Big Data Online 1-3 October 2020).