Implicit biases, subtle and unconscious attitudes, permeate various facets of human decision-making and are similarly pervasive in Artificial Intelligence (AI) systems. These biases can stem from shortcut learning, where models rely on superficial patterns that do not capture the underlying phenomena. Inspired by social psychology studies, we introduce two novel metrics to analyze implicit biases in visual-language models. Our comprehensive analysis of 90 open-clip models reveals widespread anomalies related to ethnicity and gender. The first metric considers the cosine similarity between images and text prompts related to social stereotypes. The second metric adapts the Implicit Association Test (IAT), which evaluates prejudice and hidden discrimination within human behavior. Our findings illustrate that conventional text-based debiasing efforts can inadvertently amplify second-order biases instead of mitigating them. Furthermore, in expanding our evaluation to multimodal Large Language Models (LLMs), we demonstrate disparities in the tendency to generate semantically positive or negative outputs, depending on the ethnicity or gender of the individuals depicted in the input images.

Beyond the Surface: Comprehensive Analysis of Implicit Bias in Vision-Language Models / Capitani, Giacomo; Lucarini, Alice; Bonicelli, Lorenzo; Bolelli, Federico; Calderara, Simone; Vezzali, Loris; Ficarra, Elisa. - (2024). (Intervento presentato al convegno Fairness and Ethics towards transparent AI: facing the chalLEnge through model Debiasing (FAILED) tenutosi a Milan, Italy nel 29 Sep-4 Oct).

Beyond the Surface: Comprehensive Analysis of Implicit Bias in Vision-Language Models

Giacomo Capitani
;
Alice Lucarini;Lorenzo Bonicelli;Federico Bolelli;Simone Calderara;Loris Vezzali;Elisa Ficarra
2024

Abstract

Implicit biases, subtle and unconscious attitudes, permeate various facets of human decision-making and are similarly pervasive in Artificial Intelligence (AI) systems. These biases can stem from shortcut learning, where models rely on superficial patterns that do not capture the underlying phenomena. Inspired by social psychology studies, we introduce two novel metrics to analyze implicit biases in visual-language models. Our comprehensive analysis of 90 open-clip models reveals widespread anomalies related to ethnicity and gender. The first metric considers the cosine similarity between images and text prompts related to social stereotypes. The second metric adapts the Implicit Association Test (IAT), which evaluates prejudice and hidden discrimination within human behavior. Our findings illustrate that conventional text-based debiasing efforts can inadvertently amplify second-order biases instead of mitigating them. Furthermore, in expanding our evaluation to multimodal Large Language Models (LLMs), we demonstrate disparities in the tendency to generate semantically positive or negative outputs, depending on the ethnicity or gender of the individuals depicted in the input images.
2024
2024
Fairness and Ethics towards transparent AI: facing the chalLEnge through model Debiasing (FAILED)
Milan, Italy
29 Sep-4 Oct
Capitani, Giacomo; Lucarini, Alice; Bonicelli, Lorenzo; Bolelli, Federico; Calderara, Simone; Vezzali, Loris; Ficarra, Elisa
Beyond the Surface: Comprehensive Analysis of Implicit Bias in Vision-Language Models / Capitani, Giacomo; Lucarini, Alice; Bonicelli, Lorenzo; Bolelli, Federico; Calderara, Simone; Vezzali, Loris; Ficarra, Elisa. - (2024). (Intervento presentato al convegno Fairness and Ethics towards transparent AI: facing the chalLEnge through model Debiasing (FAILED) tenutosi a Milan, Italy nel 29 Sep-4 Oct).
File in questo prodotto:
File Dimensione Formato  
ECCV_2024___FAILED_Workshop___Beyond_the_Surface__Comprehensive_Analysis_of_Implicit_Bias_in_Vision_Language_Models.pdf

Open access

Tipologia: Versione dell'autore revisionata e accettata per la pubblicazione
Dimensione 13.11 MB
Formato Adobe PDF
13.11 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1350126
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact