Implicit biases, subtle and unconscious attitudes, permeate various facets of human decision-making and are similarly pervasive in Artificial Intelligence (AI) systems. These biases can stem from shortcut learning, where models rely on superficial patterns that do not capture the underlying phenomena. Inspired by social psychology studies, we introduce two novel metrics to analyze implicit biases in visual-language models. Our comprehensive analysis of 90 open-clip models reveals widespread anomalies related to ethnicity and gender. The first metric considers the cosine similarity between images and text prompts related to social stereotypes. The second metric adapts the Implicit Association Test (IAT), which evaluates prejudice and hidden discrimination within human behavior. Our findings illustrate that conventional text-based debiasing efforts can inadvertently amplify second-order biases instead of mitigating them. Furthermore, in expanding our evaluation to multimodal Large Language Models (LLMs), we demonstrate disparities in the tendency to generate semantically positive or negative outputs, depending on the ethnicity or gender of the individuals depicted in the input images.
Beyond the Surface: Comprehensive Analysis of Implicit Bias in Vision-Language Models / Capitani, Giacomo; Lucarini, Alice; Bonicelli, Lorenzo; Bolelli, Federico; Calderara, Simone; Vezzali, Loris; Ficarra, Elisa. - (2024). (Intervento presentato al convegno Fairness and Ethics towards transparent AI: facing the chalLEnge through model Debiasing (FAILED) tenutosi a Milan, Italy nel 29 Sep-4 Oct).
Beyond the Surface: Comprehensive Analysis of Implicit Bias in Vision-Language Models
Giacomo Capitani
;Alice Lucarini;Lorenzo Bonicelli;Federico Bolelli;Simone Calderara;Loris Vezzali;Elisa Ficarra
2024
Abstract
Implicit biases, subtle and unconscious attitudes, permeate various facets of human decision-making and are similarly pervasive in Artificial Intelligence (AI) systems. These biases can stem from shortcut learning, where models rely on superficial patterns that do not capture the underlying phenomena. Inspired by social psychology studies, we introduce two novel metrics to analyze implicit biases in visual-language models. Our comprehensive analysis of 90 open-clip models reveals widespread anomalies related to ethnicity and gender. The first metric considers the cosine similarity between images and text prompts related to social stereotypes. The second metric adapts the Implicit Association Test (IAT), which evaluates prejudice and hidden discrimination within human behavior. Our findings illustrate that conventional text-based debiasing efforts can inadvertently amplify second-order biases instead of mitigating them. Furthermore, in expanding our evaluation to multimodal Large Language Models (LLMs), we demonstrate disparities in the tendency to generate semantically positive or negative outputs, depending on the ethnicity or gender of the individuals depicted in the input images.File | Dimensione | Formato | |
---|---|---|---|
ECCV_2024___FAILED_Workshop___Beyond_the_Surface__Comprehensive_Analysis_of_Implicit_Bias_in_Vision_Language_Models.pdf
Open access
Tipologia:
Versione dell'autore revisionata e accettata per la pubblicazione
Dimensione
13.11 MB
Formato
Adobe PDF
|
13.11 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris