Assessing the Impact of Contextual Information in Hate Speech Detection

Gravano, Agustín; Pérez, Juan Manuel; Luque, Franco M; Zeyat, Demián; Kondratzky, Martín; Moro, Agustín; Serrati, Pablo Santiago; Zajac, Joaquín; Miguel, Paula; Debandi, Natalia; Cotik, Viviana

dc.rights.license	https://creativecommons.org/licenses/by/4.0/	es_AR
dc.contributor.author	Gravano, Agustín	es_AR
dc.contributor.author	Pérez, Juan Manuel	es_AR
dc.contributor.author	Luque, Franco M	es_AR
dc.contributor.author	Zeyat, Demián	es_AR
dc.contributor.author	Kondratzky, Martín	es_AR
dc.contributor.author	Moro, Agustín	es_AR
dc.contributor.author	Serrati, Pablo Santiago	es_AR
dc.contributor.author	Zajac, Joaquín	es_AR
dc.contributor.author	Miguel, Paula	es_AR
dc.contributor.author	Debandi, Natalia	es_AR
dc.contributor.author	Cotik, Viviana	es_AR
dc.date.accessioned	2023-05-31T18:56:49Z
dc.date.available	2023-05-31T18:56:49Z
dc.date.issued	2023
dc.identifier.uri	https://repositorio.utdt.edu/handle/20.500.13098/11849
dc.identifier.uri	https://doi.org/10.1109/ACCESS.2023.3258973
dc.description.abstract	Social networks and other digital media deal with huge amounts of user-generated contents where hate speech has become a problematic more and more relevant. A great effort has been made to develop automatic tools for its analysis and moderation, at least in its most threatening forms, such as in violent acts against people and groups protected by law. One limitation of current approaches to automatic hate speech detection is the lack of context. The spotlight on isolated messages, without considering any type of conversational context or even the topic being discussed, severely restricts the available information to determine whether a post on a social network should be tagged as hateful or not. In this work, we assess the impact of adding contextual information to the hate speech detection task.We specifically study a subdomain of Twitter data consisting of replies to digital newspapers posts, which provides a natural environment for contextualized hate speech detection. We built a new corpus in Spanish (Rioplatense variant) focused on hate speech associated to the COVID-19 pandemic, annotated using guidelines carefully designed by our interdisciplinary team. Our classification experiments using state-of-the-art transformer-based machine learning techniques show evidence that adding contextual information improves the performance of hate speech detection for two proposed tasks: binary and multi-label prediction, increasing their Macro F1 by 4.2 and 5.5 points, respectively. These results highlight the importance of using contextual information in hate speech detection. Our code, models, and corpus has been made available for further research.	es_AR
dc.description.sponsorship	Este artículo se encuentra publicado en IEEE Access, 11, 30575-30590.	es_AR
dc.format.extent	pp. 30575-30590	es_AR
dc.format.medium	application/pdf	es_AR
dc.language	eng	es_AR
dc.relation.ispartof	IEEE Access, vol. 11, pp. 30575-30590, 2023, doi: 10.1109/ACCESS.2023.3258973.
dc.rights	info:eu-repo/semantics/openAccess	es_AR
dc.subject	NLP	es_AR
dc.subject	Text classification	es_AR
dc.subject	Hate speech detection	es_AR
dc.subject	Contextual information	es_AR
dc.subject	Spanish corpus	es_AR
dc.subject	Covid-19 hate speeches	es_AR
dc.title	Assessing the Impact of Contextual Information in Hate Speech Detection	es_AR
dc.type	info:eu-repo/semantics/article	es_AR
dc.type.version	info:eu-repo/semantics/publishedVersion	es_AR

Ficheros en el ítem

Nombre:: IEEE_Garavano_2023.pdf
Tamaño:: 2.978Mb
Formato:: PDF

Ver:

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos presentados, aceptados y publicados
Publicaciones de investigadores ditellianos en revistas científicas

Mostrar el registro sencillo del ítem