The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments

verfasst von: Nailia Mirzakhmedova, Johannes Kiesel, Milad Alshomary, Maximilian Heinrich, Nicolas Handke, Xiaoni Cai, Valentin Barriere, Doratossadat Dastgheib, Omid Ghahroodi, MohammadAli SadraeiJavaheri, Ehsaneddin Asgari, Lea Kawaletz, Henning Wachsmuth, Benno Stein
Abstract: While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touché23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4780 new arguments, doubling the dataset`s size to 9324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touché23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.
Organisationseinheit(en): Institut für Künstliche Intelligenz
Fachgebiet Maschinelle Sprachverarbeitung
Typ: Aufsatz in Konferenzband
Seiten: 16121-16134
Anzahl der Seiten: 14
Publikationsdatum: 01.05.2024
Publikationsstatus: Veröffentlicht
Peer-reviewed: Ja
ASJC Scopus Sachgebiete: Theoretische Informatik, Theoretische Informatik und Mathematik, Angewandte Informatik
Elektronische Version(en): https://aclanthology.org/2024.lrec-main.1402/ (Zugang: Offen)

BibTeX

@inproceedings{3626ffb7fb31427a9aea4c967d7fb8d6,
title = "The Touch{\'e}23-ValueEval Dataset for Identifying Human Values behind Arguments",
abstract = "While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touch{\'e}23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4780 new arguments, doubling the dataset`s size to 9324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touch{\'e}23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.",
keywords = "Corpus (Creation, Annotation, etc.), Document Classification, Text categorisation",
author = "Nailia Mirzakhmedova and Johannes Kiesel and Milad Alshomary and Maximilian Heinrich and Nicolas Handke and Xiaoni Cai and Valentin Barriere and Doratossadat Dastgheib and Omid Ghahroodi and MohammadAli SadraeiJavaheri and Ehsaneddin Asgari and Lea Kawaletz and Henning Wachsmuth and Benno Stein",
note = "Publisher Copyright: {\textcopyright} 2024 ELRA Language Resource Association: CC BY-NC 4.0.",
year = "2024",
month = may,
day = "1",
language = "English",
pages = "16121--16134",
editor = "Nicoletta Calzolari and Min-Yen Kan and Veronique Hoste and Alessandro Lenci and Sakriani Sakti and Nianwen Xue",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
publisher = "ELRA and ICCL",
}

Details zu Publikationen

The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments

Gefördert vom