Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Evaluation of Question Answering Systems: Complexity of Judging a Natural Language

Farea, Amer; Yang, Zhen; Duong, Kien; Perera, Nadeesha; Emmert-Streib, Frank (2025)

 
Avaa tiedosto
Evaluation_of_Question_Answering_Systems_Complexity_of_Judging_a_Natural_Language.pdf (712.8Kt)
Lataukset: 



Farea, Amer
Yang, Zhen
Duong, Kien
Perera, Nadeesha
Emmert-Streib, Frank
2025

ACM Computing Surveys
doi:10.1145/3744663
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-2025121811909

Kuvaus

Peer reviewed
Tiivistelmä
Question answering (QA) systems are a leading and rapidly advancing field of natural language processing (NLP) research. One of their key advantages is that they enable more natural interactions between humans and machines, such as in virtual assistants or search engines. Over the past few decades, many QA systems have been developed to handle diverse QA tasks. However, the evaluation of these systems is intricate, as many of the available evaluation scores are not task-agnostic. Furthermore, translating human judgment into measurable metrics continues to be an open issue. These complexities add challenges to their assessment. This survey provides a systematic overview of evaluation scores and introduces a taxonomy with two main branches: Human-Centric Evaluation Scores (HCES) and Automatic Evaluation Scores (AES). Since many of these scores were originally designed for specific tasks but have been applied more generally, we also cover the basics of QA frameworks and core paradigms to provide a deeper understanding of their capabilities and limitations. Lastly, we discuss benchmark datasets that are critical for conducting systematic evaluations across various QA tasks.
Kokoelmat
  • TUNICRIS-julkaisut [23862]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste