Assessing the quality of information extraction

Authors: F Seitl, T Kovářík, S Mirshahi, J Kryštůfek, R Dujava, M Ondreička, et al.
Venue: arXiv preprint arXiv:2404.04068
Year: 2024
Citations: 11

Abstract

This work studies how to evaluate information extraction outputs beyond simple exact matching. We discuss practical quality dimensions (correctness, completeness, consistency, and usefulness), compare metric choices, and show where automatic scores diverge from human judgment for real extraction tasks.

Resources

Video: TODO
Slides: TODO
Code: TODO
Dataset: TODO

Notes

I like this paper as a “methodology anchor” for later projects: before optimizing extraction models, we first make evaluation explicit and reliable. That framing helped us design cleaner experiments in downstream fact-checking and claim verification pipelines.

Assessing the quality of information extraction

Links

Abstract

Resources

Notes