Constructing Efficient Information Extraction Pipelines
- verfasst von
- Henning Wachsmuth, Benno Stein, Gregor Engels
- Abstract
Information Extraction (IE) pipelines analyze text through several stages. The pipeline's algorithms determine both its effectiveness and its run-time efficiency. In real-world tasks, however, IE pipelines often fail acceptable run-times because they analyze too much task-irrelevant text. This raises two interesting questions: 1) How much "efficiency potential" depends on the scheduling of a pipeline's algorithms? 2) Is it possible to devise a reliable method to construct efficient IE pipelines? Both questions are addressed in this paper. In particular, we show how to optimize the run-time efficiency of IE pipelines under a given set of algorithms. We evaluate pipelines for three algorithm sets on an industrially relevant task: the extraction of market forecasts from news articles. Using a system-independent measure, we demonstrate that efficiency gains of up to one order of magnitude are possible without compromising a pipeline's original effectiveness.
- Externe Organisation(en)
-
Universität Paderborn
Bauhaus-Universität Weimar
- Typ
- Aufsatz in Konferenzband
- Seiten
- 2237-2240
- Anzahl der Seiten
- 4
- Publikationsdatum
- 10.2011
- Publikationsstatus
- Veröffentlicht
- ASJC Scopus Sachgebiete
- Entscheidungswissenschaften (insg.), Betriebswirtschaft, Management und Rechnungswesen (insg.)
- Elektronische Version(en)
-
https://doi.org/10.1145/2063576.2063935 (Zugang:
Geschlossen)