Exploiting Structural Consistencies with Stacked Conditional Random Fields.
Mathematical Methodologies in Pattern Recognition and Machine Learning Springer Proceedings in Mathematics & Statistics, 30:111-125, 2013.
Peter Kluegl, Martin Toepfer, Florian Lemmerich, Andreas Hotho und Frank Puppe.
[Kurzfassung]
[BibTeX]
Conditional Random Fields (CRF) are popular methods for labeling unstructured or textual data. Like many machine learning approaches, these undirected graphical models assume the instances to be independently distributed. However, in real-world applications data is grouped in a natural way, e.g., by its creation context. The instances in each group often share additional structural consistencies. This paper proposes a domain-independent method for exploiting these consistencies by combining two CRFs in a stacked learning framework. We apply rule learning collectively on the predictions of an initial CRF for one context to acquire descriptions of its specific properties. Then, we utilize these descriptions as dynamic and high quality features in an additional (stacked) CRF. The presented approach is evaluated with a real-world dataset for the segmentation of references and achieves a significant reduction of the labeling error.
Data Mining, Validation and Collaborative Knowledge Capture.
In:
S. Brüggemann und C. d’Amato (Herausgeber):
Collaboration and the Semantic Web: Social Networks, Knowledge Networks, and Knowledge Resources, Seiten 149-167.
IGI Global, 2012.
Martin Atzmueller, Stephanie Beer und Frank Puppe.
[BibTeX]
Collective Information Extraction with Context-Specific Consistencies..
In: P. A. Flach, T. D. Bie und N. Cristianini
(Herausgeber):
ECML/PKDD (1), Band 7523, Reihe Lecture Notes in Computer Science, Seiten 728-743.
Springer, 2012.
Peter Klügl, Martin Toepfer, Florian Lemmerich, Andreas Hotho und Frank Puppe.
[doi]
[BibTeX]
Stacked Conditional Random Fields Exploiting Structural Consistencies.
In: P. L. Carmona, J. S. Sánchez und A. Fred
(Herausgeber):
Proceedings of 1st International Conference on Pattern Recognition Applications and Methods ICPRAM, Seiten 240-248.
SciTePress, Vilamoura, Algarve, Portugal, 2012.
Peter Klügl, Martin Toepfer, Florian Lemmerich, Andreas Hotho und Frank Puppe.
[doi]
[Kurzfassung]
[BibTeX]
Conditional Random Fields CRF are popular methods for labeling unstructured or textual data. Like many machine learning approaches these undirected graphical models assume the instances to be independently distributed. However, in real world applications data is grouped in a natural way, e.g., by its creation context. The instances in each group often share additional structural consistencies. This paper proposes a domain-independent method for exploiting these consistencies by combining two CRFs in a stacked learning framework. The approach incorporates three successive steps of inference: First, an initial CRF processes single instances as usual. Next, we apply rule learning collectively on all labeled outputs of one context to acquire descriptions of its specific properties. Finally, we utilize these descriptions as dynamic and high quality features in an additional stacked CRF. The presented approach is evaluated with a real-world dataset for the segmentation of references and achieves a significant reduction of the labeling error.
Data Mining, Validation and Collaborative Knowledge Capture.
In:
S. Brüggemann und C. d’Amato (Herausgeber):
Collaboration and the Semantic Web: Social Networks, Knowledge Networks and Knowledge Resources..
IGI Global, 2011.
Martin Atzmueller, Stephanie Beer und Frank Puppe.
[BibTeX]
Segmentation of References with Skip-Chain Conditional Random Fields for Consistent Label Transitions.
In:
Workshop Notes of the LWA 2011 - Learning, Knowledge, Adaptation.
2011.
Martin Toepfer, Peter Kluegl, Andreas Hotho und Frank Puppe.
[doi]
[BibTeX]
Local Adaptive Extraction of References.
In: R. Dillmann, J. Beyerer, U. D. Hanebeck und T. Schultz
(Herausgeber):
KI 2010: Advances in Artificial Intelligence, 33rd Annual German Conference on AI, Reihe LNAI 6359, Seiten 40-47.
Springer, 2010.
Peter Kluegl, Andreas Hotho und Frank Puppe.
[doi]
[Kurzfassung]
[BibTeX]
The accurate extraction of scholarly reference information from scientific publications is essential for many useful applications like BibTeX management systems or citation analysis. Automatic extraction methods suffer from the heterogeneity of reference notation, no matter wether the extraction model was handcrafted or learnt from labeled data. However, references of the same paper or journal are usually homogeneous. We exploit this local consistency with a novel approach. Given some initial information from such a reference section, we try to derived generalized patterns. These patterns are used to create a local model of the current document. The local model helps to identify errors and to improve the extracted information incrementally during the extraction process. Our approach is implemented with handcrafted transformation rules working on a meta-level being able to correct the information independent of the applied layout style. The experimental results compete very well with the state of the art methods and show an extremely high performance on consistent reference sections.
Conditional Random Fields For Local Adaptive Reference Extraction.
In: M. Atzmüller, D. Benz, A. Hotho und G. Stumme
(Herausgeber):
Proceedings of LWA2010 - Workshop-Woche: Lernen, Wissen & Adaptivitaet.
Kassel, Germany, 2010.
Martin Toepfer, Peter Kluegl, Andreas Hotho und Frank Puppe..
[doi]
[Kurzfassung]
[BibTeX]
The accurate extraction of bibliographic information from scientific publications is an active field of research. Machine learning and sequence labeling approaches like Conditional Random Fields (CRF) are often applied for this reference extraction task, but still suffer from the ambiguity of reference notation. Reference sections apply a predefined style guide and contain only homogeneous references. Therefore, other references of the same paper or journal often provide evidence how the fields of a reference are correctly labeled. We propose a novel approach that exploits the similarities within a document. Our process model uses information of unlabeled documents directly during the extraction task in order to automatically adapt to the perceived style guide. This is implemented by changing the manifestation of the features for the applied CRF. The experimental results show considerable improvements compared to the common approach. We achieve an average F1 score of 96.7% and an instance accuracy of 85.4% on the test data set.
A Data Warehouse-Based Approach for Quality Management, Evaluation and Analysis of Intelligent Systems using Subgroup Mining.
In:
Proc. 22nd International Florida Artificial Intelligence Research Society Conference (FLAIRS), accepted, Seiten 372-377.
AAAI Press, 2009.
Martin Atzmueller, Stephanie Beer und Frank Puppe.
[BibTeX]
A Semi-Automatic Approach for Confounding-Aware Subgroup Discovery.
International Journal on Artificial Intelligence Tools (IJAIT), 18(1):1 - 18, 2009.
Martin Atzmueller, Frank Puppe und Hans-Peter Buscher.
[BibTeX]
Design and Implementation of a Data Warehouse for Quality Management, System Evaluation and Knowledge Discovery in the Medical Domain.
In:
Proc. 1st European Workshop on Design, Evaluation and Refinement of Intelligent Systems.
Erfurt, 2008.
Martin Atzmueller, Stephanie Beer, Alexander Hörnlein, Ralf Melcher, Hardi Lührs und Frank Puppe.
[BibTeX]
Application and Evaluation of a Medical Knowledge-System in Sonography (SonoConsult).
In:
Proc. 18th European Conference on Artificial Intelligence (ECAI 20008), accepted.
2008.
Frank Puppe, Martin Atzmueller, Georg Buscher, Matthias Huettig, Hardi Lührs und Hans-Peter Buscher.
[BibTeX]
Causal Subgroup Analysis for Detecting Confounding.
In:
Proc. 18th International Conference on Applications of Declarative Programming and Knowledge Management (INAP 2007).
Wuerzburg, Germany, 2007.
Martin Atzmueller und Frank Puppe.
[BibTeX]
Rapid Knowledge Capture Using Subgroup Discovery with Incremental Refinement.
In:
Proc. 4th International Conference on Knowledge Capture (K-CAP 2007), Seiten 31-38.
ACM Press, 2007.
Martin Atzmueller, Joachim Baumeister, Peter Klügl und Frank Puppe.
[BibTeX]
Case-Based Characterization and Analysis of Subgroup Patterns.
In:
Proc. LWA 2006 (KDML Special Track), Hildesheimer Informatik Berichte.
University of Hildesheim, 2006.
Martin Atzmueller und Frank Puppe.
[BibTeX]
SD-Map - A Fast Algorithm for Exhaustive Subgroup Discovery.
In:
Proc. 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2006), Reihe LNAI, Seiten 6-17.
2006.
Martin Atzmueller und Frank Puppe.
[BibTeX]
Conservative and Creative Strategies for the Refinement of Scoring Rules.
In: G. Sutcliffe und R. Goebel
(Herausgeber):
Proc. 19th Intl. Florida Artificial Intelligence Research Society Conference 2006 (FLAIRS-2006), Seiten 408-413.
AAAI Press, 2006.
Joachim Baumeister, Martin Atzmueller, Peter Kluegl und Frank Puppe.
[BibTeX]
Exemplifying Subgroup Mining Results for Interactive Knowledge Refinement.
In:
Proc. 13th Leipziger Informatik-Tage 2005 (LIT 2005), Reihe LNI, Seiten 101-106.
2005.
Martin Atzmueller, Joachim Baumeister und Frank Puppe.
[BibTeX]
Subgroup Mining for Interactive Knowledge Refinement.
In:
Proc. 10th Conference on Artificial Intelligence in Medicine (AIME 05), Reihe LNAI 3581, Seiten 453-462.
2005.
Martin Atzmueller, Joachim Baumeister, Achim Hemsing, Ernst-Jürgen Richter und Frank Puppe.
[BibTeX]
Inductive Learning for Case-Based Diagnosis with Multiple Faults.
In:
Advances in Case-Based Reasoning, Band 2416, Reihe LNAI, Seiten 28-42.
2002.
Proc. 6th European Conference on Case-Based Reasoning (ECCBR 2002)
Joachim Baumeister, Martin Atzmueller und Frank Puppe.
[BibTeX]