Data Mining, Validation and Collaborative Knowledge Capture
Atzmueller, M.; Beer, S. & Puppe, F.
Brüggemann, S. & d’Amato, C., ed., 'Collaboration and the Semantic Web: Social Networks, Knowledge Networks, and Knowledge Resources', IGI Global, 149-167 (2012)
Temporal data mining using shape space representations of time series
Fuchs, E.; Gruber, T.; Pree, H. & Sick, B.
Subspace representations that preserve essential information of high-dimensional data may be advantageous for many reasons such as improved interpretability, overfitting avoidance, acceleration of machine learning techniques. In this article, we describe a new subspace representation of time series which we call polynomial shape space representation. This representation consists of optimal (in a least-squares sense) estimators of trend aspects of a time series such as average, slope, curve, change of curve, etc. The shape space representation of time series allows for a definition of a novel similarity measure for time series which we call shape space distance measure. Depending on the application, time series segmentation techniques can be applied to obtain a piecewise shape space representation of the time series in subsequent segments. In this article, we investigate the properties of the polynomial shape space representation and the shape space distance measure by means of some benchmark time series and discuss possible application scenarios in the field of temporal data mining.
Text Mining Scientific Papers: A Survey on FCA-Based Information Retrieval Research
Poelmans, J.; Ignatov, D.; Viaene, S.; Dedene, G. & Kuznetsov, S.
Formal Concept Analysis (FCA) is an unsupervised clustering technique and many scientific papers are devoted to applying FCA in Information Retrieval (IR) research. We collected 103 papers published between 2003-2009 which mention FCA and information retrieval in the abstract, title or keywords. Using a prototype of our FCA-based toolset CORDIET, we converted the pdf-files containing the papers to plain text, indexed them with Lucene using a thesaurus containing terms related to FCA research and then created the concept lattice shown in this paper. We visualized, analyzed and explored the literature with concept lattices and discovered multiple interesting research streams in IR of which we give an extensive overview. The core contributions of this paper are the innovative application of FCA to the text mining of scientific papers and the survey of the FCA-based IR research.