Computing iceberg concept lattices with TITANIC

We introduce the notion of iceberg concept lattices

and show their use in knowledge discovery in

databases. Iceberg lattices are a conceptual

clustering method, which is well suited for analyzing

very large databases. They also serve as a condensed

representation of frequent itemsets, as starting

point for computing bases of association rules, and

as a visualization method for association rules.

Iceberg concept lattices are based on the theory of

Formal Concept Analysis, a mathematical theory with

applications in data analysis, information retrieval,

and knowledge discovery. We present a new algorithm

called TITANIC for computing (iceberg) concept

lattices. It is based on data mining techniques with

a level-wise approach. In fact, TITANIC can be used

for a more general problem: Computing arbitrary

closure systems when the closure operator comes along

with a so-called weight function. The use of weight

functions for computing closure systems has not been

discussed in the literature up to now. Applications

providing such a weight function include association

rule mining, functional dependencies in databases,

conceptual clustering, and ontology engineering. The

algorithm is experimentally evaluated and compared

with Ganter's Next-Closure algorithm. The evaluation

shows an important gain in efficiency, especially for

weakly correlated data.

and show their use in knowledge discovery in

databases. Iceberg lattices are a conceptual

clustering method, which is well suited for analyzing

very large databases. They also serve as a condensed

representation of frequent itemsets, as starting

point for computing bases of association rules, and

as a visualization method for association rules.

Iceberg concept lattices are based on the theory of

Formal Concept Analysis, a mathematical theory with

applications in data analysis, information retrieval,

and knowledge discovery. We present a new algorithm

called TITANIC for computing (iceberg) concept

lattices. It is based on data mining techniques with

a level-wise approach. In fact, TITANIC can be used

for a more general problem: Computing arbitrary

closure systems when the closure operator comes along

with a so-called weight function. The use of weight

functions for computing closure systems has not been

discussed in the literature up to now. Applications

providing such a weight function include association

rule mining, functional dependencies in databases,

conceptual clustering, and ontology engineering. The

algorithm is experimentally evaluated and compared

with Ganter's Next-Closure algorithm. The evaluation

shows an important gain in efficiency, especially for

weakly correlated data.