Detail publikace

Interactive Mining on Hierarchical Data

Originální název

Interactive Mining on Hierarchical Data

Anglický název

Interactive Mining on Hierarchical Data

Jazyk

en

Originální abstrakt

In this paper, we propose a framework for interactive, iterative, and intuitive mining of multilevel association, characterization and classification rules on data organized in multi-level conceptual hierarchies. This framework is called OLAM SE (Self Explaining On-Line Analytical Mining) and it is proposed as an extension of OLAP or as an alternative to Han's OLAM. OLAM processes data stored in data cubes structure of which is based on a given conceptual hierarchy. OLAM SE determines minimum support value from user defined cover value of data with usage of entropy coding principle. It also automatically determines the maximum threshold to avoid explaining knowledge that is obvious and so potentially uninteresting. Major part of data is thus described by frequent patterns. The presentation of results is inspired by UML diagram notation. It contains a graph nodes of which are frequent data sets represented as packages including sub packages - data classes or items. Edges represent relations or patterns between packages. This representation could be applicable for characterization and nonnaďve Bayesian classification process as well. Patterns can be interactively explored by the user, who gets a detailed view of attractive ones. She can intuitively drive the more detailed knowledge obtaining process.

Anglický abstrakt

In this paper, we propose a framework for interactive, iterative, and intuitive mining of multilevel association, characterization and classification rules on data organized in multi-level conceptual hierarchies. This framework is called OLAM SE (Self Explaining On-Line Analytical Mining) and it is proposed as an extension of OLAP or as an alternative to Han's OLAM. OLAM processes data stored in data cubes structure of which is based on a given conceptual hierarchy. OLAM SE determines minimum support value from user defined cover value of data with usage of entropy coding principle. It also automatically determines the maximum threshold to avoid explaining knowledge that is obvious and so potentially uninteresting. Major part of data is thus described by frequent patterns. The presentation of results is inspired by UML diagram notation. It contains a graph nodes of which are frequent data sets represented as packages including sub packages - data classes or items. Edges represent relations or patterns between packages. This representation could be applicable for characterization and nonnaďve Bayesian classification process as well. Patterns can be interactively explored by the user, who gets a detailed view of attractive ones. She can intuitively drive the more detailed knowledge obtaining process.

Dokumenty

BibTex


@inproceedings{BUT26104,
  author="Petr {Chmelař} and Lukáš {Stryka}",
  title="Interactive Mining on Hierarchical Data",
  annote="In this paper, we propose a framework for interactive, iterative, and intuitive
mining of multilevel association, characterization and classification rules on
data organized in multi-level conceptual hierarchies. This framework is called
OLAM SE (Self Explaining On-Line Analytical Mining) and it is proposed as an
extension of OLAP or as an alternative to Han's OLAM. OLAM processes data stored
in data cubes structure of which is based on a given conceptual hierarchy. OLAM
SE determines minimum support value from user defined cover value of data with
usage of entropy coding principle. It also automatically determines the maximum
threshold to avoid explaining knowledge that is obvious and so potentially
uninteresting. Major part of data is thus described by frequent patterns. The
presentation of results is inspired by UML diagram notation. It contains a graph
nodes of which are frequent data sets represented as packages including sub
packages - data classes or items. Edges represent relations or patterns between
packages. This representation could be applicable for characterization and
nonnaďve Bayesian classification process as well. Patterns can be interactively
explored by the user, who gets a detailed view of attractive ones. She can
intuitively drive the more detailed knowledge obtaining process.",
  address="Brno University of Technology",
  booktitle="Proceedings of the 13th Conference STUDENT EEICT 2007 Volume 4",
  chapter="26104",
  howpublished="print",
  institution="Brno University of Technology",
  year="2007",
  month="april",
  pages="410--414",
  publisher="Brno University of Technology",
  type="conference paper"
}