Knowledge Discovery in Databases
ÚSI-RTZZDAcad. year: 2020/2021
The course covers the basic concepts concerning knowledge discovery in databases, the relation between knowledge discovery and data mining, data sources for knowledge discovery, the principles and techniques of data pre-processing for mining, systems for knowledge discovery in databases, data mining query languages. It also focuses on data mining techniques – characterization and discrimination, association rules, classification and prediction, clustering, complex data type mining, trends in data mining. The production of a data mining project using an available data mining tool.
Learning outcomes of the course unit
Students will gain a wide yet sufficiently deep overview of the field of knowledge acquisition from databases.
They will be able to both use and develop tools for knowledge acquisition.
Students will learn specialised terminology in both Czech and English.
Students will gain experience with working on projects in a small team.
Students will improve their skills in the area of the presentation and defence of project results.
Recommended optional programme components
Recommended or required reading
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Second Edition. Elsevier Inc., 2006, 770 p., ISBN 1-55860-901-3.
Berka, P.: Dobývání znalostí za databází. Academia, 2003, 366 s., ISBN 80-200-1062-9.
Dunham, M.H.: Data Mining. Introductory and Advanced Topics. Pearson Education, Inc., 2003, 315 p.
Zendulka, J. a kol.: Získávání znalostí z databází. FIT VUT v Brně, 160 s., 2009.
Zendulka, J., Kunc, M., Stryka, L.: Získávání znalostí z databází. FIT. Extrakce informací a získávání znalostí na Webu. FIT VUT v Brně, 65 s., 2010.
Planned learning activities and teaching methods
Tuition takes place via lectures and seminars. The lectures focus on the explanation of basic principles, the methods of the given discipline, problems and example solutions.
Assesment methods and criteria linked to learning outcomes
A mid-semester written exam, formulation of a data-mining task, defence of a project. The credit is awarded on the basis of the completion of a project, and the attainment of at least 24 points for graded activities during the semester.
Language of instruction
1. Introduction – motivation, fundamental concepts, types of data source and acquired knowledge, methodology.
2. Data Warehouse and OLAP Technology for knowledge discovery.
3. Data Preparation – methods.
4. Data Preparation – data characteristics.
5. Mining frequent patterns and associations - basic concepts, efficient and scalable frequent item set rummaging methods.
6. Multi-level association rules, association rummaging and correlation analysis, constraint-based association rules.
7. Classification and prediction - basic concepts, decision tree, Bayesian classification, rule-based classification.
8. Classification by means of neural networks, SVM classifier, other classification methods, prediction.
9. Cluster analysis - basic concepts, types of data in cluster analysis, partitioning and hierarchical methods. Other clustering methods.
10. Introduction to rummaging data stream, time-series and sequence data.
11. Introduction to rummaging in graphs, time-spatial and multimedia data.
12. Mining in biological data.
13. Text rummaging, rummaging the Web.
To acquaint students with issues concerned with the gaining of knowledge from various types of data sources, to explain the types of useful knowledge and the individual steps of knowledge acquisition from data and acquaint students with the techniques, algorithms and tools used in this process.
Specification of controlled education, way of implementation and compensation for absences