Course detail
Knowledge Discovery in Databases
FIT-ZZNAcad. year: 2020/2021
Data warehouses. Data mining techniques association rules, classification and prediction, clustering. Mining unconventional data - data streams, time series and sequences, graphs, spatial and spatio-temporal data, multimedia. Text and web mining. Working-out a data mining project by means of an available data mining tool.
Supervisor
Department
Learning outcomes of the course unit
- Students get a broad, yet in-depth overview of the field of data mining and knowledge discovery.
- They are able both to use and to develop knowledge discovery tools.
-
Student learns terminology in Czech and English.
-
Student gains experience in solving projects in a small team.
-
Student improves his ability to present and defend the results of projects.
Prerequisites
- Knowledge of the basic steps of the data mining process and methods of data preparation for the step of data modelling (discussed in the subject UPA - Data Storage and Preparation).
- Basic knowledge of probability and statistics.
- Knowledge of database technology at a bachelor subject level.
Co-requisites
Not applicable.
Recommended optional programme components
Not applicable.
Recommended or required reading
Skiena, S.S.: The Data Science Design Manual. Springer, 2017, 445 p. ISBN 978-3-319-55443-3.
Bishop, C.M: Pattern Recognition and Machine Learning. Springer, 2006, 738 p. ISBN 0387310738.
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Third Edition. Morgan Kaufmann Publishers, 2012, 703 p., ISBN 978-0-12-381479-1.
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Second Edition. Elsevier Inc., 2006, 770 p., ISBN 1-55860-901-3.
Planned learning activities and teaching methods
Not applicable.
Assesment methods and criteria linked to learning outcomes
A mid-term test, formulation of a data mining task, presentation of the project.
Exam prerequisites:
Duty credit consists of working-out the project, defending project results and of obtaining at least 24 points for activities during semester.
Language of instruction
Czech
Work placements
Not applicable.
Aims
To familiarize students with the methods and algorithms of data modelling for knowledge discovery from it.
Specification of controlled education, way of implementation and compensation for absences
- Mid-term written exam, there is no resit, excused absences are solved by the guarantor.
- The formulation of the data mining task in the prescribed term, excused absences are solved by the assistent.
- The presentation of the project results in the prescribed term, excused absences are solved by the assistent.
- Final exam, The minimal number of points which can be obtained from the final
exam is 20. Otherwise, no points will be assigned to the student. excused absences are solved by the guarantor.
Classification of course in study plans
- Programme IT-MGR-2 Master's
branch MPV , any year of study, winter semester, 5 credits, compulsory-optional
branch MBS , any year of study, winter semester, 5 credits, compulsory-optional
branch MMI , any year of study, winter semester, 5 credits, elective
branch MMM , any year of study, winter semester, 5 credits, elective - Programme MITAI Master's
specialization NADE , any year of study, winter semester, 5 credits, elective
specialization NBIO , any year of study, winter semester, 5 credits, compulsory
specialization NGRI , any year of study, winter semester, 5 credits, elective
specialization NNET , any year of study, winter semester, 5 credits, elective
specialization NVIZ , any year of study, winter semester, 5 credits, elective
specialization NCPS , any year of study, winter semester, 5 credits, elective
specialization NSEC , any year of study, winter semester, 5 credits, elective
specialization NEMB , any year of study, winter semester, 5 credits, elective
specialization NHPC , any year of study, winter semester, 5 credits, elective
specialization NIDE , any year of study, winter semester, 5 credits, elective
specialization NISY , any year of study, winter semester, 5 credits, compulsory
specialization NMAL , any year of study, winter semester, 5 credits, elective
specialization NMAT , any year of study, winter semester, 5 credits, elective
specialization NSEN , any year of study, winter semester, 5 credits, elective
specialization NVER , any year of study, winter semester, 5 credits, elective
specialization NSPE , any year of study, winter semester, 5 credits, elective - Programme IT-MGR-2 Master's
branch MBI , 2. year of study, winter semester, 5 credits, compulsory
branch MGM , 2. year of study, winter semester, 5 credits, elective
branch MSK , 2. year of study, winter semester, 5 credits, compulsory-optional
branch MIS , 2. year of study, winter semester, 5 credits, compulsory-optional
branch MIN , 2. year of study, winter semester, 5 credits, compulsory - Programme MITAI Master's
specialization NISD , 2. year of study, winter semester, 5 credits, compulsory
Type of course unit
Lecture
39 hours, optionally
Teacher / Lecturer
Syllabus
- Data Warehouse and OLAP Technology for knowledge discovery.
- Mining frequent patterns and associations - basic concepts, efficient and scalable frequent itemset mining methods.
- Multi-level association rules, association mining and correlation analysis, constraint-based association rules.
- Predictive modelling - basic concepts, classification methods - decision tree, Bayesian classification, rule-based classification.
- Classification by means of neural networks, SVM classifier, Random forests.
- Other classification and regression methods. Evaluation of quality of classification and regression.
Cluster analysis - basic concepts, types of data in cluster analysis. - Partitioning-based and hierarchical clustering. Other clustering methods. Evaluation of quality of clustering.
- Outlier analysis. Mining in biological data.
- Introduction to mining data stream and time-series.
- Introduction to mining in sequences, graphs, spatio-temporal data, moving object data and multimédia data.
- Text mining.
- Mining the Web. Process mining.
- Introduction to big data analytics.
Project
13 hours, compulsory
Teacher / Lecturer
Syllabus
- Working-out a data mining project by means of an available data mining tool.