DM864: Advanced Data Mining (5 ECTS)

STADS: 15020101

Level
Master's level course

Teaching period
The course is offered in the spring semester.

Teacher responsible
Email: zimek@imada.sdu.dk

Timetable
Group Type Day Time Classroom Weeks Comment
Common I Monday 10-12 IMADA semi 17
Common I Monday 09-11 IMADA semi 22
Common I Tuesday 16-18 IMADA semi 17
Common I Thursday 16-18 U144 14
Common I Thursday 16-18 IMADA semi 15-16,21-22
Common I Friday 13-15 U142 14
Common I Friday 13-15 IMADA semi 15-16,21
H1 TE Wednesday 12-14 IMADA semi 14-17,21-22
Show entire timetable
Show personal time table for this course.

Comment:
Ubegrænset deltagerantal

Prerequisites:
None

Academic preconditions:
Students taking the course are expected to:
  • Have basic knowledge of probability and mathematics;
  • Be able to program;
  • Be familiar with basics of unsupervised data mining (e.g. from DM555 or from DM843 – the latter can be combined with this course in the same term).


Course introduction
The aim of the course is to enable the student to understand and work with advanced unsupervised data mining methods such as ensemble methods for clustering and outlier detection or methods dedicated to high-dimensional data (e.g., subspace clustering), which is important in regard to handle complex, difficult, and high-dimensional data in various applications.

The course builds on the knowledge acquired in the courses DM555 or DM843, and gives an academic basis for working in applied projects or writing a Master’s thesis in topics involving the unsupervised analysis of complex, difficult, and high-dimensional data.

In relation to the competence profile of the degree it is the explicit focus of the course to:

  • Give the competence to independently describe, analyse, and solve advanced problems in unsupervised data mining using the acquired models and methods.
  • Give the competence to analyse advantages and drawbacks of different methods for advanced unsupervised data mining.
  • Give skills to apply the acquired models and methods adequately.
  • Give knowledge and understanding of a selection of specialized models and methods for unsupervised data mining using ensemble techniques or adaptations to high-dimensional data, including some from the research frontier of the field.


Expected learning outcome
The learning objectives of the course are that the student demonstrates the ability to:
  • describe the data mining tasks presented during the course;
  • describe the algorithms and methods presented in the course;
  • describe the topics presented in the course in precise mathematical language;
  • explain the individual steps of mathematical derivations presented in class;
  • apply the methods to situations different from the ones presented in class;
  • reflect on and assess design choices for data mining methods for high-dimensional data and ensemble methods.
Subject overview
The following main topics are contained in the course:
  • general principles and methods for ensemble learning;
  • special challenges and approaches for ensemble clustering and ensemble outlier detection;
  • selected methods for ensemble clustering and ensemble outlier detection;
  • special challenges for data mining in high-dimensional data;
  • general approaches for unsupervised learning in high-dimensional data
  • selected methods for subspace clustering;
  • selected methods for high-dimensional outlier detection.
Literature
    Meddeles ved kursets start


Website
This course uses e-learn (blackboard).

Prerequisites for participating in the exam
  1. Presentation of one or more scientific articles in class. Passed/Not passed, internal marking. the prerequisite examination is a prerequisite for participation in exam element a).  (15020112).
Assessment and marking:
  1. Oral exam. External marking, 7-mark scale. A closer description of the exam rules will be posted under 'Course Information' on Blackboard. (5 ECTS). (15020102).
 


Expected working hours
The teaching method is based on three phase model.
Intro phase: 24 hours
Skills training phase: 12 hours, hereof:
 - Tutorials: 12 hours

Educational activities
  • Reading from textbooks and papers.
  • Solving homework.
  • Applying acquired knowledge in practical projects.
Educational form
In the intro phase, concepts, theories, and models are introduced and put into perspective. In the training phase, students train their skills through exercises and dig deeper into the subject matter. In the study phase, students gain academic, personal, and social experiences that consolidate and further develop their scientific proficiency. Focus is on immersion, understanding, and development of collaborative skills.

Language
This course is taught in Danish or English, depending on the lecturer. However, if international students participate, the teaching language will always be English.

Course enrollment
See deadline of enrolment.

Tuition fees for single courses
See fees for single courses.