DM862: Big Data Systems (10 ECTS)

STADS: 15019901

Level
Bachelor course

Teaching period
The course is offered when needed.

Teacher responsible
Email: zhou@imada.sdu.dk

Timetable
There is no timetable available for the chosen semester.

Comment:
AFLYST EFTERÅR 2017

Prerequisites:
Bachelor degree in computer science, mathematics, applied mathematics, mathematics-economy or comparable.

Academic preconditions:
Students taking the course are expected to:
  • be able design and implement programs, using standard algorithmic approaches and data structures.
  • be able to judge the complexity of algorithms, with regard to runtime as well as with regard to space usage.


Course introduction
The goal of this course is to give the participants an understanding of the technologies in systems for Big Data analysis and management. It covers both traditional methods used in data warehouse and parallel database systems, real-time data-stream processing systems, as well as modern technologies of cloud computing and massively parallel data analysis platforms.

The course builds on the knowledge acquired in the courses DM507
Algorithms and Data Structures, DM505 Database Design and Programming and DM532 Principles of Database Systems.
The course gives an academic basis for writing a Master's thesis in Big Data.

In relation to the competence profile of the degree it is the explicit focus of the course to:
  • Give expert knowledge in Big Data, which is based on the highest level of international research within Computer Science.
  • Describe, analyse, and solve advanced computer scientific problems using the models they learned.
  • Shed light on stated hypotheses with a qualified theoretical basis and be critical of both own and others research results and scientific models.
  • Develop new variants of the learned methods, where the concrete problem requires it.
  • Disseminate research-based knowledge and discuss professional and scientific problems with both colleagues and non-specialists.
  • Plan and execute scientific projects of high standard, including managing work situations that are complex, unpredictable, and require novel solutions.
  • Take responsibility of own professional development and specialisation.


Expected learning outcome
The learning objectives of the course is that the student demonstrates the ability to:
  • Explain the techniques of data warehouse and parallel database systems
  • Explain the techniques of data stream processing
  • Account for theories behind massively parallel data analysis systems
  • Explain the design and trade-off in the modern systems introduced in the course
  • Develop programs and apply tools for big data management and analysis and deploy them on a cloud computing platform
  • Report work done in the assignments in a clear and precise language, and in a structured fashion.
Subject overview
The following main topics are contained in the course:
  • Data warehouse, parallel database systems, massively parallel data analysis, parallel fast data stream processing, parallel big graph data processing, scalable machine learning, fault-tolerance, load balancing, load shedding, dynamic scaling, data partitioning.
Literature
    Meddeles ved kursets start.


Website
This course uses e-learn (blackboard).

Prerequisites for participating in the exam
  1. Fremlægge udvalgte videnskabelige artikler. Bestået/ikke-bestået, intern censur ved underviser. Forudsætningsprøve a) er en forudsætning for deltagelse i eksamenselement a). (15019932).
  2. Kort anmeldelse af udvalgte videnskabelige artikler. Bestået/ikke-bestået, intern censur ved underviser. Forudsætningsprøve b) er en forudsætning for deltagelse i eksamenselement b). (15019922).
Assessment and marking:
  1. Oral exam. External marking, 7-mark scale. No exam aids allowed. (5 ECTS). (15019902).
  2. Project assignment. External marking, 7-mark scale. No exam aids allowed (5 ECTS). (15019912).
A closer description of the exam rules will be posted under 'Course Information' on Blackboard.
 


Expected working hours
The teaching method is based on three phase model.
Intro phase: 22 hours
Skills training phase: 22 hours, hereof:
 - Tutorials: 22 hours

Educational activities
  • Using the acquired knowledge in projects
  • Discussing the scientific articles/book chapters
Educational form
In the intro phase, concepts, theories and models are introduced and put into perspective. In the training phase, students train their skills through exercises and dig deeper into the subject matter. In the study phase, students gain academic, personal and social experiences that consolidate and further develop their scientific proficiency. Focus is on immersion, understanding, and development of collaborative skills.

Language
This course is taught in English.

Course enrollment
See deadline of enrolment.

Tuition fees for single courses
See fees for single courses.