Arbeitsgruppe Informationsmanagement

Data Science

SoSe 2017


VAK: 03-BE-802.98a

Termin: wöchentlich Mi 10:00 - 12:00

Raum: MZH 1110

CP: 4



From medical decision support systems to automatic language translation, from sorting and prioritizing news on social networks to autonomous cars: Machine learning is woven into the fabric of daily life. Applying machine learning, data science aims to extract knowledge or insights from data. 


The class will provide an introduction to data science and applied machine learning. For this, the programming language Python will be used (and taught). You will learn about the difference between supervised and unsupervised machine learning, and three machine learning tasks: 

1. classification (e.g. k-NN, Decision Trees, Support Vector Machines)

2. regression (Linear Regression, Logistic Regression)

3. clustering (k-means, dimensionality reduction with PCA and t-SNE)

We will explore natural language processing for text mining and computer vision. Evaluation, as an integral part of data science, will be taught as well as data processing and data mining. To communicate our findings, we will also look at different visualization techniques.


During this course, you will work in small groups on independent projects. Each group will have to:

* formulate a research question

* pick and potentially collect a dataset

* pick a suitable operationalisation and method

* find and justify the best machine learning model

* describe your approach and findings in a report


Basic programming experience is required to succeed in this course. But the course is mostly about concepts and aimed at anybody that wants to learn more about data science. 


Für die Teilnahme am Bachelorprojekt von Stauke und Heuer im Wintersemester 2017/18 wird die Teilnahme an diesem Kurs als Vorbereitung empfohlen.


Hendrik Heuer