Banner Viadrina

Data Analysis and Visualization with Python (R-Module)

Exam number: 6765

Semester: from 1st semester

Duration of the module: One semester

Form of the module (i.e. obligatory, elective etc.): Elective

Frequency of module offer: Summer semester 2016

Prerequisites: Interest in learning programming and data analysis (statistical / marketing); preliminary knowledge in statistics. Experience in programming is not necessary.

Applicability of module for other study programmes:
Obligatory or elective in other study programmes. For further information check regulations of the study programme.

Person responsible for module: Prof. Dr. Achim Koberstein

Name of the professor: Prof. Dr. Achim Koberstein

Language of teaching: English

ECTS-Credits (based on the workload): 6

Workload and its composition (self-study, contact time):
Contact time (Lecture, tutorial etc.): 60 h; self-study: 120 h

Contact hours (per week in semester): 4

Methods and duration of examination:
Submission of the home programming assignments, implementation of a working data analysis solution and its presentation

Emphasis of the grade for the final grade: Please check regulations of the study programme

Aim of the module (expected learning outcomes and competencies to be acquired):
The participants learn basic programming concepts on the example of Python language in a data analysis framework. Python is one of the most demanded programming languages in scientific research and on highly-qualified jobs in industry. The course consists of two milestone blocks: (1) introduction to programming on the example of Python; (2) hands-on experience utilizing data analysis capabilities of Python for the research of publicly available (big)data.

Contents of the module:
The first block is closer to the standard class: lecture – tutorial – homework. During this part student will get to know about programming, Python language, its state-of-the-art capabilities in data analysis including the overview of data analysis theory.
Outline of the 1st block:
1. Baby Basics, Data Types, Data Collections
2. Decision and Control Structures
3. Modular Programming
4. Data Storage and Processing
5. Statistics, Plotting and Visualization
6. Regression, Clustering
7. Getting Data from the Internet
In the second block students receive analysis cases with clearly defined research aims. In compact groups they have to develop a solution using received knowledge and perform data analysis. Students will acquire the capability to develop in teams, apply special analysis techniques and select appropriate programming methods to solve the business tasks.

Teaching and learning methods:
Lectures in programming are accompanied by tutorials and homework assignments. As a student you are expected to solve the exercises given home. Students will work in small groups to develop, implement and present working solutions of data analysis.

Literature (compulsory reading, recommended literature):
Swaroop, C H: A Bite of Python. Available at: https://python.swaroopch.com/
Zelle, John. B: Python Programming: An Introduction to Computer Science, 2nd Ed., Franklin, Beedle & Associates Inc. 2010.
Grus, Joel: Data Science from Scratch: First Principles with Python, O'Reilly Media 2015.
McKinney, Wes: Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, O'Reilly Media 2012

Further information:
Registration in Moodle Viadrina required.