Università degli studi dell'Insubria


A.A. di erogazione 2019/2020
Insegnamento opzionale

Laurea Magistrale in INFORMATICA
 (A.A. 2019/2020)


Anno di corso: 
Tipologia di insegnamento: 
Settore disciplinare: 
Secondo Semestre
Ore di attivita' frontale: 
Dettaglio ore: 
Lezione (40 ore), Laboratorio (16 ore)

The aim of this course is to complete the presentation of data science-related topics bent towards business applications. Starting from the topics presented in the Intelligence systems and Data Mining courses, this course will provide methods and techniques geared towards the implementation of projects suitable for production environments.
This course relates the knowledge of theoretical aspects of data science with the most relevant technologies for manipulating, managing and visualizing data.
Students will be taught about data analysis through datasets available online. Throughout the course activities, predictive experiments (machine learning and deep learning) will be shown in order to fulfill the requirements of the use cases.
The learning objectives and expected results of this course are:
To define an architecture that allows for the development of a data science project, with respect to data volume, velocity and availability, along with computing power and implementation and maintainability requirements.
Select adequate methods for solving the proposed problems, with machine learning and deep learning technologies.
Analyze, visualize and meaningfully interpret the obtained results, given the proposed solution methods.
Implement simple projects in order to gain hands-on experience on methods and techniques in data analysis.

The student knows about the topics presented in the courses of Intelligent Systems and Data Mining.
A knowledge of at least one programming language is helpful. Also, it is suggested to bring a laptop.

1. How a data science project works
1.a. Tools and cloud
1.b. How a project works
1.c. Projects’ examples and use cases
2. Data transformation and load
2.a. Data manipulation
2.b. ETL concepts
2.c. Data quality
3. Data analysis
3.a. Feature selection and class unbalancing
3.b. Data analysis workflow
3.c. How to contextualize and perform effective analyses
4. Machine learning and deep learning
4.a. Classification
4.b. Clustering
4.c. Feed forward networks
4.d. Autoencoders and Word2Vec
5. Presentation
5.a. How to effectively present data
5.b. Data representation with Business Intelligence tools

Frontal lessons consist in 40 hours of theoretical lessons, alternating theoretical lessons and practice lessons. 16 hours of exercises consist in individual presentations and projects on selected datasets.