Courses

The broad diversity of the labs and institutions involved in the Data Science Ph.D. program is reflected in the list of courses available.
The list below is only the list of suggested courses, but the students can select their courses from the even broader list of courses available from all the institutions involved.

Advanced Methods for Complex Systems I

Provided by: IMT
Location: IMT Lucca
Lecturers: Diego GARLASCHELLI
Hours: 20
Semester: 2
Timetable:
Educational Goals:
This interdisciplinary course aims at introducing rigorous tools from statistical physics, information theory and probability theory to investigate real-world complex systems arising in different fields of research. First, some key aspects of complexity encountered in physical, biological, social, economic and technological systems will be reviewed. Then, emphasis will be put on the construction of theoretical models based on the concept of constrained randomness, i.e. the maximisation of the entropy subject to suitable constraints. This will lead to the introduction of maximum-entropy models that serve as mathematical benchmarks for the properties of highly heterogeneous complex systems. Special cases of interest include statistical ensembles of complex networks and of multivariate time-series with given properties. Comparisons between model outcomes and empirical properties will be presented systematically. Full mathematical derivations of the models, as well as methods of statistical inference, model selection and computer codes for parameter estimation on empirical data will be provided. The course will include a combination of recent and ongoing research in the NETWORKS unit at IMT Lucca, thereby offering directions for possible PhD projects in this area.
Prerequisites:
Solid mathematical background, scientific curiosity, interest in multidisciplinarity, passion for theory.
Programme:

Advanced Methods for Complex Systems II

Provided by: IMT
Location:
Lecturers: Diego GARLASCHELLI
Hours: 20
Semester: 2
Timetable:
Educational Goals:
The second part of the course ÒAdvanced Methods for Complex SystemsÓ focuses on advanced practical applications of the concepts introduced in the first part. In particular, emphasis will be put on the successful areas of pattern detection and network reconstruction from partial information. Network pattern detection is the identification of robust empirical patterns (like scale invariance, clustering, assortatitvity, reciprocity, motifs, etc.) that are widespread across real-world networks and that deviate systematically from some null hypothesis formalised in terms of a suitable random graph model. The models introduced in part I will then be used here for pattern detection purposes. The problem of community detection will also be covered, with an emphasis on the differences between finding communities in network data and in correlation matrices constructed from (e.g. financial or neural) time series databases. The problem of network reconstruction from partial topological information will be addressed concentrating on the reconstruction of financial and interbank networks from node-specific properties, with the purpose of improving stress tests and systemic risk estimates in real markets and offering better tools to policy makers. The statistical physics methods recently found by central banks to be the best-performing reconstruction techniques will be reviewed in detail. The course will include a combination of recent and ongoing research in the NETWORKS unit at IMT Lucca, thereby offering directions for possible PhD projects in this area.
Prerequisites:
Solid mathematical background, scientific curiosity, interest in multidisciplinarity, successful completion of the course ÒAdvanced Methods for Complex Systems IÓ. Note: completion of this second part of the course is not required in order to move on to the third part (parts II and III can be understood in parallel independently of each other, after part I is completed), although it would surely provide a useful overview of practical motivations for part III.
Programme:

Advanced Methods for Complex Systems III

Provided by: IMT
Location: IMT Lucca
Lecturers: Tiziano SQUARTINI
Hours: 20
Semester: 2
Timetable:
Educational Goals:
The third part of the course "Advanced Methods for Complex Systems" focuses heavily on deeper theoretical aspects and their consequences. Particular emphasis will be put on the distinction between maximum-entropy models of complex systems with "soft" and "hard" properties. In statistical physics, the resulting models are known as the "canonical" and "microcanonical" ensembles respectively. Many of the results in statistical physics (e.g. the calculation of certain entropies), discrete mathematics (e.g. the combinatorial enumeration of possible configurations of a system with given properties), and information theory (e.g. the calculation of the maximum compressibility of information sequences) rely of the concept of "ensemble equivalence", i.e. the asymptotic equivalence of soft and hard ensembles in the large size limit. Surprisingly, various complex systems have been found to violate the property of ensemble equivalence. For these systems, the standard approach is not appropriate and new developments are needed. Several intriguing challenges open up, including the uniform sampling of realisations of large complex systems, the combinatorial enumeration of systems with heterogeneous constraints and the recalculation of traditional information-theoretic bounds on communication. Examples of these open challenges will be provided, along with tentative solutions that are underway. The course will include a combination of recent and ongoing research in the NETWORKS unit at IMT Lucca, thereby offering directions for possible PhD projects in this area.
Prerequisites:
Unlimited passion for theory and multidisciplinarity, successful completion of the course "Advanced Methods for Complex Systems I". Note: knowledge of the content of the course "Advanced Methods for Complex Systems II" is not required (parts II and III can be understood in parallel independently of each other, after part I is completed), although it would surely provide a useful overview of practical motivations for this part.
Programme:

Advanced Neurogenomics

Provided by: SNS
Location: SNS
Lecturers: Alessandro CELLERINO
Hours: 55
Semester: 1
Timetable:
Educational Goals:
Obiettivo del corso è: 1) formare studenti in grado di leggere criticamente articoli che ultizzano tecniche "omiche" comprendendo i dettagli sia della parte "wet" che della analisi dati 2) mostrare come questi studi abbiano portato ad importanti avanzamenti concettuali nella comprensione dell'organizzazione funzionale del sistema nervoso 3) mettere gli studenti in condizione di pianificare in maniera indipendente un esperimento "omico", di fare una prima analisi dei dati e di dialogare con i bioinformatici per analisi più complesse.
Prerequisites:
Programme: A primer on regulation of gene expression and statistical considerations RNA-seq: how it works Sequence quality control and mapping algorithms Detection of differential gene expression Clustering and other methods to reduce dimensionality Gene Ontology and Gene set enrichment Genome-wide studies of age-dependent gene expression in the brain Network analysis Applications of network analysis to the brain Cell-specific transcriptome analysis in the brain Single-cell RNA-seq Analysis of synaptic RNAs Analysis of microRNAs and RNA immunoprecipitation Epigenetic analysis: techniques for DNA methylation Epigenetic analysis: CHIP-seq , Proteome analysis by mass-spectroscopy Analysis of synaptic proteome Analysis of nascent proteins and RNAs Expression QTLs Microbiome

Advanced Topics in Network Theory

Provided by: IMT
Location: IMT Lucca
Lecturers: Guido CALDARELLI, Angelo FACCHINI, Tommaso GILI, Rossana MASTRANDREA, Fabio SARACCO, Tiziano SQUARTINI
Hours: 60
Semester: 1
Timetable:
Educational Goals:
The course aims at providing an overview of methods to analyse complex networks.
Prerequisites:
Solid mathematical background, scientific curiosity, interest in multidisciplinarity, passion for theory.
Programme: The course is divided into 6 modules: I - Introduction to Complex Networks (Introduction to graph theory. Properties of complex networks), II - Algebraic Concepts in Network Theory (Algebraic graph theory. Bipartite and multilayer network representations. Spectral properties of graphs), III - Topological Concepts in Network Theory (Centrality measures. Mesoscale structures detection. Ranking and reputation algorithms), IV - Dynamical Models in Network Theory (Master equations. Models of growing networks. Continuous description. Epidemics. Scaling and Percolation on networks. Contagion in financial networks. Game theory), V - Brain Networks (The fMRI technique. Correlation matrices. Random Matrix Theory), VI - Research Topics in Network Theory (Discussion of pivotal articles in network theory).

Agent Based Macroeconomics

Provided by: S.ANNA
Location:
Lecturers: Andrea ROVENTINI
Hours: 18
Semester: 2
Timetable:
Educational Goals:
The course provides an overview of agent-based macroeconomics. After having discussed the limit of standard Dynamic Stochastic General Equilibrium (DSGE) models, the agent-based computational economic approach is presented, stressing how it can be employed for economic policy. Different agent-based models are then introduced (i) to stress the role of agent heterogeneity and interactions for the emergence of endogenous growth; (ii) to study how the economy self-organize after business-cycle shocks; (ii) to analyse the joint impact of monetary and macro-prudential policies; (iv) to assess how endogenous growth, catching-up and divergence emerge in an open-economy, multy-country framework. Finally, the family of the Schumpeter meeting Keynes (K+S) models is presented, stressing its capability to jointly account for endogenous growth and fluctuations and macro and micro stylized facts. The K+S model is employed as a laboratory to study the short-and long-run impact of different ensembles of innovation, industrial, monetary, fiscal, monetary, labour market climate policies. Syllabus: (1) From DSGE models to macroeconomic ABMS, (2) Exploring the role of heterogeneity and interactions with ÒsimpleÓ macro ABMs. (3) Topics in agent-based macroeconomics. (4) The family of Schumpeter meeting Keynes models.
Prerequisites:
Programme:

Agent Based Modeling

Provided by: S.ANNA
Location:
Lecturers: TBD, contact the Institute of Economy at Scuola Sant’Anna
Hours: 18
Semester: 2
Timetable:
Educational Goals:
This course is intended to serve as a broad introduction to the huge literature using agent-based computational approaches to the study of economic dynamics. It is organized in three parts. The first one (ÒWhy?Ó) will discuss the roots of the critiques to the mainstream paradigm from a methodological, empirical and experimental perspective. We shall briefly review the building blocks of mainstream models (rationality, equilibrium, interactions, etc.) and shortly present some of the evidence coming from cognitive psychology and experimental economics, network theory and empirical studies, supporting the idea that bounded rationality, non-trivial interactions, non-equilibrium dynamics, heterogeneity, etc. are irreducible features of modern economies. In the second part (ÒWhat?Ó) we shall discuss what ACE is and what are its main tools of analysis. We will define an ABM and present many examples of classes of ABMS, from the simplest (cellular automata, evolutionary games) to the most complicated ones (micro-founded macro models). The third part (ÒHow?Ó) aims at understanding how ABMs can be designed, implemented and statistically analyzed. We shall briefly present the basics of programming, by both discussing the pros and cons of using simulation platforms (Matlab, NetLogo, Swarm, LSD, etc.) vs. computer languages (Java, C++, etc.) and providing some simple Òhands-onÓ applications to cellular automata. Finally, we will see how the outputs of ABMs simulation should be treated from a statistical point of view (e.g., Montecarlo techniques) and we will discuss two hot topics in ABM research: empirical validation and policy analysis. Syllabus: Part I. Why? Why Agent-Based Computational Economics (ACE) and Agent-Based Models (ABMs)? Empirical and theoretical underpinnings. Part II. What? The structure of ABMs; Flexibility of ABMs; Examples. Part III: How? Designing and implementing ABMS; Statistical analysis of ABMs; Applications. Part IV: Selected Topics in ACE; Empirical validation; Macroeconomic policy; Object-oriented programming.
Prerequisites:
Programme:

Algorithm Accountability

Provided by: S.ANNA
Location: S. Anna
Lecturers: Giovanni COMANDE, G. MALGIERI
Hours: 10
Semester: 2
Timetable:
Educational Goals:
Prerequisites:
Programme: Defining and Justifying Accountability. Accountability in the Machine Learning Context. Algorithm Transparency. Transparency and the Marketplace/ Competition Law. Methods of Transparency. Technical and Legal Options to Enhance Transparency & Accountability. People Analytics. Behavioural ÒNudgingÓ. New Emerging Human Rights in the age of Behavioral Data Science and Neurotechnologies: Towards "Mental Privacy" and "Decision Integrity". Legal and ethical implication of computational capacity. Machine Learning. GDPR Solutions: The Right to an Explanation, etc.

Analytics in Economics and Management

Provided by: IMT
Location: IMT
Lecturers: Massimo RICCABONI
Hours: 20
Semester: 2
Timetable:
Educational Goals:
The aim of this course is to teach students how to produce a research paper in economics and management using hands-on supervised machine learning tools for different data structures. We will bridge the gap between applications of methods in published papers and practical lessons for producing your own research. After introductions to up-to-date illustrative contributions to literature, students will be asked to perform their own analyses and comment results after applications to microdata provided during the course. The objective is to develop a critical understanding of the iterative research process leading from real economic data to the choice of the best tools available from the analyst kit. Final scores will be based 50% on individual presentations of a selected supplemental reading and 50% on an individual homework.
Prerequisites:
Programme:

Basic Principles and Applications of Brain Imaging Methodologies to Neuroscience

Provided by: IMT
Location:
Lecturers: Emiliano RICCIARDI, Monica BETTA Simone ROSSI, Luca CECCHETTI
Hours: 64
Semester:
Timetable:
Educational Goals:
The course aims at introducing the fundamentals of brain metabolism and brain imaging methodologies. Neuroimaging techniques provided cognitive and social neuroscience with an unprecedented tool to investigate the neural correlates of behavior and mental functions. Here we will review the basic principles, research and clinical applications of positron emission tomography (PET), functional magnetic resonance imaging (fMRI), electroencephalography (EEG) and magnetoencephalography (MEG), non-invasive brain stimulation tools. Solid background in the concepts common to many types of neuroimaging, ranging from study design to data processing and interpretation, will be discussed to address neuroscientific questions. In particular, we will first review the basics of neurophysiology to understand the principles of brain imaging. Then, methodologies of data processing for the main brain imaging tools will be provided to the students with hands-on sessions: students will become familiar with the main pipelines for PET, fMRI and EEG data reconstruction, realignment, spatio-temporal normalization, first and second-level analyses. At the end of the course, students are expected to have general background knowledge of the basic principles, methodologies and applications of the most important brain functional techniques and to be prepared to evaluate the applicability of, and the results provided by, these methodologies for different problems in cognitive and clinical neuroscience.
Prerequisites:
Programme:

Big data analytics

Provided by: UNIPI, CNR
Location: Polo Fibonacci, University of Pisa
Lecturers: Fosca GIANNOTTI
Hours: 48
Semester: 1
Timetable:
Educational Goals:
In our digital society, every human activity is mediated by information technologies. Therefore, every activity leaves digital traces behind, that can be stored in some repository. Phone call records, transaction records, web search logs, movement trajectories, social media texts and tweets, Every minute, an avalanche of Òbig dataÓ is produced by humans, consciously or not, that represents a novel, accurate digital proxy of social activities at global scale. Big data provide an unprecedented Òsocial microscopeÓ, a novel opportunity to understand the complexity of our societies, and a paradigm shift for the social sciences. Objective of the course is twofold: an introduction to the emergent field of big data analytics and social mining, aimed at acquiring and analyzing big data from multiple sources to the purpose of discovering the patterns and models of human behavior that explain social phenomena and an introduction to the technological scenario of scalable analytics.
Prerequisites:
The students are expected to be familiar with key management (financial & managerial accounting, cash flow analysis, org design, business processes) and strategy (PorterÕs models, innovation management basics) concepts before starting the course. For management engineering students, the course is highly recommended at 2nd year of the MSc degree (useful complement for PSSP). Recommended for students wishing to apply for Junior Consulting projects. For data science & computer science students, EGI course is recommended.
Programme: Module 1: Foundations of competitive intelligence - Systems thinking for management - CI process and Key Intelligence Topics - Sources and collection techniques - Organizing CI in the companies Module 2: Competitor and Market intelligence tools - Competitive benchmarking (to assess competitive cost of operations, to analyze the true capabilities of a rival, as well as its immediate future actions) - Blindspots - Business ecosystems, platforms and business model innovation Module 3: Corporate foresight tools - Technology intelligence tools (i.e. patent analysis) - Scenario analysis tools and techniques - Weak signals and early warning system

Big Data and healthcare

Provided by: S.ANNA
Location:
Lecturers: Giuseppe TURCHETTI
Hours: 10
Semester: 2
Timetable:
Educational Goals:
Obiettivo principale del corso  quello di analizzare le potenzialitˆ dei Big Data nel settore della sanitˆ, sia con riferimento alle attivitˆ di ricerca e sviluppo delle imprese biomedicali (pharma, medical devices, biotech) che con riferimento agli obiettivi di salute e di sostenibilitˆ economica dei sistemi sanitari nazionali. Una attenzione particolare, inoltre, verrˆ dedicata a esplorare e definire modelli di servizio e di business che favoriscano, grazie ai Big Data, un maggiore engagement del paziente - e del cittadino in generale - nei confronti della aderenza alla terapia, del controllo attivo (empowerment) e della gestione (self management) della propria malattia o del proprio wellness, anche in una prospettiva di prevenzione. Infine, verranno svolte esecitazioni pratiche finalizzate a disegnare nuovi modelli organizzativi e di interazione tra diversi stakeholders nel settore della sanitˆ (cittadini, attori del sistema sanitario, industria biomedicale, pagatori, ecc.) e modelli di stima dell'impatto di salute ed economico derivante dall'utilizzo di Big Data in sanitˆ.
Prerequisites:
Programme:

Big Data Ethics

Provided by: UNIPI, S.ANNA
Location: Computer Science Department, Unversity of Pisa
Lecturers: Anna MONREALE, Salvatore RUGGIERI, Giovanni COMANDE'
Hours: 22
Semester: 2
Timetable:
TBA
Educational Goals:
The module aims to introduce ethical and legal notions of privacy, anonymity, transparency and non-discrimination, also referring the Directives and Regulations of the European Union and their ongoing evolution. The module will show technologies for Privacy-by-Design, for predictive model auditing and for protecting the users' rights and that allow the analysis of Big Data without harming the right to the protection of personal data, to transparency and to a fair treatment.
Prerequisites:
Programme:

Big data in/for/from the Public Sector

Provided by: S.ANNA
Location:
Lecturers: Francesca BIONDI, Fabio PACINI
Hours: 20
Semester: 2
Timetable:
Educational Goals:
The aim of the course is to provide a general understanding of Public and Constitutional law to the students, who will get a conceptual framework of the processes through which each public policy is constructed, implemented and eventually evaluated by political and administrative institutions.
Prerequisites:
Programme: The first module (10 hrs.) will focus on the current and future implications of the use of Big Data within the public decision-making across the different policy areas. In order to obtain a necessary alignment of competences for students coming from different backgrounds, the first part of the module will be devoted to a concise description of political decision-making from the point of view of Public and Constitutional law, both in Italy and abroad. The course will then deal with the analysis of the current and possible uses of Big Data within these public decision-making processes, as well as the ethical and political issues that it raises. The second module (10 hrs.) will focus, with an interactive approach, on the opportunities that can be given by the exploitation of Big Data produced and made available by political and administrative institutions.

Big data sources, crowdsourcing, crowdsensing

Provided by: UNIPI, CNR
Location: Computer Science Department, Unversity of Pisa
Lecturers: Maurizio TESCONI
Hours: 20
Semester: 2
Timetable:
TBA
Educational Goals:
This module presentes techniques and methods for acquisition of Big Data from a large sources of data available, including mobile phone data, GPS data, customer purchase data, social network data, open and administrative data, environmental and personal sensor data. We discuss also several participatory methods for crowdsourcing or crowdsensing collection of data through ad hoc campains like serious games and viral diffusion.
Prerequisites:
Programme:

Bioinformatics

Provided by: UNIPI
Location: Polo Fibonacci, University of Pisa
Lecturers: Nadia PISANTI
Hours: 48
Semester: 2
Timetable:
Educational Goals:
This course has the goal to give the student an overview of algorithmic methods that have been conceived for the analysis of genomic sequences. We will focus both on theoretical and combinatorial aspects as well as on practical issues such as whole genomes sequencing, sequences alignments, the search of patterns in biological sequences, the inference of repeated patterns and of long approximated repetitions, the computation of genomic distances, and several biologically relevant problems for the management and investigation of genomic data.
Prerequisites:
A Basic course on algorithms
Programme: A brief introduction to molecular biology: DNA, proteins, the cell, the synthesis of a protein. Sequences Alignments: Dynamic Programming methods for local, global, and semi-local alignments. Computing the Longest Common Subsequences. Multiple Alignments. Pattern Matching: Exact Pattern Matching: algorithms (Knuth-)Morris-Pratt, Boyer-Moore, Karp-Rabin with preprocessing of the pattern. Algorithm with preprocessing of the text: use of indexes. Motifs Extraction: KMR Algorithm for the extracion of exact motifs and its modifications for the inference of approximate motifs. Finding Repetitions: Algorithms for the inference of long approximate repetitions. Filters for preprocessing. Fragment Assembly: Genomes sequencing: some history, scientific opportunities, and practical problems. Some possible approaches for the problem of assembling sequenced fragments. Link with the “Shortest common superstring” problem, the Greedy solution. Data structures for representing and searching sequencing data. New Generation Sequencing: Applications of High Throughput Sequencing and its algorithmic problems and challenges. Investigating data types resulting from the existing biotechnologies, and the possible data structures and algorithms for their storage and analysis.

Bioinformatics

Provided by: SNS
Location: SNS
Lecturers: Francesco RAIMONDI
Hours: 40
Semester: 2
Timetable:
Educational Goals:
Aim of the course is to provide students with the basic knowledge of bioinformatics techniques as an easy and friendly support for their study and research careers. This will entail: 1) theory of the most common bioinformatics algorithms and resources: who they are, what they do and why they are so important and increasingly used in modern biology research; 2) basic practical experience through hands-on-sessions on typical problems that can be answered by using popular online tools.
Prerequisites:
Programme: Introduction to bioinformatics Biological databases Pairwise sequence alignments Basic Local Alignment Search Tool (BLAST) Multiple sequence Alignment Protein analysis and Proteomics Introduction to Protein structure Introduction to Nucleic Acids (RNA) Structure

Cloud Computing & Big-Data

Provided by: S.ANNA
Location:
Lecturers: Tommaso CUCINOTTA
Hours: 30
Semester: 1
Timetable:
Educational Goals:
This course provides an overview of the challenges to face, and the technical solutions to embrace, when building large-scale, fault-tolerant, distributed and replicated real-time cloud services. These systems need to be capable of serving millions/billions of requests per second with industrial-grade reliability, availability and performance, and are composed of thousands of components spanning across millions of machines, worldwide. The course focuses on design, development and operations of scalable software systems, including big-data processing and analytics, where the huge volumes of data to handle mandates the use of heavily distributed algorithms. The course covers also basic concepts on architectures of data-centre/cloud infrastructures.
Prerequisites:
Computer architectures and networks
Programme:

Cloud Computing & Big-Data Lab

Provided by: S.ANNA
Location:
Lecturers: Tommaso CUCINOTTA
Hours: 30
Semester: 2
Timetable:
Educational Goals:
Hands-on follow-up to the Cloud Computing & Big-Data course. This is an applied course where students will put in practice the theoretical/abstract concepts acquired in the general course on Cloud Computing & Big-Data. During the practical sessions, we'll have a deep dive on such concepts as: machine virtualization (KVM) and OS-level virtualization (LXC) on Linux; virtual networking on Linux; network programming and distributed RPC frameworks; programming abstractions for cloud and distributed computing; elasticity in practice; big-data programming frameworks; Hadoop Map-Reduce; Apache Spark. Requisites: (1) Cloud Computing & Big-Data course (both of 2016 courses on Cloud Computing and Big-Data & Analytics work) (2) Computer programming & scripting.
Prerequisites:
Computer architectures and networks
Programme:

Complements of Bioinformatics

Provided by: S.ANNA
Location:
Lecturers: Andrea ZUCCOLO
Hours: 20
Semester: 1
Timetable:
Educational Goals:
Prerequisites:
Programme: What is bioinformatics: a historical perspective from 1953 up to the next-gen sequencing; Similarity searches: intro, basics, global vs local, dynamic programming, dot-plot, fasta, blast scoring matrix; pattern discovery Following the journey of a genome: from reads to assembly to annotation to comparative genomics: o Overall sequencing strategy (de novo vs reference based) o Technical differences (sequencers, strategies: Hierarchical-BAC based, Whole Genome Shotgun, mixed strategies etc.) o Genome assemblers: greedy graph, OLC, De Brujin's graph based o The ÒvalidationÓ and Òquality assessmentÓ of a genome assembly o Annotation: genes, pseudogenes, promoters, transcription factors, repeats (and Transposable Elements) o Comparisons: SNPs, rearrangements, epigenomic, RNA seq

Complex Networks for Data Science

Provided by: IMT
Location: IMT Lucca
Lecturers: Guido CALDARELLI, T. SQUARTINI, G. CIMINI
Hours: 40
Semester: 2
Timetable:
Educational Goals:
The course aims at providing an overview of methods to analyse complex networks.
Prerequisites:
Solid mathematical background, scientific curiosity, interest in multidisciplinarity, passion for theory.
Programme: Part I - Introduction to Complex Networks (Graph Theory Introduction. Properties of Complex Networks. Community Detection. Ranking Algorithms. Static Models of Graphs. Dynamical Models of Graphs. Fitness Models. Financial Networks). Part II - Algorithms and Applications (Centrality Measures. Spectral Properties of Graphs. Community Detection. Bipartite Networks. Ranking and Reputation Algorithms. Trade Network Datasets. Multilayer Networks. Infrastructural Networks). Part III - Statistical Mechanics of Networks (Complex Networks Randomization: A Primer. Basics of Information Theory. The Exponential Random Graphs Framework: From Zero to Shannon. The Maximum-Likelihood Recipe for Parameters Estimation. Hypothesis Testing on Networks: Pattern Detection, Network Filtering, Network Projection. The Dutch Interbank Network Case-Study. Network Reconstruction: A Survey of Existing Methods. Network Reconstruction: Moving Towards Entropy-Based Recipes. The World Trade Web Case-Study. International Economic Networks: The Interplay between Trade, Finance, Production and Migrations). Part IV - Dynamical Processes on Networks (Master Equations, Models of Growing Networks - Continuous Description. Epidemics. Scaling and Percolation on Networks. Contagion in Financial Networks. Game Theory).

Computational Life and Material Sciences

Provided by: SNS
Location: SNS
Lecturers: Giuseppe BRANCATO
Hours: 50
Semester:
Timetable:
Educational Goals:
- Providing a basic theoretical ground for the comprehension of molecular modeling techniques currently used in the field of life and material sciences - Developing competences on some of the most common computational methodologies used in molecular sciences - Developing computational skills through tutorials and exercises - Stimulating the students to study a scientific problem of their interest suitable to be treated with molecular modeling methodologies
Prerequisites:
Programme: The aim of the course is to provide an overview of the theories and methodologies currently used in various fields of computational molecular sciences, ranging from biomedical sciences to material sciences. A special focus will be devoted to those models and algorithms related to molecular simulation techniques, including enhanced sampling and free energy methods. Such models will be illustrated along with relevant examples taken from recent literature and concerning different molecular modeling applications.

Cultural Heritage and Law

Provided by: IMT
Location:
Lecturers: Lorenzo CASINI
Hours: 60
Semester:
Timetable:
Educational Goals:
International Law, EU Law, and Domestic Law on Cultural Heritage. Basic elements of comparative law. Definition of Cultural Heritage. The institution of protection of cultural heritage in Italy. Fundamental principles and main public interests: protection, circulation, access. Problems and cases (Case law). - European Landscape Convention and Domestic Law on Landscape. Basic elements of comparative law. Principles and main issues: definition of landscape; levels of governance; public law instruments.Problems and cases (Case law).
Prerequisites:
Programme:

Data Driven Innovation

Provided by: UNIPI, S.ANNA
Location: Computer Science Department, Unversity of Pisa
Lecturers: Alberto DI MININ, Andrea PICCALUGA
Hours: 12
Semester: 2
Timetable:
TBA
Educational Goals:
The module aims to show the main characteristics of the innovation processes in companies and institutions. After some basics of innovation economics, the management of the innovation processes will be presented (role of R&D, Open Innovation, etc.). The module also shows new innovation opportunities available after the last progresses in large scale data acquisition and elaboration, the basics of business models and start-ups. An exercise of business model innovation will try to explore che big data potential in opening new business opportunities.
Prerequisites:
Programme:

Data Management for Business Intelligence

Provided by: UNIPI, CNR
Location: Computer Science Department, Unversity of Pisa
Lecturers: Salvatore RUGGIERI
Hours: 20
Semester: 2
Timetable:
TBA
Educational Goals:
The module shows technologies and systems for accessing, managing and analysing Big Data for decision support. Technologies and analysis of problems are shown using examples and case studies in lab. The student will acquire skills on the main technologies for business intelligence and big data management, including data warehouse and online analytical processing technologies.
Prerequisites:
Programme:

Data Mining

Provided by: UNIPI
Location: Polo Fibonacci, University of Pisa
Lecturers: Dino PEDRESCHI
Hours: 96
Semester:
Timetable:
Educational Goals:
The formidable advances in computing power, data acquisition, data storage and connectivity have created unprecedented amounts of data. Data mining, i.e., the science of extracting knowledge from these masses of data, has therefore been affirmed as an interdisciplinary branch of computer science. Data mining techniques have been applied to many industrial, scientific, and social problems, and are believed to have an ever deeper impact on society. The course objective is to provide an introduction to the basic concepts of data mining and the process of extracting knowledge, with insights into analytical models and the most common algorithms.
Prerequisites:
Programme: Mining of time series and spatio-temporal data Mining of sequential data and graphs Advanced techniques for classification, clustering and outlier detection Language, standard and architectures of data mining systems Social impact of data mining Data mining and privacy protection Case studies in socio-economic domains (marketing and CRM, mobility and transport, public health, etc.)

Data Mining & Machine Learning Fundamentals

Provided by: CNR
Location: Officine Garibaldi
Lecturers: Mirco NANNI
Hours: 20
Semester: 1
Timetable:
https://datasciencephd.eu/courses/DataMiningMachineLearningFundamentals
Educational Goals:
Provide basics in Data Mining and Machine Learning, including usage of standard Python libraries.
Prerequisites:
Students are expected to have a basic knolwedge of Python programming or to be attending also the course on "Programming for data science". Also, they are required to bring their laptop with a working installation of the Anaconda python distribution (https://www.anaconda.com/distribution/).
Programme: This course provides a primer in data mining and machine learning, mainly focusing on the following topics: - Data understanding and preprocessing - Clustering (K-mean, Hierarchical methods, DBSCAN) - Classification (Decision trees, KNN, elements of SVM and neural networks) - Frequent patterns (Frequent itemsets, elements of Sequential Patterns, and time series Motifs) The classes will include hands-on excercises using Python and standard data mining/machine learning libraries.

Data Mining and Machine Learning

Provided by: UNIPI, CNR
Location: Computer Science Department, Unversity of Pisa
Lecturers: Dino PEDRESCHI, Fosca GIANNOTTI
Hours: 40
Semester: 2
Timetable:
TBA
Educational Goals:
The module provides an introduction to base concepts of data mining and knowledge extraction process, introducing analytical models and algorithms for clustering, classification and pattern discovery, also referring Big Data sources.
Prerequisites:
Programme:

Data Protection, Privacy, Ethics and Discrimination

Provided by: S.ANNA
Location: S. Anna
Lecturers: Giovanni COMANDE’
Hours: 10
Semester:
Timetable:
Educational Goals:
Prerequisites:
Programme: GDPR compliant data mining and AI development Algorithms’ regulation (developer’s liability, ethico-legal borders). Discrimination and other legal protection rules (e.g. consumer protection, unfair business practices, competition). From causality to probability: ethical and legal implications of the paradigm shift. The Scoring Society and Discrimination: Business models protection and criticalities for the law. Price Discrimination, credit Scoring and scoring in Employment. Private vs. Public Use of Health Data: Scope, Limits, Interferences Surveillance European and North-American.

Data Science Colloqium

Provided by: ALL
Location:
Lecturers: Dino PEDRESCHI (coord.)
Hours: 40
Semester:
Timetable:
Educational Goals:
Prerequisites:
Programme:

Data Visualization and Data Journalism

Provided by: UNIPI, CNR
Location: Computer Science Department, Unversity of Pisa
Lecturers: Salvatore RINZIVILLO, Luca DE BIASE, Andrea MARCHETTI
Hours: 34
Semester: 2
Timetable:
TBA
Educational Goals:
The module aims at preparing students to the approprieted presentation of data and knowledge extracted from them through visualization tools and narratives that exploit multimedia. The module first presents the basic visualization techniques for the effective presentation of information from several different sources: structured data (relational, hierarchies, trees), relational data (social networks), temporal data, spatial data and spatio-temporal data. Then, it also presents the most significant recent experiences in journalism and storytelling based on quantitative information extracted from various data sources.
Prerequisites:
Programme:

Ethics and legal dimensions of data science

Provided by: S.ANNA
Location: Sant'Anna
Lecturers: Giovani COMANDE' (coord.)
Hours: 10
Semester: 2
Timetable:
Educational Goals:
The course introduces the candidates to the main ethics and legal issues related to data science, algorithms regulation, the application of Machine Learning techniques and the production of AI. It focuses, among else, on principles such as accountability, transparency, data protection by design/default, bias and discrimination prevention and their impact on data science practice. Legal and ethical implication of computational capacity.
Prerequisites:
Programme:

European Statistical System and Data Production Model

Provided by: UNIPI
Location:
Lecturers: Monica PRATESI (coord.)
Hours: 48
Semester: 1
Timetable:
Educational Goals:
At the end of the course student will be able to deal with small area estimation both at theoretical and empirical level.
Prerequisites:
Programme: The course will be structured in two parts: i) European Statistical System; ii) and Data Production Model. The first module is on European Statistical System (3 ECTS) and focuses on 1) code of practice; 2)peer review; 3) statistical burden; 4) privacy and confidentially issues; At the end of this module, students should be familiar with the official statistics and the ESS, its organization and activities and should know the main aspects linked to official data collection processes and release procedures. The second module is on Data Production Model (3 ECTS) and is structured in five parts: i) Official national surveys; ii) Data process and quality: attributes and measurement; iii) Quality matrix; iv) Metadata; v) Monitoring statistical processes. At the end of this module, students should be able to know how statistical processes are structured and how to manage data quality in official statistics.

Genomica avanzata

Provided by: S.ANNA, UNIPI
Location:
Lecturers: Mario Enrico PE', Andrea ZUCCOLO
Hours: 64
Semester: 2
Timetable:
Educational Goals:
Il corso si propone di fornire conoscenze di base sulla struttura, la funzione e lÕevoluzione dei genomi di procarioti e di eucarioti. Saranno considerate le diverse metodiche utilizzate per lo studio dei genomi e sarˆ discusso come lÕadozione di approcci genomici hanno cambiato il modo di affrontare le problematiche biologiche. Il corso inoltre prevede di introdurre gli studenti allÕutilizzo e alla comprensione degli strumenti bioinformatici necessari alla gestione e allÕanalisi dei dati provenienti da esperimenti di sequenziamento. Accanto alla presentazione delle tecniche e degli algoritmi sottostanti saranno proposte attivitˆ pratiche su dati reali relativi a genomi batterici, animali e vegetali.
Prerequisites:
Programme:

High Performance & Scalable Analytics, NO-SQL Big Data Platforms

Provided by: UNIPI, CNR
Location: Computer Science Department, Unversity of Pisa
Lecturers: Roberto TRASARTI
Hours: 20
Semester: 2
Timetable:
TBA
Educational Goals:
The aim of this course is to introduce the student with the high performance Big Data management tools. The student will gain expertise in the use od NO-SQL platforms for the analysis and mining of large data volumes, thus performing tasks that would not be feasible with traditional data bases.
Prerequisites:
Programme:

How to do research

Provided by: S.ANNA
Location: TeCIP Institute
Lecturers: Giorgio BUTTAZZO
Hours: 30
Semester: 2
Timetable:
One lecture every Wednesday at 15:00 starting on Jan 13, 2021.
Educational Goals:
The course will cover the entire process needed during a research study, from the initial phase in which a new problem is addressed, formalized, and solved, up to the final phase in which the achieved results have to be communicated to the scientific community. The course is divided in six lectures of 3 hours each. The first lecture explains the meaning of research and how to approach the various steps involved in the process. The second lecture explains how to write papers, set a good structure, outline the contents, make figures, cite references, and avoid common mistakes. The third lecture explains how the scientific publication process works and how to participate in it, simulating a typical paper selection process of a conference. The fourth lecture explains how to write successful research projects. The fifth lecture will simulate a program committee meeting, where papers are discussed, evaluated, and comments are sent back to the authors for revision. The sixth lecture explains how to make presentations, good slides, and have the appropriate attitude when presenting the work. The seventh lecture part is devoted to paper presentations, simulating a small conference run by the students.
Prerequisites:
None
Programme:

Identification, Analysis and Control of Dynamical Systems

Provided by: IMT
Location: IMT Lucca
Lecturers: Alberto BEMPORAD
Hours: 20
Semester: 1
Timetable:
Educational Goals:
The course provides an introduction to dynamical systems, with emphasis on linear systems in state-space form. After introducing the basic concepts of stability, controllability and observability, the course covers the main techniques for the synthesis of stabilizing controllers (state-feedback controllers and linear quadratic regulators) and of state estimators (Luenberger observer and Kalman filter). The course also briefly covers data-driven approaches of parametric identification to obtain models of dynamical systems from a set of data, with emphasis on the analysis of the robustness of the estimated models w.r.t. noise on data and on the numerical implementation of the algorithms.
Prerequisites:
Linear algebra and matrix computation, calculus and mathematical analysis
Programme: Equilibrium points and stability. Linearization of nonlinear systems. Discretization of continous-time systems. Transfer functions. Observability and controllability of LTI systems. Luenberger's observer and state-feedback controllers. Linear quadratic regulator and Kalman filter. Basic concepts on linear parameter-varying systems. System identification: Least-squares methods and recursive identification, instrumental-variables methods, consistent and unbiased estimators, prediction error methods.

Inferential Statistics

Provided by: UNIPI
Location: Officine Garibaldi
Lecturers: Stefano MARCHETTI, Francesco SCHIRRIPA SPAGNOLO
Hours: 20
Semester: 1
Timetable:
https://datasciencephd.eu/courses/InferentialStatistics
Educational Goals:
Provide background of fundamental statistical theory, basic ideas of probability, modelling and tools of general statistical thinking
Prerequisites:
Programme: • Probability theory (random variables, distribution functions, density and mass functions) • Properties of a random sample • Point estimation (statistical properties of estimators) • Interval estimation • Hypothesis testing • Test association between variables (t-test, ANOVA, test of independence, correlation) • Linear regression models • General linear models

Information Retrieval

Provided by: UNIPI, CNR
Location: Computer Science Department, Unversity of Pisa
Lecturers: Paolo FERRAGINA
Hours: 42
Semester: 2
Timetable:
TBA
Educational Goals:
The module provides the description of a search engine structure and of Text Mining tools, by analyzing their characteristics and limits with respect to the computational cost, the precision/recall/F1 parameters, and the expressivity of the supported queries. The module is also based on hands-on activities that will present well-known open-source Python tools for the crawling and analysis of web pages, the semantic annotation of texts (TagMe), and the indexing of text data collections (ElasticSearch).
Prerequisites:
Programme:

Intellectual Property for and in data science

Provided by: S.ANNA
Location: S. Anna
Lecturers: Caterina SGANGA
Hours: 10
Semester: 2
Timetable:
Educational Goals:
Prerequisites:
Programme: Database protection applied to Big Data. Ownership of user generated contents. IP contracts.Trade secrets vs privacy rights. Trade secrets vs freedom of research and freedom to conduct a business. Patent in DataScience: obstacles, potentialities. Intellectual Property Owner in DataScience. The intellectual property rights of artificial intelligence generated ÒcreationsÓ and ÒinventionsÓ. Machine rationality versus machine artistic creativity.

Introduction to Cognitive and Social Psyschology

Provided by: IMT
Location:
Lecturers: Pietro PIETRINI, Emiliano RICCIARDI
Hours: 24
Semester:
Timetable:
Educational Goals:
This course will provide an introduction to general themes in Cognitive and Social Psychology. In the first part of the course, we will review seminal findings that had a major impact on our knowledge of cognitive processes and social interactions, as well as more recent studies that took advantage of neuroimaging, electrophysiology and brain stimulation methods to shed new light on decision-making and social behaviors. During the second part of the course, students will be asked to perform a brief presentation of a research article and to critically discuss positive aspects and limitations of the study. The course will include seminars and lectures by renowned researchers in the field and will educate PhD candidates about the influence of social aspects of the human nature on cognitive and brain functioning (and vice-versa) in an intellectually motivating manner.
Prerequisites:
Programme:

Introduction to Evolutionary Biology

Provided by: S.ANNA
Location:
Lecturers: Andrea ZUCCOLO
Hours: 15
Semester: 2
Timetable:
Educational Goals:
The main goal of the course is to introduce the basic principles underlying evolutionary biology and to present and discuss rationale and methods of phylogenetic inference. We'll discuss the process of selecting and gathering appropriate datasets for subsequent analyses and we'll explain three widely used methods of phylogenetic inference: parsimony, distance, and likelihood methods. Particular attention will be paid to the methods used to evaluate objectively the reliability and accuracy of the resulting inferences. Finally, we'll critically consider the results of phylogenetic reconstructionÑhow they can shed light on past evolutionary events, such as gene duplications and lateral gene transfers, as well as how they can be used for other purposes, such as predicting gene function and resolving RNA secondary structures.
Prerequisites:
Programme:

Machine Learning

Provided by: IMT
Location:
Lecturers: Giorgio Stefano GNECCO
Hours: 20
Semester:
Timetable:
Educational Goals:
The course provides an introduction to basic concepts in machine learning. Topics include: learning theory (bias/variance tradeoff; Vapnik-Chervonenkis dimension and Rademacher complexity, cross-validation, feature selection); supervised learning (linear regression, logistic regression, support vector machines); unsupervised learning (clustering, principal and independent component analysis); semisupervised learning (Laplacian support vector machines); online learning (perceptron algorithm); hidden Markov models.
Prerequisites:
Programme:

Machine learning fundamentals, algorithms and applications through Python

Provided by: S.ANNA
Location: S. Anna
Lecturers: Marco VANNUCCI, Valentina COLLA
Hours: 35
Semester: 2
Timetable:
Educational Goals:
The course provides fundamentals in several widely used Machine Learning approaches that are nowadays gaining interest in practical applications. The course covers both theoretical and practical aspects, providing practical examples for solving real world problems, using Python as a programming language with its main packages (Numpy, Pandas, SciKitLearn, Scipy, keras, ...).
Prerequisites:
Programme: The main families of techniques that will be discussed include: • Decision Trees; the fundamental principles will be introduced and the basic concepts to allow benefits and limitation of this widespread approach to decision support; • Ensemble methods applied to clustering classification and regression (e.g. Random Forest); • Clustering algorithms (K-Means, Self-Organizing Maps, Growing Neural Gas and their application; • Bio-inspired optimization algorithms. The main motivation and ideas will be introduced and most common algorithms will be described in detail, as genetic algorithms, ant colony optimization, tabu search, and particle swarm optimization. • Fuzzy logic and fuzzy inference systems. A theoretical background will be provided together with the basic concepts for the design and exploitation of fuzzy inference systems. Neuro-Fuzzy systems will also be presented. • Hybrid systems. The main paradigms that combine standard methods and different AI based approaches will be discussed through practical use cases. Finally, the course will show how the methods described above can be used for the development of advanced Machine Learning systems that address tasks referring to real-world problems.

Model Predictive Control

Provided by: IMT
Location: IMT Lucca
Lecturers: Alberto BEMPORAD
Hours: 20
Semester: 2
Timetable:
Educational Goals:
COURSE DESCRIPTION Model Predictive Control (MPC) is a well-established technique for controlling multivariable systems subject to constraints on manipulated variables and outputs in an optimized way. Following a long history of success in the process industries, in recent years MPC is rapidly expanding in several other domains, such as in the automotive and aerospace industries, smart energy grids, and financial engineering. The course is intended for students and engineers who want to learn the theory and practice of Model Predictive Control (MPC) of constrained linear, linear time-varying, nonlinear, stochastic, and hybrid dynamical systems, and numerical optimization methods for the implementation of MPC. The course will make use of the MPC Toolbox for MATLAB developed by the teacher and co-workers (distributed by The MathWorks, Inc.) for basic linear MPC, and of the Hybrid Toolbox for explicit and hybrid MPC.
Prerequisites:
Linear algebra and matrix computation, linear control systems, numerical optimization.
Programme: General concepts of Model Predictive Control (MPC). MPC based on quadratic programming. General stability properties. MPC based on linear programming. Models of hybrid systems: discrete hybrid automata, mixed logical dynamical systems, piecewise affine systems. MPC for hybrid systems based on on-line mixed-integer optimization. Multiparametric programming and explicit linear MPC, explicit solutions of hybrid MPC. Stochastic MPC: basic concepts, approaches based on scenario enumeration. Linear parameter- and time-varying MPC and applications to nonlinear dynamical systems. Selected applications of MPC in various domains, with practical demonstration of the MATLAB toolboxes.

Neural Network and Deep Learning: Practical and Implementation Issues

Provided by: S.ANNA
Location: S. Anna
Lecturers: Giorgio BUTTAZZO, Alessandro BIONDI
Hours: 30
Semester: 2
Timetable:
Educational Goals:
The objective of the course is to provide practical and implementation issues useful to deploy neural networks on a variety of embedded platforms using different languages and developments environments.
Prerequisites:
Programme: 1. Implementing Neural Networks from scratch in C. General implementation principles. Main and auxiliary functions. 2. Sample implementations of common neural network models in C language. 3. Frameworks for training and inference of deep neural networks. Overview of the existing frameworks. Common data sets. 4. Modeling neural networks in Tensorflow and Caffe. Examples of neural network implementations. 5. Simulation environments for neural control. Summary of neural models for control. Overview of the OpenAI Gym framework. Implementation of different RL algorithms in GYM for different application scenarios (gridworld, inverted pendulum, autonomous vehicles, robots, etc.). Overview of the Mujoco environment and related applications. 6. Genetic algorithms for reinforcement learning. 7. Accelerating deep networks on GPGPUs. Overview of the Nvidia TensorRT framework. Executing a DNN modelled in Caffe in TensorRT. 8. Real-time neural vision. How to accelerate a neural network on TensorRT to detect objects from a video camera. 9. Accelerating deep networks on FPGA. Common frameworks for deploying deep networks on FPGA. 10. A sample implementation of a deep neural network on the Zynq platform.

Neural Network and Deep Learning: Theoretical Foundations

Provided by: S.ANNA
Location: S. Anna
Lecturers: Giorgio BUTTAZZO
Hours: 30
Semester: 2
Timetable:
Educational Goals:
The objective of the course is to provide basic concepts and methodologies on the main existing neural models, explaining how to use them for pattern recognition, image classification, signal prediction, data analysis, system identification, and adaptive control.
Prerequisites:
Programme: 1. Introduction to neural computing. Motivations. Main network models and learning paradigms. 2. Fully connected networks. Hopfield networks. Associative memories, Application to optimization problems. 3. Competitive learning. Self-organizing maps. Kohonen networks: network model, learning algorithm and main network properties. Examples and applications. 4. Reinforcement Learning. The state-box learning paradigm. Temporal credit assignment. The ASE/ACE neural model. Q-learning and SARSA algorithms. 5. Supervised learning. The Perceptron: model, properties and limitations. Multi- layer networks. The Back Propagation algorithm. Convergence and generalization. Applications of multi-layer networks to signal prediction, control, and system identification. Examples and applications. 6. Towards deep networks. Advantages of increasing the number of neural layers. Problems in training deep networks: overfitting and vanishing gradient. Solutions for deep learning: better loss functions, better activation functions, regularization, and dropout methods. 7. Deep network models: Boltzmann Machines, Restricted Boltzmann Machines, Autoencoders, Convolutional Networks. Implementation issues. Examples and applications. 8. Specific deep neural networks: LeNet-5, Alex-Net, VGG-Net, GoogLeNet, ResNet, SqueezeNet. 9. Neural Networks for object detection. Sliding windows, OverFeet, R-CNN, Yolo. 10. Recurrent neural networks. Gate recurrent units, LSTM, Bidirectional networks, Networks for Natural language processing. 11. Deep Reinforcement Learning. Deep Q-learning models. Policy gradient and actor-critic methods. 12. Generative adversarial networks. Generative autoencoders, GANs, Style Transfer, Semi-Supervised learning. 13. Sample applications and open issues.

Numerical methods for optimal control

Provided by: IMT
Location: IMT
Lecturers: Mario ZANON
Hours: 20
Semester: 1
Timetable:
Educational Goals:
The students will learn how to effectively solve optimisation and optimal control problem in practice.
Prerequisites:
Basic knowledge in calculus, linear algebra and dynamical systems
Programme: Many control and estimation tasks seek at minimizing a given cost while respecting a set of constraints, which belongs to the class of problems denoted as Optimal Control (OC). The most practical approach to solve OC problems is via direct methods, which consist in discretizing the problem to obtain a Nonlinear Program (NLP) which is then solved using one of the many available approaches. The course will be introduced by an overview of the available classes of algorithms for OC and place direct methods in this context. The core of the course is structured around the following two main parts. 1. NLP solvers: This part of the course covers Nonlinear Programming first establishes a sound theoretical background on the characterization of local minima (maxima) by introducing geometric optimality concepts and relating them to the first- and second-order conditions for optimality, i.e. the Karush-Kuhn-Tucker conditions, constraint qualifications and curvature conditions. Second, the theoretical concepts will be used to analyse the most successful algorithms for derivative-based nonconvex optimization, i.e. Sequential Quadratic Programming and Interior Point Methods, both based on NewtonÕs method. Since there does not exist a plug-and-play NLP solver, attention will be devoted to giving the students a solid understanding of the mechanisms underlying the algorithms so as to endow them with the ability to formulate the problem appropriately and choose the adequate algorithm for each situation. 2. Discretisation techniques: This second part of the course covers the most successful discretization approaches, i.e. single-shooting, multiple-shooting and collocation. All mentioned approaches rely on the simulation of dynamical systems, for which a plethora of algorithms have been developed. The students will be explained the features of the different classes of algorithms, with particular attention on the numerical efficiency, simulation accuracy and sensitivity computation. Finally, the structure underlying the NLP obtained via direct methods for OC will be analysed in order to understand the immense benefits derived from developing dedicated structure-exploiting OCP solvers.

Numerical Optimization

Provided by: IMT
Location: IMT Lucca
Lecturers: Alberto BEMPORAD
Hours: 20
Semester: 1
Timetable:
Educational Goals:
Optimization plays a key role in solving a large variety of decision problems that arise in engineering (design, process operations, embedded systems), data science, machine learning, business analytics, finance, economics, and many others. This course focuses on formulating optimization models and on the most popular numerical methods to solve them.
Prerequisites:
Linear algebra and matrix computation, calculus and mathematical analysis.
Programme: Modeling: linear programming models, convex optimization models. Basic optimization theory: optimality conditions, sensitivity, duality. Algorithms for constrained convex optimization: active-set methods for linear and quadratic programming, proximal methods and ADMM, stochastic gradient, interior-point methods. Line-search methods for unconstrained nonlinear programming, sequential quadratic programming.

Peer to peer systems and blockchains

Provided by: UNIPI
Location: Polo Fibonacci, University of Pisa
Lecturers: Laura RICCI
Hours: 48
Semester: 2
Timetable:
Educational Goals:
The student will acquire knowledge relative to the development of distributed systems, in particular of blockchain-based systems.
Prerequisites:
Computer Networks, Algorithms, Programming skills
Programme: P2P Topologies Peer to Peer (P2P) systems: general concepts Unstructured Overlays: Flooding, Random Walks, Epidemic Diffusion Structured Overlays: Distributed Hash Tables (DHT), Routing on a DHT Case Studies: BitTorrent as a Content Distribution Network: KAD implementation of the Kademlia DHT, game-based cooperation Complex Network for the analysis of P2P systems Network models Random Graphs and Small Worlds Small World navigability: Watts Strogatz and Kleinberg. Complex networks navigability Cryptocurrencies and Blockchains basic concepts: a review of basic cryptographic tools (digital signatures, cryptographic hash, Merkle trees.,..) blockchains: de finitions distributed consensus: de nitions the Bitcoin blockchains Nakamoto consensus Bitcoin mining mechanism pseudoanonymity: traceability and mixing the Bitcoin P2P Network Bitcoin ecosystem scalability issues Bitcoin Extensions/alternatives: altcoins, sidechains, the StellarConsensus Protocol, Ripple Ethereum: programming smart contracts Applications of blockchains Ethereum: programming smart contracts Blockchain 1.0: cryptocurrencies Blockchain 2.0: financial instruments built on cryptocurrencies Blockchain 3.0: applications beyond cryptocurrencies: voting, IoT

PhD+: Research valorization, innovatiom entrepreneural mindset

Provided by: UNIPI
Location:
Lecturers: (vari)
Hours:
Semester: 2
Timetable:
https://www.unipi.it/index.php/phd/item/11226-programme-2017
Educational Goals:
PhD+ is a unique programme aimed at fostering innovation and entrepreneurial mind-set in students and graduates of the University of Pisa, PhD students and PhDs of all Superior Graduate Schools in Tuscany, and academics. PhD+ consists of a series of interactive and engaging lectures combined with coaching and mentoring activities, given by top-level experts in innovation and technology transfer. PhD+ is one of the best practice of training in research valorisation, innovation and entrepreneurship, also recognized by the Network of Design for Resilient Entrepreneurship, within the ENDuRE European project. To date, PhD+ has received many national and international awards, also thanks to the successes of its multi-awarded spin-offs. From this year onwards the PhD+ will represent the qualified training offer within the Contamination Lab project. The 2018 edition will take place from 6th February to 8th March and will present new features: ¥ more topics related to research valorisation and European funding opportunities will be discussed; ¥ ¥ this edition of the course will be more internationally-oriented, thanks to the collaboration with the Brasilian Federal University of Pernanbuco, allowing the students of these Universities to attend the lectures in video streaming by means of the Unipi Mediateca platform; ¥ it will be cutting edge on new technological trends. The past PhD+ edition lectures and seminars are available in video streaming on the Mediateca platform of the University of Pisa.
Prerequisites:
Programme:

Principles of Brain Anatomy and Physiology

Provided by: IMT
Location:
Lecturers: Luca CECCHETTI, Michele EMDIN
Hours: 36
Semester:
Timetable:
Educational Goals:
The course aims at introducing the fundamentals of brain anatomy and physiology. In the first part of the course we will revise the basics of neuron structure and function, as well as synaptic mechanisms and cytoarchitectonic properties of the cortical mantle, with particular regards to visual, auditory, somatosensory and motor systems. Moving from this fine-grained description of the human brain, we will focus on gross neuroanatomy: through the use of in-vivo state-of-the-art techniques, such as structural MRI and diffusion weighted imaging, we will review gyri and sulci of the cortex, subcortical structures, brainstem nuclei and major white matter fasciculi. The second part of the course will be devoted to the study of functional neuroanatomy, with insights on the relationship between specific brain structures and human cognition, collected using functional, metabolic and receptors mapping, as well as lesion studies. In particular, the the following topics will be covered: central and peripheral nervous systems, occipital parietal frontal temporal and limbic areas, subcortical nuclei and white matter fasciculi, cerebellum, methodologies of structural brain imaging: VBM, cortical thickness and folding, VLSM, Diffusion Weighted Imaging and Tractography (theory and methodologies of data processing, hands-on sessions). The last part of the course will instead cover topics related to peripheral and autonomous nervous system.
Prerequisites:
Programme:

Principles of Concurrent and Distributed Programming

Provided by: IMT
Location: IMT Lucca
Lecturers: Rocco DE NICOLA, Letterio GALLETTA
Hours: 30
Semester: 1
Timetable:
Educational Goals:
The objective of the course is to introduce the basics of concurrent and distributed programming through an illustration of the concepts and techniques related to modeling systems in which there are more components that are simultaneously active and need to coordinate and compete for the use of shared resources. At the end of the course, students will have a good understanding of the problems connected to concurrent programming and a good knowledge of the different approaches to modelling communication among distributed components and to safe resource sharing. By means of an hands-on approach, at the end of the course students be able to write and evaluate concurrent programs using different programming languages.
Prerequisites:
Basics of Computer Programming
Programme:

Privacy in and for Data Science

Provided by: S.ANNA
Location: S. Anna
Lecturers: Giovanni COMANDE’
Hours: 12
Semester:
Timetable:
Educational Goals:
Prerequisites:
Programme: Data Science under the light of EU General Data Protection Regulation. Data protection legal models and practices (further processing, legitimate interests, data subject rights, access rights with particular attention to the EU and the US models). Machine Learning. GDPR solutions for data science: the Right to an Explanation, etc.

Programming for Data Science

Provided by: CNR
Location: Officine Garibaldi
Lecturers: Giulio ROSSETTI
Hours: 20
Semester: 1
Timetable:
https://datasciencephd.eu/courses/ProgrammingDataScience
Educational Goals:
This is an introductory course to computer programming for students without a Bachelor in Computer Science or in Computer Engineering. The objective is to smoothly introduce the student to the programming concepts and tools needed for typical data processing and data analysis tasks.
Prerequisites:
Students must bring their laptop with a working installation of the Anaconda python distribution (https://www.anaconda.com/distribution/).
Programme: The course will focus on the Python programming language (version 3.7), covering the following topics: Introduction to Algorithms and programming; Introduction to Python and to the pythonista's tools; Data types, expressions using numbers, variables. Control Flow: Conditional statements, cycles; Functions and recursion; Data structures: strings, lists, sets, tuples, dictionaries; (Notion of) object-oriented programming & exception handling; Python libraries for data science Suggested textbook: P. Spronck. The Coder’s Apprentice: Learning Programming with Python 3, 2017. http://www.spronck.net/pythonbook

Public Healthcare Management & Big data

Provided by: S.ANNA
Location:
Lecturers: Sabina NUTI, Milena VAINIERI, Chiara SEGHIERI
Hours: 20
Semester: 1
Timetable:
Educational Goals:
Prerequisites:
Programme: The primary objective of the course is to show how the data and information collected from internal sources of the Health System and external, through the involvement of the users as co-producers of health, can be used within public contexts in order to both support the strategic planning at all levels of governance and to measure the ability of the system to create value for the population with the available resources.

Scientific Programming I: Data Processing and Software Prototyping

Provided by: SNS
Location: SNS
Lecturers: Julien Roland Michel BLOINO
Hours: 40
Semester: 1
Timetable:
Educational Goals:
The course aims at providing a working knowledge in the use of Python to automatize tasks, process data, and build prototypical implementations. Practical sessions in laboratory represent an important aspect of the course where the fundamental concepts studied during the course will be put in practice.
Prerequisites:
Programme: The course will address the following aspects: Introduction to computer architectures and programming languages Basic concepts of the language Object-oriented programming in Python The library system of Python: internal and external modules Python in science: introduction to some libraries and their use Key concepts of the language will be illustrated through the progressive development of a fully functional program during the course. The course is composed of a didactic part of 28 hours (2 lessons of 2 hours each per week) and a practical part of 12 hours (3 sessions of 4 hours).

Scientific Programming II: High Performance Computing

Provided by: SNS
Location: SNS
Lecturers: Julien Roland Michel BLOINO
Hours: 40
Semester: 2
Timetable:
11 January 2021 - 31 May 2021
Educational Goals:
Prerequisites:
The course is complementary to ""Scientific Programming I"" and will introduce more advanced concepts regarding the problem of performance, but can be followed independently. No previous knowledge in programming is necessary.
Programme: The course will address the following aspects: - Introduction to computer architectures, hardware limitations, and programming languages - Basic concepts of the language - Advanced concepts of the language in terms of performance - Parallel programming - Code optimization and interfacing Key concepts of the language will be illustrated through the progressive development of a fully functional program during the course. The course is composed of a didactic part of 32 hours (2 lessons of 2 hours each per week) and a practical part of 8 hours (2 sessions of 4 hours, one on basic concepts, one on more advanced aspects). Scientific-Disciplinary Sector CHIM/02 - Physical Chemistry Course language: Italian

Social Network Analysis

Provided by: UNIPI, CNR
Location: Computer Science Department, Unversity of Pisa
Lecturers: Andrea PASSARELLA
Hours: 20
Semester: 2
Timetable:
TBA
Educational Goals:
This course introduces students to the theories, concepts and measures of Social Network Analysis (SNA), that is aimed at characterizing the structure of large-scale Online Social Networks (OSNs). The course presents both classroom teaching to introduce theoretical concepts, and hands-on computer work to apply the theory on real large-scale datasets obtained from OSNs like Facebook and Twitter. The course aims to discuss in particular how the structural properties of social networks can be analyzed through SNA techniques, and how these properties can be used to characterize social phenomena arising in the society.
Prerequisites:
Programme:

Social network analysis

Provided by: UNIPI
Location: Polo Fibonacci, University of Pisa
Lecturers: Dino PEDRESCHI
Hours: 48
Semester: 2
Timetable:
Educational Goals:
Over the past decade there has been a growing public fascination with the complex ÒconnectednessÓ of modern society. This connectedness is found in many contexts: in the rapid growth of the Internet and the Web, in the ease with which global communication now takes place, and in the ability of news and information as well as epidemics and financial crises to spread around the world with surprising speed and intensity. These are phenomena that involve networks and the aggregate behavior of groups of people; they are based on the links that connect us and the ways in which each of our decisions can have subtle consequences for the outcomes of everyone else. This course is an introduction to the analysis of complex networks, with a special focus on social networks and the Web - their structure and function, and how it can be exploited to search for information. Drawing on ideas from computing and information science, applied mathematics, economics and sociology, the course describes the emerging field of study that is growing at the interface of all these areas, addressing fundamental questions about how the social, economic, and technological worlds are connected. Data-driven analysis of complex networks using a variety of models and software tools.
Prerequisites:
Programme: Big graph data and social, information, biological and technological networks The architecture of complexity and how real networks differ from random networks: node degree and long tails, social distance and small worlds, clustering and triadic closure. Comparing real networks and random graphs. The main models of network science: small world and preferential attachment. Strong and weak ties, community structure and long-range bridges. Robustness of networks to failures and attacks. Cascades and spreading. Network models for diffusion and epidemics. The strength of weak ties for the diffusion of information. The strength of strong ties for the diffusion of innovation. Practical network analytics with Cytoscape and Gephi. Simulation of network processes with NetLogo.

Socio-Economic Networks

Provided by: IMT
Location:
Lecturers: Massimo RICCABONI
Hours: 20
Semester:
Timetable:
Educational Goals:
The topic of the course will be the analysis of socio-economic networks. The course will consist of two parts: (1) micro level networks of individuals and firms, (2) macro-level networks of sectors and countries. The first part will focus on social networks and the division of (innovative) labor within and across firm boundaries. The second part on the empirics of macro networks in economics will have a strong focus on international trade, investments and human mobility. Both parts will give you a brief overview on the literature, which predominantly adopted an econometric approach to the analysis of networks.
Prerequisites:
Programme:

Statistical and Neural Machine Learning for Text Analysis

Provided by: UNIPI, CNR
Location: Computer Science Department, Unversity of Pisa
Lecturers: Andrea ESULI
Hours: 20
Semester: 2
Timetable:
TBA
Educational Goals:
This module introduces the main methods of analysis and mining of opinions and personal evaluations for users based on Big Data generated on the web or other sources. Emphasis will be put on text mining method applied to text originated on social media. Lessons will be supported by case studies developed in the SoBigData.eu lab.
Prerequisites:
Programme:

Statistical methods for data science

Provided by: UNIPI
Location: Polo Fibonacci, University of Pisa
Lecturers: Salvatore RUGGIERI
Hours: 48
Semester: 2
Timetable:
Educational Goals:
The student who completes successfully the course will have a solid knowledge on the main concepts and tools of statistical analysis, including the definition of a statistical model, the inference of its parameters with confidence intervals, the use of hypothesis testing. with specific applications to problems and models useful in data science. Finally the student will be able to use the language R for performing statistical analyses.
Prerequisites:
Basic knowledge of calculus. Basic knowledge of probability might be useful even if not indispensable.
Programme: The program covers the basic methodologies, techniques and tools of statistical analysis. This includes basic knowledge of probability theory, random variables, convergence theorems, statistical models, estimation theory, and hypothesis testing. Other topics covered include bootstrap, expectation-maximization, and applications to data science problems. Finally the program covers the use of the language R for statistical analysis.

Statistical Methods for Large, Complex Data

Provided by: S.ANNA
Location:
Lecturers: Francesca CHIAROMONTE
Hours: 10
Semester: 2
Timetable:
Educational Goals:
This course examines: (i) Computational assessment of statistical procedures, with resampling, cross-validation, permutations and perturbations. (ii) High dimensional supervised problems, with shrinkage, sparsification (e.g., Ridge, LASSO) and other feature selection techniques. (iii) Ultra-high dimensional supervised problems, with model-based and model free feature screening algorithms. (iv) Ultra-high sample sizes, with subsampling and partitioning strategies typically used for big data, and various considerations about significance and effect sizes. While not a pre-requisite, the course Topics in Statistical Learning provides important background for this course.
Prerequisites:
Programme:

Stochastic Processes and Stochastic Calculus

Provided by: IMT
Location:
Lecturers: Irene CRIMALDI
Hours: 30
Semester:
Timetable:
Educational Goals:
This course aims at introducing some important stochastic processes and Ito stochastic calculus. Some proofs are sketched or omitted in order to have more time for examples, applications and exercises. In particular, the course deals with the following topics: - Markov chains (definitions and basic properties, classification of states, invariant measure, stationary distribution, some convergence results and applications, passage problems, random walks, urn models, introduction to the Markov chain Monte Carlo method), - conditional expectation and conditional variance, - martingales (definitions and basic properties, Burkholder transform, stopping theorem and some applications, predictable compensator and Doob decomposition, some convergence results, game theory, random walks, urn models), - Poisson process, Birth-Death processes, - Wiener process (definitions, some properties, Donsker theorem, Kolmogorov-Smirnov test) and Ito calculus (Ito stochastic integral, Ito processes and stochastic differential, Ito formula, stochastic differential equations, Ornstein-Uhlenbeck process, Geometric Brownian motion, Feynman-Kac representation formula).
Prerequisites:
Matrix Algebra + Foundations of Probability and Statistical Inference
Programme:

Strategic and competitive intelligence

Provided by: UNIPI
Location: Polo Fibonacci, University of Pisa
Lecturers: Antonella MARTINI
Hours: 48
Semester: 1
Timetable:
Educational Goals:
CI programs have goals such as proactively detecting business opportunities or threats, eliminating or reducing blind- spots, risks and/or surprises; and reducing reaction time to competitor and marketplace changes. The end product of any worthwhile CI activity is what practitioners term Ôactionable intelligenceÕ Ð i.e. intelligence that management can act upon; perspective. It is more than analysing competitors: it is a process for gathering information, converting it into intelligence (about products, customers, competitors, and any aspect of the environment) and then using it in decision making. In this sense, big data brings big change to CI. The course is very interactive and includes also in-class seminars with experts on emerging topics, defined each year (i.e. patent analysis, due diligence, social network analysis for business). It provides many tools and techniques; HBS cases are used. Students will apply these tools in groups when analysing a preselected case company. They are expected to present early stage versions of their CI reports and, in the final workshop, they will present the results of their CI analysis, which is then discussed in plenary. By the end of the course students will have acquired knowledge about the tools and methodologies to design and develop competitive intelligence projects
Prerequisites:
Fundamentals of financial & cost accounting, strategy, organization design.
Programme: [1] SCI FUNDAMENTALS: VUCA, Ansoff Model, surprise in business, risk & uncertainty, applications of CI, CI cycle [2] IP INTELLIGENCE BASICS: Patents, trademarks, copyrights, patent search engines, ecosystems & platforms [3] DATA SCIENCE FOR SCI PROJECTS: (1) Basics: text analysis; (2) Advanced: NIR, topic modelling, network analysis and visualization [4] DATA SCIENCE PROJECT DESIGN: Scoping, KITs and KIQs, metrics, management, result, visualization [5] SCI APPLICATION LAB: How to extract intelligence from scientific papers; How to extract intelligence from IP; How to extract intelligence from HR and other sources (i.e. Wikipedia)

Summer school on Data Science

Provided by: CNR
Location:
Lecturers: (vari) TBD
Hours:
Semester: 2
Timetable:
Educational Goals:
Prerequisites:
Programme:

Survey Methods

Provided by: UNIPI
Location:
Lecturers: Monica PRATESI (coord.)
Hours: 40
Semester:
Timetable:
Educational Goals:
Prerequisites:
Programme:

Technologies for web marketing

Provided by: UNIPI
Location: Polo Fibonacci, University of Pisa
Lecturers: Salvatore RUGGIERI
Hours: 48
Semester: 2
Timetable:
Educational Goals:
The student who successfully completes the course will have a solid knowledge about information technologies for marketing decisions in the web, on how to adversite effectively, on how to track users and explore web metric summaries, on how to improve/personalize the customer experience on a web site, on how to invest available resources, and on how measure success in using web marketing technologies.
Prerequisites:
Some knowledge of how the Internet as a network, and some Internet programming (HTML, Javascript). Students must be fluent in English (the course is part of a Master degree held in English).
Programme: Web analytics is the collection, measurement, analysis and reporting of Internet data (web, mobile, social media, email) for purposes of deep customer and market understanding and for digital service optimization. The course presents web analytics methods, algorithms, strategies and tools with applications to web personalization for improving user experience, to web marketing and advertising for improving visibility, to search engine optimization for improving ranking, and social media analysis for improving reachability and understanding opinions.

Time Series and Mobility Data Analysis

Provided by: UNIPI, CNR
Location: Computer Science Department, Unversity of Pisa
Lecturers: Mirco NANNI, Franco Maria NARDINI
Hours: 34
Semester: 2
Timetable:
TBA
Educational Goals:
The purpose of the course is to introduce the main techniques in data mining and machine learning (including deep learning approaches) for the analysis of temporal data, in particular for time series and spatio-temporal data related to human mobility. The presentation will be supported by several case studies developed with the SoBigData.eu Laboratory.
Prerequisites:
Programme:

Topics in Statistical Learning

Provided by: S.ANNA
Location:
Lecturers: Francesca CHIAROMONTE
Hours: 30
Semester: 2
Timetable:
Educational Goals:
This course introduces the students to a number of topics in contemporary Statistical Learning, including: (i) Unsupervised classification and clustering methods. (ii) Unsupervised dimension reduction; Principal Components Analysis and related techniques. (iii) Supervised classification methods. (iv) Linear and generalized linear models. (v) Non-parametric regression methods. (vi) Resampling methods, Cross Validation, the Bootstrap and permutation-based techniques. (vii) Feature selection and penalized fitting techniques for (generalized) linear models (viii) Supervised dimension reduction; Sufficient Dimension Reduction and related techniques. Compared to traditional courses on regression, linear models and multivariate statistics, we focus on analyzing actual datasets of interest to the students through group projects, and we adopt a so-called active learning approach, leveraging practicum sessions and publicly available MOOC materials.
Prerequisites:
A working knowledge of basic statistical inference procedures (point estimation, confidence intervals, testing) and regression modeling.
Programme:

Transparency, accountability and traceability of algorithm based decision-making

Provided by: S.ANNA
Location: S. Anna
Lecturers: Giovanni COMANDE’
Hours: 10
Semester:
Timetable:
Educational Goals:
Prerequisites:
Programme: Algorithm Accountability. Accountability in the Machine Learning Context. Algorithm Transparency. Technical and Legal Options to Enhance Transparency & Accountability. People Analytics. Behavioural “Nudging”. New Emerging Human Rights in the age of Behavioral Data Science and Neurotechnologies: Towards "Mental Privacy" and "Decision Integrity".. Legal framework. Intrusiveness and legitimacy of surveillance mechanisms and of their data integration. Surveillance on Publicly accessible data (e.g. social media data). Metadata elaboration and predictive implications.

Visual analytics

Provided by: UNIPI, CNR
Location: Polo Fibonacci, University of Pisa
Lecturers: Salvatore RINZIVILLO
Hours: 48
Semester: 2
Timetable:
Educational Goals:
The trained student will acquire knowledge and skills to design and implement an effective visual representation of data and models
Prerequisites:
Basic knowledge of programming languages for the web: Javascript, HTML, CSS
Programme: Theory of Visualization Taxonomy of different types of data visualization: hierarchies, relational data, temporal data, spatial data, unstructured data (text) Visual Analytics Process Strategies and best practices for Effective data visualization Discussion of Case Studies Technologies for visualization Overview of development environments and visual libraries Design of a visual analytics project

Web Mining

Provided by: UNIPI, CNR
Location: Computer Science Department, Unversity of Pisa
Lecturers: Raffaele PEREGO, Franco Maria NARDINI
Hours: 20
Semester: 2
Timetable:
TBA
Educational Goals:
This module presents how to analyse traces that users leave from querying Web search engines (query log). It presents the main applications of Web mining including: i) how to profile the interests/activities of users, ii) how to use information from query logs for forecasting social indicators and optimizing Web search engines. Teaching activities will be supported by several case studies developed in the SoBigData.eu laboratory.
Prerequisites:
Programme:
Zircon - This is a contributing Drupal Theme
Design by WeebPal.