Bioinformatics

Provided by: 

UNIPI

From: 

M.Sc. in Computer Science

Sede: 

Polo Fibonacci, University of Pisa

Lecturers: 

Nadia PISANTI

Semester: 

2

Hours: 

48

Timetable: 

https://www.di.unipi.it/en/education/mcs/timetable-wif

Educational Goals: 

This course has the goal to give the student an overview of algorithmic methods that have been conceived for the analysis of genomic sequences. We will focus both on theoretical and combinatorial aspects as well as on practical issues such as whole genomes sequencing, sequences alignments, the inference of repeated patterns and of long approximated repetitions, the computation of genomic distances, and several biologically relevant problems for the management and investigation of genomic data. The exam has the goal to evaluate the students understanding of the problems and the methods described in the course. Moreover, the exam is additionally meant as a chance to learn how a scientific paper is like, and how to make an oral presentation on scientific/technical topics, that is designed for a specific audience.

Prerequisites: 

A Basic course on algorithms

Programme: 

A brief introduction to molecular biology: DNA, proteins, the cell, the synthesis of a protein. Sequences Alignments: Dynamic Programming methods for local, global, and semi-local alignments. Computing the Longest Common Subsequences. Multiple Alignments. Pattern Matching: Exact Pattern Matching: algorithms (Knuth-)Morris-Pratt, Boyer-Moore, Karp-Rabin with preprocessing of the pattern. Algorithm with preprocessing of the text: use of indexes. Motifs Extraction: KMR Algorithm for the extracion of exact motifs and its modifications for the inference of approximate motifs. Finding Repetitions: Algorithms for the inference of long approximate repetitions. Filters for preprocessing. Fragment Assembly: Genomes sequencing: some history, scientific opportunities, and practical problems. Some possible approaches for the problem of assembling sequenced fragments. Link with the ÒShortest common superstringÓ problem, the Greedy solution. Data structures for representing and searching sequencing data. New Generation Sequencing: Applications of High Throughput Sequencing and its algorithmic problems and challenges. Investigating data types resulting from the existing biotechnologies, and the possible data structures and algorithms for their storage and analysis.
Zircon - This is a contributing Drupal Theme
Design by WeebPal.