An Introduction To Sequence Comparison and Database Search |
The first part is dedicated to molecular evolution. We will focus on the implications of molecular evolution on sequence variation. We will use these concepts to define homology. We will then see how specific mathematical models (the substitution matrices) have been derived in order to quantify the evolutionary relationship between sequences. In the second part we will cover the pairwise and single sequence analysis. We will introduce the Needleman and Wunsch algorithm (Dynamic programming), a very basic algorithm that makes it possible to derive pairwise alignments from the sequences while using the substitution matrices. Implementation aspects will be extensively covered, including Linear Space implementations and pair-HMM alternatives for pairwise sequence alignments. These dynamic programming algorithms will be compared for both scope and application to the Burrow Wheeler Transform algorithm an essential component of genomic data aligners. The course will finish with an introduction of the Zuker RNA fold algorithm including an overview of the CYK algorithm . The third part of the course will deal with Multiple Sequence Alignment algorithms of both RNA and Protein sequences. We will introduce the basic notion of this important class of algorithm and will then move to the most recent algorithms, able to take into account protein and RNA 3D structure while building the models. We will discuss some recent results on the relationship between MSA computation and Tree reconstruction. The course will finish with a detailed description of the most recent large scale algorithm including the recently published Regressive Algorithm. |
Practicals will be in python 3. They will involve adapting existing scripts rather than programming from scratch. For this reason, students with no practical knowledge of python but with a good grasp of programming languages like Perl, C or JAVA should manage reasonnably well. Students are expected to bring their own lap-top and advised to have a LINUX boot. |
Date | Location | Session | Title | Links | |
Day 1 | NORBIS | LECTURE | Pairwise comparisons in an evolutionary context | Slides Voiceover | |
Day 1 | NORBIS | PRACTICALS | Parsing Biological Files | P | |
Day 1 | NORBIS | PRACTICALS | Computing Substitution Matrices | P | |
. | |||||
Day 2 | NORBIS | LECTURE | Introduction to Dynamic Programming | Slides Voiceover | |
Day 2 | NORBIS | PRACTICALS | Introduction to Dynamic Programming 1 | P Voiceover | |
Day 2 | NORBIS | PRACTICALS | Introduction to Dynamic Programming 2 | P Voiceover | |
. | |||||
Day 3 | NORBIS | LECTURE | Introduction to Multiple Sequence Alignment | Slides Voiceover | |
Day 3 | NORBIS | LECTURE | Multiple Sequence Alignment Algorithms | Slides | |
Day 3 | NORBIS | PRACTICALS | Parsing and manipulating binary trees | P | |
Day 3 | NORBIS | PRACTICALS | Implementing your own multiple sequence aligner | P | |
Day 3 | NORBIS | BIBLIO | MSA papers | Biblio Biblio Biblio Biblio | |
. | |||||
Day 4 | NORBIS | LECTURE | RNA Folding Algorithms | Slides Voiceover | |
Day 4 | NORBIS | PRACTICALS | Adpating the Nussinv Algorithm | P | |
Day 4 | NORBIS | BIBLIO | RNA Secondary Structure Prediction | Biblio Biblio | |
. | |||||
Day 5 | NORBIS | LECTURE | RNA Genomic Analysis using RNAseq | Slides Voiceover | |
Day 5 | NORBIS | BIBLIO | Short Reads Mapping | Biblio Biblio Biblio | |
. | |||||
Day 5 | NORBIS | PRACTICALS | Project Report - Delivery date: 15/06/2020 | P | |
. |
This Entire Course Was Automatically Generated Using BED, the Bioinformatics Exercise Database. BED is a freeware available on request Cedric Notredame |