CSHL Advanced Sequencing Course
More information is available on the course Webpage
1. De novo assembly theory and practiceTheory and practice of assembly projects examining requirements for coverage, read lengths, and quality with a focus on Illumina and PacBio sequecning
2. De novo assembly exerciseA hands on exercise assembling a genome and searching for a secret message encoded along a novel insertion
3. Genome AnnotationsStrategies for annotating a genome, including gene prediction, alignment techniques, and high throughput functional sequencing assays
4. Long Read Assembly Latest advances in long read assembly. Presented by James Gurtowksi
5. Single Cell Analysis Latest advances in single cell analysis. Presented by Tyler Garvin
CSHL WSBS Genomics and Quantitive Biology
Schatz Lab Research Projects Overview of my background and research
QB Bootcamp
QB Bootcamp 1: Introduction The challenges of biological data science
QB Bootcamp 2: Unix scripting A gentle introduction to working at the command line
QB Bootcamp 3: ORC Exercises Use sequencing to discover origins of replication
qb1.zip Python programming exercises
qb2.zip Shell programming exercises
Genomics
Genomics 1: Omics Bootcamp Introduction to whole genome, exome sequencing, RNAseq, ChIPseq, Methyseq, and single cell analysis
Genomics 2: ENCODE History and major findings of ENCODE
Genomics 3: Ancient and Modern Humans Major results from 1000 genomes project, Neanderthal sequencing, surname inference
Quantitative Biology
QB Lecture 1. Exact Matching Introduction to brute force, binary search, suffix arrays and the BWT
QB Lecture 1: BWT Notes Lecture Notes on the BWT
QB Lecture 2: Dynamic Programming Fibonnaci Numbers, Longest Increasing Subsequence, and Sequence Alignment
QB Lecture 2: Dynamic Programming Notes Lecture notes on designing a dynamic programming algorithm
QB Lecture 3: Graphs and Genomes Basic graph algorithms, methods for genome assembly
QB Lecture 4: Gene Finding and HMMs Microbial and Eukaryotic Gene Finding, Markov Models, HMMs and GHMMs, Forward Algorithm, Viterbi
CSHL Frontiers and Techniques in Plant Science
Genome Sequencing and Assembly
Introduction to de Bruijn graphs for genome assembly
Assembly Tutorial
Assembly tutorial to detect a secret message embedded into a microbial genome
CSHL Undergraduate Research Program in Bioinformatics
Searching for GATTACA
In this class we explored the problem of finding exact occurrences of a query
sequence in a large genome or database of sequences. Under this theme, we
started by analyzing the brute force approach introducing the concepts of
algorithm, complexity analysis, and Evalues. Next we discussed suffix arrays
as an index for accelerating the search, including analyzing the performance of
binary search. We also considered two traditional algorithms for sorting
(Selection Sort versus QuickSort) and their relative performance. In the second
half of the class we discussed finding approximate occurrences of a short query
sequence in a large genome or database of sequences. We first defined the
problem by considering various metrics of an approximate occurrence such as
hamming distance, or edit distance. We then considered different methods for
computing inexact alignments including brute force global & local
alignments, and seedandextend algorithms. Finally we discussed Bowtie as a
BurrowsWheeler transform based short read mapping algorithm for discovering
alignments to reference genome.
Python & Bioinformatics
Python Class 1
Introduction to python, variables, lists, conditions, loops
Python Class 2
Brute force search, dictionaries, motif finding
iPython Notebooks for Probability & Statistics
 Rolling a die (Uniform Random Probability)
 Flipping a coin (Binomial & Normal Distributions)
 Throwing Marbles into Jars (Poisson Distribution)
 Throwing Darts (Exponential Distribution)
We also used the exercises at Rosalind throughout the course.
Special topics
Talk by Anne Churchland on balancing work and life.
