Our main research interest is in understanding the structure and function of genomes, especially those of medical or agricultural importance. The core strength of our research is in developing novel algorithms and computational systems for large-scale biological sequence analysis, including leading algorithms for de novo genome assembly, variant detection, and related –omics assays. Using these advances we have contributed to the de novo genome assemblies of dozens of species; probed the sequence variations related to autism, cancer, and other human diseases; mapped the transcriptional and epigenetic profiles of tomatoes, corn, and other important plant species; and explored the role of microbes in different environments. In response to the deluge of biological sequence data we are now facing, we have also been at the forefront of distributed and parallel computing in genomics, and have pioneered the use of cloud computing and Hadoop/MapReduce as an enabling platform to address the big data challenges we are all facing.

Looking forward, we see ourselves at the intersection of biotechnology and algorithmics, developing systems for probing the structure and function of genomes using the best technologies possible. Our expertise spans from low level computer architecture, through sequencing, de novo assembly, variant identification, transcriptome & other -omics data and up to machine learning approaches to build predictive models of diseases and treatment response. In addition to ongoing projects in autism & other human diseases, and developmental plant biology, I was granted an NSF CAREER award to research new approaches for analyzing single molecule sequencing, especially for genome and transcriptome analysis of crop species. Another recent thrust has been to develop algorithms for single cell analysis, especially to use copy number variations within individual tumor cells to examine how cancer progresses. Altogether, we intend to develop powerful new methods for analyzing large collections of genomes to address questions of disease, development, and evolution.

Recent News
» GenomeScope: Fast reference-free genome profiling from short reads
April 4, 2017
» Nanopore sequencing meets epigenetics
March 31, 2017
» Proceedings of the IEEE: Bioinformatics of DNA
March 1, 2017
» Preprint on 16GT: a fast and sensitive variant caller using a 16-genotype probabilistic model
Feb 24, 2017
» Personalized Phased Diploid Genomes of the EN-Tex Samples @ AGBT
Feb 15, 2017
(past news)

Upcoming Events

~~ 2017 ~~

» Biology of Genomes
CSHL. May 10-14, 2017
» JIMB World Metrology Day
Stanford. May 22, 2017
» PacBio Users Meeting
Baltimore, MD. June 28, 2017
» UCLA Computational Genomics Summer Institute
UCLA. July 10, 2017
» Genome Informatics
CSHL. Nov 1-4, 2017
~~ 2018 ~~

» International Plant and Animal Genomes (PAG)
San Diego CA. Jan 13 - 17, 2018
» Advances in Genome Biology and Technology (AGBT)
Marco Island, FL. Feb 14 - Feb 20, 2018
» Biological Data Science
CSHL. Nov 7-10, 2018
(presentation archive)

Michael Schatz

Bloomberg Distinguished
Associate Professor of
Computer Science
and Biology

Johns Hopkins University
Department of Computer Science
3400 N Charles St
Malone Hall 323
Baltimore, MD 21211
Cell: (703) 966-1987
E-mail: mschatz <at> cs.jhu.edu

Adjunct Associate Professor of
Quantitative Biology

Cold Spring Harbor Laboratory
One Bungtown Road
Koch Building 1121
Cold Spring Harbor, NY 11724
Tel: (516) 367-5218
Fax: (516) 367-8380
E-mail: mschatz <at> cshl.edu
Twitter: @mike_schatz