Sammons Lab / Links

Sequencing Technology Workshop Information - 2019

22 Mar 2019 by sammons

Information for Attendees of the Sequencing Technology Workshop, 2019

Part of the 6th Annual RNA Symposium of the RNA Institute

Instructor: Morgan Sammons, Assistant Professor of Biology, State University of New York at Albany

Instructor: Ryan Meng, Bioinformatics Support Specialist, State University of New York at Albany

There are a number of resources online for learning about and using high-throughput sequencing data in your own work. The Harvard Chan Bioinformatics Core provides extremely detailed, well-organized introductions to a number of sequencing approaches, software, and workflows. I highly recommend this for further information.

Main Presentation: Presentation given by Morgan as part of the workshop.
Bulk RNA-seq Replicate Guidelines: Excellent manuscript published in RNA discussing why biological replicates matter and how to select the number of replicates in a bulk RNA-seq experiment.

Tools used during the symposium

STAR: a splice-aware alignment tool.

Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., … Gingeras, T. R. (2012). STAR: ultrafast universal RNA-seq aligner. Bioinformatics (Oxford, England), 29(1), 15-21. doi: 10.1093/bioinformatics/bts635

DESeq2: An R-based package for differential gene expression analysis using raw read counts derived from STAR. Short Tutorial

Love, M.I., Huber, W., Anders, S. (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15:550 doi: 10.1186/s13059-014-0550-8

Samtools: software for manipulating SAM and BAM alignment files.

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, and 1000 Genome Project Data Processing Subgroup, The Sequence alignment/map (SAM) format and SAMtools, Bioinformatics (2009) 25(16) 2078-9 doi: 10.1093/bioinformatics/btp352

Commands used for Genome Indexing and Alignment

Genome Indexing with STAR

We used the Saccharomyces cerevisiae sacCer3 genome from UCSC found at the Illumina iGenomes website.

iGenomes is a nice source for the genomes of many model organisms used in research.

You can use STAR (or other genome aligners) to build indexes to your own model organism using a FASTA file of the genome.

Genome Alignment with STAR

We used STAR in a cluster setting to align our raw FASTQ files to the sacCer3 genome we indexed in the prior steps.

The script shown here is designed to be used in a cluster setting using the slurm schedule manager. If you are performing these tasks locally, you would not need to use the slurm nomenclature. If you are performing these tasks on your home institution cluster, they may use another schedule manager.

Another point to remember is that we asked for certain computational resources in order to do the alignment. We based these numbers off of the available computational resources and our expectations of how long the alignment should take based on it being yeast (versus human or mouse with larger genomes).

Non-alignment Based Strategies for Differential Gene Expression

We used salmon to perform transcript quantification of our RNA-seq data without prior alignment to a reference genome.

The major advantage over STAR (or other aligners) is speed. salmon and a similar tool called kallisto do not actually perform any alignments, which drastically speeds up the process of quantifying your RNA-seq data.

The other huge advantage is the processing power required to perform the transcript quantification using salmon or kallisto. These two programs can be run on your laptop or desktop and do not require anything more than that!

BIO681 - Seminar in MCDN - Reading List

25 Jan 2019 by sammons

Information for ABIO681 - Spring 2019

Syllabus

Reading due 02/05: Casadevall and Fang 2009

Presentation Date	Paper	Presenters
12-Feb	Fields and Song 1989	McCauley/Altreith
19-Feb	Wang 1993 and Wilson 1991	Lin/Catizone
26-Feb	DeRisi et al 1996	Koslow/McCarthy
5-Mar	Burns et al 1994 and Giaever et al 2002	O’Keefe/Martin
12-Mar	Krogan et al 2006	Soyer/Sammons
26-Mar	Bentley et al 2008	Durham/Naik
9-Apr	Ren et al 2000 and Johnson et al 2007	Waldern/Moskwa

Reading due 04/30 #1: Casadevall and Fang 2014
Reading due 04/30 #2: Berg 2016

Presentation Date	Paper	Authors	Reviewers
7-May	Lee et al 2019	McCauley/Catizone	Koslow/Lin

Web Resources

23 Dec 2017 by sammons

Links I really like

UCSC Genome Browser: Ubiquitous and fantastically data rich, UCSC Genome Browser is my go-to.
WashU Epigenome Browswer: Excellent 3D visualization tools as well as host to a number of consortium data hubs
Cancer Cell Line Encyclopedia: Search for your favorite genes and compare expression across multiple cancer cell lines. Very well done.
HOMER: One of the best annotated software suites, this is still my favorite semi-integrated next-gen sequencing analysis software. Bonus for being great at its original function: DNA motif analysis
Bedtools: Indispensable tool for analysis of genomic intervals/peak locations.
Homebrew: Easiest way to install and update command line software tools on Mac.
Homebrew Formulas: Nice list of software that can be installed/managed by homebrew.
Gene Expression Omnibus: NIH-funded repository for genomics data of all types
Sequence Read Archive: NIH-funded repository for next-generation sequencing reads.
deeptools: Very nice suite of tools for some ChIP-seq/RNA-seq applications; many uses!

DEseq2

STAR Aligner: Super fast and accurate aligner: requires high RAM for mammalian genomes, but speed can’t be beat.
Bowtie2: My favorite short read aligner (mainly because I know how to use it).
HOCOMOCO: Nice database of DNA-binding protein motifs
Firebrowse: Integration of (primarily) the Cancer Genome Atlas (TCGA) data
cBioPortal: Integration of TCGA and many other datasets
EMBOSS: Software suite that performs numerous DNA/RNA sequence manipulations
JASPAR: Transcription factor binding motif database
Software Carpentry: Learn how to code (I need to do this)

Albany NGS Bootcamp

21 Apr 2017 by sammons

Intro to Next-Gen Sequencing: A completely non-comprehensive introduction into next-gen sequencing. Primarily designed as part of a larger bootcamp held @ UAlbany
Getting Started with Next-Gen Sequencing: Tips and tricks for getting started in sequencing at UAlbany
Applications: ATAC-seq: Data I presented that turned into this paper with the Wherry lab @ the University of Pennsylvania