Summary and Schedule
RNA sequencing (RNA-seq) has revolutionized the field of genomics, enabling researchers to gain insights into gene expression, transcriptome dynamics, and molecular pathways. Bioconductor is an open-source software project that provides a rich set of tools for analyzing high-throughput genomic data, including RNA-seq data. This Carpentries-style workshop is designed to equip participants with the essential skills and knowledge needed to analyze RNA-seq data using the Bioconductor ecosystem. Throughout this workshop, you will delve into key concepts, including data preprocessing, quality control, differential gene expression analysis, visualization of results, and gene set analysis.
Callout
This workshop has been adapted from the original RNA-seq analysis with Bioconductor and customised for delivery on NeSI as part of the Otago Bioinformatics Spring School
Prerequisites
- Familiarity with R/Bioconductor, such as the Introduction to data analysis with R and Bioconductor lesson.
- Familiarity with statistical hypothesis testing, such as Chapter 6 of Modern Statistics for Modern Biology book by Holmes and Huber.
- Familiarity with the biology of gene expression and RNA-seq, such as RNA sequencing: the teenage years manuscript by Hadfield et.al.
Setup Instructions | Download files required for the lesson | |
Duration: 00h 00m | 1. Introduction to RNA-seq |
What are the different choices to consider when planning an RNA-seq
experiment? How does one process the raw fastq files to generate a table with read counts per gene and sample? Where does one find information about annotated genes for a given organism? What are the typical steps in an RNA-seq analysis? |
Duration: 01h 40m | 2. RStudio Project and Experimental Data |
How do you use RStudio project to manage your analysis project? What is the most effective way to organize directories for an analysis project? How to download a dataset from the internet and save it as a file. |
Duration: 02h 10m | 3. Importing and annotating quantified data into R |
How can one import quantified gene expression data into an object
suitable for downstream statistical analysis in R? What types of gene identifiers are typically used, and how are mappings between them done? :::::::::::::::::::::::::::::::::::::::::::::::::: |
Duration: 04h 10m | 4. Exploratory analysis and quality control |
Why is exploratory analysis an essential part of an RNA-seq
analysis? How should one preprocess the raw count matrix for exploratory analysis? Are two dimensions sufficient to represent your data? |
Duration: 07h 10m | 5. Differential expression analysis |
What are the steps performed in a typical differential expression
analysis? How does one interpret the output of DESeq2? |
Duration: 08h 55m | 6. Extra exploration of design matrices | How can one translate biological questions and comparisons to statistical terms suitable for use with RNA-seq analysis packages? |
Duration: 09h 55m | 7. Gene set enrichment analysis |
What is the aim of performing gene set enrichment analysis? What is the method of over-representation analysis? What are the commonly-used gene set databases? |
Duration: 11h 40m | 8. Next steps |
How to go further from here? What other types of analyses can be done with RNA-seq data? |
Duration: 12h 00m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
Ensure that you have the most recent versions of R and RStudio installed on your computer. For detailed instructions on how to do this, you can refer to the section “If you already have R and RStudio installed” in the Introduction to R episode of the Introduction to data analysis with R and Bioconductor lesson.
Additionally, you will also need to install the following packages that will be used throughout the lesson.
R
install.packages(c("BiocManager", "remotes"))
BiocManager::install(c("tidyverse", "SummarizedExperiment",
"ExploreModelMatrix", "AnnotationDbi", "org.Hs.eg.db",
"org.Mm.eg.db", "csoneson/ConfoundingExplorer",
"DESeq2", "vsn", "ComplexHeatmap", "hgu95av2.db",
"RColorBrewer", "hexbin", "cowplot", "iSEE",
"clusterProfiler", "enrichplot", "kableExtra",
"msigdbr", "gplots", "ggplot2", "simplifyEnrichment",
"apeglm", "microbenchmark", "Biostrings",
"SingleCellExperiment"))
If you are attending a workshop, please complete all of the above before the workshop. Should you need help, an instructor will be available 30 minutes before the workshop commences to assist.