Setup – Nanopore Sequencing

Compute platform

This workshop is designed to be conducted on the NeSI compute infrastructure. All software and data are already set up for you to use during the workshop.

Software used

If you are attempting to work through this material on a non-NeSI compute system, the following software will need to be installed:

Software	Version	Manual	Description
BCFtools	1.16	link	Variant calling
bioawk	20110810	link	awk for biological data
FastQC	0.12.1	link	Read QA/QC
dorado	6.2.1	link	GPU basecaller for ONT POD5/FAST5 data
Nanoplot	1.41.0	link	QA/QC for ONT reads
SAMtools	1.16.1	link	Read mapping

Data availability

The data utilised in this workshop are publicly available, but are reasonably old (2016), so are not a particularly good demonstration of the quality or volume of data that the current ONT sequencing devices are able to produce.

See the following link for a description of the data (and a rather nostalgic look at the excitement associated with ONT’s “new” version 9 chemistry):

http://lab.loman.net/2016/07/30/nanopore-r9-data-release/

The specific data used for this workshop can be downloaded here (note that this URL is also provided within the post at link above), although to reduce file sizes, only a subset of reads were used:

https://s3.climb.ac.uk/nanopore/E_coli_K12_1D_R9.2_SpotON_2.tgz

Also worth noting - the original data format was one fast5 file per read, whereas the current fast5 format used by ONT is multiple reads per file. The command single_to_multi_fast5 from the ont_fast5_api package:

https://github.com/nanoporetech/ont_fast5_api

was used here to convert the single-read fast5 data to multi-read fast5.

The fast5 files were then coverted to pod5 format via ONT’s pod5 tool:

https://github.com/nanoporetech/pod5-file-format

Nanopore Sequencing: Setup

Compute platform

Software used

Data availability