| Lesson | Overview |
|---|---|
| 1. Download and verify data | Downloading data with wget/curl and check the transferred data’s integrity with check‐sums |
| 2. Streams, Redirection and Pipe | Combining pipes and redirection, Using "Exit" statuses |
| 3. Inspecting and Manipulating Text Data with UNIX Tools - Part 1 | Inspect file/s with utilities such as head,less. Extracting and formatting tabular data. Magical grep. |
| 4. Inspecting and Manipulating Text Data with UNIX Tools - Part 2 | Substitute matching patterns with sed. Text processing with awk and bioawk |
| 5. Automating File-Processing with find and xargs | Search files by pattern with find and use xargs to execute a command for those objects matching the pattern |
| 6. Puzzles 🧩 | Can you use shell scripts to solve these "real" life challenges in molecular biology ? |
| 7. Supplementary - 1 | Recap - Unix , Linux and Unix shell |
| 8. Supplementary - 2 | Recap - Shell basics and commands |
| 9. Supplementary - 3 | Escaping, Special Characters |
Attribution Notice
- This workshop material is heavily inspired by :
- Buffalo, V (2015). Bioinformatics Data Skills.O'Reilly Media, Inc
- The Carpentries. The Unix Shell . https://swcarpentry.github.io/shell-novice/
- The Carpentries. Introduction to Command Line for Genomics. https://datacarpentry.org/shell-genomics/
- Rosalind Project. https://rosalind.info/about/
License
Genomics Aotearoa / New Zealand eScience Infrastructure "Intermediate Shell for Bioinformatics" is licensed under the GNU General Public License v3.0, 29 June 2007 . (Follow this link for more information)
Setup
-
If possible, we do recommend using the Remote option over Local ( Especially for Windows hosts). This will eliminate the need to install any additional applications
-
Remote option will require an existing NeSI Account
Remote¶
Log into NeSI Mahuika Jupyter Service
- Follow https://jupyter.nesi.org.nz/hub/login
Enter NeSI username, HPC password and 6 digit second factor token

Choose server options as below
>>make sure to choose the correct project codenesi02659, number of CPUsCPUs=2, memory4 GBprior to pressing
button.
Local
¶
Local host setup - Windows, MacOS & Linux
- Install either
- Git for Windows from https://git-scm.com/download/win OR
- MobaXterm Home (Portable or Installer edition) from https://mobaxterm.mobatek.net/download-home-edition.html
- Portable edition does not require administrative privileges
- Native terminal client is sufficient.
- It might not comes with
wgetdownload data via command line (can be installed with$ brew install wget) - However, it is not required as we provide a direct link to download data in .zip format
- Native terminal client is sufficient.
bioawk install on all hosts
One of the tools used in this workshop is bioawk which is not a native Linu/UNIX utility. Installing it on MacOS and Linux can be done with $ brew install bioawk & $ sudo apt install bioawk, respectively. Windows hosts might have to do it via conda according to these instructions. However, this will require a prior install of Anaconda Or Miniconda

