02 - Setup¶
Running on NeSI vs on your computer¶
During this workshop we will be running the material on the NeSI platform, using the Jupyter interface, however it is also possible to run this material locally on your own machine.
One of the differences between running on NeSI or your own machine is that on NeSI we preinstall popular software and make it available to our users, whereas on your own machine you need to install the software yourself (e.g. using a package manager such as conda).
We have provided a guide for setting up your own machine using conda here) (note we will not be able to provide assistance if you decide to take this approach during the workshop).
Connect to Jupyter on NeSI
- Connect to https://jupyter.nesi.org.nz
Enter NeSI username, HPC password and 6 digit second factor token (as set on MyNeSI)
Choose server options as below
make sure to choose the correct project codenesi02659
, number of CPUs 4, memory 4GB prior to pressing button.
Create a working directory¶
When you connect to NeSI JupyterLab you always start in a new hidden directory. To make sure you can find your work next time, you should change to another location. Here we will switch to our project directory, since home directories can run out of space quickly. If you are using your own project use that instead of "nesi02659".
You can also navigate to the above directory in the JupyterLab file browser, which can be useful for editing files and viewing images and html documents.
Load the Snakemake module¶
We use "environment modules" on NeSI to manage installed software. This allows you to pick and choose which software is available in your environment. More details about environment modules can be found on the NeSI support page.
The JupyterLab terminal comes with some modules preloaded and it can often be nicer to start with a clean environment:
We can search for available Snakemake modules using the module spider
command:
which shows we have many versions of snakemake installed. Now load a specific version of snakemake into your environment:
Test that the snakemake command is now available by running the following command:
It should print out the version of snakemake, i.e. "7.6.2".
You can also run module list
to see the list of modules that are currently loaded.
Get the data¶
We'll use the data from the DNA variant calling workshop yesterday
code
Initialise Git (optional but recommended)
As we're going to be incrementally developing our scripts, we can also take the opportunity to place them under version control from the start. Remember to ignore the data directory
Once you create your scripts and get them to work, remember to add
and commit
so that you have snapshots to fall back to if needed