Windows | Setting up the command-line for bioinformatics

Updated: Apr 27



Contents

1. What is WSL/ Ubuntu and how to install them

2. Installing bioinformatics tools with APT, Bioconda or source

3. Installing Docker (and changing the place where images are stored)


1. What is WSL/ Ubuntu and how to install them

Most bioinformatics tools run on the command-line terminal of a Linux distribution.


If your OS is Microsoft Windows, no worries, to run Linux, you can use Windows Subsystem for Linux (WSL).


There are many distributions of Linux, and the most popular one now is Ubuntu.

Ubuntu 20.04 Long-Term Support (LTS), code name Focal Fossa is biannually maintained and free until 2030.


You can download and install both WSL and Ubuntu 20.04 by running Windows Powershell as Administrator,

and typing in this:

wsl --install -d Ubuntu-20.04

Restart your computer once you're prompted.


Now you can go back to your start menu, type in "Ubuntu" and you should see it appear:

Open it and you should see a terminal running on Linux. To speak to the terminal, you have to use shell language.


I highly recommend Codecademy to learn the first steps of the command line. You only need to invest a couple of dollars and hours but it'll give save you tons of time in return.


2. Installing bioinformatics tools with APT, Bioconda or source

Traditionally, with the Ubuntu terminal, you can already install and run bioinformatics packages.


There are many ways to install it, but here are the common ones, using the installation of bwa as an example.


1. From Advanced Package Tool (APT)

# Refreshes repository index
sudo apt update

# Install bwa
sudo apt install -y bwa

2. From Bioconda

# Install conda
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.shsh Miniconda3-latest-Linux-x86_64.sh

# Set up channels (ORDER IS IMPORTANT)
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

# Install bwa
conda install bwa

3. From the source in GitHub

# Make a directory (i.e. folder) you want bwa to be installed in
# Here I make a new folder in C drive called my_biosoftware
mkdir -p /mnt/c/my_biosoftware

# Enter the directory
cd /mnt/c/my_biosoftware

# Download the source code from GitHub
git clone https://github.com/lh3/bwa.git

# Install bwa
cd bwa; make

## Now, I make the bwa command executable from any directory
# Open up the text editor for my profile
nano ~/.profile
# Add to the last line
export PATH=$PATH:/mnt/e/my_biosoftware/bwa/
# Press Ctrl + S to save, then Ctrl + X to exit

# Execute the contents of profile
source ~/.profile

Now you can run bwa from the terminal. Here I do a little trial to show it works.

bwa index
Usage:   bwa index [options] <in.fasta>

Options: -a STR    BWT construction algorithm: bwtsw, is or rb2 [auto]
         -p STR    prefix of the index [same as fasta name]
         -b INT    block size for the bwtsw algorithm (effective with -a bwtsw) [10000000]
         -6        index files named as <in.fasta>.64.* instead of <in.fasta>.*

Warning: `-a bwtsw' does not work for short genomes, while `-a is' and
         `-a div' do not work not for long genomes.

Unfortunately, there are some huge drawbacks to installing and running apps from the above three methods.

  1. The APT/ Bioconda repository might give you an outdated version of the app.

  2. If you install from source, you might need to install the software required to run the app correctly (i.e. the dependencies).

  3. If you have multiple apps, each of them requiring different versions of the same dependencies, you'll need to create separate virtual environments.


Let's create a virtual environment using Conda to install and run bwa:

# First you need to install conda and set up the channels 
# (see "2. From Bioconda" above)

# Create a virtual environment called bwa_env
conda create --name bwa_env

# Go into the virtual environment
conda activate bwa_env

# Install bwa
{use any of the three methods above}

# Deactivate the virtual environment
conda deactivate

# Now, whenever you want to run bwa, do
conda activate bwa_env
bwa index 
conda deactivate

3. Installing Docker (and changing the place where images are stored)

Alternative to installing the app directly into your command-line, you can use Docker. Here I explained Docker.


Docker is incredibly useful if you cannot get apps to run. Sadly, Windows Home edition will not work if you want to run Docker. You need Hyper-V for Docker. I manage to buy a cheap key online off Lazada to upgrade my Windows, you can try too. To install Docker, first install Windows subsystem for Linux (WSL) then install Docker. A short summary of the steps:


1. Install WSL 2

# Run Powershell as administrator
wsl --install

# Restart your computer and open Powershell again
wsl --set-version Ubuntu-20.04 2

# Install Docker on Windows: https://docs.docker.com/desktop/windows/install/

2. Open Docker. Go to Settings > Resources > WSL Integration > Turn on switch for Ubuntu-20.04


3. Open Ubuntu and test if Docker is working

docker --version
docker run hello-world

4. Docker images can take up a lot of disk space! If you want to change the hard disk where Docker images are stored, quit Docker desktop. Run Powershell as administrator:

# First check if docker is still running
wsl --list -v

# You should see:
#  NAME                   STATE           VERSION
#* docker-desktop         Stopped         2
#  docker-desktop-data    Stopped         2

# Assuming you want Docker images to be installed in E:\Docker\wsl\data
# Run:
New-Item -Path E:\Docker\wsl\data -ItemType Directory
wsl --export docker-desktop-data "E:\Docker\wsl\data\docker-desktop-data.tar"
wsl --unregister docker-desktop-data
wsl --import docker-desktop-data E:\Docker\wsl\data\ "E:\Docker\wsl\data\docker-desktop-data.tar" --version=2

# Restart docker desktop

Let me know if you have any questions or run into any issues.

404 views10 comments

Recent Posts

See All