Tuesday, January 26, 2010

Bioinformatics for biologist: Using terminal



One of the first things in Bioinformatics is learning how to use Operating Systems (OS) other than Windows such as Linux and Mac. I'm currently dual-boot 64-bit Ubuntu with Windows Vista. What's the different between 64-bit and normal 32-bit? 64-bit OS enable more than 4 GB RAM usage. But don't bother buying 64-bit Windows because most NGS tools are built and tested on 64 bit Linux OS. My advice is to get a free 64-bit Linux distro which is suitable for you.

And, most tools run on command line instead of Graphic User Interface(GUI). In fact, many bioinformatians think it's a stupid idea to create a program in GUI. Who needs GUI? Biologist does.

The command line is run using a shell. The shell in Ubuntu is called Terminal. To open Terminal, click on Applications menu -> Accessories -> Terminal.


Let's look at some of the basic operations I use daily:

cd
To change directory. Directory can be refered to as Folder in Windows term.
cd ..
Back to previous directory
cd /
Back to root directory

ls
list the file/directory in the current directory.
ls *.pl
List all the file/directory ending with ".pl". For example, fastq2fasta.pl will be listed.
ls fastq*
List all the file/directory starting with "fastq".
It's easy to find a file if you cannot remember part of the file name.


rm
Remove a file or directory

wc -l filename
Count the number of lines in a file. Useful to find out the number of sequences in a FASTA file. # sequences = # lines divided by 2

split -l 20000 filename
Create smaller files consisting of 20000 lines each. Type "split --help" for more split options.

more filename
Read the content of the file from beginning. This is very useful when the file size is very big (>100MB). Opening it using a text editor will crash the system. Press ENTER to read more lines.

top
Display information about system, running processes and RAM usage. To exit, press 'q'.

sudo
Root user or administrator. Any command starting with "sudo" requires administrator password.
sudo rm -rf ~/.local/share/Trash/files/*
Administrative privilege used to empty trash.

When typing a long command on terminal, you would prefer to paste that command there. Note that Terminal is different. To copy something from Terminal, hightlight and press shift+ctrl+c. To paste onto Terminal, press shift+ctrl+v. Alternatively, you can right-click your mouse and choose the copy/paste option.

When a command or application finish running, you will see the directory address with "~$" sign again. To stop any running application, press shift+ctrl+z.

Find out more on https://help.ubuntu.com/community/UsingTheTerminal

3 comments:

b@wee January 28, 2010 at 7:23 PM  

speaking of unix, who's using Dr. Adura's old, Mac Pros now? The ones that used to be in the bioinformatics training room in MGI?

Melissa Wong January 29, 2010 at 5:22 PM  

Hehe.. no idea. Ask Alicia. Haven't see her since I come back. I wonder how much different is Mac n Linux. I hope MGI have Linux installed.

b@wee January 29, 2010 at 8:58 PM  

Usually they would have installation packages for mac as well as Linux. The Mac OS IS based on unix and thus works pretty mch the same way Linux does on the command line. I haven seen any Linux machines where I was working before but they should have a few somewhere else.

Post a Comment

  © Free Blogger Templates Spain by Ourblogtemplates.com 2008

Back to TOP