04. A quick look at BLAST

BLAST (Basic Local Alignment Search Tool) serves two purposes:

  1. Align two sequences and look for homology
  2. Search a sequence in a database to find similar and related sequences.

Without diving into too much details about BLAST (which we will cover in a later series), let's perform a simple query to get a feel for how to use it.

There are several types of BLAST that depend on what your query sequence is (DNA or protein) and what you want to match it to. For this run, let's stick with blastp, in which you enter a protein sequence and it matches to a similar protein sequence from a database.


The first thing to do is to go to the NCBI page for BLAST. From here, click protein blast (blastp), which is located under "basic BLAST."

You should get a window that looks like this:

BLAST website
BLAST website for protein BLAST (blastp).

1) Running a query against a database

We can search entire databases with a query. The query can be inputted with an accession number, gi (think of these as ID's for a specific protein sequence) or FASTA format.

What is FASTA?

FASTA format simply has the first line beginning with a > that describes the sequence. Any following lines are the protein sequence itself. For example:

Try inputting the above FASTA sequence.

  • Do not check the "Align two or more sequences" options.
  • Select the "non-redundant protein sequences (nr)" for the database.
  • For the organism name, use "human (taxid:9606)".

You'll notice that there are different types of BLAST you can perform - PSI-BLAST, PHI-BLAST and DELTA-BLAST. We'll cover these advanced BLAST variations in a later lesson.

There is also another window down at the bottom for Algorithm parameters, where you can fiddle with the scoring matrix, different gap penalties and more. But for now, click the big BLAST button to run your sequence!

A quick BLAST run
A quick BLAST run.

After waiting for your query to be processed...Great! You just ran a BLAST search! Looks like you just found yourself the human ortholog of a mouse protein.

Scroll down to the bottom to the Descriptions panel, and you can see all the matches that are similar to your query.

Results of the best scoring matches on top
Results of the best-scoring matches will be on top.

You can scroll further down to see the actual alignments with the Identities and Similarities (called Positives) scores next to them.

The top-scoring alignment with its identity and similarity scores.
Scroll down to see the top-scoring alignment with its identity and similarity (positives) scores.

2) Running a pairwise comparison

The other use of BLAST is for pairwise comparisons. This means you aren't querying a database, but just inputting two sequences and seeing how well they match up. To switch to pairwise comparison mode, click the "Align two or more sequences" option.

For the two sequences here, let's use gi|293651548 and gi|158256336.

A simple pairwise alignment with two proteins, given by their GI's.
A simple pairwise alignment with two proteins, given by their GI's.

Click the big BLAST button once again and wait for your query to be processed. Then scroll down and check your results.

In the Descriptions section there is just one alignment...but why are there multiple in the Alignments section? This is simply because there are several ways that BLAST can align your sequences. The top-scoring alignments are found on the top, while lower-scoring ones are at the bottom. For the most part, you'll want to look at the top-most result.

Results for a pairwise alignment run.
Results for a pairwise alignment run.

Wondering how the scoring system goes? We'll see that in the next few pages!

Learn to be a Pythonista!

Python Programming

Learn to be a Pythonista! Try Python

This book is designed to be used as the primary textbook in a college-level first course in computing. It takes a fairly traditional approach, emphasizing problem solving, design, and programming as the core skills of computer science. However, these ideas are illustrated using a non-traditional language, namely Python.

$ Check price
45.9945.99Amazon 4.5 logo(211+ reviews)

More Python resources

Take your Linux skills to the next level!

How Linux Works

Take your Linux skills to the next level! Try Linux & UNIX

In this completely revised second edition of the perennial best seller How Linux Works, author Brian Ward makes the concepts behind Linux internals accessible to anyone curious about the inner workings of the operating system. Inside, you'll find the kind of knowledge that normally comes from years of experience doing things the hard way.

$ Check price
39.9539.95Amazon 5 logo(114+ reviews)

More Linux & UNIX resources