06. Handling Gap penalties

We have already discussed amino acid substitutions, but what about the other two types mutations, insertions and deletions? Together, these insertions and deletions are known as gaps, and its scoring system varies.

Gap penalties for local vs. global alignment

Handling gap penalties vary depending on what you're looking for. If you are searching for similar domains, you shouldn't penalize much for any gaps that occur at the ends of your sequences (local alignment). However, if you need a more exact match (global alignment), then you would probably want any sequences that have long end gaps.

Opening and Extension gap penalties

Gap penalties may be broken down into two parts:

  1. An opening of a gap.
  2. The extension of a gap.

An opening gap penalty is applied at the appearance of any gaps. For gaps that are longer than one residue, we can apply an extension gap penalty, in which penalization occurs for the addition of each residue length.

With this, there are several methodologies we may apply to a gap.

Types of gap penalties

Constant

This is when there are no opening gap penalties, and a fixed negative score is given to every gap.

AB--C----DEF
AB--C----DEF
ABGHCFGHIDEF

Here, we'd have a score of -2n, where n is the score we take for each gap.

Linear

This depends on the gap length, so a penalty score is assigned for every gap residue.

AB--C----DEF
AB--C----DEF
ABGHCFGHIDEF

Here, we'd have a score of -6l, where l is the score we take for residue gap.

Affine

The affine type takes into account the gap opening penalty, as well as each length. This means that on top of a linear penalty type, there is another penalty score added that stands for gap opening.

AB--C----DEF
AB--C----DEF
ABGHCFGHIDEF

Here, we would have two opening gap penalties of -2n and residue gaps of -6l where n is the penalty per opening gap and l is the penalty per residue in each gap.

BLAST uses the affine type by default. The opening gap penalty is -11, while each additional residue gap is -1. You may change these settings in the "Algorithm parameters" section, just below the BLAST button.

Changing gap penalties in BLAST
Changing gap settings in BLAST.

One thing to notice here - what is the default scoring matrix set on? It's not any PAM matrix, but it's rather BLOSUM62. Let's see how BLOSUMs are constructed, and the difference between them and PAMs in the next lesson.

Take your Linux skills to the next level!

The Linux Command Line

Take your Linux skills to the next level! Try Linux & UNIX

The Linux Command Line takes you from your very first terminal keystrokes to writing full programs in Bash, the most popular Linux shell. Along the way you'll learn the timeless skills handed down by generations of gray-bearded, mouse-shunning gurus: file navigation, environment configuration, command chaining, pattern matching with regular expressions, and more.

$ Check price
39.9539.95Amazon 4.5 logo(274+ reviews)

More Linux & UNIX resources

Become a Bioinformatics Whiz!

Introduction to Bioinformatics Vol. 2

Become a Bioinformatics Whiz! Try Bioinformatics

This is Volume 2 of Bioinformatics Algorithms: An Active Learning Approach. This book presents students with a light-hearted and analogy-filled companion to the author's acclaimed course on Coursera. Each chapter begins with an interesting biological question that further evolves into more and more efficiently solutions of solving it.

$ Check price
49.9949.99Amazon 5 logo(5+ reviews)

More Bioinformatics resources

Ad