Pairwise alignment is the process of aligning two DNA, RNA or protein sequences such that the regions of similarity are maximized. This is often performed to find functional, structural or evolutionary commonalities.
In most cases, scientists use two protein sequences to quantitatively find relatedness (aka homology). With this, they are able to identify common domains and motifs, and sequence ancestry.
Domains are parts of a DNA or amino acid strand that code for a physiochemically similar feature as found in other sequences and proteins. Domains refer to specific functionalities. For example, you could have a ATP-binding domain or polar domain.
Motifs are similar, but reference the structural characteristics rather than functional regions. Motifs are often found in domains, although that's not always the case.
Protein amino acid sequences are preferred over DNA sequences for a list of reasons.
However, there are some obvious instances when DNA alignments are needed.
Before we move on, let's take a quick review on some elementary biochemistry and notations.
We're all familiar with the four nucleotide bases - however, there are other symbols used for more ambiguous nucleotides.
|R||A or G||puRine|
|Y||C or T||pYrimidine|
|M||A or C||aMino|
|K||G or T||Keto|
|S||C or G||Strong interaction (3 bonds)|
|W||A or T||Weak interaction (2 bonds)|
|H||A, C or T (not G)||H is after G|
|B||C, G, or T (not A)||B is after A|
|V||A, C or G (not T)||V is after T and U|
|D||A, G or T (not C)||D is after C|
|N||A, C, G or T||aNything|
Amino acids can be represented with one or three letters. Take some time to review these.
|Z||Gln||Glutamic acid or glutamine|
A good tip to memorizing these is to play the amino acids license plate game! Keep a printout of the following table. When you and your cool friends are out for a drive, try to translate each license plate letter into amino acids. Sounds nerdy, but very effective in learning. Bonus points for knowing the properties and/or structures!
There are several ways to group amino acids, depending on their functionalities and biochemical properties.
With nonpolar (hydrophobic) side chains: alanine, valine, leucine, isoleucine, proline, methionine, phenylaline, tryptophan
With uncharged polar side chains: tyrosine, asparagine, glutamine, glycine, serine, threnine, cystein
With positively charged side chains: histidine, lysine, arginine
With negatively charged side chains: aspartic acid, glutamic acid
If you're looking for a fun and easy entry point into bioinformatics algorithms, this book it just for you! Filled with graphics, and written in a light-hearted and humorous story-telling persona, Bioinformatics Algorithms guides you through the intricacies of the problems faced in biology, and the clever solutions used to solve them.$ Check price
In this completely revised second edition of the perennial best seller How Linux Works, author Brian Ward makes the concepts behind Linux internals accessible to anyone curious about the inner workings of the operating system. Inside, you'll find the kind of knowledge that normally comes from years of experience doing things the hard way.$ Check price