01. Introduction to DNA sequencing

DNA sequencing is the process of determining the precise order of a DNA molecule. There are four DNA base pairs - adenine, guanine, cytosine and thymine - but many different permutations that are possible, making sequencing no easy task.

What can be sequenced?

All living species and viruses contain DNA - including animals, plants, bacteria, and archaea - may have their DNA sequenced. From these organisms, we are able to extract DNA from genes, chromosomes, entire genomes, and mitochondria.

What can you do with a DNA sequence?

Elucidation of the DNA sequences has provided scientists with a wealth of information.

  • Geneticists are now able to understand the function of genes by finding distinctive coding regions such as DNA-binding sites, receptor recognition sites and transmembrane domains.
  • Scientists have been able to better predict homology among species. Evolutionary biology describes how organisms are related
  • Doctors can give a more personalized approach to medicine, tailoring therapeutics depending on a person's genetic makeup. Genetic testing such as paternal or prenatal testing is becoming more and more commonplace.
  • Criminal investigators can use DNA profiling to identify suspects, or exonerate the accused.
  • Metagenomics, the study of genetic material recovered directly from environmental samples, allow us to identify organisms present in bodies or water, sewage, dirt, etc.

Furthermore, entire fields have emerged from the ability to view DNA sequences. Patient diagnoses, biotechnology, forensic biology, virology and biological systematics are just a few of the fields that have either emerged or further developed due to the advent of DNA sequencing.

Two types of DNA sequencing

There are two types of DNA sequencing performed: de novo and resequencing.

  1. In de novo sequencing, the DNA is sequenced for the first time. This means there are no reference genomes available to align reads to.
  2. Resequencing, on the other hand, the sequences have a reference genome.

DNA sequencing approaches

We've come a long way since the first generation of DNA sequencing in the 1970's. The first human genome sequenced in 2003 took nearly a decade and cost $3bn. As of the year 2015, sequencing an entire human genome takes a little less than $1,000 and a matter of a few days.

Early days

Sanger sequencing was one of the early DNA sequencing techniques used. The method was primarily based on capillary electrophoresis. However, even with automation and optimization, it was found to be too slow and costly. Thus, new techniques emerged that involved cyclic methods, where dNTPs were added consecutively and massive parallelization. The methods that incorporated such techniques fell under a family of techniques known as Next Generation Sequencing.

Next Generation Sequencing

Massive parallilization made it possible to process thousands to millions of sequences concurrently. This resulted in data output increasing at a rate that exceeded Moore's law, more than doubling each year since its inception.

Obligatory image of DNA sequencing costs outpacing Moore's law.
Obligatory graph showing DNA sequencing costs outpacing Moore's Law.

Not only did NGS bring about a wealth of information, but it also uncovered new scientific ideas and revolutionized the way we worked in life sciences.

In this series, we'll go through each DNA sequencing technique. Let's begin with one of the first sequencing techniques that came about in the 70's.

Take your Linux skills to the next level!

System Admin Handbook

Take your Linux skills to the next level! Try Linux & UNIX

This book approaches system administration in a practical way and is an invaluable reference for both new administrators and experienced professionals. It details best practices for every facet of system administration, including storage management, network design and administration, email, web hosting, scripting, and much more.

$ Check price
74.9974.99Amazon 4.5 logo(142+ reviews)

More Linux & UNIX resources

Become a Bioinformatics Whiz!

Bioinformatics Data Skills

Become a Bioinformatics Whiz! Try Bioinformatics

Learn the best practices used by academic and industry professionals. Bioinformatics Data Skills give a great overview to the Linux Command Line, Github, and other essential tools used in the trade. This book bridges the gap between knowing a few programming languages and being able to utilize the tools to analyze large amounts of biological data.

$ Check price
49.9949.99Amazon 4.5 logo(7+ reviews)

More Bioinformatics resources