05. BED format

BED is a tabs-delimited file format allows users to define how data lines of an annotation track are displayed.

If you're unfamiliar with an annotation track, they're simply the lines that are displayed on a genome browser.

UCSC Annotation Track
itemA and itemB are sample annotation tracks.

BED files can have up to 12 columns, but only three are required for the UCSC browser, Galaxy browser and bedtools. The number of columns must be consisted throughout each row of the file.

Let's look at all 12 BED fields, as explained by the UCSC Genome Browser Information section.

3 Required BED fields

The following 3 fields are required for all BED files.

Name of chromosome - chr5, chrX, chr2_random. or scaffold - scaffold10671
Starting position of chrom.
First base starts at 0.
Ending position.
This value does not get displayed. For example, the first 20 bases would have chromStart value of 0 to and chromEnd value of 20.

9 Optional BED fields

These 9 BED fields are optional.

Name of the BED line.
Score between 0 and 1000. If useScore is set to 1, the score will determine the level of gray that is displayed. A higher number equates to a darker shade.
Which strand - either '+' or '-'.
The position when the feature is drawn thickly (the start codon for gene display).
Ending position of where the feature is drawn thickly.
Determines the color of the data contained in the BED line. (255,0,0) for red.
Use the Color Picker to translate a color.
Number of blocks (exons) in the BED line.
Comma-separated list of block sizes.
Size of list should correspond to blockCount.
A comma-separated list of block starts.
Should be calculated relative to chromStart.
Size of list should correspond to blockCount.

Example custom track

UCSC Genome BED file display.
Here we can see the track annotations that are uploaded as a BED file is onto the UCSC Genome Browser.


BEDtools - Read the Docs

UCSC Genome Browser - BED format

Become a Bioinformatics Whiz!

Bioinformatics Data Skills

Become a Bioinformatics Whiz! Try Bioinformatics

Learn the best practices used by academic and industry professionals. Bioinformatics Data Skills give a great overview to the Linux Command Line, Github, and other essential tools used in the trade. This book bridges the gap between knowing a few programming languages and being able to utilize the tools to analyze large amounts of biological data.

$ Check price
49.9949.99Amazon 4.5 logo(7+ reviews)

More Bioinformatics resources

Take your Linux skills to the next level!

The Linux Command Line

Take your Linux skills to the next level! Try Linux & UNIX

The Linux Command Line takes you from your very first terminal keystrokes to writing full programs in Bash, the most popular Linux shell. Along the way you'll learn the timeless skills handed down by generations of gray-bearded, mouse-shunning gurus: file navigation, environment configuration, command chaining, pattern matching with regular expressions, and more.

$ Check price
39.9539.95Amazon 4.5 logo(274+ reviews)

More Linux & UNIX resources