The Wiggle format (.wig) is an efficient way to store dense, continuous blocks of data. It is primarily used to store values such as GC percentage, probability scores and transcriptome data. Instead of specifying a value for each nucleotide position, wig allows you to bind values to entire regions that follow a certain pattern.
Like SAM and BAM, wig has an indexed binary equivalent called bigWig. This allows for efficient data handling, as only parts of the file are extracted and processed when viewing particular regions on a genome browsers. For a conversion, use the WigToBigWig program.
The .wig filetype contains one or more blocks. On the top of each block is the track declaration line, which defines the data elements with a number of options.
There are several options we can place on the first line which characterizes that particular block of information. Each variable should be formatted as a key=value pair.
The two main formatting option per block are variableStep and fixedStep.
The variableStep option is the more common option. It includes the chromosome position in one column, and data values in another.
variableStep chrom=chr4 400001 13 400002 13 400003 13 400004 13 400005 13
We may have the chromosome number and an optional parameter known as span, which tells us the number of bases each value should cover.
The use of the "span" parameter can help us save space. The following is identical to the data block above, but saves much more space.
variableStep chrom=chr4 span=5 400001 13
In case you have data blocks with regular intervals between each position, you can use the fixedStep option. This allows you to place the positions on the track definition line, along with the interval length. Thus, only one column is necessary for the data parameters.
fixedStep chrom=chr4 start=400001 step=100 13 14 15
The above block would feature chromosome 4, position 400001 as having a value of 13, position 400101 having the value 14, and position 400201 having value 15.
You may also specify a span, indicating the length of each sequence.
fixedStep chrom=chr4 start=400001 step=100 span=5 13 14 15
This is similar, but the values range for five nucleotides instead of just one. Thus we have 13 for 400101-400105, 14 for 400201-400205, and 15 for 400301-400305.
Programming Python shows in-depth tutorials on the language's number of application domains including: system administration, GUIs, the Web, networking, front-end scripting layers, and more. This book focuses on commonly used tools and libraries to give you a comprehensive understanding of Python’s many roles in practical, real-world programming.$ Check price
Command Line Kung Fu is packed with dozens of tips and practical real-world examples. You won't find theoretical examples in this book. The examples demonstrate how to solve actual problems. The tactics are easy to find, too. Each chapter covers a specific topic and groups related tips and examples together.$ Check price