One of the most common question after your Next-Generation Sequencing (NGS) run is: “What is the coverage I achieved?” Unfortunately there a different definitions available for the term “coverage”.
Notes about working with fasta / fastq files on the Unix command line.
For any large software project (i.e. one that requires more than a few scripts performing a one-off task) and for every project that was initiated by a customer request, it is useful to precisely define the requirements before starting to write any code. This might be painful at times and slow down the coding fun, but it should avoid a lot of frustration on either side in the end.
Notes about blc files, written as part of the primary analysis of Illumina sequencing machines.
Sequence uniqueness within the genome plays an important part when attempting to map short sequence parts – e.g. next-generation short sequencing reads. It is one of the factors that can introduce a bias in sequencing or it’s analysis – the other important factor being GC content (GC-rich sequences, eg. genic/exonic region, as well as very […]