Comments Off on Genomic Start Coordinates

Adding to the confusion about different notations of phases/frames, the start coordinates of genomic features are also noted differently between different genome browsers and file formats.
- One-based
Counting bases starting with “1” at the first position.
Regions are specified by a “closed interval.” Used e.g. by the Ensembl genome browser and annotation system, the GFF/GTF, SAM and wiggle file formats. - Zero-based
The interbase system counts spaces starting with “0” at the first position.
Regions are specified by a “half-closed-half-open interval”. Used by the UCSC genome browser, Chado (the fruitfly browser), the BED, BAM and PSL file formats.
An example:
One-based 1 2 3 4 5 6 | | | | | | C G A T G C | | | | | | | 0 1 2 3 4 5 6 Zero-based
The ATG interval would be described from 3-5 in the first, from 2-5 in the second system.
Image: Vecteezy, modified