There are different ways to encode the quality scores in FASTQ files from Next-generation sequencing machines. This is important to find out before using the data and to convert between formats if necessary.
The CCDS project tries to identify a core set of human and mouse protein coding regions in a stable manner.
There is always confusion about different notations of the start coordinates of genomic features between different genome browsers and file formats.
The phase (or sometimes called frame) gives information on how to translate individual parts of a gene, the coding exons. Phases 1 & 2 have a different definition in GFF and EnsEMBL format!
Originally developed for large-scale web applications, nosql databases like to see themselves as next generation dbs and are using “not only sql”.