genome informaticsgenomics

Telomeric Regions of the Human Genome

Telomeres form caps on the ends of chromosomes that prevent fusion of chromosomal ends and provide genomic stability.

During gametogenesis, reprogramming of the germ cells leads to elongation of telomeres up to their species-specific maximum.

In normal somatic cells, telomeres are progressively shortened with every cell division. This shortening in normal human cells limits the number of cell divisions. For human cells to proliferate beyond the senescence checkpoint, they need to stabilize telomere length. This is accomplished mainly by reactivation of the telomerase enzyme. Telomerase expression is under the control of many factors. Expression of telomerase can lead to cell immortalization and is activated during tumorigenesis, i.e. cancer.

Male Xq-telomeres are 1100 bp shorter than female Xq-telomeres.

The telomeric repeat found on all human chromosomes is “TTAGGG”.

The centromeres and telomeres of the human chromosomes are not defined as region attributes in the Ensembl perl API explicitely, so for checking these regions, one option is to pull them out of the UCSC table browser. For this, select the assembly and use the “Mapping and Sequencing tracks” group and the “Gap” table. The extracted locations of the human telomere regions is provided below for the genome assemblies GRCh37 (hg19) and GRCh38 (hg38). The coordinates are given in the 0-based UCSC coordinated system.

Telomere locations

Telomeres of chromosome 17 have not been defined for assembly GRCh37. They are short, but do exists nonetheless. An assembly patch will address this.

ChromosomeStart (hg19)End (hg19)Start (hg38)End (hg38)
chr1010000010000
chr1249240621249250621248946422248956422
chr2010000010000
chr2243189373243199373242183529242193529
chr3010000010000
chr3198012430198022430198285559198295559
chr4010000010000
chr4191144276191154276190204555190214555
chr5010000010000
chr5180905260180915260181528259181538259
chr6010000010000
chr6171105067171115067170795979170805979
chr7010000010000
chr7159128663159138663159335973159345973
chr8010000010000
chr8146354022146364022145128636145138636
chr9010000010000
chr9141203431141213431138384717138394717
chr10010000010000
chr10135524747135534747133787422133797422
chr11010000010000
chr11134996516135006516135076622135086622
chr12010000010000
chr12133841895133851895133265309133275309
chr13010000010000
chr13115159878115169878114354328114364328
chr14010000010000
chr14107339540107349540107033718107043718
chr15010000010000
chr15102521392102531392101981189101991189
chr16010000010000
chr1690344753903547539032834590338345
chr17NANA010000
chr17NANA8324744183257441
chr18010000010000
chr1878067248780772488036328580373285
chr19010000010000
chr1959118983591289835860761658617616
chr20010000010000
chr2063015520630255206443416764444167
chr21010000010000
chr2148119895481298954669998346709983
chr22010000010000
chr2251294566513045665080846850818468
chrX010000010000
chrX155260560155270560156030895156040895
chrY010000010000
chrY59363566593735665721741557227415

Centromer locations

The centromers for assembly hg38 are defined in different ways in the UCSC system. According to an explanation by Brian Lee and Christopher Lee of UCSC there now is a specific browser track that “shows the location of Karen Miga’s centromere model sequences. However, these annotations can be smaller than the centromeres shown in the chromosome ideogram and Chromosome Bands track. (…) Depending on your purpose you could use the centromere model regions (red), or the broader Chromosome Bands Ideogram definition of centromere which overlap some annotations (cytoBandIdeo table), or the regions of the assembly that are just NNNNN’s (Gap track).” (link, link)
You can see this location in the screenshot below (from the browser):

As the gap extend is slightly larger than the defined regions, we will have to use the definition of “acen” ideogram regions from the gaps table in most cases where we actually need to use the underlying sequence. The combined centromer regions for hg38 are shown below:

chromstartend
chr1121700000125100000
chr29180000096000000
chr38780000094000000
chr44820000051800000
chr54610000051400000
chr65850000062600000
chr75810000062100000
chr84320000047200000
chr94220000045500000
chr103800000041600000
chr115100000055800000
chr123320000037800000
chr131650000018900000
chr141610000018200000
chr151750000020500000
chr163530000038400000
chr172270000027400000
chr181540000021500000
chr192420000028100000
chr202570000030400000
chr211090000013000000
chr221370000017400000
chrX5810000063800000
chrY1030000010600000

Sources: