Differences in 5'untranslated regions highlight the importance of translational regulation of dosage sensitive genes.
Wieder N., D'Souza EN., Martin-Geary AC., Lassen FH., Talbot-Martin J., Fernandes M., Chothani SP., Rackham OJL., Schafer S., Aspden JL., MacArthur DG., Davies RW., Whiffin N.
Untranslated regions (UTRs) are important mediators of post-transcriptional regulation. The length of UTRs and the composition of regulatory elements within them are known to vary substantially across genes, but little is known about the reasons for this variation in humans. Here, we set out to determine whether this variation, specifically in 5'UTRs, correlates with gene dosage sensitivity. We investigate 5'UTR length, the number of alternative transcription start sites, the potential for alternative splicing, the number and type of upstream open reading frames (uORFs) and the propensity of 5'UTRs to form secondary structures. We explore how these elements vary by gene tolerance to loss-of-function (LoF; using the LOEUF metric), and in genes where changes in dosage are known to cause disease. We show that LOEUF correlates with 5'UTR length and complexity. Genes that are most intolerant to LoF have longer 5'UTRs, greater TSS diversity, and more upstream regulatory elements than their LoF tolerant counterparts. We show that these differences are evident in disease gene-sets, but not in recessive developmental disorder genes where LoF of a single allele is tolerated. Our results confirm the importance of post-transcriptional regulation through 5'UTRs in tight regulation of mRNA and protein levels, particularly for genes where changes in dosage are deleterious and lead to disease. Finally, to support gene-based investigation we release a web-based browser tool, VuTR, that supports exploration of the composition of individual 5'UTRs and the impact of genetic variation within them.