RNADOMAIN | Computational genomics of long noncoding RNA domains across metazoans

Summary
From junk DNA to genomic dark matter, the road to understanding RNAs that do not encode for proteins has been full of surprises. Compared to 19,000 protein-coding genes, recent estimates point that our genome contains between 25,000 and 100,000 long noncoding RNA (lncRNA) genes. Far from being inert, some lncRNAs are involved in development and disease, particularly, cancer. It has also been shown that the function of a lncRNA can be associated with its localisation in subcellular compartments. Nevertheless, to experimentally characterize and validate interesting lncRNAs is an arduous task. Computational approaches based on machine learning could be designed to complement and scale-up such efforts. Based on recent experimental discoveries, it has been proposed that lncRNAs are separable into functional domains, and that these domains are intimately related to transposable elements and repeats. Nevertheless, how functions are encoded in primary RNA sequence is a fundamental unsolved problem. I propose to develop the first high-throughput computational approach to map lncRNA domains across metazoan genomes. Domains will be first identified according to statistical evidence supported by current biological insights. Putative domains will be queried against state-of-the-art databases on lncRNA function, localisation, and disease. Machine learning algorithms will then be employed to predict new functional domains, and new mechanistic insights will be offered for promising candidates. Lastly, the obtained maps will be stored and disseminated in a database, that will be regularly updated and readily accessible for the research community. This will be a foundational resource to finally shed light on the role of lncRNAs, their regulation and involvement in disease.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/884178
Start date: 01-07-2021
End date: 30-06-2023
Total budget - Public funding: 196 590,72 Euro - 196 590,00 Euro
Cordis data

Original description

From junk DNA to genomic dark matter, the road to understanding RNAs that do not encode for proteins has been full of surprises. Compared to 19,000 protein-coding genes, recent estimates point that our genome contains between 25,000 and 100,000 long noncoding RNA (lncRNA) genes. Far from being inert, some lncRNAs are involved in development and disease, particularly, cancer. It has also been shown that the function of a lncRNA can be associated with its localisation in subcellular compartments. Nevertheless, to experimentally characterize and validate interesting lncRNAs is an arduous task. Computational approaches based on machine learning could be designed to complement and scale-up such efforts. Based on recent experimental discoveries, it has been proposed that lncRNAs are separable into functional domains, and that these domains are intimately related to transposable elements and repeats. Nevertheless, how functions are encoded in primary RNA sequence is a fundamental unsolved problem. I propose to develop the first high-throughput computational approach to map lncRNA domains across metazoan genomes. Domains will be first identified according to statistical evidence supported by current biological insights. Putative domains will be queried against state-of-the-art databases on lncRNA function, localisation, and disease. Machine learning algorithms will then be employed to predict new functional domains, and new mechanistic insights will be offered for promising candidates. Lastly, the obtained maps will be stored and disseminated in a database, that will be regularly updated and readily accessible for the research community. This will be a foundational resource to finally shed light on the role of lncRNAs, their regulation and involvement in disease.

Status

TERMINATED

Call topic

MSCA-IF-2019

Update Date

28-04-2024
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
EU-Programme-Call
Horizon 2020
H2020-EU.1. EXCELLENT SCIENCE
H2020-EU.1.3. EXCELLENT SCIENCE - Marie Skłodowska-Curie Actions (MSCA)
H2020-EU.1.3.2. Nurturing excellence by means of cross-border and cross-sector mobility
H2020-MSCA-IF-2019
MSCA-IF-2019