Summary
                        
        
                            Genes are fundamental units of life and their origin has fascinated researchers since the beginning of the molecular era. Many of the studies on the formation of new genes in genomes have focused on gene duplication and subsequent divergence of the two gene copies. But, in recent years, we have learnt that genes can also arise de novo from previously non-genic sequences. The discovery of de novo genes has become possible by the sequencing of complete genomes and the comparison of gene sets between closely related species. Here we wish to test a novel hypothesis, we propose that de novo gene formation dynamics in populations results in substantial differences in gene content between individuals. If they exist, these differences would be not be visible by the current methods to study gene variation, which are based on the comparison of the sequences of each individual to a common set of reference genes. To test our hypothesis, we will need to develop novel computational approaches to first obtain an accurate representation of all transcripts and translated open reading frames in each individual, and then integrate the information at the population level. We propose to apply these methods to two very distinct biological systems, a large collection of Saccharomyces cerevisiae world isolates and a human lymphoblastoid cell line (LCL) panel. For this, we will collect and generate RNA (RNA-Seq) and ribosome profiling (Ribo-Seq) sequencing data. In order to identify de novo originated events occurred within populations, as opposed to phylogenetically conserved genes that have been lost in some individuals, we will also generate similar data from a set of closely related species in each of the two systems. Combined with genomics data, we will identify the spectrum of mutations associated with de novo gene birth with an unprecedented level of detail and uncover footprints of adaptation linked to the birth of new genes.
                    
    
        
            Unfold all
        
        /
        
            Fold all
        
    
                                 
                    More information & hyperlinks
                        
        | Web resources: | https://cordis.europa.eu/project/id/101052538 | 
| Start date: | 01-06-2022 | 
| End date: | 31-05-2027 | 
| Total budget - Public funding: | 2 453 751,00 Euro - 2 453 751,00 Euro | 
                                Cordis data
                        
        Original description
Genes are fundamental units of life and their origin has fascinated researchers since the beginning of the molecular era. Many of the studies on the formation of new genes in genomes have focused on gene duplication and subsequent divergence of the two gene copies. But, in recent years, we have learnt that genes can also arise de novo from previously non-genic sequences. The discovery of de novo genes has become possible by the sequencing of complete genomes and the comparison of gene sets between closely related species. Here we wish to test a novel hypothesis, we propose that de novo gene formation dynamics in populations results in substantial differences in gene content between individuals. If they exist, these differences would be not be visible by the current methods to study gene variation, which are based on the comparison of the sequences of each individual to a common set of reference genes. To test our hypothesis, we will need to develop novel computational approaches to first obtain an accurate representation of all transcripts and translated open reading frames in each individual, and then integrate the information at the population level. We propose to apply these methods to two very distinct biological systems, a large collection of Saccharomyces cerevisiae world isolates and a human lymphoblastoid cell line (LCL) panel. For this, we will collect and generate RNA (RNA-Seq) and ribosome profiling (Ribo-Seq) sequencing data. In order to identify de novo originated events occurred within populations, as opposed to phylogenetically conserved genes that have been lost in some individuals, we will also generate similar data from a set of closely related species in each of the two systems. Combined with genomics data, we will identify the spectrum of mutations associated with de novo gene birth with an unprecedented level of detail and uncover footprints of adaptation linked to the birth of new genes.Status
SIGNEDCall topic
ERC-2021-ADGUpdate Date
09-02-2023
                        
                        Geographical location(s)
                    
                         
                             
                             
                            