Summary
                        
        
                            Nanopore is a breakthrough genomic sequencing tool with long read length, high accuracy, and high throughput in reading DNA. However, it is the proteome that ultimately determines the cell’s phenotype, and tremendous efforts have been made to develop sequencing techniques for reading proteins in the past decade. First breakthroughs have been appearing for sequencing short peptides at the single-molecule level. Current strategies are however still far from sequencing full-length native proteins due to the low synthesis efficiencies in protein handling and the limited scanning length (can only read short peptides with ~25 amino acids). Here, I will first develop a novel method to synthesize a large variety of proteins that are connected to DNA with their own codon sequences, using just a few reactions with a puromycin linker labeled to a mRNA, and followed by in vitro translation, reverse transcription, and RNase cleavage. Furthermore, I propose a single-molecule protein nanopore engineering strategy that significantly extends the MspA nanopore lumen length to push the limit for protein reading length. With the proven abilities of a Hel308 DNA helicase, the engineered nanopore can read the entire DNA codons first, followed by the read of >100 amino acids of the protein. Upon recording large data sets, I will build a protein signal library with the related codon information for the training of a machine learning model, which serves as a tool for de novo protein sequencing. This work has great potential to push the limits of sequencing technology to reading the whole proteome.
                    
    
        
            Unfold all
        
        /
        
            Fold all
        
    
                                 
                    More information & hyperlinks
                        
        | Web resources: | https://cordis.europa.eu/project/id/101151821 | 
| Start date: | 01-12-2024 | 
| End date: | 30-11-2026 | 
| Total budget - Public funding: | - 203 464,00 Euro | 
                                Cordis data
                        
        Original description
Nanopore is a breakthrough genomic sequencing tool with long read length, high accuracy, and high throughput in reading DNA. However, it is the proteome that ultimately determines the cells phenotype, and tremendous efforts have been made to develop sequencing techniques for reading proteins in the past decade. First breakthroughs have been appearing for sequencing short peptides at the single-molecule level. Current strategies are however still far from sequencing full-length native proteins due to the low synthesis efficiencies in protein handling and the limited scanning length (can only read short peptides with ~25 amino acids). Here, I will first develop a novel method to synthesize a large variety of proteins that are connected to DNA with their own codon sequences, using just a few reactions with a puromycin linker labeled to a mRNA, and followed by in vitro translation, reverse transcription, and RNase cleavage. Furthermore, I propose a single-molecule protein nanopore engineering strategy that significantly extends the MspA nanopore lumen length to push the limit for protein reading length. With the proven abilities of a Hel308 DNA helicase, the engineered nanopore can read the entire DNA codons first, followed by the read of >100 amino acids of the protein. Upon recording large data sets, I will build a protein signal library with the related codon information for the training of a machine learning model, which serves as a tool for de novo protein sequencing. This work has great potential to push the limits of sequencing technology to reading the whole proteome.Status
SIGNEDCall topic
HORIZON-MSCA-2023-PF-01-01Update Date
01-10-2025
                        
                        Geographical location(s)
                    
                         
                             
                             
                            