Although extant proteins consist of 20 different amino acids, it has been proposed that primordial proteins consisted of a smaller set of “early” amino acids and that additional “modern” amino acids have gradually been recruited into the genetic code [1-3]. This naturally leads to the questions: can structured and functional proteins be constructed using the “early” amino-acid alphabet? Can extant proteins be reverse-evolved while preserving their structure/function?
To test this, protein databases have been inspected to select model extant protein candidates with different structural folds. Our preliminary search contains proteins with both catalytic and binding/interaction functions.
The selected protein targets were “reverse-evolved” in vitro into variants where the “modern” amino acid were randomized by “early” ones. The libraries of randomized genes were incorporated into a genotype-phenotype linkage to be compatible with an appropriate library display (mRNA display [4]) and selection method. The selection of successful candidates was based on conservation of structure and/or function and the most “successful” variants will be characterized.
This research will inform us of the essentiality of “modern” amino acids for building protein structure/function and thus will provide a direct test of the hypotheses about early proteins. In addition, proteins constructed from a limited amino acid alphabet are of importance in protein engineering and synthetic biology. Finally, this area touches upon the very basic link of protein sequence-structure-function that lies at the core of many biotechnological and biomedicine problems and has express implications for construction of artificial biochemistries.