HIGH-THROUGHPUT CHARACTERIZATION OF ENZYMES FROM GENOMIC AND PROTEOMIC PROJECTS: – MULTIVARIATE STATISTICAL APPROACH
M. Monincová1, A. Jesenská1, R. Chaloupková1, Y. Sato2, Z. Prokop1 and J. Damborský1
1 Loschmidt Laboratories, Faculty of Science, Masaryk University, Kamenice 5/A4, 625 00 Brno, Czech Republic
2 Department of Environmental Life Sciences, Graduate School of Life Sciences, Tohoku University, 2-1-1 Katahira, Sendai 980-8577, Japan
High-throughput genomic and proteomic methods identify a large number of novel enzymes which require systematic characterization. Here, we introduce a novel approach for characterization of enzymes with broad substrate specificity using multivariate statistics. The approach employs multivariate statistical method, Principle Component Analysis , for selection of sufficiently large set of substrates from initial pull of chemicals respecting maximum variability in physical-chemical properties . Quick and reliable enzymatic assay follows the selection and produces an activity data of particular proteins with selected substrates. Third step is the application of Principal Component Analysis on enzyme activity data matrix to describe the differences in their substrate specificities. The obtained dataset, complemented with physical-chemical properties of halogenated hydrocarbons tested, can be further submitted to Quantitave Structure-Activity Relationships analysis .
The concept has been verified with bacterial enzymes haloalkane dehalogenases. Members of this enzyme family cleave carbon-halogen bond in halogenated hydrocarbons and can be used in bioremediation, industrial biocatalysis or as biosensors. A large number of putative enzymes has been identified in genomic sequencing projects. Principle Component Analysis was used for identification of novel enzymes with promising biotechnological properties. Quantitative Structure-Activity Relationships analysis brought detailed view into explanation of activity or inactivity of 30 substrates selected from the set of 194 different halogenated compounds. Such information is helpful in process of rational design and improvement of haloalkane dehalogenases for biotechnological applications.