A Genetic Algorithm for the Optimization of Protein Crystallization Screening

Samyam Acharya, Marc L Pusey, Ramazan S Aygun, Imren Dinc

1DataMedia Research Lab, Computer Science Department,
University of Alabama in Huntsville,
Huntsville, Alabama 35899, United States

2iXpressGenes, Inc., 601 Genome Way, Huntsville, Alabama 35806, United States

sa0066@uah.edu

Protein crystallization screening focuses on determining the factors crucial for successful protein crystallization. The protein crystallization may require large number of parameters to be considered for setting up of cocktails that would yield suitable large crystals for X-ray data collection [1, 2]. These parameters include types of reagents, ionic strengths, types of salts, pH value of buffers, temperature, etc. [3]. Our goal is to implement a genetic algorithm which isolates combinations of reagents and concentrations that have a higher degree of synergy and potentially offer better crystalline outcome.

Combinations of reagents along with their concentrations are mapped into binary coded strings called chromosomes (not to be confused with biological chromosomes). Each chromosome represents a cocktail which is a certain set of buffer, pH, salts, etc. The length of a chromosome depends on the number of reagents we take into consideration for a particular experiment. Using expert score from previously conducted experiments, we identify new conditions generated by the algorithm in successive iterations. Undesired conditions, such as those that are known to cause phase separation and precipitates, are removed and favourable conditions are paired to produce the next generation of conditions. The top ranked conditions produced by the algorithm will be evaluated with respect to experiments conducted based on our associative experimental design [4].

The advantage of using genetic algorithm for protein crystallization screening lies in the ability of the algorithm to handle large number of parameters in an uneven search space environment. With this approach, we can employ selective pairing of conditions (chromosomes), which could be useful in identifying precipitant synergy for obtaining crystals and antergy (pairs that produce no crystals) and thus narrow down the screening process. The output conditions generated by the algorithm will be evaluated using the Bin – Recall Metric [4].

1.         J. Jancarik and S.-H. Kim, Sparse matrix sampling: a screening method for crystallization of proteins, Journal of applied crystallography, vol. 24, no. 4, pp. 409–411, 1991.

2.         A. McPherson and B. Cudney, Optimization of crystallization conditions for biological macromolecules, Structural Biology and Crystallization Communications, vol. 70, no. 11, pp. 1445–1467, 2014.

3.         A. McPherson, Crystallization of Biological Macromolecules. Cold Spring Harbor Laboratory Press, 1999.

4.         I. Dinç, M. L. Pusey, R. S. Aygün, Protein Crystallization Screening Using Associative Experimental Design, Bioinformatics Research and Applications, 2015