Structure biology always faces the multi-step failures in the experimental process of three-dimensional structure determination by X-ray crystallography, including sequence cloning, protein material production, purification, crystallization and ultimately structural determination. Accordingly, bioinformatics methods concerning this issue can help scientists find bottleneck procedures for protein crystallization and select preferential susceptible crystallizable proteins for structural biology.
The Laboratory of Structural Bioinformatics and Integrative System Biology led by Professor SONG Jiangning at the Tianjin Institute of Industrial Biotechnology (TIB), Chinese Academy of Sciences (CAS) has carried out systematic analysis of experimental data and important physicochemical properties of proteins that influence protein crystallization, in collaboration with researchers from the Bioinformatics Center, China Agricultural University and Faculty of Medicine at Monash University, Melbourne Australia.
They have recently developed a powerful bioinformatics tool termed as PredPPCrys (See the accompanying Figure for a schematic illustration of the PredPPCrys approach) for propensity analysis of the five major experimental steps involved in successful protein crystallization. A large number of multifaceted sequence-derived physicochemical features were carefully considered and optimized to be the best feature sets by performing a multi-step feature selection algorithm. They further developed two-level support vector machine (SVM) models of PredPPCrys, making a significantly increase in prediction performance. Additionally, the new method outperformed all existing tools based on the benchmarking experiments on two independent test datasets.
PredPPCrys is an academic free tool, which has been applied for target selection of currently non-crystallizable proteins, available at http://www.structbioinfor.org/PredPPCrys.
This work was financially supported by the National Natural Science Foundation of China (61202167, 61303169, 31350110507, 11250110508) . A research article entitled “PredPPCrys: Accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection” has been published in PLoS ONE (2014, 9:e105912). WANG Huilin, a Research Assistant on bioinformatics and bioengineering in TIB, is the first author of this work.
Schematic illustration of the PredPPCrys approach ( Image by SONG Jiangning’s group )