Ab initio prediction methods, although achieving spectacular pro U0126 ERK gress in recent years, remain less reliable than homology modeling and are still reserved to proteins that cannot be related to any homologous structure. A typical homology modeling of a protein query involves the following processing steps 1. Identification of query homologs with known struc tures from the Protein Data Bank. 2. Multiple sequence alignment of the query and templates. 3. Construction of structural models satisfying most spatial restraints derived from the query template alignment. 4. Model refinement. 5. Evaluation and selection of the best model as struc tural prediction. The quality of the final 3D models depends on each modeling step and the observed accuracy decreases when the query template similarity falls down.
Homology modeling is efficient because two proteins can have dis tant sequences but still share very similar folds. But this observation creates also many problems at each step of the modeling when the query and template sequences are weakly similar. A wrong structural template choice might then have a big impact on the query model accuracy. At low sequence identity, query template alignment is also more ambiguous and any amino acid mismatch will induce important deformations on the resulting struc tural model. The selection of spatial restraints that should be projected from the templates to the query is another difficult issue when query and templates are only distantly related. In such cases, only a small subset of conserved geometrical features is shared between query and templates, and these can spread over several different structures.
Then, insufficient or incompatible spatial restraints extracted from the templates may yield impor tant geometrical variations over the generated models and require further refinement steps such as minimiza tion or loop modeling and accurate structure evaluations to select the best models. Analyses of known knottin sequences and structures indicate that roughly half of the knottin sequences have to be modeled relatively to weakly related templates. To address this challenge, we have designed a fully automated modeling procedure whose processing steps have been optimized relatively to a test set of 34 known knottin structures. We paid a great attention to the optimal use of the structural information that can be obtained from the available knottin structures. Cilengitide We tried to use the conserved geometrical features derived from the comparative analysis of knottin structures as bias to select templates closer to query, as anchors to improve sequence alignments, or as constraints to guide the modeling and increase accuracy.