SNPeffect

The BriX database


BriX is a structural classification of protein fragments. The library comprises fragments ranging from 4 to 14 amino acids that are clustered against 6 different distance thresholds. This has lead to an alphabet of around 2000 frequently observed letters or structural classes per chain length. These classes are accessible through a search and a browse interface on your left.


The data is displayed in a Class view or a Fragment view and arranged in information tabs. Besides information about its content the Class view provides a sequence alignment view through the web applet JalView, Fasta data and an image that shows the cluster of superimposed fragments. The sequence and dssp logos are both generated by the application WebLogo. The Fragment view holds general information about the source of the fragments, the sequence and the DSSP assignment and the classes it belongs to. Furthermore, the web applet JMol presents an interactive view of the protein fragment.

Background

Proteins are organized into globular domains where each domain consists of approximately 100 to 150 amino acids. These domains are constructed from a limited number of secondary structures (alpha-helices, beta-sheets, beta-turns, ...) where each secondary structure can again be partitioned in a number of fragments that consist of 4 to 14 residues. These fragments are continuously reused to construct new folds. The number of possible folds that can be constructed in this way is estimated to be approximately 5000. Yet, there is no reliable estimate for the amount of building blocks of these local structures that are required to reconstruct all already known folds completely. Based on this knowledge and the idea that local structures in the protein form clusters of reusable building blocks, we constructed a knowledge base of protein fragments where all the fragments are clustered according to a distance function (here root-mean-square distance). The usefulness of this collection is determined by the adequacy of the set of clusters to describe existing protein structures. BriX, has been constructed using the Whatif dataset and is currently evaluated on the Astral database. The results show that a very good coverage is achieved that scales relative to the distance threshold that is used. Which means that the representatives of the clusters may be used to reconstruct existing proteins.

Publications

Reconstruction of Protein Backbones from the BriX Collection of Canonical Protein Fragments.
Baeten L, Reumers J, Tur V, Stricher F, Lenaerts T, Serrano L, Rousseau F, Schymkowitz J
PLoS Comput Biol. (2008),4: