Se rendre au contenu


We have followed the definition of protein domains as in the SCOP (Structural Classification of Proteins) database to identify domain insertions in known protein structures. Although there are several schemes for protein structure classification, SCOP is central to this work as it is a manually curated classification of proteins of known structures based on their structural and evolutionary relatedness.


Overview

Domains constitute the structural, functional and evolutionary units of proteins. Proteins can be built from a single domain or an assortment of domains. In naturally occurring multi-domain proteins, the combination almost always occurs end-to-end, i.e., one domain following another domain's C-terminal end. However, there are exceptions to this general pattern, where multi-domain proteins are formed by the insertion of a domain (insert domain) into another domain (parent domain).



Access Methods

The information in the database can be accessed using different methods.

3. Browse by type of insertion

We have classified domain insertions based on the number of insert domains seen in a single chain. In single insertions, a domain belonging to a particular superfamily gets inserted into another domain of the same superfamily or of a different superfamily. In multiple insertions, more than one insert, of the same or different superfamily is inserted into the parent domain. 

1. Search by PDB Code or keyword


A user can submit a PDB Code (eg., 1an9) with or without chain information or SCOP domain or keyword.


2. Browse all entries



All known domain insertions in PDB can be browsed individually.

4. Browse by combination

We have considered the first five classes in SCOP for determining domain insertions. Hence there are 25 possible combinations of parent-child combinations. This option allows the user to browse based on such combinations. For example, domain insertions where the parent belongs to alpha/beta class and insert belongs to alpha+beta class.


This option also allows the user to browse entries by unique parent or insert class. For example, a query would be to list all insertions that have a parent domain belonging to All-Alpha class.

What is domIns all about?

DomIns is a web resource aimed at providing comprehensive information on domain insertions in proteins of known structure. We have followed the definition of protein domains as in the SCOP (Structural Classification of Proteins) database in order to identify insertions. The server is currently updated to SCOP version 1.71 and PDB_Select March 2006 version. The previous version of domIns used the SCOP version 1.61 and PDB_Select April 2002 version.

What are domain insertions?

In the above figure, the E.coli protein Malonyl-CoA:Acyl Carrier Protein transacylase has two domains: the catalytic domain is interrupted by the insertion of the ACP-binding domain . The parent domain (catalytic domain) has two regions, with residue position from 3-127 and 128-307 in the same domain. Both the parent and the insert domains belong to two different superfamily of proteins. Similar arrangement is seen in Streptomyces coelicolor malonyl-CoA:ACP transacylase as well. This is an example for single insertion, where the parent domain is interrupted by a single insert domain. In mutiple insertions, there is more than one insert domain.


How does one identify domain insertions?


Although there are several schemes for protein structure classification for investigating protein sequences and structures, SCOP is important as it is a manually curated classification of proteins of know structures from the protein data Bank based on their structural and evolutionary relatedness. In SCOP, a protein domain is considered as an unit of evolution if it occurs independently or in combination with other domains on the basis of evidence from proteins of known structure. SCOP has a hierarchical classification scheme with the principal levels being family, superfamily, fold and class. Proteins clustered together into families are clearly evolutionarily related, usually detectable at sequence level. Proteins brought together into superfamilies although have low sequence identity, their structural and functional features suggest a common evolutionary origin. Superfamilies with similar topology, but without evidence for evolutionary relatedness are grouped under a fold. Folds are then classified into classes based on the secondary structure elements present.

      We have considered only the first five classes (All-alpha, All-beta, alpha/beta, alpha+beta and Small proteins), the fold and the superfamily level of SCOP hierarchy for determining insertions. We excluded mono-domain proteins and considered chains which have at least two domains in them. In multi-domain proteins, while it is usual to have two domains linked in a linear fashion, i.e., the C-terminus of the first domain covalently linked to the N-terminus of the second domain, we looked for domains which are interrupted in the middle by the insertion of another domain. Thus, the second domain (insert) begins and ends inside the first domain (parent domain). The domains involved in insertions can come from the same or different SCOP superfamily.

 

A Web Resource for Domain

Insertions in Known Protein Structures