DSSPcont

From Rost Lab Open
Jump to: navigation, search

Contents

Intro

The DSSP program automates protein secondary structure assignment from PDB structures, using an algorithm that assigns every residue to one of eight states. However, any discrete assignment is incomplete, because the continuum of thermal fluctuations cannot be described. Hence, a continuous assignment of secondary structure that replaces 'static' by 'dynamic' states is used here. Continuous DSSP (DSSPcont) assignments are obtained as weighted averages over ten DSSP assignments with different hydrogen bond thresholds. The continuous DSSP assignments calculated from a single set of coordinates reflect the structural variations due to thermal fluctuations.

Usage

The DSSPcont output can originate from one X-ray structure, the average over all NMR models for a given structure or DSSPcont output for a single model.

  • The help document for the DSSPcont program - contains an explanation of the output from DSSPcont.

Please note that the DSSPcont server is currently offline. If you need to use this program please see the Download section

  • You may analyze data on our DSSPcont server in one of three ways. The data either must be a PDB identifier (ID), a file containing a PDB entry which you have locally on your computer, or an entire PDB entry, which you can 'cut and paste' into the form provided. All three methods perform the same operation, and will give you the same result for a particular entry:

Entering a PDB ID This will analyze an existing PDB entry, from the PDB databank. We have an up-to-date databank, DSSPcontDB here at our site, which we update once a week from the PDB site. If you know the PDB ID, you can type it into the first box in the page. An example PDB ID is '101m'. Alternatively, you can perform the same process, by using our SRS server. This has the entire PDB databank's corresponding DSSPcont entries, within one databank. Just go to the SRS Query Form, and enter the ID of the structure you're looking for. You can search this database in other ways also.

Upload a file Enter a file name into the second box, including the full path of the file. Alternatively click on 'Browse' to search your local directory structure for the file.

Paste the PDB structure 'Cut and paste' your entire PDB structure into the form. Be careful to include everything.

Download

Thank you very much for your interest in DSSPcont. If you send me (Claus Andersen) an email - using the email address ca2@cbs.dtu.dk, I'll send you the login information for downloading the files via ftp.

Alternatively you can email assistant at rostlab dot org.

Commercial licenses can be obtained through Biosof LLC

Help

Authors and Reference

Authors of the DSSP method

Wolfgang Kabsch and Chris Sander, MPI MF, Heidelberg, 1983.

Reference: Kabsch,W. and Sander,C. (1983) Biopolymers 22, p2577-2637

Authors of the DSSPcont method

Claus A.F. Andersen, Arthur G. Palmer, Søren Brunak and Burkhard Rost, CUBIC, New York 2001

Reference:

Andersen CAF, Palmer AG, Brunak S, Rost B (2002) Structure 10, p175-184

Synopsis

Definition of secondary structure of proteins given a set of 3D coordinates in PDB format.

Description

The DSSP program defines secondary structure in eight categories, geometrical features and solvent exposure of proteins, given atomic coordinates in Protein Data Bank format. The DSSPcont program has extended the discrete DSSP assignments to a continuous assignment in the same eight categories.

Usage and command line options

dsspcont pdb_file|pdbid

Command line options:

pdb_file - Name of the flatfile containing the PDB entry.

pdbid - The 4 letter PDB identifier e.g. 101m

Examples

In this example the filename was used for the large photoreaction center (1prc) input file.

   unix% dsspcont 1prc.pdb > 1prc.dsspc

Output file is 1prc.dsspc

Output

The output from DSSPcont contains secondary structure assignments and other information, one line per residue. Simplified extract from PDB-ID: 1A53

...
HEADER    SYNTHASE                                19-FEB-98   1A53                                                             .
COMPND   2 MOLECULE: INDOLE-3-GLYCEROLPHOSPHATE SYNTHASE;                                                                      .
SOURCE   2 ORGANISM_SCIENTIFIC: SULFOLOBUS SOLFATARICUS;                                                                       .
AUTHOR    M.HENNIG,B.DARIMONT,K.KIRSCHNER,J.N.JANSONIUS                                                                        .
  247  1  0  0  0 TOTAL NUMBER OF RESIDUES, NUMBER OF CHAINS, NUMBER OF SS-BRIDGES(TOTAL,INTRACHAIN,INTERCHAIN)                .
 10789.0   ACCESSIBLE SURFACE OF PROTEIN (ANGSTROM**2)                                                                         .
  184 74.5   TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(J)  , SAME NUMBER PER 100 RESIDUES                              .
   35 14.2   TOTAL NUMBER OF HYDROGEN BONDS IN     PARALLEL BRIDGES, SAME NUMBER PER 100 RESIDUES                              .
...
   81 32.8   TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+4), SAME NUMBER PER 100 RESIDUES                              .
    5  2.0   TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+5), SAME NUMBER PER 100 RESIDUES                              .
...
    .-- sequential residue number, including chain breaks as extra residues
    |    .-- original PDB residue number, not nec. sequential, may contain letters
    |    |   .-- amino acid sequence in one letter code (where ! means chain break)
    |    |   |  .-- secondary structure summary (eight category discrete assignment) based on columns 19-38
    |    |   |  | 
    |    |   |  | .-- 3-turn-helix H-bond (I+3) or (I-3) forming 3_10-helices  | 		      > means H-bond downstream
    |    |   |  | |.-- 4-turn-helix H-bond (I+4) or (I-4) forming alpha-helices| turn-helix columns:  < means H-bond upstream
    |    |   |  | ||.-- 5-turn-helix H-bond (I+5) or (I-5) forming pi-helices  | 		      X means both H-bonds
    |    |   |  | |||.-- geometrical bend
    |    |   |  | ||||.-- chirality
    |    |   |  | |||||.-- beta bridge label 
    |    |   |  | ||||||.-- beta bridge label 
    |    |   |  | |||||||   .-- beta bridge partner residue number
    |    |   |  | |||||||   |   .-- beta bridge partner residue number
    |    |   |  | |||||||   |   |.-- beta sheet label 
    |    |   |  | |||||||   |   ||   .-- solvent accessibility
    |    |   |  | |||||||   |   ||   |    
    |    |   |  | |||||||   |   ||   |    .-- continuous assignment of 3_10 helix (DSSP 'G')
    |    |   |  | |||||||   |   ||   |    |   .-- continuous assignment of alpha-helix (DSSP 'H')
    |    |   |  | |||||||   |   ||   |    |   |   .-- continuous assignment of pi-helix (DSSP 'I')
    |    |   |  | |||||||   |   ||   |    |   |   |   .-- continuous assignment of a helix turn (DSSP 'T')
    |    |   |  | |||||||   |   ||   |    |   |   |   |   .-- continuous assignment of an extended beta-sheet (DSSP 'E')
    |    |   |  | |||||||   |   ||   |    |   |   |   |   |   .-- continuous assignment of a beta-bridge (DSSP 'B')
    |    |   |  | |||||||   |   ||   |    |   |   |   |   |   |   .-- continuous assignment of a bend (DSSP 'S')
    |    |   |  | |||||||   |   ||   |    |   |   |   |   |   |   |   .-- continuous assignment of other/loop (DSSP 'L')
    |    |   |  | |||||||   |   ||   |    |   |   |   |   |   |   |   |
  #  RESIDUE AA STRUCTURE BP1 BP2  ACC    G   H   I   T   E   B   S   L  
...
   64   65   D     >  -     0   0   81    0   0   0   0   0   0   0 100 
   65   66   P  H  > S+     0   0   15    0 100   0   0   0   0   0   0 
   66   67   I  H  > S+     0   0   22    0 100   0   0   0   0   0   0 
   67   68   E  H  > S+     0   0  121    0 100   0   0   0   0   0   0 
   68   69   Y  H  X S+     0   0    7    0 100   0   0   0   0   0   0 
   69   70   S  H  X S+     0   0    0    0 100   0   0   0   0   0   0 
   70   71   K  H  < S+     0   0   96    0 100   0   0   0   0   0   0 
   71   72   F  H >< S+     0   0   33    0 100   0   0   0   0   0   0 
   72   73   M  H >X S+     0   0    0    0 100   0   0   0   0   0   0 
   73   74   E  T 3< S+     0   0   49    0  53   0  47   0   0   0   0 
   74   75   R  T <4 S+     0   0  171    0  53   0  47   0   0   0   0 
   75   76   Y  T <4 S+     0   0   52    0  53   0  47   0   0   0   0 
   76   77   A     <  -     0   0    1    0   0   0   0  10   0   0  90 
   77   78   V  S    S-     0   0    0    0   0   0   0  10   0  90   0 
   78   79   G  E     -b   48   0A   0    0   0   0   0 100   0   0   0 
   79   80   L  E     -bc  49 106A   0    0   0   0   0 100   0   0   0 
   80   81   S  E     -bc  50 107A   0    0   0   0   0 100   0   0   0 
   81   82   I  E     -bc  51 108A   0    0   0   0   0 100   0   0   0 
   82   83   L  E     + c   0 109A   5    0   0   0   0 100   0   0   0 
   83   84   T        +     0   0    0    0   0   0   0   0   0   0 100 
   84   85   E        -     0   0    4    0   0   0   0   0   0   0 100 
   85   86   E     >  +     0   0  103    0  10   0   0   0   0   0  90 
   86   87   K  T  4 S+     0   0  124    0  10   0  64   0   0  26   0 
   87   88   Y  T  4 S+     0   0   43    0  10   0  64   0   0  26   0 
   88   89   F  T  4 S-     0   0   19    0  10   0  64   0   0  26   0 
   89   90   N     <  +     0   0   72    0   0   0   0   0  10   0  90 
...

For definitons, see above BIOPOLYMERS and STRUCTURE articles.

Each line contains the following residue information

RESIDUE

two columns of residue numbers. First column is DSSP's sequential residue number, starting at the first residue actually in the data set and including chain breaks; this number is used to refer to residues throughout. Second column gives crystallographers' 'residue sequence number','insertion code' and 'chain identifier' (see protein data bank file record format manual), given for reference only. AA

one letter amino acid code. NB lower case letters indicate Cystein (CYS) residue pairs, which are covalently bonded with an SS-bridge. S (first column in STRUCTURE block)

compromise a summary of secondary structure (eight categories), intended to approximate crystallographers' intuition, based on columns 19-38, which are the principal result of DSSP analysis of the 3D atomic coordinates. These eight categories are:

  • 3_10-helix (DSSP 'G')
  • alpha-helix (DSSP 'H')
  • pi-helix (DSSP 'I')
  • helix turn (DSSP 'T')
  • extended beta-sheet (DSSP 'E')
  • beta-bridge (DSSP 'B')
  • bend (DSSP 'S')
  • other/loop (DSSP ' ' i.e. a space. In the DSSPcont column 'L')

BP1 BP2

residue number of first and second bridge partner followed by one letter sheet label

ACC

number of water molecules in contact with this residue *10. or residue water exposed surface in Angstrom**2. N-H-->O (doner) and O-->H-N (acceptor) etc.

hydrogen bonds; e.g. -3,-1.4 means: if this residue is residue i then N-H of I is h-bonded to C=O of I-3 with an electrostatic H-bond energy of -1.4 kcal/mol. There are two columns for each type of H-bond, to allow for bifurcated H-bonds. TCO

cosine of angle between C=O of residue I and C=O of residue I-1. For alpha-helices, TCO is near +1, for beta-sheets TCO is near -1. Not used for structure definition.

KAPPA

virtual bond angle (bend angle) defined by the three C-alpha atoms of residues I-2,I,I+2. Used to define bend (structure code 'S').

ALPHA

virtual torsion angle (dihedral angle) defined by the four C-alpha atoms of residues I-1,I,I+1,I+2.Used to define chirality (structure code '+' or '-').

PHI PSI

IUPAC peptide backbone torsion angles

X-CA Y-CA Z-CA

echo of C-alpha atom coordinates

Warnings

The values for solvent exposure may not mean what you think!

  • effects leading to larger than expected values: solvent exposure calculation ignores unusual residues, like ACE, or residues with incomplete backbone, like ALA 1 of data set 1CPA. it also ignores HETATOMS, like a heme or metal ligands. Also, side chains may be incomplete (an error message is written).
  • effects leading to smaller than expected values: if you apply this program to protein data bank data sets containing oligomers, solvent exposure is for the entire assembly, not for the monomer. Also, atom OXT of c-terminal residues is treated like a side chain atom if it is listed as part of the last residue. also, peptide substrates, when listed as atoms rather than hetatoms, are treated as part of the protein, e.g. residues 499 s and 500 s in 1CPA.

Unknown or unusual residues are named X on output and are not checked for standard number of sidechain atoms. All explicit water molecules, like other hetatoms, are ignored.

Input file

Coordinate file in PDB format.

Personal tools