NORSp - predictor of NOn-Regular Secondary Structure

From Rost Lab Open
Jump to: navigation, search

Contents

Intro

NORSp is a on-line predictor of NOn-Regular Secondary Structure (NORS), developed by Jinfeng Liu at the Rost Group, at Columbia University, New York.

Many structurally flexible regions play important roles in biological processes. It has been shown that extended loopy regions are very abundant in nature, and that they are evolutionarily conserved. NORSp is a publicly available predictor for disordered regions in protein. Specifically, it predicts long regions with no regular secondary structure. Upon user submission of protein sequence, NORSp will analyse the protein about its secondary structure, and presence of transmembrane helices and coiled-coil. It will then return e-mail to user about the presence and position of disordered regions.

NORSp can be useful for biologists in several ways. For example, crystallographers can check whether their proteins contain NORS regions and make the decision about whether to proceed with the experiments since NORS proteins may be difficult to crystallise, as demonstrated by the their low occurrence in PDB. Biologists interested in protein structure-function relationship may also find it interesting to verify whether the protein-protein interaction sites coincide with NORS region.

Availability/Web server

This program can be accessed via the PredictProtein service.

Installation with aptitude (Debian, Ubuntu, etc.)

Software Installation

  1. If you have not done so until now, add the rostlab repository to the list of your syanptic package manager. This is how it's done: Debian_repository#sources.list.d
  2. aptitude update
  3. aptitude (search for rostlab keyring and install by marking the package with a '+' and hit 'g' twice to install)
  4. aptitude update (to determine all rostlab packages to install)
  5. aptitude install norsp. Here's a step by step guide Debian_repository#Installing_a_package_step_by_step

Help

Required Input

NORSp requires the following input files:

  • amino acid sequence file (fasta format optional)
  • secondary structure in RDB format; can be generate by PROFphd
  • transmembrane helices prediction in RDB format; can be generate by PROFphd
  • HSSP alignment file; how to generate an hssp file
  • coiled coils prediction; Generate using the coiledcoils package IMPORTANT: you need to include both the coils and coils_raw in the input folder. norsp will need an explicit reference to the .coils file but will also look for the coils_raw file in the same folder.

Output verbosity

succinct: only shows the position of NORS region in the context of the submitted sequence verbose: includes the intermediate data used by NORSp: secondary structure, solvent accessibility, transmembrane helices, and coiled-coil prediction

Output format

TEXT: standard TXT (ASCII) format that can be displayed in any text editor HTML:pretty output optimized for web browsers

Output delivery

our web site: The full results will be available for download in our website, only the URL will be emailed to you mail: Full results will be sent by e-mail

Window size

Length of sequence window used to calculate the structural content, we recommend the value to be larger than 50 Structural content maximum percentage of secondary structure content defined for disorder region, default=12% Consecutive exposed residues minimum length of consecutive residues in NORS that are exposed to solvent, allowed values are integers between 1 and the window size

References

If you find NORSp useful for your research, please cite:

  • Liu J, Tan H, Rost B (2002) Loopy proteins appear conserved in evolution. J Mol Biol. 322(1):53-64. MEDLINE
  • Liu J, Rost B (2003) NORSp: predictions of long regions without regular secondary structure. Nucleic Acids Research 31(13):3833-3835.MEDLINE


Sample output

Input sequence

 >test sequence
 MARAEEVDGP APGEVLLSPV DGLHNHVIHV ALQEHGWATY AVHPVEAQPA
 PHPGALLHQV EVPAPLDRVD PYPLIALYHH PRLECPPYSL PNTLLSLPPP
 HITRRYIEYY GYVTPQPLLI LYHLPLAQLH PTVLEYLVGP RVRHNNTREP
 EDPVYTLLGR LPSKALPKEV GVCNNLAEPG GPNLIHTNLL PIHVYDRKEG
 GRLHNTMLCI DPADPPRQID IPNLKHRPGP TRNPSRLPTL IAPESKPPFE
 GWMSVGQEA

TEXT output
This is an example of the "succinct" output.
See the example of "verbose" output here

Result of NORS prediction (Jinfeng Liu & Burkhard Rost)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Jinfeng Liu, Hepan Tan & Burkhard Rost
J. Mol. Biol. (2002) 322: 53-64
________________________________________________________________________________

Sequence length     : 259
Secondary structure : Helix=9.7%, Strand=19.3%, Loop=71.0%

window size         : 70
Structure content cutoff: 12%
Minimum consecutive exposed residues: 10

NORS                 : n=NORS region
Secondary structure  : h=helix, e=strand, l=loop
Transmembrane helix  : m=transmembrane helix
Solvent accessibility: e=exposed, b=buried


NORS region          : 187-259

           .    :    .    :    .    :    .    :    .    5
seq    MARAEEVDGPAPGEVLLSPVDGLHNHVIHVALQEHGWATYAVHPVEAQPA
NORS   ..................................................
SEC    lllleellllllleeeeellllllleeeeeehhhllleeeeelleellll
COILS  ..................................................
HTM    ..................................................
ACC    eeebeebeeebebebbbbebeebeebbbbbbbeeeebbbbbbbebebeeb
           .    :    .    :    .    :    .    :    .    10
seq    PHPGALLHQVEVPAPLDRVDPYPLIALYHHPRLECPPYSLPNTLLSLPPP
NORS   ..................................................
SEC    lllllleeeeelllllllllllleeeeellllllllllllllllllllll
COILS  ..................................................
HTM    ..................................................
ACC    eeeebbbbebebebebeebeebbbbbbbbbeebebeebebeeebbebeee
           .    :    .    :    .    :    .    :    .    15
seq    HITRRYIEYYGYVTPQPLLILYHLPLAQLHPTVLEYLVGPRVRHNNTREP
NORS   ..................................................
SEC    lllhhhhhhhleellhhhhhhhlllhhhllllhhhhhlllllllllllll
COILS  ..................................................
HTM    ..................................................
ACC    ebbeebbebbeebeebbbbbbbbbebbebbeebbeebbebebeeeeeeee
           .    :    .    :    .    :    .    :    .    20
seq    EDPVYTLLGRLPSKALPKEVGVCNNLAEPGGPNLIHTNLLPIHVYDRKEG
NORS   ....................................nnnnnnnnnnnnnn
SEC    llleeeeelllllllllllleellllllllllleeellllleeeeellll
COILS  ..................................................
HTM    ..................................................
ACC    eeebbbbbeebeeeebeeebebbeebeeeeeeebbbbebbbbbbbeeeee
           .    :    .    :    .    :    .    :    .    25
seq    GRLHNTMLCIDPADPPRQIDIPNLKHRPGPTRNPSRLPTLIAPESKPPFE
NORS   nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
SEC    lllllleeelllllllllllllllllllllllllllllllllllllllll
COILS  ..................................................
HTM    ..................................................
ACC    eebbbbbbbbbeeeeeeebebeebeeeeeeeeeeeebeebbbeeeeeebe
           .    :    .    :    .    :    .    :    .    30
seq    GWMSVGQEA
NORS   nnnnnnnnn
SEC    lllllllll
COILS  .........
HTM    .........
ACC    ebbeeeeee
//



HTML output
This is an example of the "succinct" output.
See the example of "verbose" output here .

 

 
Sequence length 	259
Secondary structure 	Helix=9.7%, Strand=19.3%, Loop=71.0%
	&nbsp
window size 	70
Structure content cutoff 	12%
Minimum consecutive exposed residues 	10
	&nbsp
NORS 	N=NORS region
Secondary structure 	H=helix, E=strand, ' '=loop
Coiled-coil region 	c=coils
Transmembrane helix 	m=transmembrane helix
Solvent accessibility 	e=exposed, ' '=buried

 
NORS region predicted: 	187-259

           .    :    .    :    .    :    .    :    .    5
SEQ    MARAEEVDGPAPGEVLLSPVDGLHNHVIHVALQEHGWATYAVHPVEAQPA
NORS                                                     
SEC        EE       EEEEE       EEEEEEHHH   EEEEE  EE    

COILS                                                    
HTM                                                      
ACC    eee ee eee e e    e ee ee       eeee       e e ee 
           .    :    .    :    .    :    .    :    .    10
SEQ    PHPGALLHQVEVPAPLDRVDPYPLIALYHHPRLECPPYSLPNTLLSLPPP
NORS                                                     
SEC          EEEEE            EEEEE                      

COILS                                                    
HTM                                                      
ACC    eeee    e e e e ee ee         ee e ee e eee  e eee

           .    :    .    :    .    :    .    :    .    15
SEQ    HITRRYIEYYGYVTPQPLLILYHLPLAQLHPTVLEYLVGPRVRHNNTREP
NORS                                                     
SEC       HHHHHHH EE  HHHHHHH   HHH    HHHHH             

COILS                                                    
HTM                                                      
ACC    e  ee  e  ee ee         e  e  ee  ee  e e eeeeeeee

           .    :    .    :    .    :    .    :    .    20
SEQ    EDPVYTLLGRLPSKALPKEVGVCNNLAEPGGPNLIHTNLLPIHVYDRKEG
NORS                                       NNNNNNNNNNNNNN

SEC       EEEEE            EE           EEE     EEEEE    

COILS                                                    
HTM                                                      
ACC    eee     ee eeee eee e  ee eeeeeee    e       eeeee

           .    :    .    :    .    :    .    :    .    25
SEQ    GRLHNTMLCIDPADPPRQIDIPNLKHRPGPTRNPSRLPTLIAPESKPPFE
NORS   NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

SEC          EEE                                         
COILS                                                    
HTM                                                      
ACC    ee         eeeeeee e ee eeeeeeeeeeee ee   eeeeee e

           .    :    .    :    .    :    .    :    .    30
SEQ    GWMSVGQEA
NORS   NNNNNNNNN
SEC             
COILS           
HTM             
ACC    e  eeeeee

//
Personal tools