From Rost Lab Open
Jump to: navigation, search


UniqueProt should be used by researchers who want to analyze a sequence-set containing proteins of a certain functional class or cellular location. It removes the bias of sequence-redundant proteins from these sets hoping that the aquired unique sub-set will be a more accurate approximation of the protein universe.

UniqueProt takes a fasta file of protein sequences and calculates a set of sequence-unique proteins. It first compares the sequences with BLAST and then uses a greedy algorithm to derive a representative set reaching maximum coverage and minimum redundancy. The number of output sequences/clusters depends on the HSSP-value, which the program needs as a cutoff parameter. Please refer to the UniqueProt man page for further information.

Author and References

The original UniqueProt has been thoroughly revised and ported to python by T. Hamp. This new implementation is called UniqueProt2. It is available via the Rostlab repository and not published as a paper. Therefore, please still refer to the original implementation when using UniqueProt2:

For questions contact: hampt - at - rostlab.org

Availability and download

This method is available through the Rostlab repository under the GPL license.

Source packages are available via FTP.

Commercial licenses can be obtained through Biosof LLC

Personal tools