pEffect's data sets for development and evaluation |
Data sets are based on Swiss-Prot release 2012_01:
- 1388 type III effector proteins (sequence redundant set that forms the database for PSI-BLAST searches)
- 115 type III effector and 3460 non-effector proteins - (homology reduced (HVAL=0) and used for the development of the SVM)
|
Independent test sets:
- UniProt'15 HVAL0 sequence unique sets of 51 type III effector (from after the UniProt 2014_02 release) and 691 non-effector proteins
(from after the same release of Swiss-Prot)
- UniProt'15 HVAL0 fully independent is the UniProt'15 HVAL0 set 10 type III effector and 390 non-effector proteins sequence homology reduced to
pEffect's development data set
- UniProt'15 Full sequence redundant sets of 498 type III effectors (added to UniProt) and 1509 non-effectors (added to Swiss-Prot) after 2014_08 release
- T3DB Full sequence redundant set of 218 type III effector and 831 non-effector proteins from the T3DB database
- T3DB HVAL0 sequence unique set of 66 type III effector and 128 non-effector proteins from the T3DB database
|
Sequence fragments:
- 30N-terminal cleaved
- 30C-terminal cleaved
- Randomly selected two thirds of the sequence
- Randomly selected sequence fragments of typical translated reads length
|
pEffect predictions for entire proteomes |
|
pEffect: the prediction method |
|