ProTstab

Stability of biomolecules, especially of proteins, is of great interest and significance. Protein stability has been the major target for protein engineering, mainly to increase the stability, but sometimes also to destabilize proteins . Effects on stability are among the most common consequences for disease-related variations , thus this phenomenon is of interest for variation interpretation to explain the effects of harmful variants.

We trained a novel machine learning tool for prediction of protein stability, especially melting temperature Tm. The tool is based on amino acid sequence information and using GBRT (Gradient boosting of regression trees) algorithm.

Predictions for the whole human proteome are available for download here

Single prediction Proteome prediction Multiple prediction About Disclaimer
how to use the predictor
1.For single variation prediction, the following information should be supplied:
  • Protein name;
  • Protein sequence, in FASTA format.
2.For multiple variations prediction, a file is required in FASTA format as follows:
>Q9UKP3
MSLLCRNKGCGQHFDPNTNLPDSCCHHPGVPIFHDALKGWSCCRKRTVDFSEFLNIKGCT
MGPHCAEKLPEAPQPEGPATSSSLQEQKPLNVIPKSAETLRRERPKSELPLKLLPLNISQ
ALEMALEQKELDQEPGAGLDSLIRTGSSCQNPGCDAVYQGPESDATPCTYHPGAPRFHEG
MKSWSCCGIQTLDFGAFLAQPGCRVGRHDWGKQLPASCRHDWHQTDSLVVVTVYGQIPLP
AFNWVKASQTELHVHIVFDGNRVFQAQMKLWGVINVEQSSVFLMPSRVEISLVKADPGSW
AQLEHPDALAKKARAGVVLEMDEEESDDSDDDLSWTEEEEEEEAMGE
>O95965
MRPPGFRNFLLLASSLLFAGLSAVPQSFSPSLRSWPGAACRLSRAESERRCRAPGQPPGA
ALCHGRGRCDCGVCICHVTEPGMFFGPLCECHEWVCETYDGSTCAGHGKCDCGKCKCDQG
WYGDACQYPTNCDLTKKKSNQMCKNSQDIICSNAGTCHCGRCKCDNSDGSGLVYGKFCEC
DDRECIDDETEEICGGHGKCYCGNCYCKAGWHGDKCEFQCDITPWESKRRCTSPDGKICS
NRGTCVCGECTCHDVDPTGDWGDIHGDTCECDERDCRAVYDRYSDDFCSGHGQCNCGRCD
CKAGWYGKKCEHPQSCTLSAEESIRKCQGSSDLPCSGRGKCECGKCTCYPPGDRRVYGKT
CECDDRRCEDLDGVVCGGHGTCSCGRCVCERGWFGKLCQHPRKCNMTEEQSKNLCESADG
ILCSGKGSCHCGKCICSAEEWYISGEFCDCDDRDCDKHDGLICTGNGICSCGNCECWDGW
NGNACEIWLGSEYP

The FASTA file should not contain more than 10 protein sequences.