PON-P2

PON-P2 predicts the pathogenicity (harmfulness) of amino acid substitutions. It is a machine learning-based approach and utilizes amino acid features, Gene Ontology (GO) annotations, evolutionary conservation, and if available, annotations of functional sites. Note that, PON-P2 is NOT a meta-predictor. PON-P2 estimates the reliability of predictions and groups the variants into pathogenic, neutral and unknown classes. Read more

Performance of PON-P2 has been extensively tested. For details, see here. Performance of PON-P2 on additional datasets such as predictSNPSelected and SwissVarSelected datasets are also available here. PON-P2 has been shown to work also on cancer variants. PON-P2 predictions for amino acid substitutions in COSMIC (v68) and data published in Harmful somatic amino acid substitutions affect key pathways in cancers is publicly available here.

PON-P2 was the best performing method in a recent comparison and outperformed protein-specific predictors in 85% of the proteins (Riera et. al. 2016).

NEWS: PON-P2 prediction for total Human Proteome is available here.

'
Home News Instructions Disclaimer Useful Links Cancer variant predictions

PON-P2 API released
2016-03-24

PON-P2 API has now been released. You can access PON-P2 programmatically. Read more about how to use PON-P2 API here.

PON-P2 predictions for cancer variants

Predictions of PON-P2 for amino acids in COSMIC (v68) and a separate dataset consisting variants from 7,042 cancer samples are now available. The article describing the data has ben published here.

PON-P2 article published

An article describing PON-P2 has been published. Read the article here.

Performance of PON-P2 on additional datasets

We estimated the performance using some additional data. We predicted the variations in predictSNPSelected and SwissVarSelected described in Grimm et al.. The datasets are available in VariBench.


predictSNPSelected
TP TN FP FN Unknown PPV NPV Sensitivity Specificity Accuracy MCC
All variations predicted by PON-P2 5,124 3,173 345 590 6,445 0.94 0.84 0.90 0.90 0.90 0.79
Variations not in PON-P2 training data 5,116 3,173 341 590 5,575 0.94 0.77 0.90 0.86 0.88 0.73
Variations in proteins not in PON-P2 training data 1,385 1,243 186 210 2,126 0.88 0.86 0.87 0.87 0.87 0.74
SwissVarSelected
TP TN FP FN Unknown PPV NPV Sensitivity Specificity Accuracy MCC
All variations predicted by PON-P2 1,566 3,412 818 773 5,221 0.66 0.82 0.67 0.81 0.76 0.47
Variations not in PON-P2 training data 1,551 3,194 818 773 5,036 0.65 0.81 0.67 0.80 0.75 0.46
Variations in proteins not in PON-P2 training data 737 1,751 417 414 2,596 0.64 0.81 0.64 0.81 0.75 0.45

Note: The datasets were used to evaluate the performance of MutationTaster2, PolyPhen-2, Mutation Assessor, CADD, SIFT, LRT, FatHMM-U and FatHMM-W by Grimm et al.. The performance scores of the methods are presented in Supp. Table S1. Accuracy and MCC for PON-P2 are higher than the compared methods even for variations in proteins that were not present in PON-P2 training dataset (circularity-free dataset).