PON-P2 |
PON-P2 predicts the pathogenicity (harmfulness) of amino acid substitutions. It is a machine learning-based approach and utilizes amino acid features, Gene Ontology (GO) annotations, evolutionary conservation, and if available, annotations of functional sites. Note that, PON-P2 is NOT a meta-predictor. PON-P2 estimates the reliability of predictions and groups the variants into pathogenic, neutral and unknown classes. Read more Performance of PON-P2 has been extensively tested. For details, see here. Performance of PON-P2 on additional datasets such as predictSNPSelected and SwissVarSelected datasets are also available here. PON-P2 has been shown to work also on cancer variants. PON-P2 predictions for amino acid substitutions in COSMIC (v68) and data published in Harmful somatic amino acid substitutions affect key pathways in cancers is publicly available here. PON-P2 was the best performing method in a recent comparison and outperformed protein-specific predictors in 85% of the proteins (Riera et. al. 2016).NEWS: PON-P2 prediction for total Human Proteome is available here. |
'
![]() |
Home | News | Instructions | Disclaimer | Useful Links | Cancer variant predictions |
Instructions for submitting queries PON-P2 allows users to submit queries in three formats. 1) Identifier submission Identifier submission Protein or gene identifier(s) and variation(s) are required in fasta-like format. Ensembl gene identifier, NCBI gene ID and UniProtKB/Swiss-Prot accession can be used as identifiers. When using Ensembl or NCBI gene identifier(s), the variation(s) have to be mapped to the longest isoform of the gene. The identifier should be preceded by greater than sign (>). Only one variation should be placed in one line. Multiple variations in a single protein or in multiple proteins can be submitted in a single query. Alternatively, a file containing identifier(s) and variation(s) in the same format can be uploaded. Example: >ENSG00000165816 #Ensembl gene identifier I75F #reference amino acid,position in the sequence(1 based),variant amino acid V366M >Q16518 #UniProtKB/Swiss-Prot accession identifier P363T R44Q >151194 #NCBI gene identifier T9N P111Q The variations in UniProtKB/Swiss-Prot accession and the longest isoform of NCBI gene are mapped to longest isoform of corresponding Ensembl gene. If the variations could not be mapped, they are reported in error log file and we recomend users to submit these variations again using Sequence submission service. Genomic submission This format requires variation(s) at genomic level. The users are required to submit chromosome number, chromosome location, strand, reference nucleotide and variant nucleotide in the format mentioned in the example below. Each line should contain only one variation. Users can either paste the variation(s) in the text box or upload a file containing the variation(s). Example: 3:49044874,+1,C,T 11:108368994,-1,G,C 16:11363014,-1,A,G Users can also upload a Variant Call Format (VCF) file directly using VCF file submission link. PON-P2 filters non-synonymous variations from VCF file and makes predictions for them. Note: The chromosome location and reference alleles have to be provided in reference to the Genome Reference Consortium human genome (build 37) (GRCh37). Sequence submission This format requires users to submit fasta-format amino acid sequence(s) and variation(s) corresponding to the sequence(s). Each sequence should have a header line starting with greater than sign (>) followed by description. The sequence in upper-case characters follows the header line. No characters except the universal 20 amino acid codes are accepted in the sequence(s). The variation(s) corresponding to a sequence should contain the same header line as the sequence. Variation(s) follow the header line and only one variation is allowed per line. The sequence(s) and variation(s) can be pasted in the correponding text-boxes or separate files containing sequence(s) and variation(s) can be submitted. Example sequences: >ADA_HUMAN MAQTPAFDKPKVELHVHLDGSIKPETILYYGRRRGIALPANTAEGLLNVIGMDKPLTLPD FLAKFDYYMPAIAGCREAIKRIAYEFVEMKAKEGVVYVEVRYSPHLLANSKVEPIPWNQA EGDLTPDEVVALVGQGLQEGERDFGVKARSILCCMRHQPNWSPKVVELCKKYQQQTVVAI DLAGDETIPGSSLLPGHVQAYQEAVKSGIHRTVHAGEVGSAEVVKEAVDILKTERLGHGY HTLEDQALYNRLRQENMHFEICPWSSYLTGAWKPDTEHAVIRLKNDQANYSLNTDDPLIF KSTLDTDYQMTKRDMGFTEEEFKRLNINAAKSSFLPEDEKRELLDLLYKAYGMPPSASAG QNL >Retinal pigment MSIQVEHPAGGYKKLFETVEELSSPLTAHVTGRIPLWLTGSLLRCGPGLFEVGSEPFYHL FDGQALLHKFDFKEGHVTYHRRFIRTDAYVRAMTEKRIVITEFGTCAFPDPCKNIFSRFF SYFRGVEVTDNALVNVYPVGEDYYACTETNFITKINPETLETIKQVDLCNYVSVNGATAH PHIENDGTVYNIGNCFGKNFSIAYNIVKIPPLQADKEDPISKSEIVVQFPCSDRFKPSYV HSFGLTPNYIVFVETPVKINLFKFLSSWSLWGANYMDCFESNETMGVWLHIADKKRKKYL NNKYRTSPFNLFHHINTYEDNGFLIVDLCCWKGFEFVYNYLYLANLRENWEEVKKNARKA PQPEVRRYVLPLNIDKADTGKNLVTLPNTTATAILCSDETIWLEPEVLFSGPRQAFEFPQ INYQKYCGKPYTYAYGLGLNHFVPDRLCKLNVKTKETWVWQEPDSYPSEPIFVSHPDALE EDDGVVLSVVVSPGAGQKPAYLLILNAKDLSEVARAEVEINIPVTFHGLFKKS Variation examples: >ADA_HUMAN R101H #reference amino acid,position in the sequence(1 based),variant amino acid R101L S291L >Retinal pigment G75R R97P Email: Users are required to submit a valid email address where the results will be sent when they are ready. How to cite? Niroula A, Urolagin S, Vihinen M (2015) PON-P2: Prediction Method for Fast and Reliable Identification of Harmful Variants. PLoS ONE 10(2):e0117380.doi:10.1371/journal.pone.0117380 Articles citing PON-P2 List of articles citing PON-P2 |
![]() |
Performance of PON-P2 on additional datasets We estimated the performance using some additional data. We predicted the variations in predictSNPSelected and SwissVarSelected described in Grimm et al.. The datasets are available in VariBench.
Note: The datasets were used to evaluate the performance of MutationTaster2, PolyPhen-2, Mutation Assessor, CADD, SIFT, LRT, FatHMM-U and FatHMM-W by Grimm et al.. The performance scores of the methods are presented in Supplementary Table S1. Accuracy and MCC for PON-P2 are higher than the compared methods even for variations in proteins that were not present in PON-P2 training dataset (circularity-free dataset). |
If you have any queries, please feel free to contact us.