PON-Del predictor for short protein deletions


PON-Del is a predictor for short (1-10 amino acid) deletions. It was trained on an extensive set of variations and showed superior performance compared to other tools. A machine learning algorithm, LightGBM, was used to train PON-Del. Only sequence retaining deletions will be predicted.

PON-Del Overview

Input deletions

Deletion type
Note: Maximum 1000 lines allowed. Deletions starting at position 1 will return Pathogenic because deletion of the first amino acid (usually methionine) may prevent normal protein expression and are beyond the scope of the current prediction model.
Prediction log

              

Prediction results

Note: Deletions starting at position 1 return Pathogenic because deletion of the first amino acid (usually methionine) may prevent normal protein expression and are beyond the scope of the current prediction model.
Column Descriptions:
  • Input: Original input coordinates in the format you provided
  • RefSeq Protein: Corresponding RefSeq protein identifier
  • Deletion Start/End: Protein positions of the deletion
  • Predicted Probability: Two-state model prediction score (0-1, higher = more pathogenic)
  • Predicted Label: Two-state predicted label: P = Pathogenic, B = Benign
  • Predicted Probability (CV): Three-state prediction score (0-1, higher = more pathogenic)
  • Predicted Probability Std (CV): Three-state prediction standard deviation
  • Predicted Probability P (CV): Three-state predictions P-value (P>0.05 = Uncertain)
  • Predicted Probability Label (CV): Three-state predicted label: P = Pathogenic, B = Benign, U = Uncertain

Single amino acid deletions

This page provides precalculated results for all possible single amino acid deletions in proteins coded by MANE transcripts.

Deletions starting at position 1 return Pathogenic because deletion of the first amino acid (usually methionine) may prevent normal protein expression.

You can search the data in several different ways.

Select identifier

1. Choose identifier type
2. Enter ID

Note:
PON-Del is developed based on MANE selectedRefSeq protein identifiers.

If you select a different identifier type, the corresponding RefSeq protein will be displayed.

The one-to-one mapping is defined by the MANE v1.4.

Predicted deletion pathogenicity

Heatmap for predicted pathogenicity

About

Datasets for PON-Del

The datasets used for training and testing the tool are available here: data_pondel.csv

Citing PON-Del

A manuscript describing the predictor has been submitted. In the meantime use URL for citation.

Contact

If you have any problems, please contact Haoyang (haoyang.zhang@med.lu.se) or Mauno (mauno.vihinen@med.lu.se).