PON-MMR2

PON-MMR2 classifies amino acid substitutions in mismatch repair (MMR) proteins: MLH1, MSH2, MSH6 and PMS2. It is a machine learning-based method and utilizes amino acid features and evolutionary information. It is trained and tested on variants obtained from InSiGHT database that are classified into benign (Classes 1 and 2) and harmful (Classes 4 and 5) (Thompson et al.) as well as variants obtained from VariBench database that were used to train PON-MMR (Ali et al.).
If you use PON-MMR2, please cite the following publication.
Niroula A, Vihinen M. 2015. Classification of amino acid substitutions in mismatch repair proteins using PON-MMR2. Hum Mutat 36(12):1128-1134.

'
Home About Disclaimer

PON-MMR2

PON-MMR2 is a machine-learning based tool that classifies amino acid substitutions in mismatch repair (MMR) proteins: MLH1, MSH2, MSH6 and PMS2. It is trained using random forest algorithm and utilizes amino acid features and evolutionary information. It is trained and tested on variants obtained from InSiGHT database that are classified into benign (Classes 1 and 2) and harmful (Classes 4 and 5) (Thompson et al.) as well as variants obtained from VariBench database that were used to train PON-MMR (Ali et al.). Balanced accuracy and Matthews correlation coefficient (MCC) of PON-MMR2 are 0.89 and 0.78, respectively in leave-one-out cross-validation and 0.85 and 0.67, respectively on an independent test dataset.


Reference sequences used in PON-MMR2

The reference protein sequences used in PON-MMR2 are obtained from UniProtKB. You can find them in the links below.
MLH1|P40692
MSH2|P43246
MSH6|P52701
PMS2|P54278


Submit queries to PON-MMR2

Protein sequence and variations
Sequence identifier and variations should be submitted in fasta format. Either gene name or UniProtKB accession number can be used as sequence identifier. You can find the reference sequences used in PON-MMR2 above. Multiple variations in a single protein or in multiple proteins can be submitted as a single query. Alternatively, a file containing identifier(s) and variation(s) in the same format can be uploaded.
Example:
>P40692 #UniProtKB accession
G101S
T116P
>MSH2 #Gene name
L310R
D487E

Email
Email field is obligatory. The predictions will be sent to you in the provided email address.


PON-MMR2 output

An email with the result file as an attachment is sent to the email address provided during job submission. The result file contains the following contents:
Prediction
1. Gene name(s)
2. UniProtKB accession number(s) for the reference protein sequence
3. Amino acid substitution(s)
4. Positions of amino acid substitution in the reference protein sequence
5. Original amino acid in the reference protein sequence
6. New amino acid that substitutes the original amino acid
7. Probability of pathogenicity predicted by PON-MMR2. It ranges from 0 (benign) to 1 (harmful).
8. Classification of the variation as pathogenic or neutral based on the probability of pathogenicity (Column 7).
9. InSiGHT Class. It is provided for the amino acid substitutions that are present in PON-MMR2 training and test datasets.

Other details
1. How to cite PON-MMR2?
2. Disclaimer notice
3. Liability notice

Download PON-MMR2 predictions

We predicted all possible amino acid substitutions at each position in the four MMR proteins using PON-MMR2. You can download the predictions using the following link.
PON-MMR2 predictions


How to cite PON-MMR2?

Niroula A, Vihinen M. 2015. Classification of amino acid substitutions in mismatch repair proteins using PON-MMR2. Hum Mutat 36(12):1128-1134.


If you have any queries, please feel free to contact us.