A benchmark database for variations

Home | Instructions | Datasets | Citing | Disclaimer |

Structure mapped variants


Dataset used for PON-SC

Datasets for residue side chain clashes. 7796 variations PDB in F1 and 350 variations from 5 test datasets in F2.

  1. Download:     F1,      F2

Reference: Čalyševa, J., & Vihinen, M. (2017). PON-SC - program for identifying steric clashes caused by amino acid substitutions. BMC bioinformatics, 18(1), 531. doi:10.1186/s12859-017-1947-7.  PUBMED  


Semi-automatically derived and hand-curated collection of proteins, which possess an amino acid that has been changed by a SNV and 3D atomic coordinates are available in the PDB. F1 contains a benchmark dataset of 374 unique human variants, each corresponding to a different PDB entry.

  1. Download:     F1

Reference: Bhattacharya, R., Rose, P. W., Burley, S. K., & Prlić, A. (2017). Impact of genetic variation on three dimensional structure and function of proteins. PloS one, 12(3), e0171355. doi:10.1371/journal.pone.0171355.  PUBMED  


Membrane protein datasets with a total 2058 variants in F1.

  1. HTPd_variants_info.csv
  2. HTPd.fasta
  3. DS508.fasta
  4. DS1289.fasta
  5. mpHTP.fasta

Reference: Orioli and Vihinen, in press.

Last updated: 2019-04-09 by Anasua Sarkar.