BIOINFORMATICS BENCHMARKS

Benchmark datasets available in bioinformatics.
Collected by Lund University Protein Structure and Bioinformatics Group (LU PSB)

MULTIPLE SEQUENCE ALIGNMENT

BAliBASE Thompson JD, Plewniak F, Poch O: BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 1999, 15:87-88. pubmed_logo

HOMSTRAD Mizuguchi K, Deane CM, Blundell TL, Overington JP: HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci 1998, 7:2469-2471. pubmed_logo

OxBench suite Raghava GP, Searle SM, Audley PC, Barber JD, Barton GJ: OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy. BMC Bioinformatics 2003, 4:47. pubmed_logo

PREFAB Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32:1792-1797. pubmed_logo

SABmark Van Walle I, Lasters I, Wyns L: SABmark--a benchmark for sequence alignment that covers the entire known fold space. Bioinformatics 2005, 21:1267-1268. pubmed_logo


PROTEIN THREE DIMENSIONAL STRUCTURES, NON-HOMOLOGOUS

PDBselect PDBselect 1992-2009 and PDBfilter-select. Griep S, Hobohm U. Nucleic Acids Res. 2010 Jan;38 (Database issue):D318-9. Epub 2009 Sep 25.pubmed_logo


PROTEIN THREE DIMENSIONAL STRUCTURES, CLASSIFICATION

CATH Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM: CATH--a hierarchic classification of protein domain structures. Structure 1997, 5:1093-1108. pubmed_logo

SCOP Lo Conte L, Ailey B, Hubbard TJ, Brenner SE, Murzin AG, Chothia C: SCOP: a structural classification of proteins database. Nucleic Acids Res 2000, 28:257-259. pubmed_logo


PROTEIN STRUCTURE AND FUNCTION PREDICTION

Protein Classification Benchmark Collection. Sonego P, Pacurar M, Dhir S, Kertesz-Farkas A, Kocsor A, Gaspari Z, Leunissen JA, Pongor S: A protein classification benchmark collection for machine learning. Nucleic Acids Res 2007, 35:D232-236. pubmed_logo


PROTEIN-PROTEIN DOCKING

Docking benchmark. Hwang H, Vreven T, Janin J, Weng Z: Protein-protein docking benchmark version 4.0. Proteins 2010, 78:3111-3114. pubmed_logo


GENE EXPRESSION ANALYSIS

Cope LM, Irizarry RA, Jaffee HA, Wu Z, Speed TP: A benchmark for Affymetrix GeneChip expression measures. Bioinformatics 2004, 20:323-331. pubmed_logo

Zhu Q, Miecznikowski JC, Halfon MS: Preferred analysis methods for Affymetrix GeneChips. II. An expanded, balanced, wholly-defined spike-in dataset. BMC Bioinformatics 2010, 11:285. pubmed_logo


VARIATION DATASETS

VariBench Nair PS and Vihinen M: VariBench: A benchmark database for variations. Hum Mutat 2013, 34:42-49. pubmed_logo

VariSNP Schaafsma GCP and Vihinen M: VariSNP: A benchmark database for variations from dbSNP. Hum Mutat 2015, 36:161-166. pubmed_logo