Skip to content

Datasets

Currently the database is hosted on the following remote file share server: https://bwsyncandshare.kit.edu/s/98NgmGCDty54kik/download/.

Available Datasets

The following datasets can be downloaded either using the command line interface (CLI) or directly using the python API.

Name Description No. Elements Target Type
MCF_7 - 26776 regression
_price_small - 80000
_test - 3 Regression
_test2 - 3 Regression
ames Ames Mutagenicity Assays 6512 Classification
aqsoldb Aqueous Solubility 9889 Regression
bace BACE-1 Binding Affinity 1513 Regression, Classification
bbbp Blood-Brain Barrier Penetration 1934 Classification
beet Honey Bee Toxicity 254 Classification
clintox Clinical Toxicity 1465 Classification
compas_1x DFT properties of polycyclic aromatic hydrocarbons 34072 Regression
compas_3x DFT properties of polycyclic aromatic hydrocarbons 39482 Regression
dpp4 DPP-4 inhibitors 3933 Classification
elanos_bp Boiling Point 5431 regression
elanos_vp Vapor Pressure 2704 regression
esol Water Solubility 1127 Regression
freesolv Hydration Free Energy 639 Regression
hiv HIV Inhibitors 38040 Classification
lipophilicity Octanol/Water Distribution Coefficient 4199 Regression
muv MUV Benchmark 93087 Classification
pcqm4mv2 - 3378606 Regression
qm9 DFT properties of small molecules 134000 Regression
qm9_smiles DFT properties of small molecules 133882 Regression
sider Drug Side Effects 1220 Classification
skin_irritation Skin Irritation 1263 classification
skin_sensitizers Skin Sensitization 1263 classification
synth_binary_global - 249455 Classification
synth_binary_local - 249455 Classification
tox21 Toxicology 7570 Classification
toxcast Toxicology 6842 Classification