Datasets
Currently the database is hosted on the following remote file share server: https://bwsyncandshare.kit.edu/s/98NgmGCDty54kik/download/.
Available Datasets
The following datasets can be downloaded either using the command line interface (CLI) or directly using the python API.
Name | Description | No. Elements | Target Type |
---|---|---|---|
MCF_7 | - | 26776 | regression |
_price_small | - | 80000 | |
_test | - | 3 | Regression |
_test2 | - | 3 | Regression |
ames | Ames Mutagenicity Assays | 6512 | Classification |
aqsoldb | Aqueous Solubility | 9889 | Regression |
bace | BACE-1 Binding Affinity | 1513 | Regression, Classification |
bbbp | Blood-Brain Barrier Penetration | 1934 | Classification |
beet | Honey Bee Toxicity | 254 | Classification |
clintox | Clinical Toxicity | 1465 | Classification |
compas_1x | DFT properties of polycyclic aromatic hydrocarbons | 34072 | Regression |
compas_3x | DFT properties of polycyclic aromatic hydrocarbons | 39482 | Regression |
dpp4 | DPP-4 inhibitors | 3933 | Classification |
elanos_bp | Boiling Point | 5431 | regression |
elanos_vp | Vapor Pressure | 2704 | regression |
esol | Water Solubility | 1127 | Regression |
freesolv | Hydration Free Energy | 639 | Regression |
hiv | HIV Inhibitors | 38040 | Classification |
lipophilicity | Octanol/Water Distribution Coefficient | 4199 | Regression |
muv | MUV Benchmark | 93087 | Classification |
pcqm4mv2 | - | 3378606 | Regression |
qm9 | DFT properties of small molecules | 134000 | Regression |
qm9_smiles | DFT properties of small molecules | 133882 | Regression |
sider | Drug Side Effects | 1220 | Classification |
skin_irritation | Skin Irritation | 1263 | classification |
skin_sensitizers | Skin Sensitization | 1263 | classification |
synth_binary_global | - | 249455 | Classification |
synth_binary_local | - | 249455 | Classification |
tox21 | Toxicology | 7570 | Classification |
toxcast | Toxicology | 6842 | Classification |