Introducing Certified DatasetsReviewed against basic checks, certified datasets are more visible on Polaris.

This dataset has not yet been certified by approved reviewers. It may contain issues related to data completeness and quality.

Dataset

tdcommons/cyp3a4-substrate-carbonmangels

CYP3A4 substrate.

Created on: July 22, 2024Dataset size: 27 KBNumber of datapoints: 670
Public

Tags

ADME

Modalities

MOLECULE

Details

README

Background

CYP3A4 is an important enzyme in the body, mainly found in the liver and in the intestine. It oxidizes small foreign organic molecules (xenobiotics), such as toxins or drugs, so that they can be removed from the body. TDC used a dataset from [1], which merged information on substrates and nonsubstrates from six publications.

Description of readout

Task Description: Binary Classification. Given a drug SMILES string, predict if it is a substrate to the enzyme.

Data resource

References: [1] Selecting relevant descriptors for classification by bayesian estimates: a comparison with decision trees and support vector machines approaches for disparate data sets.

[2] admetSAR: a comprehensive source and free tool for assessment of chemical ADMET properties.