ASAP Discovery x OpenADMET CompetitionTake part in the first prospective benchmark on Polaris.

Dataset

asap-discovery/antiviral-potency-2025-sample

Sample dataset for the ASAP Discovery x OpenADMET potency challenge. Represents a portion of the training data.

Created on: December 03, 2024Number of datapoints: 207
Public
V2

Status

Uncertified

This artifact has not been certified by approved reviewers. It may contain issues related to data quality.

Learn more here.

Tags

MERS-CoV
SARS-CoV-2
Mpro
potency

Modalities

MOLECULE

Related benchmarks

No related benchmarks yet.

You're looking at a v2.0 dataset!

Our goal at Polaris is to build a universal format for ML-ready datasets in drug discovery. With our V2 implementation, we're drastically improving scalability, but there's still work to be done!

Details

README

Potency Challenge (Sample Dataset)

This dataset is made available as part of the ASAP Discovery x OpenADMET competition. It's a small portion of the training data, made available already to let teams prepare data loaders and other utilities.

Structure

This dataset has the following columns:

ColumnDtypeDescription
Molecule NamestrInternal identifier at ASAP Discovery for this molecule
CXSMILESstrText representation of the 2D molecular structure
pIC50 (SARS-CoV-2 Mpro)floatNegative log10 of the IC50 values of the dose-response curve
pIC50 (MERS-CoV Mpro)floatNegative log10 of the IC50 values of the dose-response curve

For the challenge, we will provide all these columns for the train set. At test time, we will only provide the CXSMILES.

📦 Raw data package

We've sacrificed the completeness of the scientific data to improve ease of use. However, for those that are interested, you can also access the raw data package that this dataset has been created from here.