This is the ADMET challenge, part of the ASAP Discovery x OpenADMET challenge.

ADMET

Absorption, Distribution, Metabolism, Excretion, Toxicology - or ADMET - endpoints sit in the middle of the assay cascade and can make or break preclinical candidate molecules. For this blind challenge we selected several crucial endpoints for the community to predict:

Mouse Liver Microsomal stability (MLM, protocol): This is a stability assay that tests how quickly a molecule gets broken down by mouse liver microsomes. This is a useful assay that can be used as an estimate on how long a molecule will reside in the mouse body before it gets cleared.
Human Liver Microsomal stability (HLM, protocol): This is a stability assay that tests how quickly a molecule gets broken down by human liver microsomes. This is a useful assay that can be used as an estimate on how long a molecule will reside in the human body before it gets cleared.
Solubility (KSOL, protocol): solubility is essential for drug molecules: this heavily affects the pharmacokinetic and dynamics ('PKPD') of the molecule in the human body.
LogD (protocol): like solubility - but then in fatty tissue - LogD is a measure of a molecule's lipophilicity, or how well it dissolves in fat. LogD is calculated by comparing a molecule's solubility in octanol, a fat-like substance, to its solubility in water.
Cell permeation (MDR1-MDCKII, protocol): MDCKII-MDR1 is a cell line that's used to model cell permeation i.e. how well drug compounds will permeate cell layers. For coronaviruses this is a critical endpoint because there is increasing evidence that afflictions such as long-covid are caused by (remnant) virus particles in the brain, and blood-brain-barrier (BBB) permeation is critical for drug candidates to reach the brain.

📊 Data

The training set will have the following variables:

Column	Unit	Dtype	Description
Molecule Name		str	Internal identifier at ASAP Discovery for this molecule
CXSMILES		str	Text representation of the 2D molecular structure
MLM	uL/min/mg	float	MLM assay readouts for stability
HLM	uL/min/mg	float	HLM assay readouts for stability
KSOL	uM	float	KSOL assay readouts for solubility
LogD		float	LogD calculation
MDR1-MDCKII	10^-6 cm/s	float	MDR1-MDCKII assay readouts for permeability

At test time, we will only provide the CXSMILES.

🧑‍💻 Get started

We provide notebooks to get started with the challenge that cover several important topics, such as how to prepare your submission and the data format. Get started here.

✂️ Split

We used a temporal split. This accurately represents a real-life drug discovery situation where you train on historical data to be able to make the optimal model for the next period of prospective predictions. This means that there may be some overlap in the chemical space between the training and test set.

✅ Evaluation criteria

The challenge will be judged based on the judging criteria outlined here.

You must provide predictions for every endpoint.
We will evaluate your submission using MAE on the log transformed endpoints, after clipping to the strictly positive detection limit. This minimises the effect of massive outliers on the MAE. For LogD which is already in log units we will just use MAE directly.
You can enter as many times as desired, but we will only evaluate your last submission.
In the open science spirit of ASAP Discovery we would love to see open code showing how you created your submission if possible. If not, we require at least a written report.

🏆 Prizes

We will be offering Polaris merch packs to the three top performing teams for each sub-challenge. We will also be writing our conclusions up as a paper, to which all submitting teams are invited to share co-authorship.

Post-challenge virtual workshop

Participants with considerable performance or learnings will have the opportunity to present their work at a special blind-challenge workshop to share learnings, hosted by the NIH ASAP Open Science Forum.

A special issue in the Journal of Chemical Information and Modeling

We encourage everyone participating in this challenge to share a preprint - we will track these during the challenge so that participants can compare learnings on-the-fly - that describes their approach, results, performance and learnings from comparing their approach with other groups. We are working with the editors of Journal of Chemical Information and Modeling (JCIM; a high-impact journal in chemical informatics and molecular modeling) on a collection of papers/ special issue that will report on the breadth of learnings and outcomes from this challenge.

💭 Any feedback?

If you have any suggestions on how we could evaluate the submitted predictions to further improve our shared understanding, we'd love to hear it! Please open a Github issue or discussion in this repository to share your ideas with the community.

About the ASAP Discovery x OpenADMET challenge

ASAP Discovery is an NIH-funded consortium leveraging open science for antiviral drug discovery, with the goal of equitable and affordable global access to effective antivirals. ASAP has pursued several programs and targets, the most advanced being ASAP's dual SARS-CoV-2 and MERS-CoV main protease (Mpro) program, which has reached preclinical candidate nomination. You can see a full list of ASAP's programs on the website. ASA PDiscovery is passionate about open science and has put a huge amount of effort into sharing its outputs in a digestible way with the community. For example, if you navigate to ASAP's website, the drug discovery pipeline is fully interactive for users. Clicking any filled box will navigate you to the continuously published data for those experiments, and experimental protocols used.

ASAP Discovery is approaching a patent disclosure for its preclinical candidates for its two coronavirus Mpro drug discovery programs see blogpost for a high-level overview. There is a batch of data in these projects that ASAP Discovery has not publicly disclosed at this point; this will be the blind test data of this challenge. The blind challenge will mirror some of the real-world drug discovery challenges that ASAP has had to overcome in the last three years: we would love to challenge the community with the same hurdles that we've had to overcome during this process - can you use your models to solve these problems better than we have? You will be working with active and real drug discovery data that is normally restricted to large pharmaceutical companies!

banner The ASAP Discovery Consortium group meeting in NYC May 2023

All subchallenges:

Timeline

Sample data released: December 3 (2024)
Challenge start: Jan 13 (2025)
Jan-Feb: Walk in online sessions (2025)
Challenge end: March 10 (2025)
Winners announced: March 25 (2025)

Endpoints included in this challenge

We have designed this challenge to let you experience a diverse set of computational drug discovery problems that are pivotal in pushing the pharmaceutical decision-making process forward. To understand the typical medicinal-chemistry way of thinking about making a preclinical candidate, it's best to start at the top. Target Candidate Profiles (TCPs) are internal documents that pharmaceutical companies draw up that set a series of goals or must-haves (and sometimes nice-to-haves) that the intended preclininical candidate must have. With ASAP, these are public. Our SARS-CoV-2/MERS-Mpro dual inhibitor TCP is available here. You'll see there are many goals: the set of goals and their values depend heavily on the target indication (the disease that we're trying to treat).

You'll also notice that potency (IC50 or Kd) is only a small part of this TCP. That is typical: in close-to-preclinical stages such as lead optimization, potency is not the main challenge anymore. Rather, the challenge is to balance a wide array of more complex parameters such as cell potency, formulation, pharmacokinetics/dynamics and safety. These are all part of the 'assay cascade': promisingly potent lead molecules are subjected to a first tier of affordable follow-up assays. Ones that come out of those assays as acceptable (i.e. within the bounds of the TCP requirements) are followed up on in subsequent assay tiers. In this way, lead molecules follow the cascade from simple biochemical potency assays all the way to more involved assays and ultimately animal studies.

Competition

Competition Hosts

Quick Links

Tags

Modalities

Details