The benchmarking platform for drug discovery.
Polaris makes it easy for the machine learning in drug discovery community to share and access datasets & benchmarks.
Increase your impact.
Our aim is to improve the state of benchmarking so ML can have a greater impact on real-world drug discovery scenarios. To start, we hope to provide a single source of truth that aggregates and provides simple access to datasets & benchmarks.
Download a dataset from the Hub
import polaris as po
# Load the dataset from the Hub
dataset = po.load_dataset("polaris/my-first-dataset")
# Get information on the dataset size
dataset.size()
# Load a datapoint in memory
dataset.get_data(
row=dataset.rows[0],
col=dataset.columns[0],
)
# Or, similarly:
dataset[dataset.rows[0], dataset.columns[0]]
# Get an entire data point
dataset[0]
Evaluate your method on a benchmark
import polaris as po
import numpy as np
# Load the benchmark from the Hub
benchmark = po.load_benchmark("polaris/my-first-benchmark")
# Get the train and test data-loaders
train, test = benchmark.get_train_test_split()
# Use the training data to train your model
# Get the input as an array with 'train.inputs' and 'train.targets'
# Or simply iterate over the train object.
for x, y in train:
...
# Work your magic to accurately predict the test set
predictions = [0.0 for x in test]
# Evaluate your predictions
results = benchmark.evaluate(predictions)
# Submit your results
results.upload_to_hub(owner="dummy-user")
Guidelines for dataset curation and method evaluation & comparison.
Starting with small molecules
Through a unique, cross-industry collaboration involving representatives from Recursion Pharmaceuticals, AstraZeneca, Relay Therapeutics, Pfizer, Merck, Nimbus Therapeutics, Blueprint Medicines, Johnson & Johnson, Novartis, Bayer, and Valence Labs, we'll be releasing recommended benchmarks and datasets plus guidelines for dataset curation, method evaluation, and comparison.
Explore Today