Background
EGFR (Epidermal Growth Factor Receptor) kinase is a type of receptor tyrosine kinase that plays a significant role in cell growth, proliferation, and survival. Mutations or overexpression of EGFR have been associated with various diseases, particularly cancer.
Benchmarking
EGFR Wild type: Targeting wild-type EGFR with small-molecule inhibitors, such as erlotinib, is an ongoing area of research in the treatment of glioblastoma. While early findings are promising, the complexity of glioblastoma biology presents challenges that require further investigation to improve treatment outcomes for patients.
The goal of this benchmark is to perform a single task, which is to the best predictive model for
- Optimization of the bioactivity % inhibition for EGFR wile type.
- Discovery of potential hits in new chemical space.
Description of readout
- Readouts:
CLASS_EGFR
- Bioassay readout: percentage of inhnibition.
- Optimization objective: postive label (1)
- Number of data points: train: 496 test: 144
- Thresholds: > 80
Data resource:
Train/test split
Given the benchmarking goal, a scaffold-based splitting approach was applied to ensure training and test sets contain distinct chemical structures while maintaining the diversity of scaffolds.
Distribution of the train/test in the chemical space
Related links
The full curation and creation process is documented -> notebook
Related benchmarks
- polaris/drewry_egfr_wildtype_singletask_reg_v1
- polaris/egfr_wt_l858r_v1
Note: It's recommanded to evaluate your methods agaisnt all the benchmarks related to this dataset.