Protein Design CompetitionRound 2 with Adaptyv Bio and Dimension. Win and present at NeurIPS!

This dataset has been certified!

Dataset

adaptyv-bio/egfr-binders-v0

This dataset includes binding protein designs targeting the Epidermal growth factor receptor(EGFR), a drug target associated with various diseases.

Created on: September 26, 2024Dataset size: 61 KBNumber of datapoints: 202
Public
Certified

Tags

protein-design

Modalities

No modalities found.

Details

README

Background

Target details

Epidermal Growth Factor Receptor (EGFR) is a transmembrane protein that plays a critical role in cell growth, differentiation, and survival. It is frequently overexpressed or mutated in various cancers, including non-small cell lung cancer, colorectal cancer, and head and neck cancer. This makes EGFR a crucial target for cancer therapies such as Cetuximab, an antibody with more than 1B USD in annual revenue.

  • Target Protein: EGFR
  • Organism: HUMAN
  • Uniprot Accession ID: P00533
  • Protein sequence: LEEKKVCQGTSNKLTQLGTFEDHFLSLQRMFNNCEVVLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLENLQIIRGNMYYENSYALAVLSNYDANKTGLKELPMRNLQEILHGAVRFSNNPALCNVESIQWRDIVSSDFLSNMSMDFQNHLGSCQKCDPSCPNGSCWGAGEENCQKLTKIICAQQCSGRCRGKSPSDCCHNQCAAGCTGPRESDCLVCRKFRDEATCKDTCPPLMLYNPTTYQMDVNPEGKYSFGATCVKKCPRNYVVTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSLSINATNIKHFKNCTSISGDLHILPVAFRGDSFTHTPPLDPQELDILKTVKEITGFLLIQAWPENRTDLHAFENLEIIRGRTKQHGQFSLAVVSLNITSLGLRSLKEISDGDVIISGNKNLCYANTINWKKLFGTSGQKTKIISNRGENSCKATGQVCHALCSPEGCWGPEPRDCVSCRNVSRGRECVDKCNLLEGEPREFVENSECIQCHPECLPQAMNITCTGRGPDNCIQCAHYIDGPHCVKTCPAGVMGENNTLVWKYADAGHVCHLCHPNCTYGCTGPGLEGCPTNGPKIPS
  • Structure PDB: 6ARU

64ru

Binding protein designs

This dataset contains 202 designed EGFR-binding protein sequences, along with experimental binding affinity results tested by the AdaptyvBio team.

The affinity constants (KD) in this dataset were calculated as the average of the experimental replicates. The KD values were then converted into binary labels binding_class.

An additional data package containing raw lab data and kinetic curves can be downloaded here.

Reference: