Submission Form is now open!


Objective

We are collecting challenging questions that will test AI systems’ ability to surface and reason over key information necessary for developing personalized therapeutic approaches for patients with genetic disease.

Your submission will contribute to a benchmark for evaluating AI capabilities in therapeutic actionability. If your question is selected for inclusion, your name will be associated with it in the dataset, and you will be invited as an author on the corresponding paper.

Submission Process

  1. Develop Your Question: Create a question in English focused on therapeutic actionability for genetic diagnoses. This could involve identifying appropriate treatments, assessing clinical trial eligibility, interpreting genetic information relevant to treatment, or applying principles of drug development to rare diseases. Review the contest description including the Frequently Asked Questions subsection to understand scope.

  2. Test with Current AI Systems: We recommend testing your question with available AI systems to gauge its difficulty level. Questions that stump current models or reveal common LLM “hallucinations” (false trial IDs, numbers, etc.) are especially valuable for our benchmark.

  3. Provide a Comprehensive Solution: Include a detailed yet concise explanation of the correct answer, including reasoning steps and necessary information sources.

  4. Submit for Review: After submission, your question will undergo expert review to ensure quality, accuracy, and alignment with the challenge objectives.

  5. Publication and Attribution: Selected questions will be included in the dataset with proper attribution to you as the author. Contributors with more accepted questions will be listed earlier in the author list of the resulting paper. Note that although your question may be kept private for detecting AI memorization, you will still receive credit.

Please review our example questions to understand the expected format and challenge level.

Guidelines

1. Original & Interesting

  • Questions must be original and authored by you (not copied from other sources).
  • Consider submitting multiple questions with minor detail modifications (e.g., specific variant, patient age) that change the answer
  • Write questions that you’d find impressive if an automated system could consistently and accurately answer. Consider testing current LLMs to confirm your questions present meaningful challenges!

2. Medically Relevant

  • Questions should reflect realistic therapeutic decision points.
  • Focus on information that would help clinicians and researchers make treatment decisions.
  • Questions should span the full spectrum of therapeutic strategies (approved therapeutics, clinical trials, off-label drug repurposing opportunities, and personalized therapeutic feasibility).

3. LLM-Challenging

The following suggestions are based on known LLM failure modes. Try to design questions that:

  • Require integration or comparison of information from different sources (e.g., public databases that may not be fully indexed by web crawlers).
  • Necessitate multi-step reasoning about the potential for variant pathogenicity, genetic mechanisms of action, or clinical applications.
  • Have uncommon correct answers (e.g., from peer-reviewed publications or GeneReviews) that contradict popular or common misinformation.
  • Involve different identifiers, such as a preclinical drug name that differs from an approved drug name.

4. Objective & Close-Ended

  • Questions must have answers that would be accepted by other experts with relevant expertise.
  • All necessary context and definitions must be included within the input JSON object.
  • Good: Answers should be specific, unambiguous, and concise (e.g., “Yes”/”No”, gene name, numeric values).
  • Bad: Questions asking to “explain,” “discuss,” or “describe” are unsuitable.

Categories

Questions will be crafted to elicit short, scorable answers suitable for automated evaluation. Example questions are below.

Category Tasks Example Questions
Established, Targeted Therapies Accurately retrieve and apply key information regarding established therapeutics for this genetic diagnosis. :black_small_square: Are there any FDA-approved targeted therapies for this genetic condition? If yes, respond with the generic name of the most recently approved therapy. If not, respond with “No”.
:black_small_square: Is this patient’s CF variant eligible for treatment with Trikafta (elexacaftor/tezacaftor/ivacaftor)? Respond with “Yes” or “No”.
Established, Supportive Therapies :black_small_square: Are there any recommended supportive or preventative pharmaceutical therapies for this condition? If yes, respond with the generic name of one such drug with the strongest evidence base. If not, respond “No”.
Clinical Trials Identify ongoing clinical trials and assess for patient eligibility based on available patient information. :black_small_square: Are there any active clinical trials testing therapeutics for this patient’s genetic diagnosis? If yes, respond with the ClinicalTrials.gov ID for the most recent trial. If not, respond “No”.
:black_small_square: Based on this patient’s variant and clinical description, do they meet published inclusion criteria for the OTC-HOPE trial (ClinicalTrials.gov ID NCT06255782)? Respond with “Yes” or “No”.
Drug Development and Repurposing Reasoning over over drugs, targets, pathways, and assays. :black_small_square: What FDA approved drug targets the gene in which this patient has a pathogenic variant? Respond with the name of the drug.
:black_small_square: Would the mechanism of action of sirolimus be most likely predicted to improve or worsen the phenotype based on this diagnosis? Respond with “Improve”, “Worsen”, or “No impact expected”.
Variant Assessment Applied / molecular biology reasoning with a focus on questions relevant to determining feasibility of precision therapeutics :black_small_square: Is this combination of genotype and phenotype more consistent with a LOF or a GOF variant? Respond with “LOF” or “GOF”.
:black_small_square: How many amino acids are canonically encoded by the exon in which this variant is found? Respond with a number.
:black_small_square: What functional domain is this variant within? Respond with the name of a functional domain or with “None”.
:black_small_square: How frequently is this variant observed in the gnomAD v4.1 database? Respond with a frequency ranging from 0 to 1.
:black_small_square: What is the AlphaMissense score for this variant? Respond with a number.
:black_small_square: Based on GTEx v10 data, is the gene of interest more highly expressed in the brain or in the liver? Respond with “Brain” or “Liver”.
:black_small_square: Is there strong evidence that this exon is frequently skipped in healthy populations? Respond with “Yes” or “No”.


Examples

:heavy_check_mark: Example 1: Eligibility for OTC-HOPE Trial Based on Gestational Age

  • Patient Genotype:
    • Gene: OTC
    • Transcript: NM_000531.5
    • Variant (cDNA): c.386G>A
    • Variant (protein): p.Arg129His
    • Zygosity: hemizygous
  • Patient Phenotype: 3-month-old male infant with hyperammonemia, lethargy, and poor feeding requiring hospitalization shortly after birth. The patient was born at 35 weeks’ gestation and was diagnosed with OTC deficiency following genetic testing. Currently on protein-restricted diet, arginine supplementation, and alternative pathway therapy.
  • Question:
    • Category: Clinical Trials
    • Prompt: Based on this patient’s gestational age at birth and genetic status, do they meet the inclusion criteria for the OTC-HOPE trial (NCT06255782) testing ECUR-506 gene therapy? Answer “Yes” or “No”.
  • Expected Answer: No
  • Primary Data Source:
    • ClinicalTrials.gov
    • Literature on OTC defiency
    • Other (Specific URL): https://clinicaltrials.gov/ct2/show/NCT06255782
  • Required Resources: ClinicalTrials.gov is essential for accessing current trial eligibility criteria.
  • Answer Explanation: Although the patient has a pathogenic variant in OTC and a phenotype consistent with OTCD, they do not meet key inclusion criteria for the OTC-HOPE trial. Specifically, the trial requires a gestational age of ≥ 37 weeks at birth, and this patient was born at 35 weeks’ gestation. This requires combining knowledge of the specific trial inclusion criteria with the interpretation of the genetic variant and patient history.

Note: This example shows how a small change in the patient details (gestational age at birth) would alter the eligibility answer. A companion question with a patient born at term (≥37 weeks) would have the opposite answer.


Submission Form is now open!