Submission Form is now open!


Objective

We are collecting challenging questions to test AI systems’ ability to surface and reason over the information needed to develop personalized therapies for patients with genetic diseases. These questions will form a benchmark for evaluating AI capabilities in therapeutic actionability.


Incentives

  • :heavy_check_mark: Expert Review & Feedback. Your question will be evaluated by domain experts, providing a second set of eyes and valuable input on a problem you care about.
  • :heavy_check_mark: Publication & Attribution. If selected for the benchmark, your name will appear with your question in the dataset, and you’ll be invited to co-author the resulting paper. Author order will reflect the number of high-quality questions contributed. Note that although your question may be kept private for detecting AI memorization, you will still receive credit.
  • :heavy_check_mark: Cash Awards. Twenty prizes of $500* will be awarded to questions randomly sampled from the pool of accepted submissions.
  • :heavy_check_mark: Impact on Clinical Practice. Your contribution will help shape AI tools that clinicians can use to solve this critical problem in therapeutic actionability.

*Exact award amounts to be confirmed


Guidelines

  • Original & Interesting. Questions must be original and authored by you. Consider submitting multiple questions with minor modifications (e.g., specific variant, patient age) that change the answer.
  • Medically Relevant. Questions should reflect realistic therapeutic decision points. Focus on information that would help clinicians and researchers make treatment decisions.
  • Objective & Close-Ended.
    • :heavy_check_mark: Questions should have answers that would be accepted by others with relevant expertise.
    • :heavy_check_mark: Questions should have answers that are specific, unambiguous, and concise (e.g., yes/no, generic drug name, numeric value).
    • :x: Questions asking to explain, discuss or describe are unsuitable. See our Frequently Asked Questions for further clarifications.
  • Challenging for AI. Consider testing your question with free AI models (e.g., ChatGPT, Stanford’s Biomni, and/or Google DeepMind’s TxGemma). Questions that probe known AI failure modes might:
    • Combine info from multiple sources, especially those not fully indexed online.
    • Require multistep reasoning about variant impact, mechanisms, or clinical use.
    • Have authoritative answers (e.g., from GeneReviews) that contradict popular/common misinformation.
    • Use mismatched identifiers (e.g., preclinical vs. approved drug names).

Example Questions across Categories

Questions will require retrieving and accurately applying authoritative information across five therapeutic categories. Our Tutorials Page lists authoritative sources to consider during question development. See the example questions below for guidance on format, scope, and difficulty.

Established_Targeted

Retrieve/apply info on established targeted therapeutics for the specific genetic diagnosis.

Example 1. Gene therapy identification for pediatric patient
{
  "patient": {
    "genotype": [
      {
        "gene": "AGXT",
        "transcript": "NM_000030.3",
        "variant_cdna": "c.33dup",
        "variant_protein": "p.(Lys12GlnfsTer156)",
        "zygosity": "homozygous"
      }
    ],
    "clinical_context": "7-year-old with recurrent nephrocalcinosis and chronic kidney disease"
  },
  "question": {
    "category": "Established_Targeted",
    "answer_format": "string_match",
    "prompt": "What targeted, genetic therapies are approved for this patient in the US? Provide the alphabetized list of generic names.",
    "date_submitted": "2025-01-01"  
  }
}
  • Answer:
    • Expected: Lumasiran
    • Explanation: Lumasiran is approved for patients of all ages whereas Nedosiran is only approved for patients age >9.
    • Challenging Rationale: Minor change in patient clinical context (current age) would change response.
  • Resources:
    • ☑ GeneReviews
    • ☑ PubMed/Literature
    • Specific Citations: https://www.ncbi.nlm.nih.gov/books/NBK1283/

Example 2. Teenage Cystic Fibrosis patient's eligibility for Trikafta
{
  "patient": {
    "genotype": [
      {
        "gene": "CFTR",
        "transcript": "NM_000492.4",
        "variant_cdna": "c.1521_1523del",
        "variant_protein": "p.Phe508del",
        "zygosity": "homozygous"
      }
    ],
    "clinical_context": "A 15-year-old male with a diagnosis of cystic fibrosis presents with chronic sinopulmonary disease characterized by persistent respiratory symptoms and recurrent infections. He also has exocrine pancreatic insufficiency requiring enzyme replacement therapy and exhibits features of male infertility consistent with congenital bilateral absence of the vas deferens (CBAVD)."
  },
  "question": {
    "category": "Established_Targeted",
    "answer_format": "binary",    
    "prompt": "Is the c.1521_1523del (F508del) variant eligible for treatment with Trikafta (elexacaftor/tezacaftor/ivacaftor)? Respond with “Yes” or “No”.",
    "date_submitted": "2024-12-10"  
  }
}
  • Answer:
    • Expected: Yes
    • Explanation: Trikafta (elexacaftor/tezacaftor/ivacaftor) was approved by the FDA in 2019 for patients aged 12 and older with at least one copy of the p.Phe508del mutation. As this patient has two copies of the mutation and meets the age requirement, he is eligible for Trikafta therapy.
    • Challenging Rationale: A small change in the patient details (current age) would alter the eligibility answer.
  • Resources:
    • ☑ GeneReviews
    • ☑ PubMed/Literature
    • Specific Citations: https://www.ncbi.nlm.nih.gov/books/NBK1250/#cf.Management, https://www.nature.com/articles/s41434-022-00347-0

Example 3. Variant eligibility for targeted drug treatment
{
  "patient": {
    "genotype": [
      {
        "gene": "AGXT",
        "transcript": "NM_000030.3",
        "variant_cdna": "c.508G>A",
        "variant_protein": "p.(Gly170Arg)",
        "zygosity": "homozygous"
      }
    ],
    "clinical_context": "Recurrent nephrocalcinosis and chronic kidney disease"
  },
  "question": {
    "category": "Established_Targeted",
    "answer_format": "string_match",
    "prompt": "What targeted, small molecule therapy is available for this patient? Provide the generic name or None.",
    "date_submitted": "2025-07-15"
  }
}
  • Answer:
    • Expected: Pyridoxine
    • Explanation: Missense variants are amenable to pyridoxine treatment.
    • Challenging Rationale: Requires multistep reasoning; the variant has a missense impact which is relevant for targeted treatment eligibility.
  • Resources:
    • ☑ GeneReviews
    • Specific Citations: https://www.ncbi.nlm.nih.gov/books/NBK1283/

Example 4. Multiple choice question for treatment selection
{
  "patient": {
    "genotype": [
      {
        "gene": "DMD",
        "transcript": "NM_004006.2",
        "variant_cdna": "c.7544_9286del",
        "variant_protein": "p.(Thr2516_Ala3096del)",
        "zygosity": "hemizygous"
      }
    ],
    "clinical_context": "Progressive muscle weakness"
  },
  "question": {
    "category": "Established_Targeted",
    "answer_format": "multiple_choice",
    "prompt": "To which of the following targeted therapies would this variant be most likely amenable: Golodirsen, Viltolarsen, Eteplirsen, Casimersen, Ataluren, or None?",
    "date_submitted": "2025-07-15"
  }
}  
  • Answer:
    • Expected: Eteplirsen
    • Explanation: Variant results in deletion of exons 52-63. Eteplirsen is listed as amenable to exon 51 skipping.
    • Challenging Rationale: Multi-step reasoning required to understand that exon skipping is relevant then figure out which exons were skipped and find the overlapping range.
  • Resources:
    • ☑ GeneReviews
    • Specific Citations: https://www.ncbi.nlm.nih.gov/books/NBK482346/#article-20747.s9


Established_Supportive

Retrieve/apply info on established supportive therapies for the condition/symptoms.

Example 5. Supportive Therapy for COL1A1-Related Bruising
{
  "patient": {
    "genotype": [
      {
        "gene": "COL1A1",
        "transcript": "NM_000088.4",
        "variant_cdna": "c.1678G>A",
        "variant_protein": "p.(Gly560Ser)",
        "zygosity": "heterozygous"
      }
    ],
    "clinical_context": "Joint hypermobility, skin hyperextensibility, and easy bruising."
  },
  "question": {
    "category": "Established_Supportive",
    "answer_format": "string_match",
    "prompt": "What two medications are most established for decreasing bruising? List generic names in alphabetical order.",
    "date_submitted": "2025-01-15"  
  }
}
  • Answer:
    • Expected: ascorbic acid, desmopressin
    • Explanation: There is some literature support for improvements in bleeding time, wound healing, and muscle strength after 1 year of daily oral, high-dose vitamin C (ascorbic acid) therapy. There is also literature support suggesting that desmopressin may normalize bleeding time, but the safety and efficacy of this drug for this genetic condition remains to be established.
    • Challenging Rationale: Authoritative sources (GeneReviews) might contradict some information found online for non-targeted, supportive treatment.
  • Resources:
    • ☑ GeneReviews
    • ☑ PubMed/Literature
    • Specific Citations: https://www.ncbi.nlm.nih.gov/books/NBK1244/#eds.Management, https://pubmed.ncbi.nlm.nih.gov/19036109/, https://pubmed.ncbi.nlm.nih.gov/3110540/


Clinical_Trials

Retrieve/apply info on ongoing clinical trials for the specific genetic diagnosis and assess patient eligibility for these trials.

Example 6. Eligibility for OTC-HOPE Trial Based on Gestational Age
{
  "patient": {
    "genotype": [
      {
        "gene": "OTC",
        "transcript": "NM_000531.5",
        "variant_cdna": "c.386G>A",
        "variant_protein": "p.Arg129His",
        "zygosity": "hemizygous"
      }
    ],
    "clinical_context": "3-month-old male infant with hyperammonemia, lethargy, and poor feeding requiring hospitalization shortly after birth. The patient was born at 35 weeks' gestation and was diagnosed with OTC deficiency following genetic testing. Currently on protein-restricted diet, arginine supplementation, and alternative pathway therapy."
  },
  "question": {
    "category": "Clinical_Trials",
    "answer_format": "binary",
    "prompt": "Based on this patient's gestational age at birth and genetic status, do they meet the inclusion criteria for the OTC-HOPE trial (NCT06255782) testing ECUR-506 gene therapy? Answer \"Yes\" or \"No\".",
    "date_submitted": "2025-02-19"  
  }
}
  • Answer:
    • Expected: No
    • Explanation: Although the patient has a pathogenic variant in OTC and a phenotype consistent with OTCD, they do not meet key inclusion criteria for the OTC-HOPE trial. Specifically, the trial requires a gestational age of ≥ 37 weeks at birth, and this patient was born at 35 weeks’ gestation. This requires combining knowledge of the specific trial inclusion criteria with the interpretation of the genetic variant and patient history.
    • Challenging Rationale: A small change in the patient details (gestational age at birth) would alter the eligibility answer.
  • Resources:
    • ☑ PubMed/Literature
    • ☑ ClinicalTrials.gov
    • Specific Citations: https://clinicaltrials.gov/ct2/show/NCT06255782

Example 7. Identifying upcoming clinical trials for a condition
{
  "patient": {
    "genotype": [
      {
        "gene": "SLC35A2",
        "transcript": "NM_005660.3",
        "variant_cdna": "c.3G>A",
        "variant_protein": "p.Met1Ile",
        "zygosity": "heterozygous"
      }
    ],
    "clinical_context": "Patient with SLC35A2-CDG who is experiencing seizures and global developmental delay"
  },
  "question": {
    "category": "Clinical_Trials",
    "answer_format": "string_match",
    "prompt": "What clinical trial developing a new therapeutic for this condition is recruiting or listed as upcoming/not yet recruiting? Return a clinical trials ID",
    "date_submitted": "2025-07-15"
  }
}
  • Answer:
    • Expected: NCT05402384
    • Explanation: There are two clinical trials for SLC35A2-CDG (Solute carrier family 35 member A2 congenital disorder of glycosylation). One has an “Unknown Status”, and the other (NCT05402384) is listed as “Not yet recruiting”.
    • Challenging Rationale: Terms to identify and reason over (e.g., “Unknown Status”) are not listed in the question.
  • Resources:
    • ☑ ClinicalTrials.gov
    • Specific Citations: https://clinicaltrials.gov/search?cond=SLC35A2-CDG

Example 8. Patient ineligible for clinical trial due to exclusion criteria
{
  "patient": {
    "genotype": [
      {
        "gene": "SLC35A2",
        "transcript": "NM_005660.3",
        "variant_cdna": "c.3G>A",
        "variant_protein": "p.Met1Ile",
        "zygosity": "heterozygous"
      }
    ],
    "clinical_context": "seizures and global developmental delay. Age 2 months, Hemoglobin 5, Normal liver labs, Not enrolled in other trials"
  },
  "question": {
    "category": "Clinical_Trials",
    "answer_format": "binary",
    "prompt": "Is this patient eligible for clinical trial NCT05402384? Answer yes or no.",
    "date_submitted": "2025-07-15"
  }
}
  • Answer:
    • Expected: No
    • Explanation: Exclusion criteria lists hemoglobin <7.
    • Challenging Rationale: Slight change in patient details (hemoglobin level) would alter response.
  • Resources:
    • ☑ ClinicalTrials.gov
    • Specific Citations: https://clinicaltrials.gov/study/NCT05402384


Drug_Development_and_Repurposing

Retrieve/apply info on approved drugs, their gene & pathway targets, mechanisms of action, and assays (e.g., gene expression) for repurposing or development.

Example 9. Drug mechanism requires Loss-of-Function variant
{
  "patient": {
    "genotype": [
      {
        "gene": "GRIN2B",
        "transcript": "NM_000834.5",
        "variant_cdna": "c.2755C>T",
        "variant_protein": "p.Gln919Ter",
        "zygosity": "heterozygous"
      }
    ],
    "clinical_context": "intellectual disability, seizures, and developmental delays"
  },
  "question": {
    "category": "Drug_Development_and_Repurposing",
    "answer_format": "multiple_choice",
    "prompt": "Is this patient's diagnostic genetic variant more likely amenable to treatment with Memantine, L-serine, or Radiprodil?",
    "date_submitted": "2024-05-05"  
  }
}
  • Answer:
    • Expected: L-Serine
    • Explanation: Variant is a LOF variant. L-serine is being used for LOF variants whereas the others are being used for GOF variants.
    • Challenging Rationale: Requires multistep reasoning over multiple terms/definitions for loss-of-function. Variant impact (LOF vs GOF) is relevant for selecting medication, and then the variant leads to a termination codon, which is a nonsense variant, which counts as LOF.
  • Resources:
    • ☑ PubMed/Literature
    • ☑ UniProt
    • Specific Citations: https://pubmed.ncbi.nlm.nih.gov/38380699/


Variant_Assessment

Retrieve/apply info to assess variant-specific feasibility for precision therapeutics (e.g., ASOs, gene therapies).

Note: The N=1 Collaborative (“N1C”) has put together guidelines for assessing variant eligibility for ASO treatment with an accompanying video tutorial and developed a variant eligibility calculator with troubleshooting tips.

Example 10. Exon length is relevant for assessing ASO feasibility
{
  "patient": {
    "genotype": [
      {
        "gene": "ANO10",
        "transcript": "NM_018075.5",
        "variant_cdna": "c.289del",
        "variant_protein": "p.(Met97Ter)",
        "zygosity": "homozygous"
      }
    ],
    "clinical_context": "progressive cerebellar ataxia and peripheral neuropathy"
  },
  "question": {
    "category": "Variant_Assessment",
    "answer_format": "numeric_match",
    "prompt": "What proportion of the total coding transcript for this gene is encoded by the exon in which this variant occurs? Answer with a decimal to nearest tenth.",
    "date_submitted": "2024-05-05"
  }
}
  • Answer:
    • Expected: 0.1
    • Explanation: 66/660 = 0.1
    • Challenging Rationale: Requires some minor computation to pull out protein length and exon length to compute proportion.
  • Resources:
    • ☑ Ensembl

Example 11. Restoration of Transcript Impacted by Nonsense-Mediated Decay
{
  "patient": {
    "genotype": [
      {
        "gene": "KMT2B",
        "transcript": "NM_014727.3",
        "variant_cdna": "c.8079delC",
        "variant_protein": "p.(Ile2694SerfsTer44)",
        "zygosity": "heterozygous"
      }
    ],
    "clinical_context": "childhood-onset generalized dystonia"
  },
  "question": {
    "category": "Variant_Assessment",
    "answer_format": "binary",
    "prompt": "Based on typical prediction rules, is this variant likely to result in nonsense mediated decay? Answer yes or no.",
    "date_submitted": "2024-05-05"
  }
}
  • Answer:
    • Expected: No
    • Explanation: The variant is located at the end of the last exon, after the main domain.
    • Challenging Rationale: Multi-step reasoning, need to know about NMD and that variant location impacts likelihood of NMD.
  • Resources:
    • ☑ UniProt

Example 12. Multiple choice identification of functional domains impacted
{
  "patient": {
    "genotype": [
      {
        "gene": "NF1",
        "transcript": "NM_001042492.3",
        "variant_cdna": "c.3728T>C",
        "variant_protein": "p.(Leu1243Pro)",
        "zygosity": "heterozygous"
      }
    ],
    "clinical_context": "Malignant Peripheral Nerve Sheath Tumor and Pheochromocytoma"
  },
  "question": {
    "category": "Variant_Assessment",
    "answer_format": "multiple_choice",
    "prompt": "In which functional domain does this variant occur? Answer choices: CSRD, TBD, GRD, Sec14-PH, HLR, NLS, SBR.",
    "date_submitted": "2024-05-05"
  }
}
  • Answer:
    • Expected: GRD
    • Explanation: GRD, GAP related domain spans positions 1198–1549 in the protein. This variant affects residue 1243.
    • Challenging Rationale: Need to access domain ranges then determine if variant lands within one of those ranges.
  • Resources:
    • ☑ PubMed/Literature
    • ☑ UniProt
    • Specific Citations: https://www.mdpi.com/2073-4425/13/7/1130#



Question Submission Form

Note that you can submit multiple questions using the templates linked in the Batch Question Submission section below. We suggest submitting question(s) using this form first to familiarize yourself with the required fields.

I. Patient Genetic Diagnosis

List all relevant SNV/indel genetic variants associated with this patient's diagnosis. You may enter multiple variants.

II. Patient Clinical Context

Provide a brief, clinically relevant description of the patient's symptoms/condition. Feel free to include "distractor" information and/or design the description such that small changes to specific details may change the answer. Clinical context should include, if relevant, patient age, family history, care setting, comorbidities and/or other medications.

III. Question, Answer & Rationale

List all questions for this particular patient and genetic diagnosis. You may enter multiple questions.

  • Question Category *: See five question category definitions above.
  • Answer Format *: Ensure the Question Prompt ends with instructions matching the selected format and that the Answer Expected also follows it.
    Selection Example Question Prompt Example Answer Expected
    binary Is there at least one ongoing clinical trial that targets this gene? Answer "yes" or "no". yes
    multiple_choice In which functional domain does this variant occur? Answer choices: CSRD, TBD, GRD, Sec14-PH, HLR, NLS, SBR. GRD
    numeric_match What proportion of this gene's coding sequence is encoded by the exon with this variant? Answer with a decimal to nearest tenth. 0.4
    string_match What targeted, small-molecule therapy is available for this genetic diagnosis? Answer with the generic drug name. Pyridoxine
  • Answer Expected *: Provide the exact, unambiguous answer that would be considered correct.
  • Answer Explanation *: Provide a concise explanation for the correct answer. See example questions above for expectations.
  • Challenging for AI: See our question guidelines above for known AI failure modes and free tools to test the difficulty of your question.
IV. Author Information

Please provide your name, email, and affiliation exactly as you would like them to appear in publications or acknowledgments.

V. Verification
My question(s) and answer(s) do not contain Protected Health Information for any real individuals without their written consent.
My question(s) and answer(s) are factually correct and would be agreed upon by experts in the field.
The question(s) have single, unambiguous correct answers.
I authorize having my name associated with this question and inclusion as a co-author on the resulting paper. (optional)
I have tested my question(s) with current AI systems and found them to be challenging. (optional)

Batch Question Submission

You can submit multiple questions, including groups of questions with minor differences, by emailing a properly-formatted JSON or Excel spreadsheet to aitxchallenge@gmail.com.

:warning: Your JSON/XLSX must include all the same fields as the form above for each question. We strongly encourage you to use the templates provided above!


Submission questions?

Direct your questions to aitxchallenge@gmail.com or visit our Frequently Asked Questions page.


SAIL logo           Hugging Face logo           Google logo           NEJM-AI logo