Phase 1: AI-Tx Knowledge

Deadline: February 11, 2026

Overview

AI-Tx Knowledge is a question answering (QA) task that will test models’ ability to answer (binary, categorical, multiple choice) questions about a patient’s genetic test result that should be directly answerable using information in public databases and the scientific literature. The task will focus on questions covering five major categories: (1) Established, Targeted Therapies, (2) Established, Supportive Therapies, (3) eligibility for Clinical Trials, (4) Drug Development and Repurposing, and (5) Variant Assessment, with an emphasis on information relevant to amenability for genetic therapies.

Our explicit goal is to craft the challenge in such a way that any successful models will be immediately useful as information processing engines within downstream tasks, including genetic test report generation and drug development.

Input Format

During evaluation, systems will be provided with an input JSON object containing entries for hypothetical patients, including one or more genetic variants, a brief clinical description, and a question relevant to the individual/variant’s candidacy for precision therapeutics. You should expect each input task to be formatted as follows:

Field	Type	Description
`id`	string	Unique identifier for the task instance (used to link input and output)
`patient.genotype`*	array of objects	List of one or more variant objects, where each object includes: `gene` (string) - HGNC gene symbol `transcript` (string) - RefSeq or Ensembl transcript identifier `variant_hgvs` (string) - cDNA HGVS notation `protein_hgvs` (string) - protein HGVS notation `zygosity` (string) - e.g., “heterozygous”, “homozygous” or “hemizygous”
`patient.phenotype`	string	Brief free-text clinical description
`question.category`	string	High-level question category from one of the following options: `Established, Targeted Therapies` `Established, Supportive Therapies` `Clinical Trials` `Drug Development and Repurposing` `Variant Assessment`
`question.prompt`	string	Direct, short-answer question about precision therapy relevance

*Users can assume that patient.genotype corresponds to a true diagnosis.

For example,

{
  "id": "AITX-00123",
  "patient": {
    "genotype": [
      {
        "gene": "CFTR",
        "transcript": "NM_000492.4",
        "variant_hgvs": "c.1521_1523del",
        "protein_hgvs": "p.Phe508del",
        "zygosity": "homozygous"
      }
    ],
    "phenotype": "A 15-year-old male with a diagnosis of cystic fibrosis presents with chronic sinopulmonary disease characterized by persistent respiratory symptoms and recurrent infections. He also has exocrine pancreatic insufficiency requiring enzyme replacement therapy and exhibits features of male infertility consistent with congenital bilateral absence of the vas deferens (CBAVD)."
  },
  "question": {
    "category": "Established, Targeted Therapies",
    "prompt": "Is the c.1521_1523del (F508del) variant eligible for treatment with Trikafta (elexacaftor/tezacaftor/ivacaftor)? Respond with “Yes” or “No”."
  }
}

Questions are being crowdsourced from a community of experts in clinical genetics and drug development. We would love your contributions!

Output Format & Assessment

Models will be expected to output a JSON object with fields for response to the question and for justification. Model outputs will be scored automatically by comparing the response field to a reference answer. Exact matches or format-constrained numeric matches will be used for evaluation, depending on question details. The source and justification fields are required to ensure transparency and factual grounding, and will be reviewed as needed by automated or human judges. Example output specifications are provided below.

Field	Type	Description
`id`	string	Must match the corresponding input task `id`
`response`	string	A concise answer (e.g., “Yes”, “GOF”, a number, drug name)
`source`	string	A URL or reference supporting the answer (e.g., ClinVar, PubMed, OMIM)
`time_accessed`	int	Unix Epoch of Coordinated Universal Time (UTC) when the `source` URL was accessed; e.g., returned by Python’s `time.time()`
`justification`	string	A brief natural language explanation linking the source to the response

For example,

{
  "id": "AITX-00123",
  "response": "Yes",
  "source": "https://www.nature.com/articles/s41434-022-00347-0",
  "time_accessed": 1749151852,
  "justification": "Trikafta (elexacaftor/tezacaftor/ivacaftor) was approved by the FDA in 2019 for patients aged 12 and older with at least one copy of the p.Phe508del mutation. As this patient has two copies of the mutation and meets the age requirement, he is eligible for Trikafta therapy."
}

Phase 2: AI-Tx Actionability Report

Invitation: March 4-11, 2026
Deadline: TBD (by invite only)

The top performers from Phase 1: AI-Tx Knowledge Challenge will be invited to submit to this invite-only challenge.

Overview

The AI-Tx Actionability Report task evaluates a system’s ability to synthesize therapeutic knowledge into a comprehensive, clinically meaningful report for a hypothetical patient. Unlike Phase 1, which focuses on short-answer queries, this task is open-ended and requires narrative generation. Participants will generate a comprehensive therapeutic actionability report addressing established therapies, investigational options, and drug repurposing candidates. The inclusion of assessments for emerging gene-targeted therapies (e.g., antisense oligonucleotides or gene editing) is optional but encouraged.

Input Format

See Phase 1 Input Details. Format is identical except for the absence of the question (i.e., question.category and question.prompt) field.

Output Format

Systems must return a structured, markdown-formatted actionability report as a string within a JSON object. Details below.

Field	Type	Description
`id`	string	Must match the corresponding input task `id`
`report_markdown`	string	A Markdown-formatted therapeutic actionability report for the patient

The required Markdown formatting ensures that the report will be clear, logically structured, and professional. Use of subheadings, bullet points, hyperlinks, and references beyond the required shell is encouraged.

For example,

{
  "id": "AITX-30001",
  "report_markdown": "## Therapeutic Actionability Report for Patient AITX-30001\n\n### Summary..."
}

Scoring

Submissions will be evaluated by a panel of experts in clinical genetics, translational research, and drug development. Reports will be scored based on the following dimensions:

Criterion	Description
Factuality	Scientific and clinical accuracy of all therapeutic claims and evidence
Comprehensiveness	Coverage of relevant therapeutic domains: approved drugs, trials, repurposing, (optionally) gene therapy
Readability	Clarity, structure, and appropriateness of tone for clinical/scientific use
Citations	Use of reputable sources (e.g., PubMed, ClinVar, ClinicalTrials.gov) with proper referencing
Helpfulness	Practical value for a clinical or translational team making therapy decisions

Tutorials: links to databases and video tutorials on utilizing compute resources.
Submit Questions: example input questions which are being crowdsourced from a community of experts in clinical genetics and drug development.
FAQ: responses to questions regarding challenge scope and requirements.