Have questions? Mail aitxchallenge@gmail.com !

:thinking: :speech_balloon: Frequently Asked Questions


Genetic Diagnoses

Are we interested in targeted treatments for cancer?

Tumor somatic sequencing results and precision oncology targeted treatments are out of scope. However, germline variants that indicate a predisposition to cancers, including rare cancers, are within scope. Other somatic variant testing (e.g., to detect mosaicism or guide treatment decisions for congenital vascular anomalies) mentioned in the patient.clinical_context input field and/or as part of the question stem are also in scope.

What about structural variants?

We are limiting question submissions to simple (single-nucleotide and insertion/deletion) variants that can be represented in HGVS cDNA and protein formats.

Why would there be multiple diagnostic variants?

Our input patient.genotype field is a list of variants, allowing for multiple diagnostic variants.

In recessive disorders, compound heterozygous variants (i.e., two different variants inherited in trans from two parents) might be necessary for disease manifestation. It is also possible that a patient may be affected by multiple monogenic conditions simultaneously.

Can we assume that the input genotype (list of variants) is confirmed diagnostic?

Yes.

Why are we focusing on just rare genetic diseases?
  1. Rare genetic diseases represent a uniquely underserved area in therapeutic development. Fewer than 5% of known monogenic disorders have targeted treatments. Unlike relatively common genetic disorders (e.g., cancers with recurrent “hotspot” mutations across patients), rare diseases often involve private or ultra-rare variants unique to individuals or families. This means that therapeutic strategies must account for variant-specific impacts.

  2. Classes of personalized treatments (e.g., antisense oligonucleotides [ASOs], gene therapies) may be especially suited for treating rare genetic diseases! However, determining the feasibility of such treatments requires answering complex molecular eligibility questions based on diverse, dynamic (up-to-date) data sources.


Question Scope

Will we be asked to recommend non-intervention medical tests?

We will not ask you to recommend surveillance steps (e.g., imaging studies) following a diagnosis. Results of such studies provided in the input patient.clinical_context field and/or as part of the question stem are considered in scope. For example, MRI evidence of gadolinium enhancement in X-linked Adrenoleukodystrophy patients indicates eligibility for haematopoietic stem cell transplantation, whereas an MRI showing no contrast enhancement would not indicate treatment.

What if there are no established targeted therapies or clinical trials?

Established therapies and active clinical trials may not exist for every case, but when they are available, identifying them accurately is especially important.

Are only established (e.g., not under development) drug repurposing methods valid?

For questions in the Drug Development and Repurposing category, retrieval of pertinent information for all scientific strategies (e.g., target/pathway reasoning, connectivity mapping, structural modeling) are welcome, provided the reasoning is clearly explained and evidence-based. Synthesis of insights across multiple data sources for specific candidates is highly encouraged, as is the analysis of candidate groups to identify shared mechanisms, patterns, or therapeutic opportunities.

What will be included in the patient.clinical_context field?

You can expect a brief description of the patient’s symptoms/condition and, if relevant, the patient’s age, family history, care setting, comorbidities, and/or other medications as they pertain to the question. This description is free text and may just be a comma-separated list rather than complete sentences.


Model Development Specifications

Can we use closed models?

Yes. However, we will evaluate and award prizes to submitted models across different tiers; use of large vs. small models as well as use of open-source vs. proprietary models will be distinguished during evaluation.

Will there be a dev dataset?

Yes. This dataset will be made available through our AI-Tx Challenge Hugging Face organization, where you must first join as a member.

Which resource tier would a mixture-of-experts model fall into?

A mixture-of-experts (MoE) model can be run with far less compute, but still requires high VRAM to run since all experts are loaded into memory. Resource tiers are determined by the total (not active) number of parameters, and so, for example, Mixtral 8x7B would fall into the “Tier 2: Large” tier, as we would count it as having ~8x7B=56B parameters.

How soon can I submit my model?

All deadlines are listed on our Participate Landing Page.

What else is happening in this space?

There are other complementary challenges relating to therapeutics and rare disease:

Still have questions? Mail aitxchallenge@gmail.com !


SAIL logo           Hugging Face logo           Google logo           NEJM-AI logo