Have questions? Mail aitxchallenge@gmail.com !

:thinking: :speech_balloon: Frequently Asked Questions


Genetic Diagnoses

Are we interested in targeted treatments for cancer?

Tumor somatic sequencing results and precision oncology targeted treatments are out of scope. However, germline variants that indicate a predisposition to cancers, including rare cancers, are within scope. Non-cancer somatic variant testing (e.g., to detect mosaicism or guide treatment decisions for congenital vascular anomalies) found in the input patient.phenotype field and/or as part of the question stem are also in scope.

What about structural variants?

We are limiting question submissions to simple (single-nucleotide and insertion/deletion) variants that can be represented in HGVS cDNA and protein formats.

What if there are multiple diagnostic variants?

It’s true that in recessive disorders, sometimes compound heterozygous variants (i.e., two different variants inherited in trans from two parents) are required for the disease to manifest. It is also possible that a single patient may be affected by multiple monogenic conditions simultaneously. Only one diagnostic variant will be provided for each input question, and the question will pertain to that specific variant (even if other variants are playing a role in the patient’s condition).

Can we assume that the input genotype (variant) is confirmed diagnostic?

Yes.

Why are we focusing on just rare genetic diseases?

Our challenge focuses on rare genetic diseases because they represent a uniquely underserved area in therapeutic development. Fewer than 5% of known monogenic disorders have targeted treatments, largely due to limited commercial incentives for pharmaceutical companies to invest in therapies for small, heterogeneous patient populations. Unlike common genetic disorders, such as cancer where recurrent hotspot mutations can be targeted across many patients, rare diseases often involve private or ultra-rare variants unique to individuals or families. This means that therapeutic strategies must account for variant-specific impacts.

Emerging classes of personalized treatments (e.g., antisense oligonucleotides [ASOs], gene replacement therapies, and RNA-modifying drugs) introduce complex molecular eligibility questions. For example, determining whether a splice-modifying ASO is viable requires evaluating the variant’s effect on RNA splicing, identifying targetable exons, and assessing tissue-specific expression of the gene. Similarly, gene therapy suitability may depend on gene size, loss-of-function mechanisms, and delivery vectors. Answering these questions requires integrating diverse, dynamic (up-to-date) data sources.

Currently, identifying therapeutic strategies for rare disease patients remains a highly manual process.

What else is happening in this space?

There are other complementary challenges relating to therapeutics and rare disease:

There are also existing agents in this space, which may be called via API and/or used to assess input question difficulty.


Question Scope

Will we be asked to recommend non-intervention medical tests?

We will not ask you to recommend surveillance steps (e.g., imaging studies) following a diagnosis. Results of such studies provided in the input patient.phenotype field and/or as part of the question stem are considered in scope. For example, MRI evidence of gadolinium enhancement in X-linked Adrenoleukodystrophy patients indicates eligibility for haematopoietic stem cell transplantation, whereas an MRI showing no contrast enhancement would not indicate treatment.

What if there are no established targeted therapies or clinical trials?

Established therapies and active clinical trials may not exist for every case, but when they are available, identifying them accurately is especially important. Phase 2 submissions that miss these key opportunities may see a notable impact on helpfulness and comprehensiveness scores.

Are only established (e.g., not under development) drug repurposing methods valid?

For questions in the Drug Development and Repurposing category, retrieval of pertinent information for all scientific strategies (e.g., target/pathway reasoning, connectivity mapping, structural modeling) are welcome, provided the reasoning is clearly explained and evidence-based. Synthesis of insights across multiple data sources for specific candidates is highly encouraged, as is the analysis of candidate groups to identify shared mechanisms, patterns, or therapeutic opportunities.

Do we have to include personalized therapies in our Phase 2 Actionability Report?

Assessment of amenability to custom genetic therapies (e.g., ASOs, gene editing) is not required but may enhance overall scoring if included rigorously and appropriately.


Model Development Specifications

Can we use closed models?

Yes. However, we will evaluate and award prizes to submitted models across different tiers; use of large vs. small models as well as use of open-source vs. proprietary models will be distinguished during evaluation.

Will there be a dev dataset?

Yes. This dataset will be made available through our AI-Tx Challenge Hugging Face organization, where you must first join as a member.

Which resource tier would a mixture-of-experts model fall into?

A mixture-of-experts (MoE) model can be pretrained with far less compute, but still requires high VRAM to run since all experts are loaded into memory. Resource tiers are determined by the total (not active) number of parameters, and so, for example, Mixtral 8x7B would fall into the “Tier 2: GPU” tier, as we would count it as having ~8x7B=56B parameters.

How soon can I submit my model?

All deadlines are listed on our Participate Landing Page.

Still have questions? Mail aitxchallenge@gmail.com !