Submit Your Model

Participants in the AI-Tx Challenge will submit their models via Hugging Face Spaces, hosted within our official organization: aitxchallenge.


Step 1: Join the Hugging Face Organization

All participants must join our Hugging Face organization as members:

➡️ https://huggingface.co/aitxchallenge

This ensures access to development datasets and enables proper organization of submissions.


Step 2: Review Available Data

  • A small development dataset will be made available publicly within the organization for all members.
  • The test dataset will be hosted privately within Hugging Face and used for final evaluation. It will not be accessible to participants.

We will programmatically call submitted models on this hidden test dataset during final scoring.


Step 3: Understand Submission Tiers

When submitting a model, you must choose one of three resource tiers (defined by the total, rather than active, number of model parameters). These tiers help us ensure a fair evaluation across models of different sizes and computational complexity. Each tier corresponds to a collection within our Hugging Face organization and has its own evaluation track.

Tier Description Intended Use
Tier 1: Small Models that can run efficiently on low-resource environments, typically with ≤8 billion parameters.* All model weights must be open-source. Ideal for small LLMs like QWEN3-8B.
Tier 2: Large Larger models with ≤70 billion parameters.* All model weights must be open-source. Good for models like LLaMA-3.3-70B or Mixtral 8x7B.
Tier 3: Unrestricted Any model, including those with >70 billion parameters,* closed-source, or relying on private/commercial APIs. Includes OpenAI’s GPT-4, Anthropic’s Claude, Google’s Gemini, etc.

*Resource tiers are specified by the total number of parameters, as opposed to the active number of parameters.


Step 4: Submit Your Model

Submissions must be in the form of a Hugging Face Space:

  • The space may invoke your own models, including private ones (examples listed in Tier 3), as long as the response format adheres to the expected format.
  • For Tier 1: Small and Tier 2: Large, you must submit your training data and your code (although you can and should still maintain licensing).
    • Exceptions can be made for the data sharing requirement with reasonable justification (e.g., patient consent, third-party licensing). At minimum, you will need to share a dummy dataset to include with your codebase.
  • Once submitted, your model will be tested automatically on a few example questions.
  • You will be notified whether your model passes basic application checks (e.g., valid response structure, accessibility).

Evaluation Process

We will evaluate models by calling each Hugging Face Space on the hidden test dataset. Models must return predictions in the required format to be eligible for scoring.