AIRR-ML-25: Adaptive Immune Profiling Challenge
Overview
⏳ Time left until challenge launch
Loading...
In this competition, you’ll develop machine learning models to simultaneously perform two tasks: (a) predict the immune state (e.g. disease, healthy) of individuals based on so-called adaptive immune repertoires (sets of protein sequences), and (b) identify immune state-associated receptor sequences (those that explain immune state in the first task). The goal is to expedite ML-based solutions for immunodiagnostics and therapeutics discovery.

Timeline¶
- November 05, 2025 - Start Date. (opens at 08:00 AM CET)
- November 19, 2025 - Entry Deadline. You must accept the competition rules before this date in order to compete.
- November 19, 2025 - Team Merger Deadline. This is the last day participants may join or merge teams.
- December 17, 2025 - Final Submission Deadline (closes at 07:59 AM CET).
All deadlines are at 11:59 PM CET on the corresponding day unless otherwise noted. The competition organizers reserve the right to update the contest timeline if they deem it necessary.
How to Participate¶
The competition is open to everyone, and will be hosted on the popular Kaggle platform. All you need to do is create a Kaggle account, accept the competition rules, and start coding! The competition will be live on November 05 at the following URL: https://www.kaggle.com/competitions/adaptive-immune-profiling-challenge-2025.
Prizes¶
Monetary rewards¶
- 1st Place - $ 5,000
- 2nd Place - $ 3,000
- 3rd Place - $ 2,000
Eligibility¶
To win the prize money, a prerequisite is that the participants make their code open-source.
Sponsorship¶
Competition prizes are kindly sponsored by The Research Council of Norway.
Scientific manuscript authorship¶
Top 10 performing participants on the final Leaderboard rankings will be invited to contribute their model descriptions, related discussions, and code to a scientific paper summarizing the competition's scientific outcome. Nature Methods has "accepted in principle" to publish this work.
Organizers¶
Many awesome people have contributed to making this community challenge happen including:
Chakravarthi Kanduri1,2, Thomas Konstantinovsky3,†, Puneet Rawat4,†, Milena Pavlovic1,2, Damon H. May5, Rebecca Elyanow5, Bryan Howie5, Harlan S. Robins5, Crina Curca6, Bryan Hariadi6, Ashwath Kumar6, Jose Jacob6, Efthymia Papalexi6, Charles Roco6, Alex Rosenberg6, AIRR-Community Machine Learning Working Group, Justin Barton7, Günter Klambauer8, Encarnita Mariotti-Ferrandiz9, Pieter Meysman10, Eline T. Luning Prak11, Lindsay G. Cowell12, Todd M. Brusko13,14,15, Gur Yaari3,16,‡, Victor Greiff4,17,‡, Geir Kjetil Sandve1,2,‡
1 Scientific Computing and Machine Learning section, Department of Informatics, University of Oslo, Norway
2 UiORealArt Convergence Environment, University of Oslo, Norway
3 Faculty of Engineering and Bar Ilan Institute of Nanotechnology and Advanced Materials, Bar-Ilan University, Israel
4 Department of Immunology, University of Oslo, Oslo, Norway
5 Adaptive Biotechnologies, Seattle, WA, USA
6 Parse Biosciences, Seattle, WA, USA
7 Institute of Structural and Molecular Biology, University of London, United Kingdom
8 Institute for Machine Learning, Johannes Kepler University Linz, Austria
9 Sorbonne Université, INSERM, UMRS959, Immunology-Immunopathology-Immunotherapy (i3) lab, Paris, France
10 Adrem Data Lab, Department of Computer Science, University of Antwerp, Belgium
11 Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
12 Department of Health Data Science and Biostatistics, Peter O'Donnell Jr. School of Public Health; Department of Immunology, School of Biomedical Sciences; UT Southwestern Medical Center, Dallas, TX, USA
13 Department of Pathology, Immunology, and Laboratory Medicine, Diabetes Institute, College of Medicine, University of Florida, Gainesville, FL, USA
14 Department of Pediatrics, College of Medicine, University of Florida, Gainesville, FL, USA
15 Department of Biochemistry and Molecular Biology, College of Medicine, University of Florida, Gainesville, FL, USA
16 Department of Pathology, Yale School of Medicine, New Haven, CT, USA
17 Imprint Labs, LLC, New York, NY, USA
†Equal contribution
‡ Equal contribution
Correspondence: geirksa@ifi.uio.no, victor.greiff@medisin.uio.no, gur.yaari@yale.edu
Note: The contributors list shown above does not reflect the final list of authors, and authorship order, for the scientific manuscript summarizing the competition's scientific outcome. As described above, the top 10 performing participants on the final Leaderboard rankings will be invited to contribute to this manuscript and become co-authors.
Thanks to the AIRR-community for the shared vision and collective perspective in organizing this challenge.

Further details¶
Description of problem¶
Imagine your body's immune system as a vast, personal army, constantly on guard against invaders like viruses and bacteria. Each soldier in this army is an "immune receptor," a tiny protein designed to recognize and fight off threats. What is truly incredible is the sheer variety of these soldiers: you have billions of unique immune receptors, each one a potential weapon against a new disease!
When a new enemy (what researchers call an "antigen," like a specific virus variant) attacks, only a tiny handful of these billions of immune receptors are the perfect match to bind to it and neutralize the threat. It is like finding a needle in a haystack, but your body does it all the time.
Now, here is the exciting challenge: What if we could peek into this personal army of immune receptors from many different people? We will have collections of their unique immune receptors (called "repertoires"), and we will also know if those individuals have a certain immune state (e.g. diseased or healthy).
The big questions for this competition:
- Can we predict a person's disease just by looking at their immune receptor "fingerprint"? Without knowing which receptors fight which diseases, can your machine learning models learn to identify patterns in these immune receptor collections that tell us if someone is sick or healthy?
- Can we identify the "contributing" immune receptors? If our models can predict disease, can they also tell us which specific immune receptors are most strongly linked to a particular disease? This would be like finding the star soldiers in the immune army!
Solving these problems is a huge step forward for medicine. It could lead to new ways to diagnose diseases earlier and even develop targeted treatments based on our own immune system's unique capabilities.
Evaluation¶
There will be a total of eight training datasets and ten test datasets included in this competition. For each repertoire_id across all test datasets, the participants has to return a probability for the repertoire being label-positive. In addition, a ranked list of the top 50,000 unique rows (including junction_aa, v_call, and j_call) that best contribute to the optimal classification for each training dataset has to be returned, regardless of the data encoding used. Note that these label-associated sequences have to be sorted based on some form of importance scores from most important to less important; we may use only top-n sequences from the ordered list of 50k sequences for evaluation. These will be used to compute the performance metrics area under the ROC curve and Jaccard similarity, respectively, for each of the datasets. A weighted average of both measures across all the included datasets will be used as the basis for ranking on the leaderboard for the competition.

Additional resources¶
Link to come: A pre-registered protocol describing all the details of the competition including extensive background information, dataset descriptions, evaluation process, and pilot data providing reference benchmarks
What's the state-of-the-art in mining Adaptive Immune Repertoires?
Examples of state-of-the-art methods
-
Modern Hopfield Networks and Attention for Immune Repertoire Classification
-
Disease diagnostics using machine learning of B cell and T cell receptor sequences
-
A platform for ML on adaptive immune repertoires with a wide collection of encodings and ML methods
đź“§ Stay Updated¶
Don't miss any important updates about the challenge! Subscribe to our newsletter for:
- Competition announcements and timeline updates
- Technical insights and tips from organizers
- Community highlights and participant spotlights
- Results and findings from the challenge
Acknowledgements¶
Adaptive Biotechnologies has generously provided ~ 500 unpublished TCRβ repertoires from a cohort of donors with known status with respect to HSV-2 infection.

Parse Biosciences has generously provided unpublished experimental antigen-specific TCR sequences for use in synthetic datasets. TCR Sequencing of 1 Million Antigen-Reactive Human T Cells in a Single Experiment, https://www.parsebiosciences.com/datasets/tcr-sequencing-of-1-million-antigen-reactive-human-t-cells-in-a-single-experiment/; Parse Biosciences, Seattle, USA, Accessed 13 March 2025.

Citation¶
AIRR-ML-2025 Organizers. AIRR-ML-2025: Adaptive Immune Profiling Challenge. https://www.kaggle.com/competitions/adaptive-immune-profiling-challenge-2025, 2025. Kaggle.