
- The Cancer Research Institute (CRI) launches a first-of-its-kind AI-ready immunotherapy database designed to accelerate research and treatment development.
- The collaborative initiative aims to overcome long-standing problems in cancer research by standardizing and sharing data globally.
- The first phase of the database will focus on melanoma and colorectal cancer, including not only successful outcomes but also failed treatments to help uncover why therapies work or fail.
Researchers have launched a new open-access database designed to create a living resource to help scientists better understand how the immune system responds to cancer treatments over time, a longstanding challenge in immunotherapy research.
The CRI, in collaboration with Stanford University School of Medicine, the University of Pennsylvania Perelman School of Medicine, Memorial Sloan Kettering Cancer Center, and biotechnology company 10x Genomics, has unveiled the CRI Discovery Engine, a centralized, AI-ready research platform for cancer immunotherapy.
The Discovery Engine is led by three principal investigators: Andrea Schietinger, PhD, and Ansuman Satpathy, MD, PhD, both CRI STARs investigators, and E. John Wherry, PhD, associate director of CRI’s Scientific Advisory Council.
The initiative aims to address two major barriers in academia that slow progress in oncology research: limited data sharing and poor reproducibility of experimental results.
The Reproducibility Project: Cancer Biology was an 8-year effort to replicate findings from cancer biology papers published between 2010 and 2012. However, the project found that fewer than half of these findings could be reliably reproduced.
Although researchers generate large volumes of oncology data each year, only a small fraction is publicly available, and even less is accessible in formats that allow other scientists to reuse it effectively.
Research suggests that only 16% of oncology data is publicly available, and the CRI notes that just 1% of cancer research data meets standards that allow meaningful reuse by external researchers.
The CRI Discovery Engine seeks to change that by providing standardized, high-resolution data on how immune cells and cancer cells respond to immunotherapy interventions over time.
By making these datasets openly available and optimized for AI and machine learning tools, the platform is intended to allow researchers worldwide to analyze the same biological processes using consistent methods.
In a press release, Alicia Zhou, PhD, CEO of CRI commented that: “The goal of the CRI Discovery Engine really is to accelerate discovery in the immunotherapy space.”
She explained that immunotherapy is often described as a “living therapy,” meaning its effects evolve dynamically as immune cells interact with tumors. Capturing these interactions in real time and in three-dimensional space has historically been difficult, but recent advances in spatial sequencing technology now make it possible.
Rather than relying on isolated experiments conducted in individual laboratories, the platform is designed as a shared foundation for immunotherapy research.
CRI will initially seed the database with its own studies, while external researchers will be able to contribute additional data over time. This will create a living resource that continually grows in value to accelerate the path from lab to life-saving treatment.
“One of the biggest challenges in academic research is that we work in silos,” said Wherry in a press release.
“There’s competition and proprietary knowledge that institutions feel they need to protect. But that approach slows everyone down. This collaboration represents a commitment to breaking down those barriers because we all share the same goal: getting better treatments to patients faster.”
The first phase of the CRI Discovery Engine will focus on melanoma and colorectal cancer. Although immunotherapy has already transformed patient outcomes for these two cancer types, significant knowledge gaps remain.
Importantly, the database will also include data from treatments that failed. Such negative results are rarely shared publicly, despite their value in helping researchers understand why certain approaches may not work.
By capturing both successful and unsuccessful interventions, the platform aims to provide a more complete picture of immune responses and guide the development of new treatment combinations.
“Someday we’ll look back on this as a turning point for immunotherapy,” Satpathy stated in a press release.
“By building a shared, high-resolution understanding of how the human immune system responds to interventions over time, we’re unlocking a new era of discovery — one that shows us why treatments work, why they fail, and how to design what comes next.”
The database is designed with AI and machine learning applications in mind. This will allow computational tools to identify biological patterns more efficiently, potentially shortening the timeline from laboratory discovery to clinical application.
The initial dataset is expected to be made publicly available within the first year.
As funding pressures and public skepticism toward science grow, CRI leaders say collaborative efforts like the Discovery Engine are increasingly important.
“Cancer doesn’t care about institutional egos or proprietary data,” Zhou said. “Neither do we.”