Extreme-Scale Computing Project Aims to Advance Precision Oncology

By Kaylee Towey, Student Intern
Image of a computer motherboard.

Two government agencies and five national laboratories are collaborating to develop extremely high-performance computing capabilities that will analyze mountains of research and clinical data to improve scientific understanding of cancer, predict drug response, and improve treatments for patients.

The cross-agency collaboration includes the National Cancer Institute and the Department of Energy along with five national labs—Argonne, Oak Ridge, Lawrence Livermore, Los Alamos, and Frederick—and is known collectively the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) program.

JDACS4C is a DOE-NCI partnership formed in response to three national initiatives: the Precision Medicine Initiative, the Beau Biden Cancer Moonshot, and the National Strategic Computing Initiative (NSCI), which was established by an executive order signed on July 30, 2015. The executive order stated that, "it is the policy of the United States to sustain and enhance its scientific, technological, and economic leadership position in [high-performance computing] research, development, and deployment."

One of the JDACS4C projects, the Cancer Distributed Learning Environment (CANDLE), is creating the capability to use cancer data to build predictive models for drug response, provide better molecular understanding of disease growth, and support decisions on individualized treatment, all while assisting the three pilot programs of the JDACS4C.

CANDLE will be implemented as a widely accessible open-source computer environment that individual cancer researchers across the country will be able to install on their own systems. The easy-to-access program will encourage development of next-generation computer simulations that can be used to better understand biological processes in cancer and, ultimately, to predict which drugs would be most effective against which cancers.

"The early uses will likely be for studying the complexities of cancer, with potential for extending impact into support for precision oncology treatment decisions,” said Eric Stahlberg, Ph.D., director, Strategic and Data Science Initiatives in the Data Science and IT program at Frederick National Lab (FNL).

CANDLE is more than a platform for sharing. The system is programmed to “learn.” The software is designed to detect complex patterns in large data sets that may be invisible to researchers. This is called deep learning. The software will link this new information to known concepts, thereby extending the scientific understanding of processes involved in cancer.

Each cancer model’s parameters can be continually adjusted to bring the machine’s predicted responses closer to observed responses. The computer predictions will be validated by scientists to ensure accuracy. The ability to explore a broad range of models and data enables CANDLE to identify potential novel solutions and insights in unanticipated areas.

“The research community has collected thousands of experiments with hundreds of thousands of data points characterizing tumors and their response to the drugs,” said Rick Stevens, an associate laboratory director at the DOE’s Argonne National Laboratory and professor of computer science at the University of Chicago. “By working with the national laboratories, the National Cancer Institute can now use the computing resources of the national labs to build scalable predictive models for the cancer problem.”

A significant milestone in the use of predictive models is evident in one of the largest precision medicine clinical trials in the nation, the NCI-Molecular Analysis for Therapy Choice (NCI-MATCH) led by ECOG-ACRIN (part of NCI’s National Clinical Trials Network). The trial incorporates expertly defined algorithms to help match the genetic mutations found in the tumors of individual patients with drugs available to target those mutations, and to do so accurately and rapidly.

The deep learning approach underlying CANDLE builds upon this important advance, enabling an even greater range of potential models and data to be incorporated into the development of predictive models.  Already deep learning has enabled recent advances in identifying skin cancer, improving breast cancer diagnosis, and predicting mutation rates in prostate cancer, Stahlberg said.

CANDLE will also build upon emerging data resources being initiated by NCI and supported by the Frederick National Lab, such as the genomic data repository at the University of Chicago called the Genomic Data Commons, which supports the NCI cancer moonshot.

CANDLE will ultimately enable scientists to analyze information from many sources to look for key cancer biomarkers, molecules in the bloodstream that indicate the presence of the disease, and other information that may predict treatment response, Stahlberg said. The scalability of the system is an important aspect of the DOE design for CANDLE, with the ability to accommodate the analysis of millions of clinical patient records that can aid development of databases of disease metastasis and recurrence.

The Frederick National Lab has installed and tested early versions of CANDLE. Planned incremental releases of the software will enable broad use of the technology and allow scientists and researchers to provide insight and feedback to guide future development of the software.

“The exciting potential behind the development of CANDLE is the emerging set of unique and creative ways in which the deep learning capability will be applied in and shared among the cancer research community,” Stahlberg said.