Teaching a computer to read doctors’ notes will capture valuable data for cancer registries

Algorithm will scan pathology and radiology reports for information that could aid researchers and policy makers in ultimately improving patient outcomes.

Yan Yuan is leading a project using AI to scan doctors’ written notes for valuable diagnostic information to add to provincial cancer registries, with the goal of improving treatments and outcomes for patients. (Photo: School of Public Health)

Every time you enter a phrase or a sentence into Google search, algorithms kick in using a technique called natural language processing to understand what you really want to know and then find you an answer.

Now University of Alberta researchers will use a similar approach to develop a computer program that can “read” doctors’ written notes to help improve our understanding of how cancer spreads, track how well cancer treatments work and ultimately make them more effective.

Scanning written pathology and radiology reports in this way will allow researchers to ferret out valuable diagnostic information that is currently missing from cancer registries. The registries are used extensively by researchers and policy-makers, for example, to evaluate the real-world performance of treatments and identify gaps in resources to treat rare cancers.

“Because cancer has become a more chronic disease, we need better information on how cancer progresses and recurs in order to advance our treatments,” said project lead Yan Yuan, a biostatistician and associate professor in the School of Public Health.

“Radiology and pathology reports contain rich clinical information in unstructured human text language,” she said.

The research team will prove the concept by searching for diagnoses of brain metastases and the molecular markers for brain cancers in the records for Alberta cancer patients diagnosed between 2010 and 2021, thanks to a grant of nearly $450,000 just announced by the Canadian Cancer Society. The project will link the original cancer diagnosis data to the followup health record.

Critical data for improving treatments

Provincial cancer registries contain manually entered information about patient demographics, cancer type and stage, and first-line treatments such as chemotherapy, surgery or radiation, and are linked to vital statistics records to infer survival rates.

Each province collects slightly different data for its cancer registry, but most do not update them after the initial cancer diagnosis. Most do not include the molecular markers found in the tumour tissue, which can indicate how aggressive a tumour is and what treatment might work. They also don’t track metastases — new tumours found in other parts of the body when the original cancer spreads.

Yuan’s project aims to close those gaps.

“If you don't have the surveillance, you don't know the burden on the health-care system,” she said. “Part of the power of the population-level database is that now we can have data from all of Canada on rare cancers such as brain cancer.”

Yuan said Alberta is ideal for the project because the province already has the most complete cancer registry in the country. Brain scans are only done in hospitals so the records are centralized and accessible. And there is a strong group of experts in brain tumours and artificial intelligence working together at the U of A, including computer scientist Lili Mou, radiology and diagnostic imaging chair Derek Emery and neuro-oncologist Jacob Easaw. Yuan expanded the research team that was originally established by Faith Davis, professor emeritus and former vice-dean of the School of Public Health, who was the founding research director for the Central Brain Tumor Registry of the United States. Yuan is also a member of the Women and Children’s Health Research Institute.

Yuan’s goal for the three-year project is to establish how the algorithm works for brain tumour records in Alberta, then share it with British Columbia and Ontario to see how it works there, and expand to more jurisdictions and other types of cancer.