Improving Care

Reducing morbidity and mortality and enhancing quality of life

Informing Policy

Transforming health care at the local, national and international levels

Featured Projects

With more than 80 scientists, research at Advancing Health encompasses a wide breadth of areas


The Evidence Speaks

A recurring feature highlighting the latest in Advancing Health research

Our People

In the News

Research Resources

From design to execution, Advancing Health provides a broad range of support services

Work in Progress Seminar Series

Improving Care
Informing Practice

How machine learning can revolutionize health care

Posted on


Within the health care industry, tools and technologies are rapidly evolving. Today, machine learning (ML) is changing the world of medicine every day, advancing care and streamlining data to offer unprecedented opportunities for improved diagnostics, personalized treatment, and operational efficiency. 

But what exactly is machine learning? It is the science of programming computers to identify patterns within data without direct instructions. Put simply, it is computers learning and improving from experience. It is a machine designed to generate insights and make predictions. In the past decade, ML has given us self-driving cars, SPAM email filters and image recognition. 

Harnessing the power of health data

In health care, ML relies on vast amounts of patient data, complementing our health care system which generates billions of gigabytes of health-related data annually, from genetics to imaging. Using systems and tools designed to sort and categorize data, ML algorithms can discover patterns in datasets which allow medical professionals to make more accurate predictions, identify at-risk patients, and optimize wait times. Although lots of data about all sorts of things are collected about patients, the data isn’t always analyzed and used to answer life-changing questions such as: which treatment approach for arthritis, diabetes, or tuberculosis will work better in a particular patient’s case?

Advancing Health Trainee Belal Hossain is bridging this gap by using ML techniques to analyze this wealth of health administrative data and derive insights to help answer questions about the comparative effectiveness of various health care interventions, with the aim to determine which treatment options work best for different conditions or populations. 

Predicting mortality risk in tuberculosis

For his doctoral research, supervised by Advancing Health Scientist Dr. Ehsan Karim, Belal is working with a team including Program Head for Biostatistics at Advancing Health, Dr. Hubert Wong. Together, they’re using ML to answer questions relating to tuberculosis (TB). Specifically, they’re exploring whether latent TB or TB disease (where the TB bacteria is active) is associated with cardiovascular disease. To answer this question, Belal is studying data from a group of people who immigrated to British Columbia from other countries between 1985 and 2019. Using the same dataset, he is also working on creating risk prediction models for long-term mortality of people diagnosed with TB.

Belal Hossain

Belal points out that while health administrative data contains a lot of information, it’s not collected for research purposes. This means that some important contextual information is missing. “For example, when we are predicting mortality rate in people diagnosed with TB, we know that factors like smoking or diet are associated with mortality, but these details aren’t recorded in the [health administrative] data,” Belal explained. “So, when we make risk predictions, our predictions might not be as reliable or accurate as possible because we are missing these variables. Fortunately, machine learning can assist us in the process.”

In his data analysis, Belal is using the vast amount of information already available in the health administrative data to compensate for what is missing and make predictions based on the linkages between variables. “For example, if we don’t have data on smoking, we can look at other factors like having diabetes or hypertension, or some other diagnoses or procedures, which are often linked to smoking,” he explained. “So, if a person’s data shows they have one of those variables, we know that, although they might not be a smoker, they share many traits that a smoker might have. We can use that information as a stand-in, or proxy, for the missing smoking data in our ML models and make our prediction more robust.”

The future of machine learning in health care

Beyond risk prediction modeling, integrating ML with causal inference presents new opportunities for research, especially in assessing how effective treatments can be. Leveraging ML allows researchers to identify high-risk groups to design tailored interventions and then assess their effectiveness in real-world scenarios. 

Ehsan Karim, Advancing Health Scientist

In his supervisory role, Dr. Karim brings a wealth of experience in ML to Belal’s doctoral research. He has used ML methods in previous research to evaluate drug effectiveness, particularly in chronic diseases like multiple sclerosis, where treatment durations extend far beyond typical clinical trial periods. Dr. Karim emphasizes the importance of using longitudinal data (data collected over a long period of time) in chronic disease research to enhance analysis. For comparative effectiveness studies, which compare the benefits and harms of different interventions used to diagnose, treat, monitor or prevent a condition, he says that ML can significantly improve the depth of analysis by identifying patterns in the data with more precision than was previously possible. However, Dr. Karim cautions about interpreting results from complex ML approaches and algorithms that adapt and learn from data based on a wide range of inputs. “We are currently exploring various statistical dimensions within the causal inference framework to accommodate a variety of ML algorithms in health care analysis properly. This ensures we can draw reliable conclusions about the effectiveness of drugs,” he said.

“There are huge opportunities for statisticians and epidemiologists in the use of machine learning”

While Belal’s doctoral research focuses on risk prediction for people with TB, his role at Advancing Health has him collaborating with another Advancing Health Scientist, Dr. Bohdan Nosyk. Here, Belal is working with big data to strengthen the evidence based on the clinical management of opioid use disorder. Their work aims to assess the effectiveness of different interventions. Working on such a range of questions is exciting and Belal thinks the future in this area is promising.

“There are huge opportunities for statisticians and epidemiologists in the use of ML in the causal inference framework because this is an active area of research, and there is huge potential public health value of this type of analysis,” shared Belal. “And potential for evolving or emerging clinical practices to alter results or refine our understanding of a particular question.” But he heeds caution as distinguishing between predictive and causal questions is crucial: “I think ML could help us if we use it appropriately. ML is mostly used for prediction, but if we want to use ML for a causal question — exploring effectiveness — we need to be careful because the analysis strategies for a causal question and for a predictive question are totally different.”

Recent Stories

At Advancing Health, we produce high-quality evidence to change health care through improved patient care, evidence-informed policy, and innovative health system approaches.