How bias can creep into medical databanks that drive precision health and clinical AI

Findings have already prompted improvements in how the University of Michigan recruits new participants for its biobank.

10:50 AM

Author | Kara Gavin

gloved hand holding blood vial
Getty Images

In the race to harness medical data for artificial intelligence tools and personalized health care, a new study shows how easily unintentional design bias can affect those efforts.

It also points to specific ways to increase the chances that patients who are traditionally underrepresented in research can be included in the massive banks of genetic samples and data from digital medical records that underlie these efforts.

Not only could that be important to the accuracy of the tools based on those data, but it would also make it more likely that they'd benefit diverse patient communities.

The study, in the December issue of Health Affairs, comes from a team at the University of Michigan and Michigan State University that studied U-M's efforts to build a large bank of data and samples for researchers to use.

The findings have already led to improvements in how Precision Health at U-M recruits participants, and the racial and ethnic categories that patients can self-select to be added to their records.

Key findings

The study focuses on the Michigan Genomics Initiativewhich originally designed its recruitment effort around approaching patients to donate a small amount of blood for the research biobank when they were waiting for surgery at Michigan Medicine, U-M's academic medical center. Trained recruiters aimed to approach all adult surgical patients in the preoperative setting during typical surgical hours.

There were several reasons why the initiative used this approach — including the fact that patients in such settings have time to engage in recruitment and enrollment procedures, and that they often already have an intravenous line placed in preparation for their treatment, so it's convenient to draw a blood sample for research use if they consent.

SEE ALSO: As Hospitals Walk Tightrope of Patient Data Sharing, One System Offers a New Balance But the new study found that the pool of surgical patients from which the Michigan Genomics Initiative staff recruited from were more likely to be older, white and socioeconomically advantaged men when compared to the general Michigan Medicine patient population.

There's an important tension between respecting patients' informed consent and also supporting generalizable research. The ideal resolution is a structure that doesn't put those two in tension to begin with.
Kayte Spector-Bagdady, J.D., M.B.E.

In addition, when approached, patients who consented to enroll in the biobank were younger than the average patient waiting for surgery, and less likely to be Black or African American, Asian or Hispanic.

The result: The blood samples collected for the biobank came from a sub-population that was less demographically diverse than Michigan Medicine's overall patient population.

Changing the approach

While recruiting surgical patients remains a key component of the Michigan Genomics Initiative's recruitment strategy, Precision Health has since expanded its recruiting efforts to include a mail-in saliva-collection kit — giving a broader patient population the opportunity to engage in the research if they choose. Precision Health's MY PART effort aims to recruit a nationally representative study population into the university's biobank.

The authors hope that by sharing their deep-dive into differences in recruitment and consent rates, they can help other institutions, organizations and companies design more equitable databanks of their own.

If they don't, all the tools and products that will emerge from research using those databanks will reflect demographic biases and make them less accessible or generalizable for underrepresented communities, the researchers say.

SEE ALSO: How U-M's Genetic Research Bank Fuels Precision Health Work

"We know that large research datasets often do not reflect the diversity of the patient population across the United States, but our study gives a detailed analysis about how these disparities become embedded in scientific advances from the ground up," said  Kayte Spector-Bagdady, J.D., M.B.E., co-first author of the new paper and a research ethicist at Michigan Medicine. "This way we were able to highlight practical improvements that we could implement immediately," she added.

Downstream effects

Spector-Bagdady, a U-M Medical School assistant professor who is the associate director of U-M's Center for Bioethics and Social Sciences in Medicine, led the study along with senior author Jenna Wiens, Ph.D., one of the co-directors of Precision Health and an associate professor of computer science and engineering at the U-M College of Engineering. Both are members of the U-M Institute for Healthcare Policy and Innovation.

MORE FROM THE LAB: Subscribe to our weekly newsletter

"A lot of the research that goes on in precision health, machine learning, and AI for health care across the country leverages data from the electronic health records of major health systems, and data from the subset of patients who have consented to give biospecimens," Wiens explained. "For an AI researcher who builds machine learning and clinical decision support tools, generalizability is so important. Otherwise, we risk building tools that perpetuate disparities in care and outcomes."

Levels of consent unlock more precision

The authors note that many academic medical centers, including Michigan Medicine, inform patients when they consent to receive care that their medical records might be used by researchers. At U-M, such use is permitted with authorization from the Institutional Review Boards at the Medical School.

Taking part in MGI involves consenting to allow those records to be used in conjunction with a sample of their DNA.

For instance, researchers might analyze part of their genetic sequence and look at how their genetic traits relate to conditions they have or how well they do when given certain treatments.

This is a powerful tool for understanding what drives certain diseases, or what treatments work best for people with different characteristics who have the same type of cancer, for instance.

It could also form the basis for AI tools that can predict which patients will suffer certain complications, or help doctors pick from among various treatments for them.

Using just the Michigan Medicine electronic medical record data would mean capturing a patient population with more demographic diversity, but does not offer patients the same research-level informed consent as the biobank consent process.

Like Podcasts? Add the Michigan Medicine News Break on iTunes, Google Podcasts or anywhere you listen to podcasts.

Records-based research also means less precision for some studies, because it doesn't include the ability to study genetic variation and biomarkers – such as proteins in the blood that could be associated with disease.

That means biobank teams must go to extra lengths to recruit people from groups that are less likely to give consent.

"Building long-term trust between healthcare systems and those underrepresented in biobanks, and the research enterprise in general, is a task that must be prioritized. Any attempts at equity building must be hyper-localized, attentive to historical neglect, and situated in justice considerations beyond the research question," added co-author Melissa Creary, Ph.D., who is an assistant professor at the U-M School of Public Health and the senior director of Public Health Initiatives at the American Thrombosis and Hemostasis Network, and who has written extensively on these issues.

Making it clear to participants how their data will be used if they give consent, including any commercial uses, and being careful about sharing data with industry is crucial for earning trust and is already a top priority at U-M. Michigan Medicine's leader, Marschall Runge, M.D., Ph.D., recently wrote on this topic.

"There's an important tension between respecting patients' informed consent and also supporting generalizable research," Spector-Bagdady said. "The ideal resolution is a structure that doesn't put those two in tension to begin with."

Paper cited: "Respecting Autonomy And Enabling Diversity: The Effect Of Eligibility And Enrollment On Research Data Demographics," Health Affairs. DOI: 10.1377/hlthaff.2021.01197

More Articles About: Industry DX Basic Science and Laboratory Research Health Care Delivery, Policy and Economics Race and Ethnicity Social Status All Research Topics
Health Lab word mark overlaying blue cells
Health Lab

Explore a variety of healthcare news & stories by visiting the Health Lab home page for more articles.

Media Contact Public Relations

Department of Communication at Michigan Medicine

[email protected]


Stay Informed

Want top health & research news weekly? Sign up for Health Lab’s newsletters today!

Featured News & Stories green circle cells close together highlighted in yellow
Health Lab
Solving a sticky, life threatening problem
Michigan Medicine researchers have zeroed in on C. auris’ uncanny ability to stick to everything from skin to catheters and made a startling discovery.
woman older with provider
Health Lab
Should older adults, with fewer years to live, keep getting cancer screenings?
Cancer screening guidelines increasingly factor in how long a person has left to live, to guide whether to continue or stop screening. A new poll explores older adults’ attitudes toward this approach.
stork with baby in bag with dollar sign
Health Lab
Childbirth associated with significant medical debt
Postpartum individuals are more likely to have medical debt than those who are pregnant, suggests a Michigan Medicine led study that evaluated collections among a statewide commercially insured cohort of 14,560 pregnant people and 12,157 people in the postpartum period.
liver in bright green against navy background
Health Lab
Genetic variation with MASLD reveals subtypes and potential therapeutic avenues
A Michigan Medicine team of experts seeks to identify the human genetic causes of MASLD, formerly called NAFLD
teacher teaching students with windows all around on laptops
Health Lab
A training ground for healthcare innovators
Advanced programs in healthcare equip doctors, nurses, and others with the skills and knowledge needed for successful careers in health. Explore how Michigan’s Clinician Scholars Program can empower healthcare innovators across different disciplines and enhance medical education.
cells colorful
Health Lab
Improvements in human genome databases offer a promising future for cancer research
A gene sequencing method called ribosome profiling has expanded our understanding of the human genome by identifying previously unknown protein coding regions. Also known as Ribo-seq, this method allows researchers to get a high-resolution snapshot of protein production in cells.