When Data Parasites Are a Positive

A growing trend of information sharing will move science forward, one cardiologist says.

4:42 PM

Author | Haley Otman


It started as an insult, but advocates for data sharing are encouraging people to become research parasites and are inspiring symbiosis in science.

With the increasing sophistication of methods in disciplines like computational biology and epidemiology, and the vast possibilities that come with newer fields like deep learning, there's more of a need than ever for skilled professionals who focus full-time on data analysis.

That means existing data sets from studies including clinical trials can now become meaningful in ways far beyond the original idea that sparked the research, says cardiologist J. Brian Byrd, M.D., M.S., at Michigan Medicine's Frankel Cardiovascular Center.

That's true as long as more researchers start to become interested in sharing their data for others to use, he says.

"Over time, I anticipate funding agencies will begin to reward those who share more and share better with an increased chance of future funding," Byrd says.

MORE FROM MICHIGAN: Sign up for our weekly newsletter

I think we're at a tipping point where sharing becomes more normal and more desirable for the person who holds the data.
J. Brian Byrd, M.D., M.S.

To help foster an environment where people would want to share with excellence, he hosted a podcast with The Lancet about open data last year, and co-founded the Research Symbiont Awards, which he presently chairs. The annual award recognizes someone who goes beyond typical standards for data sharing, making it easy for others to use the data.

Byrd says he isn't worried about fears that the growing demands to share data openly could lead to a situation where some researchers exclusively analyze existing data, without generating any of their own. That's a growing need in the medical community, now that we can do more with large amounts of data, he says.

Byrd discusses the award he created and why he's passionate about shifting the culture toward more sharing.

What's the current status of data sharing?

Byrd: While I'd still say there is no roadmap for data sharing, things are changing. I see a confluence of interests toward more, better, safe, ethical and careful data sharing today than even months ago or last year.

For example, I've personally found basic infrastructure is coming along. I had an experience several months ago while starting to plan a study, in which I know we'd like to share the data, where it seemed unclear who could figure out the logistics of doing that and make sure any regulatory implications were managed. Now, I've circled back months later and have found more people are able to help us figure this out.

I think we're at a tipping point where sharing becomes more normal and more desirable for the person who holds the data.

What makes for successful data sharing?

Byrd: The most important thing is to make it as easy as possible, within legal and ethical constraints, for other people to use your data. That could look like a downloadable data set with additional information on a website.

For the Research Symbiont Awards, we look at researchers who went beyond typical standards to create openly shared resources or data sets that could allow other people to take science further using what was shared.

I find that what people do currently varies quite a bit, and it varies by field. For example, it's very common today for people who do sequencing to upload the data to a public repository. But there are other fields in which sharing is still more unusual.

What are the main advantages for researchers to share their data?

Byrd: You allow for a longer life cycle of your research. Other people could use what you've gathered in ways you may not ever have thought of, or may not have time to collaborate on right now.

Some newer forms of analysis, like deep learning, require a vast amount of data to train the models that will then be used for a helpful application in the medical sphere. To the extent that data sets become available, we can do interesting things beyond the original research aims.

For example, I worked with colleagues at the University of Pennsylvania to create a synthetic dataset from the original data out of the SPRINT blood pressure trial. We used a novel method called generative adversarial neural networks, in which two neural networks train each other to make synthetic data similar to the original data, but not the original data. Researchers can do analyses to find real meaningful results that don't contain any trial participant's real data.

LISTEN UP: Add the new Michigan Medicine News Break to your Alexa-enabled device, or subscribe to our daily updates on iTunesGoogle Play and Stitcher.

What risks or concerns come with data sharing at this point?

Byrd: We must be concerned about any threats to the privacy of participants in a study. This is something we're giving a lot of thought to: how can we best enable sharing of data without compromising privacy?

What people originally consented to matters a lot. The literature shows that the large majority of clinical trial participants are open to their data being shared for a variety of purposes. People may in fact expect broader use of the information generated through their process of volunteerism for a study. It's important to discuss these topics in a public conversation so everyone has the same information. ­

More Articles About: Industry DX All Research Topics Emerging Technologies Future Think Hospitals & Centers
Health Lab word mark overlaying blue cells
Health Lab

Explore a variety of healthcare news & stories by visiting the Health Lab home page for more articles.

Media Contact Public Relations

Department of Communication at Michigan Medicine

[email protected]


Stay Informed

Want top health & research news weekly? Sign up for Health Lab’s newsletters today!

Featured News & Stories cells colorful
Health Lab
Improvements in human genome databases offer a promising future for cancer research
A gene sequencing method called ribosome profiling has expanded our understanding of the human genome by identifying previously unknown protein coding regions. Also known as Ribo-seq, this method allows researchers to get a high-resolution snapshot of protein production in cells.
stethoscope in gun outline
Health Lab
Many primary care providers and patients wary of discussing firearms
Screening primary care patients for gun ownership has been recommended especially for people with mental health issues. A Michigan Medicinestudy shows wariness by providers and patients.
flies moving sled in snow with person
Health Lab
Gene links exercise endurance, cold tolerance and cellular maintenance in flies
A study in PNAS identifies a protein that, when missing, makes exercising in the cold that much harder—that is, at least in fruit flies.
bacteria black background yellow cell
Health Lab
The surprising origin of a deadly hospital infection
Surprising findings from a Michigan Medicine study in Nature Medicine suggest that the burden of C. diff infection may be less a matter of hospital transmission and more a result of characteristics associated with the patients themselves.
Health Lab
Genetic mutation linked to adrenal tumor and hypertension
Research from the Department of Molecular & Integrative Physiology at Michigan Medicine identifies a previously unknown genetic mutation that causes the disease called primary aldosteronism in certain populations.
cancer cell
Health Lab
Language barriers in cancer care
Research from experts at Michigan Medicine shows that significant language-based disparities exist in patients’ access to cancer care services, and it’s well before their first appointment with a doctor.