Covid Seed Grant Yields Data Mining Discoveries

Primary tabs

As coronavirus infections exploded in the spring of 2020, everyone was looking for ideas about how to fight what had become a full-blown pandemic. The Wallace H. Coulter Department of Biomedical Engineering put out the call for ideas and offered faculty members seed funding to pursue them.

Turns out, Cassie Mitchell already was on the case.

“We’d been working for a few weeks on something, since the White House started asking data scientists to do analysis on old SARS data, but also on the emerging Covid dataset that was being updated weekly,” said Mitchell, assistant professor in Coulter BME who specializes in using data to forecast disease. “We had already started adapting our text-mining architecture for Covid-19.”

Text mining is what it sounds like: an artificial intelligence process that involves analyzing a lot of existing text for useful data that could lead to new discoveries. Mitchell’s lab received a $10,000 seed grant to use her tools to dig through millions of peer-reviewed articles, seeking hidden patterns that would be relevant to Covid-19 — perhaps identifying patient risk factors or even drugs that might be repurposed to treat the virus.

Using Covid-19 as a test case, the lab adapted a process called link prediction, an important tool in artificial intelligence and machine learning that predicts the existence of a link between two entities. It’s kind of like filling in the blanks after identifying the blanks.

Link prediction is at play when your social media platform suggests a new friend for you, or when an online marketplace predicts which customers will buy what products.

“Though it’s used for other things, we adapted it to biomedical text — as you might imagine, a biomedical application is more difficult than dealing with customer segmentation data,” Mitchell said.

Mitchell’s team excavated information from the articles and built a “knowledge graph, or network that links symptoms, drugs, antecedent diseases, genes, proteins, and much more to Covid-19 or similar coronaviruses,” she said.

The team ranked relationships with the coronavirus to find the most promising research paths, with the intent of expediting translational research. They highlighted thousands of potential repurposed drugs for further research.

“The process can be applied to any emergent or poorly understood biomedical issue for a quicker and more diligent meta-analysis of research, compared to existing methods,” said Kevin McCoy, a third-year BME student who was technical team lead of the lab’s Covid-19 study and co-authored a journal article on the work that’s currently under review.

“The most important finding is that machine learning techniques can be used to rapidly ingest and summarize biomedical literature to generate insightful and accurate summaries of the current research.”

Mitchell said the grant primarily supported student researchers, who took part in several conferences during the year in addition to writing their study.

And what started as a modest seed grant could yield results for years to come — the data mining technology, “which we intend to use for other biomedical issues,” Mitchell said — and a new course for McCoy’s future.

“I quickly fell in love with the applications of data science to biomedical engineering,” McCoy said.

He joined Mitchell’s lab in Summer 2020 and is now looking at Ph.D. programs in statistics and machine learning, “to continue learning about and using their respective techniques to solve pressing biomedical problems.”


  • Workflow Status:Published
  • Created By:Joshua Stewart
  • Created:04/13/2021
  • Modified By:Joshua Stewart
  • Modified:04/13/2021


  • No categories were selected.