PhD Defense by Toyya Pujol

Event Details
  • Date/Time:
    • Thursday August 6, 2020
      10:00 am - 11:00 am
  • Location: REMOTE: BLUE JEANS
  • Phone:
  • URL: BlueJeans Link
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: : Analytics and Machine Learning for Health Care Data

Full Summary: No summary paragraph submitted.

Dear faculty members and fellow students,

 

You are cordially invited to attend my upcoming thesis defense.

 

Thesis Title: Analytics and Machine Learning for Health Care Data

 

Advisor

Dr. Nicoleta Serban, School of Industrial and Systems Engineering, Georgia Tech

 

Committee members:

Dr. Sherri Rose, School of Medicine, Stanford University

Dr. Julie Swann, Department of Industrial and Systems Engineering, North Carolina State University (Adjunct Professor at Georgia Tech ISyE)

Dr. Branislav Vidakovic, School of Industrial and Systems Engineering, Georgia Tech

Dr. Greg Gibson, School of Biological Sciences, Georgia Tech

 

Date: Thursday, August 6th

Time: 10 am-11 am, EST

 

Meeting URL (for BlueJeans):

https://bluejeans.com/244818890/1358?src=join_info

 

Meeting ID (for BlueJeans):

244 818 890

 

Abstract:

The volume of data is expected to grow fastest in health care compared to any other industry. This creates a demand for the development of rigorous analytics and machine learning methods for applications to large health data sets.  These large health data sets come with privacy protections which place limitations on data visibility and its release; which can cause unique complications for analysis.  This can restrict the use of out-of-the-box solutions.  Notably, healthcare research has incredibly high stakes, it can be the difference between life and death or can have a major impact on an individual’s quality of life.  For these reasons, the development of statistically-sound technical solutions, are that much more critical.   This thesis focuses on the application of analytics and machine learning to solve applied research problems focused on healthcare data. 

Chapter 1 is an introduction to each study in the thesis. It presents the research objectives and contributions.  The chapter also discusses the value of the methods used in each study and the benefits of using administrative claims data.

In chapter 2, we determine the level of uptake of new CDC contraceptive recommendations by clinicians.  The study included Medicaid-enrolled women within reproductive age two years prior to the MEC release and two years following the release for 14 states using the Medicaid claims data.  We focused on two outcome measures: (1) overall contraception use and (2) the use of CDC recommended contraception (i.e. those of the highest efficacy).  We evaluated each outcome for the entire study population and by health condition. The ratio of the after-guideline rate over the before-guideline rate was used to determine statistical significance in the MEC uptake.  The results found that there had been an increase in the overall use of contraception methods among women with these health conditions and for each condition individually.  However, the results also showed that the use of the highest efficacy methods increased overall but not for every condition. The chapter gives suggestions for further increasing the use of the highest-efficacy methods within this population.   

In chapter 3, we assess the health and wellness outcomes of infants born to adolescent mothers.  Our nationwide study assesses the association between adolescent pregnancy and the health and wellness of infant within their first year of life.  Each infant in the study group (infants born to adolescent mothers) is matched with the control group (infants born to adult mothers) based on the mother’s demographics.  The outcomes assessed are: low birth weight, substance exposure, foster care, health status, mortality, emergency department visits, and wellness visits.  The results suggested differences between the two groups, especially for emergency departments visits.  However, the differences were not as drastic as previous research has found, suggesting a promising result that the gap between these two groups may be closing.  The chapter also includes recommendations to support adolescent mothers. 

In chapter 4, we assess statistical learning methods for a difference-in-differences (DID) study setting.  These analyses rely on parametric statistical models that make strong assumptions about the unknown underlying functional form of the data.  In this study, we extend existing statistical machine learning methods to target a DID parameter, defined nonparametrically, while considering a larger nonparametric model space that makes fewer assumptions. We develop a general framework for DID designs that allow researchers to estimate causal or statistical effect quantities using machine learning while providing statistical inference.  We demonstrate its performance through a simulation in which we compare it to more traditional methods.  The project applies the method to estimate the effects of episode-based bundle payment on perinatal spending.  In the study, we find our approach reduces bias (upward of 50% reduction) for lower effect sizes.

Chapter 5 applies machine learning to the problem of edge weight estimation for social networks.  Social network analysis can be used to visualize, quantify, and assess relationships between two entities.  Within healthcare, social networks can be used to quantify the impact of social influence on healthcare interventions.  Algorithms have been used to predict information on social networks, such as edge existence, or similarity measures, such as common neighbors.  However, little research focuses on weighted graphs and even less work on the estimation of their edge weights.  Accurate weight estimation can serve as a data quality tool to check if the weights in the data are correct or where we would expect new stronger (or weaker) relationships to occur next.  This study evaluates the performance of three estimators, including an ensemble machine learning approach, to predict the edge weights of a weighted social network.  We use a faculty hiring example to compare the three methods.   

Chapter 6 is the conclusion of the thesis.  It includes a discussion of the overall impact of the research with respect to health care policy and techniques for administrative claims data.  Future work is proposed as well as additional applications of the work.

 

Additional Information

In Campus Calendar
No
Groups

Graduate Studies

Invited Audience
Faculty/Staff, Public, Graduate students, Undergraduate students
Categories
Other/Miscellaneous
Keywords
Phd Defense
Status
  • Created By: Tatianna Richardson
  • Workflow Status: Published
  • Created On: Jul 28, 2020 - 5:12pm
  • Last Updated: Jul 28, 2020 - 5:12pm