About the Author
My name is Vi Lam, and I am currently in my last year of medical school at the Philadelphia College of Osteopathic Medicine (PCOM). Before attending medical school, I was involved in research projects involving big data and machine learning applications. This experience led me to City of Hope, and I carried out my project under the guidance of Dr. James Lacey of the California Teachers Study (CTS). I got to work with the CTS questionnaire data and the California Department of Health Care Access and Information (HCAI) hospitalization data. This project taught me to quickly navigate large data sets, perfect my data storytelling, and create compelling visualizations.
My Research Project
During my clinical rotations for school, I had the chance to assist in multiple surgeries. One of the most common procedures was cholecystectomy, which is the removal of the gallbladder. This summer, I wanted to learn more about the outcomes of this routine procedure, specifically the risk factors for post-cholecystectomy complications. I was curious to see which factors might impact a patient's surgical outcomes.
To help answer this question, I used the cholecystectomy procedure codes from the Agency for Healthcare Research and Quality (AHRQ) Quality Indicators to select all CTS participants who underwent cholecystectomy between 1995-2020. The study population consisted of 4,314 women who underwent cholecystectomy as a primary procedure.
Within this cohort, participants were categorized into two categories: those that had 'Complications' (1,500 women) and 'No complications' (2,814 women). Participants were considered as having 'Complications' if they met any of the following criteria: length of stay > 3 days; hospital readmission within 30-days; or a hospital discharge other than “Home”, meaning they were discharged to a location that was not their home.
Patient-level features (age, race, obesity status, smoking, diabetes, cardiovascular events, and gallbladder disease) and hospital-level features (elective vs. emergent procedure, time to procedure, and weekday vs. weekend procedure) were compared between the two groups. Among the population, we found that patient-level characteristics such as age at admission, smoking history, history of hypertension, and diagnosis of morbid obesity at the time of admission are potential indicators of post-op complications. On the other hand, the hospital-level characteristics we examined were not markedly different between the two groups.
I also created two logistic regression (LR) models to estimate the relationship between patient-level characteristics, hospital-level characteristics, and post-procedure complications. A logistic regression is a classification algorithm used to predict a binary outcome (an outcome that can only have one of two values). In this project, the binary outcome is defined as ‘1= 'Complications' and 0= 'No complications'.
This model is used to predict the surgery outcome using historical data (i.e., age, race, existing diagnoses, day of the surgery, etc.). Since LR models are not efficient at handling null (missing) values, only limited inputs were used to train these models. Age, race, history of hypertension, and self-reported BMI were used for the ‘Patient Features’ model; admission type, days to procedure, and day of the procedure were used for the ‘Hospital Features’ model.
LR models are then evaluated by a metric called Area Under the ROC curve (AUC) which ranges from 0.5-1. In this case, both our model AUCs are very close 0.5, which means that the models are unable to distinguish the two outcomes given the limited input.
Future Goals
In the future, I will continue expanding the project by exploring our geographical database. Using the hospital zip codes, we can categorize the hospitals into 'Urban' and 'Rural' areas and investigate the relationship between surgery outcome and the hospital location. Furthermore, evaluating the time of day when the procedure occurs (morning vs. night shift) could also be a valuable predictive feature of post-op complications. Lastly, development of a more robust predictive model utilizing all of patient-level and hospital-level features can potentially aid in predicting adverse outcomes and complications post-cholecystectomy.