Informatics Insights with the NMEDW Team

The Northwestern Medicine Enterprise Data Warehouse (NMEDW) is a comprehensive and integrated repository of all clinical and research data sources on campus. The EDW facilitates research, clinical quality improvement, healthcare operations, and medical education. It is a joint initiative between Feinberg School of Medicine and Northwestern Memorial.

In this Q&A, we learn about some of the people who work on the EDW every day, innovating to make sure the Northwestern research community has access to high quality data and analysis.

What’s your background?

Daniel Schneider, Director of Research Analytics
I have a background in molecular genetics, where I spent over 10 years at the University of Chicago working within the department of Human Genetics. Over the years, I gradually took on more of the computational aspects of the lab by developing code for various bioinformatic analyses. After completing an MS in Computational Biology & Bioinformatics from Northwestern University in 2008, I transitioned over to the healthcare informatics space and began my career as data analyst within the newly formed Enterprise Data Warehouse. I now am the Director of Research Analytics at FSM and lead a diverse team of research data analysts, data scientists and data architects with different areas of expertise to help handle the many analytic, machine learning (ML)/AI and data visualization needs seen across campus.

Prasanth Nannapaneni, Senior Technical Architect - Research Analytics
I have been with Northwestern University for over 13 years, currently serving as the lead architect for data architecture in research analytics. I lead a dedicated team of data engineers focused on supporting Northwestern's research mission. Together, they manage a range of projects that benefit various initiatives across the Feinberg School of Medicine, including FSM Finance architecture, the AWOME student data warehouse, common data models, and architecture support for research analysts. I hold a BS in Computer Science from DePaul University.

Zachary Hafen-Saavedra, PhD, Manager of Specialty Analytics - Research Analytics
I am one of the most recent additions to the Research Analytics team, but I first came to Northwestern University in 2014 to pursue a PhD in Computational Astrophysics. After defending my PhD thesis in 2020, a few weeks into the lockdown, I spent a few years leading a research group as a postdoctoral fellow at University of California - Irvine, before returning to Chicago and taking on a data scientist position at Adler planetarium. This is my first position working in a healthcare context, and I am delighted by both how much my skillset transfers and how very much there is for me to learn. I've already gained so much from working with my colleagues in the Research Analytics team, including the five analytics developers on my team.

Philip Silberman, Manager of Research Analytics - Core Team
Prior to joining Northwestern, I worked for Wayfair, where I initially developed a knowledge of programming, leaving that position to pursue a PhD in Mathematics at Indiana University - Bloomington. I left that program after two years and joined the EDW team as an analyst in 2015. Since then, I've built a wealth of knowledge around electronic health records and medical data, moving to a managerial position in 2021. I manage a team of seven analysts who work on a variety of projects to support the research ecosystem at Feinberg.

The EDW has been a key element of research at FSM for almost 20 years, enabling NU researchers to perform data-intensive research while protecting patient data. ”

Philip Silberman (right), pictured with (from left to right) Zachary Hafen-Saavedra, Prasanth Nannapaneni and Daniel Schneider

What does the EDW do?

The Research Analytics team is dedicated to supporting faculty, students and staff researchers by providing high quality, secure, HIPAA compliant data and analysis to the Northwestern research community. We are embedded within the Northwestern Medicine Enterprise Data Warehouse and conform to their best practices and data standards. We offer flexible solutions to handle everything from the delivery of small one-time data pulls to larger multi-institutional initiatives requiring ongoing support, analysis and data architecture.

How did the EDW get started? What were some early services?

The EDW was initially formed, around 2006, as a 3-year pilot project with funds to develop a unified data platform that would support the tripartite mission of an academic medical center: to enable research, medical education and clinical operations. This was spearheaded by the then Dean of FSM, Lewis Landsberg, who had the initial vision to understand how important centralizing data could be. Some of the earliest success stories were the ability to combine varying source systems for a more comprehensive view of data across administrative and research purposes. One example of this was the blending of data from the Northwestern Medical Faculty Foundation (NMFF), which housed the majority of the outpatient data for patients, with Northwestern Memorial Hospital, which made up most inpatient-related data. Some early services included automated clinical trial patients' lists based on inclusion and exclusion criteria, as well as internal hospital dashboards for the Emergency Department and ICU.

What current projects would you like to highlight?

Over the past couple years, and over the past year especially, the EDW has been migrating to a cloud-hosted framework (see below for more information). The modernized framework will greatly enhance the capabilities of the Research Analytics team and will upgrade data access for FSM researchers. Key features include the ability to securely process data with state-of-the-art ML/AI models, enhanced dashboarding capabilities, and the ability to integrate advanced scientific analyses into the data delivery workflow.

In the Medical Education space, our team developed a process to automatically score student assessments based on a standardized rubric using a large language model (LLM), including a confidence score so that faculty members can review any low-confidence results. This project saved countless hours of manual review for the leadership team and streamlined the entire feedback process.

Having a way to seamlessly follow a patient and their data from various outpatient visits and hospital stays gives researchers a more comprehensive picture of a patient's medical experience.”

Daniel Schneider, Director of Research Analytics

Are there any publications you'd like to feature?

Here are some of our favorites. It was hard to narrow down to just three of them!

What has been the greatest challenge for EDW?

The greatest challenge for the NMEDW Research Analytics team is the fact that we sit and operate in the middle of two different organizations, Northwestern University and Northwestern Memorial Healthcare. These two organizations have different governing bodies as well as different legal and administrative policies. The ease of sharing data between these organizations can hit many difficult roadblocks due to the differing policies. By serving as an honest broker between the two organizations, it can be quite challenging to navigate the varying technical and administrative barriers to successfully release data and analyses to meet overall project goals in a timely manner.

NMEDW Research Analytics Cloud Modernization

Over the last two years, the NMEDW Research Analytics team has undergone a complete modernization of its infrastructure and services. The team has migrated the majority of the infrastructure that was on premises over to the NMHC Microsoft Azure cloud tenant.

In doing so, the team has rolled out a completely new data experience for users of NMEDW services.

Some of the Research Analytics team expanded analytic functionality:

Full integration of LLM within our analysis pipeline
Integration of PowerBI for reporting and dashboarding
Automated data transfer to study specific fileshare location secured per study team
Creation of research specific code libraries
Comprehensive machine learning capabilities beyond LLMs
Statistical analysis capabilities
Scaling up of the volume of data that can be processed

A new intake process and end user interface are also available.