Modernized Data Lake Platform

For an Academic Medical Center on the East Coast

The client had a vision for a data platform that would enable the availability of consistent as well as accurate analytics and insights to the right person, in the right format, at the right time (and every time), securely and easily. The organization includes around 13,000 employees, 2000 physicians, 1,116 licensed beds, and 400,000 lives in value-based contracts. One of the core aims of the client was to create a unified platform for data management, analytics, and advanced AI to address use cases like disease prediction, medical image classification, clinical operations optimization, and biomarker discovery.

For fulfilling this vision, 314e was awarded the build and implementation services contract by the client to develop a modern business intelligence platform using the Databricks Enterprise Lake House as the underlying technology. The first use case for the Modernized Data Lake Platform came in the form of the Epic EMR implementation. As part of this implementation, hospital executives made a collective decision to perform data and document (scanned notes and images) conversion using the Modernized Data Lake Platform, thereby delivering much-needed efficiencies and cost savings for the system.

The implementation of the Epic EMR was done across all entities of the client to address several challenges they faced. These challenges included sudden patient surges because of the COVID-19 pandemic, high patient volume and acuity, as well as staffing shortages across the system.

Solution Approach

Following the vendor partnership with the client, 314e started engaging in the implementation and continuous improvement of the Modernized Data Lake Platform platform by utilizing the open specification of FHIR (R4; US Profile) as the underlying schema and mapping each FHIR resource to a corresponding Delta Table.

Bulk loads and incremental loads had been created to facilitate a single store of enterprise-wide data sourced from disparate systems (50+ EMRs). The implementation efforts involved performing all necessary configurations, prerequisite work necessary for getting started with Databricks on Microsoft Azure, setting up ETL pipelines, extracting data, and performing transformation using Python-based Spark ETL jobs into FHIR (for clinical systems) and Microsoft Common Data Model (for non-clinical systems), and creation of Rest-based APIs using Microsoft App service for FHIR or Spark SQL/Delta Jobs to support a wide variety of use cases. The Modernized Data Lake Platform also acted as a host to an AI/ML optimization accelerator that offered a feature-rich data-input mechanism for predictive modeling.

314e consultants had been successful in extracting legacy data stored in the Modernized Data Lake Platform using Databricks Spark-SQL Jobs and generated HL7v2 messages and CCDs for conversion into Epic. This process made use of a repeatable Databricks-based ETL accelerator to perform conversion tasks for 60+ systems in scope.

Business Outcomes

  • The Modernized Data Lake Platform presented itself as an ideal platform by making use of the Databricks Lakehouse for fostering interoperability by enabling and enhancing collaboration between CTSI’s modules and the Modernized Data Lake Platform’s FHIR-based solutions.
  • This arrangement eliminated the need to stand up a separate data platform for research purposes, thereby saving millions of dollars in research costs.
  • The Modernized Data Lake Platform also supported several use cases, including regulatory/compliance reporting, data archival, machine learning and artificial intelligence processes, and real-time disease surveillance.