Complete Guide of Data Model Conversion to OMOP Common Data Model

Akshita Mehta
3 min readMay 6, 2021

--

Introduction

The OMOP CDM assists in the detailed analysis of the disparate nature of observational databases. The idea of this technique is to convert the data within these databases together into standard format (that is a data model) and standard representations (like terminologies, vocabularies, coding schemas), and afterwards conduct structured analyses using a collection of standard analytical routines written in the standard format.

OMOP V6 Data Model

Observational Health Data Sciences and Informatics(OHDSI) is an independent public project by IQVIA, which is a global health information technology and clinical testing corporation based in the United States. Stakeholders collaborate to perform outcomes-based research at scale as part of an international research ecosystem that supports open science.

For structured data characterization and analytics, data obtained from the data provider can be transformed to the standard OMOP CDM. Using traditional analytical tools, methods, and the OMOP CDM and Standardized Vocabularies, the OHDSI standardizes the way research is conducted.

What is the Need?

Observational data, such as EHRs and claims databases, will help researchers better understand patients and make better healthcare decisions. However, because of the disparate existence of such databases, almost all patient data properties are one-of-a-kind.

The datasets of health-care organizations are typically generated on an organization-by-organization basis. This makes it difficult to analyze them — on the one hand, one must understand the structure of the dataset in question, and on the other, analytics tools must be tailored to each study. Furthermore, comparing datasets from two different organizations is difficult.

Various clinical data methodologies are converted into a Common Data Model, allowing questions, analyses, and studies to be established once and deployed in the OMOP CDM database.

Because of its optimized schema, OMOP Common Data Model offers versatility. Tables for data commonly required in clinical trials and observational research, such as drug use and procedures performed, are available. The clinical data section of the clinical data warehouse contains these tables. Extending the current OMOP Common Data Model may be necessary in some cases. Furthermore, the OMOP Common Data Model contains the most widely used ontologies like RxNorm, ICD10CD, SNOMED CT and so on.

It’s Implementation Explained

The main steps for performing OMOP CDM conversion for a new data source (received from the client or the vendor), including those that require manual intervention, are as follows:

Data Model Conversion

STEP-1: Import the source data set into a RDBMS or Hadoop environment.

STEP-2: Analyze the data from the source

STEP-3: Data quality management (DQM) of the data from the source

STEP-4: Identification of business/ETL(Extract, Transform, Load) conversion guidelines

STEP-5: Scripts for converting the OMOP data set should be written as ETL(Extract, Transform, Load) scripts

STEP-6: OMOP vocabulary is used to map source codes

STEP-7: ETL(Extract, Transform, Load) scripts must be executed.

STEP-8: Data Quality Management and data profiling for OMOP

STEP-9: Export an OMOP data collection that has been converted for analytics.

Benefits of OMOP Conversion

  • Harmonization of data
  • Reducing or eliminating challenges of unique data sources
  • Enabling systematic and collaborative research

References

--

--