Data Sources
Data Sources
In collaboration with data core resources and services throughout The Ohio State University, CTSI provides investigators, trainees and research staff at the Ohio State and Nationwide Children’s Hospital opportunities to access various data sources, many of which are showcased here.
Administrative Databases
Administrative/Billing databases are large and complex databases which offer near limitless possibilities to study research questions. While they can have their shortcomings, the flexibility, size and scope make them ideal candidates for many study questions.
CMS Standard Analytic Files
The Centers for Medicare and Medicaid Services (CMS) Standard Analytic Files (SAFs) contain inpatient, outpatient, skilled nursing facility and Hospice data for millions of publicly insured individuals across the United States.
IBM MarketScan
IBM's MarketScan contains de-identified inpatient, outpatient, skilled nursing facility and drug billing data for millions of privately insured individuals across the United States.
Healthcare Cost and Utilization Project (HCUP)
Healthcare Cost and Utilization Project (HCUP) is a collection of healthcare databases that were developed through a Federal-State-Industry partnership and contain a large collection of longitudinal hospital care data
Electronic Health Record (EHR) Databases
EHR based databases extract data directly from electronic health records in medical institutions.
The Ohio State University Wexner Medical Center EHR
At Ohio State, the primary data source for EHR data is the Ohio State Wexner Medical Center EHR data housed in Epic databases. The clinical data for all patients of the Ohio State Wexner Medical Center from year 2011 onwards is extracted, transformed and loaded periodically to Epic Clarity and Caboodle Databases. This data, in addition to being operationally used, is available for research via Honest Brokers, with IRB and Honest Broker Committee approval. Researchers can submit an Honest Broker request using Research IT Data Request Form.
Other EHR based databases showcased below can offer data to answer similar research questions with varying levels of accessibility, scope and flexibility.
Epic Cosmos
Limited data set of data provided by participating organizations (including Ohio State Wexner Medical Center) to EPIC that is continuously updated including electronic health records of hundreds of millions of patients.
LifeScale
LifeScale is an Ohio State-specific, honest-broker mediated, coded-limited database developed in partnerships with Microsoft, College of Optometry, College of Dentistry, OSUCCC-James (including Tumor and Registry data) and includes data not often found in electronic medical record data.
PCORnet CDM
The Ohio State University is part of Patient-Centered Outcomes Research Institute (PCORI) funded National Patient-Centered Clinical Research Network (PCORnet) with ~75 academic medical centers and health systems that are grouped into eight Clinical Research Networks (CRNs) spread all over the United States.
Cancer Databases
National Cancer Database (NCDB)
The National Cancer Database is a clinical oncology database sourced from hospital registries in over 1500 accredited cancer facilities. As such it is a good resource for researchers looking to study a particular cancer population.
NIH NCI Surveillance, Epidemiology and End Results (SEER)
The Surveillance Epidemiology and End Results Program was developed by the National Cancer Institute (NCI) to capture incidence of cancer and survival in the United States. The data within SEER databases are collected from population-based cancer registries representing approximately 48% of the U.S. Population. Utilizing SEER data, investigators can assess the associations between demographic, clinical and treatment characteristics and survival outcomes.
Oncology Research Information Exchange Network (ORIEN)
ORIEN is a unique alliance to integrate “big data” and data sharing for cancer research and care. ORIEN was founded by The Ohio State University Comprehensive Cancer Center and Moffitt Cancer Center in May 2014 and now includes 19 member institutions.
Other Nationwide Data Sources
Ohio State Federal Statistics Research Data
The Federal Statistics Research Data Center (FSRDC) is a collaboration with the U.S. Census Bureau to provide access to restricted individual and firm-level data from federal statistical agencies. The FSRDC provides access to many restricted-access datasets.
All of Us
The NIH’s All of Us Research Program is one of the largest biomedical data resources with health data from a diverse group of participants from across the United States.
National (Nationwide) Inpatient Sample (NIS)
As a member of the HCUP family of databases, National (Nationwide) Inpatient Sample (NIS) is the largest publicly available all-payer inpatient healthcare database in the United States providing national estimates of hospital inpatient stays.
Nationwide Readmissions Database (NRD)
As a member of the HCUP family of databases, Nationwide Readmission Database (NRD) is a powerful database designed to support different types of analyses involving national readmission rates for all payers and the uninsured.
Nationwide Emergency Department Sample Database (NEDS)
Nationwide Emergency Department Sample (NEDS) is a powerful database designed to support different types of analyses involving all-payer hospital-owned emergency department visits. NEDS is a part of the family of databases developed for HCUP.
Kids’ Inpatient Database (KID)
Kids’ Inpatient Database (KID) is a database designed to support different types of analyses involving national hospital stays for patients younger than 21. This database contains pediatric discharges of all payers (covered by Medicaid or private insurance) as well as those who are uninsured. KID is a part of the family of databases developed for HCUP.
Need Data Support? Ask the Data Navigator.
The Ohio State University offers numerous data sources for researchers, but effectively navigating these resources can be daunting. To assist faculty and staff in this process our Data Navigator acts as the first point of contact. The Data Navigator provides a high-level explanation of regulatory and institutional processes regarding these databases, identifies potential collaborators and addresses initial data-related questions. The navigator helps clarify requests, coordinates efforts and links investigators with domain experts for different data sources.