Medicare Claims Synthetic Public Use Files in Sentinel Common Data Model Format: Datasets

Basic Details
Date Posted
Friday, December 11, 2020

Medicare Claims Synthetic Public Use Files (SynPUFs) were created to allow interested parties to gain familiarity using Medicare claims data while protecting beneficiary privacy. These files are intended to promote development of software and applications that utilize files in this format, train researchers on the use and complexities of Centers for Medicare and Medicaid Services (CMS) claims, and support safe data mining innovations. The SynPUFs were created by combining randomized information from multiple unique beneficiaries and changing variable values. This randomization and combining of beneficiary information ensures privacy of health information.  

Sentinel uses a distributed data approach in which Data Partners maintain physical and operational control over electronic data in their existing environments. The distributed approach is achieved by using a standardized data structure referred to as the Sentinel Common Data Model (SCDM). Sentinel’s Cohort Identification and Descriptive Analysis (CIDA) tool is a set of SAS macros that allows users to select a cohort of interest. The CIDA tool specifically reads data that is structured in the SCDM.

The Sentinel Operations Center (SOC) has transformed the CMS SynPUFs into the SCDM format as part of an ongoing effort to make Sentinel resources available to external investigators, with the goal of creating a community of investigators who can understand, utilize, and contribute to the Sentinel enterprise.

This page contains:

  • SCDM-formatted SynPUFs datasets in the form of 20 subsamples and their related data element tables: death, demographic, diagnosis, dispensing, encounter, enrollment, and procedure, provided in zipped files.
  • Descriptive statistics of each SynPUFs SCDM subsample and a corresponding data dictionary.

SOC will continue to maintain SCDM 7.0.0-formatted SynPUFs datasets. These are available upon request by e-mailing

Refer to the Medicare Claims Synthetic Public Use Files in Sentinel Common Data Model Format: User Documentation and Example Routine Querying Package page for user documentation, technical specifications, example routine querying package and SynPUFs demontration report.

Time Period
January 1, 2008 - December 31, 2010
Population / Cohort
Individuals 18 years of age or older
Data Source(s)
SCDM-formatted SynPUFs