SEER-Medicare: About the Data Files
There are a large number of people and records per person in the SEER-Medicare data. Given the vast amount of data, the term "SEER-Medicare data" actually refers to a series of files. One file includes SEER data. The remaining files are the Medicare files for specific types of service, e.g. hospital, physician, outpatient, etc.
There are two cohorts of people included in the SEER-Medicare data -- persons with cancer and a random sample of Medicare beneficiaries who do not have cancer. The "non-cancer" group is drawn from a random 5 percent sample of Medicare beneficiaries residing in the SEER areas. Persons in the 5 percent sample who also appear in the SEER data are removed, leaving a sample of non-cancer cases. Medicare claims are available for the non-cancer cases in the same format as for the cancer cases. Information from the non-cancer group can be used for comparative purposes, such as the cost of care or the use of specific tests or procedures among a random sample of Medicare beneficiaries who do not have cancer. Data for the non-cancer cases can also be used with the data for the cancer cases to conduct population-based analyses of testing, treatment, and costs within the SEER areas.
For the cancer cases, investigators may link individual patients across files using the unique SEER case ID number. The SEER case ID number consists of an 8-digit case number and a 2-digit registry ID which when combined, uniquely identify each individual in the data. For persons in the non-cancer group, investigators may link their files using an encrypted Health Insurance Claim number that is unique to each individual.