Encrypted or Restricted Variables

Encrypted Physician and Hospital Identifiers

The SEER registries require that the identity of providers (physicians and hospitals) be protected. Therefore, provider identifiers included in the SEER-Medicare claims are encrypted. This includes the Unique Physician Identification Number (UPIN), National Provider Identifier (NPI), the provider Taxpayer ID number (tax_num), and hospital provider number (hospital NPI). These numbers are encrypted in a similar manner across files and years making it possible to track the same hospital or physician in the SEER-Medicare data over time.

Investigators may want information about providers that requires linkage to other data sources by using unencrypted provider numbers; therefore, NCI has identified processes to facilitate such linkages.

Physician Identifiers

Many investigators want to link to data about physicians from the American Medical Association (AMA). However, given NCI no longer releases unencrypted physician identifiers, NCI has established methods to support such linkages. In order to link to the AMA data, investigators should complete the following steps:

  1. Identify the encrypted provider numbers from the Medicare data. Physicians' identifiers are the UPINs or NPIs found on the carrier and outpatient files.
  2. Send the encrypted provider numbers to NCI's information technology contractor, IMS Inc. Please e-mail the provider numbers to Bob Banks. IMS will unencrypt the provider numbers and send them to the AMA.
  3. AMA will return to IMS their data linked to the unencrypted provider number.
  4. IMS will re-encrypt the file and return to the investigator a file with encrypted provider numbers and the selected AMA variables.

Researchers who are seeking AMA data should direct any inquires to AMA's programming contractor, Medical Marketing Services, Inc.:

Tom Lorge
Medical Marketing Services, Inc.
185 Hansen Court, Suite 110
Wood Dale, IL 60191
Phone: 630-477-1564
Fax: 630-350-1896
t-lorge@mmslists.com

Hospital Identifiers

In order for NCI to release unencrypted hospital numbers, investigators must obtain permission from each of the SEER registries as described below.

Geographic Identifiers

The patient's county of residence is available on the PEDSF (FIP codes) and in the Medicare files (SSA codes). To protect patient and provider identification, NCI encrypts other geographic variables including patient's census tract and ZIP code, physician ZIP code, and hospital ZIP code. Separate files that contain geographically-based (ZIP code and census tract level) socioeconomic information from the 1990 and 2000 Censuses and the 2008 – 2012 American Community Survey are provided and can be matched by the encrypted patient census tract and ZIP code to the claims files. Unencrypted ZIP codes and census tracts can only be released if the investigator obtains permission of each SEER cancer registry.

HRR-encrypted Zip Code Crosswalk

NCI can provide a zip code crosswalk file to facilitate the link to the Dartmouth Atlas of Health Care Hospital Referral Regions (HRRs) with the encrypted zip codes.

More geographic information available at Geographic Area Data.

Restricted Variables

Oncotype Dx

Genomic Health, Inc (GHI) developed the Oncotype DX Breast Recurrence Score® assay (Assay), which is a commercial diagnostic test that predicts 10-year distant recurrence risk based on the expression of 21 genes. The resulting recurrence score is used to better weigh the harms and benefits of chemotherapy, thereby informing treatment decisions. The Assay data are linked to SEER data via a collaboration between NCI and GHI, with IMS acting as the third party, honest broker. The Assay variables that have been linked to SEER data include: Assay, Assay risk group, Assay reason no score, Assay test report date, and Assay months since diagnosis (Appendix A).

Note: Per the agreement with GHI, NCI all approved applications requesting Oncotype Dx variables and any manuscripts or reports that result from the analyses of such data will be shared with GHI. These documents will be shared with GHI for informational purposed only; all approval decisions will be handled by NCI.

Prostate Cancer- Watchful Waiting Data

Active surveillance (watchful waiting) information has been collected by SEER for prostate cancer cases diagnosed from 2010 to 2015. The data includes a variable that indicates whether the initial intent of the physician and patient was to manage the disease by active surveillance or watchful waiting. The decision or plan for active surveillance had to be documented. Information was collected as North American Association of Central Cancer Registries (NAACCR) item RX SUMM-Treatment status. Cases coded as Active Surveillance by at least one facility with no known reports of initial curative intent treatment have been recoded as Watchful Waiting =” yes.” All other cases diagnosed 2010 and later have been by default recoded to “No/Unknown.” This latter category includes patients who were recommended treatment but refused it. It also includes patients whose physicians decided not to treat for reasons such as the presence of comorbidities.

Head and Neck Cancer- HPV

Human papillomavirus (HPV) infection is a prognostic factor for certain Head and Neck malignant tumors. SEER has collected data on the HPV status of patients with Head and Neck tumors, as defined by the following CS Collaborative Stage Data Collection System, version 02.05 schemas: Hypopharynx, Nasopharynx, Oropharynx, Pharyngeal Tonsil, Pharynx Other, Palate Soft, Tongue Base. Currently, data are available for patients diagnosed between 2013 and 2015 in SEER areas. The HPV status information received from SEER registries has been recoded in the following: 1) HPV Negative; 2) HPV Positive; 3) Unknown/NA.

Alaska Native Tumor Registry Data

In addition to needing NCI approval, all requests for Alaska Native Tumor Registry Data, which includes incident cancer among Alaska Natives (Eskimo, Indian, Aleut), requires tribal review. As discussed below, if a request is approved at NCI, the investigator will be provided contact information for the Alaska Native Tumor Registry to submit their request.

Requests for unencrypted and/or restricted variables

If investigators determine that unencrypted and/or restricted variables are needed for their analysis, they must go through a special approval process. Investigators must submit their completed application form (DOCX, 34 KB) to the SEER-Medicare contact with a detailed justification for access to the unencrypted and/or restricted variable(s). A completed and signed request form (DOCX, 20 KB) and a list of people that will have access to these data must be included with the request. An NCI staff member will review the application. If requesting unencrypted variables, once NCI supports the request for these variables, the investigators must also obtain permission from each of the registries prior to release of unencrypted variables for that registry. If requesting Alaska Native Tumor Registry Data, investigators must also obtain permission from the Alaska Native Tumor Registry. The SEER-Medicare contact will provide investigators with contact information for the SEER registries. Investigators who are requesting unencrypted and/or restricted variables are encouraged to allow sufficient time to obtain the necessary approvals.

Note: Files with unencrypted variables cannot be stored with regular SEER-Medicare data. In order to combine multiple requests when purchasing data, all requests must have the same permissions for access to any unencrypted variable.