Using the SEER-CAHPS Sample Size Estimator

Before you request SEER-CAHPS data, you can use the Sample Size Estimator to estimate the number of records that will match your cohort selection criteria. Please note that a SEER-CAHPS data request uses only the following variables: cancer site, years of cancer diagnosis, survey type, and survey year. When you are conducting your analysis, however, you will want to use additional cohort selection criteria. The Sample Size Estimator can help you estimate your sample size based on these additional criteria. This is only a rough estimate, however, so please note that your actual sample size may differ.

The Sample Size Estimator is a filter. It starts out including all records in the database. Your choices of specific variables and values determine which records will be included in your final sample size estimate. The See the Cost Calculator link takes you to the SEER-CAHPS Cost Calculator where you can estimate the data cost.

SEER-CAHPS Cohort Data

When using data from the SEER-CAHPS database, it’s important to understand a few things about using this sample size estimator. Since SEER-CAHPS links data from the SEER Cancer Registry with Medicare CAHPS surveys and Medicare claims (for Fee-for-Service beneficiaries only), we’ve limited the potential cohort to patients who have a sequence number of 00 or 01, indicating that only first primary cancers are included. Additionally patients with unknown Month of Diagnosis or Age are removed.

The diagnosis years available in SEER are from 1998 to 2019, however there is no CAHPS survey data for the year 2006. The table below shows the types of survey data available and the years for which it has been collected.

Survey Type Administered to Following Coverage Type Care Addressed Years
FFS-Only FFS-Only All aspects of care (there is no Part D) 2000-2004;
FFS+PDP FFS+PDP All aspects of care 2007-2010
FFS FFS-Only and FFS+PDP Non-part D aspects of care 2011-2019
Prescription Drug Plan FFS+PDP Part D aspects of care 2011-2019
Medicare Advantage Only (MA-Only) MA-Only All aspects of care (there is no Part D) 1997-2005;
Medicare Advantage Prescription Drug Plan (MA-PD) MA-PD All aspects of care 2007-2019
Medicare Advantage Preferred Provider Organization (PPO) MA-PPO All aspects of care 2009-2012

Some of the patients in the database have participated in one or more than one of the patient surveys. However, each participant is only counted once, so the number of surveys in an individual’s record does not affect the patient count.

Sample Estimator Controls

  • Dropdown lists are used for all variables, all operators, and some variable values.
  • Text fields are used to key in single numbers or a range of numbers.
  • Drag and drop controls () allow you to change the order of your variables. Changing the order will not change the final results but can provide more specific information on the number of cases in the database having particular variable values.
  • Slider controls are used for coverage variables such as Continuous Medicare Parts A & B, Continuous Medicare Part D, and Continuous Period Without HMO Coverage. Use your mouse to slide the dots on the line or the text fields. The number of months can also be keyed into fields, or the up down arrows that appear in the fields when selected (on the right) can be used to change the numbers in the fields. While entering the months on this control, the sliders and the numbers in the field change simultaneously.
    Please note: Persons who are lost to follow-up (e.g., die) within the selected post diagnosis time window are retained in the sample estimate.
    slider controls with input boxes under
  1. Click on the Add Variable button to add a new variable. A dropdown list of available variables will open with a filter field at the top. Scroll to the desired variable to select it or begin keying it into the filter field to bring the desired variable to the top of the list to select.
    picture of adding a variable
  2. Select an operator from the first dropdown list in the Category section on the left. The operators offered in the dropdown list are dependent on the variable you selected.
  3. Select the variable value(s) for the operator on the right of the Category section. Depending on the operator selected, you may see one or two sliding scale controls of numbers, or a dropdown list. If you have a sliding scale control you can move the blue dot on the control and the value will appear in the text field, or you can key the value into the text field and the blue dot will move to the appropriate point on the line.
    picture of the operator selection
  4. Select the run search icon (✓) to run the search for that variable line. Each time a variable line is run, the search statement will replace the operator in the Category section with the estimated number of cases for that variable. In the Sample Size Estimate section a line will appear representing the portion of the estimate and number of cases. Each new variable filters out new cases from the variable lines that come before.
    picture of the search statements
  5. To change the order of the variables click on the triple bar icon () in the Order column to drag the variable to a new position. The estimate of the individual variable lines will change based on the order, but the final result will always remain the same.
  6. To edit a line of the search, select the edit icon (). The line will open to allow changes in the variable, the operator, and the value(s). If you decide to cancel the edits you have made and keep the original search statement, select the cancel icon (✖).
  7. To eliminate a line of the search select the trash icon (). Once a line has been removed it cannot be undone. To get it back it must be re-entered.

Information about Variables and Category Selection Variables

Multiple variables can be added to the filter. Every time a new variable is added, the records that do not match the variable value(s) will be removed from the sample size. The Estimator starts with ALL possible records in the database. For example, until you select one or more specific cancer sites, all cancer sites will be included in the estimate. Each variable you add will remove records, for the data that does not match your selected values.

Note: Whenever you are trying to estimate the number of records that have complete claims coverage data you should always use the Continuous A&B coverage and Continuous period without HMO coverage variables together.

Category Selection

The Category section contains the operators and the variable values. All variables except for Comorbidity can only be OR searches. For example, when you select Cancer Site as the variable with multiple variable values, (the image below shows Breast and Cervex Uteri as the values) the Sample Size Estimator searches on patients who were diagnosed with either Breast OR Cervex Uteri as shown in the Category column.
picture of the category column with different selections

The Comorbidity variable (#2 in both examples) can be either an AND or OR search. The operator Is All creates an AND search, and the operators Is not and Is at least one create an OR search as shown in the image above.

Once the category selections for the variable and have been run, the search statement appears as shown above, with the estimated number of cases under the Sample Size column.

Sharing the Results

The results of search can be shared in two different ways.

  1. The Download as CSV button saves a .csv file that can be opened in a spreadsheet program like Excel and attached to an e-mail.
  2. The Copy Selection Link saves the link to the clipboard where it can be pasted into an e-mail or other document. The link opens the search exactly as it appears when the link was saved.

Requirements for Making a SEER-CAHPS Data Request

When requesting SEER-CHPS data, certain variables have specific limitations or requirements.

  • Cancer Site - All requests require this variable. No more than 10 cancer sites may be included in a single data request.
  • Year of cancer diagnoses - All requests require this variable. The selection of the operator determines how years may be designated. They can indicate a single year, years before or after a certain point in time, or a range of years.
  • Survey Type - All requests require this variable. The available survey data appears on the table shown in the SEER-CAHPS Cohort Data section on this web page.
  • Survey Year - All requests require this variable. The years that specific survey data is available and the care it addresses appear on a table below. There is no data for the year 2006 for any survey type.
Last Updated: 08 Feb, 2023