Using the SEER-MHOS Sample Size Estimator
SEER-MHOS Sample Size Estimator is a web tool able to estimate the number of MHOS respondents that match your cohort selection criteria. The Sample Size Estimator helps investigators generate sample size estimates for SEER-MHOS projects, allowing for tailoring by cancer site, survey timing, and Medicare enrollment.
The SEER-MHOS Sample Size Estimator is a filter. It includes all survey records in the database. Your choices of specific variables and values determine which survey records will be used for sample size estimate at the person-level. Each participant is only counted once in the Sample Size Estimator.
Please note: the SEER-MHOS Sample Size Estimator provides only a rough estimate, actual sample size may differ. In accordance with the SEER-MHOS Data Use Agreement (DUA) cell sizes less than 11 (eleven) are suppressed. Additionally, sample sizes are randomly adjusted within a +/- 10 threshold per query. Cancer site is limited to first primary cancers ONLY. (SEER Cancer Sequence number= 00 or 01)
For SEER-MHOS cost estimates, please use the Data Cost Calculator.
SEER-MHOS Data Sources
The SEER-MHOS Data Resource incorporates the following information. Please restrict each data source in your sample size estimate to reflect data availability to answer your proposed research question.
For example:
- Proposed studies that include Medication information (Part D), will need to restrict relevant MHOS survey cohorts to 2007-2020.
- Proposed studies looking at cancer survivors may choose to limit their sample based on time from cancer diagnosis to survey date. Survivorship period covers over 40 years from an identified cancer dx.
- Cancer sites have been grouped according to SEER*Explorer definitions.
Data Source | Description | Years |
---|---|---|
SEER | Cancer Clinical Data | 1973 – 2019 |
MHOS | Survey Cohorts | 1998 – 2021 |
Medicare | MA Enrollment Files | 1999 - 2020 |
Medicare Part D | Prescription Drug Claims | 2007 – 2020 |
SEER-MHOS Cohort Data
The SEER-MHOS data resource includes all participants in MHOS cohorts 1-22 that have that have completed at least one survey. Each cohort consists of a baseline survey and a two-year follow-up survey. Participants who responded to a baseline MHOS survey may not have completed a follow-up survey. While not common, participants may have been selected to participate in multiple MHOS Cohorts, resulting in the completion of multiple baseline and follow-up surveys.
Participation in MHOS is not indexed against a cancer diagnosis, therefore participation in a MHOS cohort can occur at any time a person is enrolled in a Medicare Advantage Plan.
Sample selection criteria can include survey timing relative to a cancer diagnosis in the following ways:
Survey Timing | Definition |
---|---|
Any Cohort | |
Survey Timing=Pre-Cancer Diagnosis | Participants with any survey(s) prior to a cancer dx |
Survey Timing=Post-Cancer Diagnosis | Participants with any survey(s) after a cancer dx |
Survey Timing= Pre & Post Cancer Diagnosis | Participants with at least one survey pre and post cancer dx across any cohort |
Same Cohort | |
Survey Timing= Pre & Post Cancer Diagnosis (Same Cohort) | Participants with at least one survey pre and post cancer dx in the SAME cohort (two years) |
Sample Estimator Controls
- Dropdown lists are used for all variables, all operators, and some variable values.
- Text fields are used to key in single numbers or a range of numbers.
- Drag and drop controls () allow you to change the order of your variables. Changing the order will not change the final results but can provide more specific information on the number of cases in the database having particular variable values.
- Slider controls are used for coverage variables such as Continuous Medicare Part D. Use your mouse to slide the dots on the line or the text fields. The number of months can also be keyed into fields, or the up down arrows that appear in the fields when selected (on the right) can be used to change the numbers in the fields. While entering the months on this control, the sliders and the numbers in the field change simultaneously.
Please note: Persons who are lost to follow-up (e.g., die) within the selected post diagnosis time window are retained in the sample estimate.
Searching the Database for Sample Sizes
- Click on the Add Variable button to add a new variable. A dropdown list of available variables will open with a filter field at the top. Scroll to the desired variable to select it or begin keying it into the filter field to bring the desired variable to the top of the list to select.
- Select an operator from the first dropdown list in the Category section on the left. The operators offered in the dropdown list are dependent on the variable you selected.
- Select the variable value(s) for the operator on the right of the Category section. Depending on the operator selected, you may see one or two sliding scale controls of numbers, or a dropdown list. If you have a sliding scale control you can move the blue dot on the control and the value will appear in the text field, or you can key the value into the text field and the blue dot will move to the appropriate point on the line.
- Select the run search icon (✓) to run the search for that variable line. Each time a variable line is run, the search statement will replace the operator in the Category section with the estimated number of cases for that variable. In the Sample Size Estimate section a line will appear representing the portion of the estimate and number of cases. Each new variable filters out new cases from the variable lines that come before.
- To change the order of the variables click on the triple bar icon () in the Order column to drag the variable to a new position. The estimate of the individual variable lines will change based on the order, but the final result will always remain the same.
- To edit a line of the search, select the edit icon (). The line will open to allow changes in the variable, the operator, and the value(s). If you decide to cancel the edits you have made and keep the original search statement, select the cancel icon (✖).
- To eliminate a line of the search select the trash icon (). Once a line has been removed it cannot be undone. To get it back it must be re-entered.
Information about Variables and Category Selection Variables
Multiple variables can be added to the filter. Every time a new variable is added, the records that do not match the variable value(s) will be removed from the sample size. The Estimator starts with ALL possible records in the database. For example, until you select one or more specific cancer sites, all cancer sites will be included in the estimate. Each variable you add will remove records, for the data that does not match your selected values.
Category Selection
The Category section contains the operators and the variable values. If multiple values are selected for a variable an OR search is performed between those values. For example, when you select Cancer Site as the variable with multiple variable values, (the image below shows Breast and Cervex Uteri as the values) the Sample Size Estimator searches on patients who were diagnosed with either Breast OR Cervex Uteri as shown in the Category column.
The Comorbidity variable (#2 in both examples) can be either an AND or OR search. The operator option At least one of the above will allows users to search for patients who had a survey with any, but not necessarily all of the selected comorbidity values. The All of the above option requires a patient to have surveyed with all of the selected comorbidity values.
Once the category selections for the variable and have been run, the search statement appears as shown above, with the estimated number of cases under the Sample Size column.
Requirements for Making a SEER-MHOS Data Request
When requesting SEER-MHOS data, The Cancer Site(s) are required on all requests. No more than 10 cancer sites may be included in a single data request.