The SEER-CAHPS Illness Burden Index (SCIBI)

The SEER-CAHPS team has developed the SEER-CAHPS Illness Burden Index (SCIBI) to help data users understand relative illness burdens among different groups of survey respondents. The SCIBI is a machine-learning-derived summary score that approximates relative risk of mortality within 12 months after survey response. The SCIBI allows researchers to analyze illness burden information for Medicare Advantage (MA) enrollees as well as fee-for-service (FFS) enrollees. Please view the 2020 methods paperExternal Web Site Policy for an overview of how the SCIBI was developed and validated.1

SEER-CAHPS Illness Burden Index (SCIBI) scores are now available for individuals surveyed between 2007 and 2017. They incorporate all available data for a respondent in terms of self-reported and claims information, including activities of daily living (ADL) limitations, other limitations in activities, self-reported conditions, and healthcare utilization (both claims-based and self-reported).

Please see the detailed information below to learn more about the SCIBI measure included in SEER-CAHPS.

Versions

Two versions are available to users requesting the SEER-CAHPS data:

  • Concurrent Basic (SCIBI-CB): Available for all respondents. Includes self-reported information from survey data that is based on the respondent thinking about the prior 6 months. For FFS enrollees, also includes claims-based predictors that are based on a 24 month period – that is, the 12 months before and the 12 months after the survey. Indicators such as the number of hospitalizations and durable medical equipment (DME) use are measured across the full 24-month period (i.e., the predictors are measured concurrently with the outcome).
  • Prospective Basic (SCIBI-PB): Only available for FFS population. These alternate scores include claims-based predictor data from the 12 months before survey response only (which increases the error rate but may be preferred based on the analysis). While claims data are available for FFS enrollees, claims are not available for MA enrollees in the SEER-CAHPS linkage. Thus, these scores are not generated for MA enrollees.

Note that case-mix adjustors are not included as predictor variables in either version of the scores (refer to the website and section 6.3 of the User Guide (PDF, 5.2 MB) for more information on case-mix adjustor variables).

See below for a list of all predictor variables used to generate prediction scores for SCIBI-CB and SCIBI-PB. The variables used for each individual differ depending on whether the beneficiary is enrolled in Medicare FFS or MA and whether the beneficiary was diagnosed with cancer.

Predictor variables Source
All beneficiaries (cancer or non-cancer, Medicare FFS or MA)
Continuous enrollment in Medicare Part A and B MBSF data
Indicator for current smoking survey data file
Indicator for needing help with personal care needs survey data file
Indicator for needing help with routine activities survey data file
Indicator for physical condition interfering with independence survey data file
Indicator for trouble bathing survey data file
Indicator for trouble dressing survey data file
Indicator for trouble dressing survey data file
Indicator for trouble eating survey data file
Indicator for trouble using chairs survey data file
Indicator for trouble walking survey data file
Indicator for trouble using toilet survey data file
Indicator for trouble climbing stairs survey data file
Indicator for limited moderate activity survey data file
Indicator for limited regular activity survey data file
Indicator for limited social activity survey data file
Indicator for pain interfering with normal activities survey data file
Indicator for lethargy (energy level rated 4, 5, or 6) survey data file
Doctor ever told you: Heart attack or angina survey data file
Doctor ever told you: Stroke survey data file
Doctor ever told you: Chronic obstructive pulmonary disease survey data file
Doctor ever told you: Diabetes survey data file
Doctor ever told you: Cancer (other than skin cancer) survey data file
Same condition lasted at least 3 months survey data file
2 or more conditions (heart, stroke, COPD, diabetes) survey data file
Seen doctor 3 or more times for same condition survey data file
Take medicine for condition prescribed by doctor survey data file
Medicine taken for condition lasting at least 3 months survey data file
Ever delay filling prescribed medicines because of cost survey data file
Spent one or more nights in a hospital in past 6 months survey data file
Medicare FFS beneficiaries only, cancer or non-cancer
Hospice (any) Medicare FFS claims
Skilled nursing facility (any) Medicare FFS claims
Hospitalizations (any, count) Medicare FFS claims
Home health (any) Medicare FFS claims
Home hospital bed Medicare FFS claims
Wheelchair Medicare FFS claims
Oxygen Medicare FFS claims
Ambulance/Life support Medicare FFS claims
Paralysis Medicare FFS claims
Parkinson’s Medicare FFS claims
Dementia Medicare FFS claims
Psychiatric diagnosis Medicare FFS claims
Stroke or brain injury Medicare FFS claims
Skin Ulcer Medicare FFS claims
Heart failure Medicare FFS claims
Difficulty walking Medicare FFS claims
Diabetes mellitus complications Medicare FFS claims
Weakness Medicare FFS claims
Bladder dysfunction Medicare FFS claims
Any durable medical equipment Medicare FFS claims
Beneficiaries with cancer only, Medicare FFS or MA
Cancer site SEER registry data
Cancer stage at diagnosis SEER registry data
Indicator for first malignant primary cancer SEER registry data
Indicator for multiple primary cancer sites SEER registry data
Number of primary cancer sites SEER registry data
Married at time of cancer diagnosis SEER registry data
Urban residence at time of cancer diagnosis SEER registry data

Standardized Z-scores

The SCIBI scores vary in their distributions depending on

  • year,
  • whether a person is in MA or FFS, and
  • whether they were surveyed before or after their cancer diagnosis

Thus, in addition to the cohort-specific raw out-of-bag scores* (developed within each year-group slice), we provide standardized z-scores centered on the population mean (i.e., people with and without cancer, MA and FFS, all years) for each version of the scores (SCIBI-CB and SCIBI-PB). These z-scores, which have a mean of 0 and an SD of 1, can thus be used to compare illness burden using the same “measuring stick” for every person in the linked data.


* Out-of-bag (OOB) predictions take advantage of bootstrap aggregation (bagging). OOB predictions are generated by using only the 1/3 of the trees that do not have a particular data point in them to generate the predictions, and this is repeated for all data points. This provides a more unbiased estimate of the prediction error of the random forest compared to standard prediction scores, which are susceptible to overfitting. For this reason, the values in the SCIBI dataset are all based on out-of-bag prediction scores.


The SCIBI dataset contains 6 variables in total:

  • Patient_id: a unique, encrypted identifier, specific to SEER-CAHPS, that can be used to link the scores to the other SEER-CAHPS data files
  • Scibi_year: indicates the year the CAHPS survey was taken upon which the SCIBI scores for an individual are based on. Individuals may have taken more than one survey across the period 2007-2017, but the SCIBI scores are unique at the patient_id level and pertain to only a single survey for each patient_id.
  • Scibi_cb: the raw prediction scores for the concurrent basic version
  • Scibi_cb_z: the standardized prediction scores for the concurrent basic version
  • Scibi_pb: the raw prediction scores for the prospective basic version
  • Scibi_pb_z: the standardized prediction scores for the prospective basic version

SCIBI Performance

SCIBI-CB
  People with cancer in SEER People without cancer in SEER
Surveyed pre-diagnosis Surveyed post-diagnosis
MA FFS MA FFS MA FFS
N 49,439 42,138 85,329 78,748 482,775 380,577
Percent who died within 12 months of survey 4% 3% 6% 6% 2% 3%
Percent who died in bottom 25th percentile 2% 0% 2% 0% 1% 0%
Percent who died in 99th percentile or above (top percentile) 19% 76% 32% 84% 16% 70%
Mean error rate
(2007-2017)
42% 7% 28% 8% 31% 10%
SCIBI-PB
  People with cancer in SEER People without cancer in SEER
Surveyed pre-diagnosis Surveyed post-diagnosis
MA FFS MA FFS MA FFS
N NA 42,138 NA 78,748 NA 78,748
Percent who died within 12 months of survey - 3% - 6% - 3%
Percent who died in bottom 25th percentile - 1% - 1% - 1%
Percent who died in 99th percentile or above (top percentile) - 23% - 51% - 28%
Mean error rate
(2007-2017)
- 34% - 20% - 20%

SCIBI in Action

The SCIBI was designed specifically for this data resource to be used as a comorbidity adjustor. It can be used for both MA and FFS enrollees surveyed between 2007 and 2017. SCIBI scores are particularly useful for analyses that encompass both FFS and MA beneficiaries because they use information that is found not just in claims but also in enrollment files, CAHPS surveys, and cancer registries. Given that the SEER-CAHPS data resource does not include claims data for MA beneficiaries, the scores may allow researchers to account for illness burden among MA enrollees as well as FFS enrollees.

SCIBI scores can be used as a substitute for, or adjunct to, the NCI Combined or Charlson comorbidity indices. As an omnibus measure incorporating multiple individual indicators of physical function, frailty, and being at higher risk for mortality, the SCIBI scores allow researchers to substitute a single variable for the dozens of individual variables that exist to measure illness burden. Many of the individual variables have high rates of missing information; in addition, it can be helpful to have an omnibus measure for estimating more parsimonious models.

Researchers may also find it helpful to use SCIBI scores as predictor variables. For example, the SCIBI has been used in an analysis of SEER-CAHPS using data from 2007-2015.2 The research team found that, after controlling for other case-mix adjustors, higher SCIBI scores (indicating greater illness burdens) were significantly associated with better ratings of Health Plan and better Getting Care Quickly scores. These results suggest that illness burden may influence how people experience care or report those experiences, independently of standard case-mix adjustors (such as self-reported general and mental health status).

In another study, in a sample of Medicare FFS and MA enrollees with regional, distant, or unstaged cancers, the SCIBI was shown to be significantly associated with forgone care (surgery or radiation contraindicated, not recommended, or refused), independent of other factors.3 The study also reported that care experience measures were not associated with forgone care.

Please contact the SEER-CAHPS team with questions.

References

1 Lines LM, Cohen J, Kirschner J, Halpern MT, Kent EE, Mollica M, et al. Random Survival Forests Using Linked Data to Measure Illness Burden Among Individuals Before or After a Cancer Diagnosis: Development and Internal Validation of the SEER-CAHPS Illness Burden Index. Int J Medical Informatics. 2020. [ViewExternal Web Site Policy]
2 Lines L, Cohen J, Kirschner J, Barch DH, Halpern M, Kent EE, et al. Associations between illness burden and care experiences among Medicare beneficiaries before or after a cancer diagnosis. Cancer Causes & Control. 2022. [ViewExternal Web Site Policy]
3 Lines LM, Danforth K, Zabala D, Halpern MT, Mollica M. Illness Burden is Associated with Forgone Care Among Medicare Beneficiaries with Cancer. American Public Health Association Annual Meeting; Oct. 22, 2021; Denver/Online. Cancer Forum; 2021. [ViewExternal Web Site Policy]
Last Updated: 18 Apr, 2022