Comorbidity SAS Macro (2014 version)

The macro below calculates comorbidity scores for both the Charlson and the NCI Comorbidity Indices. The macro considers only ICD-9 diagnosis codes and ICD-9 procedure codes on the claims (ICD-10 and HCPCS are not used).

Investigators who want to use the macro must decide if they want to use claims only from the hospital file (MEDPAR) or to also include the diagnoses on claims submitted by physicians (NCH) and outpatient facilities (OUTPAT), as described in Klabunde et al. The rationale for including the latter claims is that many more people see a physician or receive care in an outpatient clinic than are hospitalized, thus increasing the possibility of identifying more comorbid conditions.

If both hospital claims and physician/outpatient claims are being used to identify comorbid conditions then the rule-out algorithm, which is built into the 2014 version of the comorbidity macro, should be used. This algorithm requires that for physician and outpatient claims, a patient's diagnosis codes must appear on at least two different claims that are more than 30 days apart. The reason for this is that the diagnoses on the physician and outpatient claims have not been validated and it is possible that physicians may have recorded a diagnosis as being present when the correct coding would be to “rule-out” the condition. Codes that do not appear on two different claims are considered to be “rule-out” diagnoses and are not counted as comorbid conditions. This is necessary to prevent over-estimation of the comorbidity when using physician or outpatient claims.

The macro is available to download here: charlson.comorbidity.macro.2014.sas (SAS, 26 KB).

Building an Input File for the Macro

Regardless of which Medicare files an investigator decides to use as input, the files should be subset to include a limited number of variables as described below. All ICD-9-CM diagnosis codes on these records should be 5 characters long and ICD-9-CM procedure codes should be 4 characters long. Before invoking the SAS macro, decimal points or blanks occurring within the number should be removed from the code (ex. diagnosis code '123.4' becomes '1234').

Variables to keep for the macro:

  • MEDPAR - keep Patient_ID, admission date (ADM_M, ADM_D, ADM_Y), admitting diagnosis code (ADMDXCDE), diagnosis codes 1-25 (DGN_CD1-DGN_CD25), surgery codes 1-25 (SRGCDE1-SRGCDE25), and length of stay (LOS). Set filetype='M'.
  • NCH - keep Patient_ID, claim from date (from_dtm, from_dtd, from_dty), 14 diagnosis codes (pdgns_cd, dgn_cd1-dgn_cd12, linediag). The carrier data can have more than one claim for the same date of service and all claims for each date should be included. Set filetype='N'.
  • OUTPAT - keep Patient_ID, claim from date (from_dtm, from_dtd, from_dty), diagnosis codes 1-25 (dgn_cd1-dgn_cd25), ICD procedure codes 1-13 (prcdr_cd1-prcdr_cd13). Set filetype='O'.

The final SAS file which combines data from any of the above sources must include the variables used in the Macro call. If the comorbidity score for the 12 months prior to diagnosis is to be calculated, then only records with claim dates falling within that window should be kept. However, if the rule-out algorithm is invoked, claims for 30 days before and after the window of analysis should also be kept.

An example SAS program to build an input file is available here: comorbidity.input.file.example.2014.sas (SAS, 4 KB).

These variables are needed for the macro:

PATID - Variable name: Unique ID for each patient.

STARTDATE - Variable name: Date the comorbidity window starts, in SAS date format.

ENDDATE - Variable name: Date the comorbidity window ends, in SAS date format.

CLAIMDATE - Variable name: Date of the claim found on the claim file, in SAS date format. This can be created by using the MDY() function.

CLAIMTYPE - Variable name: the source of the claim record ('M'=MEDPAR, 'O'=OUTPAT, 'N'=NCH). Note, do not use DME.

DAYS - Variable name: contains the length of stay (in days) for hospital visits from MEDPAR.

DXVARLIST - List of variable names: the diagnosis codes in ICD-9 (e.g. DGN_CD1-DGN_CD25). If there are multiple variables, some of which cannot be included in a range, please list them using spaces to separate each single element or range (e.g. DGN_CD1-DGN_CD25 ADMDXCDE).

SXVARLIST - List of variable names: the surgery or procedure codes in ICD-9 (e.g. SRGCDE1-SRGCDE25). If there are multiple variables, some of which cannot be included in a range, please list them using spaces to separate each single element or range (e.g. SRGCDE1-SRGCDE25 PRCDR_CD1).

RULEOUT - Flag: Set this to 1 (or R), if the “ruleout algorithm” should be invoked, otherwise set this to 0 (or leave it blank).

An example call statement for the macro would be:

%COMORB(Claims,patient_id,start_date,end_date,claim_date,filetype,los,admdxcde dgn_cd1-dgn_cd25,surg1-surg25,R,Comorb);

Last Updated: 24 Sep, 2021