Comorbidity SAS Macro (2014 version)

The below macro calculates Charlson comorbidity weights from the claims and was initially developed in 2014 after NCI re-evaluated the included codes. NCI does not accept responsibility for the completeness or accuracy of the codes and weights used in the macros. Investigators may modify the macros if they wish to include different diagnosis codes or condition weights.

The first decision that an investigator must make is whether to use claims only from the hospital file (MEDPAR) or to also include the diagnoses on claims submitted by physicians (NCH) and outpatient facilities (OUTPAT), as described in Klabunde et al. The rationale for including the latter claims is that many more people see a physician or receive care in an outpatient clinic than are hospitalized, thus increasing the possibility of identifying more comorbid conditions.

If both hospital claims and physician/outpatient claims are being used to identify comorbid conditions then the rule-out algorithm, which is built into this updated version of the comorbidity macro, will be needed. This algorithm requires that for physician and outpatient claims, a patient's diagnoses must appear on at least two different claims that are more than 30 days apart. The reason for this is that the diagnoses on the physician and outpatient claims have not been validated and it is possible that physicians may have recorded a diagnosis as being present, when the correct coding would be "rule-out" the condition. Conditions that do not appear on two different claims are considered to be "rule-out" diagnoses, and are not counted as comorbid conditions. This is necessary to prevent over-estimation of the comorbidity when using physician or outpatient claims.

The macro considers only ICD-9 diagnosis codes and ICD-9 procedure codes on the claims (HCPCS are no longer used).

The macro is available to download here: charlson.comorbidity.macro.sas.

Building an Input File for the Macro

Regardless of which files an investigator decides to use as input, the files should be subset to include a limited number of variables as described below. All ICD-9-CM diagnosis codes on these records should be 5 characters long and ICD-9-CM procedure codes should be 4 characters long. Before invoking the SAS macro, decimal points or blanks occurring within the number should be removed from the code (ex. diagnosis code '123.4' becomes '1234 ').

Variables to retain for the macro:

  • MEDPAR - retain Patient_ID, admission date (ADM_M, ADM_D, ADM_Y), admitting diagnosis code (ADMDXCDE), diagnosis codes 1-25 (DGN_CD1-DGN_CD25), surgery codes 1-25 (SRGCDE1-SRGCDE25), and length of stay (LOS). Set filetype='M'.
  • NCH - retain Patient_ID, claim from date (from_dtm, from_dtd, from_dty), 14 diagnosis codes (pdgns_cd, dgn_cd1-dgn_cd12, linediag). The carrier data can have more than one claim for the same date of service and all claims for each date should be included. Set filetype='N'.
  • OUTPAT - retain Patient_ID, claim from date (from_dtm, from_dtd, from_dty), diagnosis codes 1-25 (dgn_cd1-dgn_cd25), ICD procedure codes 1-13 (prcdr_cd1-prcdr_cd13). Set filetype='O'.

The final SAS file which combines data from any of the above sources must include the variables used in the Macro call. If the comorbidity score for the 12 months prior to diagnosis is to be calculated, then only records with claim dates falling within that window should be kept. However, if the rule-out algorithm is invoked, claims for 30 days before and after the window of analysis should also be kept.

An example SAS program to build an input file is available here: comorbidity.input.file.example.sas.

These variables are needed for the macro:

PATID - Variable name: Unique ID for each patient.

STARTDATE - Variable name: Date the comorbidity window starts, in SAS date format.

ENDDATE - Variable name: Date the comorbidity window ends, in SAS date format.

CLAIMDATE - Variable name: Date of the claim found on the claim file, in SAS date format. This can be created by using the MDY() function.

CLAIMTYPE - Variable name: the source of the claim record ('M'=MEDPAR, 'O'=OUTPAT, 'N'=NCH). Note, do not use DME.

DAYS - Variable name: contains the length of stay (in days) for hospital visits from MEDPAR.

DXVARLIST - List of variable names: the diagnosis codes in ICD-9 (e.g. DGN_CD1-DGN_CD25). If there are multiple variables, some of which cannot be included in a range, please list them using spaces to separate each single element or range (e.g. DGN_CD1-DGN_CD25 ADMDXCDE).

SXVARLIST - List of variable names: the surgery or procedure codes in ICD-9 (e.g. SRGCDE1-SRGCDE25). If there are multiple variables, some of which cannot be included in a range, please list them using spaces to separate each single element or range (e.g. SRGCDE1-SRGCDE25 PRCDR_CD1).

RULEOUT - Flag: Set this to 1 (or R), if the "ruleout algorithm" should be invoked, otherwise set this to 0 (or leave it blank).

An example call statement for the macro would be:

%COMORB(Claims,patient_id,start_date,end_date,claim_date,filetype,los,admdxcde dgn_cd1-dgn_cd25,surg1-surg25,R,Comorb);