HD*Calc Tutorial: Preparing Input Files Exercise 3
Measuring Health Disparities By Race using SEER Incidence Data showing 3-Year Moving Average
In this exercise, we will calculate several measures of health disparities by racial/ethnic group for lung and bronchus cancer for ages 45 and older using SEER data. Use incidence data from 1992 to 2005 to see if the disparities have narrowed or widened over that time period.
For this example we are going to generate age-adjusted rates by sex, race/ethnic group, and year for the cancer site of interest. Instead of showing changes in disparity every year we will use a 3-year moving average. This approach is useful for minimizing fluctuations in the standard errors calculated by HD*Calc.
Step A: Create a Rate File in SEER *Stat
- Open a new Rate session in SEER*Stat.
- On the Data tab, select an incidence database that covers the time period for which you want statistics. For example: Incidence - SEER 13 Regs Research Data, Nov 2007 Sub (1992-2005)
- On the Statistic tab, select Rates (Age-Adjusted) as the statistic and choose to Show Standard Errors and Confidence Intervals.
- Define the records to include in your analysis on the Selection tab. For this exercise, we want to include only ages 45 and older, and malignant lung and bronchus cancer.
- Use the Table tab to choose the variables by which you want your data organized.
- You need to define a user-defined variable for Sex to exclude total rates for Both Sexes.
- You will need to create a merged variable for the following racial/ethnic groupings:
- White Non-Hispanic (excl AK)
- Black Non-Hispanic (excl AK)
- Asian or Pacific Islander
- American Indian (CHSDA only)
- Total Hispanic (excl AK)
- You need to define a user-defined variable for Year of Diagnosis. Define 3-year groupings with two years overlapping, and use the middle year as your label. For example:
- For 1993 use 1992-1994
- For 1994 use 1993-1995
- For 1995 use 1994-1996 and so on.
- Last Two Variables: HD*Calc expects that the Disparity Groups (Race/Ethnicity) and the Time Points (Year of Diagnosis) must be the last two variables in the table. Make sure that they are in that order.
- On the Output tab, set Display Rates as Cases Per to "100,000" and set the Number of Decimal Places for Rates/Trends to "0.000000001".
- Execute the session to create the results matrix of age-adjusted rates. You can download this sample matrix* and compare your results matrix: health.disp.table1.race.moving.average.sim
- Once the results matrix is created, the age-adjusted rates can be exported to a text (*.txt) data file. When you export the data, the associated information about the variable names, format, and the name and location of the data file will be saved in the dictionary (*.dic) file. Export the matrices with the following settings:
- Output Variables as: Numeric Representation
- Line Delimiter: DOS/Windows (CR/LF)
- Missing Character: Space
- Field Delimiter: Tab
- Check the boxes to Remove All Thousands Separators (Commas) and Remove Flags (Footnote), Prefix and Suffix Characters. Leave the other checkboxes unmarked.
* Your results may differ if you are using a different SEER Incidence database.
Step B: Import the Data into the HD*Calc Program
- When you start the HD*Calc application you will get a message that reminds you to open a data file in order to view the disparity measures.
- Select Open... from the File Menu to open your data file. You will be prompted for a dictionary (*.dic) file. Find the same file that you exported in Step A.8 above.
- When your file is opened, you will be taken to the HD*Calc Data Import dialog where you will provide all the information needed to identify the fields in your input file. In the edit box at the top please provide a Title for your input data. This title will be displayed with the resulting disparity measures.
- Use the Dictionary edit box to select a file for storing your data input specifications.
- The checkbox indicating that your Data File Contains Column Headers should not be checked and is disabled by default since headers were not exported with your data.
- The Statistics are Sorted by All Variables In Their Order box has been checked by default and is disabled. This will speed up retrieval of records during computation.
- The Fields Are Character Delimited radio button should be selected, and Tab should be selected as the Delimiter. No change is needed.
- Your fields should have the correct field type by default, but you can select them individually (one field at a time) and press the Change button to the right to view the details.
- Click OK.
Step C: View Disparity Measures In HD*Calc
- When the dialog opens you will be asked to specify whether the disparity groups in your data are ranked (e.g. by income or education). There are some disparity measures that will only be presented if the groups are ranked. Since your example uses Race/Ethnicity as the basis for the disparity groups, there is no inherent ranking, so press No as your response.
- Use the Selection dropdown list to select Male or Female sex from your input data. Whenever you choose from that list, all the disparity measures are re-calculated for the selected set of records.
- On the Disparity Groups tab, in the Ranking Disparity Groups box, see that the groups in your file are not ranked. The checkbox below that indicating that a higher rate means less healthy (more disease cases), should also be checked.
- On the Disparity Table tab you will see all the measures that have been calculated for your data. If you click on the title of a disparity measure, the help system will display a description of that measure.
- On the Disparity Chart tab, you can select any measures you wish to view in the graph. You can use the checkbox labeled Show Relative Change Over Time (top right) to show the percent change over time instead of the actual measures.
- The Data Table and Data Chart tabs show the rates read from you data, and some additional fields calculated from your data for use in the computation of disparity measures.
- The Combined Chart tab can be used to present input data values and disparity measures on the same graph at the same time.
- The Pair Comparison tab allows you to select any two disparity groups to be compared. For the two groups, the Rate Difference and Rate Ratio are calculated and presented on the graph.