SEER*Stat Rate Exercise 3: Merged Variables

Merged variables are created by defining groupings from one, two, or more variables in the database. Up to this point, we have been using user-defined variables based on groupings from a single standard variable to stratify our results, but in some cases it is impossible to get the exact groupings needed based on one variable. See the example below which illustrates the need for merged variables.

Problem Statement

Use a merged variable to create a table showing 5 age-adjusted incidence rates for:

  • Breast cancer among women ages less than 50
  • Breast cancer among women ages 50 and older
  • Prostate cancer for men ages less than 65
  • Prostate cancer for men ages 65 and older
  • Lung and bronchus cancer (both sexes, all ages)

These rates should be age-adjusted to the 2000 US standard population and should be based on malignant cases diagnosed from 2000 through 2011 in the 18 SEER registries. Only include data in the research database, i.e. do not include July through December 2005 cases/populations from Louisiana.

Key Points

This sample output (Matrix 3a) is provided as an illustration, you do not need to create that matrix. It is shown to illustrate the need for SEER*Stat's "merged variables". The problem can be done correctly using the steps in the next section.

  • A user-defined variable is an edited version of a single standard variable in the database. Merged variables are user-defined variables based on 1, 2, or more variables.
  • Merged variables are typically used for stratifying results by groupings that require variables such as site and histology; or site and extent of disease fields.
  • You can use standard user-defined variables to calculate the statistics for this problem. However, the matrix will contain other unwanted and possibly misleading statistics (like the rate for prostate cancer for both sexes in the sample output).
  • If you are not using merged variables, two user-defined variables must be created. (see results matrix 3a). These are: an age variable with 5 groupings (all ages, <50, 50+, <65, 65+); and a cancer site variable with three groupings (lung and bronchus, breast, prostate). The results are displayed for every combination of these variables and sex.
  • In this exercise we will create a variable based on cancer site, sex, and age. The variable will have groupings for the five groupings listed above ("Lung and Bronchus Cancer for All Ages, Both Sexes", "Prostate Cancer for Men Age <65", etc).
  • Due to the impact of Hurricane Katrina on Louisiana's population for the July - December 2005 time period, Louisiana cases diagnosed for that six-month time period have been excluded from the research database. These cases are provided with the data, but they are considered supplemental data. SEER does not include these cases in most analyses, therefore the default in SEER*Stat is to exclude them with the "Cases in Research Database" checkbox on the Selection tab. For more information, see Adjustments for Areas Impacted by Hurricanes Katrina and Rita on the SEER Web site.

Step 1:  Create a Rate Session

  • Start SEER*Stat.
  • From the File menu select New > Rate Session or use the Rate button on the toolbar.

Step 2:  Select a Database (Data Tab)

  • On the Data Tab select "Incidence - SEER 18 Regs Research Data + Hurricane Katrina Impacted Louisiana Cases, Nov 2013 Sub (2000-2011) <Katrina/Rita Population Adjustment>".
    • Due to the impact of Hurricane Katrina on Gulf state populations, SEER has created adjusted populations for 2005. All November 2013 databases that contain populations include "<Katrina/Rita Population Adjustment>" in their name, even if the geographies covered were unaffected. In this exercise we are including the affected areas. For more information, see Adjustments for Areas Impacted by Hurricanes Katrina and Rita on the SEER Web site.
  • Make sure the Age Variable is set to "Age recode with <1 year olds"
  • Step 3:  Choose the Statistics to Display (Statistic Tab)

    • In the Statistics box, select Rates (Age-Adjusted).
    • In the Parameters box:
      • Make sure that the Standard Population is set to "2000 US Std Population (19 age groups - Census P25-1130)".
      • Make sure the Age Variable is set to "Age recode with <1 year olds"

    Step 4:  Defining the Analysis Cohort (Selection Tab)

    Specific click-by-click instructions for creating individual selection statements were given in previous tutorials (see Frequency Exercise 1a). Use those techniques to create your selection statement.

    Make sure that the Malignant Behavior and the Cases in Research Database options are checked in the Select Only box. The Known Age option is always checked and disabled in rate sessions because all records must have values that are included in the US Population and Standard Population data. Unknown age is not a valid value, so records with unknown ages are excluded from the analysis.

    For this problem you should create a selection statement based on cancer site. The system will process the data more efficiently if you specify the cancer sites as selections on the Selection Tab rather than relying on the definitions of variables used on the Table Tab.

    Make the following selections in the Other (Case Files) box. Hold down the Ctrl key to select multiple items from the list.

    {Site and Morphology.Site recode ICD-O-3/WHO 2008} = 'Lung and Bronchus','Breast','Prostate'

    Step 5:  Create a Merged Variable

    Use the dictionary editor to create a merged variable with 5 groupings: Breast Cancer (females, <50), Breast Cancer (females, 50+), Prostate Cancer (males, age <65), Prostate Cancer (males, 65+), Lung and Bronchus Cancer (all ages, both sexes).

    1. Open the dictionary editor by selecting Dictionary from the File menu, or using the Dictionary button on the toolbar.
    2. Click the Merge button to open the Edit Merged Variable window.
    3. Enter the following name for the merged variable in the Name field: "Breast, Prostate, Lung (by age, sex)"
    4. Click the Add button to open the New Merged Grouping window. Notice that the New Merged Grouping window is the same as the Selection window used to define your analysis cohort on the Selection Tab. Selection statements are used to define the groupings of a merged variable.
    5. Add the first grouping (Breast Cancer - females, <50) by creating a selection statement for the subset of cases the new grouping will contain. You must use three variables to define the breast cancer grouping, Site recode ICD-O-3/WHO 2008, Age recode with <1 year olds, and Sex. Your selection statement should look like this:
      {Age at Diagnosis.Age recode with <1 year olds} = '00 years','01-04 years','05-09 years','10-14 years','15-19 years','20-24 years','25-29 years','30-34 years','35-39 years','40-44 years','45-49 years'
      And {Race, Sex, Year Dx, Registry, County.Sex} = ' Female'
      And {Site and Morphology.Site recode ICD-O-3/WHO 2008} = ' Breast'
    6. Click OK to return to the Edit Merged Variable window.
    7. Give a meaningful name to your new grouping, such as Breast (females, <50), since it will be the label used as the headers in your output matrix.
    8. Click the Add button and repeat these steps to create the four additional groupings needed for this exercise:
      Breast (females, 50+)
      {Age at Diagnosis.Age recode with <1 year olds} = '50-54 years','55-59 years','60-64 years','65-69 years','70-74 years','75-79 years','80-84 years','85+ years'
      And {Race, Sex, Year Dx, Registry, County.Sex} = ' Female'
      And {Site and Morphology.Site recode ICD-O-3/WHO 2008} = ' Breast'

      Prostate (males, <65)
      {Age at Diagnosis.Age recode with <1 year olds} = '00 years','01-04 years','05-09 years','10-14 years','15-19 years','20-24 years','25-29 years','30-34 years','35-39 years','40-44 years','45-49 years','50-54 years','55-59 years','60-64 years'
      And {Race, Sex, Year Dx, Registry, County.Sex} = ' Male'
      And {Site and Morphology.Site recode ICD-O-3/WHO 2008} = ' Prostate'

      Prostate (males, 65+)
      {Age at Diagnosis.Age recode with <1 year olds} = '65-69 years','70-74 years','75-79 years','80-84 years','85+ years'
      And {Race, Sex, Year Dx, Registry, County.Sex} = ' Male'
      And {Site and Morphology.Site recode ICD-O-3/WHO 2008} = ' Prostate'

      Lung (both sexes, all ages)
      {Site and Morphology.Site recode ICD-O-3/WHO 2008} = ' Lung and Bronchus'

      The lung and bronchus grouping should be defined using only the Site recode ICD-O-3/WHO 2008 variable. There is no need to use the age or sex variables to define this grouping since we are showing the lung and bronchus rate for all ages and both sexes.
    9. When you have finished defining the merged variable, close the Edit Merged Variable window by clicking OK. Notice that a new folder has been added to the dictionary named "Merged".
    10. Close the dictionary window and add your new merged variable as a Row variable on the Table Tab.

    Learn More...

    The main features of the Edit Merged Variable window are defined in the SEER*Stat Help system.

    Step 6:  Specify a Title (Output Tab)

    • Move to the Output Tab.
    • Enter the following title:
    • SEER 18 Incidence, 2000-2011
      Rate Exercise 3b (merged variables)

    Step 7:  Execute SEER*Stat

    • Use the Execute button or select Execute from the Session menu to execute the session.
    • Compare your results to this SEER*Stat matrix file: Rate Exercise 3 Results Matrix.