SEER*Stat Rate Exercise 4a: Complex Selection Statements

In this exercise, you will define a complex selection statement to produce statistics by expanded race using the SEER Program's guidelines. You will also create selection statements using variables with unlabeled values.

If you are just getting started with SEER*Stat, be sure to do the introductory tutorials first.

Problem Statement

Create a table showing frequencies and incidence rates (age-adjusted to the 2000 US standard population) for malignant esophageal squamous cell carcinoma. Include only microscopically confirmed cases. Calculate these statistics for persons diagnosed from 1992 through 2011 in the SEER 13 Registries. Do not show statistics based on fewer than 30 cases.

Display the statistics by race and sex. Show data for males and females separately but not combined. Use the following races: "White", "Black", "American Indian", "Asian or Pacific Islander". Include standard errors and confidence intervals in the table.

Define squamous cell carcinoma as: Histologic Type ICD-O-3 = 8070-8078,8083-8084

Key Points and Reminders

  • This exercise requires that you create a complex selection statement to include the correct race and region combinations. When producing statistics using SEER Incidence data for American Indians, SEER frequently only includes cases that are in a Contract Health Service Delivery Area (CHSDA). The selection statement will use parentheses and the "OR" conjunction.
    • Starting with the November 2012 submission, the CHSDA 2012 variable is used. In the November 2006-2011 submissions, CHSDA 2006 was used. Refer to the County Attributes web page for more information.
  • In this exercise, you will make selections by specifying a range of numeric values to define squamous cell carcinoma using the Histologic Type ICD-O-3 variable. In previous exercises, you selected values from a list of values with labels. The ICD-O-3 Hist/behav variable has labeled values and could also be used for this selection.

Step 1:  Create a new Rate Session

  • Start SEER*Stat.
  • From the File menu select New > Rate Session or use the Rate button on the toolbar.

Step 2:  Select a Database (Data Tab)

  • On the Data Tab select "Incidence - SEER 13 Regs Research Data, Nov 2013 Sub (1992-2011) <Katrina/Rita Population Adjustment>".
    • Due to the impact of Hurricane Katrina on Gulf state populations, SEER has created adjusted populations for 2005. All databases that contain populations include "<Katrina/Rita Population Adjustment>" in their name, even if the geographies covered were unaffected (which is the case in this exercise). For more information, see Adjustments for Areas Impacted by Hurricanes Katrina and Rita on the SEER Web site.
  • Make sure the Age Variable is set to "Age recode with <1 year olds."

Learn More...

  • Databases distributed with SEER*Stat use names designed to describe the data. The various parts of this exercise's database name indicate the following:
    • Incidence - The database contains cancer incidence data.
    • SEER 13 Regs - The database contains data for the "SEER 13 Registries" as defined in SEER Registry Groupings for Analyses.
    • Research Data, Nov 2013 Sub - This is the version of the database available researchers outside of the SEER program. The data was submitted to the SEER program by the registries in November 2013.
    • (1992-2011) - These are the years of diagnosis for the cases included in the database.
  • The suggested citation for the database selected on the Data Tab is shown at the bottom of the screen. For more information, see Citations for SEER Databases and SEER*Stat Software.

Step 3:  Choose the Statistics to Display (Statistic Tab)

  • Move to the Statistic Tab.
  • In the Statistics box, select Rates (Age-Adjusted).
  • In the Parameters box:
    • Make sure that the Standard Population is set to "2000 US Std Population (19 age groups - Census P25-1130)".
    • Make sure the Age Variable is set to "Age recode with <1 year olds."
    • Check the Show Standard Errors and Confidence Intervals box.

Step 4:  Defining the Analysis Cohort (Selection Tab)

  • Move to the Selection Tab. Specific click-by-click instructions for creating individual selection statements were given in previous tutorials (see Frequency Exercise 1a).
  • Make sure that the Malignant Behavior option is checked in the Select Only box at the top of the tab.
  • For this problem you will need to select based on race, CHSDA region, behavior, cancer site, histology, and diagnostic confirmation. Use the Find button to locate a variable based on its name or a value (for example, if you search for "microscopically confirmed" you will find the "Diagnostic confirmation" variable.
  • In the "Race, Sex, Year Dx, Registry, County (Pop, Case Files)" box, use the conjunctions "AND" and "OR", and group lines using parentheses, to make the following selections:
  • Race recode (W, B, AI, API) = White,Black,Asian or Pacific Islander
    OR ({Race, Sex, Year Dx, Registry, County.Race recode (W, B, AI, API)} = 'American Indian/Alaska Native'
    AND {Race, Sex, Year Dx, Registry, County.CHSDA 2012} = 'CHSDA' )

    Note: Parentheses around a group of lines tell SEER*Stat to evaluate those lines first when processing the selection statement. When using parentheses, you must first create the selection statement lines and then add the parentheses. To add parentheses to a selection statement, click and drag your cursor to highlight the lines you want to work with, then click Add (...) to enclose those lines in parentheses.
  • In the "Other (Case Files)" box, make the following case selections:
  • {Site and Morphology.Site recode ICD-O-3/WHO 2008} = ' Esophagus'
    And {Site and Morphology.Histologic Type ICD-O-3} = 8070-8078,8083-8084
    AND {Site and Morphology.Diagnostic Confirmation} = 'Microscopically confirmed'

Learn More...

  • Through the use of the complex selection statements, you were able to define an analysis cohort which includes:
    1. All records for Whites, Blacks, and Asian/Pacific Islanders for all registries and years in the selected database (SEER 13 registries, 1992-2011)
    2. All records for American Indians within the CHSDA regions.
  • When you selected the "Histologic Type" variable, the Values box in the Selection window changed format. The valid values for the Histologic Type variable are shown just above the Values box. It is not practical to list all values for variables with a large number of numeric values. If you want to specify a range of values for an unlabeled variable, use a hyphen to define the range and use commas to separate multiple values or ranges (e.g. 1-5,8-19).
  • To learn more about the squamous cell carcinoma definition, the "ICD-O-3 Hist/behav" variable has labeled values for each ICD-O-3 code. If you are unsure of which ranges define squamous cell, this variable could be used instead in the case selection. The selection statement would be:
    {Site and Morphology.ICD-O-3 Hist/behav} =
    '8070/2: Squamous cell carcinoma in situ, NOS',
    '8070/3: Squamous cell carcinoma, NOS',
    '8071/2: Squamous cell carcinoma in situ, keratinizing, NOS',
    '8071/3: Squamous cell carcinoma, keratinizing, NOS',
    '8072/2: Squamous cell CIS, large cell, nonkeratinizing',
    '8072/3: Squamous cell ca., large cell, nonkeratinizing',
    '8073/2: Squamous cell CIS, small cell, nonkeratinizing',
    '8073/3: Squamous cell ca., small cell, nonkeratinizing',
    '8074/2: Squamous cell carcinoma in situ, spindle cell',
    '8074/3: Squamous cell carcinoma, spindle cell',
    '8075/2: Squamous cell carcinoma in situ, adenoid',
    '8075/3: Squamous cell carcinoma, adenoid',
    '8076/2: Squamous cell CIS with questionable stromal invasion',
    '8076/3: Squamous cell carcinoma, micro-invasive',
    '8077/2: Squamous intraepithelial neoplasia, grade III',
    '8077/3: Squamous cell ca. & Grade III',
    '8078/3: Squamous cell carcinoma with horn formation',
    '8083/2: Basaloid squamous cell carcinoma in situ',
    '8083/3: Basaloid squamous cell carcinoma',
    '8084/3: Squamous cell carcinoma, clear cell type'

Step 5:  Create User-Defined Variables to use on the Table Tab

For this exercise, you need to define two new variables, one for race and one for sex.

Open the Data Dictionary.

  1. Select the "Race recode (W,B,AI,API)" variable from the "Race, Sex, Year Dx, Registry, County" category and use the Create button to open the Edit Variable window.
    • Change the Name of the variable to: "Race recode (W,B,AI,API) w/o unks".
    • Delete the "Unknown" grouping in the Groupings box.
    • When you are finished, click the OK button.
  2. Select the "Sex" variable from the "Race, Sex, Year Dx, Registry, County" category and use the Create button to open the Edit Variable window.
    • Change the Name of the variable to: "Sex (no total)".
    • Delete the "Male and Female" grouping in the Groupings box.
    • When you are finished, click the OK button.
    • Close the dictionary.

Step 6:  Set the User-Defined Variables as Row Variables (Table Tab)

  • Use the "+" symbol to expand the User-defined category in the Available Variables box at the bottom of the Table Tab.
  • Select "Race recode (W,B,AI,API) w/o unks" from the "User-Defined" category, then add it as a row variable.
  • Next, select "Sex (no total)" and add it to the row dimension as well.

Step 7:  Specify a Title and Hide Statistics (Output Tab)

  • Move to the Output Tab.
  • Enter the following title:
  • Malignant Esophageal Squamous Cell Carcinoma
    Microscopically Confirmed Cases Only, 1992-2011
    SEER 13 for White, Black, API
    SEER 13 (incl. CHSDA only) for AI/AN
    Rate Exercise 4a
  • Check the option to "Hide Statistics When Fewer Than 25 Cases", and change the limit from 25 to 30 cases.

Step 8:  Create the Matrix and Re-order the Rows

  • Use the Execute button or select Execute from the Session menu to execute the session.
  • A dialog will display the progress of the job. When the job completes, a SEER*Stat matrix window will open containing the output table.
  • The output table contains two row variables (race and sex). The outermost row variable is the first variable listed as a row variable on the session's Table Tab. The innermost is the second row variable on the Table Tab.
  • Change the order of the row variables. From the Matrix menu, select Order and then Row.
  • Select the first variable listed and click Move Down button to switch the order of the variables.
  • Click OK.
  • The variable you moved down is now the inner row variable in your results matrix.
  • Use the Save As command on the File menu to save the matrix for use in the next exercise. Enter "Rate Exercise 4a" as the filename. SEER*Stat will assign the "sim" extension to indicate that this is a "SEER*Stat Rate Matrix" file.
  • Compare your results to this SEER*Stat matrix file: Exercise Matrix 4a Results.

Learn More...

  • The Matrix menu gives you the opportunity to customize your results, as well as export the results for use in other applications. See Results Matrix in the SEER*Stat help system for more information.