Race Recode Changes
For the 1973-2005 SEER Research Data (November 2007 Submission) and Later Releases

The algorithms for creating the race recode variables in the SEER incidence and US mortality data were modified starting with the November 2005 submission of data.  All of the variable names within the SEER*Stat and SEER*Prep software were modified for clarity and to avoid compatibility issues between submissions of data.

Available Race Variables

Race Recode (White, Black, Other)
White
Black
Other (American Indian/AK Native, Asian/Pacific Islander)
Unknown
Race Recode (W, B, AI, API)
White
Black
American Indian/Alaska Native
Asian or Pacific Islander
Unknown
Origin Recode NHIA (Hispanic, Non-Hisp)
Non-Spanish-Hispanic-Latino
Spanish-Hispanic-Latino

Race Recode Definition

For incidence and mortality rate calculations, we recoded detailed race information into four major categories in order to make them compatible with available annual population estimates used as denominators for the rates. These categories are:

  • White
  • Black
  • American Indian/Alaskan Native
  • Asian or Pacific Islander

The available race codes for the fields in the underlying incidence and mortality data have changed over the years.  For some years, both the SEER incidence and NCHS mortality data have had a code available for “all other races”, when in fact every race was already represented, and therefore the “all other races” code was not needed.  However, cases/deaths were coded to this category. Starting with the 2010 data (November 2012 submission), these incidence cases are now coded as "unknown" race. In prior incidence databases, these cases were coded as "Other - unspecified (1991+)". In mortality databases, these deaths are coded as “Other - unspecified (1978-1991)”.

Starting with data through 2005 (November 2007 submission), the “Race/ethnicity” variable used to create the race recodes in the SEER incidence data was revised. This field is created from the Race1 and Indian Health Service (IHS) Link variables. If Race1 is white, unknown, or other and the IHS Link is positive, then Race/ethnicity is set to American Indian/Alaskan Native, otherwise Race/Ethnicity is set to the Race1 value. The previous method is described for the 1973-2004 SEER Research Data (November 2006 submission).

Spanish-Hispanic-Latino Ethnicity

Hispanic is not mutually exclusive from Whites, Blacks, Asian/Pacific Islanders, and American Indians/Alaska Natives.

Incidence data for Hispanics are based on NAACCR Hispanic Identification Algorithm (NHIA)External Web Site Policy. When producing statistics using SEER Incidence data for Hispanic ethnicity, we exclude cases from the Alaska Native Registry.

For state exclusions that SEER uses when producing Hispanic (and non-Hispanic) mortality rates, see Policy for Calculating Hispanic Mortality.

Combining Race and Ethnicity in Rate Analyses

Some SEER incidence and mortality databases in SEER*Stat are now linked to both race (White, Black, AI/AN, API) and Hispanic origin within the same database.  While this provides the ability to produce rates for the 8 combinations of these variables, the SEER Program does not recommend using all of the combinations.  SEER only reports Hispanic/non-Hispanic rates for the races of all races combined, white, and non-white. 

American Indian/Alaskan Native Statistics

When producing statistics using SEER Incidence data for American Indians/Alaska Natives, SEER frequently only includes cases that are in a Contract Health Service Delivery Area (CHSDA).

Starting with data through 1973-2010 SEER Research Data (November 2012 submission), CHSDA 2012 is used. In data from 2004-2009 (November 2006-2011 Submissions), CHSDA 2006 was used.

The following spreadsheet has the CHSDA variable definitions used in SEER*Stat: [MS Excel File] [PDF File]