Updates to SEER Data, 1973-2008

This page details revisions to the 1973-2008 SEER Research Data, November 2010 Submission.

October 28, 2011

Hispanic Coding Issue

A problem was identified in the SEER incidence data relating to the coding of Hispanic ethnicity in the November 2010 submission. Due to an error in calculating the NAACCR Hispanic Identification Algorithm (NHIA)-derived Hispanic Origin field in the four California registries, many cases were incorrectly coded for these fields, the majority of which were Hispanic cases incorrectly coded to non-Hispanic. The problem involves the fields: "NHIA Derived Hisp Origin" and "Origin recode NHIA (Hispanic, Non-Hisp)" and is limited to data from the four California registries. Any analyses conducted prior to October 28 that use either of these fields and any of these registries will need to be updated.

Scope of Regional Lymph Node Issue

A change was made to the November 2010 SEER research data to remove information pertaining to scope of regional lymph node surgery. The change was for breast cancer cases only, defined strictly by primary site (C50.0-C50.9; represented as 500-509 in the files). For more information, refer to the Scope of Regional Lymph Node Surgery page.

SEER Cancer Statistics (updated 11/10/2011)

The Hispanic coding issue affected calculated statistics provided throughout the SEER Web site. On November 10, 2011, the statistics were updated to correct the problem with Hispanic incidence and are available in the SEER Cancer Statistics Review 1975-2008, Cancer Stat Fact Sheets, Fast Stats, and the Cancer Query Systems (Canques). Lifetime risk statistics through 2008 were also released.

The error only affected statistics calculated with incidence data, not the 2008 mortality statistics released on October 20, 2011.

SEER Research Data Users

If you requested the data prior to October 28, 2011:

  • In client-server mode, you automatically have access to the updated data.
  • On DVD or downloaded compressed data files:
    • If you would like a new DVD, submit a request. If you have already submitted an agreement form for the November 2010 submission, then you are not required to do so again.
    • Or, download updated compressed data files from the Access Options page.

If you never ordered the 1973-2008 SEER research data, you must use the request form.

October 6, 2011

The SEER incidence databases were updated on October 6, 2011 and included the following changes.

SEER*Stat databases and ASCII text data files

  1. The files were updated to correct a problem with two fields, "Derived SS2000 (2004+)" and "Summary stage 2000 (1998+)". The problem was limited to 2004-2008 cases from six registries (Atlanta, Rural Georgia, Greater California, Los Angeles, San Francisco-Oakland, and San Jose-Monterey). If your analyses included these variables prior to the update, you will need to rerun them.
  2. The "CS site-specific factor 1 (2004+)" variable was updated to include information for Ovary and Intracranial Gland tumors.
  3. The "CS site-specific factor 2 (2004+)" variable was updated to include information for Prostate tumors.

SEER*Stat databases only

  • The format for the Collaborative Stage schema variable, "CS Schema v0202", was modified to include all valid values, not just those that occur in the particular database. This was not a data change, just a change to the format.

Statistics on the SEER web site

Some of these changes impact statistics presented by stage of diagnosis available throughout the SEER web site. These statistics were updated on October 20th with the 2008 US Mortality release.