de-identified health information (HIPAA)

Under HIPAA's Privacy Rule, there are two approaches to de-identify health information so that it is no longer protected health information (PHI).

Protected health information under HIPAA is individually identifiable health information. Identifiable refers not only to data that is explicitly linked to a particular individual (that's identified information). It also includes health information with data items which reasonably could be expected to allow individual identification.

Potential identifiers include obvious ones like name and social security number, and also:

  • all geographic subdivisions smaller than a state, including street address, city, county, precinct, zip code, and their equivalent geocodes, except for the initial three digits of a zip code if, according to the current publicly available data from the Bureau of the Census: the geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people; and [t]he initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000.
  • all elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death; and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older;
  • voice and fax telephone numbers;
  • electronic mail addresses;
  • medical record numbers, health plan beneficiary numbers, or other health plan account numbers;
  • certificate/license numbers;
  • vehicle identifiers and serial numbers, including license plate numbers;
  • device identifiers and serial numbers;
  • Internet Protocol (IP) address numbers and Universal Resource Locators (URLs);
  • biometric identifiers, including finger and voice prints;
  • full face photographic images and any comparable images; and
  • any other unique identifying number, characteristic, or code.

Under HIPAA's "safe harbor" standard, information is considered de-indentified if all of the above have been removed, and there is no reasonable basis to believe that the remaining information could be used to identify a person.

The covered entity may assign a code or other means of record identification to allow de-identified information to be re-identified, if the code is not derived from, or related to, the removed identifiers. (Only the covered entity will have the re-linking information.)

Alternatively, under the "statistical" standard, a covered entity may determine that health information is not individually identifiable (and thus protected) health information if:

A person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable, [a]pplying such principles and methods, determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information; and [that person] documents the methods and results of the analysis that justify such determination.

As an alternative to using fully-deidentified information, HIPAA makes provisions for a limited data set from which direct identifiers (like name and address) have been removed, but not indirect ones (such as age). Limited data sets require a data use agreement with the party to which/whom it is provided.

See also:

 
 

   © 2002-2006 Contributing authors and University of Miami School of Medicine