Data Broker Services : Frequently Asked Questions

What are the HIPAA compliant methods of de-identification?

HIPAA Privacy Rule & De-Identification in Depth

Section 164.514(a) of the HIPAA Privacy Rule provides the standard for de-identification of protected health information.  Under this standard, health information is not individually identifiable if it does not identify an individual and the covered entity has no reasonable basis to believe it can be used to identify an individual.

Sections 164.514(b) and(c) of the Privacy Rule contain the implementation specifications that a covered entity must follow to meet the de-identification standard. As shown in Figure 1 from the HHS guidance, the Privacy Rule provides two methods by which health information can be designated as de-identified.

Figure 1. Two methods to achieve de-identification in accordance with the HIPAA Privacy Rule.

The first is the “Expert Determination” method

Implementation specifications: requirements for de-identification of protected health information. A covered entity may determine that health information is not individually identifiable health information only if:

1. A person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable:

  • Applying such principles and methods, determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information; and
  • Documents the methods and results of the analysis that justify such determination; or

The second is the “Safe Harbor” method

The following identifiers of the individual or of relatives, employers, or household members of the individual, are removed:

1. Names

2. All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and their equivalent geocodes, except for the initial three digits of the ZIP code if, according to the current publicly available data from the Bureau of the Census:

  • The geographic unit formed by combining all ZIP codes with the same three initial digits contains more than 20,000 people; and
  • The initial three digits of a ZIP code for all such geographic units containing 20,000 or fewer people is changed to 000

3. All elements of dates (except year) for dates that are directly related to an individual, including birth date, admission date, discharge date, death date, and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older

4. Telephone numbers

5. Vehicle identifiers and serial numbers, including license plate numbers

6. Fax numbers

7. Device identifiers and serial numbers

8. Email addresses

9. Web Universal Resource Locators (URLs)

10. Social security numbers

11. Internet Protocol (IP) addresses

12. Medical record numbers

13. Biometric identifiers, including finger and voice prints

14. Health plan beneficiary numbers

15. Full-face photographs and any comparable images

16. Account numbers

17. Any other unique identifying number, characteristic, or code as permitted by 164.514 (c); and

18. Certificate/license numbers

The covered entity does not have actual knowledge that the information could be used alone or in combination with other information to identify an individual who is a subject of the information.

Satisfying either method would demonstrate that a covered entity has met the standard in §164.514(a) above.  De-identified health information created following these methods is no longer protected by the Privacy Rule because it does not fall within the definition of PHI.

Limited Data Set

A related term is a limited data set (LDS).  A limited data set of information may be disclosed to an outside party without a patient’s authorization if certain conditions are met. First, the purpose of the disclosure may only be for research, public health or health care operations. Second, the person receiving the information must generally sign a data use agreement. Specifically, it is distinguished from a Safe Harbor de-identified data set in that it allows retention of the following identifiers:

  • dates such as admission, discharge, service, DOB, DOD;
  • city, state, five digit or more zip code; and
  • ages in years, months or days or hours.

All other identifiers such as name, telephone numbers, email addresses, social security numbers, medical record numbers etc. must be removed. It is important to note that this information is still protected health information or “PHI” under HIPAA.  It is not de-identified information and is still subject to the requirements of the Privacy Regulations