Types of (PII) de-identification techniques
Choosing the de-identification transformation you want to use depends on the kind of data you want to de-identify and for what purpose you’re de-identifying the data. The de-identification techniques that Sensitive Data Protection supports fall into the following general categories:
Redaction: Deletes all or part of a detected sensitive value.
Replacement: Replaces a detected sensitive value with a specified surrogate value.
Masking: Replaces a number of characters of a sensitive value with a specified surrogate character, such as a hash (#) or asterisk (*).
Crypto-based tokenization: Encrypts the original sensitive data value using a cryptographic key. Sensitive Data Protection supports several types of tokenization, including transformations that can be reversed, or “re-identified.”
Bucketing: “Generalizes” a sensitive value by replacing it with a range of values. (For example, replacing a specific age with an age range, or temperatures with ranges corresponding to “Hot,” “Medium,” and “Cold.”)
Date shifting: Shifts sensitive date values by a random amount of time.
Time extraction: Extracts or preserves specified portions of date and time values.