This is the second part of a two-piece Blog post on Personally Identifiable Information (PII). In part 1, PII - The Black the White and the Gray (include link), we went over the basics of PII. In this segment, we will look at safely using PII.
Every year hundreds of millions of people are impacted by identity fraud and/or identity theft. According to a 2017 survey, there were 16.7 million victims of identity theft in the United States[1] alone, equating to an approximate value of $17 billion USD. Although it is common, no person would want their PII to end up in the hands of someone that would use it for nefarious purposes. When designing applications that work with PII it may be helpful to consider some simple concepts to reduce the possibility of PII being easy to gain access to. Take the following as an example: if my name was John Doe and I lived at 123 Main Street, Anytown CO, 12345 and my social security was 123-45-6789 with a membership number of 987654321 you could store the record like this:
Doe ← not a complete name
Anytown CO 12345 ← not a complete address
***-**-6789 ← a partial SSN
987654321 ← A full membership number because this is not sensitive information
Because the membership number is only unique to the business it could be used as the single piece of information that makes the user unique as a customer. The membership number could then be used as a record locator within the company to store encrypted sensitive information (should it be required, for things like billing or transactions).
By storing personal data like this you can reduce the risk of a person being able to be singled out for possible use by criminals, but still be effective for business use, such as being able to have this user be able to call in and provide enough information that their record could be retrieved.
Perhaps you made it this far and were wondering what to do if you have an application that needs to store and be able to retrieve PII, yet you want to keep it safe. It's not impossible to shield PII in such a way to make it extremely difficult to get unauthorized access to it. The simplest way to do this is by using a few concepts that make information hard to find. People often misunderstand how to use encryption to secure data. The hardest concept that people often don't understand is that once information is encrypted, you will not be able to search by that information. This little caveat can actually be used to help secure the data.
One way to do this is by using two logical systems instead of one. One system is used as a record locator with information that is incomplete. The second only allows an application to access sensitive information and can only retrieve information with by way of a record locator number. In other words, it takes significantly accurate information to retrieve information from the second system. The information in the first system is searchable while in the second system its severely limited. A person who calls in to retrieve information, would need to know information to locate the record. The application could require a person to know all 4 pieces of information to locate and return the membership ID, or it could require a name and the last 4 digits of the SSN to locate the record.
Fname.initial | Lname | City | State | last4SSN | MembershipID |
---|---|---|---|---|---|
J | Doe | Anytown | CO | 6789 | 987654321 |
The layout of the sensitive information storage is done so that only a query of the MembershipID is the only searchable (indexed) field.
MembershipID | FirstName | LastName | Address | City | State | SSN | AccountNumber |
---|---|---|---|---|---|---|---|
987654321 | *********** | *********** | *********** | *********** | *********** | *********** | *********** |
If the information is stored in this manner, the only piece of information that can be used to retrieve an individual record is the Membership ID. The database of sensitive information can then be secured by writing a data retrieval system; where the login used to access information is limited in its ability to retrieve only a single record by the query, therefore requiring the exact membership number to return one decrypted record. You might be thinking; it would be easy to join these two sets of records on the MembershipID column, but that is significantly more difficult if you secure access to the sensitive database solely to the application that can retrieve information.
By understanding what PII is and how to work with it carefully, you can help reduce the chance of a data breach occurring.
[1] 2018 Identity Fraud: Fraud Enters a New Era of Complexity
When you subscribe to our announcements, we will send you an e-mail when there are new updates on the site so you won't miss them.