What Everyone Needs to Know About Data Masking
In this podcast a data masking expert discusses why data masking is critical to protecting our data privacy, how data masking works, and use cases to prevent devastating data breaches.
Summary: In December the Marriott data breach exposed 500 million customers’ sensitive data. Data breaches like Marriott, Equifax and so many others have exposed our personal data to theft and abuse. Listen to this podcast to find out why data masking is critical to protecting our sensitive personal data. You will learn what data masking is, how it really works and how it differs from other data protection approaches like encryption. You will also learn about the key data masking use cases for deploying this essential technology to prevent data breaches that put all our personal data at risk.
Podcast Show Notes
What Everyone Needs to Know About Data Masking
Chris Doolittle, VP Marketing Teleran, Interviews Tim Gorman, Data Masking Expert
Chris Doolittle: In December, it was revealed that Marriott hotels suffered a monumental breach that compromised the personal data of over 500 million Marriott customers. The Marriott incident and other massive data breaches at companies like Equifax, Facebook, and Yahoo has hugely increased the risk of misuse of our personal data, and they have had material impact on these company’s reputation, financial condition, and legal liabilities.
I’m here with Tim Gorman, a leading expert, practitioner and educator in data masking and information management. Tim will share with us the key essentials of data masking, how data masking really works including data masking use cases, and why it is such a critical component in protecting all our personal data, in light of these devastating data breaches like Marriott and so many others.
Tim works for Delphix, www.delphix.com a leading data masking and virtualization software company. He has over 30 years’ experience in information management, working for companies like Oracle, SageLogix, and others in software engineering, automation and data management. For 15 years, Tim was president of Evergreen Technologies, a database software consultancy. Now Tim travels around the globe using data masking techniques to enable organizations to protect their sensitive data from data breaches, hacking, and misuse.
Let me start with a simple question. What is data masking?
Data Masking Defined
Tim Gorman: Data masking is either obfuscating or replacing confidential or sensitive data with either fictitious or obscured data usually for the purposes of anonymity. In the case of PII or personally identifiable information or for the purpose of obfuscation for other types of confidential or sensitive data. In this way, the data retains the realism and usability of original set of data without jeopardizing the original, sensitive, confidential data. Masking’s irreversible, so pretty much eliminates the possibility of the confidential data being used in ways that is not intended.
Chris Doolittle: Encrypting data is also a common strategy to protect information. How is encryption different from masking?
Data Masking – Different from Encryption
Tim Gorman: Encryption is the encoding of all data into a format that’s totally unusable. If you’ve seen encrypted data after it’s been encrypted, it’s pretty much gobbelty-gook, and that’s its protection, by the way, is being gobbelty-gook. But encrypted data has to be decrypted using a key to turn it back into something that’s usable. And there are many different kinds of encryption ciphers that are available, and they’re pretty uncrackable without brute force computing power. But the basic idea behind making encryption ciphers so complicated is, again, just to prevent them from being reversed, to make them difficult if not impossible to unencrypt.
In contrast, masking data just means obscuring or obfuscating parts of the data. The data’s still usable, but we’re just changing the data values to be different from the original so that, again, it can’t be used in ways that are unintended. So with masked data, it’s irreversible. It could be reversed, if necessary, but typically the way data masking is done, there’s no way to reverse it. There’s no decryption step. So, encryption is encoding to make data totally unusable; masking just changes data while remaining useful.
Chris Doolittle: Tim, what methods of data masking are used?
Data Masking Applications – Data Inflight and Data at Rest
Tim Gorman: In general data masking can be applied in two ways. on in-flight data or dynamic data masking, or on data at-rest or static data masking. Dynamic masking is when the data is obfuscated or obscured after retrieval from a database and prior to being presented to an application. So during that period in-flight, the data is dynamically changed. However, at-rest or static masking is when the confidential data is masked permanently in the database or in the storage so that there’s really no copy of unmasked data available for presentation.
There are various techniques for masking in-flight or at-rest, but mostly the algorithms involve either a replacement scheme where we obscure all or part of a data item with a series of patterns from a list that we generate, or we might just do simple randomization or scrambling. But the replacement scheme is by far more useful than the randomization scheme. Because randomization leaves the data rather unusable and doesn’t really look like it’s original. If we’re simply replacing the simple data components with items from a list, they can look very realistic, but they are nonetheless masked.
Encryption Use Case
So, a good use case for encryption is to create a barrier between systems, applications, and external intruders by making all the data completely unusable. Encryption is a great tool for creating a wall, if you will, around your data center. Encryption at-rest is recommended for all types of storage for production and non-production systems. And encryption in-flight is recommended for all communications to prevent an external intruder from intercepting any communications.
Dynamic Data Masking Use Case
Tim Gorman: A good use case for masking in-flight data in a production system is for dealing with different levels of access or privilege by different users. In a production application, some users are permitted to see all the sensitive data, but dynamic or in-flight data masking is useful where some users are not permitted to see sensitive data. So dynamically or on-the-fly, we obscure or obfuscate that data.
Static Data Masking Use Case
A good use case for at-rest masking, permanently masking the data, is when we’re cloning production data to non-production systems for software development or for software testing. In those situations, we’re copying the realistic production data because we need data “realism” in software development and testing. But, we don’t need to expose the sensitive data. So, production users may be permitted to see the sensitive data, but developers, testers, and administrators, to do their job, just need the realistic data itself. They don’t need to see the actual sensitive data that’s personally identifiable or confidential.
Chris Doolittle: Tim, thank you. You’ve done a great job defining data masking and how it differs from encryption, applications for dynamic and at-rest masking methods, and key use cases of these two data masking approaches. Thanks so much for sharing your knowledge and experience on the essentials of data masking.
In our next conversation Tim and I will be revealing the critical information you need to know about dynamic and at-rest data masking.