A survey of European IT executives in 2014 revealed that 72% of businesses didn’t trust cloud vendors to obey data protection laws and regulations, and that 53% of respondents said the likelihood of a data breach increases due to the cloud.
In October 2015, Rob Enderle, president and principal analyst of the Enderle Group and previously Senior Research Fellow for Forrester Research and the Giga Information Group, wrote in a CIO.com post, “Simply stated, you can’t trust the employees of cloud service providers. Frankly, I don’t think we can really trust our own employees anymore either, but at least our capability to monitor them is far greater.”
This line of thought applies to big data and analytics as much as it does to transactional data, but for the purpose of this column, I am going to argue that there might be a place for the cloud in the production of analytics and analytics applications that will not trigger alarms in the minds of IT decision makers.
“Public clouds like Microsoft Azure and Amazon AWS have grown because companies understand the economics of using the cloud—but they still have major fears when it comes to the security of their data on public cloud platforms,” said JT Sison, vice president of marketing of Dataguise, which provides security solutions to protect sensitive data, no matter where that data is stored. “When you store your data internally, you have direct responsibility for all of your data, but when you use the cloud, this data security become a shared responsibility.”
Despite this understandable anxiety, companies shouldn’t give up on using public clouds in their big data strategies. Instead, they need to look at their data processing needs and determine the best places to deploy and to act on data for these various activities.
The data and applications activities that immediately stand out as candidates for the cloud are the development and testing of applications. In these cases, the test data that is prepared is not your production data, so there is less (and in many cases, no) risk of data breach or security exposure.
“Using a public cloud for application development and test is one of the best use cases that we see,” said Venkat Subramanian, Dataguise’s chief technology officer. “To assist companies so they can take advantage of the economics and the speed of the cloud in application development and test activities, we have intelligence that is built into our software that can detect and encrypt sensitive data before it is ever passed into the cloud. This enables applications to be tested in the cloud against realistic data, but not data that is in production or that has security sensitivities.”
To accomplish this level of security over data, the Dataguise software mashes data into usable but fictionalized data for application testing. “For example, if a company’s real customer lives in Chicago, the data might be mashed in the software to instead read Springfield, Illinois as the home residence of the customer,” said Subramanian. “Or, if the customer’s real first name is Mary, the software might change the name to Jane.”
The process seems simple enough. Most importantly, it has the ability to save enterprises many hours of preparing test data, or going through the process of having to refresh this test data when it begins to fail.
Techniques like this also meet IT’s most prominent objection to use of the cloud for any kind of sensitive data storage: The data is fictionalized to the point where it realistically functions in a test and development environment, but does nothing to satisfy the whims of a hacker.