Thanks to advances in technology, we might soon be able to use sensitive data for machine learning without customers having to reveal their confidential information.
Machine learning systems need access to huge volumes of data in order to learn thoroughly. But how secure is the data used to train the machine, especially if it’s confidential information? Can it be traced or even hacked? Should we even use sensitive data for machine learning at all?
SAP reported on the launch of SAP’s guiding principles on artificial intelligence (AI) in 2018. One example of how SAP lives by these principles itself is homomorphic encryption.
Unlike other techniques, homomorphic encryption allows encrypted data that customers send to the cloud to be analyzed without having to decrypt it first. The results returned are also encrypted. That way only the customer—and not even the cloud provider—knows what information is contained in the data and the results of the analysis. The data remains the property of the customer. This approach complies with SAP’s principle of placing data protection and privacy at its core.
Homomorphic encryption offers customers guaranteed data privacy and data protection when querying databases and performing analytics. Furthermore, it could allow to refine AI capabilities and open up even more opportunities using secure customer data.
Homomorphic encryption will be a technology under the hood and not an actual product on SAP’s price list. Still, it pays into the success of SAP’s cloud offerings enhancing the security and privacy of customers’ data.
Homomorphic encryption enables programmers to use original data without the “noise” caused by traditional anonymization methods. “It doesn’t get more precise than that,” says Axel Schroepfer, lead for Security and Privacy in the SAP Innovation Center Dresden and one of the main protagonists in this field. As a result, organizations can analyze a shared pool of encrypted data without having to disclose information to each other.
This way of pooling data allows Schroepfer and his team to pilot a secure benchmarking solution for customers. A cryptographic protocol enables enterprises to compare key performance indicators (KPIs) with competitors without the need to involve a third-party or running the risk to disclose sensitive data.
The technology could allow the use of protected data to train machine learning models. Ideally, this will make more companies willing to share their data, because they can do so without worrying about breaching the General Data Protection Regulation (GDPR) and other data privacy rules.
Homomorphic encryption could thereby contribute to preventing AI systems from making biased decisions based on data that is not representative, or that is otherwise imperfect. In simple words: The more reliable input data you have, the better a system learns.
“Only if the quality of the data for the model is good you can assume that it will make good decisions in the long run,” says Schroepfer. So eventually, the work on homomorphic encryption can also support another AI guiding principle to enable business beyond bias.
Experts have known for 40 years that it is possible to process data even while it is encrypted. But early efforts were hampered by limited computing power and it wasn’t until 2009 that the technique was sufficiently advanced to produce the first fully homomorphic encryption.
Homomorphic Encryption in a Nutshell
Unlike other techniques, homomorphic encryption allows computation on encrypted data, yielding the same result as if the operations had been performed on the plaintext. The results returned are also encrypted. From just two operations, addition and multiplication, any arbitrary function can be constructed. Modern encryption schemes supporting both operations in unlimited number are therefore called fully homomorphic encryption schemes.
On the downside, processing power and memory are still significant technical hurdles, since processing encrypted data requires vastly more resources—in fact, 1 million times more than what is required for processing non-encrypted data or plaintext. Thanks to advances in technology over the past 10 years, homomorphic encryption is now not only feasible, there are more and more use cases in which it is a real help.
Powerful and flexible hardware solutions as well as innovative computing machines are important prerequisites to keeping up with the latest developments in the field. SAP is therefore teaming up with major companies, universities and startups in regular workshops to standardize homomorphic encryption.
Beyond these technical preconditions, the exchange with other experts is a key to success. SAP Security Research and the SAP Leonardo Machine Learning Research team, for example, are working on similar topics and looking into other privacy-preserving machine learning trends.
The team at the SAP Innovation Center Network is in conversations with these units to gather input and feedback. As Schroepfer points out: “We want to build on the strengths of SAP. It’s therefore crucial to collaborate right from the start and tap into the specific expertise inside the company and drive the topic forward in a meaningful way.”
The mission of the SAP Innovation Center Network is to solve emerging business challenges by pioneering new technologies and creating transformative innovation at SAP. The unit focuses on areas that complement the unique strengths of SAP and collaborates closely with the company’s industries and lines of business. With 180 employees at eight locations worldwide, the SAP Innovation Center Network has been spearheading the company’s efforts around machine learning and blockchain. Since January 2019, the SAP Innovation Center Network is headed by Torsten Zube.