Spending five minutes into the marketing or sales material of any enterprise SaaS company can be enough to overwhelm you with technical speak about security and privacy.
It is only natural that many initial reactions to Confidential Computing and encryption-in-use is "I am already encrypting my data!"
In this blog, we are going to walk through why what is "secure" and "private" has transformed in the last few years, and why "standard" cloud encryption is not enough for a lot of use cases today.
A story of three data states
Data exists in three states, and so does the encryption applied to this data. To really understand the value-add of Confidential Computing, one first has to be familiar with these three states and what they mean.
Data and encryption at-rest
When some data is in a USB stick, a hard drive, or cloud storage, then that data is at-rest.
Leaving that data unencrypted means that whoever gets access to the data storage, automatically gets access to the underlying data as well. However, encrypting that data is a problem that has been solved for many years now.
Think "password protected USB sticks" or "encrypted cloud storage". It is as the industry calls it “encryption at-rest”, and it is nowadays a heavily recommended step for any data sitting on premises or in the cloud. Even if someone gets access to the storage, they will not be able to see the data there without the proper keys.
Data and encryption in-transit
When the data is moving from one place to the other, then we consider the data to be in-transit.
Failing to encrypt the data in-transit results in third parties being able to gather and intercept data while it is being sent from one place to the other, regardless of whether the storage (or the compute) they are going to is encrypted.
In other words, if your use case involves data being sent around, not having encryption in-transit leaves the data vulnerable, even if your storage is encrypted. Encrypting that data (encryption in-transit) is a more recent achievement (think HTTPS and TLS) but it is something that is considered the internet standard today.
Data and encryption in-use
But there is a third state, when some computation is performed on the data (any computation; from simply reading a file, to Machine Learning and AI). At that point, the data is considered in-use.
Failing to encrypt that data means that anyone with administrator rights in the computer that performs any computation on that data, can see the data. It does not matter if the data is encrypted where it comes from, where it goes, or when it is being transmitted.
At the place of the computation, it is unencrypted. And today that is the case for almost all data computation
Ok but if it’s such a big deal, why do I only hear about it now?
If you are wondering about this, you are not alone. It is not by accident that this type of encryption has only become truly relevant recently. There are three reasons for this, and they are connected.
1. Before cloud, on-premise infrastructure acted as a natural defense
Traditionally, organizations deployed software and held data in their premises, allowing them to physically control and secure their infrastructure. So if you could make sure that the data is encrypted when it was stored, and while being piped in the network, you would not mind it being unencrypted in-use because the "attack surface" was relatively limited. The admin with such access would probably be a senior level security administrator in your company.
However, this has been slowly changing for the past years as more and more companies are moving to the cloud. Now, the computation is no longer happening in the secure perimeter of the company and the admin with that access is not a senior security admin of the company, but an unknown cloud admin and/or the SaaS vendor admin. This means that more people have access, especially people who are not part of the data owning organization.
2. Encryption in-use used to be difficult and expensive
Encrypting data while it is in-use is a much bigger challenge than at-rest and in-transit. And it makes intuitive sense why; when you want to compute on some data, you need to know what data that is. In order to find the result of a simple addition, I need to know what elements I am adding. And in that case, to know, mean to be "unencrypted".
So, the question really becomes:
How can we get the result of a computation (and know it is correct), without knowing the data underneath?
While the technology to support encryption in-use is not new, until now, the tradeoffs were just not worth it. It has only been recently that advancements in cryptography enabled technologies, such as Confidential Computing, to be almost as performant as unencrypted methods.
You can read more about these different technologies in some of our previous blogs.
3. Stricter data privacy and data security regulations globally
In the wake of events such as the Cambridge Analytica scandal, data privacy and awareness of data misuse became a popular social movement that created global regulations such as the GDPR, and the CCPA. These data privacy regulations are now rapidly adopted by countries all over the world, creating a new landscape where data privacy must be completely thought out before data usage, making most of the current analysis platforms unsuitable for sensitive data analytics.
4. Data collaboration is eating the data world, and it is almost impossible without encryption in-use
The need becomes even more apparent when we talk about data collaboration. There is currently no other way to collaborate on data, without giving access to that data to a trusted third party, or to a consulting firm or a SaaS data collaboration vendor. This adds yet another entity to the mix.
In the end, a company who just wants to provide their data to a collaborator ends up having to give access to two more actors, namely the cloud admins and the trusted third party.
And on top of all of this, they have to make sure that data privacy is uphìeld and no sensitive information is being extracted.
All in all, a data owner, such as a first-party data provider (or a hospital, and a bank), who wants to collaborate with someone else today, needs to:
- Give data access to the vendor providing the software that facilitates this (or the consultancy)
- Give sensitive data access to the "impartial" infrastructure provider
- Trust the privacy guarantees of the software provider to make sure that whoever analyses that data is not ending up with more information than they should (output privacy)
So, encryption in-use is basically another layer of protection for my data?
Yes and no.
Yes, it is an extra layer of protection for your data because it closes a previously overlooked hole in the security assumptions.
But it is also much more than that. Removing the need to trust the infrastructure provider and the vendor (Decentriq), we enable some very exciting use cases that were just not possible before, such as:
- Data Clean Rooms that don’t require organizations to trust the software provider
- Cyber defence data intelligence and collaboration
- Pharmaceutical companies developing models from hospital data without anyone ever seeing the data
and many more...
So yes, it is an extra protection layer, but you should not need it because of this. It is needed because it unlocks a completely different way to interact and collaborate on data. And that collaborative way is taking the industry by storm.
For more on keeping your data secure while in-use, please reach out to us here.