Blog post

From AWS to Snowflake, Data Clean Rooms are trapped between privacy and progress

Venn diagram with google and AWS logos on the left, Decentriq in the middle, and the snowflake and databricks logos on the right

Written by

Nikolas Molyndris

Published on

December 15, 2022

Users shouldn't have to choose between utility and privacy when it comes to the security of their datasets. Decentriq DCRs are a missing link solution.

First of all, a Data Clean Room without encryption in use is not a Data Clean Room, it’s just another trusted party

In the early stages of Data Clean Room development, companies such as Snowflake and Databricks relied on a framework of predetermined computations and strict access controls to secure data. However, by neglecting to encrypt user data and hide it from themselves, they are ultimately trusted with that data. As a result, users of these Data Clean Rooms are exposed to significant privacy and security risks.

That being said, Google and AWS with their Data Clean Room announcements demonstrated the importance of protecting user IDs from even the DCR provider, emphasizing the need for cryptographic guarantees in order to adequately safeguard sensitive information. And while most solutions in that space still leave some questions unanswered, they clearly show that the future of Data Clean Rooms lies in enhanced encryption.

Just like an airlock is necessary for a clean room to be truly a clean room, cryptographic guarantees are necessary for a data clean room to truly protect sensitive information.

We made Data Clean Rooms from the ground up having verifiable trust as our goal. We go to great lengths, re-thinking fundamental computing problems along the way, to make sure that our users don’t have to choose between utility and privacy.

‍

The need for utility and why Data Clean Rooms are not just about ID matching

In the first-party data era, a simple matching of IDs is not sufficient to achieve a meaningful reach for advertising campaigns. Opt-in first-party data is limited compared to the broader "tag everyone" approach of third-party cookies, leading to lower match rates. In order to expand the matched audience, advanced capabilities such as lookalike modeling are required.

Trusted-party Data Clean Rooms provide these capabilities, but they lack the necessary privacy technology to protect sensitive information. AWS cryptographic computing and PAIR from Google offer cryptographic protection for matching IDs but do not provide the advanced features needed for audience expansion. As a result, companies must either cede their data to walled gardens or trust third parties to act as Data Clean Rooms in order to access these (now needed) capabilities.

‍

The need for privacy and why Data Clean Rooms are not to be trusted

When you reduce the privacy problem to a matching problem, the solutions seem “easy”. Some magic during matching…and done. The reality however is very different. Big tech is aiming to overshoot the “PR security and privacy” of some simple operations like matching while completely ignoring the need for privacy and security on more complex operations.

When operating in a walled garden you know your data is accessible by the garden operator, their data clean room is telling you its not the case anymore. Can you trust that without being able to verify it?

‍

The first victim of the dilemma between utility and privacy is trust.

Overall, we see that the current data clean room solutions fail to offer a significant departure from the existing status quo. They still require users to choose between realistically needed features and their users' privacy. This dilemma forces companies to make a difficult decision and undermines trust in the data clean rooms in general.

Trust is far more than PR and most Data Clean Room providers are not used in operating in such environments. From Snowflake and Databricks that created Data Clean Rooms that are essentially walled-gardens-as-a-service, to Amazon which is not exactly known for treating business customer data privately, to startups that offer magic black-box solutions. Security and privacy depend on transparency and verifiability. A Data Clean Room provider must offer technical and easy-to-understand guarantees on both in order to be truly independent. Current solutions cannot.

‍

How is Decentriq unique?

We seamlessly go further than just matching. A no-code UI even for AI-enhanced targeting and segmentation, already used by major publishers and brands.
Using a new encryption technology called confidential computing, we guarantee that all data is end-to-end encrypted even for Decentriq. Our users can verify it themselves without our involvement.
We don’t require technical resources for setup and use, completely cloud and browser-based
We offer unlimited analytical flexibility even to the most advanced users. A data science environment for custom Python scripting on encrypted data.
We bring extensive experience in production deployments in regulated industries like healthcare and FSI, we can handle sensitive data and go beyond even the strictest regulations to enable otherwise-impossible workflows.

We believe that the emerging privacy landscape is a great reset for the media and advertising space. Organizations that embrace the changes and shape a strategy that holistically addresses their users’ concerns can massively benefit from it. Both in gaining invaluable user trust, and in using their data in ways that were previously gated and available only to selected few.

References

Subscribe to Decentriq

Stay connected with Decentriq. Receive email notifications about industry news and product updates.