Product & technology

Product Update: Decentriq platform v2.1 brings Synthetic data, R scripts and more

Key visual with product screenshot and name of article

Written by

Sergio Giannone

Published on

April 14, 2022

An update from our product team – David Sturzenegger, Head of Product, Decentriq

1. Explore sensitive data with synthetic data generation

To us, data privacy is not negotiable. However, we strive to minimize the impact of this on the analysts user experience. Especially in machine learning workflows, data scientists are used to work iteratively with the data, something that until today was not possible in the Decentriq platform.

This changes completely with the introduction of synthetic data generation. Our users are able to generate from any table a differentially-private synthetic copy of the data with similar statistical properties as the original (compared in a dedicated report that our platform produces). This allows you to prototype your scripts locally before running them on the original data in a dedicated data clean room.

UI screenshot showing synthetic patient data generation

2. R language support

Same as Python, R was a highly requested feature for our platform. A lot of our users use R to perform complex statistical analyses on sensitive data and we want to make it easy for them to migrate their analyses in a data clean room without changing their code.

code snippet showing a script used for cohort analysis

3. Support for unstructured data

So far we covered use cases that depended on structured data with a strict schema definition. However, our customers recently suggested some exciting use cases on unstructured data.

With this release we are unlocking them by supporting datasets of any kind. Think about JSON, plain text or even a zip with several images. Computations just got more powerful by being able to run on both tables and files.

Snippet from UI showing the text "files of any kind can be provisioned, e.g. JSON, ZIP, TXT."

If you would like to learn more about these updates, let us know and we’ll be more than happy to walk you through.

We’re always looking for ways to improve our data clean rooms to make it even easier for you to unlock new value from sensitive data assets.

4. Splitting scripts into multiple files

We are proud that our data clean rooms now support computations written in both Python and R language. But sometimes, users prefer to structure their analyses into multiple files. We now support such workflows for both Python and R so you can conveniently split your analysis however you like.