TL'DR
To enable ad hoc and trustworthy collaborations & transactions between entities that have not established trust a-priori, we need to re-think the design of data and compute. We need to advance Web3 and Privacy-Enhancing Technologies to build more scalable trustworthy and verifiable compute and data fabrics. This will minimize breach of trust, cut cost and reduce business friction.
Our research enables the training of ML models while keeping the model confidential and the training data private.
Motivation
It is not a too bold claim to state that the Industry 4.0 revolution is transforming not only the technological foundations but often the fundamental business models on which existing vertical industries (e.g., Energy, Transportation, Healthcare, Manufacturing) are currently thriving. Increasingly, these traditionally rather closed industries are characterized by fragmented and open ecosystems of actors providing solution ingredients like sensors, software components, services, AI/ML models, data, integration services, etc. Industry 4.0 automation services typically rely on large amount of heterogeneous data that are processed by many different AI/ML models to deliver timely situation-aware automation actions.
In many cases, these AI/ML models are provided by a diverse set of actors from the vertical ecosystem, introducing the need for so-called collaborative business creation. These models are increasingly recognized as critical Intellectual Property (IP) given the substantial investment and skills required to design, develop and deploy them. Not only the ML model itself but also the way the model can be efficiently trained and deployed is considered as key IP.
This highlights the main challenge of this project: to enable the training of ML models while keeping the model confidential and the training data private.
Our research
We create technologies that enable sharing of data and models between parties that have not established a-priori trust. This allows to run and train AI/ML models confidentially over private datasets. We advocate an emerging scenario where there are data silos that hold valuable private datasets (either static or continuously generated), and in the meantime, various analysts develop their own proprietary AI/ML models and want to run these models over such siloed datasets.
We aim to design technical means to perform the AI/ML jobs (including training and inferencing) while protecting both the privacy of the siloed datasets and the confidentiality of the AI/ML models being used. While doing so, we also plan to make an informed trade-off between the privacy guarantees and the ML model utility, as well as to ensure that the entire ML pipeline satisfies the requirements of privacy laws (e.g., GDPR and CCPA).
Decentralized Systems team
Discover more
Biographies
Location
Service
Awards