Lab overview
We are creating a data fabric system that is distributed and federated by design and that dynamically adjusts to changes in the environment, offering real-time data processing for federated Industry 4.0 eco-systems. Our system adapts and grows dynamically without service disruption, achieving seamless and trusted distribution of analytics functions and machine learning models across federated actors and clouds.
Our research
Objectives
Industry 4.0 ecosystems are becoming increasingly fragmented, dynamic, and collaborative. For instance, remote data sources such as sensors and cameras, which are widely spread, produce data to be handled by multiple applications owned by several participants that aim to collaborate, yet sometimes have conflicts of interest. Such intricacies exacerbate the need for more intelligent and federated systems capable of self-adapting to enable efficient and secure collaboration while safeguarding data-locality and without affecting or disrupting the running systems.
Currently, production-grade data systems focus on centralized setups and therefore struggle to provide a federated solution and address the above challenges. These systems exploit the traditional central cloud, which interferes with application performance and profitability due to bandwidth-limited and latency-prone edge links, network and computation costs, and limitations for knowledge sharing. Hence, systems must enable an adaptive usage of all layers of the network, leveraging federated application deployments across clouds, edges, and far edges.
We are creating a data fabric system that is distributed and federated by design and that dynamically adjusts to changes in the environment, offering real-time data processing for federated Industry 4.0 eco-systems. Our system adapts and grows dynamically without service disruption, achieving seamless and trusted distribution of analytics functions and machine learning models across federated actors and clouds.
Achievements and projects
The Nokia Bell team has created World Wide Streams (WWS), a large-scale and geo-distributed stream processing platform that facilitates the development of real-time applications, handles high volumes of data and media streams, and deploys applications across central clouds, edge clouds, or the far edge.
WWS applications are authored in XStream, a novel Bell Labs language, providing a flexible way to build dataflows with built-in and external operators. XStream mitigates the interplay with different programming languages, supporting the declaration of external operators, which are not implemented in XStream, and linking them together in a dataflow effortlessly. XStream primitives include stream processing operators (e.g., map, alter, join, and partition) and ML operators (e.g., object tracking, face recognition).
Current projects of the Federated Data Systems team go beyond WWS and XStream functionalities.
- OmniDT - a federated and distributed data fabric system for Digital Twins that enables interoperability across vendors
- Resilience and elasticity/evolvability of stateful applications (i.e., dataflows that maintain the application context and contain window, join, among other stateful operators) with minimal service disruption
Applications
Applications of federated and distributed digital twins are to be found in data intensive sectors like industrial automation, energy grid orchestration, intelligent transport systems, smart cities, …
Team
APA style publications
- Minchen Yu, Tingjia Cao, Wei Wang, Ruichuan Chen, "Following the Data, Not the Function: Rethinking Function Orchestration in Serverless Computing", in USENIX NSDI 2023, Boston, MA, USA, April 2023.
- Ismi Abidi, Ishan Nangia, Paarijaat Aditya, Rijurekha Sen, "Privacy in Urban Sensing with Instrumented Fleets, Using Air Pollution Monitoring As A Usecase", in NDSS 2022, San Diego, California, USA, April 2022.
- Chengliang Zhang, Junzhe Xia, Baichen Yang, Huancheng Puyang, Wei Wang, Ruichuan Chen, Istemi Ekin Akkus, Paarijaat Aditya, Feng Yan, "Citadel: Protecting Data Privacy and Model Confidentiality for Collaborative Learning", in ACM SoCC 2021, Seattle, WA, USA, November 2021.
- Youhui Bai, Cheng Li, Quan Zhou, Jun Yi, Ping Gong, Feng Yan, Ruichuan Chen, Yinlong Xu, "Gradient Compression Supercharged High-Performance Data Parallel DNN Training", in ACM SOSP 2021, Virtual, October 2021.
- Zewen Jin, Yiming Zhu, Jiaan Zhu, Dongbo Yu, Cheng Li, Ruichuan Chen, Istemi Ekin Akkus, Yinlong Xu, "Lessons Learned from Migrating Complex Stateful Applications onto Serverless Platforms", in ACM APSys 2021, Virtual, August 2021.
- Minchen Yu, Zhifeng Jiang, Hok Chun Ng, Wei Wang, Ruichuan Chen, Bo Li, "Gillis: Serving Large Neural Networks in Serverless Functions with Automatic Model Partitioning", in IEEE ICDCS 2021, Virtual, July 2021.
- Klaus Satzke, Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Andre Beck, Paarijaat Aditya, Manohar Vanga, Volker Hilt, "Efficient GPU Sharing for Serverless Workflows", in High Performance Serverless 2021, Virtual, June 2021.
- Chengru Yang, Zhehao Li, Chaoyi Ruan, Guanbin Xu, Cheng Li, Ruichuan Chen, Feng Yan, "PerfEstimator: A Generic and Extensible Performance Estimator for Data Parallel DNN Training", in ICSE Workshop on Cloud Intelligence 2021, Virtual, May 2021.
- Paarijaat Aditya, Istemi Ekin Akkus, Andre Beck, Ruichuan Chen, Volker Hilt, Ivica Rimac, Klaus Satzke, Manuel Stein, "Will Serverless Computing Revolutionize NFV?", in Proceedings of the IEEE 107(4): 667-678, April 2019.
- Do Le Quoc, Istemi Ekin Akkus, Pramod Bhatotia, Spyros Blanas, Ruichuan Chen, Christof Fetzer, Thorsten Strufe, "ApproxJoin: Approximate Distributed Joins", in ACM SoCC 2018, Carlsbad, CA, USA, October 2018.
- Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus Satzke, Andre Beck, Paarijaat Aditya, Volker Hilt, "SAND: Towards High-Performance Serverless Computing", in USENIX ATC 2018, Boston, MA, USA, July 2018.
- Zhenyu Wen, Do Le Quoc, Pramod Bhatotia, Ruichuan Chen, Myungjin Lee, "ApproxIoT: Approximate Analytics for Edge Computing", in IEEE ICDCS 2018, Vienna, Austria, July 2018.
- Wolfgang Van Raemdonck, Tom Van Cutsem, Kyumars Sheykh Esmaili, Mauricio Cortes, Philippe Dobbelaere, Lode Hoste, Eline Philips, Marc Roelands, and Lieven Trappeniers. 2017. Building Connected Car Applications on Top of the World-Wide Streams Platform: Demo. In Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems (DEBS '17). Association for Computing Machinery, New York, NY, USA, 315–318.
- Do Le Quoc, Ruichuan Chen, Pramod Bhatotia, Christof Fetzer, Volker Hilt, Thorsten Strufe, "StreamApprox: Approximate Computing for Stream Analytics", in ACM/IFIP/USENIX Middleware 2017, Las Vegas, NV, USA, December 2017.
- Jörg Thalheim, Antonio Rodrigues, Istemi Ekin Akkus, Pramod Bhatotia, Ruichuan Chen, Bimal Viswanath, Lei Jiao, Christof Fetzer, "Sieve: Actionable Insights from Monitored Metrics in Distributed Systems", in ACM/IFIP/USENIX Middleware 2017, Las Vegas, NV, USA, December 2017.
- Do Le Quoc, Martin Beck, Pramod Bhatotia, Ruichuan Chen, Christof Fetzer, Thorsten Strufe, "PrivApprox: Privacy-Preserving Stream Analytics", in USENIX ATC 2017, Santa Clara, CA, USA, July 2017.
- Chun-Nam Yu, Michael Crouch, Ruichuan Chen, Alessandra Sala, "Online Algorithm for Approximate Quantile Queries on Sliding Windows", in SEA 2016, St. Petersburg, Russia, June 2016.
- Ennan Zhai, David Isaac Wolinsky, Ruichuan Chen, Ewa Syta, Chao Teng, Bryan Ford, "AnonRep: Towards Tracking-Resistant Anonymous Reputation", in USENIX NSDI 2016, Santa Clara, CA, USA, March 2016.
- Ruichuan Chen, Istemi Ekin Akkus, Paul Francis, "SplitX: High-Performance Private Analytics", in ACM SIGCOMM 2013, Hong Kong, China, August 2013.