MLaaS

Machine Learning as a Service

MLaaS

Project overview

MLaaS - an end-to-end MLOps platform with pluggable components

MLOps is an engineering practice that leverages three contributing disciplines: machine learning, software engineering (especially DevOps), and data engineering. MLOps brings ML systems into production by bridging the gap between development (Dev) and operations (Ops). The MLOps space offers a plethora of tools, and new platforms emerge continuously.

Unlike traditional code development, ML model development is predominantly data-driven, and has a requirement for version controlling both the model and the data. In addition to the common DevOps challenges of scalability and optimum technical performance (latency and throughput), the following ML-specific challenges arise: model monitoring, model reproducibility and lifecycle management.

Machine Learning as a Service (MLaaS) is an MLOps platform that addresses the challenges above, streamlines the process of taking ML models to production, and assists in maintaining and monitoring these models. The MLaaS architecture employs an orchestrator, which enables ML operationalization of the five main pillars of ML, namely, data processing, training, inferencing, model performance monitoring and model lineage. MLaaS also provides a means to perform hyperparameter optimization, experiment tracking, model registration and deployment, as well as monitoring for drift, out-of-distribution scenarios, adversarial attacks and model accuracy.

On this page

Project details
Project members
APA style publications
Discover more

Project details

Objectives

MLaaS is designed and developed with the following objectives in mind:

Consistency and high-quality, with a unified approach for the integration of AI/ML features.
Addressing repeatability, reliability, traceability, explainability, scalability, flexibility, efficiency, and reproducibility.
A decentralized, cloud and component-agnostic design, allowing for pluggable components.
Ease of use, significantly reducing implementation efforts and supporting a seamless transition from development to production, enabling faster time to market.
Rapid and simplified integration of the plethora of available and new ML components.

Significant achievements

Development of core assets:

MLaaS Pipeline Creation Tool: A python-based tool that simplifies the development and deployment of the MLaaS workflows that are executed under the orchestrator. By providing a uniform interface to the underlying components that accomplish a specific task, the MLaaS Pipeline Creation Tool minimizes the technology-specific knowledge required to develop, deploy, and maintain the pipelines. Use of our tool results in a 50% reduction in the number of Lines of Code required for end-to-end operationalization of use cases.
MLaaS Python Training Library: A Python-based library that enables users to make their ML model training-related code compatible with MLaaS. Use of the library simplifies and abstracts out the interaction with the underlying Experiment Tracking and Model Registration System(s) and minimizes the changes to the user's original training-related code.
End-to-End MLOps Blueprints: We consider three different use cases and illustrate the following end-to-end MLOps functions: data processing, data version control, model training, experiment tracking, model registration, model deployment, inferencing, model monitoring, drift detection, ground truth labeling, model retraining, and model promotion. We also provide the associated workflow specifications, the task manifest files, the MLaaS configuration files, the application source code and Docker images.

Research on Large Language Models (LLMs) for Simplifying MLOps:

In this work, we study the utilization of Large Language Models (LLMs) for automatically adapting:

Existing ML use-case code to new MLOps functionalities.
Existing MLaaS codebase to new components to support the pluggability feature.

We conduct extensive benchmarking studies on the use case code adaptation for MLOps functionalities of experiment tracking, hyperparameter optimization, version control, manifest generation, model optimization, etc., with LLMs. We utilize several in-context learning methods including few-shot learning, retrieval-augmented, and task-specific knowledge enrichment techniques. We evaluate OpenAI and open-source LLMs. In this process, we create a data curation workflow that operates on a given tool documentation site, then post-processes and populates the results into a vector store.

We provide a comprehensive guide with valuable insights and best practices for working with LLMs and in adapting the use case code with MLOps functions. We prove that the MLaaS codebase can be automatically adapted to support new components at a rate more than 5x faster compared to conventional methods.

Drift Detection Research and Development in MLaaS

Once the ML models are deployed in production, one pressing concern is their degradation. Over time, as data distributions evolve and user behaviors shift, the efficacy of ML models can deteriorate. Drift detection is the process of continuously monitoring a model's performance in real-world scenarios and identifying deviations from its expected behavior. This proactive approach serves as an early warning system, alerting data scientists and engineers to the need for model retraining or adjustment.

Several techniques exist that include statistical and ML-based methods for drift detection, each with its unique advantages and applications. MLaaS supports canned (based on statistical methods) and customized (ML and non-ML based) drift detection methods and provides use case blueprints for incorporating the same in the production workflows.

Our team is working on innovative drift detection techniques as well. We introduce an application of the Chi-squared Goodness of Fit (GoF) Hypothesis Test as a drift detection algorithm in a Deep Learning environment. We avoid the vanishing p-valueproblem associated with hypothesis testing with big data by using proprietary techniques. We demonstrate the flexibility of the Chi-squared GoF test to detect drift across various forms of drift, architectures, datasets, and applications.

Real world applications

As a flexible MLOps platform, MLaaS is applicable in many different use case scenarios, a few examples include:

Customer churn prediction: By implementing end-to-end MLOps, businesses can build models to predict customer churn. MLaaS assists in gathering, cleaning, and analyzing customer data, training models to identify churn patterns, and deploying them to monitor customer behavior in real time. This allows managers to proactively intervene and retain at-risk customers, reducing customer churn and increasing revenue.
Predictive maintenance: Enables the implementation of predictive maintenance strategies for machinery and equipment. By analyzing sensor data and historical maintenance records, models can be trained to proactively predict when maintenance is required to prevent costly breakdowns. MLaaS automates the monitoring of sensor data, triggers alerts for maintenance actions, and optimizes maintenance schedules, thereby reducing downtime and maintenance costs.
Fraud detection: Automates the process of training models on large datasets, continuously monitoring for new fraud patterns, and deploying updated models to production. This ensures timely identification and prevention of fraudulent activities, saving a company from financial losses and maintaining customer trust.

Additional research areas

The MLaaS team is currently conducting related to the following activities:

Orchestration of Federated Learning and energy trading in the energy management space.
Integration with a home-grown factory monitoring and management system.
Integration with a home-grown privacy-preserving Federated Learning / distributed training system.
Integration with open-source and home-grown drift detection and adaptation systems.

Project members

Lawrence Drabeck

Brian Friedman

Research Engineer

Manzoor Khan

Head of Autonomous Systems Research Department

APA style publications

Sontakke, S.A., Ramanan, B., Itti, L. and Woo, T., 2022. Model2Detector: Widening the Information Bottleneck for Out-of-Distribution Detection using a Handful of Gradient Steps, RAISA Workshop at AAAI 2022, https://doi.org/10.48550/arXiv.2202.11226