Skip to main content

CHive: Bandwidth Optimized Continuous Querying in Distributed Clouds

01 January 2015

New Image

Bandwidth efficient application of online big data analytics in telecom networks demands for tailored solutions. Existing streaming analytics systems are designed to operate in large data centers, assuming unlimited bandwidth between process- ing nodes, but ignoring the bandwidth required to get all data from the event sources into the data center. Applying these solutions without modification to telecom networks, ignores that bandwidth is the precious resource making the telecom network valuable to end-users. CHive (Continuous Hive) is a tool, being developed inside Bell Laboratories, seeking to bridge this gap by offering a Hive-like solution tailored for distributed telco clouds. Similar to Hive, CHive aims to facilitate the execution of CQL/SQL-like analytics queries. In contrast to Hive, however, CHive was not designed to execute ad-hoc queries on large datasets stored in Hadoop, but instead to continuously execute queries on data collected in an online fashion, e.g. using Storm as the underlying Event Stream Processing (ESP) framework. Furthermore, when deploying and operating con- tinuous queries in a telecommunication network, CHive targets optimising the deployment of the associated query plans to minimise their overall bandwidth consumption.