Skip to main content

Adaptive Thresholds: Monitoring Streams of Network Counts Online

01 January 2006

New Image

This paper describes a fast, statistically principled method for monitoring streams of network counts, which have long-term trends, rough cyclical patterns, outliers and missing data. The key step is to build a reference (predictive) model for the counts that captures their complex, salient features but has just a few parameters that can be kept up-to-date as the counts flow by, without requiring access to past data. This paper justifies using a negative binomial reference distribution with parameters that capture trends and patterns and method of moment estimators that can be computed quickly enough to keep up with the data flow. The reference distribution may be of interest in its own right for traffic engineering and load balancing, but a more challenging task is to use it to identify degraded network performance as quickly as possible. Here we detect changes in network performance not by monitoring quantiles of the predictive distribution directly but by applying control chart methodology to normal scores of the p-values of the counts. Using p-values adjusts for the lack of stationarity from one count to the next. Compared to thresholding isolated counts, control charting reduces the false alarm rate, increases the chance of detecting ongoing low level events, and reduces the time to detection of long events. This adaptive count thresholding procedure is shown to perform well on both real and simulated data.