Adaptive Failure Detection Timers for IGP Networks
22 May 2013
Carrier networks need to provide their customers with high availability of communication services. When a failure occurs, the IGP protocol start its convergence process in order to re-establish a consistent view of the network. In this process, the failure detection realised by the Hello protocol is responsible of an important unavailability. Indeed a quick failure detection require the use of fast Hello which cause false detection and instability. However, there are often forewarning signs that a network device will stop working properly. Based on an embedded and realtime risk-level assessment, we can adapt in real-time the Hello frequency of the sick nodes and thus reduce unavailability while maintaining the routing stability. The consequences in terms of quantity of Hello and availability have been estimated based on an analytical model and then simulated to measure the expected benefits of the proposed proactive self-healing function.