In today’s fast-paced digital landscape, Communication Service Providers (CSPs) face immense pressure to maintain seamless connectivity and minimize downtime. When network disruptions occur, identifying the root cause, especially in complex multi-vendor IP/MPLS network environments, can be daunting. This is where Blue Planet’s Route Optimization and Analysis (ROA) proves invaluable. It provides customers with an unmatched ability to correlate performance degradation with routing events and helps them get to the bottom of critical issues as quickly as possible.

Blue Planet success story – helping a major CSP identify root cause of BGP peering issues Figure 1. ROA with Route Analysis can monitor and analyze complex IP networks running OSPF, IS-IS, EIGRP, and BGP routing protocols across multiple autonomous systems.

In a recent incident, Blue Planet ROA helped a major CSP customer identify the root cause of a service-impacting BGP peering issue that might have otherwise gone unresolved. I go into further detail on what transpired below.

What happened

A major CSP encountered a challenging BGP peering issue recently that caused significant disruptions in its network. The problem arose between route reflectors from Vendor A and provider edge (PE) routers from Vendor B.

At first glance, everything seemed normal; Interior Gateway Protocol (IGP) connectivity was intact, and the CSP’s monitoring tools indicated no apparent faults. Yet, earlier this year, the first PE router abruptly lost all its prefixes, followed by the second PE router just seconds later. This brief but impactful disruption raised alarms, and neither router vendor could pinpoint the root cause of the issue.

Facing a dead-end, the CSP turned to Blue Planet ROA for a deeper forensic analysis, a decision that would ultimately reveal the hidden fault that other systems had missed.

How ROA helped

As I mentioned in my blog last January, Blue Planet ROA is known for its deep visibility into network events and its advanced analytics capabilities, particularly when it comes to correlating network performance with routing behavior. This case exemplifies how ROA's capabilities are unmatched when it comes to identifying and addressing complex network issues.

  1. Correlating performance degradation with routing events: One of ROA’s standout features is its ability to correlate performance metrics, such as CPU utilization, with routing events like BGP peering flaps. In this instance, analysis in ROA revealed a critical CPU spike on PE routers that coincided precisely with the BGP session flaps. Though not reaching 100%, this CPU spike was nearly double the regular utilization rate and sufficient to disrupt BGP peering, leading to the withdrawal of prefixes.
  2. Forensic analysis of network events: Using ROA, the CSP reconstructed the sequence of events leading to the disruption, offering a detailed historical timeline that linked the CPU spike directly to the BGP issues. This level of insight was instrumental in identifying the root cause, something other monitoring tools and vendor analyses had failed to achieve.
  3. Giving you the clarity you need: ROA’s ability to correlate seemingly unrelated data points gave the CSP the clarity needed to address the issue. This highlights ROA’s proficiency in helping operators detect and correlate subtle performance degradations with network events.

Proactively preventing future network issues

This incident is another example of how Blue Planet ROA’s capabilities empower our customers to detect, diagnose, and resolve network issues that might otherwise remain hidden. By leveraging ROA’s advanced analytics capabilities, the CSP was not only able to understand the root cause of the disruption but also took proactive steps to prevent future occurrences.

In complex multi-vendor networks, having a tool like Blue Planet ROA is essential. It provides the deep insights necessary to correlate performance degradation with routing events — insights critical for maintaining network stability and ensuring service reliability.

The takeaway for CSPs facing similar challenges is clear: Blue Planet ROA is the go-to solution for uncovering hidden faults and guiding CSPs to resolution, especially in multi-vendor environments. Its exceptional capability to correlate network performance with routing events makes it a critical asset in maintaining uninterrupted service. It is an indispensable tool for operators aiming to deliver reliable service in their increasingly complex IP/MPLS network environments.