The growing role of Artificial Intelligence in telecom operations
The role of AI in telecom operations is growing, which is fueling the need for operators to future-proof their networks to support the adoption of current and emerging AI models. Peter Briscoe, Vice President of Solution Architecture & Pre-Sales at Blue Planet explains.
The Artificial Intelligence (AI) hype cycle has taken us from the early days of data warehousing through the promise of fully lights-out operations centers. Over the last few years, it has begun to deliver measurable value in domain-based AI Operations (AIOps). Yet key questions remain: What are the best use cases for your operations teams, and what do you need to do to future-proof AI adoption in your organization?
The drive for data-driven decision-making
The boardroom has been driving data use within companies' operations for many years. Many business reports have found that data-driven decision-making has a significant impact on a company's performance. A recent Harvard Business Review survey of more than 360 executives found that data and AI leaders outperformed their peers across a range of key business metrics, such as operational efficiency (81% vs. 58%), revenues (77% vs. 61%), customer loyalty and retention (77% vs. 45%), employee satisfaction (68% vs. 39%), and IT cost predictability (59% vs. 44%).
The approach has senior management support; however, the question has been how your organization achieves the right balance of data and the proper tooling to help critical decision-makers vs overwhelming them.
Types of AI for telecom operations
All forms of AI, Machine Learning (ML) and other types of computer-driven data analysis can compress the time it takes to understand and react to operational status. AI technologies generate insights from data at a faster rate than humans, and more accurately. Traditionally, the focus was on achieving a better understanding of the current state by generating insights from historical data (or hindsight). These methods have been deployed for several years to help assure services and better determine the root cause of network issues.
Figure 1. Different types of insights
New growth areas for AI insight generation include building foresight algorithms for events that have yet to happen (i.e. predictions). This can proactively prevent service impacts or service delivery fallout. These methods, in turn, maintain SLAs for customers, increase customer satisfaction, and reduce churn. This helps maintain Average Revenue Per User (ARPU) levels and subscriber counts in an increasingly competitive landscape.
Generative AI (GenAI) uses language models to introduce a new way of interacting with large data sets and increase the useability of automated systems. Initial deployments are focused on customer care functions. Outside of customer care, this technology can help in many exciting ways. Let’s explore some of these further below.
Data and AI architecture approaches
The drive for modern central data warehouses within telecommunication companies began almost two decades ago. These warehouses help centralize insights, however the gathering of data into large single locations has limited the speed of action as it might only occur every 24 hours. The technologies used, and the amount of data collected have also started to reach the capacity of these systems. To cope, we’re seeing some operators limit the volume of data captured and the length of time that data is gathered to only a couple of weeks. This hinders the ability of modern AI techniques to build insights. These limits, combined with data quality issues and technology barriers, reduce the overall value of some of these systems.
As a result, there has been a rise in domain-based AI and data analysis, where actionable insights are kept as close to the data collection source as possible. This approach can still use the information in a larger company-wide warehouse to build new algorithms but enables a faster reaction to the data within that domain. In network operations, speed is critical in reducing operational costs. This distributed approach to AI is crucial and can complement existing central warehouse systems.
Determining the value of AI in operations (AIOps)
Blue Planet has deployed AI at a large North American operator to predict fiber failures in the customer’s multi-vendor optical network. In this case, we identify degradation, which indicates a probability of greater than 95% fiber degradation up to 7 days in advance. This allows the operator to validate and mitigate potential service impacts by proactively addressing the root cause before any outage occurs. This has a significant benefit in avoiding SLA penalties for high-value services and out-of-hours operational costs.
Figure 2. Example predictions from a fiber network
Another pre-packaged AI application Blue Planet has developed builds an understanding of the relationships between different network layers and domains. Here, we use ML to discover cross-layer connections by analyzing the performance patterns of network traffic traversing IP and underlay technologies. These insights lead to a more accurate multi-layer network model that allows faster correlation of events across the network. This AIOps innovation also stops multiple network operations teams from chasing the same fault at different technology layers and can effectively halve network troubleshooting costs while speeding up mean-time-to-repair (MTTR).
An area we are researching now is focused on using GenAI to simplify the building of targeted queries and dashboards within our products. Instead of needing IT expertise or advanced system skills to manually create database queries and new reports, users can employ natural language prompts that are automatically crafted into a view for their convenience. This helps answer questions like: “Show me how many times there has been a threshold SLA event on the Ethernet network in the NY metro area over the last 6 months”. This removes additional IT costs and leads to faster user- and data-driven decision-making.
AI and service orchestration
We see exciting ways that GenAI can be applied to other, more structured data, such as device configuration. Building a model of all valid device configurations within a network context could simplify intent-based orchestration, with natural language requests being used to drive service delivery in entirely new ways, including the configuration of automation platforms.
This AI-enhanced orchestration approach could help address one of the most significant ongoing automation system costs: device upgrades and technology version changes. Device and EMS upgrades frequently change how you configure a device or the structure of API calls. Typically, these changes require manual reprogramming of automation systems and extensive testing. This manual, error-prone process could benefit from AI-driven configuration, enabling consistent intent-based orchestration and accelerating the transition to a single, AI-driven change process. As shown in the figure below, IT would use AI to evaluate what needs to be modified and suggest or create new automation configurations to match the latest device versions or APIs.
This is just one example of how AI might extend into orchestration processes. There are many more possible opportunities. To make them a reality, however, an open AI framework, like the recently announced AI Studio for the Blue Planet Cloud Native Platform, is central to building AI into domain automation systems. This flexible approach allows any mix of vendor-based AI models like those described above and independently developed AI models to be deployed for optimizing various operational processes using your specific situational knowledge. These tools widen the application and value of AI in the telecom operations space while speeding up the ability to use them and providing a level of future-proofing as new models are developed.
Glitches in the AI system
As with any new technology early on the maturity curve, some areas of AI need improvement.
Internal experimentation with GenAI has shown that the technology has the potential for creating phantom answers to technical questions that are not 100% factually correct. Also known as “hallucinations,” these happen when the GenAI looks to form the best response possible but cannot get a direct match from the original data set based on sources such as product documentation and knowledge-based answers. Others have also experienced issues with using AI for code generation. A Purdue University study of 517 Stack Overflow Chat GPT answers found that 52% of answers for programming-related queries were inaccurate. In addition, more than three-quarters (77%) of the answers were deemed “verbose.”
The increasing trend of AI-driven learning shows the potential to improve the accuracy of responses. However, model training needs checks and balances to provide the most accurate and insightful results. In telecom operations, getting it right the first time is the primary objective, as each misdirected insight costs additional operational expenditure (OPEX) as teams chase down false leads.
Converting hype to reality
AI and data-driven decision making have come a long way within telecom operations functions, and there is only more growth to come. Ensuring the speed and accuracy of insights is critical to realize the necessary cost-benefit return on AI investments. Domain-based AI and AIOps have a solid foundation to drive actual value. With care, the application of GenAI will further accelerate the number of use cases that AI will support. Finally, an open framework approach like our AI Studio will help future-proof operators' AIOps roadmap and put them in the best position to convert the hype to reality.