April 01, 2024
The customization of services in Fifth-generation (5G) and Beyond 5G (B5G) networks relies heavily on network slicing, which creates multiple virtual networks on a shared physical infrastructure, tailored to meet specific requirements of distinct applications, using Software Defined Networking (SDN) and Network Function Virtualization (NFV). It is imperative to ensure that network services meet the performance and reliability requirements of various applications and users, thus, service assurance is one of the critical components in network slicing. One of the key functionalities of network slicing is the ability to scale Virtualized Network Functions (VNFs) in response to changing resource demand and to meet Customer Service Level agreements (SLAs).
In this paper, we introduce a proactive closed-loop algorithm for end-to-end network orchestration, designed to provide service assurance in 5G and B5G networks. We focus on dynamically scaling resources to meet key performance indicators (KPIs) specific to each network slice and operate in parallel across multiple slices, making it scalable and capable of managing completely automatically real-time service assurance. Through our experiments, we demonstrate that the proposed algorithm effectively fulfills service assurance requirements for different network slice types, thereby minimizing network resource utilization and reducing the over-provisioning of spare resources.
5G/B5G resource allocation, 5G and B5G network slicing, closed-loop algorithm, end-to-end network slicing, proactive service assurance, quality of service, service assurance.
5G and B5G cellular networks have introduced significant improvements in wireless communication technology, providing better transmission rates, ultra-low latency, and enhanced capacity. However, with the ever-increasing demand for low-latency and high-bandwidth applications, the current 5G architecture faces a range of challenges including resource allocation for network slicing, ensuring quality of service, and maintaining security and isolation among network slices, as reported in recent studies [1], [2]. These challenges must be effectively addressed in order to satisfy the changing requirements of both network operators and end-users. Furthermore, B5G networks will probably require networks with load adaptability and scalability to meet the growing demand for 5G and B5G applications. To meet the ever-increasing demand for high-speed, advanced reliability, and responsive data transmission, telecommunication networks have undergone significant transformations in recent years. Consequently, the wireless networks must support dynamic network slicing, VNF and SDN [3] to allow vendors to provide different kinds of services and fulfill the specific needs of each application type. The concept of network slicing, which enables a single physical network infrastructure to be partitioned into multiple virtual networks, has emerged as a promising solution to meet the distinct requirements of various applications and customers, see Zhang [4].
Likewise, to effectively manage and utilize network slices, the infrastructure should be able to dynamically allocate resources based on the service requirements of each slice. They include services with a wide variety of quality of service (QoS) needs, such as enhanced mobile broadband (eMBB), ultra-reliable low-latency communications (uRLLC), and massive machine-type communications (mMTC) [5]. In the context of VNF scaling within Service Assurance (SA), resource allocation is the process of allocating specific resources, such as CPU, memory, and storage, to VNF instances. This allocation can be done manually or automatically. Manual allocation requires more time and is more susceptible to errors, but it provides greater control over resource usage. Automated allocation can be more efficient, but it does not always allocate resources optimally. In particular, the VNF auto-scaling process involves balancing network actions and spare resources to meet the QoS, but also needs to achieve cost savings [6]. To guarantee QoS, a common practice is to wisely allocate additional resources to the network components to accommodate the traffic load demand. However, excessive allocation of resources can reduce cost savings. It is important to note that achieving the right balance between resource allocation and cost savings is crucial in the auto-scaling of VNF instances. Proper resource allocation can ensure that performance requirements are met, while cost savings can be achieved by avoiding excessive allocation of resources.
To maintain optimal network performance and resource utilization, closed-loop control mechanisms must be implemented in network slicing [7]. The closed-loop algorithm is a feedback control system used in Service Assurance of wireless networks to improve network performance and maintain service quality. Closed-loop control mechanisms play a vital role in continuously monitoring performance and resource utilization, enabling real-time responses to satisfy the distinctive requirements of each slice in functional domains such as the radio access network (RAN), transport network (TN), and core network (CN). Achieving service assurance can involve modifying network configurations, better resource allocation, or improved management of traffic flows. For instance, if a slice needs more capacity, the control loop management and orchestration can assign additional resources to the slice in real-time without affecting the effectiveness of other slices. A closed-loop algorithm in the wireless network typically involves three steps:
Monitoring: Monitor the network performance and collect data on KPIs, such as network throughput, latency, and packet loss. This data is then analyzed to identify any issues that may be affecting network performance.
Analysis: Analyze the data and identify the root cause of any issues that are affecting the network performance. This may involve analyzing network traffic patterns, identifying bottlenecks, or diagnosing specific network components that are causing issues.
Control: Implement control mechanisms to address the issues identified in the analysis phase. This may involve adjusting network configurations, rerouting traffic, or implementing QoS measures to prioritize traffic.
The utilization of a closed-loop algorithm for 5G and B5G network slicing yields numerous advantages. Implementing the mechanism can help guarantee optimal resources, network efficiency can be optimized. Additionally, network performance can be enhanced by reducing latency and jitter. Also, the network’s effectiveness can be improved by mitigating congestion and service disruptions. However, the development of a closed-loop algorithm also represents several challenges [8]. The network often has a high degree of complexity and is subject to various factors that may impact its operational efficiency. Furthermore, the addition of new hardware and applications causes the network to constantly evolve. The 5G network is made up of a variety of different components across the RAN, TN, and CN, which also might include cloud computing resources, making it heterogeneous. Thus, numerous research activities are currently in progress to devise closed-loop algorithms for network slicing in the 5G and B5G networks, organized by diverse entities such as academic institutions, industrial bodies, and governmental agencies. A closed-loop algorithm should possess certain characteristics such as:
Scalability: The algorithm should have the capability to scale and accommodate the numerous devices and applications anticipated to connect to the network.
Reliability: The algorithm should operate dependably despite network failures and congestion.
Security: The algorithm can protect the network against security risks, including denial-of-service attacks and slice isolation.
Efficiency: The algorithm should utilize resources efficiently.
This paper seeks to conduct an in-depth examination and enhancement of a scalable proactive closed-loop algorithm, named PCLANSA - Proactive Close Loop Algorithm for Network Slicing Assurance, with a focus on proactive characteristics for service assurance in 5G and B5G networks enabled with network slicing. The algorithm is designed to optimize (the best effort) the utilization of network resources while ensuring compliance with the QoS requirements of different slices. Moreover, PCLANSA is designed with flexible parameters which helps to adapt seamlessly to varying conditions and diverse network resources. This adaptability enhances its performance across various scenarios and contributes to its multi-functionality in addressing different QoS requirements. The paper is organized as follows. An overview of related work is presented in Section 2, followed by a discussion of End-to-End (E2E) orchestration in Section 3. We next present our E2E network slicing infrastructure. The design of our proposed proactive closed-loop algorithm for service assurance in 5G/B5G networks is detailed in Section 4. In Section 5, we present an analysis of the efficiency of PCLANSA on E2E network slicing. Finally, we draw the conclusions in Section 6 and discuss the benefits of this approach.
The management of service assurance in 5G and B5G networks is currently an important area of investigation. It implies the efficient management of network resources, and various algorithms have been proposed to dynamically allocate compute resources to VNF instances and manage network resources. The objective is to fulfill the SLAs while simultaneously minimizing resource utilization, operating costs, and energy consumption.
To effectively handle compute resources within 5G networks, Ren et al [9] proposed the use of NFV to virtualize network components in the core network, commonly referred to as the virtual Evolved Packet Core (vEPC). The authors also designed the Dynamic Auto Scaling Algorithm (DASA) for virtualizing the EPC in 5G networks. This algorithm uses a combination of queuing theory and cost analysis to balance the trade-off between performance and operation cost while considering the capacity of legacy network equipment. Then, several subsequent algorithms (see, e.g., [10]) have been proposed for dynamically scaling VNF instances in the vEPC in order to meet performance requirements while minimizing resource usage. Ren et al [11] also proposed the Adaptive VNF Scaling Algorithm (ASA), designed to balance the cost-performance trade-off in 5G mobile networks. This algorithm uses an analytical model to determine the optimal number of VNF instances needed to handle data traffic while minimizing the cost.
While the aforementioned algorithms demonstrate considerable potential in addressing the challenges of dynamically scaling VNF instances, their primary focus remains within one network domain, with limited configurable parameters. Furthermore, their efficiency is heavily dependent on the current traffic load, without taking into account other factors such as delay, jitter, packet loss, etc. This makes them less adaptable to the varying requirements posed by diverse network applications. For instance, accommodating the distinct requirements of applications such as eMBB and mMTC might prove challenging.
Considering the literature on cellular networks, various categories of "slicing problems" have received attention and exploration. One of the central areas that emerge is the allocation challenge resources for physical nodes between slices, including allocation of resource blocks within the CN and RAN [12]–[16]. In terms of 5G network slicing, VNF refers to software-based network functions (instantiated as virtual machines or containers) that operate on virtualized infrastructure, delivering services and functionality within network slices. The challenge of VNF resource allocation in SA mainly focuses on optimizing the utilization of existing resources while satisfying the various requirements of distinctive network slices. The requirements include but are not limited to:
Resource scarcity: competition for limited resources, such as computing power, memory, storage, and network bandwidth, is crucial among network slices. Efficient resource allocation is of extreme importance to achieve optimal performance and avoid resource conflicts.
QoS requirements: network slices may exhibit diverse QoS requirements, that include factors such as latency, throughput, reliability, and availability. In order to fulfill the service-level agreements (SLAs) for each slice, it is essential that resource allocation takes into account these particular requirements.
Dynamic resource demands: the demands for resources may vary dynamically depending on factors such as network traffic patterns, user behaviours, and application requirements. Thus, the adaptability and real-time responsiveness of the resource allocation mechanism is important.
Multi-dimensional resource optimization: the process of resource allocation within the context of 5G network slicing entails the simultaneous optimization of various dimensions, including but not restricted to CPU utilization, memory usage, power consumption, and network bandwidth. Therefore, the optimization problem for satisfying the QoS requirements for multiple slices while maintaining a balance between the different dimensions is a complex challenge.
Isolation and security: ensuring proper isolation between slices is crucial to prevent interference, unauthorized access, and data breaches. Additionally, maintaining robust security measures within each slice is imperative to protect sensitive information and mitigate potential vulnerabilities.
Resource allocation policies: to ensure optimal performance, it is essential to design resource allocation policies that effectively allocate resources based on the specific needs (such as those derived from SLA) and priorities of each slice. This requires considering factors such as QoS requirements, traffic demands, latency constraints, and dynamic resource allocation.
Some research efforts use closed-loop mechanisms to address network slicing AS challenges [17]–[19]. The majority of the current research on slice embedding mainly concentrates on addressing the one-shot optimization problem, which involves optimizing resource allocation based on average and/or static demands. The primary goal of SA is to dynamically and real-time allocate resources to VNF instances or network installations in order to meet SA requirements while minimizing resource usage.
In summary of the above analyses, the aforementioned articles highlight some limitations, notably the lack of consideration of network KPIs based on the various QoS requirements of different slices. Our study aims to fulfill these limitations and therefore determine an efficient approach to resource allocation while using a forecasting model. In particular, it will take into account a wider range of factors, including the data rate of the link, the concept of E2E network slicing and the distinct quality of service of each of the slices.
The trend of network softwarization involves an extensive redesign of the creation, implementation, deployment, management, and maintenance of network equipment and components through the use of software programming. This approach leverages the inherent characteristics of software, such as flexibility and rapid design, development, and deployment, throughout the whole life cycle of network equipment and components. Two distinct architectures for the 5G core network have been established by the 3rd Generation Partnership Project (3GPP), namely the reference point architecture and the service-based architecture [20]. Within the context of the reference point architecture, a distinct reference point is established between two distinct network functions, thereby enabling the functions to interact in communication with one another via these reference points.
Throughout a service-based architecture, identical interfaces are allocated to corresponding functionalities across all interfaces. One of the defined aspects of 5G-CN by 3GPP is decoupling the user plane function (UPF) and control plane function. Through this approach, the novel architecture is able to achieve flexibility, efficacy, and scalability in both the development and operation of 5G/B5G networks. Conversely, the system has the ability to enhance resource allocation through the utilization of traffic patterns and demands. The control plane function provides the ability to dynamically deliver and distribute resources, including radio bearers and QoS parameters. In contrast, the UPF prioritizes the optimization of data transmission efficiency.
With the new design mentioned above, the concept of E2E orchestration has recently gained prominence as an innovative concept in the domain of 5G and B5G networks [21]. Orchestration refers to the comprehensive management as well as coordination of multiple network functions, resources and services across the network infrastructure, resulting in parallel degrees of flexibility, efficiency and operation automated. Demonstrated in Fig 1 from 3GPP, the implementation of end-to-end orchestration provides a holistic strategy for handling network operations, supporting operators to efficiently manage and enhance all network components, ranging from the RAN, the TN to the CN. The E2E network slicing infrastructure employs various management domains and utilizes modern SDN and NFV technologies to facilitate flexible resource allocation, service chaining, and policy enforcement. Thus, E2E network slicing involves a physical infrastructure comprising network, computing, and storage resources that are programmable and embedded throughout the end-to-end communication paths.
The studies [23], [24] provide a deep overview of E2E network slicing in both vertical and horizontal directions with a detailed discussion on slice isolation, and application use cases that enable a comprehensive infrastructure for network slicing in a 5G network. Based on the showcases, we can indicate the significance of slice isolation, which guarantees the independent and secure operation of each network slice, without any external factors or interference from other slices. Ensuring the confidentiality, integrity, and efficiency of each slice is of the highest priority in situations where sensitive or vital applications are utilized, thereby emphasizing the significance of slice isolation. Thus, the mechanisms and techniques need to be revisited to create slice isolation effectively and to address various challenges that arise in this scenario, including resource allocation, traffic management, and security enforcement. As per the definition provided in reference [25], the concept of network slicing is comprised of three distinct layers.
The Service Instance Layer refers to the provision of services to end-users or businesses that are supported. A service instance is the representation of each individual service.
The Network Slice Instance Layer covers the various network slice instances that are available for provisioning. A network slice instance is responsible for delivering the necessary network functionalities to support the service instance.
The Resource Layer is responsible for providing all requisite virtual or physical resources and network functions essential for the instantiation of a network slice.
Despite the numerous benefits that E2E network slicing offers for 5G and B5G networks, there remain certain gaps in knowledge and research opportunities [26] such as RAN virtualization and slicing, holistic and intelligent slice orchestration, secure sliced networks, quality of services in network slicing. Drawing from the previously mentioned review, below we will construct a 5G E2E network slicing architecture in a simulator environment, with the goal of implementing and addressing intelligent network management in the context of service assurance.
Fig. 2 depicts a high-level view of our E2E network slicing infrastructure with a closed-loop algorithm on a 5G network.
It consists of five primary components, including slice control, MANagement and Orchestration (MANO), virtualized networks/platforms, physical infrastructure, and a closed-loop algorithm. In the virtualized networks/platform, there exists a set of commonly shared network functions (NFs) [27], [28], including but not limited to the Network Slice Selection Function (NSSF), Policy Control Function (PCF), and Access and Mobility Management Function (AMF). This approach offers several advantages, such as cost savings on hardware and software, enhanced network efficiency via a reduction in the number of VNF instances that must be deployed, and increased network scalability by facilitating the creation of new network slices. In addition, slice control is used to establish and manage network slices, enforce slice policies, and monitor slice performance. In cooperation with slice control, the MANO component is in charge of ensuring optimal network performance and functionality. This includes facilitating network visibility, equipping network administrators with effective management tools, and automating network management processes [29]. Next, the integration of a closed-loop algorithm has been implemented with the goal of improving coordination between the slice control and MANO components. This integration has enabled the management of network slicing in reliable and efficient ways, while also aligning with customer requirements and satisfying QoS. Finally, the aforementioned components are in charge of controlling and managing a shared physical infrastructure in order to establish an E2E network that optimizes the entirety of the network capability, from the RAN, the TN and the CN. Thus, every E2E network slice is created with an isolated virtual network, a set of VNF instances, dedicated virtual computing and storage resources with several shared common NFs.
Currently, the research on 5G network slicing is concentrated on creating a multi-service network capable of accommodating a broad spectrum of verticals, each with unique performance and service prerequisites [30]. The implementation of network slicing results in raised complexity when managing the network, particularly in scenarios involving a substantial quantity of VNF instances. Therefore, it is necessary to develop automated management and orchestration solutions. At this point, there is a lack of a cohesive consensus regarding the precise structure and extent of MANO for network slicing [5], [21], [31].
When auto-scaling a VNF instance, there is a trade-off between QoS and cost saving. Within the context of network resources, the operator often allocates more resources to guarantee QoS but allocating too many resources will reduce cost savings. In network slicing, each slice is indexed by \(s\in S\), where \(S\) is the set of network slice instances, and it may serve multiple applications. Let \(R\) be the set of compute resources, i.e., \(R= \{\mathrm{\small cpu}, \mathrm{\small ram}, \mathrm{\small sto}\}\), indexed by \(r\). At any given time, each slice \(s\) requires a set of VNF instances, denoted by \(\text{VNFset}\textit{i}_{s}\) where \(\text{VNFset}\textit{i}= \bigcup_{s\in S} {\text{VNFset}\textit{i}}_{s}\) is the set of all the VNF instances in the network. In addition, each VNF instance in \(\text{VNFset}\textit{i}\) is an occurrence of a VNF, indexed by \(\mathrm{\small v}\). For each slice \(s\), each VNF instance is allocated a specific amount of resources, i.e., cpu, ram, and sto (Storage) capacity, each denoted by \(r\in R^{\mathrm{\small v}}\), where \(R^{\mathrm{\small v}}\) defines the overall set of resources required by a VNF instance vnfi, its capacity \(\mathrm{\small cap}_{r}\) and its resource utilization (load) as \({U}_{r}\). In any part of the network from RAN to CN, there is a collection of PMs (physical machines). Note that VNF instances are instantiated in physical machines through the virtualization platform, and each of them has an amount of \(\mathrm{\small cap}_{\mathrm{\small pm}}\). At all times, the resources utilized across all slices should not exceed those provided by the physical machines:
\[\label{eq:check95total95slice95cap95with95phy} \sum\limits_{\mathrm{\small v}\in \text{VNFset}\textit{i}} \sum\limits_{\mathrm{\small vnf}\textit{i}} \mathrm{\small cap}^{\mathrm{\small vnf}\textit{i}}_{r} \leq \sum\limits_{\mathrm{\small pm}\in \text{PMset}} \mathrm{\small cap}_{r}^{\mathrm{\small pm}} \qquad \forall r\in R.\tag{1}\]
For instance, consider a CN with a data center configuration of 2 PMs. Each PMhas a capacity of 2 cpu(s), 3GB of ram, and 5GB of sto. At any time, the total amount of resources used on network slices, allocated to VNF instances, must not exceed 4 cpu(s), 6GB of ram, and 10GB of sto, according to Formula 1 .
Nowadays, 5G/B5G networks are expected to provide network connectivity to not only classical devices (i.e., tablets, smartphones, etc.) but also to the Internet of Things (IoT), which will drastically increase the number of connected devices over the network [32]. In order to assess the efficiency of a network, it is necessary to conduct regular performance monitoring. The orchestration framework often utilizes real-time monitoring, analytics, and predictive capabilities to effectively allocate network resources in accordance with traffic patterns, customer behaviour, and service demands. The implementation of dynamic resource optimization has been envisioned to improve network efficiency, minimize packet loss and reduce latency to enhance user experience. During the operation, the monitoring component will collect the network information including throughput, E2E latency, packet loss, and VNF instance information in all slices. Depending on the KPI requirements of each slice, the QoS model may vary and have different priorities. Within the context of this research, a proactive closed-loop algorithm has been proposed to operate within the CN and the TN. The algorithm is used to optimize the number of VNF instances, as well as the required cpu, ram, storage network resources, and network bandwidth, for E2E network slicing. It is assumed that:
All requests originating from a UE and its applications have been pre-categorized and accurately assigned to their respective slices, such as video streaming, sensitive latency applications, and so on. The implementation of this task may vary depending on the system’s design and can be accomplished through various methods [33], [34].
Each E2E network slice is isolated [35] by virtualization technology including network switching, router, and compute resources and can be controlled by software.
The scaling process is performed with a dedicated algorithm by MANO: the method of scaling may vary based on the algorithm used in MANO, e.g., creating a dedicated VNF instance with a new configuration, then switching the traffic to a new VNF instance and removing the old one. Alternatively, the existing VNF instance could be updated directly (hot scaling) during this dynamic period.
The monitoring component and MANO provide all the network information used in PCLANSA: network KPIs, number of VNF instances for each slice, and slice information (throughput, E2E latency, packet loss, etc.).
The proposed architecture in Fig. 3 leverages the closed-loop algorithm at the slice level to enable parallel SA processing and minimize the complexity of the algorithm in development and scalability. This approach enhances the network design’s ability to expedite the processing and execution of actions. At each time window t, PCLANSA estimates the slice resource consumption per throughput unit (e.g., Mbps, Gbps) as follows: \[\label{eq:estimate95resources95per95th95unit} \text{Req}_{s}= \sum\limits_{\mathrm{\small v}\in \text{VNFset}\textit{i}_{s}} \sum\limits_{r\in R^{\mathrm{\small v}}} \frac{ \mathrm{\small cap}_{r} \cdot U_{r}}{\mathrm{\small th}_{s}}\tag{2}\] where \(\mathrm{\small th}_{s}\) is the throughput of a given slice. By utilizing the slice resource consumption per throughput, PCLANSA is capable of efficiently calculating and identifying changes in traffic load. Therefore, it can effectively adjust resource allocation in response to fluctuations in traffic load, whether they involve an increase or decrease in resources. We can select the quantity of current VNF instances assigned to a specific network slice by taking the total resources configuration of a given slice and dividing by the maximum physical resources per VNF instance. The required number of VNF instances needed for a given slice during the upcoming time window can be computed as follows: \[\label{eq:determine95number95of95vnf95instances} \gamma_{s} = \ceil*{\max\limits_{R} \left\{ \frac{\sum\limits_{\mathrm{\small v}\in \text{VNFset}\textit{i}_{s}} \sum\limits_{\mathrm{\small vnf}\textit{i}} \sum\limits_{r\in R^{\mathrm{\small v}}} \mathrm{\small cap}_{r}^{\mathrm{\small vnf}\textit{i}}}{\mathrm{\small cap}^{\max}_{sr}} \right\}}\tag{3}\] where \(\mathrm{\small cap}^{\max}_{sr}\) is the maximum allowed resource capacity per VNF instance that could be instantiated in a slice.
Relying upon the results derived from Formula 2 , PCLANSA is able to compute the amount of resources required to facilitate the processing of a single throughput unit. Subsequently, this information can be utilized in combination with a machine learning (ML) agent model to forecast the amount of compute resources necessary for a given slice. Thus, with the assistance of a ML agent, we can predict the throughput at the next time step. This time step can be configured as \(t+1\), or \(t+n\), and PCLANSA can estimate the amount of compute resources and link resources \(s^{r, \ell}\) needed for a given slice. Please note that \(s^{r, \ell}\) is a vector formed by combining compute and link resources (creating a higher dimension vector) defined as: \[\label{eq:calculate95slice95resources} s^{r, \ell}= (\widehat{\mathrm{\small th}}_{s} \cdot \text{Req}_{s}, B^{\widehat{\mathrm{\small th}}}_{s})\tag{4}\] where:
\(\widehat{\mathrm{\small th}}_{s}\): Predicted throughput in the time window \(t+ 1\) (or \(t+ n\) depending on the model configuration).
\(B^{\widehat{\mathrm{\small th}}}_{s}\): throughput boundary obtained by a traffic prediction model, see next paragraph for clarification and 5 for its value.
To determine the predicted throughput \(\widehat{\mathrm{\small th}}_{s}\) of a given slice, a machine learning model, LSTM-FSD, was used to analyze the historical data. Subsequently, the model was deployed in conjunction with PCLANSA to get the predicted traffic. It is important to acknowledge that the ML agent is not capable of guaranteeing 100% accuracy. Therefore, PCLANSA will implement a monitoring interval t’ to prevent abnormal traffic or incorrect prediction, and avoid excessive scaling. During the monitoring interval \(t'\), an error \(e\) will be computed at each timestamp that is utilized to establish the traffic boundary. Through the implementation of this methodology, it is possible to maintain a more consistent prediction of traffic fluctuations and scaling procedures. Consequently, a traffic boundary based on a forecasting model was defined as follows: \[\label{eq:traffic95boundary} B^{\widehat{\mathrm{\small th}}}_{s}= \varepsilon\cdot \widehat{\mathrm{\small th}}_{s} + E_{t'}\tag{5}\]
\(\varepsilon\): is the accuracy of the traffic prediction model. For example the accuracy of the LSTM-FSDmodel.
\(E_{t'}\): is the average error rate between the actual and the predicted traffic in the monitoring time window \(t'\).
On the other hand, the algorithm has the capability to calculate and dynamically allocate the link resources for individual slices within the network, based on the traffic load forecast, see lines 23 to 32 in Algorithm 4 for more details. At any timestamp, the total link capacity across network slices should not exceed the link capacity provided by the network and must satisfy the equation 6 .
Assuming that we have access only to the links that are connected to the core network. At any given time, each slice \(s\) requires a link instance. Each link instance is allocated a specific amount of resources denoted by \(\ell_{s}\), and there is also a collection of physical links \(\ell^{\mathrm{\small PHY}} \in L^{\mathrm{\small PHY}}\). Consequently, the total link capacity allocated for network slices mustn’t exceed the total physical link capacity provided by the network infrastructure at any given time, as denoted by Equation 6 :
\[\label{eq:constrant95link95capacity} \sum_{s\in S} \mathrm{\small cap}_{\ell_{s}}\leq \mathrm{\small cap}_{\ell^{\mathrm{\small PHY}}}\tag{6}\] where \(\mathrm{\small cap}_{\ell_{s}}\) is the virtual link capacity and \(\mathrm{\small cap}_{\ell^{\mathrm{\small PHY}}}\) is the physical link capacity provided by network infrastructure. Equation 6 is utilized to verify the link configuration at any point within the network, from the RAN to the CN and from the CN to the data network, where the algorithm is executed. Finally, a proactive closed-loop algorithm, PCLANSA, was developed to address the issue of scaling in resource allocation for a specific slice. This algorithm, capable of running parallel instances across different slices, enables efficient resource allocation and scaling for network slices. By utilizing a closed-loop approach, the system can proactively allocate resources in response to changing network conditions, optimizing performance and reducing resource under-utilization. This leads to considerable enhancements in network efficiency and dependability.
Definition | Notation | |
among slices | ||
Accepted over-provisioning resource ratio | \(\rho^{\mathrm{\small op}}\) | |
Minimum scaling step ratio | \(\rho^{\mathrm{\small s}}\) | |
Traffic prediction accuracy ratio | \(\varepsilon\) | |
Expected resource utilization ratio | \(\rho^{\mathrm{\small ru}}\) | |
Resources validation scaling ratio | \(\rho^{\mathrm{\small d}}\) | |
Maximum allocated resources per VNF instance for slice | \(\mathrm{\small cap}^{\max}_{sr}\) | |
Minimum allocated resources per VNF instance for slice | \(\mathrm{\small cap}^{\min}_{sr}\) | |
Number of sampling data | \(\kappa\) | |
Physical nodes configuration | \(\mathrm{\small cap}_{\text{PMset}}\) | |
Total physical link capacity | \(L\) | |
Time window | \(t\) | |
Monitoring traffic time window | \(t'\) | |
Set of target slice KPIs | \(KPI^{s}\) |
Domain | No. | Action (\(\alpha\)) | Description |
---|---|---|---|
Core Network | 1 | \(scale\_up\) | Scale-up slice resources. |
2 | \(scale\_down\) | Scale-down slice resource. | |
3 | \(scale\_out\) | Add VNF instance(s). | |
4 | \(scale\_in\) | Remove VNF instance(s). | |
Transport Network | 5 | \(scale\_up\_link\) | Increase virtual link capacity. |
6 | \(scale\_down\_link\) | Decrease virtual link capacity. | |
Both | 7 | \(no\_action\) | No action need. |
The high-level flowchart in Fig. 7 illustrates how our closed-loop algorithm works. A closed-loop algorithm in network slicing aims to provide an efficient and dynamic service deployment and management capable of maintaining the QoS of multiple slices by detecting KPI violations quickly and accurately, taking corrective actions, and assigning appropriate resources to resolve KPI violations in a timely manner. Thus, the algorithm has been developed with flexible parameters, shown in Table 1, enabling operators to customize PCLANSA by themselves in order to meet the specific requirements of the slice and align it to their infrastructure. To keep things simple, we split PCLANSA into two main algorithms:
In the first Algorithm 4, the algorithm aims to forecast the upcoming traffic (per slice), see lines 3 to 6. By combining historical information with equations 2 and 4 , the algorithm estimates the resources required for the forthcoming timestamp, as shown in lines 7 to 11. Before optimizing resources and performing scaling, the algorithm conducts a KPI violation check in line 12. If all resources are below the \(\rho^{\mathrm{\small ru}}\) rate but have KPI violations, there is a possibility of an abnormal event in the system, such as a dropped link connection or loss of power in a node. In such cases, the algorithm will trigger an alarm in the system. In the event of a KPI violation, the algorithm will attempt to increase resources by a factor of \(\rho^{\mathrm{\small op}}\) to mitigate potential bottlenecks caused by insufficient resources. Subsequently, the configuration of the slice is evaluated using the \(LP_{\mathrm{\small kpi}}\) model to estimate the necessary resources, with the objective of preventing KPI violations associated with the slice. Afterwards, the final slice configuration will be processed by Algorithm 6 which will perform precise network checks. The algorithm verifies the network infrastructure constraints and ensures that there is sufficient network capacity for the slice before executing any operations.
The second Algorithm 5 is used to optimize the slice configuration obtained in the first phase and perform accurate actions, shown in Table 2. To mitigate the issue of frequent scaling, the algorithm utilizes a minimum scaling increment denoted by \(\rho^{\mathrm{\small s}}\). It is used to determine the minimum quantity of resources required in the event of infrastructure expansion. During the process of scaling up or scaling down, the algorithm allocates resources for each compute resource type independently, in order to ensure that each resource type is in accordance with the upcoming traffic. In the scaling out phase, the algorithm tries to maximize the resources of the last VNF while maintaining a consistent ratio of values among compute resources. Subsequently, the algorithm computes and adds a new VNF into the slice. The aforementioned mechanism is also applicable to the scaling in phase but in the reverse direction. The algorithm computes the optimal resources required for VNFs in the subsequent period while aligning it with the slice KPIs. More details on \(\textit{LSTM-FSD}\) and \(LP_{\mathrm{\small kpi}}\) models can be found in our previous work [36].
While the closed-loop algorithm for E2E network slicing in 5G/B5G networks is promising, there are several potential limitations and challenges associated with implementing it in a real-world network. Some of these challenges may include:
Data availability and quality: Accurate prediction of the algorithm is based on the availability and quality of data relating to network traffic and resource utilization. In the context of a real 5G network, the task of gathering and analyzing data in real-time may present difficulties, potentially impacting the precision of the prediction phase in the algorithm, especially E2E data.
Computational complexity: The algorithm integrates machine learning and linear programming methodologies, which can result in significant computational demands. The computational complexity of the algorithm in a network with numerous VNFs and slices may pose a significant challenge, potentially impacting its performance and scalability.
Parameter tuning: The algorithm has a total of 13 configurable parameters, which need to be carefully tuned in order to optimize its performance. In a real-world network, finding the optimal values for these parameters could be challenging, as it may require extensive experimentation and testing.
Adaptability to changing conditions: 5G/B5G networks are dynamic and can experience rapid changes in traffic patterns and resource requirements. Therefore, the algorithm needs to adapt quickly to these changing conditions to continue making accurate predictions and allocating resources effectively. Consequently, enabling the algorithm to promptly adapt to real-world network fluctuations may pose a challenge.
The next part of this paper will present a comprehensive experiment utilizing PCLANSA and demonstrate our research results.
This section will provide an overview of our experimental setup and showcase our service assurance algorithm designed to support 5G networks with network slicing, and encompassing diverse slice categories.
A simulation at the packet level has been developed utilizing [37] to replicate a 5G network environment that includes support for slicing features. The 5G E2E network slicing simulation includes the construction of an initial configuration comprising four distinct slices, including uRLLC (video gaming), mMTC (IoT), eMBB (HD video), and an in between application service like VoIP. Each slice has different service assurance requirements and resource demands. Fig. 8 demonstrates the implementation of our 5G network, which is reinforced by an isolated E2E network slicing mechanism that leverages virtualization technology. The 5G CN enables the construction of VNFs in a dynamic manner, as displayed by the User Plane Function (UPF) in our experimentation. This enables a single slice to accommodate either a singular VNF or a group of VNFs supported by a load balancer. Hence, the CN has the capability of facilitating the scaling of VNFs in both the vertical and horizontal dimensions. To balance traffic between VNF instances within a slice, a weighted round-robin load balancing algorithm [38] was integrated into the network slice manager. In order to deal with the guaranteed bit rate requirements in network slicing, we have implemented the Hierarchical Token Bucket (HTB) queue [39] in the router, which helps to isolate virtual network links in both TN and CN, and optimize resources.
Slice | Transport type | Total | Scale factor | Network direction |
---|---|---|---|---|
eMBB | Cars | 9,075 | 1/25 | Downlink |
uRLLC | All trucks categories | 7,995 | 1/15 | Uplink |
mMTC | Bikes and Motorcycles | 2,200 | 1/10 | Both |
VoIP | Bus | 885 | 1/3 | Both |
The simulation was configured to generate traffic patterns that are representative of real-world networks. Table 3 illustrates the specific number of UE employed in the four discrete slices and the UE type was sourced from the open dataset [40]. Utilizing the above data, we have compiled a summary of the quantity of mobile UE presented during each hour and have subsequently allocated their respective start positions randomly within our 5G network. Fig 9 depicts the testing environment which is capable of accommodating diverse scenarios, including low and high peak traffic for both downlink and uplink directions (slice 3 - mMTC), stable traffic (slice 4 - VoIP), downlink direction (slice 1 - eMBB) exclusively, and uplink direction (slice 2 - uRLLC) exclusively.
The closed-loop algorithm is a crucial component of the service assurance mechanisms in 5G and B5G networks, and it is designed to oversee and enhance the performance of multiple elements of the network, and to ensure there are enough resources to satisfy the network SLAs. This section will evaluate PCLANSA’s performance in the service assurance domain within a 5G network. PCLANSA was designed with the intention of providing flexible configurations, to align with the requirements of infrastructure and network planning and to enable dynamic monitoring of network conditions, identify potential resource issues, and take appropriate actions to maintain high service quality. Thus, the outcome of the algorithm may vary based on the configuration parameters, and there exists a certain degree of compromise between over-provisioning and the number of network actions executed. For the sake of simplicity and to easily observe the results of our experiments, we use the same settings for all slices. However, in practice, it is possible to configure different parameters for each individual slice.
PCLANSA was extensively examined to assess its effectiveness in two distinct layers: CN and TN layers. The evaluation process included various factors, including latency, throughput, jitter, and packet loss to maximize user experience. In order to conduct a comprehensive analysis of the algorithm’s capabilities, a set of E2E KPI limits and scale factors on different slices was established. Detailed information about these KPI limits can be found in Table [tab:kpiLimits]. These factors play an important role in provisioning optimal service in 5G/B5G networks and provide insights into the algorithm’s ability to detect anomalies, adapt to new network parameters, and make real-time adjustments to optimize service performance. By inspecting the algorithm’s performance with respect to latency (delay experienced during data transmission), we aim to determine if PCLANSA is capable of minimizing delays and enhancing the overall responsiveness of the network. Throughput, the rate of data transmission through the network, is also a crucial factor under consideration. It is an essential measure of the algorithm’s capacity to efficiently manage large data volumes and sustain a consistent information flow. Jitter, the irregular variation of packet arrival times, is also taken into account during the evaluation process. The ability of the algorithm to mitigate jitter is of extreme importance, as it directly impacts the stability and consistency of data transmission. By minimizing jitter, the algorithm can significantly enhance the reliability and smoothness of network performance. Packet loss, the loss of data packets during transmission, is another critical parameter that must be evaluated. The capacity of the algorithm to detect and minimize packet loss is vital for maintaining the integrity and completeness of the transmitted information. Through the reduction of packet loss, the algorithm ensures that data is reliably and accurately delivered, preventing any information loss or service disruption.
Notation | Setting 1 | Setting 2 | Setting 3 |
---|---|---|---|
\(\rho^{\mathrm{\small op}}\) | .15 | ||
\(\rho^{\mathrm{\small s}}\) | .05 | - | - |
\(\varepsilon\) | .814 | - | - |
\(\rho^{\mathrm{\small ru}}\) | .8 | - | - |
\(\rho^{\mathrm{\small d}}\) | .02 | - | - |
\(\mathrm{\small cap}^{\max}_{sr}\) | 3 CPUs, 1 GB, 1.2 GB | - | - |
\(\mathrm{\small cap}^{\min}_{sr}\) | .1 CPU, 15 MB, 20 MB | - | - |
\(\kappa\) | 15 minutes samples | - | - |
\(\mathrm{\small cap}_{\text{PMset}}\) | 39 CPUs, 13 GB, 15 GB | - | - |
\(L\) | 500 Mbps | - | - |
\(t\) | 5 mins | - | - |
\(t'\) | 2 \(\cdot t\) | - | - |
Setting | Slice | ||||||||
KPI violation | Total action | Scale_down | Scale_up | Scale_out | Scale_in | ||||
/ simulation time | |||||||||
hourly | |||||||||
1 | |||||||||
uRLLC | |||||||||
mMTC | |||||||||
VoIP | |||||||||
0 | |||||||||
0 | |||||||||
0 | |||||||||
130 | |||||||||
223 | |||||||||
136 | |||||||||
29 | |||||||||
8 | |||||||||
20 | |||||||||
101 | |||||||||
211 | |||||||||
116 | |||||||||
0 | |||||||||
2 | |||||||||
0 | |||||||||
0 | |||||||||
2 | |||||||||
0 | |||||||||
18.2% | |||||||||
30% | |||||||||
19.7% | |||||||||
.76% | |||||||||
1.25% | |||||||||
.86% | |||||||||
2 | |||||||||
uRLLC | |||||||||
mMTC | |||||||||
VoIP | |||||||||
0 | |||||||||
0 | |||||||||
0 | |||||||||
154 | |||||||||
65 | |||||||||
245 | |||||||||
33 | |||||||||
20 | |||||||||
63 | |||||||||
121 | |||||||||
232 | |||||||||
182 | |||||||||
0 | |||||||||
7 | |||||||||
0 | |||||||||
0 | |||||||||
6 | |||||||||
0 | |||||||||
21.5% | |||||||||
37% | |||||||||
35.3% | |||||||||
.9% | |||||||||
1.54% | |||||||||
1.47% | |||||||||
3 | |||||||||
uRLLC | |||||||||
mMTC | |||||||||
VoIP | |||||||||
0 | |||||||||
1 (delay) | |||||||||
0 | |||||||||
168 | |||||||||
321 | |||||||||
294 | |||||||||
45 | |||||||||
50 | |||||||||
101 | |||||||||
123 | |||||||||
258 | |||||||||
193 | |||||||||
0 | |||||||||
7 | |||||||||
0 | |||||||||
0 | |||||||||
6 | |||||||||
0 | |||||||||
23.5% | |||||||||
44.8% | |||||||||
42.3% | |||||||||
.99% | |||||||||
1.86% | |||||||||
1.76% |
Table 4 outlines the configuration parameters utilized in the testing environment. During the testing phase, the algorithm successfully determined the data rate needed to be used as a traffic boundary for each slice, even when the slice exhibited a significantly high data rate. By taking the advanced ML traffic forecasting model, PCLANSA reliably offers the guaranteed bit rate and ensures a seamless flow of traffic.As an example refer to Fig. 10 (a) for more details of the output configuration on the eMBB slice. On the other hand, the algorithm exhibits proficiency in efficiently providing an optimized allocation of resources of the VNF instances which can see in Fig. 11. The orange colour represents the actual resource utilization, while the blue colour represents the configured resources for the VNF instances. It can be observed that the algorithm has the capability to predict resource utilization and proactively allocate resources. As illustrated in Fig. 11 (a) for the overall of all VNF instances of the eMBB slice, the algorithm closely provides optimal resources for the slice and always allocates spare resources in advance in accordance with the requirements of the slice. To elaborate on Fig. 11 (b to f), the algorithm is capable of providing the necessary resources for each VNF instance while being able to dynamically add or remove instances correctly to optimize resources in accordance with the requirements of the slice. Therefore, this indicates that the algorithm offers the capability to identify and allocate resources in an optimized way effectively. Nevertheless, it is important to note that the distribution of spare resources and the execution of actions of the algorithm might vary based on the parameters \(\rho^{\mathrm{\small op}}, \rho^{\mathrm{\small ru}}, \rho^{\mathrm{\small d}}\) and \(\rho^{\mathrm{\small s}}\). If their values are sufficiently small, the algorithm will probably execute actions more frequently, as the available resources will be depleted sooner but the network will save more resources. In addition, the algorithm’s ability to effectively utilize resources can enhance the dependability and scalability of VNF deployments. Thus, the algorithm has the potential to mitigate bottlenecks and outages by promoting the timely and efficient utilization of resources. Furthermore, this can enhance the resilience of VNF deployments in the face of disruptions and improve their ability to handle high volumes of network traffic.
The results of the algorithm on the simulation are shown in Table 5 and Fig. 12. The closed-loop algorithm is effective at minimizing the number of KPI violations, even in the face of high traffic spikes and changes in network conditions while balancing the trade-off between the number of actions taken (around 0.86% to 1.38% hourly on setting 1) and the spare resources allocated to the slice. Depicts in Fig. 12, the algorithm with setting 1 successfully prevents violations of KPIs for all slices in terms of packet loss (a,b), delay(c,d), and jitter (e,f), thereby meeting our QoS targets. The simulations show that the closed-loop algorithm is capable of maintaining the QoS of multiple slices. It rapidly and accurately identified KPI violations, implemented corrective measures, took corrective actions, and allocated suitable resources to resolve KPI violations in a timely manner. Consequently, the algorithm was able to reduce the number of KPI violations over time, enhancing the overall QoS of the network.
Assuming that the highest peak of network traffic is known and enough resources are configured for the slice to run smoothly during the simulation without any KPI violation, as an example in Fig. 13, the green colour represents the spare resources, and the orange colour represents the actual resources consumption of eMBB in the above case without our closed-loop algorithm. Our closed-loop algorithm can save 54.85% of overall resources in comparison to the case run without PCLANSA (see Fig. 11 (a) and Fig. 13). The overall resource savings is the mean difference between the total resources used without the algorithm and the total resources used with the algorithm enabled in a given slice over the simulation time. Furthermore, the algorithm can achieve significant overall resource savings of 50.87% for mMTC, 57.1% for uRLLC, and 23.63% for VoIP. The relatively lower overall resource savings for the VoIP slice can be understood due to its stable traffic, as described in Section 5-A. An inverse correlation between the accepted over-provisioning rate and the quantity of scaling actions was observed across various settings. When the accepted over-provisioning rate increases, the number of scaling actions decreases, whereas a reduction in the accepted over-provisioning rate results in an increase in scaling actions. This correlation can be understood by considering the trade-off between resource utilization and service assurance. When the accepted over-provisioning rate is high, the algorithm allows for a larger buffer of resources to be allocated to VNFs. This buffer can help to mitigate oscillations in demand without the need for additional scaling actions, resulting in a reduction in the number of scaling actions. Conversely, when the accepted over-provisioning rate is low, the algorithm allocates fewer resources to VNFs, leaving less room for absorbing rapid fluctuations in demand. As a result, the algorithm may need to take more frequent scaling actions in order to maintain service assurance, leading to an increase in the number of scaling actions. On the other hand, the closed-loop algorithm is also able to successfully scale resources in response to changes in traffic load, without experiencing any performance bottlenecks or outages in both horizontal, Fig. 10 (b), and vertical scaling, Fig 11. In particular, it was able to accurately predict future resource demand and allocate resources. This allowed the algorithm to minimize resource usage while still meeting service assurance requirements. Additionally, because the algorithm is able to perform parallel tasks on different slices, it is able to simultaneously allocate resources to VNF instances on multiple slices. This allowed the algorithm to effectively manage resource allocation across all slices in real-time. For example, when there was a sudden increase in traffic on one slice, the algorithm was able to quickly allocate additional resources to that slice in order to avoid any KPI violation. At the same time, if the traffic decreased on another slice, the algorithm was able to reduce resource allocation accordingly to minimize resource usage.
The proactive closed-loop algorithm for service assurance described in this paper, PCLANSA, represents a promising approach to effectively managing resource allocation in 5G/B5G networks. By combining ML and linear programming techniques, the algorithm is able to accurately predict future resource demand and allocate resources accordingly. An important advantage of the algorithm consists of its ability to effectively manage the trade-off between the number of actions performed and the allocation of spare resources to the network slice. This allows the algorithm to minimize resource usage while still meeting service assurance requirements in different slices. Another important strength of the algorithm is its ability to simultaneously allocate resources to VNFs on multiple slices, taking into account the specific service assurance requirements and resource demands of each slice. Additionally, it has the ability to perform in parallel on different slices allowing it to effectively manage resource allocation across all slices in real-time with scalability. Our results demonstrate the reliability, robustness and efficacy of PCLANSA in managing resource allocation across multiple network slices. In detail, PCLANSA has demonstrated significant resource savings across all slices, with up to 54.85% savings in the eMBB slice, 50.67% in the mMTC slice, 57.1% in the uRLLC slice, and 23.63% in the VoIP slice. Even when decreasing the spare resources configuration to a small ratio (5%), the algorithm remains highly effective, with very few KPI violations. Overall, our proposed proactive closed-loop algorithm for service assurance represents a significant advance in the management of resource allocation in the network and has the potential to significantly improve the efficiency and effectiveness of 5G/B5G networks.
The first two authors of this paper received support for their internship from MITACS & Ciena.
Nguyen Phuc Tran received the B.E. degree in information technology from Dong Nai Technology University, Viet Nam, in 2017 and M.S. degree in Computer Science from University Of Information Technology, HCM-VNU, Viet Nam, in 2020. He has been a PhD Student at Concordia University, Montreal, Quebec, Canada since 2021. As a senior software engineer with over 5 years of experience in system development and telecommunication technology, he has refined his expertise in system optimization and security, quality assurance, data analysis, team leadership, and stakeholder management. His current research interest is service assurance in wireless mobile network technologies including resource allocation, energy efficiency, green mobile network, system optimization and AI/machine learning.
Oscar Delgado (Member, IEEE) received the M.A.Sc. degree from Concordia University, Montreal, QC, Canada, in 2010, and the Ph.D. degree in electrical engineering from McGill University, Montreal, in 2016. In 2017, he joined the Telecommunications and Signal Processing Laboratory (TSP), Department of Electrical and Computer Engineering, McGill University, where he is a Postdoctoral Researcher. His current research interests are in the applications of 5G wireless mobile communication technologies, including AI/machine learning, software-defined networks, network virtualization, and green wireless systems, and the analysis and design of video traffic management techniques, resource allocation strategies, and energy efficiency algorithms.
Brigitte Jaumard (Senior Member, IEEE) is a professor in the Computer Science and Software Engineering (CSE) Department at Concordia University. Her research focuses on mathematical modelling and algorithm design (large-scale optimization and machine learning) for problems arising in communication, transportation and logistics networks. She is also a senior advisor for the Montreal Ericsson GAIA (Global Artificial Intelligence Accelerator) research center and the chief scientist of CRIM. Brigitte Jaumard was ranked among the top 2% of scientists in her field of research according to a 2021 study based on research citations. She was awarded several research chairs (Canada Research Chair and Concordia Research Chair, both Tier I during the years 2000-2019). B. Jaumard has published over 300 papers in international journals in operations research and telecommunications.