Abstract

The fifth generation (5G) of cellular networks promises to be a major step in the evolution of wireless technology. 5G is planned to be used in a very broad set of application scenarios. These scenarios have strict heterogeneous requirements that will be accomplished by enhancements on the radio access network and a collection of innovative wireless technologies. Softwarization technologies, such as Software-Defined Networking (SDN) and Network Function Virtualization (NFV), will play a key role in integrating these different technologies. Network slicing emerges as a cost-efficient solution for the implementation of the diverse 5G requirements and verticals. The 5G radio access and core networks will be based on a SDN/NFV infrastructure, which will be able to orchestrate the resources and control the network in order to efficiently and flexibly and with scalability provide network services. In this paper, we present the up-to-date status of the software-defined 5G radio access and core networks and a broad range of future research challenges on the orchestration and control aspects.

1. Introduction

The fifth generation (5G) of cellular networks aims at revolutionizing the world of wireless communication. 5G will be characterized by ubiquitous connectivity, extremely low latency, and very high-speed data transfer. These characteristics will enable the use of 5G in a very broad set of application scenarios: from pervasive video to high user mobility; from broadband access everywhere to lifeline communications; from massive Internet of Things (mIoT) to broadcast-like services; from tactile Internet to ultrareliable communications [1]. For enabling this variety of applications, ambitious improvements with respect to 4G are needed: 10-100 times more connected devices; 1000 times higher mobile data volume per area; 10-100 times higher data rate; 1 ms latency; 99.999% availability; 10 times less energy consumption; 5 times less network management operation expenses. The motivation and explanation of such requirements are given for some specific use cases in [2].

A key aspect of 5G is the radical network transformation required for offering vertical services, with connectivity, storage, and computing solutions tailored to the specific digital business case of different industries (e.g., health care, energy, multimedia, and automotive). A vertical service relaxes the conceptual restriction of VNFs to networking functions, which is the main characteristic of a network service. In a vertical service, the virtual functions may perform arbitrary functionality in the application domain. A vertical service may also include the end-user devices or applications within them, which can be considered as physical or virtual functions. New business models will be developed using the integration of the requirements of multiple vertical industries. In some cases, stakeholders from vertical industries can take the role of service providers for end-users by exploiting the infrastructure and connectivity services of network providers. In others, they will instantiate a vertical service for improving the efficiency of their business infrastructure and for developing new production models. For example, in [3], the authors describe a vertical proof of concept aimed at providing more flexibility and higher efficiency to the “factory of the future” by integrating robots, machine intelligence, and 5G. In [4], the authors describe how the 5G infrastructure implementing the vertical service concept can be used to offer Vehicle-to-Everything (V2X) safety applications. In this vertical service of the automotive domain, applications within a vehicle may be part of the service.

The transformation concerns two main aspects: an evolutionary aspect, where 5G will allow the same applications as today, but with much better performance. The work of 3GPP on New Radio (NR) and Next-Generation Radio Access Network (NG-RAN) [5] aims at designing new radio interfaces for increasing the available data rate by taking into account the recent results on millimeter wave (mmWave) communications and massive Multiple Input and Multiple Output (MIMO).

A revolutionary aspect given by the 5G vertical concept is the key new business paradigm that implies the support of very heterogeneous services on the same infrastructure. Different services, such as vehicular networking and e-health, require from the mobile network very different Key Performance Indicators (KPIs) (e.g., low latency, high capacity, or service continuity). Supporting all these requirements on the same infrastructure entails a disruptive reengineering of the network architecture. This scenario requires a key feature to 5G: it should represent a holistic orchestration platform, where computing resources are distributed within the network including sites of the vertical industry stakeholders, within the base stations, in edge clouds at central offices, in regional and central clouds, and managed by different stakeholders. To this aim, the introduction and implementation of the network slicing concept are a key tool that can enable operators to deliver tailored and customized connectivity and services for different business verticals and use cases.

For achieving the challenging objectives of 5G, the research community is working in two complementary directions: (i) the NG-RAN in order to integrate heterogeneous radio network technologies, i.e., NR, massive MIMO, mmWave, and multitier architecture [6, 7]; (ii) Software-Defined Networking (SDN) and Network Function Virtualization (NFV), which are rising as key components that will enable the orchestration and control of the technological resources in the RAN, transport, and cloud network, in a flexible, scalable, efficient, and agile way [8].

This paper presents software-defined 5G radio access and core networks (see Figure 1) from a general point of view, including the last updates on the topic. Moreover, this paper extends the previous considerations [9] and outlines a broad range of the future research challenges in the orchestration and control in 5G system. Several research challenges are emerging in order to optimally manage the resources and provide advanced 5G services to the end-users [10]. Current works are really specific, mainly addressing a particular task with a particular objective. On the contrary, this paper addresses orchestration and control in 5G from a holistic view by presenting its enabling technologies, tasks, and objectives and providing the details of the challenges related to each task.

This paper is organized as follows. In the next section, the main features and concepts of SDN and NFV are presented. In Section 3, a more holistic view of 5G is presented, including RAN, core, softwarization, orchestration, and management concepts. Successively, Section 4 proposes a classification of the 5G implementation tasks and objectives, and for each class it describes the respective challenges of orchestrating and controlling 5G networks. Finally, Section 5 concludes the paper.

2. Softwarization Technologies

In recent years, SDN and NFV emerged as innovative network paradigms that have attracted the interest of network operators and service providers. In this section, their main features and global architectures are presented in order to provide the background needed for the concepts presented in the paper.

2.1. Software-Defined Networking (SDN)

SDN is a networking paradigm that promises to improve the programmability and flexibility of networks. SDN assumes programmable network devices in which the forwarding plane is decoupled from the control plane. In addition, the control plane is logically centralized in a software-based controller (“network brain”), while the data plane is composed of network devices (“network arms”) that forward packets.

Figure 2 depicts the SDN architecture. The control plane includes both northbound and southbound interfaces. The northbound interface provides a network abstraction to network applications. The southbound interface standardizes the information exchange between the control and data planes. SDN has been successfully introduced in data center environment by Internet content providers such as Facebook and Google.

As far as its deployment, the software-defined approach should allow platform agnostic implementations of the monitoring functions. However, after several years, the ultimate target of fully programmable devices still partially collides with the vendors need for closed platforms and portability is still mostly limited to software platforms. Nevertheless, significant steps forward have recently been made. As an example, the H2020 BEBA (Behavioral Based Forwarding) [11] project has successfully delivered monitoring and security applications to both OpenFlow controlled HW and SW platforms.

From a technological point of view, the maturity level attained by commodity hardware enables even standard PCs to handle the traffic volume of multi- (10+) gigabit links. Indeed, the large number of cores available on affordable CPUs and the new generation of network interfaces that support multiple queues have determined a significant interest towards the proposal of software accelerated solutions for high-speed traffic retrieval. Indeed, software frameworks, like PF_RING [12], Netmap [13], DPDK [14], and PFQ [15, 16], currently allow very high-speed data capture and processing on reasonably cheap commodity PCs. For example, in the aforementioned BEBA project, PFQ has been used to accelerate a repurposed version of the OpenFlow compliant software switch OFSoftSwitch [17] to perform packet processing up to full 10Gbps line speed [18].

2.2. Network Function Virtualization (NFV)

NFV is about transforming the way network operators design and operate networks and network services. NFV consists of applying IT virtualization technology to consolidate many specialized network equipment types onto industry of high volume servers, switches, and storage units.

The NFV architecture (see Figure 3) defines the following main architectural elements. The Network Functions Virtualization Infrastructure (NFVI) provides the virtual resources required to support the execution of the Virtual Network Functions (VNFs). The VNF is the software implementation of a network function that runs over the NFVI. It is the entity corresponding to a function of today’s network nodes, which is now expected to be delivered as a software module running independently of the hardware. The NFV management and orchestration (MANO) covers the orchestration and lifecycle management of physical and/or software resources in order to provide network services. The NFV MANO is composed by three elements: Virtual Infrastructure Manager (VIM), VNF Manager (VNFM), and NFV Orchestrator (NFVO).

According to the ETSI-NFV specifications, a network service can be defined as the subset of the end-to-end service formed by VNFs and the associated Virtual Links (VLs) instantiated on the NFVI (see Figure 4). This procedure is also known as service function chaining (SFC) [19].

On providing the network service, the NVFO plays a key role by (i) being the single point of access for all service requests; (ii) handling the lifecycle of network services and SFCs; (iii) and having the end-to-end view of the resources being allocated across network service and VNFs by VNFMs (which handle VNFs lifecycle).

Similarities and differences between SDN and NFV are summarized in Table 1.

3. Software-Defined 5G Radio Access and Core Networks

If SDN and NFV are not dependent on each other, they are certainly mutually beneficial. For this reason, the full transformative value of 5G will require the adoption at large scale of these two technologies to support a redesigned 5G system [21].

In the following, we first introduce the architecture of software-defined 5G radio access and core networks. Next, two essential aspects of the new architecture are discussed: the network slicing and the resource sharing. Finally, the current 5G testbeds are presented.

3.1. Architecture

As already mentioned, 5G systems must go much beyond the design of new high-speed radio interfaces. For instance, the new 5G RAN must have a high flexibility to allow operators managing a heterogeneous set of access technologies and to optimize the access according to the required service capabilities. 5G RAN is designed considering a large range of deployment scenarios, including a variety of static or moving nodes, with a much denser deployment of access points. Furthermore, 5G considers the software-defined radio access network and in particular refers to the cloud RAN (C-RAN) architectures to increase efficiency and bring down costs in future 5G RAN [22]. The C-RAN architectures offload baseband signal processing from individual Remote Radio Heads (RRHs) (i.e., the base stations in the “classical” notation) to a Baseband Unit (BBU). On one hand, this strategy allows simplifying network maintenance, to increase the efficiency of the processing resources utilization through statistical multiplexing at the BBU, to reduce costs at base station sites, and to gain spectral efficiency from joint processing, such as coordinated multipoint (CoMP). On the other hand, C-RAN architectures need the deployment of a very demanding fronthaul (FH) network by transporting the raw in-phase/quadrature-phase (I/Q) samples from the RRHs to the BBUs for processing [23]. In general, current deployments use different transport networks and interfaces for FH (e.g., Common Public Radio Interface, CPRI) and backhaul (BH) traffic. However, the trend towards packet-based FH fosters a unified transport network to fulfil the requirements of all RAN splits (including regular BH traffic). Hence, recent research studies aim at designing an integrated FH and BH network architecture under the control of an SDN Transport Orchestrator [24, 25]. Furthermore, the envisaged deployment scenarios for 5G networks usually entail a variety of transport technologies that, ideally, should be managed homogeneously [26]. Hence, the general 5G RAN scenario requires an end-to-end orchestration of resources across multidomain multitechnology transport networks for offering network slicing and vertical services. The system that is being studied in the 5G-TRANSFORMER project [27] explores how network slicing can help verticals and mobile (virtual) network operators (MVNO), acting as customers, to deploy their service more quickly [28]. The system is composed of three major components: vertical slicer (VS), service orchestrator (SO), and mobile transport and computing platform (MPT). The VS coordinates and arbitrates the requests for vertical services, mapping their requirements at application level onto a set of VNF chained with each other and fine-grained instantiation parameters (e.g., deployment flavor) that are sent to the SO. The SO provides end-to-end orchestration of services across multiple administrative domains by interacting with the local MTP and with the SOs of other administrative domains [29]. The MTP is responsible for orchestration of resources and the instantiation of VNFs over the infrastructure under its control, as well as managing the underlying physical mobile transport network, computing, and storage infrastructure [4].

3.2. Network Slicing

Hosting different services with possibly conflicting requirements on the same infrastructure push for technical solutions that allow for both efficient resource sharing and multitenant infrastructure utilization. For this reason, one of the most promising solutions is network slicing [1], which gives to future 5G networks the scalability and flexibility features needed to support diverse services and scenarios. A network slice can generally be defined as an end-to-end logically isolated network that includes 5G devices in addition to access, transport, and core network functions.

In [30], three slice types are standardized: enhanced mobile broadband (eMBB), ultrareliable low-latency communication (URLLC), and mIoT. There can be several slices of each type and a UE should signal to which slice it should be connected when establishing a session. The network functions in a slice can be deployed differently depending on the requirements of the service. For example, an eMBB core network function, such as a user plane function (UPF), can be deployed in a central cloud to increase scalability, whereas for a slice supporting URLLC the UPF can be deployed in an edge cloud to reduce latencies. For verticals with different needs, different network slices can be provided for each of the slice types.

The optimization of the physical network resources usage can be obtained by sharing the network functions between different slices. The abstraction of different physical infrastructures into a logical virtual network, in which VNFs are operated, allows the sharing of the physical network resources and functions between different slices. To achieve a high flexibility, which, for instance, allows running VNFs at various network locations, the ETSI proposes a logical reference architecture for the NFV MANO [20].

Each tenant should be able to perform both management and orchestration of the shared resources (e.g., transmission points, radio resources, transport, and fronthaul capacity) on a per-slice basis upon need. Meantime, the utilization of the physical shared resources should be maximized. Indeed, while resource pooling for storage and processing power may be less demanding due to theoretically large resource pools, the scarcity of radio resources in many cases requires an advanced resource management solution [31]. These goals can be achieved by splitting the orchestration part in two submodules: the interslice and the intraslice orchestration. The former module has a global view of the available resources and optimally selects the resource quotas that are assigned to each slice (or tenant), exploiting the multiplexing gain. The latter module provides isolation between slices resources/functions. Thus, it performs resource orchestration on a per-slice basis, directly acting on the VNFM and VIM modules.

3.3. Resource Sharing

The support for multitenancy in 3GPP networks is related to early proposals on active RAN sharing, which enables network sharing based on contractual agreements. Recently, two active network sharing architectures are specified in 3GPP [32] to allow network operators to connect their own core network to a shared radio access network. These architectures are the Multioperator Core Network (MOCN), allowing each operator to share eNBs connected on a separate core network, and the Gateway Core Network (GWCN), where operators share additionally the Mobility Management Entity (MME). To enable Mobile Virtual Network Operators (MVNOs) to control the allocated resources, the document [33] describes concepts and high-level requirements for the Operations, Administration, Maintenance, and Provisioning (OAM&P) of network sharing.

These resource sharing architectures give the expected network utilization optimization and monetization only if traffic and QoS control mechanisms and algorithms are added to the network infrastructure to leverage multiplexing gains of traffic among slices. SDN-based monitoring tools are necessary to acquire traffic information per slice in order to perform optimization in the resources allocated for each slice. In particular, the optimization of the physical network infrastructure usage requires new traffic control functions, such as (i) the prediction of “network slice” traffic based on measured traffic data and user mobility, (ii) the admission control for network slicing requests, (iii) the scheduling of network slicing requests in charge of meeting the agreed Service Level Agreements (SLAs), and (iv) the monitoring of traffic and KPIs for each slice.

3.4. Testbeds

To experimentally evaluate the performance and the functionality of the SDN/NFV 5G networks, a set of toolkits and testbeds are being available. These represent a critical enabler of 5G evolution and allow acquiring experimental data for improving the standards and the deployment. To boost the visibility of these initiatives, the IEEE sets up an open catalogue to publicize them (wiki.sdn.ieee.org).

In the following, we describe some testbeds focused on the deployment of SDN/NFV/MEC/5G functionalities.

The 5G Berlin (5g-berlin.org) is a comprehensive end-to-end testbed infrastructure relaying on multiple layers of infrastructure and software components. The testbed allows the direct integration of third-party components and applications. 5G Berlin enables the setup of dedicated, specialized networks via “slicing” as required by general high-reliable networks, by automotive verticals, and by security/safety use cases. One of the two locations of 5G Berlin is 5G Playground (5G-Playground.org), which is an open testbed providing a comprehensive set of software toolkits enabling the setup and the development of 5G applications in an end-to-end testing environment. 5G Playground around the Fraunhofer FOKUS campus offers outdoor radio coverage using experimental licenses for 5G radio spectrum in the 700 MHz and 2600 MHz bands. The system combines 4G, 5G, WiFi, and LoRa WAN access technologies allowing the outdoor validation of experiments previously feasible in laboratories only.

The 5G Test Network (5gtn.fi) is one of the world’s wide 5G systems with open access [34]. The 5G test system considers all aspects “from infrastructure to applications and services” and allows unique testing possibilities from prototype devices to complete solutions in a controlled environment. The testbed allows acquiring experience in testing and analyzing data-intensive systems, such as 5G networks [35].

5G–EmPOWER (5g-empower.io) is a Multiaccess Edge Computing Operating System (MEC OS), which converges SDN and NFV into a single platform supporting lightweight virtualization and heterogeneous radio access technologies [36]. The platform allows the setup of small testbeds that can be used to implement and experimentally test algorithms and protocols, such as in [37].

Network Implementation Testbed using Open Source (NITOS) (nitlab.inf.uth.gr/NITlab/nitos) is a remotely accessible and configurable testbed equipped with cutting-edge fully programmable networking equipment (LTE-A, LTE, WiMAX, WiFi, ZigBee, Software-Defined Radio equipment, hardware OpenFlow switches, and Cloud Computing infrastructure). The testbed is based on open source software that allows the design and implementation of new algorithms, enabling new functionalities on the existing hardware. NITOS supports evaluation of protocols and applications under real world settings and is also designed to achieve reproducibility of experimentation.

The SoftFIRE (softfire.eu) is a federated testbed exploiting the NFV/SDN technologies for creating a secure, interoperable, and programmable experimental infrastructure. The project comprises several independent testbeds that have been set up in different EU countries. Giving the features of being a federated testbed and an orchestrated virtualization infrastructure, SoftFIRE paves the way towards the concurrent and conflict-free execution of radically different network scenarios, which can mimic network slicing. Providing the infrastructure for innovative experiments, SoftFIRE has identified and prototyped several levels of programmability that are useful for provisioning and managing 5G network slices. The project discovered and defined sets of problems to deal with during 5G network operation and offers managed solutions that are generic for NFV/SDN/5G environments. For example, in [38], the authors summarized their experimental activity aimed at designing, building, and testing a fully functional virtualized mobile core network, pointing out lessons learned and recommendations for future improvements. In [39], the authors present the concept of a scalable service function chaining (SFC) Orchestrator capable of deploying SF Chains following the ETSI-NFV architectural model, as well as orchestrating the runtime phase by rerouting the traffic to a different path in case of overload of certain SF instances.

The cooperation between Ericsson, Comau, and TIM led to the realization of the proof of concept for the experimentation of an innovative vertical that combines robotics, machine intelligence, and 5G to improve productivity in the Industrial 4.0 scenario [3]. The vertical is based on a slice supporting URLLC, where the UPF is deployed in an edge cloud to reduce latencies. The experimental system represents a key element of the factory of the future because it allows moving the control logic of robots from a control cabinet referred to as PLC (programmable logic controller) station to a cloud platform. This new approach allows easier implementation of new control features and adds high flexibility to the factory plant. By moving a relevant part of the control to the cloud enables the virtualization of those functionalities that can run as virtual machines on general purpose hardware. When new actuators are deployed, no new PLC hardware is needed. Furthermore, the flexibility is increased given that a different level of control functions (i.e., factory, cell, and actuator level) can be deployed on the same platform.

4. Future Research Challenges in 5G Orchestration and Control

In the following, some of the research challenges in the 5G orchestration and control will be highlighted (see Figure 5 for a graphical summary).

The general objectives of 5G concern performance, dependability, energy efficiency, and economical cost reduction. The tasks to consider on the coordination capabilities of the orchestrator and of the controller are as follows: to manage and orchestrate the different SDN and NFV technologies deployed; to implement a network slicing scheme that allows the efficient realization of the different expected 5G verticals; to allocate the wireless resources needed; finally to monitor the different components of the 5G network.

There are technologies expected to deal with these functions. However, they are not yet proven to work on a scale as needed in the future 5G network. In addition, there will be issues introduced by imperfect design, attacks, and operational mistakes, tentatively in combinations, that may lead to misoperation or halt to network control and management functions. Multitenancy and multidomain service management and orchestration in 5G require a tighter interaction and larger information exchange between simultaneously competing and cooperating marked actors with respect to current networks. When it comes to a realization, these issues likely introduce additional constraints which will make the above outlined tasks even harder.

4.1. SDN/NFV Management and Orchestration

Together with the orchestration, the SDN/NFV management encompasses a wide range of tasks. SDN and NFV address different aspects of providing network-based services and may be operated independently. Their interworking with respect to management and orchestration (M&O) is still an open issue, where a number of options are proposed; see, for instance, [40, 41] for an overview. Hence, it is a prime research challenge how the best may be done, which is dependent on the networking context. From the authors’ point of view, it is important to put emphasis on the provisioning of services in a multitenant and/or multidomain context, such as the deployment of the virtual functions, the composition of the service function chaining, and the related routing selection.

The success of the network virtualization relies heavily on the M&O. The reduction of the operating costs (OPEX), by centralization and coordination, and the uniformity of the management interface and the automation were the main motivations for adopting this technology. Furthermore, the increased flexibility and agility in service creation, the reduced time to market, and the ease of providing network services by simple applications are important motivations for the network virtualization. This technology is also an enabler for 5G, where network management is a core functionality in providing the network slicing, discussed in Section 4.2. However, realizing these ambitions is not straightforward and raises a number of research challenges. Among the more important are dealing with the resulting complexity, the provision of sufficient dependability and security, and the ability to provide compound services based on cooperation of several market actors using a common physical infrastructure. These are briefly discussed below.

4.1.1. Complexity

Even though the management responsibility for the various parts of the network and service provisioning is well defined in the SDN and NFV architectures (see, for instance, Figure 3), the entire complexity of the M&O will increase significantly. This has many causes: for instance, (i) the control and management activity is moved to separate platforms that must also be managed, see, e.g., [42, 43]; (ii) the system is in general multivendo; (iii) implementing the improved functionalities of the M&O causes additional complexity, etc.; (iv) efficient allocation, management, and control of assigned resources in real-time are presented, for instance, in [44]. Finally, continuous dynamic adaptation to changing workloads, new services, failures, and repairs simultaneously which provide service continuity amplifies the challenges.

Most challenging is perhaps the dealing with the 5G slicing, where the different slices should provide services with highly differing requirements with respect to functionality, latency, reliability, security, latency, and capacity. There are a number of fundamental questions on how to obtain efficient resource utilization, guaranteed QoS, and proper isolation between slices. Some works proposed to use artificial intelligence and machine learning to deal with these issues; see, for instance, [45, 46]. However, these techniques are unproven and their ability to deal with unforeseen catastrophic events are questioned. This last aspect may be in opposition to the requirement of dependable and secure services, as discussed next.

4.1.2. Dependability and Security

The SDN and NFV based 5G network will be a critical infrastructure in itself and will likely become a part of other critical infrastructure as energy (power) supply, payments, transport, etc. The requirements for the dependability and the security of the services provided will be high, for some services. For instance, in the case of operation of autonomous vehicles, an outage or widespread malfunctioning of the network could provide catastrophic effects [47]. Therefore, the network management should have functionality to meet these requirements.

A centralized management with knowledge of global state and control over the resources represents a good basis for dealing with most “everyday" failures of physical infrastructure elements, as well as crash failures of VNFs. The impact of such failures on the dependability is expected to become less than in the current system. However, little attention has been paid to failures of the M&O systems itself. These are likely to be unexpected and unforeseen ’black swan’ type of failures, caused by the complexity of the system and its composition of software elements from a number of different vendors. Those nonconventional failures are hard to deal with and may yield severe outages [42, 48]. Misoperation of M&O may have severe network-wide consequences. Significant effort must be put into designing an M&O system that is able to deal with failures of the platform where it is executed, including its internal logical and configuration faults. Most importantly, it must not execute incorrect actions. Hence, it should be designed to have failure omission semantic [43].

In part, these challenges are due to the shift in paradigm from the current networks, whose basic design philosophy is to make it distributed with autonomous network elements giving an inherent robustness [49], to the logically centralized control and management of the emerging SDN-NFV.

The logical centralization does also yield an increased security risk. With respect to consequences, rather than “capturing" a single network device, an intruder has the potential to capture the entire network. Gathering all activities on the same platform increases the potential for illegal monitoring and the likeliness of intrusions. This is due to the increased complexity and the interaction between software from various vendors, which may incur into vulnerabilities that are hard to detect. There are efforts towards dealing with the threats to the security; see [50, 51].

4.1.3. Multiactor Issues

The software-defined 5G is expected to be multitenant; i.e., a number of autonomous network operators and service providers share the same physical infrastructure and computational platform. Furthermore, the services are expected to be provided by operators and providers in different domains, e.g., access domain, one or more core network domains, and content provider domain. The network M&O has to deal with corresponding multitenant and multioperator issues, as illustrated Figure 6. To be able to do that, there has to be an overall architecture in place that enables cooperation and coordination. In addition to the technical aspects, economic and legal issues must be dealt within SLAs which get an increasingly important role [52]. These issues have been addressed by a number of fora, projects, and researches; see, for instance, [41, 5356]. However, no overarching outcome is seen, and this is still a major research challenge that must be resolved before the vision of 5G becomes a reality.

As discussed in [43], to deal with the issues raised in this subsection, more attention should be placed on the shift in risk profile; i.e., the use of virtualization may reduce the frequency and consequences of “everyday failures”. At the same time, this increases the probability and consequences of the unforeseen events that the system and operation and maintenance organizations are not prepared to deal with, the so-called black swans [57]. Measures that should be considered is to design critical M&O functions so it may be guaranteed that they do not misoperate, for instance, by logical fault tolerance or giving them failure omission semantics, and to separate critical M&O from the infrastructure they manage to avoid mutual dependencies [43]. Other means that are important for the SDN/NFV M&O are the ability to manage the software unreliability [58, 59] and to avoid dependent failures [48].

4.2. Network Slicing

Network slicing proposes the splitting of common physical infrastructure and virtual resources in isolated subgroups (slices) able to provide independent specific network capabilities by fulfilling specific requirements that can be tailored according to the needs. Each network slice is an isolated logical network, tailored to different functional and performance requirements, running on a common virtual and physical infrastructure. For the sake of illustration, we mention some relevant slice types defined in [60]: (i) enhanced mobile broadband; (ii) massive machine-type communication; and (iii) ultrareliable low-latency communication.

SDN and NFV are concepts that are fundamental for network slicing. Hence, the latter inherits most of the challenges presented previously. In terms of complexity, network slicing needs to have control on resources at the access, transport and core network, edge and central cloud computing, and SDN Controller and NFV orchestration functionalities. Therefore, the complexity of each of these components is inherited, with still many open issues. In addition, the main factor that makes network slicing very complex is the fact that it needs to articulate and integrate all those individual components, in order to offer the expected highly differentiated requirements of each slice. The perfect coordination required for that is a considerable extra challenge that must be addressed. Dependability and security are also a top priority here. In fact, some slices are expected to deliver ultrareliable communication (URC) with reliability requirements above 99.999%, as it was explicitly stated in the 5G vision [61]. As described for SDN and NFV, the dependability of network slicing should have two focuses: first, the dependability of the slices and services offered, where the network slice control and management play an important role; second, the dependability of the network slice control and management components as such. A dependable network slicing scheme depends on the design of the adequate reaction mechanisms for recovery, based on accurate information of the failure events and the current state of the system. However, due to the size and complexity, having proper and reliable information demands a system with the smartness to efficiently detect and filter all the reported events, which is still an open issue. Finally, the multiactor issues challenge is the essence of network slicing and, hence, this is a challenge that is in its nature. In this section, we will complement this issue by addressing the isolation challenge, which is closely related.

In addition to the challenges inherited from SDN and NFV, network slicing has some additional considerations that must be addressed.

4.2.1. Slice Isolation

Isolation is the most important property of network slicing, but also the one that represents the main challenge for its realization. Slices should be independent and have an appropriate degree of isolation from each other, so that no slice can interfere with the traffic in another slices or in general all type of events should not propagate among them. First, it is important to notice that isolation is a set of several tools that can be selected and tuned in order to achieve the isolation desired on the different use cases. Isolation can be classified in 4 different areas [62]: traffic, bandwidth, processing, and storage isolation. Some of the methods available for achieving isolation are as follows: language and sandbox based isolation for service instances and network slice instance and, for lower levels, approaches such as virtual machine and kernel based isolation, Physical Resource Block (PRB) scheduling, Channel Access Control, SDN-based isolation, hardware and physical isolation, etc. [62]. In this context, the main challenge is the orchestration and control tasks that need to be implemented in order to harmonize the different isolation techniques in the different domains, in order to fulfil the requirements of each specific slice. This is even more complex due to the fact that there is not a holistic and final standardized network slice architecture, which keeps open the control and orchestration of unknown isolation techniques. Also, isolation techniques significantly rely considerably on SDN functionalities, but as it is explained in [62]; some of these functions are not yet mature. Security and access control are also some of the isolation requirements that need to be enhanced, since, for instance, devices from group 1 should have access to slices A and B, but never to slice C, which in practice demands diverse architectural and security considerations. Finally, the inclusion/exclusion of new slices is challenging in terms of isolation, since it has to guarantee that those kinds of procedures do not have any effect on the currently running slices. This demands an updated information of the network status and the proper estimation of all the implications of including/excluding components, in order to avoid undesirable side effects on operational services.

4.2.2. Standardized Architecture

Another important challenge that needs to be addressed is the definition of a standardized network slicing architecture. The motivation, expectations, and requirements of network slicing are clear. However, the definition of a unified architecture with respective standards is still work in progress. The formalization of this concept was presented in [63], where a three-layer scheme is proposed (Service Instance, Network Slice Instance, and Resources). There are also efforts on standardization such as [60, 64, 65], with important definitions such as (i) the Network Slice Management and Orchestration block with a set of functions and subcomponents; (ii) the definition of subdomains such as edge and central cloud solution, fronthaul, backhaul, and core network; (iii) initial definitions of the roles of SDN and NFV into the overall slicing scheme. Finally, some other works that contribute valuables architectural concepts can be found in [6668], where details, such as the need of different control domains (global and specific) and the illustration of some specific slice orchestration workflows that include the SDN controller and the NFVO, are addressed. Despite all the different efforts made by industry, academia, and standardization units, there is no concrete consensus on a final standardized network slicing architecture. However, the mentioned works are just a small subset of all the big efforts currently running and that will come in order to progress in this direction.

4.2.3. Performance

The performance offered by virtualized systems has been from the very beginning one of the main concerns for its implementation [69]. Therefore, in most of the ETSI-NFV proof of concepts [70], performance has been one of the first issues to test. Through the years, many of those tests have been successful, but each case must be tested separately, based on the specific technical characteristics and requirements. Performance issues need to be assessed at both the control plane [71] and the data plane, where, for instance, the need to meet the performance requirements has promoted the introduction of acceleration modules, such as DPDK (Data Plane Development Kit) [72, 73]. However, challenges such as increased complexity, the respective VNF validation, and the live migration implementation when using DPDK still need to be investigated.

One of the concepts that may make very challenging the performance issue is the 3GPP proposal on recursive network slice [60], which is a concept that provides the ability of creating new slices from one that has been created and assigned. This in principle provides a higher flexibility and opens new business opportunities. However, as mentioned in [74, 75], these types of nesting approaches have a negative impact which needs to be thoroughly assessed case by case.

4.2.4. Resource Optimization

Under normal operations, it is expected that the potential combinations of network services and the respective resources assigned to them will be huge, and hence their distribution needs to be properly planned. The efficient use of the resources that a network operator has available is one of the main motivations for network slicing. Therefore, such planning must have optimization as one of its main strategies. The coming 5G networks are expected to have the capability to scale up and down at any time the deployed services, enable fast testing of new ideas, and in general be programmable and flexible to any new coming businesses and services. On the other hand, the amount of resources that need to be managed and assigned may be considerably large and diverse. In conclusion, for the implementation of network, slicing is important to have clear policies and plans for the assignment of resources that (i) enable cost-efficient operations for the network providers; (ii) provide agile solutions that enable quick implementations; and (iii) are capable of adapting to the varying conditions of the network slice environment.

4.2.5. Slices Management Hierarchy

The management and orchestration of slices can be classified in two big groups: interslice and intraslice management. One of the challenges in this regard is the interaction and distribution of domains and responsibilities between them. On one hand, a global management of the entire network slice system can be mentioned, where, despite the local managers, the central entity may have strong influence over all the components inside the existing slices. On the other hand, the local-slice management may have full autonomy and control of all the resources assigned inside a specific slice and its respective operation, so it never gets any type of interference from the global entity. Having a global management with decision over the entire system (including the resources already assigned to the slices) is a scheme that allows better control and a more flexible use of the resources. However, this goes against the principle of the slices independence. Under some circumstances (failures, emergencies, special events, etc.), a bigger management capability of the global management is needed, being necessary a design that finds the right balance between these two options.

Table 2 presents the tradeoff of those two types of slice management hierarchies. However, today, it is not clear what should be the right level of influence of the global management on the local slices. This is a topic that needs to be analyzed and addressed, so future implementations may have clear procedures in this regard.

4.3. Wireless Resource Allocation

A set of tasks that should be considered in the orchestration and control of software-defined 5G Radio Access Networks is the wireless resource allocation, which includes user association, power control, and channel selection.

4.3.1. User Association

In 5G, the user association is more complex than in current cellular network. In existing systems, a user is associated with a base station, mainly based on the received power from the surrounding base stations. Instead, 5G user association will exploit the multiple technologies that will compose the RAN: the user can be associated with a macro-cell, a small-cell, a relay, another user device (i.e., D2D communication), or even multiple combinations of them (i.e., CoMP).

User association in 5G has attracted strong interest in the past few years and a few surveys are already present in the literature [76, 77]. Current works are focusing on optimizing specific criteria, such as energy consumption, channel condition and bandwidth, interference, and load, and by using different approaches, such as stochastic geometry, combinatorial optimization, and game theory. In some of the current works, the user association has been jointly addressed with other tasks. As in other wireless contexts, the use of interdisciplinary approaches will be important for the development of efficient and profitable algorithms [7880].

4.3.2. Power Control

One of the tasks that can be jointly addressed with the user association is the power control. The power control consists of selecting the transmission power of the base station, which includes also powering it off. For this reason, it is often related to energy efficiency optimization [81]. A coordinated user association and power control is important because the variation of the transmission power of a base station affects the reachability and the data rate between a potential user and the base station [82]. An effective power control will need an accurate power model of all the transmission entities [83] in order to allow the development of performing algorithms [84].

4.3.3. Channel Selection

The channel selection is another task which is important together with user association and power control for the interference management [85]. In 5G, the selection of the frequency channel is especially critical because of the high densification [86]. Moreover, one important 5G application will be IoT, where technologies, such as Cognitive Radio Network (CRN) and Software-Defined Radio (SDR), will enable advanced radio spectrum management [87].

Despite the effort of the research community, there are still several challenges that need to be addressed in the above tasks.

(i) The architecture considered in the current works does not usually include all the 5G radio access technologies. Moreover, future works should consider the SDN/NFV operation and management in both access and core networks. The impact of CRN and SDR on accomplishing the above tasks should be also further considered.

(ii) Modelling a generic 5G channel is still an open issue due to the complexity of the 5G scenario, which includes innovative transmission technology, such as massive MIMO, NOMA, and mmWave. Furthermore, a related challenge is the development of learning algorithms for the dynamic management of the spectrum.

(iii) Another aspect is related to the target of the allocation. In the current works the target has mainly been energy efficiency [88, 89]. However, multiple targets need to be taken into account. They can be considered as a constraint, e.g., for avoiding that the reduction of active resources for energy efficiency purposes affects the dependability in case of failures or unexpected changes of traffic pattern.

(iv) The targets of the wireless resource allocation depend on the type of application, and 5G network slicing will enable the provision of heterogeneous network services with different technical requirements. Since the 5G application will (partially) share the same resources of the radio access and mobile core network, the related allocation problem will be complex.

(v) Another aspect that needs to be further investigated is the different time scale of the approaches. The allocation methods can be applied at different intervals of time: real time, daily fluctuation, and long term. The different approaches can be complementary and can focus on different scales (global/local).

4.3.4. Caching

The caching is a new opportunity in 5G wireless resource allocation, which researchers are working at [90, 91]. The idea is to offload traffic by caching contents in the edge (i.e., radio access) of the network, such as the different base stations, the relays, and potentially even in the end-user devices. Traffic offloading will enable facing the increase of traffic demand and achieve better Quality of Experience (QoE) [92].

In this scenario, the content placement [93] is a relevant task that researchers are working on. In particular, researchers are working on Mobile Edge Computing (MEC) [94] and on what, where, when, and how to log cache contents. In this context, coding [95] and machine learning [96] techniques can be applied. Examples of typical scenarios are vehicular communications [97] and low-latency applications [98].

At the best of our knowledge, a task that has not been yet investigated is the content delivery of the cached contents. Once decided what and where to cache, the retrieval of the content can be jointly realized through the wireless resource allocation in order to achieve content awareness [99].

The objectives of wireless resource allocation are broad and challenging, and having an interdisciplinary (i.e., signal processing, networking, operation research) approach can be a valuable option.

4.4. Network Monitoring

The huge advance brought by the whole 5G ecosystem raises a number of technological challenges that strongly require an accurate knowledge of the status of the network, in terms of both operational conditions and traffic information. The motivations for network monitoring and measurements are manifold and span from the need to gather low level performance indicators to higher level information for user profiling and defence against cybe attacks.

Troubleshooting has been traditionally one of the key propellent for traffic monitoring. As of today, it is frequent practice for large-scale operators to outsource the management of their communication infrastructures to the manufacturers that provided the network devices. However, the operators themselves are interested in either developing their own monitoring instruments or acquiring third-party tools to check the status of their own networks as well as to verify the adherence with the operational conditions specified in the contracts with the manufacturers.

Performance metrics are also crucial in the 5G technological context that proposes extremely stringent constraints. Indeed network QoS parameters such as packet loss, available bandwidth, end-to-end latency must be constantly monitored to verify and promptly react against the performance degradation and to assess SLAs fulfilment. The latter reason is particularly relevant as multitenancy and network slicing will be common practices in the 5G operational scenario, and contract verification and slice planning will ultimately require real-time traffic information.

At higher level, traffic monitoring will definitely address the evaluation of the user experience and profiling. In particular, traffic classification plays a key role as the fine-grained knowledge of the users behavior allow network operators huge opportunities for highly customized offers.

Finally, though, perhaps most important, traffic monitoring becomes crucial to defend the whole communication infrastructure against cyberattacks, whose effects are even magnified in highly interconnected systems. The evidence of such issues is in everyday’s news as, in 2017 only, cybercrime reportedly hits more than a billion people in the world and caused damage for around 500 billion dollars [100].

Nowadays, cyberthreats are continuously mutating into sophisticated, stealthy, targeted, and multifaceted attacks to penetrate systems, study the critical components, impact their operations, and and minimize the ability of the users to report the system’s malfunction. As an example, this is the case of Advanced Persistent Threats (APTs), in which the attacker is tasked to perform an attack and will use any technique, over long periods of time, until the target is finally accomplished.

4.4.1. Scalability

The advent of 5G cellular network will further fuel the rapid increase of network traffic volume which will be soon measured in Zettabytes (1 billion terabytes) [101], with more than 20 billion connected devices forecasted by 2020 [102]. From this astonishing figures, scalability emerges as a primary issue and clearly makes traditional monitoring approaches based on capture, storage, and postprocessing of data practically infeasible and obsolete. Indeed, the large volumes of traffic require the network nodes themselves to be the first monitoring players, by at least providing a first stage of coarse-grained processing over the live data, while delegating more sophisticated analysis to a second tier of specialized systems, such as middleboxes, Security Information and Event Managements (SIEMs), Deep Packet Inspection (DPI) modules, etc. This approach is perfectly in line with the SDN paradigm and the newly introduced flexibility may in fact readily enable the integration of monitoring capability into programmable nodes. As a result, network devices such as programmable switches or network interface cards, while being originally designed for performing network packets forwarding, can be repurposed to accommodate even complex monitoring tasks. In other words, network monitoring will not make an exception and will be treated as one of the network functions envisaged in the SDN/NFV model.

4.4.2. Programmable Data Plane

In the SDN model, network monitoring applications will have a portion of their logic offloaded to the programmable network devices. Processing in the infrastructure nodes will be instantiated through the SDN control plane (that still retains the full ownership of the process) by delegating tasks across the network nodes. In other terms, the underlying data plane will provide an Application Programmable Interface (API) to expose the set of supported monitoring (or, more generally, processing) primitives as well as a formal way of combining such instructions. In fact, this approach follows the one originally proposed by OpenFlow, which pragmatically adopts a simple Match-and-Action programmable abstraction, and more recently by P4 [103] that, instead, improves the programmability of packet processing pipelines through dedicated processing architectures or Reconfigurable Match-Action Tables as an extension of TCAMs (Ternary Content Addressable Memories). However, when the processing scope becomes more complex and (as in the network monitoring domain) possibly involves stateful operations, the programming abstractions inherited by the switching domain show significant limitations and more sophisticated approaches, such as the one based on eXtended Finite State Machines (XFSM) [104], prove better potentials.

On the user side, the monitoring applications running on the end hosts will refine and complement the monitoring activities upon the data received from the inner nodes. At this stage, they may run algorithms and analytics in a centralized fashion on the received subset of predigested data, with beneficial effect in terms of accuracy and overall scalability.

4.4.3. Data Analysis and Protection

The types of applications and services that may benefit from the new programming paradigm is virtually unlimited. As an example, in the security domain, the model easily translates in distributed agents running locally on the nodes that continuously look for signals and threats. Once a signal is detected, the system dynamically “zooms" into the nodes that are interested by the suspected threat, collects the data and centrally runs more sophisticated analysis, such as anomaly detection, visualization tools and threat analysis techniques and methods. The scope of applications can be further extended to include more traditional monitoring issues that require updated information about the network status, as in the recent P4-based In-Band Network Telemetry (INT) proposal [105] and in the ETSI driven iOAM [106] approach.

Typically, the centralized network element devoted to run fine-grained data analysis and visualization is the SIEM. In the heterogeneous 5G ecosystem, SIEMs are indeed expected to receive a large amount of data from different types of distributed sensors, including the less traditional outcomes of behavioral and sentiment analysis tasks. Therefore, they must address data homogenization, aggregation and correlation, as well as the adoption of suitable algorithms for big data analysis to provide the “visual awareness" of the infrastructure status.

SIEMs will also play a central role in Digital Forensics (DF) investigations and will be the technical basis for post–mortem analysis in forensic procedures by facilitating the DF process in determining significance, reconstructing data fragments of data and drawing conclusions based on the collected evidences. However, for the above process to be effective and deployable, a particular care must be dedicated to the lawful implications involved in monitoring of the infrastructure and collection of data in order not to infringe fundamental rights of potentially involved citizens, and to guarantee user’s privacy and data protection. As such, SIEMs will be called to implement proper techniques of data protection and anonymization as well as to conform to the international, European, national and local laws, rules and regulations.

4.5. Discussion

Ideally, the target of the research in the orchestration and control of software-defined 5G radio access and core networks is to find the optimization mechanism that dynamically allocates the resources associated with the tasks discussed in this section, and at the same time, ensure that the requirements of the services are met and that the operation is within the commercial constraint set by the market actors. In doing so, the researchers will initially focus on subsets of the objectives and tasks, i.e., specific subset of the tasks-disc presented in Figure 5. Simultaneously, meeting the strictest requirements of all the applications is either infeasible (given the inherent tradeoff between them) or, when feasible, technoeconomically inefficient. Hence, there must be specific optimization objective for each application, and it must be assured that a global optimality is sought for the entire system, which is the main challenge for the orchestration and control. In general, a cross-layer and interdisciplinary approach will be needed for obtaining solutions that are both practical and effective, and thus profitably applicable in the real world.

On the other hand, today it is clear that SDN, NFV and network slicing will be key concepts to achieve the 5G vision. Those technologies have been studied and tested during the last years, and have achieved a reasonable level of maturity. However, there are still many open issues that need to be addressed, such as the definition of an overall architecture of the holistic system, or the evaluation and testing that provide the confidence needed for an implementation phase.

Table 3 summarizes the above discussed challenges to motivate future works that lead to a successful 5G implementation.

5. Conclusions

This paper has presented the software-defined 5G radio access and core networks from a general up-to-date point of view by including both the enabling softwarization technologies, SDN and NFV, and the innovative wireless technologies, such as multi-RAT, multitier architecture, and CoMP and D2D communications.

The future research challenges in the orchestration and control have been highlighted. Different objectives, e.g., energy efficiency, dependability, performance, and cost reduction, have been considered, and different tasks, i.e., SDN/NFV management and orchestration, network slicing, wireless resource allocation, and network monitoring, have been investigated.

The paper shows how the orchestration and control of 5G systems are important to effectively coordinate and exploit the full potential of the 5G technology and identifies a number of fundamental issues that need to be resolved. Further efforts are needed to develop algorithms and methods that are able to manage the resources and provide advanced, dependable 5G services with appropriate QoS to the end-users, and yield a cost-efficient system.

In general, 5G networks are close to become a reality, thanks to the technological maturity and the numerous efforts that have been made during the last years. However, the challenges presented here show that additional efforts are still needed.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work has been partially supported by the PRA 2018-2019 Research Project “CONCEPT–COmmunication and Networking for vehicular CybEr-Physical sysTems” funded by the University of Pisa.