Estimating Spectral Efficiency Curves from Connection Traces in a Live LTE Network
In cellular networks, spectral efficiency is a key parameter when designing network infrastructure. Despite the existence of theoretical model for this parameter, experience shows that real spectral efficiency is influenced by multiple factors that greatly vary in space and time and are difficult to characterize. In this paper, an automatic method for deriving the real spectral efficiency curves of a Long Term Evolution (LTE) system on a per-cell basis is proposed. The method is based on a trace processing tool that makes the most of the detailed network performance measurements collected by base stations. The method is conceived as a centralized scheme that can be integrated in commercial network planning tools. Method assessment is carried out with a large dataset of connection traces taken from a live LTE system. Results show that spectral efficiency curves largely differ from cell to cell.
In the coming years, an exponential growth of cellular traffic is expected. Specifically, a 10-fold increase in mobile data traffic is forecast from 2015 to 2021 . Meanwhile, the proliferation of smartphones and tablets has changed the most demanded services in cellular networks. These changes will continue with the massive deployment of machine-type communications in Internet-of-Things applications . To cope with these changes, future mobile networks will have to combine multiple technologies. Thus, service and network heterogeneity has been identified as a critical issue in future 5G networks [3, 4].
In parallel, the increasing size and complexity of cellular networks is making it very difficult for operators to manage their networks. Thus, network management is one of the main bottlenecks for the successful deployment of mobile networks. To tackle this problem, industry fora and standardization bodies set up activities in the field of Self-Organizing Networks (SON) while defining 4G networks . Self-organization refers to the capability of network elements to self-plan, self-configure, self-tune, and self-heal . This need for self-organization has also been identified by vendors, which now offer automated network management solutions to reduce the workload of operational staff.
Legacy SON solutions are restricted to the replication of routine tasks that were done manually in the past. Currently, network planning and optimization is mostly based on performance counters and alarms in the network management system [7–9]. Thus, other data from network equipment and interfaces that could give very detailed information is discarded. Such a piece of information is only used in very rare cases for troubleshooting after a tedious analysis. However, with recent advances in information technologies, it is now possible to process all these data on a regular basis by means of Big Data Analytics (BDA) techniques . In cellular networks, “big data” refers to configuration parameter settings, performance counters, alarms, events, charging data records, or trouble tickets .
While BDA have long attracted the attention of the computing research community, this field is relatively new in the telecommunication industry. In , the authors propose a generic framework for improving SON algorithms with big data techniques to meet the requirements of future 5G networks. With a more limited scope, a self-tuning method is proposed for adjusting antenna tilts in a Long Term Evolution (LTE) system on a cell basis based on call traces . Likewise, a review of network data used for self-healing in cellular networks is presented in . However, few works have used BDA for self-planning cellular networks.
In radio network planning, the key figure of merit to evaluate network (or channel) capacity is spectral efficiency (SE). A theoretical upper bound on the channel capacity of a single-input single-output wireless link is given by the Shannon capacity formula . This formula can be adapted to approximate the maximum channel capacity under certain assumptions specific to each radio access technology [15–19]. However, even if channel capacity is mainly determined by signal quality, it is also affected by the radio environment (user speed, propagation channel, etc.), the traffic properties (service type, burstiness, etc.), and the techniques in the different communication layers (multiantenna configuration, interference cancellation, channel coding, radio resource management, etc.). As considering all these factors is extremely difficult, most network planning tools rely on mapping curves relating signal quality to SE (a.k.a. SE curves), generated by link-level simulators [20–22]. This approach is still limited, as simulators make simplifications for computational reasons, and there remains the problem of selecting the right combination of simulation parameters that closely match the reality.
In this work, a new automatic method for deriving the real SE mapping curves for the downlink of a LTE system on a cell-by-cell basis is proposed. The method is based on a trace processing tool that makes the most of detailed network performance measurements collected by base stations (specifically, signal strength, traffic, and resource utilization measurements). Method assessment is carried out with a large dataset of connection traces taken from a live LTE system. The main contributions of this work are (a) a data-driven methodology for deriving SE mapping curves from real network measurements, which can be integrated in commercial network planning tools, and (b) a set of SE curves obtained from connection traces collected in two live LTE systems.
The rest of the paper is organized as follows. Section 2 presents the classical approach to derive SE curves in radio network planning tools. Section 3 explains the trace collection process. Section 4 describes the new methodology to derive SE curves from user connection traces. Section 5 presents the results of the proposed method over a real trace dataset taken from the live network. Finally, Section 6 presents the main conclusions of the study.
2. Current Approach
In wireless technologies, SE is strongly affected by the link adaptation scheme. For clarity, a brief overview of the link adaptation process in LTE is first given. Then, the classical abstraction model of the link layer integrated in most network planning tools is explained.
2.1. Link Adaptation Process
Link Adaptation (LA) aims to ensure the most effective use of radio resources assigned to a user. In LTE, this is achieved by dynamically changing the Modulation and Coding Scheme (MCS) depending on radio link conditions. Figure 1 shows the structure of the classical LA scheme for the downlink of LTE . LA is performed in the eNodeB based on the feedback from the UE. The UE estimates downlink channel quality based on the experienced Signal-to-Interference-plus-Noise Ratio (SINR), , which is reported to the eNodeB in the form of a Channel Quality Indicator (CQI). The reported CQI value is processed at the eNodeB (eNB) to build an estimate of the measured downlink SINR, . Such an estimate is corrected by an Outer Loop Link Adaptation (OLLA) mechanism to compensate for systematic errors, based on Hybrid Automatic Repeat reQuest (HARQ) positive and negative acknowledgments (ACKs/NACKs). Thus, a corrected SINR, , is obtained. Then, an Inner Loop Link Adaptation (ILLA) mechanism determines the MCS that the eNodeB should use from the corrected SINR, so that the UE is able to demodulate and decode the transmitted downlink data and not to exceed a certain Block Error Rate (BLER) threshold, usually set to 10%.
Better radio link conditions translate into a higher reported CQI, thus allowing the eNB to select more effective MCSs (i.e., higher order modulations with more bits per symbol and less redundancy). Conversely, in poor radio link conditions, a lower CQI is reported, and more robust MCSs are selected (i.e., lower order modulations with less bits per symbol and more redundancy).
The actual SINR values triggering the use of different MCSs in ILLA are vendor-specific and depend on the network conditions assumed by the vendor (radio environment, antenna configuration, traffic properties, network features, etc.).
2.2. Link Abstraction Model
As a result of LA, SE (and link capacity) can be treated as a function of SINR. In most network planning tools, SINR is estimated on a per-location basis. Then, the maximum SE of a single-input single-output system (in bits/s/Hz) for infinite block length and infinite decoding complexity in an Additive White Gaussian Noise (AWGN) channel can be obtained by the Shannon capacity formula  aswhere SNR is the signal-to-noise ratio. For general multiple-input multiple-output systems with perfect transmitted knowledge, the Shannon capacity is where and are the number of transmit and receive antennas, respectively, and is the SNR of the th spatial subchannel. In practice, real implementations are below the theoretical limit given by (2). Thus, the real SE of the limited set of MCS specified in the standard can be better approximated by the Truncated Shannon Bound (TSB) formula  suggested in :where is the SINR of the link, is a lower limit on SINR below which SE is zero, is an upper limit on SINR associated with the SE of the highest implemented MCS (e.g., 64 QAM, rate 4/5, in this work), , is the system bandwidth efficiency that accounts for different overheads (pilots, cyclic prefix, control channels, etc.), and is a correction factor to reflect implementation losses. The values of and for different antenna configurations and packet scheduling schemes are presented in .
SE estimate in (3) is still an optimistic value of the link SE. Classical LA schemes based on adaptive thresholds (i.e., OLLA + ILLA) suffer from slow convergence with strongly biased CQI reporting [25, 26]. Such a slow convergence is a major issue in current LTE networks due to the prevalence of short connections . Even if more realistic values of SE could be obtained from simulations, these cannot capture all possible factors, which greatly vary from cell to cell and dynamically change with time. As a result, SE and throughput measurements are much lower than expected in live networks .
Network planning is negatively affected by the overestimation of SE, as this parameter controls the expected demand of network resources. Thus, underestimating the average cell load during network coverage planning might lead to a too optimistic cell radius from unreal cell-edge performance. Likewise, underestimating cell load might give an inadequate amount of the traffic resources needed per cell during network capacity planning. All these problems can be solved by deriving a more realistic SINR-to-SE mapping from connection traces.
3. Connection Traces
Data for managing a radio access network includes(a) Configuration Management data (CM), consisting of current network parameter settings,(b) Performance Management data (PM), consisting of counters reflecting the number of times some event has happened per network element and Reporting Output Period (ROP),(c) Data Trace Files (DTFs), consisting of multiple records (known as events) with radio related measurements stored when some event occurs for a single User Equipment (UE) or a base station. DTFs can be further classified into User Equipment Traffic Recording (UETR) and Cell Traffic Recording (CTR) . UETRs are used to single out a specific user, while CTRs are used to monitor cell performance by monitoring all (or a random subset of) anonymous connections . The former are used for network troubleshooting, whereas the latter are used for network planning and optimization purposes.
Depending on the involved network entities, events can be classified in external or internal events. External events include signaling messages that eNBs exchange with other network elements (e.g., UE or eNB) through the Uu, X2, or S1 interfaces [31–33]. Internal events include vendor-specific information about the performance of the eNB.
3.1. Trace Collection
Figure 2 depicts the reference architecture for trace collection in LTE . CTR collection starts by the operator preparing a Configuration Trace File (CTF) in the Operation Support System (OSS), with (a) the event(s) to be monitored, (b) the cells and the ratio of calls for which traces are collected (i.e., UE fraction), (c) the ROP (typically, 15 minutes), (d) the maximum number of traces activated simultaneously in the OSS, and (e) the time period when trace collection is enabled. After enabling trace collection, UEs transfer their event records to their serving eNB. When ROP is finished, the eNB generates CTR files, which are then sent to the OSS asynchronously.
3.2. Trace Preprocessing
Trace files are binary files encoded in ASN.1 format . The structure of events consists of a header and a message container including different attributes (referred to as event parameters). The header contains general attributes associated with the event description, such as the timestamp, the eNB, the UE, the message type, or the event length, while the message container includes specific attributes associated with the message type.
Trace decoding is performed by a parsing tool that extracts the information contained on fields. In most cases, the output is one file per event type, eNB, and ROP. Then, traces are synchronized by merging files from different eNBs by event type and ROP and ordering events by the timestamp attribute. Thus, it is possible to link simultaneous events of the same type from different eNBs (e.g., incoming and outgoing handover events).
4. Estimating Spectral Efficiency from Traces
A method for building a link-layer abstraction model for LTE downlink from network measurements is proposed here. The model relates SINR to SE based on signal strength, traffic, and radio resource measurements obtained from the live network. Such measurements are generated by the UE and the eNB, and later uploaded to the OSS in the form of connection traces. The inputs to the algorithm are CTR files with the following events:(a)UE Traffic Report: this internal (i.e., Ericsson specific) event includes the total carried traffic volume and the total amount of used Resource Elements (REs) per connection. A RE consists of 1 subcarrier during 1 OFDM symbol. In most vendors, this event is reported once at the end of each connection, so that there is a one-to-one mapping between traffic reports and connections.(b)RRC Measurement Report: this standard event includes Reference Signal Received Power (RSRP) measurements from cells detected by the UE. Each measurement record includes the pilot signal level from 1 serving cell and up to 8 neighbor cells . It can be configured to be reported periodically or event-triggered. In the former case, each connection can comprise many records of this event. A measurement report is said to belong to a given connection if it is reported during such connection.
Figure 3 illustrates an example of how these events are distributed within a call. A call starts with a connection setup and ends with a connection release. While in a call, the UE may perform a handover between cells. The term “connection” refers to the time spent by a UE in a cell, until a handover is executed or the call is finished. Therefore, a call may contain more than one connection. A UE traffic event is reported at the end of each connection, while RRC measurements are generated periodically along a connection.
Tables 1 and 2 present the most relevant parameters in the UE Traffic Report and RRC Measurement Report events. In the tables, subindex refers to the traffic report (i.e., connection), and subindex refers to the RRC Measurement Report. In Table 1, it is worth noting that only counts REs used for user data transmission in the Physical Shared Data CHannel (PDSCH) and thus excludes REs used for Cell Reference Signals (CRS) and other signaling information (e.g., Physical Common Control CHannel, PDCCH) .
Figure 4 shows the flow diagram of the proposed algorithm. In stage 1, the time distribution of cell load is calculated per cell as the percentage of used REs during a fixed time period based on the information in UE Traffic Report events. In stage 2, the average SINR per connection is calculated as the ratio between the average received power from the serving cell and the sum of the interference power plus background noise (in linear units). To estimate interference levels, RSRP samples in RRC Measurement Report events are combined with cell load estimates computed in stage 1. In stage 3, the average SE per connection is calculated as the ratio between the total carried traffic volume and the amount of used REs based on the information in UE Traffic Report events. In stage 4, a fitting curve is built relating average SINR and average SE estimates from stages 2 and 3. All these operations are described in more detail in the following paragraphs.
4.1. Stage 1: Estimation of Cell Load Distribution over Time
In this work, cell load is defined as the ratio of REs occupied for transmission. In the network, cell load changes every Time Transmission Interval (TTI). As the number of REs used per connection is only available at the end of the connection, cell load cannot be calculated on a TTI basis. Alternatively, cell load is estimated with a lower resolution by defining a fixed time granularity of several TTIs. Then, the total number of REs used by a connection is evenly distributed across the equally spaced time intervals from the start to the end of the connection.
First, the average resource usage rate (in RE/s) in cell from the th connection (where is the serving cell of that connection) is computed aswhere is the total amount of resources used by the th connection (in REs), and and are the start and end times of the th connection (in s), respectively, as illustrated in Figure 5.
By assuming equally spaced time intervals, the total amount of REs used by connection in time interval , , is calculated aswhere is the resource usage rate for the th connection, and are the start and end points of the th connection (in s), and is the sampling period defining the time resolution (in s).
Finally, the sampled average load distribution of cell , , is calculated as the ratio between the sum of resources used by connections and the total amount of available resources in that cell in the th period, aswhere is the total number of available REs per time slot, is the number of subcarriers per Physical Resource Block (PRB), set to 12, is the number of PRBs in the cell, given by the system bandwidth, is the number of OFDM symbols per slot (6 or 7 for normal or extended cyclic prefix, resp.), is the slot duration (0.5 ms), and is the time interval duration (i.e., the sampling period). Also, is a correcting factor that represents the traced connection ratio configured by the operator. If all connections are traced in the network (i.e., UE fraction is 100%), then, .
4.2. Stage 2: Estimation of Average SINR per Connection
The SINR is defined as the ratio between the received power from the serving cell and the sum of the interference power (i.e., received power from adjacent cells) plus background noise (in linear units). In LTE, different REs transmit different signals, causing the fact that not all REs in a resource block experience the same SINR. 3GPP specifications do not standardize how SINR is measured, so the actual definition is vendor-specific. It can be measured in data or in reference signals REs. However, SINR is generally calculated on the REs carrying reference signals . In our case, the average SINR (in natural units) for the th measurement report can be estimated aswhere is the RSRP (in mW) of the serving cell in the th measurement report, is the RSRP (in mW) of the th neighbor cell in the th measurement report, is the number of cells in the th measurement report, is the average load of the th interfering cell in report at the time interval when the th measurement report was sent, and is the background noise (in mW).
As previously stated, the UE may send more than one RRC Measurement Report per connection. Therefore, it is necessary to obtain an average SINR per connection. The average SINR for the th connection is obtained aswhere is the connection to which the th measurement report belongs and is the number of measurement reports in the th connection.
4.3. Stage 3: Estimation of Average SE per Connection
SE is defined as the data rate that can be transmitted over a given bandwidth in a communication system. Based on the UE Traffic Report, the average SE (in bps/Hz) in REs assigned to the th connection can be estimated aswhere is the traffic volume in the th connection (in bytes), is the total amount of resources used in th connection (in REs), is the subcarrier bandwidth (15 kHz in LTE), is the number of OFDM symbols per slot (6 or 7 for normal or extended cyclic prefix, resp.), and is the slot duration (0.5 ms).
Note that (9) is restricted to data REs. Thus, it considers the loss of SE due to cyclic prefix, but does not take into account other factors such as (a) the limited BW occupancy to satisfy the Adjacent Channel Leakage Ratio (ACLR), (b) the pilot overhead due to CRSs, and (c) the dedicated and common control channel overhead. All these factors can be added later if needed for planning purposes, based on the values suggested in .
4.4. Stage 4: Construction of Link-Level Mapping Curves
The SINR-to-SE curve is computed by regression analysis of the scatter plot built with the average SINR, , and average SE, , estimated on a per-connection basis. Depending on the aggregation level, the output of the regression analysis is a single mapping curve for the whole network or a set of curves constructed on a per-cell basis.
In principle, any regression method could be applied as long as it provides good fitting. Previous studies suggest a logarithmic fitting, based on the expression of the Shannon bound , or an arctangent-based approach, based on empirical results . In this work, a simple polynomial regression from logarithmic SINR values is used for simplicity and flexibility, as it is included in most statistical analysis packages and does not presume any shape of the mapping function.
Several factors may add dispersion to the SINR-to-SE estimates, causing two connections with the same average SINR to have different average SE. A first reason is instantaneous SINR fluctuations due to fading, multipath, and other propagation phenomena, which is not reflected in SINR averages. A second reason is the limited time resolution of RRC measurements, which may cause the average SINR estimate to not reflect the true average SINR of the connection. Another reason is the service type, as the LA scheme requires certain time to converge, which might not be satisfied in short connections. All these factors degrade regression performance.
To increase the robustness of regression, several actions are taken. To improve the accuracy of SINR measurements per connection, regression analysis is carried out over connections with more than 1 RRC Measurement Report. Likewise, piecewise regression is used to avoid the fact that the most populated SINR values dominate the regression equation. Thus, SINR measurements are divided into bins of 1 dB, centered at integer SINR values (i.e.). Then, a single SE value is computed per bin by averaging the SE of all connections in the bin. Bins with less than 50 samples (connections) are discarded for the regression analysis.
It should be pointed out that the output of the method is a curve relating the average SINR and SE of a connection. This is the information needed by a network planning tool, where SINR and SE are calculated per location in the form of averages. Thus, the resulting curve might differ from the curves used in system-level simulators, where the link-layer model considers instantaneous SINR and SE values.
The proposed method is tested with trace datasets taken from a live LTE network. For clarity, the analysis methodology is first described and results are presented later. Finally, implementation issues are discussed.
5.1. Analysis Setup
Two trace datasets are used in the analysis, taken from different networks (referred to as Network 1 and Network 2). Table 3 describes their main parameters. The bulk of the analysis is carried out on Network 1, and Network 2 is only used to check the impact of the network configuration and service mix. Even if traces include both downlink and uplink measurements, the analysis presented here is restricted to the downlink.
The proposed trace-based approach to derive the SINR-SE curves is compared with a theoretical bound in the absence of a dynamic system-level simulator that captures the diversity of services, radio environments, and features in the real network. Specifically, the following approaches are evaluated:(a)TSB-MIMO: modified truncated Shannon bound adjusted for best fit to link-level simulation curves for 2 × 2 multiple-input multiple-output antenna configuration with Alamouti Space Time Coding under Typical Urban (TU) channel at 3 km/h and Proportional Fair-Time Dependent Packet Scheduling (PF-TDPS) ; it corresponds to transmission mode 3 (open-loop spatial multiplexing) with Rank 2; specifically, , , , , and bps/Hz;(b)TB-N: the proposed trace-based approach applied to the complete set of traces, resulting in a single mapping curve valid for the whole network;(c)TB-C: the proposed trace-based approach applied to the traces of a single cell, resulting in a mapping curve per cell.
Method assessment is carried out by comparing the shape of the SINR-SE curves. For a fair comparison, the SE of all methods is restricted to data REs. Thus, the bandwidth efficiency parameter in TSB-MIMO only considers the loss due to cyclic prefix; that is, for long prefix. Hence, the maximum achievable SE in antenna configurations with 1 spatial stream, , corresponding to the highest MCS (i.e., 64 QAM, rate 4/5), is 6 (ms·subc.).
Figures 6(a) and 6(b) illustrate how the trace-based approaches work. Figure 6(a) shows the original SINR-SE scatter plot together with a simple polynomial regression. In the figure, each point is a connection. It is observed that connections with the same SINR have very different SE, which is the reason for the low , . Figure 6(b) shows the simplified scatter plot obtained by discretizing SINR values and computing a piecewise regression of order 0 (denoted as piecewise regression). To aid comparison, the curves obtained by polynomial regression on the original and simplified data are also superimposed (denoted as original and piecewise, resp.) and the -axis is restricted to the range of dB. From the figure, it is clear that the regression curve derived from the points computed by piecewise regression better captures the average SE trend. This is confirmed by the large value of .
(a) Linear regression
(b) Piecewise regression
To show the benefit of using real traces, Figure 7 compares the TSB-MIMO (theoretical) and TB-N (practical) approaches. For TB-N, 95% confidence intervals for the average SE in each SINR band are included. Note that both methods result in a single curve for the whole network. It is observed that SE values in traces are consistently below the maximum theoretical values suggested by TSB-MIMO. This gives clear evidence of the need for computing SINR-SE curves from real connection traces.
The reasons for such differences are the link adaptation process and the transport protocol. In , it was shown that connection length has strong impact on user throughput. Short connections, prevailing in current LTE networks, suffer from reduced user throughput. This is due to the slow Outer Loop Link Adaptation Process (OLLA) convergence and the slow-start feature of Transport Control Protocol (TCP), causing throughput to ramp up. Figure 8 confirms this observation by showing the SE curve obtained by TB-N for short and long connections. In this work, a connection with less than 20 ACK + NACK is classified as a short connection. Conversely, a connection with more than 100 ACK + NACK is classified as a long connection. In the figure, it is observed that the maximum SE for long connections is more than three times larger than for short connections (1.45 versus 0.45 bps/Hz). This is mainly due to OLLA convergence issues, as traffic burstiness caused by TCP ramp-up should not affect the selected MCS. By comparing Figures 7 and 8, it can be deduced that, even for long connections, the theoretical curve is a loose upper bound for the average SE with good radio link conditions.
To show the benefit of computing a curve per cell, Figure 9 compares the output of the trace-based approach executed on a cell basis, TB-C, for two cells in the system. The network-wide curve obtained by TB-N is also included as a reference. It is observed that SE values may differ from cell to cell up to a 150% for the same SINR value. A closer analysis (not presented here) shows that this is due to the fact that the ratio of long connections in Cell A is 41% and it is only 20% in Cell B. Recall that connection length has a strong impact on user throughput due to OLLA convergence issues and TCP slow-start feature. Thus, the connection length distribution in a cell strongly influences the spectral efficiency curve measured for that cell. The observed differences justify the need for deriving SE curves on a cell basis.
Finally, Figure 10 compares the results of the trace-based method in the two datasets from different networks. For brevity, the analysis is restricted to the network-wide solution for long connections. In the figure, even if trends are similar, Network 1 has a SE lower than Network 2 for the same SINR. This might be due to the different service mix in both networks. To back up this statement, a deeper analysis of radio network measurements is done. On the one hand, traces show that 50% of long connections in Network 1 have less than 300 ACKs + NACKs, compared to only 10% in Network 2. Thus, the probability that OLLA has reached steady state before the end of a connection is higher in Network 2. On the other hand, network counters show that the percentage of active TTIs where the user buffer is emptied (i.e., last TTI transmissions ) is 41% for Network 1 and only 25% for Network 2. In last TTI transmissions, some REs in the PRBs assigned to the user might not carry data because there is not enough data, decreasing the link SE. Thus, the number of underutilized resources for this reason should be larger in Network 1. Both effects indicate that traffic in Network 1 is more bursty than in Network 2. These differences justify the need for deriving a specific SE curve for each network.
5.3. Implementation Issues
The method is designed as a centralized scheme that can be integrated in a commercial radio network planning tool. Its low computational load makes it a perfect candidate for improving measurement-based replanning algorithms. The worst-case time complexity is linear in the product of the number of cells and trace collection periods. In practice, the most time consuming process is parsing and synchronizing the traces, which can be done with trace processing tools provided by OSS vendors. The rest of the method can be developed in any programming language (in this work, R ). Specifically, the total execution time for Network 1 dataset in a 2.6-GHz quad-core processor laptop is less than 780 s (3 s per 1000 connections).
Link spectral efficiency is a key parameter when designing and optimizing cellular networks. Unfortunately, such a parameter is difficult to estimate, as it depends on multiple factors that cannot be monitored and greatly vary from cell to cell. In this work, a data-driven methodology for deriving the SINR-to-spectral efficiency mapping curves for LTE downlink on a cell basis based on connection traces has been proposed. The method relies on the activation of standard periodic RSRP measurements and the provision of user traffic reporting events by the base station vendor. As these requirements are common, the method can easily be adapted to other radio access technologies, even if it is initially conceived for LTE. The method has been tested with a large dataset of connection traces taken from a live LTE system. Results have shown that the current approach of deriving the spectral efficiency curves, based on Truncated Shannon Bound formula, is too optimistic. Differences with real traces are more significant in large SINR values, for which a fourfold reduction in spectral efficiency has been observed. Likewise, active connection length has been shown to have a strong impact on spectral efficiency due to the OLLA convergence process. In particular, it has been observed that spectral efficiency can be up to three times larger for long connections than for short ones for the same average SINR. Likewise, it has been checked that average connection length largely differs between cells and network operators, which is one of the reasons for the differences in spectral efficiency curves at cell and network level.
The proposed method can be used to build link-layer mapping curves from traces on a cell or a network basis. It is expected that mapping curves derived from connections with similar radio and traffic conditions have similar link-level performance and therefore provide more accurate results. Thus, it seems reasonable to define clusters of cells with the same properties to segregate data based on cell type, depending on multiantenna configuration, user mobility, terrain, or service mix. Nonetheless, defining too many cell groups might lead to regression with insufficient data, which would not be reliable either, so a trade-off must be considered.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This work has been funded by the Spanish Ministry of Economy and Competitiveness (TEC2015-69982-R) and Ericsson Spain.
Ericsson, “Ericsson Mobility Report,” Tech. Rep., Jun 2016, https://www.ericsson.com/res/docs/2016/ericsson-mobility-report-2016.pdf.View at: Google Scholar
A. R. Mishra, “Advanced cellular network planning and optimisation: 2G/2.5G/3G. Evolution to 4G,” John Wiley and Sons, 2007, New York, NY, USA.View at: Google Scholar
I. H. Witten, E. Frank, and M. Hall, “Data Mining: Practical Machine Learning Tools and Techniques,” 2011.View at: Google Scholar
N. Baldo, L. Giupponi, and J. Mangues-Bafalluy, “Big data empowered self organized networks,” in Proceedings of the 20th European Wireless Conference (EW '14), pp. 1–8, May 2014.View at: Google Scholar
C. E. Shannon, Claude Elwood Shannon, IEEE Press, New York, NY, USA, 1993.View at: MathSciNet
C. Mehlführer, M. Wrulich, J. C. Ikuno, D. Bosanska, and M. Rupp, “Simulating the long term evolution physical layer,” in Proceedings of the 17th European Signal Processing Conference, EUSIPCO 2009, pp. 1471–1478, gbr, August 2009.View at: Google Scholar
3rd Generation Parthnership Project, “TS 36.942 V. 9.0.1; Evolved Universal Terrestrial Radio Access (E-UTRA) Radio Frequency (RF) system scenarios (Release 9),” Tech. Rep., 2009.View at: Google Scholar
H.-J. Su, “On adaptive threshold adjustment with error rate constraints for adaptive modulation and coding systems with hybrid ARQ,” in Proceedings of the 2005 Fifth International Conference on Information, Communications and Signal Processing, pp. 786–790, Bangkok, Thailand, December 2005.View at: Google Scholar
K. Aho, O. Alanen, and J. Kaikkonen, “CQI reporting imperfections and their consequences in LTE networks,” in Proceedings of the 10th Int. Conference Networks, pp. 241–245, Taipei, Taiwan, 2011.View at: Google Scholar
V. Niemi and K. Nyberg, Universal Mobile Telecommunications System Security, John Wiley & Sons, Ltd, Chichester, UK, 2003.View at: Publisher Site
3rd Generation Parthnership Project, “TS 32.421; Telecommunication management; Subscriber and equipment trace; Trace concepts and requirements (Release 6),” Tech. Rep., 2012.View at: Google Scholar
3rd Generation Parthnership Project, “TS 25.331; Technical Specification Group Radio Access Network; Radio Resource Control (RRC); Protocol specification v11.4.0 (Release 11),” Tech. Rep., 2012.View at: Google Scholar
3rd Generation Parthnership Project, “TS 36. 413; Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access Network (E-UTRAN); S1 Application Protocol (S1AP); v8.4.0; (Release 8),” Tech. Rep., 2008–2012.View at: Google Scholar
3rd Generation Parthnership Project, “TS 36. 423; Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access Network (E-UTRAN); X2 application protocol (X2AP); v9.2.0; (Release 9),” Tech. Rep., 2010–2013.View at: Google Scholar
3rd Generation Parthnership Project, “TS 36.331; LTE; Evolved Universal Terrestrial Radio Access (E-UTRA); Radio Resource Control (RRC); Protocol specification; v. 10.7.0; (Release 10),” Tech. Rep., 2012.View at: Google Scholar
The r project for statistical computing, available: https://www.r-project.org/.