Abstract

Bit error rate (BER) minimization and SNR-gap maximization, two robustness optimization problems, are solved, under average power and bitrate constraints, according to the waterfilling policy. Under peak power constraint the solutions differ and this paper gives bit-loading solutions of both robustness optimization problems over independent parallel channels. The study is based on analytical approach, using generalized Lagrangian relaxation tool, and on greedy-type algorithm approach. Tight BER expressions are used for square and rectangular quadrature amplitude modulations. Integer bit solution of analytical continuous bitrates is performed with a new generalized secant method. The asymptotic convergence of both robustness optimizations is proved for both analytical and algorithmic approaches. We also prove that, in the conventional margin maximization problem, the equivalence between SNR-gap maximization and power minimization does not hold with peak-power limitation. Based on a defined dissimilarity measure, bit-loading solutions are compared over Rayleigh fading channel for multicarrier systems. Simulation results confirm the asymptotic convergence of both resource allocation policies. In nonasymptotic regime the resource allocation policies can be interchanged depending on the robustness measure and on the operating point of the communication system. The low computational effort leads to a good trade-off between performance and complexity.

1. Introduction

In transmitter design, a problem often encountered is resource allocation among multiple independent parallel channels. The resource can be the power, the bits or the data, and the number of channels. The resource allocation policies are performed under constraints and assumptions, and the independent parallel channels can be encountered in multitone transmission.

Independent parallel channels result from orthogonal design applied in time, frequency, or spatial domains [1]. They can either be obtained naturally or in a situation where the transmit and receive strategies are to orthogonalize multiple waveforms. The orthogonal design can also be applied in many communication scenarios when there are multiple transmit and receive dimensions. Orthogonal frequency-division multiplexing (OFDM) and digital multitone (DMT) are two successful commercial applications for wireless and wireline communications with orthogonality in the frequency domain.

To perform resource allocation, relations between various resources are needed, and one is the channel capacity. This capacity of 𝑛-independent parallel Gaussian channels is the well-known sum of the capacity of each channel 𝒞=𝑛𝑖=1𝒞𝑖=𝑛𝑖=1log21+snr𝑖.(1) This relation, which holds for memoryless channels, links the supremum bitrate 𝒞𝑖, here expressed in bit per two dimensions, to the signal to noise ratio, snr𝑖, experienced by each channel or subchannel 𝑖. Any reliable and implementable system must transmit at a bitrate 𝑟𝑖 below capacity 𝒞𝑖 over each subchannel, and then the margin, or SNR-gap, 𝛾𝑖 is introduced to analyze such systems [2, 3] 𝛾𝑖=2𝒞𝑖12𝑟𝑖.1(2) This SNR-gap is a convenient mechanism for analyzing systems that transmit below capacity or without Gaussian input, and 𝑟𝑖=log21+snr𝑖𝛾𝑖,(3) with 𝑟𝑖 the bitrate in bits per two-dimensional symbol (bits per second per subchannel) which is also the number of bits per constellation symbol.

Resource allocation is performed using loading algorithms, and diverse criteria can be invoked to decide which portion of the available resource is allocated to each of the subchannels. From an information theory point of view, the criterion is the mutual information, and the optimal resource allocation under average power constraint was first devised in [4] for Gaussian inputs and later for non-Gaussian inputs [5]. Since the performance measure is the capacity, the SNR-gap in (3) is 𝛾𝑖=1 for all 𝑖. In other cases, 𝛾𝑖 is higher than 1, and (3) has been exploited into many optimal and suboptimal resource allocation policies. In fact, resource allocation is a constraint optimization problem, and generally two cases are of practical interest: rate maximization (RM) and margin maximization (MM), where the objective is the maximization of the data or bitrate, and the maximization of the system margin (or power minimization in practice), respectively [6]. The MM problem gathers all non RM problems including power minimization, margin maximization (in its strict sense), and other measures such as error probabilities or goodput (In this paper MM abbreviation is related to the general family of non-RM problems and not only to the margin maximization problem in its strict sense. The expanded form is reserved for the margin maximization in its strict sense.). It is not necessary to study all the resource allocation strategies, and equivalence or duality can be found. Families of approaches are defined, and unified processes have been used [710]. The loading algorithms are also split in to two families. The first is based on greedy-type approach to iteratively distribute the discrete resources [11], and the second uses Lagrangian relaxation to solve continuous resource adaptation [12]. Both approaches have been compared in terms of performance and complexity [7, 1214]. All these adaptive resource allocations are possible when channel state information (CSI) is known at both transmitter and receiver sides. This CSI can be perfect or imperfect, and full or partial. The effects of channel estimation error and feedback delay on the performance of adaptive modulated systems can also be considered in the resource allocation process [1517].

In this paper we shall focus henceforth on MM problems, and the main contributions are as follows: (i)the new resource allocation algorithm and(ii)the comparison of the different resource allocation strategies. It is assumed that the channel estimation is perfect, and feedback CSI delay and overhead are negligible. The considered peak-power constraint, instead of the conventional average power constraint or sum power constraint, results from power mask limitation and has been taken into account in resource allocation problem [13, 1820]. With this peak power constraint, each channel must satisfy a power constraint. Note that the sum power constraint is historically the first considered constraint [4]. Bitrate constraint comes from communication applications or service requirements, where different flows can exist, but one of them is chosen at the beginning of the communication. In this configuration, the remaining parameter to optimize is then the SNR-gap 𝛾𝑖 which is also related to the error probability of the communication system.

Two similar problems of MM have the same objective, that is, to maximize the system robustness. What we call robustness in this paper is the capability of a system to maintain acceptable performance with unforeseen disturbances.

The first measure of robustness is the SNR gap, or system margin, and its maximization ensures protection against unforeseen channel impairments or noise. The system margin maximization is the maximization of the minimal SNR-gap 𝛾𝑖 in (3) over the 𝑛 subchannels. In that case the conventional equivalence between margin maximization and power minimization in MM problems is not generally true. In this paper we show that this equivalence can nevertheless be obtained in particular configurations.

The second robustness measure is the bit error rate (BER) and its minimization can reduce the packet error rate and the data retransmissions. In transmitter design, the BER minimization can be realized using uniform bit-loading and adaptive precoding [21, 22]. Analytical studies have been performed with peak-BER or average BER (computed as arithmetic mean) approaches [15, 17]. With nonuniform bit loading, the average BER must be computed as weighted arithmetic mean, and the resource allocation has been performed using a greedy-type algorithm [23]. The first main contribution of this paper is the analytical solution of the resource allocation problem in the case of weighted arithmetic mean BER minimization.

To perform the analytical study, based on a generalized Lagrangian relaxation tool, we develop a new method for finding roots of functions. This method generalizes the secant method to better fit the function-depending weight and to speed up the search of the roots. Both robustness polices are compared using a new measure. This measure evaluates the difference of the bit distributions instead of the bitrates. We also prove that both robustness policies provide the same bit distribution in asymptotic regime, which is defined for high SNR and high bitrate regimes, and this is the second main contribution in this paper. The proof is given in the case of unconstrained modulations (i.e., continuous bitrates and analytical solution) and also for QAM constellations and greedy-type algorithms. The convergence is exemplified by simulation in multicarrier communications systems.

The organization of the paper is as follows. In Section 2, the quantities to be used throughout the paper are introduced, and the robustness optimization problem is formulated in a general way for both system margin maximization and BER minimization. The equivalences between margin maximization and power minimization are worked out. Section 3 presents the considered expressions of accurate BER, the new measure of the bit distribution differences, and the new search method of roots of functions. The solutions of formulated problems are given in Section 4 in the form of an optimum resource allocation policy based on greedy-type algorithms. The conditions of equivalence of both margin maximization and BER minimization are given in this section. Section 5 presents the analytical solution and both greedy-type and analytical methods are compared in Section 6. This Section 6 exemplifies the application of robustness optimization to multicarrier communication systems. Finally, the paper concludes in Section 7 with the proofs of several results relegated to the appendices.

Notation. The bitrates {𝑟𝑖}𝑛𝑖=1 are defined as a number of bits per two dimensions and they are simply given by a number of bits (undertone per constellation).

2. Problem Formulation

Consider 𝑛 parallel subchannels. On the 𝑖-th subchannel, the input-output relationship is 𝑌𝑖=𝑖𝑆𝑖+𝑊𝑖,(4) where 𝑆𝑖 is the transmitted symbol, 𝑌𝑖 is the received one, and 𝑖 the complex scalar channel gain. The complex Gaussian noise 𝑊𝑖 is a proper complex random variable with zero-mean and variance equal to 𝜎2𝑊𝑖.

The conventional average power constraint is 1𝑛𝑛𝑖=1𝐸||𝑆𝑖||2𝑃,(5) whereas the peak-power constraint, or power spectrum density constraint, considered in this paper is 𝐸||𝑆𝑖||2𝑃,𝑖=1,,𝑛.(6)

It is convenient to use normalized unit-power symbol {𝑋𝑖}𝑛𝑖=1 such that 𝑆𝑖=𝑝𝑖𝑃𝑋𝑖,(7) which leads to the peak-power constraint 𝑝𝑖1,𝑖=1,,𝑛.(8)

It is also convenient to introduce two other variables. The first one is the conventional SNR snr𝑖=||𝑖||2𝑝𝑖𝑃𝜎2𝑊𝑖(9) and the second is called power spectrum density noise ratio (PSDNR) 1psdnr=𝑛𝑛𝑖=1||𝑖||2𝑃𝜎2𝑊𝑖,(10) which is the mean signal to noise ratio over the 𝑛 subchannels if and only if 𝑝𝑖=1 for all 𝑖. This PSDNR is the ratio between the power mask at the receiver side (the transmitted power mask through the channel) and the power spectrum density of the noise. The system performance will be given according to this parameter to point out the ability of a system to exploit the available power under peak-power constraint.

Using the previous notations, (3) becomes 𝑟𝑖=log2||1+𝑖||2𝑝𝑖𝑃𝛾𝑖𝜎2𝑊𝑖.(11) With 𝑝𝑖/𝛾𝑖=1 for all 𝑖, 𝑟𝑖 is the subchannel capacity under power constraint 𝑃. With unconstrained modulations, 𝑟𝑖 is defined in , but constrained modulations are used in practice and 𝑟𝑖 takes a finite number of nonnegative values. Noninteger number of bits per symbol can also be used with fractional bit constellations [24, 25]. In this paper, modulations defined by discrete points are used with integer number of bits per symbol. Typically, 𝑟𝑖{0,𝛽,2𝛽,,𝑟max}, where 𝛽 is the granularity in bits and 𝑟max is the number of bits in the richest available constellation. The peak-power and bitrate constraints are then 𝑝𝑖1𝑖,𝑛𝑖=1𝑟𝑖=𝑅,𝑟𝑖0,𝛽,2𝛽,,𝑟max𝑖.(12) Obviously, the exploitation of the available power leads to 𝑝𝑖=1  forall𝑖 and the constraint is simplified as 𝑛𝑖=1𝑟𝑖=𝑅,𝑟𝑖0,𝛽,2𝛽,,𝑟max𝑖.(13) With peak-power and bitrate constraints, the resource allocation strategy is then to use all available power and to optimize the robustness.

The problem we pose is to determine the optimal bitrate allocation {𝑟𝑖}𝑛𝑖=1 that maximizes a robustness measure, or inversely minimizes a frailness measure, under constraints given in (13). In its general form, this problem can be written as 𝑟1,,𝑟𝑛=argmin𝑛𝑖=1𝑟𝑖𝑟=𝑅𝑖0,𝛽,2𝛽,,𝑟max𝜙𝑟𝑖𝑛𝑖=1,(14) where 𝜙() is the frailness measure. In this paper, this measure is given by the SNR gap or the BER. In addition to the bitrate allocation, the receiver is presumed to have knowledge of the magnitude and phase of the channel gain {𝑖}𝑛𝑖=1, whereas the transmitter needs only to know the magnitude {|𝑖|}𝑛𝑖=1. The objective is to find the data vector [𝑟1,,𝑟𝑛] which is the final relevant information for the transmitter. The resource allocation can then be computed on the receiver side to reduce the feedback data rate from 𝑛 real numbers to 𝑛 finite integer numbers. Furthermore, the integer nature of the data rates allows a full CSI at the transmitter, which is not possible with real numbers.

2.1. System Margin Maximization

The SNR-gap 𝛾𝑖 of the subchannel 𝑖 is (3) 𝛾𝑖=snr𝑖2𝑟𝑖.1(15) With reliable communications, 𝛾𝑖 is higher than 1 for all subchannels. Let the system margin, or system SNR-gap, be the minimal value of the SNR gap in each subchannel 𝛾=min𝑖𝛾𝑖.(16) Let 𝛾init be the initial system margin of one communication system ensuring a given QoS. Let 𝛾 be the optimized system margin of this system. Then, the system margin improvement ensures system protection in unforeseen channel impairment or noise, for example, impulse noise; bitrate and system performance targets are always reached for an unforeseen SNR reduction of 𝛾/𝛾init over all subchannels. This robustness optimization does not depend on constellation and channel-coding types. The system margin 𝛾 is defined and optimized without knowledge of used constellations and coding, and the proposed robustness optimization works for any coding and modulation scheme.

The objective is the maximization of the system margin which is equivalent to the minimization of 𝛾1. We note 𝛾𝑖(𝑟𝑖) the function that associates 𝑟𝑖 to 𝛾𝑖. The function 𝜙() in (14) is then given by 𝜙𝑟𝑖𝑛𝑖=1=max𝑖1𝛾𝑖𝑟𝑖,𝑟(17)1,,𝑟𝑛=argmin𝑛𝑖=1𝑟𝑖𝑟=𝑅𝑖0,𝛽,2𝛽,,𝑟maxmax𝑖𝛾𝑖1.(18) This problem is the inverse problem of bitrate maximization under peak-power and SNR-gap constraints. The solution of the bitrate maximization problem is obvious under the said constraints and given by 𝑟𝑖1=𝛽𝛽log21+snr𝑖𝛾𝑖𝑖.(19)

Following the conventional SNR-gap approximation [2], the symbol error rate (SER) of QAM depending on the SNR-gap is constellation size independent with ser𝑖𝑟𝑖=2erfc32𝛾𝑖𝑟𝑖,(20) where the complementary error function is usually defined as 2erfc(𝑥)=𝜋𝑥𝑒𝑡2𝑑𝑡.(21) The system margin maximization is then equivalent to the peak-SER minimization in high-SNR regime. Note that, with (16), the system margin maximization can also be called a trough-SNR-gap maximization, and it is strongly related to the peak-power minimization. Whereas the bit-loading solution is the same for power minimization and margin maximization with sum-margin or sum-power constraints, instead of peak constraints, the following lemma gives sufficient conditions for equivalence in the case of peak constraints.

Lemma 1. The bit allocation that maximizes the system margin under peak-power constraint {𝑝margin𝑖}𝑛𝑖=1 minimizes the peak-power under SNR-gap constraint {𝛾power𝑖}𝑛𝑖=1 if 𝑝margin𝑖𝛾power𝑖=𝛼 for all 𝑖.

Proof. It is straightforward using (11) and (18). Both problems have the same expression and therefore the same solution.

This lemma provides a sufficient but not necessary condition for the equivalence of solutions, and it says that if the power and the SNR-gap constraints have proportional distributions for margin maximization and peak-power minimization problems, respectively, then both problems have the same optimal bitrate allocation. In the general case, we cannot conclude that both problems have the same solution.

2.2. BER Minimization

In communication systems, the error rate of the transmitted bits is a conventional robustness measure. By definition, the BER is the ratio between the number of wrong bits and the number of transmitted bits. With a multidimensional system, there exists several BER expressions [15, 23]. Let the BER be evaluated over the transmission of 𝑚 multidimensional symbols. (We suppose that 𝑚 is high enough to respect the ergodic condition and to make possible use of error probability.) In our case, the multidimensional symbols are the symbols sent over 𝑛 subchannels. Let 𝑒𝑖 be the number of erroneous bits received over subchannel 𝑖 during the transmission. The BER is then given as ber=𝑛𝑖=1𝑒𝑖𝑚𝑛𝑖=1𝑟𝑖=𝑛𝑖=1𝑟𝑖𝑒𝑖/𝑚𝑟𝑖𝑛𝑖=1𝑟𝑖.(22) The BER over subchannel 𝑖 is 𝑒𝑖/𝑚𝑟𝑖, and the BER of 𝑛 subchannels is then 𝑟ber𝑖𝑛𝑖=1=𝑛𝑖=1𝑟𝑖ber𝑖𝑟𝑖𝑅(23) with ber𝑖(𝑟𝑖) the function that associates the BER of channel 𝑖 with the bitrate 𝑟𝑖. The BER of multiple variable bitrate 𝑟𝑖 is then not the arithmetic mean of BER but is the weighted mean BER. Weighted mean BER and arithmetic mean BER are equal if 𝑟𝑖=𝑟𝑗  forall𝑖,𝑗 or if ber𝑖=0 forall 𝑖. As there exists ber𝑖0, then weighted mean BER and arithmetic mean BER are equal if and only if 𝑟𝑖=𝑟𝑗  forall𝑖,𝑗. Note that if the number 𝑚 of transmitted multidimensional symbols depends on the subchannel 𝑖, (23) does not hold anymore. These obvious results on mean measures are not taken into account, and mean BER is erroneously used instead of mean weighted BER [15, 17].

The function 𝜙() in (14) is then given by 𝜙𝑟𝑖𝑛𝑖=1=1𝑅𝑛𝑖=1𝑟𝑖ber𝑖𝑟𝑖,𝑟(24)1,,𝑟𝑛=argmin𝑛𝑖=1𝑟𝑖𝑟=𝑅𝑖0,𝛽,2𝛽,,𝑟max𝑟ber𝑖𝑛𝑖=1.(25)

To simplify the notations, let ber(𝑅) be the BER of the system. In high SNR regime with Gray mapping, 𝑟𝑖ber𝑖(𝑟𝑖)=ser𝑖(𝑟𝑖), and then weighted mean BER can be approximated by arithmetic mean SER divided by the number of transmitted bits.

Contrary to system margin maximization, the BER minimization needs the knowledge of constellation and coding schemes, and it is based on accurate expressions of BER functions. In this paper, the used constellations are QAM, and the optimization is performed without a channel coding scheme. When dealing with practical coded systems, the ultimate measure is the coded BER and not the uncoded BER. However, the coded BER is strongly related to the uncoded BER. It is then generally sufficient to focus on the uncoded BER when optimizing the uncoded part of a communication system [26].

3. Interludes

Before solving the optimization problem, the BER approximation of QAM is presented. This approximation plays a chief role in BER minimization, and a good approximation is therefore needed. Since this paper deals with bitrate allocation, a measure of difference in the bitrate distribution is proposed and presented in this section. This section also presents a new research method of roots of functions. This method generalizes the secant method and converges faster than the secant one.

3.1. BER Approximation

Conventionally, the BER approximation of square QAM has been performed by either calculating the symbol error probability or by simply estimating it using lower and upper bounds [27]. This conventional approximation tends to deviate from the exact values when the SNR is low and cannot be applied for rectangular QAM. Exact and general closed-form expressions are developed in [28] for arbitrary one and two-dimensional amplitude modulation schemes.

An approximate BER expression for QAM can be obtained by neglecting the higher-order terms in the exact closed-form expression [28]. ber𝑖1𝑟𝑖12𝐼𝑖1𝐽𝑖erfc3𝐼2𝑖+𝐽2𝑖2snr𝑖(26) with 𝐼𝑖=2𝑟𝑖/2, 𝐽𝑖=2𝑟𝑖/2, and 𝑟𝑖=log2(𝐼𝑖𝐽𝑖). By symmetry, 𝐼𝑖 and 𝐽𝑖 can be inverted. The BER can also be expressed using the SNR-gap 𝛾𝑖. Using (3) and (26), the BER is written as ber𝑖1𝑟𝑖12𝐼𝑖1𝐽𝑖erfc3𝐼𝑖𝐽𝑖1𝐼2𝑖+𝐽2𝑖𝛾2𝑖.(27)

These two approximations allow the extension of the ber𝑖(𝑟𝑖) function from to + which is useful for analytical studies. Figure 1 gives the theoretical BER curves and the approximated ones from the binary phase shift keying (BPSK) to the 32768-QAM. For BER lower than 5102, the relative error is lower than 1% for all modulations.

3.2. Dissimilar Resource Allocation Measure

Two resource allocations can have the same bitrate, but this does not mean that the bitrates per subchannel are the same. To measure the difference in the bit distribution between different resource allocation strategies, we need to evaluate the dissimilarity. This dissimilarity measure must verify the following properties: (1) if two resource allocations lead to the same bit distribution, then the measure of dissimilarity must be null, whereas (2) if two resource allocations lead to two completely different bit distributions in loaded subchannels, then the measure of dissimilarity must be equal to one, and (3) the measure is symmetric; that is, the dissimilarity between the resource allocations 𝑋 and 𝑌 must be the same as the dissimilarity between the resource allocations 𝑌 and 𝑋. We choose that the empty subchannels do not impact the measure.

Definition 2. The dissimilarity measure between the resource allocations 𝑋 and 𝑌 is 𝜇(𝑋,𝑌)=𝑛𝑖=1𝛿𝑟𝑖(𝑋)𝑟𝑖(𝑌)max𝑗{𝑋,𝑌}𝑛𝑖=1𝛿𝑟𝑖(,𝑗)(28) where 𝛿(𝑥)=1 if 𝑥0 else 𝛿(𝑥)=0.

This dissimilarity has the following properties.

Property 1. 𝜇(𝑋,𝑌)=0 iff 𝑟𝑖(𝑋)=𝑟𝑖(𝑌)  forall𝑖.

Property 2. 𝜇(𝑋,𝑌)=1 iff 𝑟𝑖(𝑋)𝑟𝑖(𝑌) or 𝑟𝑖(𝑋)=𝑟𝑖(𝑌)=0  forall𝑖.

Property 3. 𝜇(𝑋,𝑌)=𝜇(𝑌,𝑋).

Property 4. If 𝜇(𝑋,𝑌)=0, then, for all resource allocation 𝑍, 𝜇(𝑋,𝑍)=𝜇(𝑌,𝑍).

All these properties are direct consequences of Definition 2. For a null dissimilarity, 𝜇(𝑋,𝑌)=0, all the subchannels transmit the same number of bits, that is, 𝑟𝑖(𝑋)=𝑟𝑖(𝑌)  forall𝑖. For a full dissimilarity, 𝜇(𝑋,𝑌)=1, all the nonempty subchannels of both resource allocations 𝑋 and 𝑌 transmit a different number of bits, that is,  forall𝑖 such as 𝑟𝑖(𝑋)0 and 𝑟𝑖(𝑌)0, then 𝑟𝑖(𝑋)𝑟𝑖(𝑌). It is obvious that the measure is symmetric 𝜇(𝑋,𝑌)=𝜇(𝑌,𝑋). If two resource allocations have a null dissimilarity 𝜇(𝑋,𝑌)=0, then they are identical and for any resource allocation 𝑍  𝜇(𝑋,𝑍)=𝜇(𝑌,𝑍). The converse of this last property is not true. Note that the dissimilarity is not defined for two empty resource allocations.

For example, let 𝑛=4 and [𝑟1(𝑋),,𝑟4(𝑋)]=[4330]. If [𝑟1(𝑌),,𝑟4(𝑌)]=[3222] or [𝑟1(𝑌),,𝑟4(𝑌)]=[5500], then 𝜇(𝑋,𝑌)=1. If [𝑟1(𝑌),,𝑟4(𝑌)]=[4321], then 𝜇(𝑋,𝑌)=1/2. The measure 𝜇(𝑋,𝑌) is null if and only if [𝑟1(𝑌),,𝑟4(𝑌)]=[4330]. The dissimilarity does not evaluate the total bitrate differences but only the bit distribution differences; the contribution of two bitrates 𝑟𝑖(𝑋) and 𝑟𝑗(𝑌) in the dissimilarity measure is independent of the bitrate difference |𝑟𝑖(𝑋)𝑟𝑗(𝑌)|.

3.3. Generalized Secant Method

There are many numerical methods for finding roots of functions. We propose a new method, called the generalized secant method, that is, based on the secant method. This new method better fits the function-depending weight than secant method do and then improves the speed of the convergence. Before explaining this new method, a brief overview of the secant method is given.

In our case, the objective function 𝑓(𝑥) is monotonous, nondifferentiable and computable over 𝑥[𝑥1,𝑥2] with 𝑓(𝑥1)/|𝑓(𝑥1)|=𝑓(𝑥2)/|𝑓(𝑥2)|. The secant method is as follows for an increasing function 𝑓(𝑥): (1)𝑖=0, 𝑦0=𝑓(𝑥1); (2)𝑥0=(𝑥2𝑓(𝑥1)𝑥1𝑓(𝑥2))/(𝑓(𝑥1)𝑓(𝑥2)), 𝑦𝑖+1=𝑓(𝑥0); (3)if |𝑦𝑖+1𝑦𝑖|𝜖, then 𝑥0 is the root of 𝑓(𝑥), else 𝑦𝑖+1<0then𝑥1=𝑥0𝑦𝑖+1>0then𝑥2=𝑥0, 𝑖𝑖+1 and go to step 2. The objective of the secant method is to approximate 𝑓(𝑥) by a linear function 𝑔𝑖(𝑥)=𝑎𝑖𝑥+𝑏𝑖 at each iteration 𝑖, with 𝑔𝑖(𝑥1)=𝑓(𝑥1) and 𝑔𝑖(𝑥2)=𝑓(𝑥2), and to set 𝑥0 as the root of 𝑔𝑖(𝑥). The search for the root of 𝑓(𝑥) is completed when the desired precision 𝜖 is reached. The precision is given for 𝑦𝑖, but it can also be given for 𝑥𝑖.

As the function 𝑓(𝑥) is computable, it can be plotted and an a posteriori simple algebraic or elementary transcendental invertible function over [𝑥1,𝑥2] can be used to better fit the function 𝑓(𝑥). The a posteriori information is then used to improve the search for the root. The function 𝑓(𝑥) is iteratively approximated by 𝑎𝑖(𝑥)+𝑏𝑖 instead of 𝑎𝑖𝑥+𝑏𝑖, where (𝑥) is the invertible function. This method is then given as follows for an increasing function 𝑓(𝑥): (1)𝑖=0, 𝑦0=𝑓(𝑥1); (2)𝑥0=1((𝑥2𝑓(𝑥1)𝑥1𝑓(𝑥2))/(𝑓(𝑥1)𝑓(𝑥2))), 𝑦𝑖+1=𝑓(𝑥0); (3)if |𝑦𝑖+1𝑦𝑖|𝜖, then 𝑥0 is the root of 𝑓(𝑥), else 𝑦𝑖+1<0then𝑥1=𝑥0𝑦𝑖+1>0then𝑥2=𝑥0, 𝑖𝑖+1 and go to step 2. Compared to the secant method, only step 2 differs and the computation of 𝑥0 is performed taking into account the approximated shape (𝑥) of the function 𝑓(𝑥).

This generalized secant method is used in Section 5 to find the root of the Lagrangian and is compared to the conventional secant method. In our case, 𝑓(𝑥) is the sum of logarithmic functions, and the function (𝑥) is then the logarithmic one.

4. Optimal Greedy-Type Resource Allocations

The general problem is to find the optimal resource allocation [𝑟1,,𝑟𝑛] that minimizes 𝜙(), the inverse robustness measure, or frailness. This is a combinatorial optimization problem or integer programming problem. The core idea in this iterative resource allocation is that a sequential approach can lead to a globally optimum discrete loading. Greedy-type methods then converge to the optimal solution. Convexity is not required for the convergence of the algorithm and monotonicity is sufficient [29]. This monotonicity ensures that the removal or addition of 𝛽 bits at each iteration converges to the optimal solution. In this paper the used functions 𝜙() are monotonic increasing functions.

In its general form and when the objective function 𝜙() is not only a weighted sum function, the iterative algorithm is as follows: (1)start with allocation [𝑟1(0),,𝑟𝑛(0)]=0, (2)𝑘=0, (3)allocate one more bit to the subchannel 𝑗 for which 𝜙𝑟𝑖(𝑘+1)𝑛𝑖=1(29) is minimal, with 𝑟𝑗(𝑘+1)=𝑟𝑗(𝑘)+𝛽 and 𝑟𝑖(𝑘+1)=𝑟𝑖(𝑘)forall𝑖𝑗, (4)if 𝑖𝑟𝑖(𝑘+1)=𝑅, terminate; otherwise 𝑘𝑘+1 and go to step 3.

The obtained resource allocation is then optimal [29] and solves (14). This algorithm needs 𝑅/𝛽 iterations. The target bitrate 𝑅 is supposed to be feasible; that is, 𝑅 is a multiple of 𝛽. Note that an equivalent formulation can be given starting with 𝑟𝑖(0)=𝑟max for all 𝑖 and using bit removal instead of bit addition with maximization instead of minimization. For bitrates higher than (𝑛/2)𝑟max, the number of iterations with bit removal is lower than with bit addition. The opposite is true with bitrate lower than (𝑛/2)𝑟max.

Iterative resource allocations have been firstly applied to bitrate maximization under power constraint [11]. Many works have been devoted to complexity reduction of greedy-type algorithms; see, for example, [6, 12, 30, 31] and references therein. In this section, only greedy-type algorithms are presented in order to compare the analytical resource allocation to the optimal iterative one. Note that the analytical solution can also be used as an input of the greedy-type algorithm to initialize the algorithm and to reduce the number of iterations.

4.1. System Margin Maximization

The system margin, or system SNR gap, maximization under bitrate and peak-power constraints is the inverse problem of the bitrate maximization under SNR-gap and peak-power constraints. This inverse problem has been solved, for example, in [18]. To comply with the general problem formulation, the inverse system margin minimization is presented instead of the system margin maximization.

Lemma 3. Under bitrate and peak-power constraints, the greedy-type resource allocation that minimizes the inverse system margin 𝛾1 (16) allocates sequentially 𝛽 bits to the subchannel 𝑖 bearing 𝑟𝑖 bits and for which 2𝑟𝑖+𝛽1snr𝑖(30) is minimum.

Proof. It is straightforward using (17) and (29). See Appendix A for an original proof.

The main advantage of system margin maximization is that the optimal resource allocation can be reached independently of the SNR regime. Resource allocation is always possible even for very low SNR, but it can lead to unreliable communication with SNR gap lower than 1. Lemma 3 is given with unbounded modulation orders, that is, 𝑟max= and 𝑟𝑖𝛽forall𝑖. With full constraints (13), the subchannels that reach 𝑟max are simply removed from the iterative process.

4.2. BER Minimization

The system BER minimization under bitrate and peak-power constraints is the inverse problem of bitrate maximization under peak-power and BER constraints. This inverse problem has been solved, for example, in [23]. Using (29) and (24), the solution of BER minimization is straightforward, and the corresponding greedy-type algorithm is also known as Levin-Campello algorithm [5, 32, 33]. The main drawback of this solution is that it requires good approximated BER expressions even in low-SNR regime. This constraint can be relaxed, and the following lemma gives the optimal greedy-type resource allocation for the BER minimization.

Lemma 4. In high SNR regime and under bitrate and peak-power constraints, the greedy-type resource allocation that minimizes the BER minimizes (𝑟𝑖+𝛽)ber𝑖(𝑟𝑖+𝛽) at each step.

Proof. See Appendix B.

Lemma 4 states how to allocate bits without mean BER computation at each step. It is given without modulation order limitation. Like system margin maximization solution, the bounded modulation order is simply taken into account using 𝑟max and subchannel removal.

4.3. Comparison of Resource Allocations

To compare the two optimization policies, we call the resource allocation that maximizes the system margin and 𝒞 the resource allocation that minimizes the BER. Table 1 gives an example of bitrate allocation over 20 subchannels where the SNR follows a Rayleigh distribution and with 𝛽=1. In this example, the PSDNR defined in (10) is equal to 25 dB, and the maximum allowed bitrate per subchannel is never reached. As expected, the system margin minimization leads to a minimal SNR gap, min𝑖𝛾𝑖, higher than that provided by the BER minimization policy with a gain of 0.3 dB. On the other hand, the BER minimization policy leads to BER lower than that provided by system margin minimization (2.6105 versus 3.1105). In this example, the dissimilarity is 𝜇(,𝒞)=0.1, and two subchannels convey different bitrates. All these results are obtained with 𝑟max=10.

This example shows that the difference between the resource allocation policies can be small. The question is whether both resource allocations converge and if they converge then in what cases. The following theorem answers the question.

Theorem 5. In high-SNR regime with square QAM and under bitrate and peak-power constraints, the greedy-type resource allocation that maximizes the system margin converges to the greedy-type resource allocation that minimizes the BER.

Proof. See Appendix D.

The consequence of Theorem 5 is that the dissimilarity between the resource allocation that maximizes the system margin and the resource allocation that minimizes the BER is null in high-SNR regime and with square QAM. With square QAM, 𝛽 should be a multiple of 2. Note that with square modulations, 𝛽 can also be equal to 1 if the modulations are, for example, those defined in ADSL [34]. Figure 6 exemplifies the convergence with 𝛽=2 as we will see later in Section 6.

5. Optimal Analytical Resource Allocations

The analytical method is based on convex optimization theory [35]. Unconstrained modulations lead to bitrates 𝑟𝑖 defined in . With 𝑟𝑖+ the solution is the waterfilling one. With bounded modulation order, that is, 0𝑟𝑖𝑟max, the solution is quite different from the waterfilling one. The solution is obtained in the framework of generalized Lagrangian relaxation using Karush-Kuhn-Tucker (KKT) conditions [36].

As the bitrates are continuous and not only integers in this analytical analysis, the constraints (13) do not hold anymore and become 𝑛𝑖=1𝑟𝑖=𝑅,0𝑟𝑖𝑟max𝑖.(31) The KKT conditions associated to the general problem (14) with (31) instead of (13) write [36] 𝑟𝑖𝑟0,𝑖=1,,𝑛,(32)𝑖𝑟max0,𝑖=1,,𝑛,(33)𝑅𝑛𝑖=1𝑟𝑖𝜇=0,(34)𝑖𝜈0,𝑖=1,,𝑛,(35)𝑖𝜇0,𝑖=1,,𝑛,(36)𝑖𝑟𝑖𝜈=0,𝑖=1,,𝑛,(37)𝑖𝑟𝑖𝑟max𝜕=0,𝑖=1,,𝑛,(38)𝜕𝑟𝑖𝜙𝑟𝑗𝑛𝑗=1𝜆𝜇𝑖+𝜈𝑖=0,𝑖=1,,𝑛,(39) where 𝜆, 𝜇𝑖 and 𝜈𝑖 are the Lagrange multipliers. The first three equations (32)–(34) represent the primal constraints, (35) and (36) represent the dual constraints, (37) and (38) represent the complementary slackness, and (39) is the cancellation of the gradient of Lagrangian with respect to 𝑟𝑖. When the primal problem is convex and the constraints are linear, the KKT conditions are sufficient for the solution to be primal and dual optimal. For the system margin maximization problem, the function 𝜙() is convex over all input bitrates and SNR whereas this function is no longer convex for the BER minimization problem. Appendix C gives the convex domain of the function 𝜙() in the case of BER minimization problem.

The properties of the studied function 𝜙() are such that 𝜕𝜕𝑟𝑖𝜙𝑟𝑗𝑛𝑗=1=𝜓𝑖𝑟𝑖(40) is independent of 𝑟𝑗 for all 𝑗𝑖. The optimal solution that solves (32)–(39) is then [36] 𝑟𝑖(𝜆)=0,if𝜆𝜓𝑖𝜓(0),𝑖1(𝜆),if𝜓𝑖(0)<𝜆<𝜓𝑖𝑟max,𝑟max,if𝜆𝜓𝑖𝑟max(41) for all 𝑖=1,,𝑛 and with 𝜆 verifying the constraint 𝑛𝑖=1𝑟𝑖(𝜆)=𝑅.(42)

It is worthwhile noting that the above general solution is the waterfilling one if 𝑟max𝑅. The waterfilling is also the solution in the following case. Let be the subset index such that =𝑖𝑟𝑖0,𝑟max,(43) and let 𝑅 the target bitrate over . In this subset, {𝑟𝑖}𝑖 are solutions of 𝜕𝜕𝑟𝑖𝜙𝑟𝑗𝑛𝑗=1𝑅𝜆=0,𝑖𝑖𝑟𝑖(𝜆)=0.(44) This is the solution of (14) with unbounded modulations over the subchannel index subset . If ={1,,𝑛} and 𝑅=𝑅, and (44) is also the solution of (14) with unconstrained modulations.

5.1. System Margin Maximization

Theorem 6. Under bitrate and peak-power constraints, the asymptotic bit allocation which minimizes the inverse system margin is given by 𝑟𝑖=𝑅||||+1||||𝑗log2snr𝑖snr𝑗,𝑖.(45)

Proof. See Appendix E.

The solution given by Theorem 6 holds for high modulation orders which defines the asymptotic regime, compare Appendix E. If the set is known, then Theorem 6 can be used directly to allocate the subchannel bitrates. Otherwise, should be found first.

The expression of 𝑟𝑖 in Theorem 6 is a function of the target bitrate 𝑅, the number || of subchannels, and the ratios of SNR. This expression is independent of the mean received SNR or PSDNR. It does not depend on the link budget but only on the relative distribution of subchannel coefficients {|𝑖|2}𝑛𝑖=1.

5.2. BER Minimization

The arithmetic mean BER minimization has been analytically solved, for example, in [22, 37]. This arithmetic mean measure needs to employ the same number of bits per constellation which limits the system efficiency. The following theorem gives the solution of the weighted mean BER minimization that allows variable constellation sizes in the multichannel system.

Theorem 7. Under bitrate and peak-power constraints, the asymptotic bit allocation which minimizes the BER is given by 𝑟𝑖=𝑅||||+1||||𝑗log2snr𝑖snr𝑗𝑖(46) with equal in-phase and quadrature bitrates.

Proof. See Appendix F.

The solution given by Theorem 7 holds for high modulation orders and for subchannel BER lower than 0.1, and these parameters define the asymptotic regime in this case, compare Appendix F. The optimal asymptotic resource allocation leads to square QAM with 𝑟𝑖 conveyed bitrate in each in-phase and quadrature components of the signal of subchannel 𝑖. It is important to note that, in asymptotic regime, BER minimization and system margin maximization lead to the same subchannel bitrate allocation. In that case, the asymptotic regime is defined by the more stringent context which is the BER minimization. As we will see in Section 6, this asymptotic behavior can be observed when 𝛽=2.

The main drawback of the formulas in Theorems 7 and 6 is that the subset must be known. To find this subset, the negative subchannel bitrates and those higher than 𝑟max should be clipped, and can be found iteratively [18]. But clipping negative bitrates first can decrease those higher than 𝑟max, and clipping bitrates higher than 𝑟max first can increase the negative ones. It is then not possible to apply first the waterfilling solution and after that to clip the bitrates 𝑟𝑖 greater than 𝑟max to converge to the optimal solution. Finding the set requires many comparisons, and we propose a fast iterative solution based on the generalized secant method.

5.3. Lagrangian Resolution

To solve (41), numerical iterative methods are required. It is important to observe that the function defined in (41) is not differentiable, and, thus, methods like Newton's cannot be used [18]. We use the proposed generalized secant method to better fit the function-depending weight and increase the speed of the convergence. An important point for the iterative method is that the initialization value must lead to feasible solution and should be as close as possible to the final solution.

The root of the function defined by (42) is now calculated. Let 𝑓(𝜆)=𝑛𝑖=1𝑟𝑖(𝜆)𝑅.(47) Theorems 6 and 7 show that 𝑟(𝜆) is the sum of log2() functions. This is the reason why the function log2() is used in the generalized secant method. Figure 2 shows three functions versus the parameter 𝜆. The first function is the input function 𝑓(𝜆), the second one is the function used by the generalized secant method, and the last one if the linear function used by the secant method. In this example, the common points are 𝜆=0 and 𝜆=2.3. As it is shown, the generalized secant method better fits the input function than the secant method and therefore can improve the speed of the convergence to find the root which is around 𝜆=1/80 in this example.

To ensure the convergence of the secant methods, the algorithm should be initialized with 𝜆1 and 𝜆2 such as 𝑓(𝜆1)<0 and 𝑓(𝜆2)>0. For both optimization problems, system margin maximization and BER minimization, the parameter 𝜆 is given by the function 𝜓𝑖(𝑟𝑖), and it can be reduced to 𝜆=2𝑟𝑖/snr𝑖, as shown in Appendices E and F. Parameters {𝜆1,𝜆2} are then chosen as 𝜆1=1max𝑖snri,𝜆2=2𝑟maxmin𝑖snri.(48) Using (41), 𝜆𝜆1 leads to 𝑟𝑖(𝜆)=0 for all 𝑖, and 𝜆𝜆2 leads to 𝑟𝑖(𝜆)=𝑟max for all 𝑖. Then, it follows that 𝑓(𝜆1)<0 and 𝑓(𝜆2)>0 if 𝑅(0,𝑛𝑟max).

Figure 3 shows the needed number of iterations for the convergence of the generalized and conventional secant methods versus the target bitrate 𝑅. Results are given over a Rayleigh distribution of the subchannel SNR with 1024 subchannels. The possible bitrates are then 𝑅[0,𝑛×𝑟max] and 𝛽=2. Here, 𝑟max=15 and then 𝑅15360 bits per multidimensional symbol. For comparison, the number of iterations needed by the greedy-type algorithm is also plotted. Note that the greedy-type algorithm can start by empty bitrate or by full bitrate limited by 𝑟max for each subchannel. The number of iterations is then given by min{𝑅,𝑛𝑟max𝑅}. The iterative secant and generalized secant methods are stopped when the bitrate error is lower than 1. A better precision is not necessary since exact bitrates {𝑟𝑖}𝑖 can be computed using Theorems 6 and 7 when is known. As it is shown in Figure 3, the generalized secant method converges faster than the secant method, except for the very low target bitrates 𝑅. For very high target bitrates, near from 𝑛×𝑟max, the number of iterations with the generalized secant method can be higher than that with the greedy-type algorithm. Except for these particular cases, the generalized secant method needs no more than 4-5 iterations to converge. In conclusion, we can say that with Rayleigh distribution of {snr𝑖}𝑛𝑖=1 and for target bitrates 𝑅 such that 3%𝑅/𝑛𝑟max97%, the generalized secant method converges faster than the secant method or the greedy-type algorithm.

Using the generalized secant method, the bitrates are not integers and for all 𝑖, 𝑟𝑖[0,𝑟max]. These solutions have to be completed to obtain integer bitrates.

5.4. Integer-Bit Solution

Starting from the continuous bitrate allocations previously presented, a loading procedure is developed taking into account the integer nature of the bitrates to be allotted. A simple solution is to consider the integer part of {𝑟𝑖}𝑖 and to complete by a greedy-type algorithm to achieve the target bitrate 𝑅. The integer part of {𝑟𝑖}𝑖 is then used as a starting point for the greedy algorithm. This procedure can lead to a high number of iterations. Therefore, the secant or bisection methods are suitable to reduce the number of iterations. The problem to solve is then to find the root of the following function [18]: 𝑔(𝛼)=𝑖𝑟𝑖+𝛽𝛼𝑅,(49) where 𝑟𝑖, , and 𝑅 are given by the continuous Lagrangian solution. This is a suboptimal integer bitrate problem, and the optimal one needs to find {𝛼𝑖}𝑛𝑖=1 instead of a unique 𝛼. As the optimal solution leads to a huge number of iterations, it is not considered. The function (49) is a nondecreasing and nondifferentiable staircase function such that 𝑔(0)<0, 𝑔(1)>0 because 𝑖𝑟𝑖=𝑅. The iterative methods can then be initialized with 𝛼1=0 and 𝛼2=1.

Two iterative methods are compared: the bisection one and the secant one. Both methods are also compared to the greedy-type algorithm. Figure 4 presents the number of iterations of the three methods to solve the integer-bit problem of the Lagrangian solution with 𝛽=1. Results are given over a Rayleigh distribution of the subchannel SNR, with 1024 subchannels and the target bitrates are between 0 and 𝑛×𝑟max=15360. As it is shown, the convergence is faster with bisection method than with greedy-type algorithm. For target bitrates between 10% and 90% of the maximal loadable bitrate, the secant method outperforms the bisection one with a mean number of iterations around 4 whereas the number of iterations for bisection method is higher than 8. Figure 4 also shows that |𝑔(0)| is all the time lower than the half of number of subchannels and around this value for target bitrate between 10% and 90% of the maximal loadable bitrate. Then, if the number of iterations induces by the greedy-type algorithm to solve the integer-bit problem of the Lagrangian solution that is acceptable in a practical communication system, this greedy-type completion can be used and appears to lead to the optimal resource allocation. This result obtained without proof means that the greedy-type procedure has enough bits to converge to the optimal solution. If the number of iterations induced by the greedy-type algorithm is too high (this number is around 𝑛/2), the secant method can be used.

The overall analytical resolution of (14) needs few iterations compared to the optimal greedy-type algorithm. Whereas the continuous solution of (14) is optimal, the analytical integer bitrate solution is suboptimal.

6. Greedy-Type versus Analytical Resource Allocations

In the previous section, the numbers of iterations of the algorithms have been compared. In this section, robustness comparison is presented and the analytical solutions obtained in asymptotic regime are also applied in nonasymptotic regime which means that 𝛽=1 and modulation orders can be low.

The evaluated OFDM communication system is composed of 1024 subcarriers without interferences between the symbols or the subcarriers. The channel is the Rayleigh fading one with independent and identically distributed elements. The richest modulation order is 𝑟max=10. The robustness measures are evaluated for different target bitrates which are given with the following arbitrary equation: 𝑅=𝑛𝑖=1minlog21+snr𝑖2,𝑟max.(50) This equation ensures reliable communications for all the input target bitrates or PSDNR. The empirical relationship between PSDNR and target bitrate is also given in Figure 5.

Figure 6 presents the output BER and the system margin of three resource allocation policies versus the target bitrate 𝑅. The first one, 𝒜, is obtained using analytical optimization, the second, , is the solution of the greedy-type algorithm which maximizes the system margin, and the third, 𝒞, is the solution of the greedy-type algorithm which minimizes the BER. Two cases are presented: one with 𝛽=1 and the other with 𝛽=2. All subchannel BER are lower than 2102 to use valid BER approximations. Note that, with 𝛽=1, the system margin of allocation is almost equal to 8.9 dB for all target-bit rates. This constant system margin 𝛾 is not a feature of the algorithm but is only a consequence of the relation between the target bitrate and the PSDNR.

To enhance the equivalences and the differences between the resource allocation policies, the dissimilarity is also given in Figure 6 with 𝛽=1 and 𝛽=2. As expected in both cases, 𝛽=1 and 𝛽=2, the minimal BER are obtained with allocation 𝒞, and the maximal system margins with allocation .

With 𝛽=1 and when the target bitrate increases, the Lagrangian solution converges faster to the optimal system margin maximization solution, , than to the optimal BER minimization solution, 𝒞. Note that Theorem 7 is an asymptotic result valid for square QAM. With 𝛽=1, the QAM can be rectangular, and the asymptotic result of Theorem 7 is not applicable, contrary to the result of Theorem 6 where there is not any condition on the modulation order.

The case 𝛽=2 shows the equivalence between the optimal system margin maximization allocation and the optimal BER minimization allocation. In this case, the asymptotic result given by Theorems 5 and 7 can be applied because the modulations are square QAM, and the convergence is ensured with high modulation orders, that is, high target bitrates. Beyond a mean bitrate per subchannel around 6, that corresponds to a target bitrate around 6000, all the allocations 𝒜, and 𝒞 are equivalent, and the dissimilarity is almost equal to zero. In nonasymptotic regime, the differences in BER and system margin are low. The system margin differences are lower than 1 dB, and the ratios between two BER are around 3. In practical integrated systems, these low differences will not be significant and will lead to similar solutions for both optimization policies. Therefore, these resource allocations can be interchanged.

7. Conclusion

Two robustness optimization problems have been analyzed in this paper. Weighted mean BER minimization and minimal subchannel margin maximization have been solved under peak-power and bitrate constraints. The asymptotic convergence of both robustness optimizations has been proved for analytical and algorithmic approaches. In non asymptotic regime, the resource allocation policies can be interchanged depending on the robustness measure and the operating point of the communication system. We have also proved that the equivalence between SNR-gap maximization and power minimization in conventional MM problem does not hold with peak-power limitation without additional conditions. Integer bit solution of analytical continuous bitrates has been obtained with a new generalized secant method, and bit-loading solutions have been compared with a new defined dissimilarity measure. The low computational effort of the suboptimal resource allocation strategy, based on the analytical approach, leads to a good tradeoff between performance and complexity.

Appendices

A. Proof of Lemma 3

We prove that the optimal allocation is reached starting from empty loading with the same intermediate loading than starting from optimal loading to empty loading. To simplify the notation and without loss of generality, 𝛽=1.

Let [𝑟1,,𝑟𝑛] be the optimal allocation that minimizes the inverse system margin 𝛾(𝑅)1 for the target bitrate 𝑅, and then 𝛾𝑅1=max𝑖2𝑟𝑖1snr𝑖.(A.1) Let [𝑟1,,𝑟𝑛] be the optimal allocation that minimizes the inverse system margin 𝛾(𝑅+1)1 for the target bitrate 𝑅+1𝑅. The optimal allocation for target bitrate 𝑅 is obtained iteratively by removing one bit at a time from the subchannel 𝑘 with the highest inverse system margin [38] 𝑘=argmax𝑖2𝑟𝑖1snr𝑖(A.2) or 2𝑟𝑘1snr𝑘2𝑟𝑖1snr𝑖,𝑖=1,,𝑛.(A.3) The last bit removed is from the subchannel with the lowest inverse-SNR, snr𝑖1, because the bits over the highest inverse-SNR are first removed.

Now, let [𝑟1,,𝑟𝑛] be the optimal allocation that minimizes the inverse system margin 𝛾(𝑅)1 for the target bitrate 𝑅<𝑅. Following the algorithm strategy, the optimal allocation for target bitrate 𝑅+1 is obtained adding one bit on subchannel 𝑗 such that 𝑗=argmin𝑖2𝑟𝑖+11snr𝑖.(A.4) We first prove that 𝛾(𝑅+1)1=2𝑟𝑗+11snr𝑗.(A.5) Suppose that there exists 𝑗 such that 2𝑟𝑗1snr𝑗>2𝑟𝑗+11snr𝑗,(A.6) then one bit must be added to subchannel 𝑗 to obtain 𝑟𝑗+1 bits before adding one bit to subchannel 𝑗 to obtain 𝑟𝑗 bits which means that [𝑟1,,𝑟𝑛] is not optimal. As [𝑟1,,𝑟𝑛] is optimal by definition, it yields 2𝑟𝑖1snr𝑖2𝑟𝑗+11snr𝑗𝑖=1,,𝑛(A.7) which proves (A.5). The first allocated bit is from the subchannel with the lowest inverse SNR given by (A.4) with 𝑟𝑖=0 for all 𝑖.

Comparing (A.3) with (A.7) yields that 𝑘=𝑗, and the index subchannel of the first added bit is the same as the last removed bit. All the intermediate allocations are then identical with bit-addition and bit-removal methods. There exists only one way to reach the optimal allocation 𝑅 starting from the empty loading.

Proof of Lemma 3 can also be provided in the framework of matroid algebraic theory [19, 39].

B. Proof of Lemma 4

To simplify the notation and without loss of generality, the proof is given with 𝛽=1. Let [𝑟1,,𝑟𝑛] be the optimal allocation for the target bitrate 𝑅 such that 𝑟𝑖=𝑅. Let 𝑅+1 the new target bitrate. We first prove that Δ𝑖𝑟𝑖=𝑟𝑖+1ber𝑖𝑟𝑖+1𝑟𝑖ber𝑖𝑟𝑖(B.1) is a good measure at each step of the greedy-type algorithm for the BER minimization, and finally that (𝑟𝑖+1)ber𝑖(𝑟𝑖+1) can be used instead of Δ𝑖(𝑟𝑖).

Starting from the optimal allocation of target bitrate 𝑅, the new target bitrate 𝑅+1 is obtained by increasing 𝑟𝑗 by one bit 𝑟ber(𝑅+1)=𝑗+1ber𝑗𝑟𝑗++1𝑛𝑖=1,𝑖𝑗𝑟𝑖ber𝑖𝑟𝑖1+𝑛𝑖=1𝑟𝑖(B.2) and, using Δ𝑗, Δber(𝑅+1)=𝑗𝑟𝑗+𝑅𝑅+1𝑅+1ber(𝑅).(B.3) The ber(𝑅+1) which is equal to 𝜙({𝑟𝑖(𝑘+1)}𝑛𝑖=1) in (29) is minimized only if Δ𝑗(𝑟𝑗) is minimized. The minimum ber(𝑅+1) is then obtained with the increase of one bit in the subchannel 𝑗 such that 𝑗=argmin𝑖Δ𝑖𝑟𝑖.(B.4)

To complete the proof by induction, the relation must be true for ber(1). This is simply done by recalling that ber𝑖(0)=0, and then minber(1)=min𝑖ber𝑖(1)=min𝑖Δ𝑖(0).(B.5) The convergence of the algorithm to a unique solution needs the convexity of the function 𝑟𝑖𝑟𝑖ber(𝑟𝑖). This convexity is verified at high SNR. Appendix C provides a more precise domain of validity.

It remains to prove that (𝑟𝑖+1)ber𝑖(𝑟𝑖+1) can be used instead of Δ𝑖(𝑟𝑖). In high SNR regime ber𝑖𝑟𝑖+1ber𝑖𝑟𝑖(B.6) and then limsnr𝑖+Δ𝑖𝑟𝑖=𝑟𝑖+1ber𝑖𝑟𝑖+1(B.7) which proves the lemma.

In low SNR regime, the approximation of Δ𝑖 by (𝑟𝑖+1)ber𝑖(𝑟𝑖+1) remains valid; the dissimilarity between allocation using Δ𝑖 (B.1) and allocation using (𝑟𝑖+1)ber𝑖(𝑟𝑖+1) is null in the domain of validity given by Appendix C.

C. Range of Convexity of 𝑟𝑖ber𝑖

Let 𝑓+𝑟𝑖𝑟𝑖ber𝑖𝑟𝑖,snr𝑖(C.1) which equals the SER for high-SNR regime and Gray mapping. The function 𝑓 is a strictly increasing function: 𝑓(𝑟𝑖)<𝑓(𝑟𝑖+1)forallsnr𝑖, because ber(𝑟𝑖,snr𝑖)ber(𝑟𝑖+1,snr𝑖) and 𝑟𝑖<𝑟𝑖+1. Let Δ(𝑟𝑖)=𝑓(𝑟𝑖+1)𝑓(𝑟𝑖), and then Δ𝑟𝑖𝑟+1Δ𝑖𝑟=𝑓𝑖𝑟+22𝑓𝑖𝑟+1+𝑓𝑖𝑟𝑖+1ber𝑖𝑟𝑖+22ber𝑖𝑟𝑖.+1(C.2) If ber𝑖(𝑟𝑖+2)2ber𝑖(𝑟𝑖+1), then the function 𝑓 is locally convex or defines a convex hull. This relation is verified for BER lower than 2×102 and for all 𝑟𝑖0.

D. Proof of Theorem 5

We prove that both metrics used in Lemmas 3 and 4 lead to the same subchannel SNR ordering. Let 𝑓𝑟𝑖,snr𝑖=2𝑟𝑖+𝛽1snr𝑖,𝑔𝑟𝑖,snr𝑖=𝑟𝑖+𝛽ber𝑖𝑟𝑖.+𝛽(D.1) We then have to prove that 𝑓𝑟𝑖,snr𝑖𝑟𝑓𝑗,snr𝑗𝑟𝑔𝑖,snr𝑖𝑟𝑔𝑗,snr𝑗.(D.2) It is straightforward that 𝑓𝑟𝑖,snr𝑖𝑟𝑓𝑗,snr𝑗snr𝑗snr𝑖2𝑟𝑗+𝛽12𝑟𝑖+𝛽.1(D.3)

With square QAM, in high SNR regime and using (26) 𝑔𝑟𝑖,snr𝑖1=212𝑟𝑖+𝛽erfc322𝑟𝑖+𝛽1snr𝑖(D.4) and it can be approximated by the following valid expression 𝑔𝑟𝑖,snr𝑖=2erfc322𝑟𝑖+𝛽1snr𝑖.(D.5) Then, 𝑔𝑟𝑖,snr𝑖𝑟𝑔𝑗,snr𝑗snr𝑗snr𝑖2𝑟𝑗+𝛽12𝑟𝑖+𝛽1(D.6) which is also given by the first inequality. In high SNR regime and with square QAM, that is, 𝛽=2, 𝑓() and 𝑔() lead to the same subchannel SNR ordering and then argmin𝑖𝑓𝑟𝑖,snr𝑖=argmin𝑖𝑔𝑟𝑖,snr𝑖.(D.7) This last equation does not hold in low SNR regime (the BER approximation is not valid) or when the modulations are not square, that is, when 𝑟𝑖 is odd. Note that (D.5) is not only a good approximation in high SNR regime, it can also be used with high modulation orders with moderate SNR regime defined in Appendix C.

E. Proof of Theorem 6

As the infinite norm is not differentiable, we use the 𝑘 norm with lim𝑘+𝑖𝛾𝑖𝑘1/𝑘=max𝑖𝛾𝑖1.(E.1) In the subset , the Lagrangian of (18) for all 𝑘 is 𝐿𝑘𝑟𝑖𝑖=,𝜆𝑖(2𝑟𝑖1)𝑘snr𝑘𝑖1/𝑘+𝜆𝑅𝑖𝑟𝑖.(E.2) Let 𝜆 such as 𝜆=𝑖(2𝑟𝑖1)𝑘snr𝑘𝑖(𝑘1)/𝑘𝜆.log2(E.3) The optimal condition yields 2𝑟𝑖(2𝑟𝑖1)𝑘1=snr𝑘𝑖𝜆.(E.4) In asymptotic regime, 𝑟𝑖1 and then 2𝑟𝑖12𝑟𝑖. The equation of the optimal condition can be simplified and 𝑟𝑖=log2snr𝑖+1𝑘log2𝜆.(E.5) The Lagrange multiplier is to identify using the bitrate constraint, and replacing 𝜆 in the above equation leads to the solution 𝑟𝑖=𝑅||||+1||||𝑗log2snr𝑖snr𝑗.(E.6) Note that we do not need to calculate the convergence of the solution with 𝑘+ to obtain the result for the infinite norm. The result holds for all values of 𝑘 in asymptotic regime.

With 𝑘=1, the problem is a sum SNR-gap maximization problem under peak-power constraint, and it can be solved without asymptotic regime condition. Note that this sum SNR-gap maximization problem, or sum inverse SNR-gap minimization problem, under peak-power and bitrate constraints is min𝑖𝛾𝑖1=min𝑟𝑖𝑖𝑖(2𝑟𝑖𝜎1)2𝑊𝑖||𝑖||2𝑃𝑝𝑖(E.7) and is very similar to power minimization problem under bitrate and SNR-gap constraints exchanging 𝑝𝑖 with 𝛾𝑖1min𝑖𝑝𝑖=min𝑟𝑖𝑖𝑖(2𝑟𝑖𝜎1)2𝑊𝑖||𝑖||2𝑃𝛾𝑖1.(E.8) Both problems are identical if 𝑝𝑖𝛾𝑖=𝛼 as it is stated by Lemma 1.

F. Proof of Theorem 7

To prove this theorem, variables 𝐼𝑖 and 𝐽𝑖 are used instead of 𝑟𝑖 with 𝐼𝑖=2𝑟𝑖/2,𝐽𝑖=2𝑟𝑖/2,(F.1) and the bitrate constraint is 𝑅=𝑛𝑖=1log2𝐼𝑖𝐽𝑖.(F.2) In the subset , the Lagrangian of (25) is then 𝐿𝐼𝑖,𝐽𝑖𝑖=1,𝜆𝑅𝑖12𝐼𝑖1𝐽𝑖×erfc3𝐼2𝑖+𝐽2𝑖2snr𝑖+𝜆𝑅𝑖log2𝐼𝑖𝐽𝑖.(F.3)

Let 𝑋𝑖{𝐼𝑖,𝐽𝑖}, then 𝜕𝐿𝜕𝑋𝑖=𝑋𝑖𝑓𝐼𝑖,𝐽𝑖+1𝑋2𝑖𝑔𝐼𝑖,𝐽𝑖1𝑋𝑖𝜆,(F.4) with 𝑓𝐼𝑖,𝐽𝑖=1𝑅12𝐼𝑖1𝐽𝑖23snr𝑖×𝑒3snr𝑖/(𝐼2𝑖+𝐽2𝑖2)𝜋𝐼2𝑖+𝐽2𝑖23/2,𝑔𝐼𝑖,𝐽𝑖=1𝑅erfc3snr𝑖𝐼2𝑖+𝐽2𝑖.2(F.5)

The optimality condition yields 𝐼2𝑖𝐽2𝑖𝐼𝑖𝐽𝑖𝑓𝐼𝑖,𝐽𝑖=𝐼𝑖𝐽𝑖𝑔𝐼𝑖,𝐽𝑖𝑖.(F.6) A trivial solution is 𝐼𝑖=𝐽𝑖, and the other solution must verify 𝐼𝑖+𝐽𝑖𝐼𝑖𝐽𝑖𝑓𝐼𝑖,𝐽𝑖𝐼𝑔𝑖,𝐽𝑖=0.(F.7) To find the root of (F.7), let (𝑥,𝑦)=𝑥𝑦𝑒𝑦erfc𝑦(F.8) with 2𝑥=𝜋𝐼𝑖+𝐽𝑖𝐼𝑖𝐽𝑖𝐼2𝑖+𝐽2𝑖122𝐼𝑖1𝐽𝑖,𝑦=3snr𝑖𝐼2𝑖+𝐽2𝑖.2(F.9) We will prove that this function is positive in a specific domain. Consider that (1)𝑦𝑒𝑦>erfc(𝑦) for 𝑦0.334, then for BER lower than 101. (2)𝜋/2𝑥>1 for {𝐼𝑖,𝐽𝑖}[1,+)2 and 𝐼𝑖1 or 𝐽𝑖1, and lim𝐼𝑖,𝐽𝑖1𝜋/2𝑥=1+. Then, in the domain defined by 𝐼𝑖,𝐽𝑖[1,+)2ber𝑖0.1,(F.10)(𝑥,𝑦) is positive, and (F.7) has no solution. Thus, the only one solution of (F.6) with (F.10) is 𝐼𝑖=𝐽𝑖. As we will see later the domain of (F.10) is less restrictive than the asymptotic one.

The problem is now to allocate bits with square QAM. The following upper bound is used: 𝑟ber𝑖=2𝑟𝑖erfc3snr𝑖2(2𝑟𝑖1).(F.11) Note that this upper bound is a tight approximation with high SNR and with high modulation orders. The Lagrangian is that 𝐿𝑟𝑖𝑖=2,𝜆𝑅𝑖erfc3snr𝑖2(2𝑟𝑖1)+𝜆𝑅𝑖𝑟𝑖.(F.12) And its derivative is 𝜕𝐿𝜕𝑟𝑖=ln2𝜋2𝑟𝑖2𝑟𝑖13snr𝑖2(2𝑟𝑖𝑒1)3snr𝑖/2(2𝑟𝑖1)𝜆.(F.13) Let 𝑟𝑖1 for all𝑖, then 2𝑟𝑖12𝑟𝑖, and the optimality condition yields 3snr𝑖2𝑟𝑖𝑒3snr𝑖/2𝑟𝑖=2𝜆2𝜋ln22.(F.14) With reliable communication over the subchannel 𝑖, the Shannon's relation states that 𝑟𝑖log2(1+snr𝑖) and 3snr𝑖/2𝑟𝑖3/2 because 𝑟𝑖1. The relation between 𝑟𝑖 and 𝜆 is then bijective, and the real branch 𝑊1 of the Lambert function [40] can be used with no possibility for confusion 𝑟𝑖=log23snr𝑖log2𝑊12𝜆2𝜋ln22.(F.15) With the bitrate constraint 𝑅=𝑖𝑟𝑖, we can write log2𝑊12𝜆2𝜋ln22=𝑅||||1||||𝑛𝑖=1log23snr𝑖(F.16) and with (F.15) 𝑟𝑖=𝑅||||+1||||𝑛𝑗log2snr𝑖snr𝑗.(F.17) This result is obtained with square QAM in asymptotic regime (high modulation orders and high SNR) which is a more restrictive domain than that of (F.10).

Acknowledgments

The research leading to these results has received partial funding from the European Community's Seventh Framework Program FP7/2007-2013 under grand agreement no 213311 also referred to as OMEGA.