Abstract

The space-time whitened matched filter (ST-WMF) maximum likelihood sequence detection (MLSD) architecture has been recently proposed (Maggio et al., 2014). Its objective is reducing implementation complexity in transmissions over nonlinear dispersive channels. The ST-WMF-MLSD receiver (i) drastically reduces the number of states of the Viterbi decoder (VD) and (ii) offers a smooth trade-off between performance and complexity. In this work the ST-WMF-MLSD receiver is investigated in detail. We show that the space compression of the nonlinear channel is an instrumental property of the ST-WMF-MLSD which results in a major reduction of the implementation complexity in intensity modulation and direct detection (IM/DD) fiber optic systems. Moreover, we assess the performance of ST-WMF-MLSD in IM/DD optical systems with chromatic dispersion (CD) and polarization mode dispersion (PMD). Numerical results for a 10 Gb/s, 700 km, and IM/DD fiber-optic link with 50 ps differential group delay (DGD) show that the number of states of the VD in ST-WMF-MLSD can be reduced ~4 times compared to an oversampled MLSD. Finally, we analyze the impact of the imperfect channel estimation on the performance of the ST-WMF-MLSD. Our results show that the performance degradation caused by channel estimation inaccuracies is low and similar to that achieved by existing MLSD schemes (~0.2 dB).

1. Introduction

Maximum likelihood sequence detection (MLSD) receivers for nonlinear channels have been extensively investigated in the literature (e.g., [1, 2] and references therein). Their ability to achieve optimal performance in the presence of additive white Gaussian noise (AWGN) has always been of great theoretical and practical interest. The theoretical interest lies in that it provides a performance bound for any reception scheme. The practical interest stems from its ability to actually achieve optimal or nearly optimal performance in transmissions over nonlinear channels.

Much of the early work in this area has been focused on the compensation of nonlinearities in satellite communications [15]. A traditional architecture for the optimal nonlinear receiver consists of a matched-filter bank (MFB) followed by a maximum likelihood sequence detector (MLSD) [1]. Owing to the correlation among the spatial noise components at the MFB outputs, an MLSD with non-Euclidean metrics must be used. The use of oversampling in combination with MLSD (OS-MLSD) has also been proposed to implement the optimal receiver in the presence of nonlinearities (see [2] and references therein). Since the complexity of both MFB and OS MLSD-based receivers grows exponentially with the channel memory, their practical application in transmissions over highly dispersive channels has been limited. Despite this fact, MLSD-based receivers are still preferred over decision feedback equalizers (DFE) in applications such as multigigabit intensity modulation/direct detection (IM/DD) fiber optic systems for the two following reasons.(i)Performance: DFE suffers from severe performance limitations in moderate to high dispersion single-mode fiber links as a result of its inability to compensate nonlinear ISI [6, 7]. Although nonlinear DFE (NL-DFE) structures have also been considered in the literature (e.g., [8, 9]), their performances still degrade significantly at long fiber lengths (e.g., ≥500 km). On the other hand, MLSD can operate with a constant penalty around 3 dB with respect to back-to-back (B2B) at virtually any distance [10, 11].(ii)High-speed implementation: future generation of communication systems will operate at multigigabit per second data rates on highly dispersive channels [12]. In commercial applications, the digital receiver is often implemented as a monolithic chip in CMOS technology [12]. Maximum clock frequency of state-of-the-art complex digital signal processors in 28 nm CMOS technology is limited to frequencies lower than ~1 GHz. Therefore, in order to achieve multigigabit per second data rates, parallel processing techniques are required [12]. Although the complexity of serial implementations of DFE grows linearly with channel memory, all presently known parallel processing implementations require that the bottleneck created by the feedback loop be broken using techniques such as the ones proposed by [1315], whose complexity grows exponentially with the channel memory. Therefore, in high-speed applications, complexity of both DFE and MLSD increases with the channel memory in a similar way.

From the above it is clear that complexity reduction of MLSD is crucial for many practical applications. In IM/DD optical channels, the receiver must compensate the linear fiber dispersion as well as nonlinearities caused by lasers, optical modulators, the fiber Kerr effect, photo-detectors, and other components of the link. Chromatic dispersion (CD) and polarization-mode dispersion (PMD), in combination with the quadratic response of the photo-detector, are major factors that limit the reach and drive the complexity of optical-transmission systems at data rates ≥10 Gb/s [16, 17]. With traditional implementations based on oversampling techniques, an 8192-state MLSD is required to compensate 700 km of fiber at 10 Gb/s [10]. This is prohibitive in current CMOS technology1.

A new MLSD receiver architecture for nonlinear channels has been proposed in [18]. The major breakthrough of this proposal consists in a novel representation of the received signal obtained by a Gram-Schmidt-like orthogonalization of the kernels of a Volterra series expansion of the channel. This procedure yields a special form of space-time whitened matched filter (ST-WMF) [19] whose baud-rate-sampled outputs are sufficient statistics with independent noise components in both space and time. Combined with the minimum phase property of the response of each branch, the ST-WMF provides an effective way to reduce the complexity of MLSD in nonlinear channels. The ST-WMF MLSD technique offers a smooth trade-off between performance and complexity. As complexity is progressively reduced, performance degrades in a graceful manner. Numerical results in [18] demonstrate that the number of states of the VD in ST-WMF-MLSD required on a 10 Gb/s, 700 km, and IM/DD fiber-optic link without PMD can be reduced 8 times compared to an oversampled MLSD.

Further contributions on the ST-WMF-MLSD receiver and its performance on IM/DD optical links are provided in this work. First, the space channel compression achieved by the ST-WMF-MLSD is analyzed in detail. Space channel compression is important because it is the property that enables the major complexity reductions of the proposed architecture. Second, the performance evaluation of the ST-WMF-MLSD receiver is extended by addressing the combined effect of CD and PMD. Numerical results confirm that the ST-WMF-MLSD remains an attractive solution in the presence of these combined impairments. Finally, the impact of channel estimation inaccuracies on the ST-WMF-MLSD performance is assessed. Accuracy and speed of channel estimation are particularly important when tracking nonstationary channels. Nonstationarity in optical channels results from PMD and random rotations of the laser state of polarization. Our results show that the performance degradation caused by an imperfect channel estimation is low and similar to that achieved by existing MLSD schemes (~0.2 dB).

This paper is organized as follows. The nonlinear channel model and the ST-WMF-MLSD architecture are described in Sections 2 and 3, respectively. Performance evaluation of the ST-WMF-MLSD under different channel conditions, as well as its robustness in the presence of imperfect channel knowledge, are presented and discussed in Sections 4 and 5. Finally, conclusions are drawn in Section 6.

2. Nonlinear Channel Model

The noisy received signal is given by where is the noise-free signal and is the noise component, which is assumed to be a white Gaussian process with power spectral density . Component can be expressed in terms of its Volterra-series expansion [20, 21]. For example, neglecting the DC term of the expansion we get where is the linear kernel, with is the th second-order kernel, is the th symbol at the input of the nonlinear channel, is the symbol rate, and is the total number of kernels2.

2.1. Channel Model Orthogonalization

Next we present an alternative representation of the nonlinear signal [18]. Without loss of generality, we consider here the Volterra-series with a dominant linear kernel , and we select the first pivoting response as . Let be the signal space spanned by the set [19]. We assume that signal spaces are Hilbert spaces with inner product defined as , where superscript denotes complex conjugate. From the projection theorem, the nonlinear kernels can be uniquely expressed as where is orthogonal to the signal space ; that is, while is minimum [19], and denotes the set of all integers. We highlight that the first summation in (3) is the projection of onto . Define as the signal space spanned by . For and , from (4) note that ; therefore and are orthogonal signals [19]. Replacing (3) in (2) and operating, we obtain where with operator denoting convolution. Notice that and ; therefore signals and are orthogonal (see (4)). Next we focus on . Without loss of generality we select the second pivoting response as ; then (7) can be rewritten as Similarly to (5)–(7), can be expressed as where with chosen to satisfy Thus, note that is orthogonal to the signal spaces and , spanned by and , respectively. Repeating the processing on (11) and generalizing, we get where is the response of the th channel path and with given by with From (13) and (15) note that signal components from different paths are orthogonal; that is,

Figure 1 shows two multiple input-single output (MISO) representations of the nonlinear channel. Figure 1(a) shows the signal represented by the traditional Volterra model of (2); Figure 1(b) depicts the orthogonal representation given by (13) and (14).

2.2. Space Channel Compression

The model orthogonalization procedure described in Section 2.1 can be performed in different ways with being the number of kernels. For example, consider a Volterra model with . One way to obtain the orthogonalized model may be realized by selecting the kernels as shown in Figure 2. In this case, the pulse is selected as the pivoting for the first orthogonalization step. Then, and are obtained as the components of and orthogonal to . For the second step, is selected as the pivoting response, so results as the component of orthogonal to . This unique ordering could be identified by the following set of indexes .

A different result is achieved by selecting the pivoting responses as illustrated in Figure 3. In this situation, is selected as the pivoting response for the first step, and and are the components of and orthogonal to . According to Figure 3, (which is different from shown in Figure 2) is selected as the pivoting response for the second orthogonalization step. Then, will be obtained as the component of orthogonal to . The index set corresponding to the procedure depicted in Figure 3 is .

Note that the set of responses and sequences resulting from the procedures described in Figures 2 and 3 are two different expansions of the same signal . Any one of the possible sets can be selected in the orthogonalization process. Space compression of the channel can be achieved concentrating the maximum of the channel energy on a reduced set of paths, independent of the distribution of their individual energy, as explained in the following discussion.

Let be a given number of paths used to model the channel (e.g., if , note that any orthogonalization process shall represent exactly the behavior of the channel). To reduce the complexity of the receiver, it is desirable for to be as small as possible. On the other hand, to minimize the performance degradation (or the inaccuracy of the channel representation), the orthogonalization process should be carried out in such a way that the most part of the signal energy is concentrated on these paths. From the above, for a given value of , the optimum set for space compression (i.e., minimal channel distortion modeling) should meet the following condition: where is the energy in the th path for the set , while denotes expectation. If , notice that any orthogonalization process will satisfy condition (17). Otherwise, if the criterion guarantees that the signal energy contained in the complete paths shall be maximum, independent of the individual energy distribution among them.

The optimum channel expansion that satisfies (17) can be found by an exhaustive search. Instead, we propose to achieve the orthogonalization process to meet the following criteria: with . Condition (19) can be achieved by selecting at each orthogonalization step (th step, e.g., with ) the pivoting response () as the response with highest energy among the remaining responses (those orthogonal to for , ). That is, at the first step we select the pivoting response as the Volterra kernel with the highest energy in . At the second step we select the pivoting response as the response with highest energy in the set of responses orthogonal to (i.e., , with ). This procedure is repeated in the same way at each step, where the best pivoting response at each stage is selected by an exhaustive search among all the pivot candidates at that stage. The minimization of the energy of the orthogonal responses (see (3), (4), and associated discussion) ensures that condition (19) is met. As we shall show later, condition (19) gives rise to space channel compression, which can be exploited to reduce complexity.

2.3. Example

As an example, we present results for a 10 Gb/s, 700 km, and IM/DD fiber-optic link (details about the simulated system are given in Section 4.1).

Without loss of generality, we assume that . Let be the normalized cumulative path energy of the traditional Volterra-series expansion (see (2)) given by where is the total energy of the signal component :

On the other hand, the normalized cumulative path energy of the orthogonalized Volterra-series expansion is defined as where is given by (18) with obtained according to the criteria defined by (19). In all cases, note that   , with .

Figure 4 shows the normalized cumulative path energy for the traditional (20) and orthogonalized (22) Volterra-series representation with . For the orthogonal representation, the pivoting responses are selected according to the criteria in (19). We observe that most energy of the nonlinear signal is concentrated on the first two paths for the orthogonalized Volterra-series expansion (i.e., represents ~99.4% of the total signal energy). The traditional Volterra model requires five paths to capture a similar level of nonlinear signal energy. We highlight that the increase of from −0.33 dB to −0.12 dB at is due to the factor in (18). Notice that this factor is a measure of the correlation between all nonlinear kernels with and the linear kernel (e.g., () if and are orthogonal (same) pulses ). Thus, we conclude the following for the channel analyzed in the example.(i)Most energy of the nonlinear kernels with the traditional Volterra model is contained in their projection onto the linear signal space spanned by the set .(ii)A channel model with paths captures practically all the signal energy.(iii)Since is ~97.5% of the total signal energy (see Figure 4). Therefore, as we shall show later, a receiver that considers the first channel path should be able to achieve a good performance.

3. MLSD Receiver for Nonlinear Channels with AWGN

The MLSD receiver chooses the sequence that minimizes the metric Using (13) and (16) in (23), we get where denotes the real part, while From (24) notice that the computation of the cost function for every candidate symbol sequence can be achieved from the samples at the outputs of the matched filters given by (25).

From (1), equation (25) can be rewritten as where

From (15) note that if while where Since the noise is assumed Gaussian, from (28) we conclude that the noise components at the output of the proposed MFB are spatially independent. Therefore, the bank of matched filters followed by a baud rate sampler is called the space-whitened matched filter (S-WMF) bank.

3.1. Space-Time Whitened Matched Filter MLSD Receiver

From the above, we observed that the matched filter bank derived from the new expansion of the nonlinear channel gives rise to spatially independent noise components. In order to simplify the implementation of the sequence detector, a space and time whitening filter bank is derived.

Let be the impulse response of the filter with Fourier transform (FT) given by where is the FT of and and are defined by the folded spectrum factorization with being the -transform of the sequence given by (29). The set forms an orthonormal basis for the signal space spanned by . Furthermore, we choose to be minimum phase [19]. Define as the projection of onto the signal space spanned by ; that is, where . From (15) and following the procedure used in [19, Section 10.2.4], metric (23) can be expressed as where is an diagonal matrix given by while with and given by (14) and (31), respectively.

On the other hand, from (30) it is possible to show that where and are diagonal matrices given by with being the inverse FT of . From (31) notice that (35) can be expressed as where (see (31)). From (32), (36), and (39) it can be shown that MLSD reduces to minimize

Let be the -dimensional vector with the noise components of the baud rate samples . From (15) and (30), the power spectral density of results in , where is the identity matrix. Therefore, the minimization of (40) can be easily implemented by using a Viterbi detector with multidimensional Euclidean branch metrics. Figure 5 shows a block diagram of the ST-WMF-MLSD receiver with dimension (i.e., is the number of filters in the bank). If , all the paths of the nonlinear channel are used by the receiver.

3.2. Complexity Considerations

The ST-WMF-MLSD consists of a filter bank followed by a Viterbi decoder (VD) and a channel estimation stage. Although the computational load of the latter may be important, its complexity is not critical since it can be implemented at a low frequency rate3. On the other hand, the dimension of the front-end () is expected to be very low in general as a result of the spatial energy compression achieved by the ST-WMF-MLSD. Then, is a reasonably good estimation for applications in IM/DD optical systems and the computational complexity of the filter bank is reduced to the one of 2 linear filters. Therefore, the implementation complexity of ST-WMF-MLSD is dominated by the VD. Practical aspects related to high-speed implementations of VD have been widely investigated in the past literature (e.g., see [12, 22] for more details). The computational load (usually measured in number of multiplications and comparisons) and storage requirements of the VD depend directly on the number of states. Therefore, we adopt the number of states of the VD as the measure of complexity in order to compare the MLSD architectures investigated in this work.

4. Performance Evaluation in IM/DD Optical Systems

Next we analyze the proposed ST-WMF-MLSD receiver in transmissions over IM/DD fiber-optic systems with on-off keying (OOK) modulation. We focus on two key aspects of ST-WMF-MLSD: its performance (in comparison with current solutions based on OS-MLSD), and its ability to reduce complexity (e.g., number of states of VD). Complexity reduction is possible owing to (i) the minimum-phase property of the equivalent channel response provided by ST-WMF and (ii) condition (19). The latter gives rise to space compression, which reduces the ST-WMF dimension, . This is achieved by using the most significant paths of the nonlinear channel.

Figure 6 depicts the optical system under consideration. The transmitter modulates the intensity of the transmitted signal using NRZ-OOK modulation. The standard single mode fiber (SMF) introduces CD and PMD, as well as attenuation. Optical amplifiers are deployed periodically along the fiber to compensate the attenuation, also introducing amplified spontaneous emission (ASE) noise in the signal. ASE noise is modeled as AWGN in the optical domain. The received optical signal is filtered by an optical filter and then converted to a current with a PIN diode or avalanche photodetector. The resulting photocurrent is filtered by an electrical filter. The noise component after the electrical filtering is non-Gaussian and signal-dependent [16]. Therefore, the electrical signal is first processed by a memoryless nonlinear transformation. It has been found that after a square root transformation, the noise can be assumed Gaussian and signal-independent [23, 24]. Furthermore, channel nonlinearities can also be reduced by using the square root transformation [25], which improves the space compression used to reduce the receiver dimension (i.e., most of the channel energy is concentrated on the linear kernel). The split-step Fourier method [26] is used to compute the propagation of optical signals through the fiber. Oversampled linear and nonlinear kernels are extracted from the electrical signal after the square root transformation. The oversampling factor depends on various parameters of the communication system (e.g., optical power, fiber length, etc.). In our case, we have found that an oversampling factor of is good enough to accurately model the system; that is, no improvement was appreciated by increasing the sampling rate for the entire set of conditions considered for this work. Then, we compute and according to (17), and the symbol rate channel response matrix can be easily obtained from (38). Since the noise after the square root transformation is approximately Gaussian and signal-independent [27], the theory proposed in [28] is used to evaluate the bit error probability4. All the kernels of the nonlinear channel are used to compute the error probability, independent of the receiver dimension, . Data rate is Gb/s and the transmitted pulse shape has an unchirped Gaussian envelope with ps. We use a Lorentzian optical filter and a fourth-pole Butterworth electrical filter with bandwidths of and GHz, respectively. The fiber dispersion is  ps/(nm-km).

4.1. IM/DD Systems with CD and PMD

Typically, the performance of equalization stages in fiber optic transmissions is evaluated by using the optical signal-to-noise ratio (OSNR) required to achieve a given BER (e.g., see [10, 11]). The target BER is around , which corresponds to the value required at the input of the forward error correction (FEC) code to achieve the error rate expected by the application (e.g., ~10−15) [29].

Figure 7 depicts the OSNR penalty with respect to a B2B system at a bit-error rate (BER) of versus the number of states of the VD for  km. We present results for ST-WMF-MLSD with (i.e., only one filter in the bank). For OS-MLSD with 2 samples/bit, the reduction of states is achieved by truncation and optimization of the sampling phase in order to minimize BER (8 uniformly distributed phases in the interval were tested). From Figure 7, we verify that the number of states of the VD at a penalty of  dB can be reduced from to with ST-WMF-MLSD and . Notice that this performance is achieved by using a VD with one sample per bit [11]. Furthermore, we emphasize that these benefits greatly outperform the extra complexity required by the linear filter and the channel estimator (implementation details are omitted here due to space limitations). From Figure 7 we can also see that the performance degradation achieved with with respect to the ideal MLSD is only ~0.3 dB. This result can be understood from the analysis of Section 2.2, where it has been observed that of the total signal energy is contained in the first channel path.

The performance of ST-WMF-MLSD in the presence of CD and first-order PMD (i.e., differential group delay or DGD) is investigated. Figure 8 depicts the OSNR penalty versus the number of states of the VD for a  km fiber link with two values of DGD: and ps. We evaluate the OS-MLSD with 2 samples/bit and ST-WMF-MLSD with . As expected, both receivers tend to the same performance as the number of states of the VD increases. We also note that the benefits of the ST-WMF-MLSD reduce when the DGD increases5. Nevertheless, from Figure 8 notice that the number of states of the VD at a penalty of  dB can be reduced and times with ST-WMF-MLSD at 25 and 50 ps DGD, respectively. These results show that the ST-WMF-MLSD is still an attractive solution to reduce complexity in transmissions over fiber-optic channels in the presence of PMD.

5. Impact of the Imperfect Channel Knowledge

Performance evaluation of the ST-WMF-MLSD has been achieved by assuming a perfect knowledge of the channel. In the following, we analyze the impact of the channel estimation inaccuracy on the performance of the ST-WMF-MLSD architecture in transmissions over IM/DD optical systems. This study will show that the performance degradation in ST-WMF-MLSD receivers, with , caused by an imperfect channel estimation, is low (~0.2 dB) and similar to that achieved by oversampled OS-MLSD receivers in the  km fiber link used in the example of Figure 7.

The estimation of the oversampled linear and nonlinear kernels is required to implement both MLSD-based receivers. Let be the oversampling factor. Based on the polyphase filter representation of the oversampled channel response (see Figure 9), the received samples can be expressed as where ,  , and with . Notice that there are different sequences sampled at the baud rate, corresponding to different sampling phases.

Equation (41) can be rewritten as Since symbols are assumed zero-mean and i.i.d. real random variables with , we can verify that

From (43), a simple estimator of the oversampled linear and nonlinear kernels can be implemented with an averaging filter as follows: where is the length of the averaging filter, is an arbitrary time index, and is the detected symbol6. The accuracy of the channel estimation given by (44) depends on the precision of the decisions , the length of the averaging filter , and the channel noise power. We consider that decisions provided by the forward error correction (FEC) decoder are available; therefore the effect of decision errors can be neglected (i.e., ). We highlight that this assumption is still valid if pre-FEC decisions are used as a result of the low bit error rates experienced in this link (e.g., ~10−3). From the above, we conclude that the goodness of the estimates (44) shall mainly depend on the filter length and the channel noise power.

The precision of (44) improves as the value of increases. On the other hand, the maximum value of shall be imposed by the speed of temporal variations of the fiber optic channel. As a result of its dependence on stress and vibrations, as well as on random changes in the state of polarization of the laser, PMD is nonstationary. Fluctuations with a time scale of a hundreds of microseconds have been considered in previous works (e.g., [30]). Therefore, the response time of channel estimation algorithms for PMD mitigation must be less than 1 ms (in practice a response time less than 100 μs is required [27]). This imposes, for example, that the bandwidth of the averaging filter () should be ≥20 kHz in order to efficiently track the channel variation.

The received signal seen by an MLSD receiver in the presence of imperfect knowledge of the channel dispersion can be expressed as where and is the estimation error component, while is the synthesized signal obtained from (44).

Figure 10(a) shows the SNR penalty caused by the imperfect channel estimation as a function of obtained from computer simulations. We consider GHz, , and the fiber link with  km, as used in Figure 7. The SNR penalty caused by an imperfect channel estimation is computed as where is the channel noise power required to achieve a with an unconstrained complexity OS-MLSD receiver (i.e., the VD uses as many states as required) and is the variance of the estimation error component7. Notice that the penalty is ~0.4 dB for . This value of represents a BW of ~1 kHz (see Figure 10(b)). Assuming that the estimation error is white Gaussian noise with power , from Figure 10(a) we infer that the SNR penalty caused by an imperfect channel knowledge in OS-MLSD receivers with should be  dB.

Figure 11 depicts the OSNR penalty at versus the number of states of the Viterbi detector (VD) with km. We present results with perfect knowledge of the fiber dispersion (denoted as ), and for imperfect channel estimation with . We see that the mean penalty caused by inaccuracies of channel estimation agrees with that expected from Figure 10 with (i.e., ~0.14 dB dB). Furthermore, we observe that the impact of imperfect channel knowledge on the performance is similar in both MLSD receivers (i.e., ~0.14 dB and ~0.18 dB for OS and ST-WMF, resp.). This result can be understood from the fact that the filters and are computed from the samples of the estimated linear kernel . Taking into account that the energy of the linear component is significantly higher than the nonlinear kernels [25] (see Section 2.2), an accurate estimation of can be achieved for the channel considered. Then, we infer that the energy loss of the signal component at the output of will be small. Therefore, and based on (45), notice that the performance of OS-MLSD and ST-WMF-MLSD with should degrade in a similar way.

6. Conclusions

New results on the recently proposed ST-WMF-MLSD nonlinear receiver have been presented in this paper. These results are the following: (1) the space compression property of the factorization introduced in [18] has been analyzed in detail; (2) the performance of the ST-WMF-MLSD in IM/DD fiber optic systems in the combined presence of CD and PMD has been evaluated, and (3) it has been shown that the performance degradation caused by an imperfect channel estimation and tracking is low and similar to that achieved by existing MLSD schemes. These features make the ST-WMF-MLSD a good architecture for receivers for long distance IM/DD fiber-optic links.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This paper has been supported in part by Fundación Fulgor and the ANPCyT (PICT 2013, no. 2724).

Endnotes

  1. Some commercial implementations of VD at 10 Gb/s with 4, 8, and 16 states are available in 90 nm CMOS technology [3133]. It should be possible to implement VD’s with 64 or 128 states by using the available 28 nm technology.
  2. The Volterra model can be easily extended to include higher order terms (e.g., third order kernels , with input sequence and , could be included in the expansion). To keep the notation simple, only second order kernels are used in the derivations throughout this paper. However, numerical results incorporate nonlinear kernels of order higher than two.
  3. As a result of its dependence on stress and vibrations, as well as on random changes in the state of polarization of the laser, PMD is nonstationary. Fluctuations with a time scale of a few milliseconds have been observed in PMD measurements [27]. Thus, the response time of the channel estimation schemes for PMD mitigation must be less than 1 ms. In practice, a response time less than 100 μs is required. Therefore, the channel estimation stage could be easily implemented by using current technology.
  4. We did not run Monte Carlo simulations of the communication systems. The numerical results are semianalytic, in the sense that they are theoretic estimations assisted by numerical simulations. The probability of error is estimated according to the theory proposed in [28]; that is, the first terms of the union bound (those with the lowest distance) are used to estimate the probability of error. Numerical simulations are used to find the distances of the error-events, their Hamming weights, and a priori probabilities. These results are then introduced in the formulas for the probability of error estimation as reported in [28].
  5. In order to explain this result, consider a fiber channel with  ps only (i.e., no CD). Since , the IM/DD system approximately behaves as a duobinary channel; therefore the time compression achieved by the ST-WMF-MLSD will be negligible.
  6. In order to improve the channel tracking capability, an efficient implementation of the channel estimator with the well-known LMS algorithm may be preferred [12]. Nevertheless, the objective is to shed light on the impact of imperfect knowledge of the fiber dispersion on the orthogonalized Volterra model. Therefore, practical aspects of the receiver architecture (e.g., buffers, number of taps of the WF, finite precision arithmetic effects, etc.) are not considered.
  7. The mean penalty of 20 runs with different seeds of the random number generator is presented.