Abstract

The error mechanisms of iterative message-passing decoders for low-density parity-check codes are studied. A tutorial review is given of the various graphical structures, including trapping sets, stopping sets, and absorbing sets that are frequently used to characterize the errors observed in simulations of iterative decoding of low-density parity-check codes. The connections between trapping sets and deviations on computation trees are explored in depth using the notion of problematic trapping sets in order to bridge the experimental and analytic approaches to these error mechanisms. A new iterative algorithm for finding low-weight problematic trapping sets is presented and shown to be capable of identifying many trapping sets that are frequently observed during iterative decoding of low-density parity-check codes on the additive white Gaussian noise channel. Finally, a new method is given for characterizing the weight of deviations that result from problematic trapping sets.

1. Introduction

Prior to 1993, channel codes were typically designed with the goal of maximizing the minimum distance of the code [1, 2]. The combination of a code with a large minimum distance and a decoder that minimizes the probability of codeword error often resulted in good asymptotic performance. The discovery of turbo codes [3] and the rediscovery of low-density parity-check (LDPC) codes [4] revealed that codes with relatively poor minimum distance properties could achieve near-capacity performance at bit error rates of . This resulted in a reduced emphasis on maximizing minimum distance when design codes for use on the additive white Gaussian noise (AWGN) channel. As with many other classes of codes, there are no practical bounds for the decoders used for turbo codes and LDPC codes and simulations are required to accurately determine the performance at practical operating points.

With the discovery of turbo codes and the various subsequent iterative decoders, the phenomenon of the error floor has become prominent in practical code design. The term error floor refers to the situation where the error rate at the output of the decoder suddenly starts to decrease at a slower rate as a function of increasing signal-to-noise ratio (SNR); that is, the performance curve flattens out. The error floors that occur with iterative decoding of turbo codes and LDPC codes are problematic in practical systems because it is difficult to predict the specific operating point at which they occur, and thus design engineers risk using codes that may have unknown error floors that limit the performance of the system. In the case of iterative decoding of turbo codes, it has been shown that the error floor is usually the result of the overall turbo code having low weight codewords that begin to limit the performance of the code after some SNR is reached [5]. The minimum distance of a turbo code can be increased, and hence the likelihood of an error floor sufficiently mitigated, through the use of various interleaver designs [6, 7].

Low-density parity-check codes with iterative decoding are also known to exhibit error floors [8, 9], albeit at much lower bit error rates than the error floors of turbo codes of similar block length. In many cases, because of the exceptionally low bit error rates at which the error floors of LDPC codes appear, it is not practical to use Monte Carlo simulations to demonstrate the existence of these error floors. The inability to run conventional computer simulations down to the error floor combined with the lack of practical upper bounds for LDPC codes has inhibited the deployment of LDPC codes in high-throughput applications that require near error-free performance with bit error rates of .

LDPC codes are most commonly decoded using iterative message-passing decoders such as the min-sum decoder [10] and the sum-product decoder [11] due to their excellent performance and low implementation complexity. Many attempts have been made to estimate the performance of these decoders by characterizing their error mechanisms. Three of the most well-known error mechanisms are stopping sets [12], trapping sets [13], and absorbing sets [8]. Unfortunately, none of these mechanisms leads to strict upper bounds on the performance of iterative decoding over the AWGN channel, and thus they are of limited use in determining error floors. Wiberg showed that deviations on the computation trees of LDPC codes can be used to compute tight upper bounds on the performance of LDPCs with iterative decoders [10]; however, it is computationally intractable to do so after even a small number of decoder iterations. A practical method for determining the error floor of LDPCs with iterative decoding has yet to be discovered.

This paper attempts to make progress on this problem by integrating the precise, but computational intractable, work of Wiberg with the experimental studies of the error mechanisms observed when iteratively decoding LDPCs. The paper begins with a tutorial review of the existing methods for analyzing the performance of iterative message-passing decoders. Then, the notion of a problematic trapping set is introduced and its relationship to deviations is examined in detail, with the goal of determining what makes deviations either more or less problematic during iterative message-passing decoding. Finally, an iterative method is given for finding problematic trapping sets using the weights of deviations on the computation trees.

2. Background

The following model for channel coding is used throughout this paper. First, a vector of information bits is generated by a binary source. The binary source is assumed to be memoryless, which is often the result of source coding (data compression), and therefore all information sequences in are equally probable. A binary generator matrix may be used by the channel encoder to map the information bits to a length codeword , via the mapping . Here, it is assumed that the matrix is full-rank, and thus the rate of the code is .

Before a codeword is transmitted over the channel, it is modulated via the transformation

for all . The received signal vector is given by

where is the Gaussian noise vector. The log-likelihood ratio (LLR) vector, often used for soft-decision decoding, is given by

for all . This reduces to when is a vector of AWGN. An estimate of the transmitted codeword is derived from the received vector at the channel decoder. Finally, the information bits extracted from are passed to the sink.

As mentioned earlier, each of the information sequences is equiprobable. Since there is a one-to-one mapping between information sequences and codewords, all codewords in the code are equiprobable as well. Therefore, for all , where is the probability that codeword is transmitted. When considering the performance of linear codes, equiprobable codewords allow for the assumption that the all-zeros codeword was transmitted. The all-zeros codeword assumption is used throughout this paper.

From the generator matrix , it is possible to derive an parity-check matrix for the code. A parity-check matrix of a code is any matrix , such that for all . LDPC codes are often defined by their parity-check matrix . In particular, LDPC codes are a class of codes with sparse parity-check matrices. A sparse parity-check matrix is any binary matrix that contains more binary 0s than binary 1s. A -regular LDPC code is one that has a fixed number of binary 1s in each column of the parity-check matrix and some fixed number of binary 1s in each row of the parity-check matrix. An example of a (2,3)-regular LDPC code of length and dimension is given by the parity-check matrix The parity-check matrix of a length , dimension code must contain at least rows, since the kernel of has dimension . However, it is possible for the parity-check matrix to contain more than rows. Therefore, the number of rows in the parity-check matrix is denoted by , where .

A Tanner graph is a bipartite graphical representation of a low-density parity-check matrix. To construct a Tanner graph from a parity-check matrix, each column in the parity-check matrix is assigned to a corresponding variable node in the Tanner graph, and each row is assigned to a corresponding check node in the Tanner graph. The set of all variable nodes is , and the set of all check nodes is . There is an edge between variable node and check node in the Tanner graph if and only if the entry in at the intersection of the th row and th column is a binary 1. The Tanner graph is thus defined by the set of variable nodes , the set of check nodes , and the set of edges . The Tanner graph corresponding to the parity-check matrix (given by (4)) is shown in Figure 1.

Note that in the Tanner graphs of irregular LDPC codes variable nodes and check nodes do not all have the same number of incident edges. The number of check nodes that a specific variable node is connected to is denoted by , and the number of variable nodes that a specific check node is connected to is denoted .

2.1. Iterative Message-Passing Decoding

The min-sum (MS) and sum-product (SP) decoders are low-complexity, sub-optimal iterative decoders that can be used to decode low-density parity-check codes. Given a particular parity-check matrix, the MS decoder operates by passing messages between the check nodes and the variable nodes along the edges of the Tanner graph of the code.

Before introducing the decoders, some additional notation is necessary. The set of neighbors of check node in the Tanner graph is denoted , and similarly the set of neighbors of variable node in the Tanner graph is denoted . To denote the set of neighbors of check node excluding variable node , the notation is used. Similarly, the set of neighbors of variable node excluding check node is denoted . During decoding, messages are passed between neighboring check nodes and variable nodes along the edges of the Tanner graph. Messages from check node to variable node are denoted by , and messages from variable node to check node are denoted . Given the transmitted codeword , the channel output available at the receiver, and a maximum number of iterations , the steps for MS and SP decoding are given in the following algorithm

Algorithm 1 (Min-Sum/Sum-Product Decoding). Step 1 (Initialization). Set the number of iterations to For all messages set Step 2 (Check Node Update). Set For all messages setMin-Sum: Sum-Product:Step 3 (Variable Node Update). For all messages set Step 4 (Check Stop Criteria). For all , set For all set with If or stop decoding, else return to Step 2.
One of the primary strengths of the min-sum and sum-product decoders is the relatively small number of operations performed during each iteration. During each iteration, the messages and must be computed for each binary 1 in the parity-check matrix. For a -regular LDPC code, there are binary 1s in the parity-check matrix. When the degree of the nodes and the number of iterations is fixed, the complexity of MS decoding scales linearly with the length of the code.
In practice, the min-sum and sum-product decoders do not always output a codeword. It has been shown that when the MS decoder does not output a codeword after a large number (200) of iterations has been performed, the output often cycles in a repeating sequence of two or more noncodeword outputs [14]. In Sections 2.2 through Section 2.4, three different characterizations are given for the noncodeword outputs of iterative message-passing decoders.

2.2. Stopping Sets

The notion of stopping sets was first introduced by Forney et al. [15] in 2001. Two years later, a formal definition of stopping sets was given by Di et al. [12]. They demonstrated that the bit and word error probabilities of iteratively decoded LDPC codes on the binary erasure channel (BEC) can be determined exactly from the stopping sets of the parity-check matrix.

Definition 1 (stopping sets [12]). A stopping set is a subset of the set of variable nodes , such that any check node connected to a variable node contained in is connected to at least two variable nodes in .

A small example of a stopping set is given in Figure 2. Consider the subset of the set of variable nodes . There are five check nodes connected to the set , and each of them is connected to at least two times. Note that only is connected to the set an odd number of times; If each of the check nodes is connected to an even number of times, corresponds to a codeword support set where all bits in can be flipped without changing the overall parity of any of the check nodes.

The intuition behind stopping sets begins with an understanding of iterative message-passing decoders. Information given to a specific variable node from a neighboring check node is derived from all other variable nodes connected to that check node. Consider two variable nodes , where both variable nodes contain an erasure. In this case, each of the sets and contains at least one erasure, thus making it impossible for the check node to determine the parity of either set. For this reason, none of the check nodes connected to a stopping set is capable of resolving erasures, if each variable node contained in the stopping set begins with an erasure from the channel.

Work relating linear programming (LP) pseudocodewords to stopping sets for the binary erasure channel [15], and both the binary symmetric channel (BSC) and the additive white Gaussian noise channel [16], has revealed a relationship between linear programming pseudocodewords and the size of stopping sets. Although stopping sets have a strong relationship with LP pseudocodewords, the performance of neither the MS decoder or the SP decoder on the BSC and AWGN channels can be predicted using stopping sets alone.

2.3. Trapping Sets

Trapping sets, also referred to as near-codewords, were first introduced by MacKay and Postol [13] to provide an explanation for the weaknesses of algebraically constructed low-density parity-check codes. They define trapping sets as follows.

Definition 2 (trapping sets [13]). Consider a length code with parity-check matrix , and let be a set containing coordinates. Consider a length- binary vector with 1s in the coordinates of and 0s elsewhere. If the syndrome has Hamming weight , the set is referred to as a trapping set.

Consider the trapping set shown in Figure 3, where the set corresponds with a set of variable nodes in the Tanner graph of the parity-check matrix . There are four variable nodes in the set, so , and if all variable nodes are set to a binary 1, only check nodes and are connected to an odd number of binary 1s, so the syndrome has Hamming weight equal to 2. Therefore, according to Definition 2, this set of variable nodes defines a trapping set.

It is important to note that any set of variable nodes can be considered a trapping set defined by some set of parameters, and the significance of trapping sets varies greatly depending on the parameters . In much the same way that low-weight codewords are problematic to decoding, erroneous channel information is more likely to affect the majority of variable nodes in a trapping set which has low-weight . Richardson [17] shows that trapping sets with small weight and a small number of unsatisfied check nodes are more likely to cause errors. When a trapping set has small , the extrinsic information being passed into can not overcome the intrinsic information reinforced within .

In [17], trapping sets are examined for different decoders on the binary erasure channel, binary symmetric channel, and the additive white Gaussian noise channel. Whereas stopping sets can be used to precisely determine the probability of error on the BEC, trapping sets appear to cause errors on the AWGN channel. Richardson [17] uses the parameters and multiplicity of various problematic trapping sets to estimate the error floor of LDPC codes at bit error rates where simulations are not feasible. Unfortunately, the somewhat vague definition of problematic trapping sets makes it difficult to use them for performance analysis.

2.4. Absorbing Sets

In an attempt to clarify the ambiguity of problematic trapping sets, Zhang et al. introduced the notion of absorbing sets [8]. They define absorbing sets as follows.

Definition 3 (Absorbing Sets [8]). Let be a set containing variable nodes. Also, let be a set of check nodes such that , and each check node in the set has an odd number of edges connected to . If each variable node in is connected to strictly more check nodes in than in , the set is referred to as a absorbing set. A fully absorbing set also satisfies the condition that each variable node in is connected to more check nodes in than in .

Note that an absorbing set is also an trapping set, but the converse is not always true. Figure 4 shows an example of a absorbing set. The set of variable nodes in this absorbing set is , and the set of unsatisfied check nodes is . The variable node is not connected to , and the variable nodes , , and are each connected to and at least two other check nodes in . Therefore, each of the variable nodes in is connected to more satisfied check nodes than unsatisfied check nodes. Also, note that the trapping set in Figure 3 is not an absorbing set because variable nodes and are connected to two unsatisfied check nodes and only one satisfied check node.

Simulations show that the majority of errors encountered in the error floor region during sum-product decoding of the IEEE 802.3 an low-density parity-check code could be attributed to absorbing sets [8, 9]. Although absorbing sets appear to be useful for estimating the performance of iterative message-passing decoding, they do not lead to strict upper bounds. For upper bounds, it is possible to use the concept of deviations on the computation tree.

2.5. Computation Trees and Deviations

In his 1996 dissertation, Wiberg [10] presented groundbreaking analytical results with respect to iterative decoding of low-density parity-check codes. He provided extensive analysis of both the MS and SP decoders by introducing a model of iterative decoding known as the computation tree. Wiberg showed that the MS decoder minimizes the probability of word error when decoding a code whose Tanner graph is a tree, while for the same type of code the SP decoder minimizes the probability of bit error.

In addition to introducing computation trees, Wiberg also introduced the concept of deviations. Wiberg proved that deviations on the computation tree with negative cost are required in order for errors to occur during MS and SP decoding. Because of the importance of computation trees and deviations in understanding finite tree-based decoding, they are examined in detail in this section.

Consider a low-density parity-check code represented by a Tanner graph . A computation tree rooted at variable node after iterations is denoted . In order to construct a computation tree from the Tanner graph, a variable node is placed at the top level (root) of a descending tree. To construct the next level in the tree directly below , each of ’s neighbors in is added to this level and connected to . This process continues level-by-level, where nodes in the previous level are used to determine nodes on next level, while maintaining that each node in the computation tree has the same set of neighbors as its corresponding node in the Tanner graph. For example, if variable node on the last completed level is connected to check node on the level above it, then all check nodes in must appear on the next level and be connected to , thereby ensuring that is connected to exactly one copy of each check node in .

Figure 5 gives an example of a Tanner graph, and its corresponding computation tree rooted at after two iterations. Nodes at the bottom level of the computation tree are referred to as leaf nodes. Notice that the leaf nodes are the only nodes in the computation tree that are not connected to a copy of each of their neighbors in the original Tanner graph.

Computation trees are precise models for analyzing the performance and behavior of min-sum and sum-product decoding for a finite set of iterations. Each of these decoders can be precisely modeled after iterations by constructing different computation trees that contain levels of nodes including the root node. The computation trees are each rooted at a different variable node from the original Tanner graph. Then, for every variable node in each computation tree, the LLR cost is assigned to that variable node. At this point, MS or SP decoding operations can be performed from the leaf nodes up to the root node. The final cost at each of the root nodes determines the binary estimate of the transmitted codeword computed by the decoder. Because the MS and SP decoders are optimal on Tanner graphs that are trees, the MS and SP decoders are optimal on each of the computation trees derived from the Tanner graph. MS chooses the least cost valid configuration on the tree, where a valid configuration refers to any assignment of binary numbers to the variable nodes such that each check node is adjacent to an even number of variable nodes assigned to a binary 1. The SP decoder, on the other hand, chooses the value at the root node that has the highest probability over all valid configurations.

Although the computation tree model is precise, after a small number of iterations it becomes impractical to analyze the performance of specific codes by considering all valid configurations on the computation tree. The number of valid configurations on the computation tree can be computed by treating the computation tree as a Tanner graph. In order to define a Tanner graph given the computation tree, treat all check nodes and variable nodes in the computation tree separately. For example, if multiple copies of variable node are distributed throughout the computation tree, each copy is treated as a distinct variable node. After regarding each variable node in the computation tree as distinct, one can show that each check node on the computation tree corresponds to a linearly independent parity-check equation. If there are variable nodes and check nodes on a computation tree rooted at variable node after iterations, then there are a total of valid configurations on the tree. On a -regular LDPC code, the number of variable nodes after iterations is given by and the number of check nodes is given by

To illustrate the growth rate in the number of valid configurations on the computation tree, consider an LDPC code where each variable node has degree and each check node has degree . These commonly used code parameters result in what are known as a -regular LDPC codes. Table 1 shows the number of variable nodes given by

the number of checks nodes given by

and the corresponding number of valid configurations on the computation tree after 1, 2, and 3 iterations. Note that the growth rate is not affected by the block length of the code.

Table 1 illustrates the computational complexity associated with considering each valid configuration on the computation tree. In light of this, Wiberg [10] derived a simplified bound on the performance of MS decoding operating on a particular computation tree. In order to obtain this bound, Wiberg introduced the concept of deviations on the computation tree.

Definition 4 (Deviation [10]). A deviation is any set of variable nodes on the computation tree satisfying the following three conditions.(1)Each check node in the computation tree is adjacent to either two or zero variable nodes in the deviation set.(2)A deviation set contains the root node of the computation tree.(3)No proper and nonempty subset of variable nodes in the deviation from a valid configuration on the computation tree.

Figure 6 shows an example of a deviation on the computation tree given in Figure 5(b). The larger blue variable nodes are contained in the deviation, whereas the smaller red nodes are not.

Wiberg uses the set of deviations on the computation tree to derive an upper bound on the performance of the min-sum decoder. It is necessary, but not sufficient, for at least one deviation in the set of all deviations to have negative cost in order for an error to occur at the root node. The cost of the deviation, denoted by , can be found by summing the LLR cost of each of the nodes in the support of the deviation. The cost of a deviation is given by

where copies of are counted as many times as they appear in the deviation. A necessary, but not sufficient, condition for an error to occur on the computation tree rooted at variable node is given by [10]

Using this condition, a bound can be derived on the probability that the minimum-cost configuration on the computation tree contains a binary 1 at the root node. This bound is

which can be further loosened to by using the union bound.

Wiberg [10] shows that the bound given by (18) can be used to predict the performance of min-sum decoding of infinite-length codes after a specific number of iterations. Wiberg begins by assuming that the computation trees have no repeated nodes. This assumption simplifies the weight enumerators of the deviations for regular LDPC codes. Wiberg also shows that (18) can be used to bound MS decoder performance when there are multiple copies of each variable node in the tree. Thus, in theory, Wiberg’s deviation bound can be used to bound the performance of MS decoding of finite length codes. The following proposition shows that the number of deviations grows exponentially with , thus making it computationally intractable to enumerate the deviations even after a small number of iterations.

Proposition 1. Let be the computation tree of a -regular LDPC code, rooted at variable node after iterations. Then, the number of deviations that exist on is

Proof. By the definition of a deviation, we must assign the root node to a binary 1. Each of the check nodes immediately below must assign exactly one of the their child variable nodes to a binary 1. Thus, there are a total of deviations after one iteration. In addition, there are exactly leaf nodes in the support of each deviation after one iteration.
Each of the previous leaf nodes gets connected to check nodes after two iterations. Each of these check nodes assigns one of their child variable nodes to a binary 1. Therefore, for each deviation after one iteration there are different deviations after two iterations. This brings the total number of deviations to after two iterations. The total number of leaf nodes in the support of the deviation after two iterations is .
Following this pattern, the variable nodes in support of the deviation after two iterations branchs out to check nodes. There are ways of assigning the leaf nodes to the support of the previous deviation. This brings the total number of deviations to after three iterations.
After iterations, the old leaf nodes in the support of the deviation branch out to new leaf nodes in the support of the deviation. There are ways of assigning the support to the previous deviation, and the total number of deviations after iterations is

The number of deviations on the computation tree of a -regular low-density parity-check code is given in Table 2 for iterations 1 through 5. Even after only a small number of iterations, it becomes impractical to enumerate each of the deviations in order to compute the upper bound on the probability of bit error of the root variable node of the computation tree.

Using computation trees, Wiberg provided a precise model of the behavior of the min-sum and sum-product decoders. Unfortunately, the size of the computation trees and the number of configurations on them grows too large for practical analysis. Deviations provide a simplified approach to the analysis of computation trees, but the number of deviations also grows exponentially with the number of iterations.

3. Stopping Sets, Absorbing Sets, and Resulting Deviations

Deviations can be used to define a necessary condition for an error to occur during iterative decoding. The condition simply states that there must be at least one deviation with cost less than zero, assuming that the all-zeros codeword was sent. However, this condition says nothing about which deviations are more or less likely to cause errors. What is known is that at high SNRs low-weight, deviations are much more likely to cause errors than high-weight deviations. Thus it is reasonable to expect that low-weight stopping sets and low-weight deviations coincide over the BEC channel, since low-weight stopping sets are precisely the cause of errors for iterative decoding over the BEC [15]. For the same reason, one can expect that trapping sets, or more specifically absorbing sets, coincide with low-weight deviations over the AWGN channel, since they have been frequently observed to cause errors at high SNR during iterative decoding of LDPC codes over the AWGN channel [8, 17]. Connections between stopping/absorbing sets and deviations and their effect on decoding performance are examined in this section.

3.1. Stopping Sets as Deviations

Stopping sets consist of a subset of the variable nodes, such that each check node connected to an element in is connected at least twice. According to the definition of a deviation given in Definition 4, each check node connected to the variable nodes in a deviation is connected exactly two times. These two properties can be used to study the relationship between stopping sets and their corresponding deviations.

First, consider a computation tree where each of the variable nodes begins with an assignment of a binary 0. Then, assign all copies of variable nodes in to a binary 1. If does not correspond with a codeword in , the resulting configuration on the computation tree will not be a valid configuration. For example, consider check node in Figure 2. Each time, check node appears in the computation tree, it will be connected to three variable nodes assigned to a binary 1, including the parent variable node and two child variable nodes. If one of the child variable nodes of along with all of its descendants is set to a binary 0, check node will be satisfied. If this is done for every unsatisfied check node in the computation tree, a deviation is created that contains only variable nodes in . Thus, this method allows one to create a deviation from a stopping set.

Using the method previously described, a deviation can be constructed using only the variable nodes contained in a stopping set. This is illustrated in Figure 7(a), where a portion of the deviation defined by the set from Figure 2 is given. Since only one of the child variable nodes can be included in the deviation, it is sufficient to randomly include and exclude , since both nodes are included in . The effect of this deviation is now examined over the BEC and the BSC. Stopping sets are known to cause errors over the BEC. The reason for this is illustrated by Figure 7(b). When each variable node in is received as an erasure , it only takes one erasure below to cause an erasure message to be sent from to . However, even though two erasures are connected below in Figure 7(b), the same erasure message is sent up to . This example illustrates that a deviation containing all erasures is sufficient for an erasure output at the root node. Thus there is a strong connection between stopping sets and deviations over the BEC, since both cause errors using iterative decoding and deviations can be created from stopping sets.

The reason stopping sets will not cause errors as frequently over the BSC is illustrated in Figure 7(c). A deviation containing only the variable nodes in a stopping set exists on a computation tree rooted at any one of the variable nodes in regardless of the channel. However, the impact of deviations is not the same across all channels. For example, if two messages of (representing binary 1s) are being sent from and , the message sent from up to is a (representing a binary 0). This example shows that while a deviation created from a stopping set is sufficient to cause erasure outputs from the decoder over the BEC, this same deviation may not cause errors over the BSC.

From Figure 7, it is clear that deviations created from stopping sets have a different effect on iterative decoding over the binary erasure channel and the binary symmetric channel. While it is not as clear how deviations created from stopping sets will effect decoding over the AWGN channel, some similarities can be drawn between the BEC and the AWGN. In terms of channel LLR costs, an erasure over the BEC behaves like a LLR cost of zero over the AWGN. A real-valued LLR interpretation of the BEC channel can be created using real-valued costs of , and 0.0 to represent a binary 1, 0, and an erasure , respectively. Binary information is transmitted as and over the AWGN channel, and the probability , regardless of the channel SNR. Thus, it is reasonable to suspect that the AWGN channel behaves more like the BEC than the BSC, especially at high channel SNR.

3.2. Absorbing Sets as Deviations

Absorbing sets project to deviations on the computation tree in a different way than stopping sets. Since each check node connected to a stopping set is connected at least twice, a deviation can easily be defined using only nodes in on any computation tree rooted at a node in . Unlike stopping sets, absorbing sets can have only a single connection to a check node. When an absorbing set has a single connection to a check node, it is not possible to form a deviation on any computation tree using only the variable nodes in , unless there is a stopping set . If , a deviation can be formed on the computation tree by simply avoiding variable nodes in that are not in .

For an absorbing set that does not contain a stopping set, it is of interest to know how the absorbing set manifests itself as a deviation on the computation tree rooted at one of the variable nodes in . This manifestation takes the form of a deviation with as many variable nodes in as possible. Because it is known that each variable node is connected to strictly more satisfied check nodes than unsatisfied check nodes, it is possible to compute a bound on the number of variable nodes in a deviation that are contained in for regular LDPC codes.

Consider an absorbing set on the Tanner graph of a -regular low-density parity-check code. Let each variable node be connected to at least satisfied check nodes. A computation tree rooted at a variable node after iterations is given by . A deviation on this computation tree can be constructed by selecting the nodes in the deviation level-by-level. The deviation construction begins by including the root node at level . At the next level, one variable node connected to each of the check nodes in must be included in . When possible, variable nodes in will always be included in . Therefore, after there are at least variable nodes in that are also in , and at most variable nodes in that are not in . Continuing to level in the computation tree, each of the variable nodes in at level in connects to new variable nodes in and new variable nodes not in . Each of the variable nodes at level in that are not in connects to variable nodes that are not it . After , the number of variable nodes in that are also in is Similarly, the total number of variable nodes in is Therefore, the number of variable nodes in that are not in is .

It is clear from (21) and (22) that the lower bound on the portion of variable nodes in within the deviation given by approaches zero as approaches infinity. Thus, the bound does not appear to be an accurate method for calculating the true portion. The bound given by (21) is computed under the worst-case scenario that, after a deviation reaches a check node with only one connection to the set , the descendants of that check node contained in the deviation do not contain any more variable nodes in . On a connected Tanner graph with no nodes of degree one and , this assumption is most likely never true, since nodes in will eventually (with increasing ) be included in the descendants of the failed check node, and consequently the variable nodes will also be included in any deviation which is a manifestation of .

For finite-length, -regular low-density parity check codes, it is possible to determine the exact number of variable nodes after a given number of iterations . In order to find , each variable node is assigned a LLR cost of and each variable node is assigned a LLR cost of . Then, using the resulting LLR cost vector, MS decoding is performed for iterations. The final cost for any variable node , is the number of nodes in the minimum-cost deviation on the computation tree rooted at after iterations. Deviations for -regular LDPC codes contain a fixed number of variable nodes determined by (22), and the only way to reduce the cost of the deviation is to include variable nodes from the set . Thus, the minimum-cost deviation on the computation tree of a -regular LDPC code will include the maximum number of variable nodes in that is possible, and the MS decoder will output the cost of this deviation. It is important to note that the cost of the deviation returned by the MS decoder corresponds to the number of variable nodes in the deviation that are not in . Therefore, in order to determine the number of variable nodes in the deviation that are in , it is necessary to subtract this MS decoder cost from the result given by (22).

Example 1. Consider the length , dimension , -regular low-density parity-check code defined by the parity check matrixA fully absorbing set is defined by the set of variable nodes , where columns 1 through 20 of corresponds to the set of variable nodes . After 50 iterations, each of the deviations on each of the computation trees contains variable nodes, as computed by (22). After setting the LLR costs for variable nodes in to , and all other LLR cost to , the output of the MS decoder after 50 iterations for variable node is a cost of . Therefore, the minimum-weight deviation contains variable nodes also contained in the absorbing set . Thus, a deviation can be formed on the computation tree with 91% of its variable nodes coming from . In contrast, the bound given by (21) for -regular LDPC codes after 50 iterations with only guarantees that 101 variable nodes from will be included in the minimum weight deviation. This huge disparity between the bound and the actual result given by MS decoding reveals the effects of the assumptions used to derive (21). It is worth noting that the proportion of variable nodes in in the minimum weight deviation after 51 and 52 iterations were also 91%, so this proportion appears to stabilize after a sufficient number of iterations.
Using this same method on the fully absorbing set , it is found that the minimum weight deviation after 50 iterations contains copies of the variable nodes in , equivalent to 78% of the total number of variable nodes in the deviation. By comparing the properties of the minimum-weight deviations resulting from and , one might expect that is more likely to cause errors than . This is because 91% of the cost of the deviation resulting from is determined by the variable nodes in , compared to only 78% for . Simulation results in Section 4 show that causes errors much more frequently than .

4. Finding Problematic Trapping Sets

Any set of nodes can be interpreted as a trapping set, including stopping sets, absorbing sets, and fully absorbing sets. This is because trapping sets are only defined by the number of variable nodes in the set and the number of failed check nodes. In order to simplify analysis, the trapping sets studied in this paper are restricted to the study of problematic trapping sets.

Definition 5 (Problematic Trapping Set). A problematic trapping set is a trapping set such that the number of failed check nodes connected to the trapping set is less than or equal to the number variable nodes contained in the trapping set.

Because trapping sets with small weight and a small number of failed check nodes are often the cause of errors at high SNR [17], it is unlikely that the restriction to problematic trapping sets will eliminate error patterns of interest. In Section 3, it was shown that MS decoding can be used to determine the proportion of variable nodes that are both inside and outside an absorbing set. The same idea is used in this section to find problematic trapping sets using MS decoding. The iterative method given by Algorithm 1 operates by forcing the trapping set to contain one variable node, and then adding more variable nodes one-by-one that decrease the cost of the minimum-cost deviation the most.

for 𝑖 x e d = 1 , , 𝑁
  Set 𝑚 m i n = .
  Set 𝜒 = { 𝑖 x e d } .
while 𝑚 m i n > 0 . 0
    for 𝑖 = 1 , , 𝑁
    -Set 𝜒 = 𝜒 { 𝑖 } .
    -Set 𝜆 𝑘 = 0 . 0 for all 𝑘 𝜒 .
    -Set 𝜆 𝑘 = 1 . 0 for all 𝑘 𝑉 𝜒 .
    -Perform MS Decoding for iterations.
     if m i n 𝑗 = 1 , , 𝑁 𝑚 𝑣 𝑗 < 𝑚 m i n
       -Set 𝑚 m i n = m i n 𝑗 = 1 , , 𝑁 𝑚 𝑣 𝑗 .
       -Set 𝑗 m i n = a r g m i n 𝑗 = 1 , , 𝑁 𝑚 𝑣 𝑗 .
      end
     -Set 𝜒 = 𝜒 ( { 𝑉 { 𝑖 } } { 𝑖 x e d } ) .
     end
    -Set 𝜒 = 𝜒 { 𝑗 m i n } .
    -Create a binary vector v with 𝑣 𝑘 = 1 if 𝑘 𝜒 , and
       𝑣 𝑘 = 0 if 𝑘 𝑉 𝜒 .
    -Compute the integar syndrome 𝐬 i n t = 𝐻 𝐯 𝑇 .
    -Compute the binary syndrome 𝐬 b i n = 𝐻 𝐯 𝑇 with
      Hamming weight 𝑤 𝑠 .
    -Compute the integar vector 𝐳 = 𝐻 𝑇 𝐬 b i n
     if m i n 𝑘 = 1 , , 𝑀 𝑠 i n t , 𝑘 2
        𝜒 is a ( | 𝜒 | , 𝑤 𝑠 ) Stopping Set.
    end
     𝜒 is a ( | 𝜒 | , 𝑤 𝑠 ) Trapping Set.
    if 𝑧 𝑘 𝑑 < 𝑣 𝑘 2 for all 𝑘 𝜒
          𝜒 is a ( | 𝜒 | , 𝑤 𝑠 ) Absorbing Set.
    end
    if 𝑧 𝑘 𝑑 < 𝑣 𝑘 2 for all 𝑘 = 1 , , 𝑁
         𝜒 is a ( | 𝜒 | , 𝑤 𝑠 ) Fully Absorbing Set.
    end
  end
end

Once Algorithm 1 reaches cost , it is possible to discover more problematic trapping sets by removing nodes one-by-one from the set that result in the smallest increase in the cost . In order to examine the efficacy of Algorithm 1, the length , dimension , -regular low-density parity-check code given in Example 1 was used. This code was chosen because the Hamming weight of all of its codewords could be enumerated, and thus the minimum distance of the code could be easily computed. Applying Alogorithm 1 to this code resulted in a total of 37 trapping sets with parameters shown in Table 3. Note that the proportion of nodes in each set that are included in the minimum weight deviation is given by “Dev. %”. This code was found to have minimum distance equal to 4, so only stopping/trapping/absorbing sets of weight less than or equal to 3 are tabulated.

In order to determine how effective Algorithm 1 is at locating problematic trapping sets, the same length , dimension , -regular low-density parity-check code was simulated using sum-product decoding over the additive white Gaussian noise channel with an SNR of = 8.0 dB. A total of 1500 noncodeword outputs with weight less than or equal to 3 were observed during SP decoding after 1000 iterations. It is important to note that the noncodewords outputs were not simply the last quantized output given after 1000 iterations. The output of SP decoding typically changes after each iteration when it does not converge to a codeword. For this reason, the noncodeword outputs were computed by averaging the cost for each variable node from over the last 200 iterations to compute a final output cost. This is similar to the method used in [14] for characterizing the changing outputs of the MS decoder.

Table 3 shows the number of observed output errors, and compares them to the problematic trapping sets found using Algorithm 1. Approximately 82% of the observed errors corresponded to one of the problematic trapping sets found using Algorithm 1. Also, the number of times a particular problematic trapping set is observed indicates how problematic the set is to the SP decoder. The single most problematic set, resulting in over 60% of the output errors, was the set that satisfies the definitions of a stopping set, absorbing set, and fully absorbing set. Errors falling into the “Other” category were highly variable, and no specific output pattern in this set accounted for more than 6 of the total observed errors.

The problematic trapping sets with the highest proportion of nodes within their corresponding deviation were the trapping sets. Not surprisingly, the stopping set has the highest proportion of variable nodes in its deviation. While the proportions were noticeably different between different-sized sets, the difference was minimal within sets of the same size. Furthermore, it is difficult to make any connections between proportions within sets of the same size and their corresponding probability of causing an error. One possible reason for this might be the overlap between the different problematic trapping sets, and between the problematic trapping sets and codewords. For example, the reason that the fully absorbing set did not appear in the simulations might be because two of its variable nodes overlap with the exceptionally problematic stopping set, and thus any significant channel noise received by variable nodes and may be highly likely to cause the stopping set to be output by the decoder.

It is worth noting that the average value of received information within the absorbing sets of weight less than or equal to three was . Recall that a binary 0 is modulated to , so the mean value of the noise within the absorbing sets was . This cost further justifies the earlier assertion that the AWGN channel behaves more like the BEC channel at high SNR than the BSC channel, since an erasure over the BEC can be interpreted as noise of and a bit flip over the BSC can be interpreted as noise of .

Algorithm 1 was able to find 82% of the most problematic errors with weight less than . In order to test the algorithm on an LDPC code with longer block length, a length , , -regular LDPC code was used. The resulting output of Algorithm 1 was 1019 absorbing sets, 941 fully absorbing sets, and 1 stopping set, and the sizes of the sets ranged from 3 to 9. Only sets containing less than 10 variable nodes were considered problematic, since the code is known to contain a codeword of weight 10. Simulations were performed using SP decoding at SNR = 5.0 dB, at which the bit error rate of the code is . Overall, 40 noncodeword errors were observed, of which 20 were error patterns of weight less than 10. Of these, all were absorbing sets and 19 were fully absorbing sets. Unlike the results given for the length code, only two of the 20 absorbing sets was found by Algorithm 1. However, the two that were found were the two smallest absorbing sets, including and fully absorbing sets.

As expected, the number of trapping sets grows very large when increasing the size and dimension of the code. Algorithm 1 is capable of locating the majority of problematic trapping sets for a small length LDPC code, but for the larger length LDPC code it was only able to identify the two smallest sets observed during simulations. This is likely due to the fact that the code had not yet reached its error floor, as evidenced by the fact that half of the observed error patterns had weight greater than or equal to the weight of a known codeword. Because error floors occur at such low bit error rates for large LDPC codes, it is difficult to observe problematic trapping sets using simulations. Thus, the effectiveness of Algorithm 1 at identifying problematic trapping sets remains unknown for large codes with error floors beyond the reach of simulations.

5. The Weight of Deviations Induced byProblematic Trapping Sets

In [17], Richardson characterizes trapping sets by their size and the number of associated failed check nodes. To find the impact of the trapping sets with respect to probability of error, Richardson uses simulations that force the noise in the trapping set and push the received information away from modulated 0s and towards modulated 1s. The result is an estimate of the probability of error caused by trapping sets at high SNR. In this section, a new method is used to examine the probability of error associated with trapping sets. Instead of using simulations to estimate the probability of error, deviations induced by the trapping set are created to analyze the probability of error. Since bounds on the probability of error can be derived from deviations, if one could prove that minimum-weight deviations were induced by problematic trappings sets and then computed the weights of the deviations, it may be possible determine the probability of error associated with trapping sets without having to rely on simulations.

It may seem surprising that almost all problematic trapping sets listed in Table 3 result in a deviation where the variable nodes within the trappings set make up the majority of the variable nodes within that deviation. For example, consider the fully absorbing set given by the nodes . While there are only two variable nodes contained in the absorbing set, they make up 76% of the variable nodes in a deviation that exists on the computation tree rooted at either variable node within the set. This implies that there might be a way of cleverly designing deviations which contain a disproportionately large number of certain variable nodes. A deviation rooted at variable node after 4 iterations is shown in Figure 8. This deviation was designed to include more copies of variable nodes and than other variable nodes. Overall, the number of copies of each variable node in the deviation is , , , , and . Although the overall configuration contains 5 different variable nodes, almost 2/3 of them are copies of or . Now, consider the subgraph of the Tanner graph defined by these 5 variable nodes, shown in Figure 9. This subgraph defines a stopping set. It was shown in Section 3.1 that, because it is a stopping set, the deviation in Figure 8 could continue to grow indefinitely without the need for variable nodes outside the stopping set.

To construct the deviation in Figure 8, decisions were made at check nodes and . Those decisions are expressed in the directed bipartite graph shown in Figure 10. The number of decisions for each check node is equivalent to the number of edges incident to the check node. Check nodes , , , and each have two copies in the directed graph to preserve the fact that they have bidirectional edges. However, check nodes and do not simply have bidirectional edges connecting them to each of their incident variable nodes, as demonstrated by the deviation in Figure 8. This comes from the fact that check nodes with degree higher than 2 are still only connected to 2 variable nodes within the deviation. Thus, at each check node with degree greater than 2, a decision is made as to which variable nodes it includes in the deviation.

Using the adjacency matrix of the directed bipartite graph, it is possible to compute the number of copies, or the multiplicity, of each node at each level in the deviation. The method for computing the multiplicity of nodes in the deviation is similar to the method given in [18] for computing the multiplicity of nodes on computation trees. For the directed Tanner graph given in Figure 10, the adjacency matrix is given by

The columns and rows of correspond, in the same order, to the nodes , , , , , , , , , , , , , , , , , , , and . Beginning with a vector which has a 1 in the position of the root node and 0s elsewhere, the number of nodes on each level of the deviation can be calculated recursively using where is used to subtract the parent check nodes at each check node level in the deviation. The matrix for the directed Tanner graph in Figure 10 is given by

Using the recursive formula given by (25), the number of variable nodes at different levels in the computation tree is given in Table 4. The deviation at level exists on the computation tree rooted at variable node after iterations. This deviation contains a total of variable nodes, over 63% of which are copies of and . Using a similar method to that given in [19], the effective weight of the deviation can be calculated. First, let be the number of copies of variable node within the deviation, and let be the total number of variable nodes in the deviation. The cost of the deviation is modeled by the normal distribution

where is the variance of the AWGN noise. This random variable can be rescaled resulting in the distribution

where the mean is the weight of the deviation. Note that if all were equal, the weight of the set would be equal to the number of nodes it contains, which is consistent with the notion of Hamming weight when is equal to a codeword. From Table 4, the weight of the deviation created from the directed Tanner graph in Figure 10 after iterations is 3.8784. This is less than the Hamming weight of the minimum distance codeword in the code, which has weight 4.0. Thus, the minimum weight of deviations on the computation tree rooted at is probably less than the minimum distance of the code.

6. Conclusion

Practical methods for predicting and understanding the performance of low-density parity-check codes with iterative decoders are needed in order to avoid the use of codes with error floors. Trapping sets, which include absorbing sets and stopping sets, provide insight into the error mechanisms of iterative decoders but are too imprecise to be used to make design decisions with respect to error floors. Deviations on computation trees are precise and can be used to compute strict upper bounds on the performance of MS and SP decoding, but computing these bounds quickly becomes computationally intractable. The paper examined the connections between trapping sets and their corresponding deviations through the notion of problematic trapping sets in an attempt to find a practical and precise method for predicting the performance of LDPC codes with iterative decoding.

It was shown that the variable nodes in a stopping set can be used to define a deviation, while trapping sets and absorbing sets only define a deviation if a subset of their variable nodes forms a stopping set. When trapping sets and absorbing sets do not include a stopping set, it is necessary to include additional variable nodes in order to construct a corresponding deviation. The number and proportion of variable nodes outside the set that are needed to construct the deviation can be found experimentally using a modification of the MS decoder. This modified MS algorithm leads to an iterative method for identifying low-weight problematic trapping sets in an LDPC code. Simulation results demonstrate that this method is capable of finding many of the low-weight trapping sets that determine the performance of LDPC codes at moderate SNRs. The efficacy of this algorithm is limited by computational constraints.

Finally, an analytical approach for determining the weight of deviations induced by trapping sets on the computation tree was introduced. This approach involves determining the minimum-weight stopping set that contains a given trapping set, and then determining a directed Tanner graph from the stopping set that favors certain variable nodes within the trapping set. It was then shown that the effective weight of the deviation can be found using a recursive method for computing the multiplicity of variable nodes within the deviation, In one example, it was proven that a deviation exists on the computation tree with weight less than the Hamming weight of the code. This result shows that trapping sets probably result in a necessary condition for an error to occur during iterative decoding, and in certain cases, this condition is satisfied with probability higher than the probability of an ML codeword error.

Acknowledgments

This paper was funded in part by AFOSR Contract FA9550-06-1-0375 and Department of Education Grant no. P200A070344.