Abstract

Edge computing, as an emerging computing paradigm, aims to reduce network bandwidth transmission overhead while storing and processing data on edge nodes. However, the storage strategies required for edge nodes are different from those for existing data centers. Erasure code (EC) strategies have been applied in some decentralized storage systems to ensure the privacy and security of data storage. Product-matrix (PM) regenerating codes (RGCs) as a state-of-the-art EC family are designed to minimize the repair bandwidth overhead or minimize the storage overhead. Nevertheless, the high complexity of the PM framework contains more finite-domain multiplication operations than classical ECs, which heavily consumes computational resources at the edge nodes. In this paper, a theoretical derivation of each step of the PM minimum storage regeneration (PM-MSR) and PM minimum bandwidth regeneration (PM-MBR) codes is performed and the XOR complexity over finite fields is analyzed. On this basis, a new construct called product bitmatrix (PB) is designed to reduce the complexity of XOR operations in the PM framework, and two heuristics are used to further reduce the XOR numbers of the PB-MSR and PB-MBR codes, respectively. The evaluation results show that the PB construction significantly reduces the XOR number compared to the PM-MSR, PM-MBR, Reed–Solomon (RS), and Cauchy RS codes while retaining optimal performance and reliability.

1. Introduction

Edge computing has emerged as a new paradigm for addressing local computing needs and moving data computation or storage to an edge node near the end user [13]. In contrast to cloud storage built on a network, edge computing-based storage is distributed among the edges of the network structure [4, 5]. Consequently, these edge nodes are particularly in need of fault-tolerant techniques to ensure system reliability and availability and even more importantly to ensure data privacy and security [6]. Nowadays, whether it is a distributed storage system [7] or a decentralized system [8, 9], it needs to guarantee the storage reliability of the source data, and these systems have used an erasure code (EC) strategy as opposed to a replication strategy which has a higher storage overhead. [10]. The well-known Reed–Solomon (RS) codes or more generally, maximum distance separable codes (MDS), have been applied to many storage systems [11]. But in a word, an EC is essentially a particular linear combination of data symbols that involves a large number of finite field matrix product operations, and the calculation complexity over finite field is too high. There have been many studies on this issue [1214]. In addition to the problem of high computational complexity over , another problem of EC is that their repair bandwidth is equal to the size of the entire source data [15]. It has been reported that, based on RS coding, an average of 95,500 coded blocks per day needs to be recovered, and more than 180 TB of data per day must be transmitted through the TOR switch for data recovery [16]. Recently, a new class of ECs called regenerating codes (RGCs) has emerged that trades off both storage overhead and repair bandwidth [17]. RGC maintains the same reliability as ECs, but both of them are calculated over a finite field. There are two principal RGC classes, namely, minimum storage regenerating (MSR) and minimum bandwidth regenerating (MBR) codes, which imply two extreme points in a trade-off known as the storage-repair bandwidth trade-off. There have been many frameworks designed for MSR and MBR codes, respectively [7, 1820], but as far as we know, the product-matrix (PM) framework proposed by Rashmi et al. is the only framework that constructs two codes in a unified way [21], and the PM framework provides exact repair of PM-MSR and PM-MBR codes. While there has been some research on the optimization of extensions based on this framework, such as reducing the disk I/O overhead or dynamically changing the number of helper nodes according to system requests and network changes, these studies have not analyzed and optimized the framework itself [2224]. In this paper, we focus on the PM framework, analyzing the computational complexity of the encoding, decoding, and repair processes over . Since the PM framework has a large number of XORs at each step, especially in the decoding process of the PM-MSR code, the computational complexity of the inversion of the Vandermonde matrix is . Therefore, we propose a new construction called product bitmatrix (PB). Our contributions include the following:(i)The computational complexity of the PM-MSR and PM-MBR codes is strictly theoretically derived in combination with the finite field arithmetic operations.(ii)A new construction called product bitmatrix for MSR and MBR codes is designed, and we elaborate on the encoding, decoding, and repair processes in detail.(iii)Two new heuristic algorithms are presented that find a locally optimal PB which has low XORs, thereby reducing computational complexity. The experimental results show that the number of XORs dropped by 11.73% of the PB-MSR and 20.17% of the PB-MBR codes.

The remainder of this paper is organized as follows. The background of the two codes and arithmetic complexity analysis over the finite field are analyzed in Section 2. Section 3 is the analysis of the computational complexity of the encoding, decoding, and repair processes of the PM framework. In Section 4, a new PB construction is presented and implemented. In Section 5, two different heuristic algorithms are designed to find the locally optimal product bitmatrix that has the least number of XORs. Then, the results of many evaluation experiments conducted to verify the performance of optimization are presented in Section 6. Finally, Sections 7 and 8 are related works and conclusions.

2. Foundations

2.1. MSR and MBR Codes

RGCs are state-of-the-art ECs that aim to reduce the amount of data during the repair process, which means a new replacement node needs to connect to any d helper nodes and download symbols from each node. As shown in Figure 1, a comparison between the reconstruction and repair processes is presented. Any symbols in an -level stripe are transferred to reconstruct origin symbols, while symbols downloaded for repair are much smaller than the total size of B. It is shown in the following formula:

Since both storage overhead and bandwidth overhead are costs, it is desirable to minimize both as well as . In [17], when are fixed values, there is a trade-off between and ; the two extreme points in this trade-off are termed the MSR and MBR. Clearly, of the minimum storage point is given by , and at the the minimum bandwidth point of the trade-off, is given by . RGC over is associated with a set of parameters . A comparison about metrics between Replication, EC, MSR, and MBR is shown in Table 1.

As shown in Table 1, the upper bound of storage efficient RMBR code is reached (1/2) when , but there is no such limitation of . For the sake of simplicity, if the size of a file is B = 6, , the repair bandwidth of EC, MSR, and MBR is 6, 3.33, and 2.5,respectively, and the total storage overhead of EC, MSR, and MBR is 12, 12, and 15, respectively. Although the EC and MSR codes use the same storage, the repair bandwidth of EC is higher than MSR. MBR code is where storage space is allowed to expand, but repair bandwidth is at the lowest bound. The MSR codes minimize based on the optimal , while the MBR codes minimize under the optimal .

2.2. Finite Field Arithmetic Complexity

Since the finite field of prime size is not suitable for computers, the finite field of a size equal to a power of 2 is generally preferred. In , each element is represented by the formal power series, which is a way of writing infinite sequences of bits. For example, a and b are bit sequences over the finite field ; . Addition of a and b could be easily done by bitwise XOR, but multiplication is more complicated, , where l is length of sequence. A notation a(z) is used as a formal power series (polynomial) to represent an infinite sequence:

The multiplication over is performed by multiplying two polynomials with coefficients in binary field and then reducing the product by irreducible polynomials [25]. Hence, a polynomial represents a field element, and the set of polynomials is denoted by . Due to the highest power series, dega(z) of a(z) is equal to w-1, and the sum of nonzero terms of polynomials as the weight of a(z) is denoted as which equals . The addition is a bitwise XOR operation whose complexity is .

For multiplication, where , there are two steps to compute complexity: compute the product and reduce while a(z), b(z), a(z)b(z).Step 1: in the \emph{product} stage, the complexity of this stage is the \emph{coefficients-XOR}, whereFor example, , , , , , . . The sum of XORs is , so the product stage is .Step 2: in the second stage, if degc(z), c(z) is reduced by an irreducible polynomial (z), until degc(z). Let l = degc(z), and convert the term to . Because an irreducible polynomial (z) degree is , h(z) = (z)-, degh(z) = , and maximum weight of h(z) is .

Divide c(z) into three polynomials: no-reduce part d(z), once-reduce part e(z), and twice-reduce part f(z) as shown in Table 2. While , there is a nonexistent second reduction.

Because c(z) = d(z) + e(z) + f(z), the reduction part is as follows:

Therefore, the once-reduce part e(z) has XORs. Then, we compute number of XORs of the twice-reduce part.

Let , and the new polynomial , ; degm(z) =  and . Then, which needs XORs. And if , f(z) also must be reduced.where n(z) is a polynomial in which . Thus, twice-reduce part f(z) needs XORs.

Consequently, the sum of XORs of the field multiplication over including two steps of the addition XOR among three parts is

Generally, a multiplication operation over takes bit operations. For example, when  = 3, multiplication results are as shown in Table 3. Because the finite field size equals that corresponds to a byte in the computer, is usually set to be an integral times of 8 to make encoding operations convenient.

Matrix inversion is often used in EC decoding and repair processes. Suppose A is matrix , and ; the determinant of A is solved by trigonometric matrix. Elements in the solution process include addition and multiplication, which require XORs. Adjoint matrix is an (nn) matrix composed of algebraic cofactors in which each element equals the determinant value of A excluding row i and column j. Therefore, computing requires XORs.

Consequently, the XORs of matrix inversion are

3. The Computational Complexity of PM-MSR and PM-MBR

The famous PM-RGC construction is introduced to provide MBR constructions for all feasible values of (n, k, d) and MSR constructions for all [21]. A detailed description of the complexity of the PM-MSR and PM-MBR codes is presented in this section, where matrices are used in the framework listed in Table 4.

For an MSR code with  = 1, , , over . The original symbols are arranged in the form of where the entries in the upper-triangular part of and are filled by symbols.

Encoding is constructed using where each element is chosen to satisfy the properties where (1) any d rows of are linearly independent; (2) any rows of are linearly independent; and (3) the n diagonal elements of are distinct. Let the i-th row of be , the i-th row of be , and the i-th row of be . The coded symbols stored by the i-th node are given by . The complexity of encoding equals the sum of addition and multiplication XORs, where the MSR encoding complexity is and each data encoding needs XORs.

Decoding is to reconstruct B. This process has three steps. Let be the submatrix of corresponding to the k nodes. Then, get encoded data(1)Let ; the result is matrix. This step has XORs.(2)Set , , . Because is symmetric, P and Q are symmetric. . Since and , the element is . Then, sets of equations are derived and one set of equations is as follows:This step involves matrix inversion; the XORs to compute a pair is 16 + 8. Consequently, the DC solves the values of for all . The sum of XORs in this stage is .(3)After elements of P are known, the i-th row (excluding the diagonal element) is given by , and is recovered through right product . Then, compute , from which is recovered. is similarly recovered from . In this stage, XOR is

For example, if k = 3, where

The sum of XORs during decoding is .

Repair is to regenerate coded symbols on failed node f. Thus, the aim is to reconstruct where

There are two roles in this stage: helper(h) and replacement(r) nodes, respectively. At first, the i-th helper node computes the inner product and transmits this value to the r node. Then, then r node obtains the vector . Because is invertible, the vector is recovered. From : , since and are symmetric matrices,

The content is eventually recovered. The process of repair is not complicated. The number of XORs on each h node is and the total is . The computational complexity of r includes matrix inversion, matrix product, and coefficient product and is represented as follows:

All feasible values of (n, k, d) of MBR codes have  = 1, , over . The original symbols are arranged in the form of where the upper-triangular part of the symmetric matrix S is filled by symbols, and the remaining symbols are used to construct matrix T.

Encoding is to obtain coding matrix C by matrix product with and M. Define the encoding matrix as the form where each element is chosen in such a way that (1)any d rows of are linearly independent and (2) any k rows of are linearly independent. Then, C is given by . During this process, the sum of XORs is .

Decoding is DC to recover original symbols B by connecting to any k nodes. Because the i-th node stored the vector , the DC gets the matrix . Thus, multiply the matrix on the left of to recover first T, and the number of XOR in this step is . Subsequently, compute and then multiply on the left to recover S. The XOR of this step is .

Repair is recover symbols stored in the failed node to a replacement node by connecting to an arbitrary set of d helper nodes. First, each computes the inner product and sends the result to the replacement node. The replacement node thus obtains the d symbols and then multiplies on the left by to recover matrix . Since M is symmetric, . The XOR of repair on the i-th node is and on the replacement node is .

The computational complexity of the PM-MSR and PM-MBR codes is shown in Table 5. As seen from Table 5, the PM framework requires multiple matrix inversions, and the number of XORs in the decoding process is very high.

4. The New Product Bitmatrix Constructed by Cauchy Matrix

In this section, we introduce the process of converting finite field elements into bitmatrices first. Then, we construct a new with the Cauchy matrix and transform this matrix into a bitmatrix called PB. The encoding, decoding, and repair processes of the MSR and MBR codes based on PB are introduced in detail.

4.1. Transforming Finite Field Elements Using Bitmatrix

Through the analysis of the finite field arithmetic in the previous section, each element e in is represented as a formal power series and the length of these polynomials is . In [26], they have described a row vector V or a matrix M which is represented as an element over in a new representation over . For any , use M(e) as the matrix whose column is ; M(1) is the identity matrix and M(0) is the all-zero matrix. For example, Figure 2 shows bitmatrices over . The bitmatrix of e = 3 whose 1st column is 011, 2nd column is , and 3rd column is . There are two forms of multiplication of bitmatrix as shown in Figure 3. Using the bitmatrix representation, the encoding and decoding of PB-MSR and PB-MBR are accomplished by XOR operations, together with some copying operations.

In paper [21], they have adopted classical “Vandermonde” construct , but the inverse of an nn Vandermonde-based matrix needs time of complexity . One well-known choice is to use the Cauchy matrix whose inverse has a time complexity . Let and , where and are distinct elements over and . Then, the Cauchy matrix defined by and has in element (i, j). It is clear that any submatrix of a Cauchy matrix is still a Cauchy matrix. If using a Cauchy matrix as and converting it to bitmatrix, the number of ones in the bitmatrix means the number of XOR operations in encoding and o is the average number of ones per row in the matrix. Choosing different and will get different o. Shown in Figure 4 is the instance over the finite field GF(8) of the PB-MSR code with and PB-MBR code with . When is constructed by Cauchy matrix, it uses bitmatrix transfer .

4.2. New Product Bitmatrix of MSR Codes

For the PB-MSR code in the upper part of Figure 4, X = {1, 2} and Y = {0, 3, 4, 5, 6} are used to construct matrix , and is an diagonal matrix. Because , any d rows of are linearly independent, any rows of are linearly independent, and the n diagonal elements of are distinct. According to these three conditions, encoded matrix is generated, and each row of coded symbols in C is stored on each corresponding node, which means the i-th symbols of a row are stored on the i-th node.

In the reconstruction scene, DC links any three nodes and downloads symbols.

According to the decoding process introduced in Section 3, the content of the original symbols is decoded. When node 5 fails, helper nodes . Through connecting any 4 helpers (nodes 1, 2, 3, 4 in the example), a replacement node r can exact regenerate symbols. Each inner product bitmatrix as shown in Figure 5 sends results to replacement node r. Then r calculate equation (17) to obtain intermediate results:

Then, r calculates according to the number of helper node like this.

Use to get vector .

After the DC recalculates , the content of node 5 is recovered.

However, the method of random enumeration to find suitable based on the third condition is not reasonable. Thus, we present a simple heuristic to create a better .

Lemma 1. In the finite field, any element (except 0) has a unique multiplicative inverse element , so the formula holds.

If and , where and are distinct elements over GF(), then a Cauchy matrix is as :

Because every is distinct and their multiplicative inverse is also distinct according to Lemma 1, . Then, the square of the multiplicative inverse of each element in i-th row and j-th column of is used to construct diagonal matrix and as follows:

Each column of is linearly independent. Nevertheless, for every operation based on the finite fields , when and primitive polynomial change, the results of all operations change. Hence, matrix inversion is needed to determine whether the current structure is invertible. If the current has an irreversible submatrix, it changes each element of by multiplying by the power to obtain a new completely different element. Since matrix inverses are encoded, decoded, and repaired only a few times and matrix products occur countless times, it is important to optimize the number of exclusive XORs for the matrix. The other details are covered in the next section.

4.3. New Product Bitmatrix of MBR Codes

For the lower part of Figure 4, combine X = {0, 1, 2} and Y = {3, 4, 5, 6} to construct matrix of PB-MBR. Because where and are and , any d rows of are linearly independent and any k rows of are linearly independent. As shown in the example, each node stores vector, such as

When decoding the original symbols, DC connects any two nodes and downloads data, where

When , T is recovered by multiplying on the left.

Subsequently, S is recovered.

For example, if node 4 fails, to accurately regenerate coded symbols on the r node, it requires connecting to any three helpers (1,2,3 in the example). Each inner product bitmatrix , respectively, sends the result to the replacement node. r node obtains d vectors and multiplies on the left where

Then, the DC recovers which represents the lost symbols on node 4.

5. Optimization of PB Framework Based on Minimizing the Number of Ones

An arbitrary Cauchy matrix with different X and Y brow the various number of ones, and the impact on performance is significantly different [27]. It is necessary to find a better bitmatrix which has fewer ones. Therefore, in this section, we optimize the two new by two new heuristic algorithms.

5.1. Optimizing the Number of Ones in Bitmatrix of PB-MSR

It is easy to know the number of ones of each element over . Figure 2 shows that 1–7 elements have several ones that are represented by an array as [3, 4, 7, 5, 4, 7, 6]. A bitmatrix with fewer ones can reduce the encoding computation, and element 1 has the least ones.

For column of bitmatrix of PB-MSR,(1)Count the number of ones.(2)For each element divide by element (i is row) and get a new column.(3)Repeat the first two steps n times and obtain the best column which has the minimum number of ones.

Then, repeat the first three steps for all columns and get a new matrix which has a minimal number of ones called as well as the invertibility property.

Next, it is necessary to find which has a minimal number of ones. Let denote the set of diagonal matrices and let each be n multiplicative inverses of column of . For each in , do the following:(1)Construct .(2)Determine whether the new is invertible. If it is, continue to the next step; if it is irreversible, .(3)Count the number of ones of .(4)Repeat the first three steps and whichever gives the minimal number of ones, form by this .

Algorithm 1 outlines the procedure to generate the set of and the process to find the best matrix.

(1) is the number of ones of a matrix;
(2) is the number of ones of the column;
(3)Use Cauchy matrix construct and count ;
(4)for each column to do
(5)for each row to do
(6)  for each row to do
(7)   Calculate ;
(8)   Count of this new column;
(9)  end for
(10)end for
(11) Choose the column with min as new column of ;
(12)end for
(13)Construct and each column has min;
(14)Count ;
(15)for each column to do
(16)for each row to do
(17);
(18)  for each column to do
(19)  ;
(20)  end for
(21)end for
(22)if is inverse then
(23)  Count ;
(24)else
(25)  repeat 16–26 steps by multiplying the power of ;
(26)end if
(27)end for
(28)Choose the best which has the min;

For example, as shown in Figure 6, has 22 + 22 + 22 + 24 = 90 ones and XORs per row , but optimized has 18 + 18 + 18 + 18 = 72 ones and XORs per row , which is a 23% performance improvement.

5.2. Optimizing the Number of Ones in Bitmatrix of PB-MBR

For bitmatrix of PB-MBR, let (d is the number of helper nodes) denote the set of diagonal matrices and let each be n multiplicative inverses of column of . For each , do the following:(1)Use .(2)Count the number of ones of new .(3)Repeat the first two steps d times and obtain the best .

After one column of becomes element 1, Algorithm 2 does the same for the remaining d-1 columns, and each column needs to be optimized by dividing the value on this column so that the number of ones in the current column is minimized. The detailed process is written in Algorithm 2.

(1) is the number of ones of a matrix
(2) is the number of ones of the column
(3)Use Cauchy matrix to construct and count
(4) as the set of diagonal matrices and each matrix form from multiplcative inverses of
(5)for each do
(6)
(7) Count
(8)end for
(9)Choose the best which has the min
(10)The column of the matrix with all number 1 is
(11)The set includes all columns except column
(12)for each column do
(13)for each row to do
(14)  for each row to do
(15)   Calculate
(16)  end for
(17)  Count of this new column
(18)end for
(19) Choose the column with min as new column of
(20)end for
(21)Choose columns with the min to construct
(22)Count

For example, as shown in Figure 7, original has 24 + 19 + 21 = 64 ones and XORs per row , but optimized only has 12 + 18 + 18 = 48 ones and XORs per row , which is a 30.7% performance improvement.

6. Experiment

In this section, we have implemented the new PB-MSR and PB-MBR codes in C/C++ and employed the Jerasure 2.0 [28] with GF-complete [29] libraries for finite field arithmetic operations. All evaluations over are about the reduction of XORs, encoding, decoding, and repair performances. All tests have been conducted on MacOS Catalina with 8 Intel Core i9 with 2.3 GHz clock speed and 16 GB 2667 MHz. In Section 6.1, we analyze the improved performances of the number of XORs based on new PB construction. In Sections 6.26.4, we report on experiments about encoding, decoding, and repair performances. The encoding performance = total encoded file size/encoding time and the decoding performance = total decoded file size/decoding time. Finally, the influence of finite field size about is analyzed. All experimental results were average values after maximum and minimum values have been removed.

6.1. Analyzing New Product Bitmatrix

To better understand the effects of the new PB construct in speeding up the encoding computation, some evaluation experiments were designed on reducing the number of XORs, and the results are reported in Table 6. The baseline is based on the original Cauchy matrix without any additional optimization. Original PB-MSR and original PB-MBR have two subcolumns of data, which are the sum of XORs of Cauchy matrices formed by different combinations of X and Y. Since the original PB-MBR code is constructed by original Cauchy matrix, the two columns of before PB-MBR are the actual values of original Cauchy matrices. The latter two columns are reduction of PB-MSR and reduction of PB-MBR, respectively. Improvement is measured in terms of the decrease in XORs. In the last row of the table, the average improvement over all the tested parameters is included.

Since combinations of X and Y are different and the construction process of the PB-MSR code product matrix is also more complicated, each change will bring completely different results, so this experiment only observes the overall change trend. The size of the finite field is dependent on the size of computer bytes, which are , , and . And the choice of needs to satisfy the range of , when d = 2k-2. It can be seen from Table 1 that PB-MSR provides a 11.73% reduction in the number of XORs and PB-MBR provides a 20.17% reduction in the number of XORs. Because the optimization of PB-MBR is the overall optimization of matrix , the degree of optimization is greater. Figure 8 shows the XOR change curve. The two line graphs are the original PB-MSR code and PB-MBR code, which are baselines. And the number of XORs optimized for the two codes is lower than the values of the line graphs.

6.2. Encoding Performance Evaluation

The RS and Cauchy codes use the Vandermonde and Cauchy matrices, which are the most classic experiment benchmarks for all erasure codes. Figure 9 shows the comparison results of the encoding performance of PB-MSR, PM-MSR, PB-MBR, PM-MBR, RS, and Cauchy codes when the total number of coded blocks is fixed and r equals 2, 2.5, or 3. Cauchy-good is the optimized Cauchy-matrix encoding scheme provided by the Jerasure 2.0 library. Since encoding performance is directly related to the number of XORs, when the redundancy rate r is the maximum, the coding matrix of any coding scheme is the minimum. Therefore, when , all encoding performances are optimal. With a decrease of r, the multiplicative operations in finite fields increase gradually, and all coding rates decrease. It can be seen from the figure that as becomes smaller, the PB becomes larger and more optimization time is required. Therefore, the coding rate of the PB-MSR and PB-MBR codes decreases faster than RS or Cauchy. According to the analysis in the previous section, the different Cauchy matrices will generate different optimization results, so the rise of PB-MBR code in or does not have the same proportions. However, the PB-MSR encoding rate is not as fast as the PM-MSR using the Vandermonde matrix directly when the coded object is small. This is because it requires more time to generate the optimization bitmatrix. The encoding rate of the PB-MBR code is lower than that of PB-MSR because more redundancy is needed to be written to disk during the encoding process of the PB-MBR code. From Figure 9(b), it can be seen that with an increase of coding objects, the proportion of time to generate or optimize the matrix is relatively small, so the rate of all coding strategies increases.

6.3. Decoding Performance Evaluation

In the decoding/reconstruction experiments, the decoding performances for various values of r and fixed n for all coding schemes were tested. The decoding performance during reconstruction was measured in terms of the amount of data decoded per unit time. Figure 10 shows the comparison results. It can be observed that the decoding performances of all codes improve with increasing redundancy r. From Figure 10(a), we see that decoding performance of the PB-MSR code increased from 186.72 MB/s when r = 2 to 300.27 MB/s when r = 3. The PB-MBR code increased by 11.95% when r = 3 than when r = 2. From Figure 10(b), it can be seen that from r = 2 to r = 3, the decoding performance of the PB-MSR code increased by 48.26% and that of the PB-MBR code increased by 14.78%. Since the calculation complexity of PB-MSR and PB-MBR decoding is closely related to k, the smaller k, the lower the calculation complexity and the faster the decoding performances. Although the PB-MBR codes are similar to the PB-MSR codes, they are simpler and therefore have a higher decoding rate. As can be seen from Figure 10, PB-MSR decoding performance is about 17.35% higher than PM-MSR, and PB-MBR is about 15.29% higher than PM-MBR.

6.4. Repair Performance Evaluation

Figure 11 shows the data transfers across the network to repair one failed node. It can be seen that both the PB-MSR and PB-MBR codes have significantly lower data transfer compared to RS codes. As can be seen from the bar chart in Figure 11(a), when the encoding object is 5 MB, the classical RS code needs 1.5 times overhead of data transfer compared to the PB-MSR code and needs 2.25 times overhead of data transfer compared to the PB-MBR code. It also can be observed from Figure 11(a) that when the redundancy ratio r is fixed, changing the number of d will result in a smaller amount of data transferred across the network. When in Figure 11(a), the data transferred overhead of PB-MSR code and PB-MBR code decreased by 35.74% and 16.22% compared to that of .

Additionally, the repair performances were tested as shown in Figure 12. From the overall trend, while the value of n is fixed, as the redundancy rate increases, k gradually decreases, and the unit storage capacity of each node increases; disk I/O overhead increases, and the overall repair rate decreases. In contrast to Figure 11, although PB-MSR has a larger network transmission overhead than PB-MBR, it is always the smallest in-unit storage capacity . Thus, the data that need to be written to the disk is also the smallest. Besides, besides, based on the calculation and analysis results about repair complexity in Table 5, it can be seen that PB-MSR has lower calculation complexity than PB-MBR, so its repair rate is slightly higher.

6.5. The Effect of Finite Field on Coding Performance

Based on the analysis thus far, we know that the complexity of the encoding, decoding, or repair process is related to the finite field size , which means every symbol . Therefore, the experiment in this section analyzes the influence of the finite field size on the performance of ECs. The values of (n, k, d) were fixed, and the performances of  = 8,  = 16, and  = 32 were tested as shown in Figure 13. Figures 13(a), 13(b), and 13(c) are the comparison diagrams of encoding, decoding, and repair performances, respectively. From these figures, we see that as the finite field size gradually increases, the calculation complexity increases and the rate declines in varying degrees.

Regenerating codes are state-of-the-art ECs with optimal repair bandwidth that have been used by some storage systems. The NCCloud storage system was one of the earliest implementable designs for 2-parity functional MSR codes (F-MSR), which maintain the same data redundancy level and same storage requirement as traditional ECs but use less repair traffic [30]. In [31], Pamies-Juarez et al. have presented the evaluation of a novel MSR code known as the Butterfly codes in both Ceph and HDFS. Their analysis shows that Butterfly codes are capable of reducing network traffic and reading I/O access during repairs. In [20], the Coupled-Layer (Clay) code is an MSR code that offers a simplified construction for decoding/repair by using pairwise coupling across multiple stacked layers of any single MDS code and has been evaluated over an Amazon AWS cluster. The Clay code is simultaneously optimal in terms of storage overhead, repair bandwidth, optimal access, and sub-packetization level. In [22], Rashmi et al. wanted to minimize disk I/O consumed, while simultaneously retaining optimality in terms of both storage, reliability, and network bandwidth. They presented an algorithm to transform PM-MSR codes into I/O optimal codes (which they refer to as the PM-RBT codes). In addition to MSR codes, there was some new literature about MBR codes. In [32], Hu et al. presented NCFS which was a distributed file system under real network setting based on E-MBR codes [33]. In [34], Shah et al. presented the (n = d +1, k, d) exact-repair- (ER-) MBR codes without arithmetic operation in the repair process. In [35], the authors introduced a family of repair-by-transfer (RBT) codes which was a class of (n, k, d = n-1) ER-MBR codes that were constructed based on congruent transformations applied to a skew-symmetric matrix of message symbols. In this construction, the encoding complexity decreases from to . Based on the new coding matrix for PM-MBR, the minimum of a finite field is reduced from n-k + d to n. In [18], the authors first introduced a more natural extension to the classical PM-MBR framework, modified to provide flexibility in the choice of the number of helpers during node repair, permitting a certain number of error-prone nodes during repair. This was achieved by proving the nonsingularity of the family of matrices in large enough finite fields. To reduce the high coding and repair complexity that involves expensive multiplication operations in a finite field, another new class of RGC was proposed. In [3638], the authors proposed a new framework of linear codes with binary parity-check code, named Binary Addition and Shift Implementable Cyclic-convolutional (BASIC) codes which enable coding and repair by XOR and bitwise cyclic shift.

Although some systems are currently using RGC, they are designed or optimized for coding structure, bandwidth overhead, storage overhead, and I/O overhead. However, there is still a lack of some classical coding framework complexity analysis from the perspective of the finite field.

ECs rely on finite field operations which can be performed using XOR operations. Many acceleration techniques have been proposed: optimizing bitmatrix design, optimizing computation schedule, common XOR operation reduction, cache management, and vectorization techniques [12]. In [13], two new heuristics have been derived called Uber-CHRS and X-Sets to schedule encoding and decoding bitmatrix by reducing the number of XORs. And several technologies were introduced in the same work by using different heuristic algorithms. In addition to smart scheduling and matching algorithms for reducing the number of XORs, other solutions were based on improving hardware performance. In [14], Plank et al. vectorized finite field operations directly based on single-instruction-multiple-data (SIMD).

8. Conclusion

In this paper, we have combined the finite field arithmetic operations to elaborate on the computational complexity of the famous PM framework in encoding, decoding, and repair processes and proposed a new construction called product bitmatrix (PB). Based on two heuristic algorithms, PB-MSR and PB-MBR codes find local minimum XORs that can improve encoding, decoding, and repair performances. Although PB has been optimized by the original framework in computation and has advantages in the decoding and repair processes, when the encoding object is small, the optimization time occupies a portion of the total encoding time. Compared with the PM adopted by Vandermonde, its advantage is insufficient.

In future research, in addition to theoretical research, we will conduct in-depth research on the fault-tolerant technology of edge computing nodes in combination with the real environment, for instance, improving network transmission performance [39], adapting the recommended data reliability storage algorithm of edge computing [40, 41], or to design new nonlinear coding methods with deep neural networks [42, 43].

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the National Key Research and Development Project (2019YFB2102600) and the National Natural Science Foundation of China (NSFC) (61572194 and 61672233).