Research Article

Block-Split Array Coding Algorithm for Long-Stream Data Compression

Algorithm 3

MLB Coding (k = 3).
/ This encoding algorithm uses the following procedures:
  input( buffer + offset, stream, length ) : Data inputting from the character stream;
  output( length, index ) : Data outputting as a length/index code-word;
  output_Char( character ) : Data outputting as a single character code-word.
/
function encodeLZ_MLB( stream ) { / stream is the input stream /
  buf = array( 0..M-1 ); / buf is the data stream buffer, and Eq (5) is always working /
  pos = N; / pos traces the current position of the data window (M=2N) /
  define: s = buf + pos; / s[0..N-1] is the data window, and Eq (6) is always working /
  link = array( 0..M-1 ); / link stores the matching links for buf /
  bucket = array( 0..2563-1); / bucket stores the link headers of Eq (3) /
  link[ 0..M-1 ] = null; bucket[ 0..2563-1 ] = null; / initialize the matching links and link headers /
  repeat do { / Hint: positions of buf form a congruent group /
   remain = input( buf + N - pos, stream, N); / remain gets the length of current input data /
   if ( remain =0 ) do { return; } / this algorithm ends /
   for n = 0..remain-1 do { / Phase (a): build matching links /
    i = Npos + n; j = buf[ i..i+2 ]; p = bucket[ j ];
    if not ( p = null ) and not ( buf[ p..p+2 ] = j ) do { bucket[ j ] = null; } / clear the expired header value /
    if ( bucket[ j ] = null ) do { link[ i ] = null; }
    else do { link[ i ] = i - bucket[ j ]; } / link stores the distance (i.e. relative position) of Eq (4) /
    bucket[ j ] = i; / bucket stores the plain position of Eq (3) /
   }
   while ( remain >0 ) do { / Phase (b): LZ77 string match and encode /
    if ( remain <3 ) do {
     for i = 1..remain do { output_Char( s[N] ); pos = pos+1; } / output the remaining characters /
     break; / jump out of the “while” cycle /
    }
    n = pos + N; dist =0; / dist traces the distance between s[N] and current matching point /
    len =1; index =0; / len and index trace the current maximum matching length and its relative position /
    for i = 1..sight do { / sight controls the matching times /
     d = link[ n - index ]; / d is the distance from current matching point /
     if ( d = null ) or ( dist + d >= N ) do { break; } / jump out of the “for” cycle /
     dist = dist + d; x = N; y = N - dist;
     for ( l = 0..remain-1 ) and ( s[x] = s [y] ) do { x = x+1; y = y+1; } / get the matching length l /
     if ( l > len ) do { len = l; index = dist; } / find a better matching point /
    }
    If ( len >2 ) do { output( len, index ); } / output a length/index code-word /
    else do { output_Char( s[N] ); len =1; } / output a single character /l
    pos = pos + len; remain = remain - len;
   }
  }
}