Research Article

Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor

Algorithm 2

Idea of block decomposition: (a) basic scheme of MPDATA code, codes after applying loop tiling (b), loop fusion on tiles management level (c), and loop fusion on intratile level (d), where , and S17 are exemplary MPDATA stages.
(a)
for -dim
 for -dim
  for -dim
   S1 = …
for -dim
 for -dim
  for -dim
  S3 = …
for -dim
 for -dim
  for -dim
   S14 = …
for -dim
 for -dim
  for -dim
   S17 = …
(b)
for BlockOff tiles
 for BlockOff tiles
  for BlockOff tiles {
   for -dim
    for -dim
     for -dim
      S1 = …
  }
for BlockOff tiles
 for BlockOff tiles
  for BlockOff tiles {
   for -dim
    for -dim
     for -dim
      S3 = …
  }
for BlockOff tiles
 for BlockOff tiles
  for BlockOff tiles {
   for -dim
    for -dim
     for -dim
      S17 = …
  }
(c)
for BlockOff tiles
 for BlockOff tiles
  for BlockOff tiles {
   for -dim
    for -dim
     for -dim
      S1 = …
   
   for -dim
    for -dim
     for -dim
      S3 = …
   
   for -dim
    for -dim
     for -dim
      S14 = …
   
   for -dim
    for -dim
     for -dim
      S17 = …
  }
(d)
for BlockOff tiles
 for BlockOff tiles
  for BlockOff tiles {
   for -dim
    for -dim
     for -dim {
      S1 = …
      S2 = …
      S3 = …
     }
    
    for -dim
     for -dim
      for -dim {
       S14 = …
       S15 = …
       S16 = …
      }
    
    for -dim
     for -dim
      for -dim {
       S17 = …
      }
  }