1col_buf[N];
2for (i=0; i<=N; i++)
3 { // pragma hls loop pipeline
4   t0(1..N, i, col_buf); // write one column of array A[N][N]
5 t1(i, N-1..1, col_buf); // read one column of array A[N][N]
6 }
Listing 23: Generated HW module after task-level scheduling.