Design Space Exploration of Deeply Nested Loop 2D Filtering and 6 Level FSBM Algorithm Mapped onto Systolic Array
Table 4
(a) Heuristic search results for 2D filtering
NPE
Ncyc
-matrix
Reg. cost
12
12
1 0 1 3
0
0
1 1 1 2
10
12
14
1 0 1 3
0
0
1 1 1 4
14
12
18
1 0 1 3
0
0
1 1 1 3
12
12
16
1 0 1 3
0
0
1 1 1 1
8
12
12
1 0 1 3
0
0
1 1 2 1
10
12
15
1 0 1 3
0
0
1 1 2 2
12
12
17
1 0 1 3
0
0
1 1 2 0
8
12
13
1 0 1 3
0
0
1 1 2 4
16
12
21
1 0 1 3
0
0
1 1 2 3
14
12
19
1 0 1 3
0
0
1 1 2 1
10
12
15
1 0 1 3
0
0
1 1 0 4
12
12
15
1 0 1 3
0
0
1 1 0 3
10
12
13
1 0 1 3
0
0
1 1 4 1
14
12
21
1 0 1 3
0
0
1 1 4 2
16
12
23
1 0 1 3
0
0
1 1 4 0
12
12
19
1 0 1 3
0
0
1 1 4 4
20
12
27
1 0 1 3
0
0
1 1 4 3
18
12
25
1 0 1 3
0
0
1 1 4 1
14
12
21
1 0 1 3
0
0
1 1 3 1
12
(b) Mapping results using the modified heuristic search results process 2D filtering
Window size = 3 × 3; 2D result arrived by using Step 11
Window size = 4 × 3
[pe_arr, Ncyc_arr, or Tmat]
[pe_arr Ncyc_arr or Tmat]
NPE
Ncyc
matrix = [; ]
NPE
NCYC
matrix = [; ]
9
9
1 0 0 4; 1 1 2 1
12
12
1 0 1 4; 1 1 3 1
9
9
1 0 0 4; 1 3 0 4
12
12
1 0 1 4; 1 3 1 4
9
9
1 0 0 4; 1 3 2 1
12
12
1 0 1 4; 1 3 3 1
9
9
1 0 0 4; 1 2 0 4
12
12
1 0 1 4; 1 2 1 4
9
9
1 0 0 4; 1 2 2 1
12
12
1 0 1 4; 1 2 3 1
9
9
1 0 0 4; 1 4 0 4
12
12
1 0 1 4; 1 4 1 4
9
9
1 0 0 4; 1 4 2 1
12
12
1 0 1 4; 1 4 3 1
9
9
1 0 0 4; 1 1 0 4
12
12
1 0 1 4; 1 1 1 4
9
9
1 0 0 4; 1 1 2 1
12
12
1 0 1 4; 1 1 3 1
9
9
1 0 0 4; 0 1 0 4
12
12
1 0 1 4; 0 1 1 4
9
9
1 0 0 4; 0 1 2 1
12
12
1 0 1 4; 0 1 3 1
9
9
1 0 0 4; 0 3 0 4
12
12
1 0 1 4; 0 3 1 4
9
9
1 0 0 4; 0 3 2 1
12
12
1 0 1 4; −0 3 3 1
9
9
1 0 0 4; 0 2 0 4
12
12
1 0 1 4; 0 2 1 4
9
9
1 0 0 4; 0 2 2 1
12
12
1 0 1 4; 0 2 3 1
9
9
1 0 0 4; 0 4 0 4
12
12
1 0 1 4; 0 4 1 4
9
9
1 0 0 4; 0 4 2 1
12
12
1 0 1 4; 0 4 3 1
9
9
1 0 0 4; 0 1 0 4
9
9
1 0 0 4; 0 1 2 1
9
9
1 0 0 4; 2 1 0 4
9
9
1 0 0 4; 2 1 2 1
*Search space for matrix without the use of the scheduling vector ; the execution time takes more execution time to obtain Table 4(a), than the search time which uses the as the projection direction for reassignment of PE plane used to obtain Table 4(b).