Research Article  Open Access
Network Partitioning Domain Knowledge Multiobjective Application Mapping for LargeScale NetworkonChip
Abstract
This paper proposes a multiobjective application mapping technique targeted for largescale networkonchip (NoC). As the number of intellectual property (IP) cores in multiprocessor systemonchip (MPSoC) increases, NoC application mapping to find optimum coretotopology mapping becomes more challenging. Besides, the conflicting cost and performance tradeoff makes multiobjective application mapping techniques even more complex. This paper proposes an application mapping technique that incorporates domain knowledge into genetic algorithm (GA). The initial population of GA is initialized with network partitioning (NP) while the crossover operator is guided with knowledge on communication demands. NP reduces the largescale application mapping complexity and provides GA with a potential mapping search space. The proposed genetic operator is compared with stateoftheart genetic operators in terms of solution quality. In this work, multiobjective optimization of energy and thermalbalance is considered. Through simulation, knowledgebased initial mapping shows significant improvement in Pareto front compared to random initial mapping that is widely used. The proposed knowledgebased crossover also shows better Pareto front compared to stateoftheart knowledgebased crossover.
1. Introduction
The advancement in submicron technology allows more intellectual property (IP) cores to be integrated into a single chip which increases the system complexity. Multiprocessor systemonchip (MPSoC) size will increase from several cores to hundreds of cores per chip in the future. Current onchip communication architectures that utilize bus sharing or hierarchical bus architecture will become the performance bottleneck with the increasing number of cores. Implementation of large MPSoC needs more flexible communication resources. Networkonchip (NoC) has emerged as a new communication architecture that provides modularity and flexibility for MPSoC. NoC architectures are based on traditional interconnection network concepts [1]. Each IP core is connected to one of the routers on the NoC network and messages are forwarded through routers to destination cores. However, a handful of NoCbased system design problems are still under research. The problems have been identified and categorized in [2]. A major challenge in NoC design is the placement of IP cores to the associated routers on the network.
Application mapping determines the placement of IP cores to routers in the network such that the performance or cost metrics of interest are optimized [2]. In this paper, it is assumed that application tasks have been assigned and scheduled on IP cores. Task scheduling is not examined in this paper. The input for application mapping is in the form of a core graph instead of task graph. The placement of source cores and destination cores affect the cost and performance of NoC. Without a proper application mapping algorithm, NoC performance may be afflicted with traffic congestion, hotspot, and higher energy consumption. It is an NPhard problem such that exhaustive algorithms cannot be applied. In this regard, there is a need for an effective mapping algorithm to cut down the large search space and obtain optimum mapping.
Optimization search with refinement such as simulated annealing (SA) [3, 4], genetic algorithm (GA) [5, 6], and particle swarm optimization (PSO) [7] has been used in application mapping in NoC. GA is the predominant algorithm for application mapping. GA is good in searching and optimizing problem with limited information provided: the problem representation of possible solutions and the fitness function to evaluate the goodness of the solution. However, increasing IP cores in MPSoC result in factorial increase in the number of possible mappings. Large searchspace renders slower GA convergence. Thus, some knowledgebased information may guide GA to converge faster and provide better solution quality.
Regardless of the size of the initial population, choosing a proper initialization method is vital for solving largescale problems [8]. For largescale NoC problem, to speed up the convergence and improve the solution quality, a proper initialization method is needed. Largescale MPSoCs are mostly combinations of a few subsystems. One IP core may only communicate with several cores in such a large system. Network partitioning (NP) decomposes a large system into several smaller subsystems in which highly communicating cores are grouped in the same partition. However, thermal balance becomes an issue. Hotspot in NoC may cause faulty network resources and erroneous packets being sent. The thermal balance of a network should be another concern for a reliable NoC.
This paper proposes an application mapping technique that incorporates domain knowledge into genetic algorithm (NPDKGA) to minimize the energy consumption and obtain thermal balance on NoC. The initial population of GA is initialized with network partitioning knowledge while the genetic operator crossover is guided with communication demands knowledge. NPDKGA application mapping technique operates in two phases. The first phase is to perform way partitioning of a large MPSoC application to map all the cores into assigned partitions in the meshbased network as the knowledgebased initial population. The second phase involves multiobjective optimization using knowledgebased genetic algorithm (DKGA) to search for Paretooptimum mapping. The authors have tested the effectiveness of NPDKGA on several real benchmarks and the results show overall improvement in the final solution quality and convergence speed. The proposed techniques are implemented and verified using UniMap, a unified framework for NoC application mapping [9].
The rest of this paper is organized as follows. Section 2 briefly discusses some related works in application mapping algorithm, mainly focusing on network partitioning and genetic algorithm. Section 3 presents the proposed application mapping technique based on the combination of network partitioning and the GA using knowledgebased (DK) crossover in multiobjective environment, as well as their formal definitions. Section 4 discusses the tools and simulation parameters used in the experimental work and discusses the experiment results. Finally, Section 5 concludes the paper and suggests future works.
2. Related Works
Due to high potential of NoC application mapping, many algorithms have been proposed. A detailed survey on application mapping has been published in [11]. The first mapping algorithm based on bit energy model was proposed by Hu and Marculescu using branch and bound technique such that energy consumption can be minimized with bandwidth reservation [12]. NMAP [13] mapping algorithm has been proposed using traffic splitting technique to minimize communication delay. Reference [14] compares several application mapping algorithms using bit energy model for low energy consumption. Simulated annealing (SA) [3, 4, 14] and GA [5] were also proposed as the application mapping techniques to optimize energy consumption using bit energy model. In [1, 7, 13, 15–17], the application mapping optimization is based on communication cost in terms of the distance among communicating cores. These application mapping techniques only consider energy minimization. Application mapping actually involves many issues. Optimizing only one objective may cause other objectives to be worse. Therefore, multiobjective technique is needed.
Reference [18] solved the multiobjective problem by using aggregate several objectives into one objective with applied weight. However, it is hard to decide the importance of each objective and to change the weight accordingly. A small change of weight gives totally different solution [19]. Multiobjective evolutionary algorithm with randombased initial population mapping was proposed to optimize execution time and power consumption using SPEA2 [20]. The genetic operator has been proposed to remap hotspots in the random fashion as the choice of effective genetic operator has a great impact on the final mapping [20]. In [18], crossover was proposed based on swapping communicating cores with neighbouring cores.
There are a few crossover techniques such as remap hotspot [20–22], shift crossover [23], and cycle crossover [24]. All of these crossover techniques do not include useful NoC mapping knowledge. The convergence is slow especially for a largescale NoC. Domain knowledge has been proposed for faster convergence. In the domain knowledge evolutionary algorithm [5], mapping similarity crossover (MS) has been proposed to maintain the common characteristic in genes between the parents and the rest of the genes using greedy mapping. Mapping similarity approach is able to handle symmetric problem in mesh topology but the technique increases the computation time drastically as the NoC size increases.
Largest communication first (LCF) initializes mapping based on maximum communication ordering in center and places the rest one by one to reduce the communication cost. LCF can generate good initial mapping especially for large varying traffic [14]. However, as the NoC size increases, the complexity for placement based on communication ranking hardly obtains good mapping. A large MPSoC system can be divided into several clusters (partitions). Clusterbased application mapping has been proposed in [1, 15]. The author in [1] proposed a clusterbased relaxation for integer linear programming (ILP) formulation for application mapping in order to reach optimum result within tolerable time limits. HMMap [25] employed nondominated sorting genetic algorithmII (NSGAII) to decide relative location of partition groups and then further map the cores inside each group before combining the hierarchical mapping into the final mapping. Authors in [24] proposed a partitionbased application mapping with nearconvex region core placement for large NoC. However, these three techniques map cores without improving cross partition movement. Although they show shorter runtime, the final mapping quality is affected [24].
A mapping algorithm based on KernighanLin (KL) partitioning, called LMAP, has been proposed to explore search space via flipping the partitions and groups in hierarchical fashion [17]. References [15, 16] proposed clusterbased initial mapping for simulated annealing (CSA) to speed up the convergence to nearoptimal solution. These works show the advantage in runtime without compromising the quality of solution compared to the pure SA approach. Given random initial mapping, optimized simulated annealing (OSA) [4] improves SA by clustering communicating cores implicitly during swapping process. OSA shows better mapping quality compared to CSA. However, author in [5] has shown that an evolutionary algorithm performs better than OSA. Particle swarm optimization (PSO) has been proposed with deterministic initial mapping to explore the search space [7]. The domain knowledge applied on initial mapping is greedily based where IP cores are placed on the NoC topology based on the descending ranking of total communication cost in application graph. The shortcoming of this initial mapping technique is similar to problem of LCF, and it hardly obtains good mapping as the NoC size increases.
3. Application Mapping Using NP KnowledgeBased GA
This proposed work aimed for largescale NoC. This paper proposes an application mapping technique that incorporates domain knowledge into genetic algorithm (NPDKGA) to minimize the energy consumption with thermal balance for NoC communication. Figure 1 shows the overall flow of the proposed technique. Network partitioning minimizes intertraffic between partitions with highly communicating cores in the same partition. The NP knowledge reduces mapping complexity and explores for potential mapping space. Then GA evolution is guided by genetic operators that are based on knowledge of communication demands. Some definitions used in this paper are listed next.
3.1. Problem Formulation
Definition 1. In an application characteristic graph (APCG), is a directed graph, where each vertex represents an IP core and each directed edge characterizes the total communication volume in bits from vertex to vertex . The vertex weight represents the power consumption for each IP core and the default value is 1 for all IP cores. Application tasks are assumed to be assigned and scheduled to each vertex .
Definition 2. In NoC meshbased network, is a labelled graph, where each denotes a router and each denotes a channel. All routers can have a maximum of 5 ports with 4 ports connected to neighbouring routers via channels and one connection to the processing core. is placed on a grid in the plane with unit distances between adjacent routers. and denote the and coordinates for a router .
Definition 3. Given an input APCG, network partitioning decomposes APCG into smaller subsystems depending on the size of meshbased topology. APCG will be partitioned or divided into partitions, . Network partitioning is to find where is number of cores in each partition and is interpartition traffic. The objective of network partitioning is to reduce interpartition traffic (mincut partitioning), subject to constraints, to obtain a balanced number of cores for all partitions.
Definition 4. The mapping for the partitioned APCG involves partition placement and core placement. Assume a partitioned graph and topology . In partition placement, assigns certain regions on the meshbased topology, to a particular partition, . For core placement, where each vertex in each partition is associated with the router in the assigned topology region.
3.2. Genetic Algorithm for LargeScale NoC Application Mapping
Genetic algorithm mimics the processes of biological evolution. It consists of a few important components as below [26]:(1)problem representation,(2)population,(3)fitness function,(4)parent and survivor selection mechanism,(5)genetic operator (crossover and mutation).
Genetic algorithm optimization is based on evolution of a population of chromosomes toward a better solution. In order to optimize the problem, the representation of possible solutions is crucial. Permutation chromosome is used to represent the application mapping problem. It consists of a series of genes where each gene corresponds to a tile in the mesh topology. For mesh topology, the length of a chromosome is genes. Each gene is assigned an integer which represents an IP core in that is attached to the corresponding router in each tile. Figure 2 shows an example of encoded integer chromosome for a mesh topology. A gene associated with a router is assigned a null value if no IP core is assigned to the router. A valid permutation chromosome cannot have two genes with the same integer because it would represent a core connected to two routers.
In application mapping problem, GA mostly starts with a population of randomly generated chromosomes. This population will be evaluated for goodness based on the predefined fitness function. The fitness function is based on the optimization objectives, for either single objective or multiobjective optimization. Then, the chromosomes are selected based on fitness using binary tournament selection. Two chromosomes are chosen randomly and the fitter one is allowed to perform crossover and mutation to reproduce new offsprings with fixed probability. Crossover and mutation algorithms are responsible for GA to explore and exploit the search space. The combination of newly generated offsprings and previous population becomes a mating pool. Fitter chromosomes have a higher chance to survive to the next generation. GA continues to operate iteratively until a fixed number of iterations or termination criteria have been met.
3.3. Network Partitioning as Initial Mapping in GA
Network partitioning decomposes a large NoC system into a few smaller partitions. In this proposed NoC application mapping, NP is implemented in two stages: mesh topology partitioning and application partitioning. In the first stage, mesh topology is assigned into a few smaller regions where each region represents one partition. The number of partitioning levels depends on the size of the topology. For the cases where mesh topology cannot be bipartitioned, such as and , way partitioning can be implemented. Mesh topology is partitioned into partitions with the same number of tiles for each partition. If number of tiles per partition is imbalanced, larger NoC network may be needed. Figure 2 shows a mesh topology partitioning. The partitioning starts with vertical partition and then horizontal partition. If mesh partitions are generated, then the same application partitions are needed in the second stage.
In the second stage, the multilevelKL (KernighanLin) algorithm decomposes IP cores in APCG into halves and refines the partitions at each subsequent level. This algorithm is available in Chaco [27]. It is chosen due to its highquality partitions and is scalable for large networks [27]. The application is partitioned according to number of mesh partitions and the available tiles in each partition. Each partition must have at least four available tiles. If the partition size is too small, the role of NP to group the highly communicating cores will be insignificant. The objective of NP is to achieve mincut with the lowest interpartition traffic. There is a single constraint, that is, to corebalance each partition. Figure 2 shows an example of 2level partitioning on mesh topology for the VOPD application [10]. The dashed lines show the firstlevel partitioning while dasheddot lines show the secondlevel partitioning for the VOPD application.
The outcome of the twostage NP is used to generate an initial population for GA. Instead of detail hierarchical mapping for all partitions and cores, it is done randomly within the assigned region of mesh topology. The random placement of partitions and cores provides population diversity to GA. Figure 3 shows two individuals of NP initial population for VOPD [10] application after random partitions and core placement. The mincut partitioning technique that groups communicating cores within the same partition provides a potential low energy mapping. Research has shown that the initial population may have effect on the best fitness function value and these effects may last for several generations [28]. Genetic algorithm is expected to converge to an equilibrium independent of initial state [28]. However for a largescale NoC, the possible mapping space is extremely huge and slows down the convergence. Hence, a good initial population may result in faster convergence and better solution quality.
3.4. KnowledgeBased Genetic Operator
Crossover is used to produce offsprings, and fitter chromosomes are searched to form a new population. Mapping similarity has been proposed where offsprings keep the common characteristics of their parent in terms of sumofdistance among communicating cores [5]. The genes are evaluated one by one to check for common characteristics. This is timeconsuming especially for largescale and highly communicating applications.
The NPbased initial mapping provides potential mapping. Thus, we propose retaining the common characteristic parents in terms of locus in mesh topology to exploit the search space. Then, the rest of cores with no similarity is mapped greedily. This crossover algorithm is energybias. Thus, a proper mutation algorithm is needed to explore the search space. We do not propose a new mutation algorithm but we utilised mutation algorithms available in UniMap: swap between cores (SWAP) and knowledgebased mutation using simulated annealing (OSA).
In this paper, knowledgebased GA optimization is proposed as described in Algorithm 1. Crossover points are randomly set according to the nature randomization behaviour of GA. Two children chromosomes are generated from two selected parents. After the crossover between parents, if the same index is assigned to two genes, the latter gene in the resulting chromosome is labelled as InvalidGene. Cores that are not assigned to any gene are labelled as UnmappedCores.

In this work, we applied a knowledgebased (DK) crossover technique. The UnmappedCores will determine its communication with the adjacent router of InvalidGene. The UnmappedCores will be remapped to InvalidGene which has the highest communication with NeighborCore. This crossover algorithm is done iteratively until the generated children chromosomes reach the population size. This implicit clustering approach aids GA to explore the mapping space efficiently for low power mapping.
3.5. Multiobjective Optimization
Multiobjective optimization is an optimization that involves more than one objective. In application mapping, highly communicating cores are kept together for shorter packet transmitting path. However, it may cause hotspot in networks and incurs fault in packets or routers. An optimum mapping should not only minimize energy but also need to consider both conflicting objectives. Designers need to make decision based on the tradeoff between a set of Pareto mappings obtained. Pareto optimum mapping is nondominated mapping for all objective functions.
Multiobjective application mapping is better to be treated independently. The SPEA2 and NSGA2 (Nondominated Sorting Genetic Algorithm 2) techniques are available in UniMap to obtain Pareto mapping. Both techniques find the best solution, and either technique gives good result for NoC application mapping [5].
Energy model and thermal model for fitness evaluation are available in UniMap. The bit energy model is widely used in application mapping for energy consumption evaluation whereas the thermal model uses the HotSpot tool [29]. The bit energy model available in UniMap is to optimize , that is, the required energy for a bit of data from source core to destination core. Consider the following: where is the number of hops for a path taken from the source core to the destination core (i.e., one hop is the distance between two adjacent routers) with deterministic routing, is the energy consumption for a link between adjacent routers, and is the energy consumption for the router. The and are given in UniMap and are used in this paper. The overall energy consumption is the summation of all energy bits consumed by all bit transmissions. Consider where is the total communication traffic in bits from the source core to the destination core. If the placement does not fulfil the bandwidth constraint, penalty will be added to the energy consumption, .
The thermal model used in UniMap is the HotSpot tool [29]. Thermal balance is achieved by minimizing the maximum sum of subnetwork of NoC. Consider the following:
The NoC topology is partitioned into smaller subnetworks where size of each subnetwork will overlap the neighbouring subnetworks. The maximum temperature of each subnetwork is estimated based on the power and area provided. Power for each core is proportional to the execution time and area is available in UniMap framework for different NoC sizes.
4. Simulation Results and Discussion
This section discusses the simulation setup, tool, and application benchmark used for verification. Then, we analyse the effectiveness of knowledgebased initial mapping in multiobjective environment. We also compare knowledgebased genetic operator with stateof theart genetic operators available in UniMap. The proposed technique is verified using several benchmarks [30].
4.1. Simulation Setup
The MCSL traffic benchmark suite [30] that supports several NoC architectures is used as the real traffic traces in this experiment. Three real applications using meshbased architecture are included in MSCL: Fpppp, Sparse, and Robot. networks are chosen to represent largescale NoC. Additionally, we also implement a 215core benchmark that is available in UniMap that was also used in [5]. This application mapping is evaluated on meshbased NoC and deterministic routing. Meshbased NoC is chosen for its scalability for large scale and simplicity for implementation.
A meshbased NoC architecture is used for all MCSL benchmarks, whereas NoC size is used for the 215core benchmark. All tasks in each application have been scheduled and mapped into the IP cores. The MCSL benchmarks provide information of packet size, execution time, memory, and transmitting dependency. Dynamic information like transmission dependency increases the simulation time drastically especially for largescale NoC. Thus, only packet size and execution time are considered. The HotSpot thermal model used requires the information of power consumption of each IP core that are not available in MCSL. Therefore, the power of each core is generated according to ratio of execution time for each core over total system execution time. Power for 215core benchmark is available in UniMap.
For all the benchmarks, network partitioning is implemented using Chaco [27] or hMetis [31] before the application mapping stage. Chaco performs bisection partitioning whereas hMetis performs the way partitioning. The partitioning purpose is to group highly communicating cores in the same partition and, at the same time, perform the mincut operation. Thus, any partitioning tool that fulfils the purpose can be used. The network partitioning information is used to generate initial population. Each simulation starts with identical initial population set for each benchmark either the proposed NPbased or random initial mapping.
We implemented our proposed technique into the UniMap framework. UniMap is a unified framework for the evaluation and optimization of application mapping algorithms for NoC architectures. We utilised the multiobjective GA environment available in UniMap which integrated SPEA2 from jMetal library, a multiobjective metaheuristics library. Several GA parameters are fixed with probability for crossover of 0.9 and probability for mutation of 0.3. Probability for mutation is set according to our analysis on OSA mutation technique. This work does not analyse the optimal parameters for GA rather to assess the effectiveness of the knowledgebased initial population and genetic operator in a multiobjective environment. The population size of GA is set to 100 for all benchmarks and the termination of GA is set to 500 generations. The parameters in SPEA2 are the archive size of 10, to store the Pareto front for each generation. Other parameters are based on the default setting in UniMap.
4.2. Results and Discussion
We first analyse the effectiveness of NP initial mapping in multiobjective environment using SPEA2 genetic algorithm. The proposed DK crossover is implemented in the multiobjective environment in UniMap framework. Besides our proposed crossover, mapping similarity (MS which is also knowledgebased) and partial match crossover (PMX which is randombased) algorithms available in UniMap are chosen to assess the effectiveness of knowledgebased initial mapping. SWAP and OSA that are also available in UniMap are chosen as the mutation techniques. Table 1 shows all the combination of different initial mapping and genetic algorithms to analyse our proposed technique.

Figure 4 shows the Pareto front obtained by different initial mapping and genetic operators combination in the final generation. Figure 4(a) shows significant improvement especially in terms of energy consumption when NPbased initial mapping is applied. Then, we evaluate the effectiveness of NPbased initial mapping with different crossover algorithms available in UniMap. Figures 4(b) and 4(c) also show significant improvement in solution quality when applying NPbased initial mapping. The quality improvement is not only limited to energy consumption but also in terms of thermal balance. NPbased initial mapping gives better solution mapping regardless of the genetic operators applied for optimization. It provides a potential spacesearch for multiobjective GA.
(a) DK crossover
(b) MS crossover
(c) PMX crossover
Figure 5 shows the combined Pareto fronts obtained by combining all the evaluated algorithms. Figure 5(a) shows the combined Pareto fronts which are the nondominated solutions from all merged Pareto fronts in Figure 4. The combined Pareto fronts show all mappings obtained from NPbased initial mapping. NPbased initial mapping benefits to this benchmark that is a combination of few smaller applications. Thus, any clusterbased application would obtain better quality Pareto front mapping with NPbased initial mapping. With knowledgebased genetic operators, NPOSAPMX and NPOSADK give good energybias mapping. MS crossover cannot reach the combined Pareto fronts in this application. NP is needed for clusterbased applications to reduce mapping complexity and improves the solution quality.
(a) 215core benchmark
(b) Sparse benchmark
(c) Fpppp benchmark
(d) Robot benchmark
Figures 5(b)–5(d) show the combined Pareto front obtained from the Sparse, Fpppp, and Robot benchmarks. Randombased and NPbased initial mapping both appear in the combined Pareto front mapping. However, randombased initial mapping gives only good energybias but imbalanced thermal mapping. The randombased initial mapping that can reach the Pareto front is either the one using OSA mutation or DK crossover that implicitly clusters highly communicating cores together. NPbased initial mapping could give thermal balance, but there are tradeoffs in energy consumption. Most multiobjective solutions are found using OSA mutation technique.
For all the benchmarks evaluated, only DK and PMX crossover are in Pareto front. MS never reaches the combined Pareto front with the same generation runs. MS always shows the fastest convergence in energy minimization but it cannot reach the Pareto front. Overall, DK crossover gives higher number of solutions in good energybias mappings compared to PMX. For faster convergence, DK performed better in multiobjective optimization compared to PMX. However, if the number of maximum generations increases, PMX may give better Pareto mapping.
5. Conclusions
This paper presented NPDKGA that uses network partitioning as initial mapping and multiobjective genetic algorithm with DK crossover for NoC application mapping. This algorithm is targeted for largescale NoC. We performed analysis on the effectiveness of network partitioning as initial mapping, as well as the proposed DK crossover in multiobjective environment based on different benchmarks. Knowledgebased initial mapping shows significant improvement in Pareto front compared to randombased initial mapping. Our proposed DK crossover gives better Pareto front mapping compared to stateoftheart MS crossover. If no simulation time is imposed for simulation, PMX can provide a good Pareto front. If simulation time is restricted, NP initial mapping is preferred especially for largescale NoC. Our experiment shows that knowledgebased initial mapping works well with all genetic operators. Not only does it reduce mapping complexity, but it also gives better quality in terms of Pareto front mappings. This work can be extended into more accurate evaluation using cycleaccurate NoC simulator.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
References
 S. Tosun, “Clusterbased application mapping method for NetworkonChip,” Advances in Engineering Software, vol. 42, no. 10, pp. 868–874, 2011. View at: Publisher Site  Google Scholar
 R. Marculescu, U. Y. Ogras, L. Peh, N. E. Jerger, and Y. Hoskote, “Outstanding research problems in NoC design: system, microarchitecture, and circuit perspectives,” IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, vol. 28, no. 1, pp. 3–21, 2009. View at: Publisher Site  Google Scholar
 B. Yang, L. Guang, T. Sntti, and J. Plosila, “Parameteroptimized simulated annealing for application mapping on NetworksonChip,” in Learning and Intelligent Optimization, Y. Hamadi and M. Schoenauer, Eds., Lecture Notes in Computer Science, pp. 307–322, Springer, Berlin, Germany, 2012. View at: Google Scholar
 C. Radu and L. Vintan, “Domainknowledge optimized simulated annealing for NetworkonChip application mapping,” in Advances in Intelligent Control Systems and Computer Science, L. Dumitrache, Ed., vol. 187 of Advances in Intelligent Systems and Computing, pp. 473–487, Springer, Berlin, Germany, 2013. View at: Publisher Site  Google Scholar
 C. Radu, M. S. Mahbub, and L. Vinţan, “Developing domainknowledge evolutionary algorithms for networkonchip application mapping,” Microprocessors and Microsystems, vol. 37, no. 1, pp. 65–78, 2013. View at: Publisher Site  Google Scholar
 M. Arjomand, S. H. Amiri, and H. SarbaziAzad, “Efficient genetic based topological mapping using analytical models for onchip networks,” Journal of Computer and System Sciences, vol. 79, no. 4, pp. 492–513, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 P. K. Sahu, T. Shah, K. Manna, and S. Chattopadhyay, “Application mapping onto meshbased NetworkonChip using discrete particle swarm optimization,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 22, no. 2, pp. 300–312, 2014. View at: Publisher Site  Google Scholar
 B. Kazimipour, X. Li, and A. K. Qin, “Initialization methods for large scale global optimization,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC '13), pp. 2750–2757, June 2013. View at: Publisher Site  Google Scholar
 C. Radu, Optimized algorithms for NetworkonChip application mapping [Ph.D. thesis], University of Sibiu, 2011.
 E. B. van der Tol and E. G. T. Jaspers, “Mapping of MPEG4 decoding on a flexible architecture platform,” in Media Processors 2002, Proceedings of SPIE, pp. 1–13, San Jose, Calif, USA, January 2002. View at: Publisher Site  Google Scholar
 P. K. Sahu and S. Chattopadhyay, “A survey on application mapping strategies for NetworkonChip design,” Journal of Systems Architecture, vol. 59, no. 1, pp. 60–76, 2013. View at: Publisher Site  Google Scholar
 J. Hu and R. Marculescu, “Energyaware mapping for tilebased NoC architectures under performance constraints,” in Proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC '03), pp. 233–239, IEEE, January 2003. View at: Publisher Site  Google Scholar
 S. Murali and G. De Micheli, “Bandwidthconstrained mapping of cores onto NoC architectures,” in Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, pp. 896–901, February 2004. View at: Publisher Site  Google Scholar
 C. Marcon, E. Moreno, N. Calazans, and F. Moraes, “Comparison of networkonchip mapping algorithms targeting low energy consumption,” IET Computers and Digital Techniques, vol. 2, no. 6, pp. 471–482, 2008. View at: Publisher Site  Google Scholar
 Z. Lu, L. Xia, and A. Jantsch, “Clusterbased simulated annealing for mapping cores onto 2D mesh networks on chip,” in Proceedings of the 11th IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systems (DDECS '08), pp. 1–6, Bratislava, Slovakia, April 2008. View at: Publisher Site  Google Scholar
 Z. Song, Y. Dou, M. Zheng, and W. Xu, “A quick method for mapping cores onto 2Dmesh based networks on chip,” in Computer Engineering and Technology, L. Xiao, P. Lu, J. Li, and C. Zhang, Eds., vol. 337 of Communications in Computer and Information Science, pp. 173–184, Springer, Berlin, Germany, 2013. View at: Google Scholar
 P. K. Sahu, N. Shah, K. Manna, and S. Chattopadhyay, “A new application mapping algorithm for mesh based networkonchip design,” in Proceedings of the Annual IEEE India Conference (INDICON '10), pp. 1–4, Kolkata, India, December 2010. View at: Publisher Site  Google Scholar
 A. A. Morgan, H. Elmiligi, M. W. EIKharashi, and F. Gebali, “Multiobjective optimization of NoC standard architectures using Genetic Algorithms,” in Proceedings of the 10th IEEE International Symposium on Signal Processing and Information Technology (ISSPIT '10), pp. 85–90, December 2010. View at: Publisher Site  Google Scholar
 S. G. Ficici, “Multiobjective optimization and coevolution,” in MultiObjective Problem Solving from Nature: From Concepts to Applications, J. Knowles, D. Corne, and K. Deb, Eds., Springer, New York, NY, USA, 2007. View at: Google Scholar
 G. Ascia, V. Catania, and M. Palesi, “Multiobjective mapping for meshbased NoC architectures,” in Proceedings of the 2nd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '04), pp. 182–187, September 2004. View at: Google Scholar
 A. A. Morgan, NetworksonChip: modeling, systemlevel abstraction, and applicationspecific architecture customization [Ph.D. thesis], University of Victoria, 2011.
 R. K. Jena and G. K. Sharma, “Application mapping of mesh basedNoC using multiobjective genetic algorithm,” International Journal of Computers and Applications, vol. 30, no. 1, pp. 17–22, 2008. View at: Google Scholar
 N. Nedjah, M. Silva, and L. Macedo Mourelle, “Evolutionary IP mapping for efficient NoC based system design using multiobjective optimization,” in Innovative Computing Methods and Their Applications to Engineering Problems, N. Nedjah, L. Santos Coelho, V. Mariani, and L. Macedo Mourelle, Eds., vol. 357 of Studies in Computational Intelligence, pp. 105–129, Springer, Berlin, Germany, 2011. View at: Google Scholar
 W. Jang and D. Pan, “A3MAP: architectureaware analytic mapping for networksonchip,” in Proceedings of the 15th Asia and South Pacific Design Automation Conference (ASPDAC '10), pp. 523–528, Taipei, Taiwan, January 2010. View at: Publisher Site  Google Scholar
 H. Lin, L. Zhang, D. Tong, X. Li, and X. Cheng, “A fast hierarchical multiobjective mapping approach for meshbased networksonchip,” Acta Scientiarum Naturalium Universitatis Pekinensis, vol. 44, no. 5, pp. 711–720, 2008. View at: Google Scholar
 A. E. Eiben and J. E. Smith, Introduction to Evolutionary Computing, Springer, 2003. View at: Publisher Site  MathSciNet
 B. Hendrickson and R. Leland, The Chaco User’s Guide Version 2.0, 1995.
 H. Maaranen, K. Miettinen, and A. Penttinen, “On initial populations of a genetic algorithm for continuous optimization problems,” Journal of Global Optimization, vol. 37, no. 3, pp. 405–436, 2007. View at: Publisher Site  Google Scholar  MathSciNet
 K. Skadron, M. R. Stan, K. Sankaranarayanan, W. Huang, S. Velusamy, and D. Tarjan, “Temperatureaware microarchitecture: modeling and implementation,” ACM Transactions on Architecture and Code Optimization, vol. 1, no. 1, pp. 94–125, 2004. View at: Google Scholar
 W. Liu, J. Xu, X. Wu et al., “A NoC traffic suite based on real applications,” in Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI '11), pp. 66–71, July 2011. View at: Publisher Site  Google Scholar
 G. Karypis and V. Kumar, “Multilevel kway hypergraph partitioning,” in Proceedings of the 36th Annual ACM/IEEE Design Automation Conference (DAC '99), pp. 343–348, New York, NY, USA, June 1999. View at: Google Scholar
Copyright
Copyright © 2014 Yin Zhen Tei et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.