Research Article  Open Access
Yuanhui Ni, Zhiyao Gong, Weiwen Chen, Chengmo Yang, Keni Qiu, "StateTransitionAware Spilling Heuristic for MLC STTRAMBased Registers", VLSI Design, vol. 2017, Article ID 1030249, 9 pages, 2017. https://doi.org/10.1155/2017/1030249
StateTransitionAware Spilling Heuristic for MLC STTRAMBased Registers
Abstract
Multilevel Cell SpinTransfer Torque Random Access Memory (MLC STTRAM) is a promising nonvolatile memory technology to build registers for its natural immunity to electromagnetic radiation in radhard space environment. Unlike traditional SRAMbased registers, MLC STTRAM exhibits unbalanced write state transitions due to the fact that the magnetization directions of hard and soft domains cannot be flipped independently. This feature leads to nonuniform costs of write states in terms of latency and energy. However, current SRAMtargeting register allocations do not have a clear understanding of the impact of the different write statetransition costs. As a result, those approaches heuristically select variables to be spilled without considering the spilling priority imposed by MLC STTRAM. Aiming to address this limitation, this paper proposes a statetransitionaware spilling cost minimization (SSCM) policy, to save power when MLC STTRAM is employed in register design. Specifically, the spilling cost model is first constructed according to the linear combination of different statetransition frequencies. Directed by the proposed cost model, the compiler picks up spilling candidates to achieve lower power and higher performance. Experimental results show that the proposed SSCM technique can save energy by 19.4% and improve the lifetime by 23.2% of MLC STTRAMbased register design.
1. Introduction
Electromagnetic radiation effects can cause several types of errors on traditional SRAMbased registers and DRAMbased memory such as single event upset (SEU) and single event functional interrupt (SEFI). Especially in aerospace where radiation is quite intense, the stability and correctness of systems are strongly affected. It is therefore essential to make electronic components and systems resistant to damage or malfunctions caused by ionizing radiation. Previous studies have shown that nonvolatile memories such as SpinTorque Random Access Memory (STTRAM), Phase Change Memory (PCM), Domain Wall Memory (DWM), and Flash memories [1–3] exhibit the appealing feature of softerror immunity. Different from chargebased memories such as SRAM, NVMs such as STTRAM, PCM, DWM, and Flash memories store data as a change in physical state. Since write operations involve changing the physical state, NVMs are resilient to radiations in the harsh space environment. Among these technologies, STTRAM has the shortest access policy and can potentially be used to build registers. Multilevel cell STTRAM (MLC STTRAM) offers high storage density, and recent studies have shown that the write latency of STTRAM can be greatly reduced by modifying the bitcell structure or increasing write current [4]. In this paper, we consider to build a full electromagneticimmunity memory hierarchy consisting of MLC STTRAMbased registers and nonvolatile main memory. The goal of this work is to effectively allocate MLC STTRAMbased registers.
During compilation, the decision of which variables to be kept in registers at each point in the generated code is called register allocation. Typically, register allocation is modeled as a graph coloring program which is aimed at finding a koptimalcoloring solution for the interference graph. In Chaitin’s coloring [5], when the physical registers are insufficient to hold all the variables, that is, when a node in graph cannot be provably colored, several live ranges must be selected to spill. Since the cost to write different values in SRAM is uniform, traditional register allocators [5, 6] heuristically select potential spilling candidates without considering statetransition costs. When applied to STTRAMbased registers, however, those techniques produce inferior spilling decisions.
For MLC STTRAMbased registers, the programming costs of variables with different state transitions vary significantly [7]. To minimize the overall programming energy, we propose a write statetransitionaware spilling cost minimization (SSCM) technique. First, a spilling cost model needs to be built. Then, the spilling priority order is derived based on the cost model to make a better allocation decision. In particular, this paper tends to select the potential spilling nodes with larger spilling costs. The main contributions of our paper are summarized as follows.(i)To the best of our knowledge, this is the first work which integrates the write statetransition cost of MLC STTRAM into the spilling policy of register allocation.(ii)A cost model is proposed to quantify the spilling cost of variables in the potential spilling list.(iii)A SSCM algorithm is proposed to select the best spilling candidate with the goal of reducing the overall programming energy of MLC STTRAM.(iv)Experiments are conducted to quantitatively evaluate the effectiveness of the proposed approach.
The rest of this paper is organized as follows. The background of STTRAM and register allocation are introduced in Section 2. Section 2.3 presents the motivation of this work. Section 3 derives the spilling cost model and presents the algorithm of SSCMaware register allocation. A set of experiments is conducted to evaluate the proposed methods in Section 4. Finally, Section 5 concludes the paper.
2. Background Information
This section firstly describes the resistance state transition of MLC STTRAM and its nature of antielectromagnetic radiation and then presents the traditional graph coloring algorithm for register allocation. Finally, previous spilling heuristic is discussed.
2.1. MLC STTRAM Preliminaries
Among all the emerging NVMs, the spintransfer torque RAM (STTRAM) is considered as a promising candidate for onchip memory because of its advantages, such as low leakage, high density, fast read speed, nonvolatility, and immunity to radiationinduced soft errors [8]. It features much better endurance and performance than other magnetic memory technologies. Compared to SRAM, it is up to 4 times denser and has much lower leakage energy. This enables the implementation of very large onchip memories with nearzero static consumption, which alleviates both main memory stress and power consumption. High TMR (tunneling magnetoresistance ratio) motivated the research on multilevel cell (MLC) STTRAM. In a MLC STTRAM bit, bits are represented by states, that is, resistance. By doing so, MLC technology can effectively improve the memory density and power efficiency.
In a (SLC) STTMRAM device, the spin of the electrons is changed using a spinpolarized current. This effect is achieved in a magnetic tunnel junction (MTJ). An MTJ device consists of a reference layer and a free layer. The magnetization direction (MD) of reference layer is unchanged while the MD of free layer can be flipped by applying a current through the MTJ. The MLC STTRAM comprises 2bit MLC cell which is adopted in this work. Two MTJs with different sizes are stacked vertically atop an NMOS transistor. The four resistance states are defined by the four combinations of different MDs of the two MTJs [9].
For comparison, Table 1 shows the parameters of SRAM, SLC (SingleLevel Cell) STTRAM, and MLC (multilevel cell) STTRAM [10]. It is known that registers are frequently written component in a system. When architecting STTRAM for registers, the long write latency will impose great impact on both performance and energy of architectural components.

In conventional random access memory (RAM) technologies, data are stored as electric charge or current flows. For STTRAM, data are stored by magnetic storage elementsmagnetic tunnel junctions (MTJs). Since STTRAM cell does not carry electric charge, it is resilient to radiations. Such natural immunity to electromagnetic makes it an ideal candidate to replace the traditional SRAM technology and be used as registers in the harsh space environment [11]. Samples were exposed to 2 MeV and 220 MeV protons and showed no changes in bitstate or write performances. Radiation testing results show that STTRAM will not suffer SEUs when used in space [12]. Thanks to its easy integration with CMOS and infinite endurance, STTRAM has been proposed to be widely used in order to overcome the power challenge of conventional CMOS circuits [13]. Therefore, in many harsh environments like aerospace, STTRAM is an ideal candidate to build registers. In fact, STTRAMbased register file has been used in [14–16] to achieve lower dynamic and leakage energy consumption. Recently, IBM researchers in collaboration with Samsung researchers demonstrated 11 nm STTRAM junction, which is a significant achievement on the way to substitute DRAM with STTRAM [17]. This work proposes to build STTRAMbased registers for embedded systems in radhard environment.
The resistance of an MTJ can be changed by injecting a switching current. In particular, MLC STTRAM has two domains, a hard domain and a soft domain. The magnetic direction of the soft domain can be changed by a small current, while applying a larger current to MTJ affects both hard and soft domains. In this paper, the first bit of a 2bit data indicates the magnetization direction of the hard domain and the second bit indicates the magnetization direction of the soft domain. States transitions of MTJ resistance can be presented in Table 2 with the following four types [18], where “R00” represents that the softbit and hardbit are both low resistance. Similarly, “R01” stands for the softbit with low resistance while hardbit is high resistance. And “R10” represents the softbit with high resistance while hardbit is soft resistance. “R11” represents the softbit and hardbit being both high resistance.(i)Zero transition (ZT): neither bit is changed.(ii)Soft transition (ST): only the magnetic orientation of the soft domain is switched.(iii)Hard transition (HT): the magnetic orientation of the hard domain is switched, and two domains have the same orientation.(iv)Twostep transition (TT): transition completes with two steps, including one HT followed by one ST.

Table 3 presents the rated current required to switch the state of MLC STTRAM for each transition [18]. When the current is larger than the rated current, the state can switch to the other. A negative value sets the current in the reverse direction, and “—” represents that a state cannot be directly converted into the other state. It can be seen that switching a hard domain requires a larger current than switching a soft domain. For a twostep transition, the required current is the sum of the absolute currents of both steps.

It can be seen from Table 3 that changing states has significant impact on the energy consumption of MLC STTRAM. It is therefore preferable to spill variables with higher programming energy to save register access energy during program execution. To achieve this goal, a spilling policy taking statetransition costs into account is proposed in this paper for MLC STTRAMbased registers.
2.2. Graph Coloring Based Register Allocation
A graph coloring based register allocation approach was designed by Chaitin et al. [5]. Its basic data structure is the interference graph [19]. The node in G represents live ranges, and the edge between nodes corresponds to interferences. Adjacent nodes are not allowed to simultaneously live and share the same physical register. The kcoloring problem assigns one of k colors (physical registers) to the node of G. Various phases of the process are described as follows.
Build. Construct the interference graph by scanning the entire program.
Simplify. After build, the nodes in G are, respectively, examined. Each node with a degree < C (less than C neighbors) is removed from G and pushed onto the stack. Relevant edges are also removed from graph G.
Spill. If there exists a node with degree ≥ , it will be chosen as a potential spill candidate. Once a node is marked for spilling, the node is then deleted from the graph G and pushed onto the stack.
Select. Repeatedly pop the nodes from stack and reinsert them into G. If is not a potential spilling candidate, can be assigned a free color. If is a potential spilling, may be trivially colorable; that is, it will get assigned a color. Otherwise, the node is marked for an actual spilling and remained uncolored.
Start Over. If is marked for spilling, an additional store is inserted after every definition, and a load is inserted before every use. The whole graph coloring process is started all over again.
A critical issue of register allocation is which node should be selected as a potential spilling candidate. Several approaches have been proposed to make decisions according to the sequence which registers, the degree and the number of operation , respectively (use or define ) [19]. However, these spilling policies assume uniform write distribution and hence will fail to choose the most energyefficient node from the potential spilling list if MLC STTRAM is employed as register. In this paper, considering unbalanced write distribution of MLC STTRAM, a cost model estimating node spilling cost is proposed to derive a highly efficient register allocation approach.
2.3. A Motivational Example
In this section, a motivational example is presented to show how the unbalanced costs of different write state transitions impact the spilling decision for MLC STTRAMbased registers.
The example in Figure 1 shows a 2coloring problem in a manner of conventional register allocation. It is assumed that four variables should be allocated with two registers. In the Simplify phase, the node will be first deleted from the interference graph and pushed onto the stack. Then there does not exist any node with a degree less than two. In this case, any one of the , , nodes can be selected for potential spilling. In the conventional approach, the three nodes are all added into the spilling list, and the compiler chooses the tobespilled nodes without any priority. In the example in Figure 1, node is chosen as the potential spilling target in the Simplify phase and is spilled one in the coloring phase.
In this work, since we consider registers built by MLC STTRAM where writes with different statetransitions cost different energy, the conventional approach is not appropriate any more. Table 4 presents an example of programming a 16bit MLC STTRAM. It is assumed that the old value of node is “00 01 00 01 00 01 01 10” and the new value tobewritten is “10 10 11 01 00 01 11 10”. The old values and new written values in nodes b and c are given in Table 4 as well. We also collect the numbers of the aforementioned state transitions. It has been presented that a TT implies one ST and one HT and a ZT for no transitions. As such, we can convert the above transitions by counting soft transitions and hard transitions as shown in the lower right part of Table 4. The results indicate that writing node costs the highest energy. It is therefore preferable to spill node .

The observation indicates the impact of different statetransition costs on the potential spilling decision during register allocation. Different from conventional register allocation policies, the spilling costs with different state transitions are nonuniform in MLC STTRAMbased registers. Motivated by this consideration, a spilling policy guided by statetransition cost analysis is proposed so as to reduce energy consumption in MLC STTRAM.
3. A StateTransitionAware Spilling Heuristic
This section first describes the framework overview of the proposed approach and then presents the spilling cost model driven by state transition of MLC STTRAM. Finally, the algorithm for SSCMbased register allocation is presented.
3.1. Framework Overview
Previous heuristics as described in Section 2.2 usually employ simple spilling principles. Due to the lack of a formal cost model, these heuristics fail to estimate the impact of a spilling decision on program code quality. Furthermore, since they all target SRAMbased registers where write cost of different values is uniform, none of them examine the write operation state. In other words, spilling decisions are independent of the actual cost model.
In this paper, we propose a costbased method to choose spilling variables when MLC STTRAM is employed as the register. In order to build a formal spilling cost model, we explore the unbalanced writes to the hard domain and soft domain of MLC STTRAM cells and the exact statetransition cost to identify the spilling cost of each node. Then, spilling candidates are selected according to their spilling costs in the spill phase. In the following subsections, a qualitative statetransition model is first constructed for cost assessments. Then, the heuristic of SSCMbased register allocation is depicted. This algorithm extends the capability of Chaitin’s algorithm [5] in spillingoptimization ways. Compared to traditional Chaitin’s register allocation, SSCMbased register allocation can retain more costefficient variables in registers, thus delivering promising reduction in terms of energy consumption.
3.2. A Spilling Cost Model
In this subsection, a spilling cost model is presented to illustrate the spilling priority, determined based on statetransition profiling information of MLC STTRAM.
We assume that the write frequency or the number of transitions of each state can be obtained through profiling. Considering a MLC STTRAM with 2 bits per cell, the state contains states. The write frequency of state set can be calculated as follows:where represents the number of transitions from state to state .
The number of the four state transitions can be collected by the following model:The other three states can be obtained in a similar way.
Subsequently, the cost model of a variable can be constructed as the linear combination of , , , and , represented bywhere , , , are defined as the weight of every statetransition frequency. In this paper, since we focus on the dynamic energy saving, the weight is defined as the execution energy of different state transition. The dynamic energy of state transition is calculated in direct proportion to the product of the square of every transition’s average switching current and the pulse duration:
Here, denotes the required average switching current of every state transition XT and can be obtained by Table 3, while denotes the pulse duration. Then weights , , , can be obtained by normalizing to 01.
We calculate the write energy of every energy in MLC STTRAM at nm technology node based on data reported in [18, 20] and assume that ns pulse duration is applied. By profiling the frequencies of the four transition events, , , , and can be obtained by normalizing the average energy of , , , to 01. In this way, the spilling cost model can be constructed according to (3).
Once the parameters have been finalized, we can obtain the cost for each node in graph G according to (3). Then the nodes with degree greater than are sorted based on their write cost in descending order. Finally, the node with the highest cost is selected as the spilling candidate. In this way, the model for spilling cost minimization can be constructed. We use the same cost model as the measurement of spilling priority for every remaining node. If a node with the highest priority is spilled, the register energy pressure can be reduced. In this way, the allocator can make a better decision on register assignment based on the exact STTRAM register statetransition usage information.
Overall, the procedure of building spilling cost model is shown in Figure 2, while the entire implementation process is depicted in Algorithm 1.

The spilling cost model provides a sound basis for selecting potential spilling nodes. By keeping the node (variable) with less transition energy in register instead of memory, it helps avoid expensive spills when considering the statetransition costs of MLC STTRAM.
3.3. Algorithm Description
This subsection describes the proposed SSCMbased register allocation algorithm. The basic idea is to choose the potential spilling candidate with the high spilling priority which is determined by the variable’s write transition cost. The goal is to spill the node with relatively expensive write cost to memory so as to relieve the register pressure and maximize energy saving during program execution. The SSCMbased register allocation mainly consists of four steps.
Step 1. An interference graph G is employed as the basic data structure for graph coloring. Then repetitively, the variable with degree ≤ k is deleted from the interference graph G, until no node with degree ≤ k remains.
Step 2. It is assumed that is the graph resulting from G by successively deleting nodes with degree less than k. If is empty, then color the variables in reverse order of deleting.
Step 3. The number of different write state transitions of the remaining nodes is counted through profiling. Then the cost of each variable can be obtained by (3). Subsequently, the variables are sorted in descending order of spilling cost. The variable with the greatest spilling cost is marked for a potential spilling. Then the allocator gets the variable colored.
Step 4. If no color is available for the spilled variable, then stop. Otherwise, the allocator will insert the spill node, rebuild the interference graph, and start over.
The algorithm is shown in Algorithm 2 in detail. When the algorithm cannot find a variable that is trivially colorable, some variables need to be spilled (line ). The algorithm chooses the variable with the highest spilling cost as the potential spilling candidate (line ). If the variable is not colored, it is marked for an actual spilling (line ).

As discussed previously, the proposed optimistic coloring can lead to more energyefficient register allocation by considering the nonuniform state transitions of MLC STTRAMbased registers.
3.4. Discussion Regarding InputDependence
One typical concern with most profilingbased optimizations is inputdependence, that is, whether the optimizations made for a specific set of inputs will be preserved for other inputs of the same application. For the proposed SSCM scheme, it is clear that the spilling cost models are fixed given a specific programming strategy, while the write statetransition frequencies of each state vary across different applications and across different inputs. However, the optimality of SSCM depends not on the values of F, but only the descending order of node write costs. In other words, once a spilling decision is made based on a set of input, this decision preserves the maximal cost reduction for other inputs as long as the descending order of nodes remains the same, even with various frequency values. In addition, the previous work [21] focusing on workload characterization showed the workload characterization strategies provide potential to improve the accuracy of offline prediction of the proposed SSCM policy.
In the experimental evaluation, this paper, same as the work in [22], assesses all the test benches with various inputs and studies the differences in cost reduction. Regarding the proposed SSCM, two cases are evaluated: SSCM_ideal and SSCM_practical. SSCM_ideal customizes the spilling decision for different inputs of the same program, while SSCM_practical makes the spilling decision for one input and applies it to other input configurations. A comparison between the two cases shows that the impact of input variations on the optimality of SSCM is negligible, thus confirming that profiling can be done on one specific input and SSCM_practical can be employed.
4. Experiment
In this section, the experimental setup is introduced first. Then, the experimental results for evaluating the efficacy of proposed SSCM methods are presented.
4.1. Experimental Setup
We evaluate how the proposed SSCM impacts on dynamic energy and lifetime of MLC STTRAM. The architectural parameters of the MLC STTRAM registers are listed in Table 5 [23].

Benchmarks are selected from DSP programs and Livermore benchmarks in the experiments. Using the LLVM [24], the corresponding assembly code and the register write statetransition profiling can be obtained. Then the cost model can be built to guide the proposed statetransitionaware spilling heuristic in register allocation. All the experiments are implemented with the SSCM_practical deployment.
4.2. Experimental Results
Typically, a register file is accessed in a single cycle. The cycle length is sized for the worst case. Thus, all accesses take the same amount of time. In this section, the proposed SSCMMLC STTRAM scheme is evaluated against the MLC STTRAM with traditional register allocation in terms of energy efficiency and lifetime.
4.2.1. Dynamic Energy
The consumed energy is accumulated by each 2bit state transition in the register. Each register is 64bit long and the bits in the same register can be programmed simultaneously [7]. For every register, the overall energy consumption is determined by the product of each state to program and the energy of each state. So the energy improvement is impacted from the number and type of state transitions. Figure 3 presents the results of energy consumption of the SSCM scheme (SSCMMLC) compared with conventional register allocation applied to MLC STTRAM without considering the spilling priority (CMLC). The results shown in Figure 3 are normalized to the CMLC scheme. As is shown in Figure 3, for all benchmarks, wdf achieves the highest energy reduction. The reason lies that the hard/twostep transitions variables of wdf are spilled to memory and low energy zero/soft transitions variables are kept in register. It can be seen that the benchmark livermore12 is smaller than others. The underlying reason is that livermore12 has more soft transitions and zero transition. And the zero/soft transition consumes less energy than hard/twostep transition. The proposed SSCM policy spills a large amount of the zero/soft transition variable of livermore12. As a result, the overall energy consumption of livermore12 is minimal. On average, the proposed SSCM saves energy by 19.4% over CMLC. This is mainly due to the fact that the proposed SSCM policy is able to retain the energyefficiency variable in the register, thus saving more write energy.
4.2.2. Lifetime Evaluation
The best endurance test result for SLC STTRAM devices so far is less than cycles [25]. For MLC STTRAM, the larger write current exponentially degrades the lifetime of register as a result of dielectric breakdown. Furthermore, the frequent access to registers also attribute to lifetime reduction. For two registers with the physical properties, their lifetimes are decided by the number of writes (switch times under a write operation). The switch times of an MLC cell under a write operation are counted in hard domain and soft domain, separately. In Figure 4, the total number of switches represents the sum of the soft domain and the hard domain. The results show that the proposed SSCM design achieves greater switch reduction than CMLC. Specifically, the total number of switches to soft and hard domains is reduced by 9.35%, on average. This is mainly because the SSCM scheme spills more variables with twostep state transition to memory, thus reducing the total number of switches. Overall, the MLC STTRAM lifetime is improved by 23.2% compared to CMLC design. As is shown in Figure 4, the switching time of the benchmark floyed is smaller than others. This is mainly because there are more twostep state transitions in the benchmark floyed so that the SSCM scheme spills more variables with the twostep state transition to memory, thus reducing the total number of switches. It can be observed that the switching time of the benchmark livermore11 is larger than others in Figure 4. The reason lies that there are more zero state transitions in the benchmark livermore11. The proposed SSCM scheme spills more variables with the zero state transition to memory, thus reducing less number of switches than others.
5. Conclusions
This paper has proposed a statetransitionaware spilling cost minimization (SSCM) scheme for energy reduction in MLC STTRAMbased register design. First an energy cost model is built to quantitatively calculate spilling cost of each variable with a degree larger than k colors. Then the algorithm for SSCMbased register allocation is presented to choose the variable with the highest write cost to be spilled and assign the physical register to other variables. Experimental results show that the proposed SSCM scheme can achieve promising cost reduction in terms of energy consumption of registers and enlarge MLC STTRAM lifetime as well.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work is supported by the grants of Beijing Advanced Innovation Center for Imaging Technology, National Natural Science Foundation of China [Project no. 61502321], and the Project of Beijing Municipal Education Commission [Project no. KM201710028016].
References
 M. Ranjbar Pirbasti, M. Fazeli, and A. Patooghy, “Phase Change Memory lifetime enhancement via online data swapping,” Integration, the VLSI Journal, vol. 54, pp. 47–55, 2016. View at: Publisher Site  Google Scholar
 S. Mittal, “A survey of techniques for architecting processor components using domainwall memory,” ACM Journal on Emerging Technologies in Computing Systems, vol. 13, no. 2, article no. 29, 2016. View at: Publisher Site  Google Scholar
 Q. Li, L. Shi, C. J. Xue et al., “Access characteristic guided read and write cost regulation for performance improvement on flash memory,” in Proceedings of the in USENIX Conference on File and Storage Technologies (FAST 16), pp. 125–132, Santa Clara, Calif, USA, 2016. View at: Google Scholar
 R. Bishnoi, M. Ebrahimi, F. Oboril, and M. B. Tahoori, “Improving Write Performance for STTMRAM,” IEEE Transactions on Magnetics, vol. 52, no. 8, 2016. View at: Publisher Site  Google Scholar
 G. J. Chaitin, M. A. Auslander, A. K. Chandra, J. Cocke, M. E. Hopkins, and P. W. Markstein, “Register allocation via coloring,” Computer Languages, vol. 6, no. 1, pp. 47–57, 1981. View at: Publisher Site  Google Scholar
 K. D. Cooper and A. Dasgupta, “Tailoring graphcoloring register allocation for runtime compilation,” in Proceedings of the 4th International Symposium on Code Generation and Optimization, CGO 2006, pp. 39–49, USA, March 2006. View at: Publisher Site  Google Scholar
 M. Zhao, Y. Xue, C. Yang, and C. J. Xue, “Minimizing MLC PCM write energy for free through profilingbased state remapping,” in Proceedings of the 2015 20th Asia and South Pacific Design Automation Conference, ASPDAC 2015, pp. 502–507, Japan, January 2015. View at: Publisher Site  Google Scholar
 C. J. Xue, Y. Zhang, Y. Chen, G. Sun, J. J. Yang, and H. Li, “Emerging nonvolatile memories: Opportunities and challenges,” in Proceedings of the Embedded Systems Week 2011, ESWEEK 2011  9th IEEE/ACM International Conference on Hardware/SoftwareCodesign and System Synthesis, CODES+ISSS'11, pp. 325–334, Taiwan, October 2011. View at: Publisher Site  Google Scholar
 W. Wen, Y. Zhang, M. Mao, and Y. Chen, “Staterestrict MLC sttram designs for highreliable highperformance memory system,” in Proceedings of the 51st Annual Design Automation Conference, DAC 2014, USA, June 2014. View at: Publisher Site  Google Scholar
 X. Chen, N. Khoshavi, J. Zhou et al., “AOS: Adaptive overwrite scheme for energyefficient MLC STTRAM cache,” in Proceedings of the 53rd Annual ACM IEEE Design Automation Conference, DAC 2016, USA, June 2016. View at: Publisher Site  Google Scholar
 D. Chabi, W. Zhao, J.O. Klein, and C. Chappert, “Design and analysis of radiation hardened sensing circuits for Spin transfer torque magnetic memory and logic,” IEEE Transactions on Nuclear Science, vol. 61, no. 6, pp. 3258–3264, 2014. View at: Publisher Site  Google Scholar
 G. Tsiligiannis, L. Dilillo, A. Bosio et al., “Testing a commercial MRAM under neutron and alpha radiation indynamic mode,” IEEE Transactions on Nuclear Science, vol. 60, no. 4, pp. 2617–2622, 2013. View at: Publisher Site  Google Scholar
 Y. Lakys, W. S. Zhao, J.O. Klein, and C. Chappert, “Hardening techniques for MRAMbased nonvolatile latches and Logic,” IEEE Transactions on Nuclear Science, vol. 59, no. 4, pp. 1136–1141, 2012. View at: Publisher Site  Google Scholar
 N. Goswami, B. Cao, and T. Li, “Powerperformance cooptimization of throughput core architecture using resistive memory,” in Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, HPCA 2013, pp. 342–353, China, February 2013. View at: Publisher Site  Google Scholar
 J. Wang and Y. Xie, “A writeaware STTRAMbased register file architecture for GPGPU,” ACM Journal on Emerging Technologies in Computing Systems, vol. 12, no. 1, article no. 6, 2015. View at: Publisher Site  Google Scholar
 H. Zhang, X. Chen, N. Xiao, and F. Liu, “Architecting energyefficient STTRAM based register file on GPGPUs via delta compression,” in Proceedings of the 53rd Annual ACM IEEE Design Automation Conference, DAC 2016, USA, June 2016. View at: Publisher Site  Google Scholar
 http://www.mraminfo.com/tags/companies/ibm.
 Y. Chen, X. Wang, W. Zhu et al., “Access scheme of multilevel cell spintransfer torque random access memory and its optimization,” in Proceedings of the 53rd IEEE International Midwest Symposium on Circuits and Systems, MWSCAS 2010, pp. 1109–1112, USA, August 2010. View at: Publisher Site  Google Scholar
 H. Falk, “WCETaware register allocation based on graph coloring,” in Proceedings of the 2009 46th ACM/IEEE Design Automation Conference, DAC 2009, pp. 726–731, usa, July 2009. View at: Google Scholar
 X. Lou, Z. Gao, D. V. Dimitrov, and M. X. Tang, “Demonstration of multilevel cell spin transfer switching in MgO magnetic tunnel junctions,” Applied Physics Letters, vol. 93, no. 24, Article ID 242502, 2008. View at: Publisher Site  Google Scholar
 P. Bogdan, “Mathematical modeling and control of multifractal workloads for datacenteronachip optimization,” in Proceedings of the 9th IEEE/ACM International Symposium on NetworksonChip, NOCS 2015, Canada, September 2015. View at: Publisher Site  Google Scholar
 M. Zhao, Y. Xue, J. Hu et al., “State Asymmetry Driven State Remapping in Phase Change Memory,” IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, vol. 36, no. 1, pp. 27–40, 2017. View at: Publisher Site  Google Scholar
 X. Liu, M. Mao, X. Bi, H. Li, and Y. Chen, “An efficient STTRAMbased register file in GPU architectures,” in Proceedings of the 2015 20th Asia and South Pacific Design Automation Conference, ASPDAC 2015, pp. 490–495, Japan, January 2015. View at: Publisher Site  Google Scholar
 C. Lattner and V. Adve, “LLVM: a compilation framework for lifelong program analysis & transformation,” in Proceedings of the International Symposium on Code Generation and Optimization (CGO '04), pp. 75–86, March 2004. View at: Publisher Site  Google Scholar
 H. Luo, J. Hu, L. Shi, C. J. Xue, and Q. Zhuge, “Twostep state transition minimization for lifetime and performance improvement on MLC STTRAM,” in Proceedings of the 53rd Annual ACM IEEE Design Automation Conference, DAC 2016, USA, June 2016. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2017 Yuanhui Ni et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.