Abstract

This paper introduces new approaches for the analysis of frequent statement and dereference elimination for imperative and object-oriented distributed programs running on parallel machines equipped with hierarchical memories. The paper uses languages whose address spaces are globally partitioned. Distributed programs allow defining data layout and threads writing to and reading from other thread memories. Three type systems (for imperative distributed programs) are the tools of the proposed techniques. The first type system defines for every program point a set of calculated (ready) statements and memory accesses. The second type system uses an enriched version of types of the first type system and determines which of the ready statements and memory accesses are used later in the program. The third type system uses the information gather so far to eliminate unnecessary statement computations and memory accesses (the analysis of frequent statement and dereference elimination). Extensions to these type systems are also presented to cover object-oriented distributed programs. Two advantages of our work over related work are the following. The hierarchical style of concurrent parallel computers is similar to the memory model used in this paper. In our approach, each analysis result is assigned a type derivation (serves as a correctness proof).

1. Introduction

Distributed programming is about building a software that has concurrent processes cooperating in achieving some task. For a problem specification, the type, number, and the way of interaction of processes needed to solve the problem are decided beforehand. Then a supercomputer can be computationally simulated by a group of workstations to carry different processes. A group of supercomputers can in turn be combined to provide a computing power greater than that provided by any single machine. This enormous computing power provided by distributed systems is why the distributed programming style [13] is quite important and attractive. Among examples of distributed programming languages (DPLs), based on machines having multicore processors and using partitioned-global model, are Unified Parallel C (UPC), Chapel, Titanium which is based on Java, and X10.

Among advantages of object-oriented programming (OOP) is combining other styles such as imperative, functional, and relational programming. Concepts of class, procedure, and inheritance are basics for OOP. These concepts result in dynamic behavior in various implementations of object-oriented programming languages.

Recomputing a nontrivial statement and reaccessing a memory location are waste of time and power if the value of the statement and the content of the location have not been changed. The purpose of frequent statement and dereference elimination analysis is to save such wasted power and time. This is an interesting analysis because it involves connecting statement and dereference calculations to program points where the calculated values may be reused. The analysis also requires changing program points at the ends of these connections. Such changes to program points have to be done carefully so that they do not destroy the compositionality. Our approach to treat this analysis is a type system [4, 5] built on a combination of two analyses; one of them builds on the results of the other one.

For different programming languages, in previous work [4, 5], we have proved that the type systems style is certainly an adaptable approach for achieving many static analyses. This paper proves that this style is flexibly useful to the involved and important problem of frequent statement and dereference elimination of imperative and object-oriented distributed programs.

This paper introduces new techniques for frequent statement and dereference elimination for imperative and object-oriented distributed programs running on hierarchical memories. Simply structured type systems are the main tools of this paper’s techniques presented using the languages of Figure 2 and OODP of Figure 3. These languages are equipped with basic commands for distributed execution of programs and for pointer manipulations. The single program multiple data (SPMD) model is the execution archetypal used in this paper. On different data of different machines this archetypal runs the same program. The analysis of frequent statement and dereference elimination for distributed programs is achieved in three steps each of which is done using a type system. The first of these steps achieves ready statement and memory access analysis. The second step deals with semiexpectation analysis and builds on the type system of the first step. The third type system takes care of the analysis of frequent statement and dereference elimination and is built on the type system of the second step. The paper also illustrates how these type systems can be generalized to cover object-oriented distributed languages.

This paper is an extended and revised version of [6], which treats imperative distributed programs. The work of [6] was generalized in Section 5 of the current paper to cover object-oriented distributed programs. The soundness theorems of the current paper are stated using memory model and operational semantics in the appendix of [6].

Motivation. The left-hand-side of Figure 1 presents a motivating example of our work. We note that lines 4 and 6 dereference which has already been dereferenced in line 2 with no changes to values of and in the path from to . This is a waste of computational power and time (accessing a secondary storage). One objective of the research in this paper is to avoid such waste by transforming the program into that in the right-hand-side of the algorithm. This is not all; we need to do that in a way that provides a correctness proof for each transformation. We adopt a style (type systems) that provides these proofs (type derivations).

Contributions. Contributions of this paper are new techniques, in the form of type systems, for achieving the following analyses for imperative and object-oriented distributed programs.(1)The analysis of ready statement and memory access. (2)The analysis of semiexpectation.(3)The analysis of frequent statement and dereference elimination.

Organization. The rest of the paper is organized as follows. Section 2 presents the type system achieving the analysis of ready statement and memory access for imperative distributed programs. The analysis of semiexpectation as an enrichment of the type system presented in Section 2 is outlined in Section 3. The main type system carrying the analysis of frequent statement and dereference elimination is contained in Section 4. Type systems of Sections 2, 3, and 4 are generalized in Section 5 to cover object-oriented distributed programs. Related and future works are discussed in Section 6.

2. Ready Statement and Memory Access Analysis of

If the value of a statement and the content of a memory location have not been changed, then the compiler should not recompute the statement or reaccess the location. The purpose of frequent statement and dereference elimination is to save the wasted power and time involved in these repeated computations. This is not a trivial task; compared to other program analyses, it is a bit complex. This task is done in stages. The first stage is to analyze the given program to recognize ready statements and memory locations.

The analysis of ready statements and memory locations calculates for every program point the set of statements and memory locations that are ready at that point in the sense of Definition 1. This section presents a type system (ready type system) to achieve this analysis for imperative distributed programs.

Definition 1. (1) At a program point , a statement is ready if each computational path to (a)contains an evaluation of at some point (say ) and(b)does not modify (changing value of any of ’s variables) between and .
(2) At a program point , a memory location is ready if each computational path to (a)reads at some point (say ) and(b)does not modify content of between and .

The ready analysis is a forward analysis that takes as an input a set of statements and memory locations (the ready set of the first program point). It is sensible to let this set be the empty set. The set of types of our ready type system has the form: , where(1) is the set of nontrivial statements (Figure 2),(2) is the set of global addresses. This set is defined precisely in the appendix of [6], and(3)points-to-types is a set of points-to-types (typically have the form of maps from the union of variables and global addresses to the power set of global addresses [4, 7]).

The subtyping relation has the form , where is the order relation on the points-to-types and is the order relation on . A state on an execution path is of type if all elements of are ready at this state according to Definition 1. Judgments of the ready type system have the form . The symbols and denote the points-to-types of the before and after states of executing . The set denotes the set of addresses that may evaluate. We assume that all such pointer information is given along with the statement . Techniques like [4, 7] are available to compute the pointer information. For a given statement along with pointer information and a ready pretype rs, we present a type system to calculate a post ready-type such that . The type derivation of this typing process is a proof for the correctness of the ready information. The meaning of the judgment is that if elements of are ready before executing , then elements of are ready after executing .

The inference rules of the ready type system are presented in Algorithm 1. Comments on the inference rules are in order. We note that numbers, variables, and the allocating statement (new) do not affect the ready pretype. In line with semantic rules and [6], nontrivial arithmetic and Boolean statements and their nontrivial substatements are made ready. The direct assignment rule expresses that after executing the assignment the substatements of r.h.s. become ready and that all statements involving become unready as the value of may become different. The rule reflects the fact that the statement becomes ready after executing the dereference. Moreover if evaluates a single address according to the underlying pointer analysis, then this address becomes ready as well. However if evaluates a large set of addresses (more than one), then we are not sure which of these addresses is the concerned one and hence cannot conclude any readiness information about addresses. The rule adds the substatements of and to the ready pretype. Since the content of address referenced by is possibly changed after executing the statement, all statements involving dereferencing this address are removed from the set of ready items. Remaining rules are self-explanatory. The Boolean statements and have inference rules similar to that of .

  
  
  
  
  
  
  

All in all, the information provided by type derivations obtained using this and the following type system is classified into two sorts. The first sort is about knowing the program point at which a particular statement becomes ready. The second sort of information is about the program point at which a precomputed value of a ready statement can be replaced with the statement.

Now we recall the assumption that our distributed system consists of machines. For a given statement and a given machine , the type system of Algorithm 1 calculates for each program point of , the set of ready items. The following rule can be used to combine the information calculated for each machine to get new ready information for each program point. The new ready information is valid on any of the machines.

Consider

The rule (main-rs) supposes a suitable notion for the join of pointer types. The soundness of the ready type system is stated as follows.

Theorem 2. Suppose that , and the items of are ready at the point corresponding to on the execution path. Then the items of are ready at the point corresponding to on the execution path.

3. Semiexpectation Analysis of

The aim of frequent statement elimination is to introduce new variables to accommodate values of frequent statements and reusing these values rather than recomputing the statements. Analogously, the aim of frequent dereferences elimination is to introduce new variables to accommodate values of frequent dereferences and reusing these values rather than reaccessing the memory. The information gathered so far by the ready type system introduced in the previous section is not enough to achieve frequent statements and dereferences elimination. We need to enrich the ready information, assigned to each program point, with new information called semiexpectable information.

Definition 3. (1) At a program point , a statement is semiexpectable if there is a computational path from that(a)contains an evaluation of at some point (say ), where is ready at , and(b)does not evaluate between and .
(2) At a program point , a memory location is semiexpectable if each computational path to (a)reads at some point (say ) where is ready at , and(b)does not read between and .

The semiexpectation analysis is a backward analysis that takes as an input a set of statements and memory locations (the semiexpectable set of the last program point). It is sensible to let this set be the empty set. The following example gives an intuition for the previous definition: Neither the statement nor the statement is ready after the if statement because they are not computed in all branches. Hence it is not true to replace these statements with variables towards optimizing the last statement of the example. The job of the type system presented in this section is to provide us with this sort of information. More precisely, as the statements and are not ready after the if statement, the second statement of the example does not make them semiexpectable.

The semiexpectation analysis assigns for each program point the set of items that are semiexpectable. The analysis is based on the readiness analysis and is backward. The set of types of the semiexpectation type system has the form: . The subtyping relation has the form . A state on an execution path is of type if all elements of are semiexpectable according to Definition 3. Judgments of the semiexpectation type system have the form . For a given statement along with pointer information, readiness information, and a semiexpectation type , we present a type system to calculate a pre-semiexpectable-type such that . The type derivation of this typing process is proof for the correctness of the semiexpectable information. The meaning of the judgment is that if elements of are semiexpectable after executing , then elements of must have been semiexpectable before executing .

The inference rules of the semiexpectation type system are shown in Algorithm 2. Some comments on the inference rules are in order. In the rule , given the posttype , we calculate the pretype for the statement . Then the resulting pretype is used as a posttype for the statement to calculate the pretype . In line with Definition 3, the arithmetic statement is added to only if it belongs to . Similar explanations illustrate the rule . The remaining rules mimic the rules of the ready type system.

     
  
  
   
          
       
    
       
   

Now we recall the assumption that our distributed system consists of machines. For a given statement and a given machine , the type system given above calculates for each program point of the set of semiexpectable items. Now the following rule can be used to combine the information calculated for each machine to get new semiexpectable information for each program point. The new semiexpectable information is valid on any of the machines.

Consider

The difference in the way that this rule treats the semiexpectable information and the way ready information is treated is explained by the fact that the ready analysis is forward while the semiexpectation analysis is backward.

It is not hard to prove the soundness of the above type system.

Theorem 4. Suppose that and the items of are semiexpectable at the point corresponding to on the execution path. Then the items of are semiexpectable at the point corresponding to on the execution path.

4. Frequent Statement and Dereference Elimination of

This section presents a type system that is an enrichment of the type system presented in the previous section. The type system of this section achieves the frequent statement and dereference elimination. The type system uses a function that assigns each nontrivial statement a name. These names are meant to carry values of frequent statements and dereferences. The judgments of our type system have the form . The type information and were calculated by the previous type system. is the optimization of and is a sequence of assignments that links optimized statements with the names of their unoptimized versions.

Algorithms 3 and 4 present inference rules for the frequent statements and dereferences elimination. We note the following on the inference rules. A big deal of optimization is achieved by the three rules for . These rules are , and . The rule takes care of the case where is ready and is replaceable by its name under the function . The rule treats the case where is semiexpectable and is not ready before calculating the statement. In this case, a statement name of is used. The rule considers the case where is neither semiexpectable at the program point after execution nor ready before calculating the statement. In this case, the statement does not get changed. Similarly, the three rules , and treat different cases for arithmetic statements. The Boolean statements are treated with rules quite similar to that of arithmetic statements. The rule reuses frequent substatements of the guard. This is done via adding in the positions clarified in the rule. Remaining rules of system are self-explanatory.

    
  
  
          
     
     
         

       
       
       
       
     
    
   
    
       

For expressing the soundness, we introduce the following definition.

Definition 5. Suppose that is a state defined on the set of locations, Loc ([6, Definition 4]). Suppose also that is a state defined on . The expression denotes the fact that and are equivalent with respect to the semiexpectation type se. More precisely if and only if (1), and(2).

The soundness of frequent statements and dereferences elimination means that the original and optimized programs are equivalent in the following sense:(i)the states of the two programs coincide on the Loc, and(ii)if a statement is both ready and semiexpectable, then its semantics in the original-program state equals the value of its corresponding name in optimized-program state.This gives an intuition to the previous definition. The following soundness theorem is proved by a structure induction.

Theorem 6. Suppose that and . Then (i);(ii).

5. Frequent Statement and Dereference Elimination of OODP Programs

This section generalizes the type systems of previous sections to cover object-oriented distributed programs. Hence, a new model for object-oriented distributed programs and necessary changes to proposed type systems for the analysis of frequent statement and dereference elimination are presented in this section. Object-oriented concepts such as subtyping and inheritance are included in the model language (dubbed OODP) whose syntax is shown in Figure 3.

In line with OOP concepts, local variables are contained in functions and live while their functions are live. While parameters of function are represented using local variables, a class’s internal state is contained in its instance variables. A class is a container for a set of function definitions. Each function has parameter , a main statement , and a statement representing value returned by the function. Hence an OODP program is a set of classes followed by a “main” function. Figure 4 presents semantic spaces and naming conventions used in the rest of the paper.

As shown in the previous sections, the analysis of frequent statement and dereference elimination for imperative distributed programs is achieved in three steps. In the following, we show necessary changes to the three type systems presented so far to cover object-oriented distributed programs.

For each program point, ready statements and memory locations (Definition 1) are computed by the analysis of ready statements and memory locations. Adding rules of Algorithm 5 to that of Algorithm 1 results in a type system that calculates this analysis for object-oriented distributed programs of Figure 3. Using semantics notions of Figure 4, Definitions 1, 3, and 5 are applicable and convenient for the analyses in this section for the language OODP.

  
  
   

Comments on the inference rules are in order. The rules of Algorithm 5 suppose the existence of a class analysis that calculates the set of classes that a statement may reference. The judgments of the proposed analysis have the form . The intuition of such judgments is that the pointer information are used to calculate the set . In the rule , ready substatements of and are added to to produce . Then for any class that may reference, statements involving are removed from . In the rule includes classes that may reference. For all functions named in classes of , the body and return statements are enumerated in the set . Ready substatements of these statements are added to to produce . Then all statements involving are removed from .

Using semantics notations of Figure 4, soundness of the type system of Algorithm 5 is stated as follows.

Theorem 7. Suppose that and the items of are ready at the point corresponding to on the execution path. Then the items of are ready at the point corresponding to on the execution path.

The goals of main analysis of this section for OODP are as follows.

Introducing new variables to maintain values of frequent statements and dereferences and then reusing these values instead of recomputing the statements and reaccessing the memory.

To achieve this goal the ready information needs to be enriched with information of semiexpectable.

Adding rules of Algorithm 6 to that of Algorithm 2 results in a type system that calculates the analysis of semiexpectation for object-oriented distributed programs of Figure 3. Some comments on the inference rules of Algorithm 6 are in order. In the rule , starting with the posttype , the pretype is calculated for the statement . Then is used as a posttype for to get the main pretype . Similarly to , the rule enumerates body and return statements of convenient functions. Then sequentially is calculated starting from . The remaining rules mimic the rules of the ready type system.

     
       
   
   
     

Using semantics notations of Figure 4, soundness of the type system of Algorithm 6 is stated as follows.

Theorem 8. Suppose that , and the items of are semiexpectable at the point corresponding to on the execution path. Then the items of are semiexpectable at the point corresponding to on the execution path.

Adding rules of Algorithm 7 to that of Algorithm 3 results in the main type system achieving the analysis of frequent statement and dereference elimination for object-oriented distributed programs of Figure 3. We note the following on the inference rules. Optimization is based on rules for ; , , and . The case that is ready and is replaceable by its name under the function is treated by . The case is semiexpectable but not ready before calculating the statement is treated by . The rule takes care of the case, where is neither ready before the calculation nor semiexpectable after execution.

  
   
     
   
    

The following definition generalizes Definition 5 and is necessary to express soundness.

Definition 9. Suppose that is a state defined on the set of locations Loc ([6, Definition 4]). Suppose also that is a state defined on . The expression denotes the fact that and are equivalent with respect to the semiexpectation type se. More precisely if and only if (1)for all, and(2)for all .

Using semantics notations of Figure 4, soundness of the type system of Algorithm 7 is stated as follows.

Theorem 10. Suppose that and . Then (i). and ;(ii). and .

The techniques of common subexpression elimination (CSE) [8, 9] are closed to our work. In [10], a type system for CSE of the while language is introduced. The work presented in our paper can be realized as a generalization of that presented in [10]. The generality of our work is evident in our language models which are much richer with distributed, pointer, and object-oriented commands. Consequently, the operational semantics that we measure the soundness of our system against are much more involved than that used in [10]. Using new opportunities appearing while scheduling control-intensive designs, the work in [11] introduces a technique that dynamically eliminates CSE. To optimize polynomial expressions (important for applications like domains, computer graphics, and signal processing), the paper [12] generalizes algebraic techniques originally designed for multilevel logic synthesis. The generalization in [12] uses factoring to eliminate common subexpressions of polynomial expressions.

There are many analyses for optimizing object-oriented programs. In [13] evolutionary multiobjective optimization methods are used to present a Class-Based Elitist Genetic Algorithm (CBEGA) for testing OOP. A new method to optimize OOP for field access in concurrent object-oriented programs is presented in [14]. This work utilizes the correctness concept that concurrency control must be used by programmers. A new model concurrency abstraction is presented in [15]. This model has the advantage of separating the specification of the synchronization code from the method bodies.

The association of a correctness proof with each result of the static analysis is important and needed by applications like proof-carrying code and certified code. The work presented in this paper has the advantage over most related work of constructing these proofs. Adding to the value of using type systems, the proofs constructed in our proposed approach have the form of type derivations. The work in [4, 16, 17] presents many examples of other static analyses that are in the form of type systems.

In [18], a technique for flow-insensitive pointer analysis of programs that run on parallel and hierarchical machines and that share memory is introduced. Via a two-level hierarchy, [19, 20] present constraint-based approaches to evaluate locality information and sharing attributes of references. Our language model is a generalization of models presented in [18, 19].

Much research acclivities [18, 21] was devoted to analyze distributed programs. This is motivated by the importance of distributed programming as a main stream of programming today. The examining and capturing of causal and concurrent relationships are among important issues to many distributed systems applications. In [22], an analysis that examines the source code of each process constructs an inclusive graph, POG, of the possible behaviors of systems. Data racing bugs [23] can be a side effect of the parallel access of cores of a multicore process to a physically distributed memory. In [23] a technique, called DRARS, is proposed for avoidance and replay of this data race. Parallel programs on DSM or multicore systems can be debugged using DRARS. The classical problems of satisfiability decidability and algorithmic decidability are approached in [24] on the distributed-programs model of message sending. In this work, distributed programs are represented by communicating via buffers.

7. Conclusion

This paper introduces new techniques for the analysis of frequent statement and dereference elimination for imperative and object-oriented distributed programs running on parallel machines equipped with hierarchical memories. Type systems are the tools of the techniques presented in this paper. The first sort of proposed type systems defines for program points of a distributed program sets of calculated (ready) statements and memory accesses. The second sort determines which of the ready statements and memory accesses are used later in the program. The final sort eliminates unnecessary statement computations and memory accesses.

Disclosure

This is an extended and revised version of [6].

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.