#### Abstract

The ejectors are used commonly to extract gases in the petroleum industry where it is not possible to use an electric bomb due the explosion risk because the gases are flammable. The steam ejector is important in creating and holding a vacuum system. The goal of this job is to develop an object oriented parallel numerical code to investigate the unsteady behavior of the supersonic flow in the ejector diffuser to have an efficient computational tool that allows modeling different diffuser designs. The first step is the construction of a proper transformation of the solution space to generate a computational regular space to apply an explicit scheme. The second step, consists in developing the numerical code with an-object-oriented parallel methodology. Finally, the results obtained about the flux are satisfactory compared with the physical sensors, and the parallel paradigm used not only reduces the computational time but also shows a better maintainability, reusability, and extensibility accuracy of the code.

#### 1. Introduction

Steam jet ejectors offer a simple, reliable, low-cost way to produce vacuum. They are especially effective in the chemical industry where an on-site supply of the high-pressure motive gas is available. The ejector operation consists of a high-pressure motive gas that enters the steam chest at low velocity and expands through the converging-diverging nozzle. These results show a decrease in pressure and an increase in velocity. Meanwhile, the fluid enters at the suction inlet. The motive fluid, which is now at high velocity, enters and combines with the suction fluid [1, 2].

Ejector operations are classified as critical or noncritical flow mode; in the critical mode; the incoming flow is supersonic; other related experimental work and also work with CFD are in [3, 4]. This paper investigates the behavior of the supersonic flow for value of Mach number of 2.0 in the diffuser. The form of the ejector diffuser is not rectangular; therefore, it is necessary to transform the physical plane to a computational plane where the grid is rectangular.

The implementation was done using JPVM to run the program in a collection of computing system interconnected by one or more networks as a single logical computational resource to reduce the computational time. Figure 1 shows the ejector parts commonly used in the petroleum industry. Ejectors have no moving parts and operate by the action of high-pressure incoming stream like air and other vapors at a lower pressure into the moving stream and thereby removing them from the process system at intermediate pressure.

#### 2. The Governing Equation

We consider the two-dimensional compressible nonlinear Navier-Stokes equation written in the conservation form as

where , and are column vectors defined as,

For a calorically perfect gas, it is possibly, to eliminate the energy in favor of , , , and as follows: . For clarity in calculations, the elements of the vector could be denoted as

the elements of the column vector are denoted by

also, the elements of the column vector are denoted by

the viscous stress terms are written in terms of velocity gradients as

Likewise, the components of the heat flux vector (from Fourier's law of heat) are defined as

The variations of the properties as dynamic viscosity and thermal conductivity are considered temperature-dependent and they are computed by Sutherland's law as

#### 3. The Numerical Scheme

An explicit predictor-corrector scheme for (2.1) is formulated as follows.

*Predictor:*

where the artificial viscosity is given by

where the and are two parameters; typical values of and range from to , for this application = = .

*Corrector:*

#### 4. Validating the Numerical Code

Before managing the ejector's more complex geometry, it is possible to test the code with a simpler case with analytical solution in order to compare against the result of the numerical solution. The case is an expansion corner as sketched in Figure 2; for this problem an exact analytical solution exists in order to obtain a reasonable feeling for the accuracy of the numerical simulations result for the ejector shape; in this case in (2.1) the vector and the viscous stress terms are suppressed [5].

It is necessary to establish some details of the particular problem to be solved. The physical plane drawn in Figure 3 is considered. The flow at the upstream boundary is at Mach 2.0 with a pressure, density, and temperature equal to N/, 1.23 Kg/ and 286 K, respectively. The supersonic flow is expanded at an angle of ; the reason for this choice is to have analytical solution. As shown in Figure 3, the calculations will be made in the domain from to in and from the wall from to in.

The location of the expansion corner is at 10 in, for this geometry; the variation of is given by

We can construct a proper transformation as follows (Figure 4): let denote the local height from the lower to the upper boundary in the physical plane, clearly . Denote the location of the solid surface (the lower boundary in the physical plane) by , where .

We define the transformation as

With this transformation, in the computational plane varies from 0 to and varies from 0.0 to 1.0; corresponds to the surface in the physical plane, and corresponds to the upper boundary. The lines of constant and form a rectangular grid in the computational plane (Figure 4). The lines of constant and are also sketched in the physical plane; they form a rectangular grid upstream of the corner and a networks of divergent lines downstream of the corner.

The partial differential equations for the flow are numerically solved in the rectangular space and therefore must be appropriately transformed for use in the computational plane. That is, the governing equation must be transformed into terms dealing with and . The derivative transformation is given by the following equations:

The metrics , , , and , in (4.3), are obtained from the transformation given by (4.2), that is, and :

if then

if , then and ; if and , so the metrics could be finally expressed as :

The mesh for the expansion corner and for the ejector is automatically generated to avoid stability numerical problems using the CFL criterion where the values of are given by

where and .

The mesh obtained is shown in Figure 5. The size of the ejector grid is discrete points.

In the results of the Mach number, pressure and density for the expansion corner problems are shown in the results. The analytical solution of the Mach Number for the expansion corner could be obtained with the Prandtl-Meyer function (4.8):

The analytical solution of the expansion corner is shown in Figure 6.

The leading edge of the expansion fan makes an angle with respect to the upstream flow direction, and the trailing edge of the wave makes an angle with respect to the downstream flow direction. The angles and are mach angles, defined as and .

The numerical solution of the expansion corner is depicted in Figure 7.

The numerical solution matches the analytical with an error of less than 1% percent at all points and the expansion Mach fan is well formed in the numerical solution.

#### 5. The Transformation of the Physical Ejector Diffuser

The appropriate transformation to generate a boundary-fitted coordinate system for the ejector diffuser is defined as follows:

where and are defined as

The metrics in (4.3) given by (5.1) and (5.2) are calculated as follows: , , , by (5.2), , so . The metric can be obtained differentiating and :

Therefore

In Figures 8 and 9 is depicted the distribution of the metrics and , respectively. The size of the ejector grid is discrete points and it was generated using the CFL marching criterium.

The boundary condition in the outlet is Neumann and in the walls is applied the nonslipping condition. The flux field is initialized with a pressure, density, and temperature equal to N/, 1.23 Kg/, and 286 K, respectively; the velocity field is initialized with m/s. In the upstream boundary a velocity of Mach 2.0 ( m/s and m/s) is injected. The simulation is performed for 5,000,000 time steps, with a calculated using the CFL criterium (see Figure 10).

##### 5.1. Viscous Stress Terms

Of course is necessary to use the metrics transformation for the viscous stress terms. Once applied the metrics the terms are expanded as follows:

For the heat flux vector the transformation is

for this application the values of , and are used

#### 6. Parallelizing the Numerical Scheme

The JPVM (Java Parallel Virtual Machine) library is a software system for explicit message passing-based distributed memory MIMD parallel programming in Java. The library supports an interface similar to the C and FORTRAN interfaces provided by the Parallel Virtual Machine (PVM) system, but with syntax and semantics enhancements afforded by Java and better matched to Java programming styles. The similarity between JPVM and the widely used PVM system supports a quick learning curve for experienced PVM programmers, thus making the JPVM system an accessible, low-investment target for migrating parallel applications to the Java platform. At the same time, JPVM offers novel features not found in standard PVM such as thread safety, multiple communication end-points per task, and default-case direct message routing. JPVM is implemented entirely in Java and is thus highly portable among platforms supporting some version of the Java Virtual Machine. This feature opens up the possibility of utilizing resources commonly excluded from network parallel computing systems such as Mac-, Windows- and Linux-based systems [6, 7].

The method used is an explicit finite-difference technique which is second-order accurate in both space and time with artificial viscosity.

In the predictor step the governing equation is written in terms of forward differences (predictor), later on is written in backward differences (corrector). Following the scheme we can divide the computation into tasks that perform over a part of the solution space only in the axis ; so if we have discrete points that divide the space over , then the computation could be performed into tasks, where the number of points in which each task operates is ; if the division is not exact, then the number of remaining points will be distributed among the tasks. For example, if we have 40 points on , and we want to divide it into 6 tasks, then there will be 4 tasks that take charge of working on 7 points and two tasks on 6 points. If the calculation is wanted to divide among 4 tasks, then each task will work on 10 points.

If and , then the iterators of each task will work on the following points: , , , .

Figure 11 shows the set-up of four tasks working over the same solution space, the arrows that arise of the tasks indicate the messages, and the arrows ahead of each task indicate the direction of the calculation.

The main issue in a traditional parallel algorithm for finite differences problem, is the intensive message passing, obtaining a poor performance in distributed memory systems, where the latency time is higher. Suppose that we want to advance on 300 discrete points over the axis and we have 4 tasks; so in every iteration it is necessary to send 6 messages; therefore it will be messages that holds just one value, and this is just for the case of the variable . Additionally, also we have to create 3 tasks for the other flux variables , that give a total of messages.

To reduce the quantity of messages it is necessary to increase the tasks granularity, assigning a tasks for flux variable. Figure 12 depicted four tasks working in their own flux variable.

**(a)**

**(b)**

**(c)**

**(d)**

Creating four tasks by every term of the flux , , , and , the task carries out the necessary calculations to obtain the new value of at . The procedure is listed as follows.

Calculate the forward differences of the predictor step.Calculate the artificial viscosity of the predictor step.Calculate the predicted value of .Calculate the rearward differences of the corrector step, using the predicted values of .Calculate the artificial viscosity for corrector step.Calculate the average derivative.Calculate the value of the flux variable at .Additionally to the four tasks (slave tasks), we need the master task to control the execution of the slave tasks and achieve the following calculations.

Calculate at to obtain the predicted values of the flux variables .Apply the boundary conditions.Adjust the flux variables at boundary.The former procedure indicates that the work of every task has to perform over the points collection that conform the bidimensional mesh. Every task will calculate the new values of the flux variables at localizations .

We follow a similar reasoning if we want to increase the granularity of tasks.

Figure 13 depicted the setup of four tasks working over its own flow field, in this case every task carried out the calculations to find the new values at each grid point of the variables in the time , , , .

**(a)**

**(b)**

**(c)**

**(d)**

The advantage of this paradigm for the parallelism is that we can made quickly hybridizations between a coarse and fine granularity, and the parallel tasks are created recursively easily.

#### 7. Simulation Results

In this section we show the results of the simulations about the expansion corner and the ejector.

##### 7.1. Expansion Corner Results

The expansion corner results for the Mach number, density, and temperature are shown in Figures 14 and 15 in steady state.

##### 7.2. Ejector Results

Figures 16, 18, 20, and 22 show the contour graphs of the density and the Figures 17, 19, 21, and 23 of the Mach number, after 0.2 s, 0.9 s, 1.6 s, and 5.0 s of real simulation when the flux is stable.

In Figure 24, is depicted the profile of the Mach number in the different sections of the diffuser; when the compression, transfer; and expansion occur, three numerical visors are taken at 5 inches, 21 inches, and 37 inches in the vertical, and compared with the experimental processed results when the flux is stable [1].

#### 8. Performance Results

To gain a better perspective of the performance options, the JPVM was compiled using a native compiler gcj (gnu java compiler) to avoid the overload of the virtual machine for Linux and Windows to generate a native version and carry out comparatives with the byte-code version. The creation and communication is under the same machine; a slow Pentium IV processor was used because is easier to measure the time. Table 1 shows the creation time of 1, 2, 4, 8, and 16 tasks.

In Table 1 we can see that the time of creation is practically the same in both versions.

In Table 2 we can observe that there is not significative difference between the native version and the byte-code version in the communication time of the tasks. Nevertheless the advantage of the native version is the saving memory storage. About the execution time, gcj can produce programs that are faster than the byte code version around 30%.

The cluster configuration used to execute the simulation is

(i)14 nodes, every node with two-processor Xeon DP 2.8 Ghz, 4 GB RAM, in total 28 cores.
For this application is supposed 10% of the serial part, the part that could not be parallelized (when the primitive variables are decoded); so the max *speed-up* for 16 tasks is 14.5; Table 3 shows the performance metrics for 16 tasks.

#### 9. Conclusions

A numerical parallel code has been developed to simulate the supersonic flow in the ejector diffuser in a parallel-distributed system with an object-oriented methodology. The model was validated with cases where there is an exact solution. The design of a parallel program allows to reduce the execution time; nevertheless the design and build of a parallel program is complex due to the no deterministic execution and in a scientific computing ambient is more difficult build and debug because it is necessary to manage several data, and so it is necessary to use a good methodology to minimize the risk of bugs in the program construction.

We have discussed implementations of object-oriented design using Java in computational fluid dynamics simulations. This also provides the benefits of better maintainability, reusability, and extensibility of the code. For examples, is possible to create recursively new parallel tasks to control easily the granularity; for this reason Java is a serious language suitable for demanding applications in science and engineering and the JPVM is a good MPI-based tool for parallel cluster computing.

Finally, this program is a good tool to investigate the behavior of the flux in the ejector diffuser and shows how to transform the solution space in an easy way.

#### Nomenclature

: | Component velocity in the direction (m/s) |

: | Component velocity in the direction (m/s) |

: | Temperature (K) |

: | Pressure (N/m^{3}) |

: | Density (Kg/m^{3}) |

: | Energy |

: | Mach number |

: | Specific heat capacity at constant pressure ((J/(kg K))) |

: | Ratio of heat capacities |

Pr: | Prandtl number |

: | Heat flux along direction (W/m^{2}) |

: | Ideal gas constant ((J/(kg K))) |

: | Sutherland’s constant (K) |

: | Thermal diffusivity (W/(m K)) |

: | Kinematic viscosity (m^{2}/s). |

#### Acknowledgment

This research was supported by the Mexican Petroleum Institute by Grant 204005.