Abstract

This study examines the available experiment data for a copper bromide vapor laser (CuBr laser), emitting in the visible spectrum at 2 wavelengths—510.6 and 578.2 nm. Laser output power is estimated based on 10 independent input parameters. The CART method is used to build a binary regression tree of solutions with respect to output power. In the case of a linear model, an approximation of 98% has been achieved and 99% for the model of interactions between predictors up to the the second order with an relative error under 5%. The resulting CART tree takes into account which input quantities influence the formation of classification groups and in what manner. This makes it possible to estimate which ones are significant from an engineering point of view for the development and operation of the considered type of lasers, thus assisting in the design and improvement of laser technology.

1. Introduction

Metal vapor lasers, including copper and copper halide lasers, have long been recognized to possess unique properties and capabilities with wide area of applications [1, 2]. They are known as the most powerful sources in the visible range (516.6 nm–578.2 nm) with coherent radiation and high beam convergence, generating at high repetition rates and high average output power. This type of devices continues to be subject of laser technology research for improved performance and for innovation. The main aspect of their development is the enhancement of the average output laser power.

This paper examines a copper bromide vapor laser which is from the class of copper halide vapor lasers. There is continuing interest in further improving the output characteristics of this laser and its applications [1, 2]. Alongside engineering design, the mathematical modeling (analytical, numerical, statistical, simulating, or other types) of laser devices is also widely applied in practice. Standard mathematical modeling includes systems of differential and integral equations, optimization, and other mathematical methods, describing the system and allowing the calculation of solutions for the processes, occurring within the system under investigation, as well as the performance of simulations. Here, the most widely used types of models are kinetic models. These describe the particles and processes occurring in the operating laser medium. There exist a large number of such publications for metal vapor lasers, including copper bromide vapor lasers [35]. Although kinetic models describe the major processes within the laser medium and the interactions between particles using hundreds of equations, a general drawback of theirs is that they cannot provide a complex direct estimate of output characteristics such as the average output power, laser efficiency, and service life. Moreover, the results from the kinetic models are in the form of calculated numerical data, which need additional computer processing.

As a set off to that, during the last few years, statistical models were developed and applied on the basis of accumulated experiment data. The models are in the form of explicit statistical relationships, dependencies, and classifications of the basis laser parameters. These give the opportunity to estimate the strength and the form of the relationship between the laser parameters. All this make it possible to direct the experiment towards increased output laser parameters and to make a preliminary estimate of experiment results using the models. Traditional parametric models of metal vapor lasers have been developed and analyzed in [610]. Multivariate regression with principal components analysis, hierarchical cluster analysis, factor analysis, and other statistical techniques has been used. A nonlinear model of output power has been built in [11]. Nonparametric models were obtained using the Multivariate Adaptive Regression Splines (MARS) method in [6, 11]. In the recent paper [12], the models describe over 98% of experiment data with a relative accuracy comparable to that of measurements, making it possible to predict the output power of future lasers.

In this paper, another powerful nonparametric modeling method—CART (classification and regression trees)—is applied to available data for a copper bromide vapor laser. This method allows for the separation of all observations from the considered independent variables (predictors) in noninteracting groups in the form of a binary tree according to the degree of influence on the dependent variable, in this case, laser output power.

The objective of this study is to determine the influence 10 input laser characteristics (supplied electric power, geometric design of the tube, neon pressure, reservoir temperature, etc.) on the average output power based on available experiment data. For the first time, the powerful nonparametric technique CART, described in [13, 14], is applied for metal vapor lasers. The following basic problems are solved: (i) building an optimal solution regression tree; (ii) determining the adequate linear models on the basis of this tree; (iii) building a tree of second degree independent variables; (iv) using the models to estimate known experiments; (v) applying the models for experiment prediction; (vi) validation of models; (vii) comparison of results to previous parametric and non-parametric models of the same type of laser. The obtained models describe more than 98% of the data and demonstrate excellent predictive qualities. They are used to direct the construction and design of new copper bromide vapor lasers with increased output power.

The results have been obtained using the CART software package [15].

2. Subject of Investigation

The copper bromide vapor laser is an improved version of a pure copper vapor laser. It is the most powerful and effective laser in the visible spectrum demonstrating high coherence and convergence of the laser beam. We are investigating variations of this laser invented and developed at the Laboratory of Metal Vapor Lasers at the Georgi Nadjakov Institute of Solid State Physics of the Bulgarian Academy of Sciences, Sofia. The first patents related to this type of laser are [16, 17]. The copper bromide vapor laser is one of the 12 laser sources which have a wide range of applications and are commercially viable [1, 2]. The development and improvement of CuBr lasers is seen as a fundamental step in the study of copper lasers as a whole.

Copper bromide vapor lasers are sources of pulse radiation in the visible spectrum (400–720 nm) emitting at two wavelengths: green, 510.6 nm, and yellow, 578.2 nm. They are considered to be high-pulse lasers. Neon is used as a buffer gas. In order to improve efficiency, small quantities of hydrogen are added. Unlike the high-temperature pure copper vapor laser, the copper bromide vapor laser is a low-temperature one, with an active zone temperature of about 500°C. The laser tube is made out of quartz glass without high-temperature ceramics as a result of which it is significantly cheaper and easier to manufacture. The discharge is heated by electric current (self-heating laser). It produces light impulses tens of nanoseconds long. Its main advantages are short initial heating period, stable laser generation, relatively long service life, high values of output power, and laser efficiency. A simple scheme of the laser is given in Figure 1.

The specific technical parameters of the investigated copper bromide vapor lasers are given in Table 1.

3. Description of the Data

This paper takes into account the following 10 independent input variables (predictors) and one dependent variable (response)—laser output power (W). The independent variables are D (mm)—inner diameter of the laser tube, DR (mm)—inner diameter of the ring (without rings, ), L (cm)—length of the active zone (distance between the electrodes), PIN (kW)—electric power supplied to the discharge, (kW/cm)—electric power per unit length with 50% losses, PRF (kHz)—electric pulse repetition frequency, PNE (torr)—buffer gas pressure (neon), PH2 (torr)—pressure of the added gas (hydrogen), C (nF)—equivalent capacity of the condensation battery, and TR (°C)—temperature of copper bromide reservoirs.

The study uses the values of these variables taken from experiments, published in [1825]. It needs to be noted that the maximum output power achieved is  W in an experiment where the following values were measured for the input parameters as given previously: (58, 58, 200, 5, 12.5, 0.6, 17.5, 20, 1.3, and 490) [24].

The statistical summary for the whole set is given in Table 2.

It should be noted that the variables are not normally distributed, which is observed from the values of asymmetry and excess. The same is valid for the multivariate distribution of the data. For this reason, nonparametric methods which have no requirements towards the type of data distribution, both as a whole and for subsets, are more suitable.

4. Short Description of the CART Method

The CART method algorithm, as indicated by the name, solves the classification and regression problem. It was developed between 1974–1984 by Breiman et al. [13].

CART is a nonparametric solution tree technique which builds classification or regression trees depending on whether the dependent variable is categorical or numerical. In our case, this is a regression tree.

The algorithm is intended for the building of a binary solutions tree. The initial set of observations is divided into groups at the terminal nodes (leaves) of the tree. The goal is to find a tree which allows for a good distribution of the data with the lowest possible relative error of prediction. Each branch of the tree ends with one or two terminal nodes and each observation falls into exactly one terminal node, defined by a unique set of rules.

More specifically, the objective of the regression tree approach is to distribute the data in relatively homogeneous (with minimum least squares or minimum standard deviation) terminal nodes and to obtain a mean observed value at each node in the form of a predicted value. The building of a tree starts from a root node, containing all observations. At each step (at each running node) a rule is applied to divide the set of observations within the node into two subsets (two children) according to some condition for an independent variable (predictor) of the type where is the threshold value. If a given observation from the current node meets the left inequality in (1), it is classified to a group in the left child node split, and, if not it goes to the right child node split. In this way, the separation by nodes is repeated multiple times until a terminal node is reached. The general criterion for the selection of a predictor variable at each node and its threshold value is the minimum of the least squares or the minimum standard deviation from all possible predictors and all possible threshold values beginning from the current node and subset data. Defining a given node as a terminal one depends on the minimum error achieved as per a preset criterion for the minimum number of observations or some other type of restriction [26, 27]. The observations which find their way to a given tree node are defined by a series of rules of the type (1), starting at the root of the tree.

Validation is usually applied when building regression trees, since they may be sensitive to random errors in the data. This helps diminish by “pruning” the initial tree, maintaining its regression characteristics and accuracy. In the case of fewer observations and variables, the use of the statistical method of cross-validation with V-fold is recommended. This validation technique in CART allows for the construction of very reliable models superior to standard regression models. In general case, CART applies the least squares splitting rule to build the maximal tree and a cross-validation procedure to select the optimal tree.

In this study, we have used the standard 10-fold cross-validation, recommended for small samples. The data have been randomly divided into 10 equal nonintersecting subgroups, each containing approximately 10% of the dataset. The tree has been built using 9/10 of the data (learn sample) and the remaining 1/10 (test sample) have been used for prediction and to determine the level of the error. The tree construction process is repeated 10 times and the average error of the 10 series is taken as a general estimate. This procedure ensures accurate estimation of the dependent variable and allows for the tree to be used for the classification or regression of another dataset.

The estimate for the value of the prediction in a terminal node with the number is the mean value of all measurements for the dependent variable , which fall within the following node:

5. Linear CART Model of Output Power

First we will build and analyze a linear model, that is, where the predictors are the independent variables participating only with their first degree, as described in Section 3.

A CART model has been built in order to determine the relationship between laser output power and the 10 basis input laser variables. The minimum number of observations has been set at 10 for parent nodes and 5 for terminal nodes. It was established using a special feature Battery ATOM of the software CART [15, 28]. The comparative diagram of the relative error of the models with a given number of the terminal nodes is shown in Figure 2. It can be seen that minimum of 2 and 5 cases in the terminal node give almost equal relative error less than 2.5%.

One more specific objective of our investigation is to build a tree which classifies and predicts well experiments with high values of output power. For this reason, further on we will concentrate on the node which contains the highest values of output power Pout.

In order to specify the tree and its reverse prune so as to find a tree with an optimal small relative error for the data, we apply the 10-fold cross-validation procedure described in Section 4.

By setting the minimum number of the cases in the terminal nodes equal to 5 and 10 for the parent node, and setting the classification/regression criterion to least squares, an optimal regression tree is found. In practice, there exists a subset of trees that exhibit an accuracy performance statistically indistinguishable from the optimal tree. All of these models are candidates for optimal models too. This is called a “1 standard error” or 1 SE rule to identify these trees [28]. In this study we will choose the 1 SE tree that has the same performance with the optimal tree in the subtree with the maximum output power and has the simplest structure with the minimum terminal nodes.

The curve of relative errors of generated models, including the optimal model with the smallest error is shown in Figure 3. It can be seen that the optimal tree is this with 49 terminal nodes and 3.0% of relative error. After examining all other models following the 1 SE rule (visualized in green), we find a tree with 27 terminal nodes with minimum terminal nodes and the same performance in the hot spot nodes with the maximum output power Pout.

The selected regression CART model with 27 terminal nodes accounts for % of the sample following 10-fold cross-validation procedure. It has a relative error 3.1%.

A detailed specific information about the hot spot node is shown in Figure 4. This node contains the highest power values with a standard deviation and a local root mean square error . The value predicted by the regression using formula (2) is the average value of the response This approximation is within 6% relative error with respect to the maximum of experiment, and STD is relatively high, which is not sufficiently satisfactory, since it is comparable but still high with respect to the unavoidable experiment error, which is considered to be within 5–10%.

Figure 5 shows all splitters used to build the selected tree with 27 terminal nodes. For all terminal nodes, the corresponding local splitting classification rules are given in Table 3. For node 22, which is of special interest, through the cross-section of local rules, we find three variables PIN, C, and PR, limited as follows: The overall quality of approximation with the regression tree is given in Figure 6, showing the experiment values of output power Pout against those predicted by the linear model. It can be added that the residuals of the selected model are normally distributed and no heavy tails were detected.

6. CART Model of Output Power Using up to Second Degree of Predictors

In order to build a CART tree including up to second-degree polynomials, from the 10 independent variables we form 65 predictors of the following type: where the variables, for ease of use, denote the input laser parameters given in Section 3. Analogically to the linear case, we construct the binary tree of solutions under restrictions: minimum 10 observations per parent node and a minimum of 5 for terminal nodes. The graph of distribution of the relative error for all obtained trees is given in Figure 7. It can be seen that the optimal tree with 3.8% relative error is with 62 terminal nodes.

To bring into comparison with the linear model we chose again a tree with 27 terminal nodes. It satisfies the selection criteria as in the linear case. More exactly, this model has 4.1% relative error (see Figure 7). The statistics and rules of the selected tree are given in Table 4. Significant predictor variables in the model are the following 30 predictors of first and second degrees: D, DR, L, PIN, PRF, PNE, C, , , , , , , , , , , , , , , , , , , , , , , and .

A detailed view of the hot spot nodes with the maximum values of Pout is presented in Figure 8.

The node with the highest values is number 20 (see Figure 8). The following approximation and accuracy values are achieved: the average value predicted for the leaf is a standard deviation is 6.22, and within the leaf. The model describes % of the sample. The approximation (6) is within 4% relative error with respect to the maximum of experiment and also the STD is admissible. So, the indices of this model are satisfactory with the experiment error, considered to be within 5–10%.

The splitting rules for node 20 are as follows:

The general distribution diagram of the tree splitters according to variables is shown in Figure 9.

Figure 10 compares the values of Pout to the ones predicted by the regression tree in variable PredPout_quadratic. It can be added that as in the linear case the residuals of the selected model are normally distributed and no heavy tails were detected.

7. Discussion of Results and Model Comparison

In the obtained linear model of the 10 independent physical parameters, only 6 participate in the constructed regression tree. These defining parameters are

As shown in Figure 5, when the cases (experiments) are separated, three main third-level branches form, corresponding to a large degree to the three types of physical classification of copper lasers—small, medium, and large bore lasers [1]. Of the parameters (8), PIN is the most important quantity. It is the root of the tree and subsequently participates in 4 more nodes related to the classification of medium and high laser power values Pout. For lower power values (along the left end branch in Figure 5), the defining parameters are PIN, DR, PH2, and PNE. For medium power values—PIN and C. For high power, these are PIN, C, and PRF, respectively.

The analysis of the second-degree tree model (Figure 9) shows that on level three there are 4 groups, 3 of which are large, as is the case in Figure 3, but the second group of these has no continuation and practically the basic groups are 3. In this case, the defining parameters besides (8) also include D, L, and TR. Of these, D is more significant as it participates together with PIN in the root of the tree as well as in another node but not on its own. In view of the weaker prediction offered by the second-degree model, it can be concluded that these 3 parameters are ancillary and therefore secondary in significance with regard to the classification of the sample.

After reviewing the predictive capabilities of both models, we can conclude that the linear and second-degree models are almost equivalent. They describe quite well the various groups of classified cases and predict the values for the nodes with maximum output power within a relative error less than 5%. Since the second-degree model is the same in structure and better at predicting the group of higher output power values, it is recommended for engineering applications which aim at increasing output power. However, the results of both models can be combined for experiment planning. Another important comparison can be made with the models obtained using another powerful nonparametric technique—MARS. For the same data, second-degree MARS models also concur with 98-99% of the data, but are more precise in prediction of the output laser power than the CART models (see [12]). The advantage of CART models is that they provide more accurate criteria for the classification of individual experiment groups which are of special practical use.

8. Physical Interpretation and Application of the Models

We will also discuss the influence within the models of the main parameters which define high Pout values, namely, PIN, C, and PRF.

Influence of PIN: when the supplied electric power PIN is increased, the energy of the electrons rises. This leads to a higher probability of the upper laser level being populated. Laser generation Pout increases.

Influence of C: when C goes up, the electric power supplied to the discharge increases according to the formula , where U is the voltage between electrodes. This leads to an increase of the supplied electric power PIN in the tube and subsequently of laser generation.

Influence of PRF: when the frequency of the supply increases, the emission frequency of laser generation also goes up. The number of per unit time (1 second) laser pulses is higher which facilitates the increase of the average laser generation power.

Ensuring the combined action of these basic processes under the set conditions (8) find practical application in planning and conducting new experiments aimed at increasing the output power of a CuBr laser.

9. Conclusion

Regression models based on a CART tree, which classifies groups of similar experiments, have been built for a copper bromide vapor laser. The variables which play the main role in increasing laser output power have been identified for classified groups, as well as the intervals these should be within when conducting future studies and developing laser sources of the same type for improving laser technology.

Acknowledgment

This paper is published in cooperation with project of the Bulgarian Ministry of Education, Youth and Science, BG051PO001/3.3-05-0001 “Science and business” and financed under Operational program “Human Resources Development” by the European Social Fund.