Abstract

Bay laurel leaves, also known as bay leaves, are an important herb in many cuisines around the world. In addition to their use in cooking, bay leaves have also been used for their medicinal properties and are thought to have anti-inflammatory and antimicrobial effects. Gas chromatography/mass spectrometry (GC-MS) device was used to determine the secondary metabolites in the essential oil of bay laurel leaves samples kept at different temperatures (−22, −20, −18, 2, 4, 6, and 22°C) and storage times (1, 2, and 3 months). In this research, temperature (°C) and storage time (month) were used as input parameters in the neural network. On the other hand, alpha-pinene, beta-pinene, sabinene, 1.8-cineole, gamma-terpinene, cymenol, linalool, borneol, 4-terpineol, caryophyllene, sabinene, alpha-terpineol, germacrene-D, alpha-selinene, methyl eugenol, caryophyllene oxide, spathulenol, eugenol, and beta-selinenol were used as an output parameter. Considering the R2 values obtained from the artificial neural network analysis, R2 values of 0.97156 for the test, 0.98978 for the training, 0.98998 for the validation value, and 0.98831 for all values were obtained.

1. Introduction

Laurus nobilis L., commonly known as bay laurel or sweet bay, is a tree or shrub in the Lauraceae family. It can grow to be 3–10 meters tall and has yellow flowers with two petals. Some sources indicate that under the optimal conditions, it can grow up to 15–20 meters tall. It is a characteristic species of the Mediterranean maquis vegetation. The leaves of bay laurel have traditionally been used to treat certain illnesses. It is reported to have a soothing effect on rheumatic pain and stomach-related effects. It has diuretic, antiseptic, and stomachic properties. In the cosmetics and perfume industry, its essential oils and high lauric acid content make it a valuable ingredient in soap making and among fragrant woody plants. Recently, there has been an increasing interest in natural dyestuffs instead of synthetic dyestuffs. The anthocyanin in the laurel (Laurus nobilis L.) fruit is used as a natural dye in the food, pharmaceutical, and cosmetic industries [1, 2].

The growth and development of a plant, as well as the synthesis and accumulation of secondary metabolites, are greatly influenced by various environmental factors, such as pathogens, herbivores, light, altitude, temperature, irrigation, soil fertility, and salinity. Secondary metabolites, which are derived in response to changes in the environment, such as external stimuli or stressors, play a crucial role in the plant’s adaptation to its environment [3]. For instance, a wide array of phenolic compounds, which are considered to be key defense compounds, is generated via the phenylpropanoid pathway [46]. Optimizing the relevant growing conditions and using germplasm potentials can improve the production and quality of medicinal plant raw materials [7]. Among the external factors, mineral elements have various effects on primary metabolism and the production of secondary metabolites, which can greatly impact plant growth and development and the quality of raw materials [8]. Researchers studying the secondary metabolites of Laurus nobilis L., which is a medicinal, aromatic, and perfume plant, stated that 1,8-cineole is the main component, and other components are linalool, trans-sabinene hydrate, α-terpinyl acetate, methyl eugenol, sabinene, eugenol, α-pinene. They listed them as β-ocimene and β-pinene [911].

Artificial neural networks (ANNs) are a nonlinear, data-driven, self-adaptive approach to modeling complex relationships [11, 12]. ANNs can identify and learn the correlation patterns between independent variables and corresponding target variables when the underlying relationship is unknown and can then use this knowledge to predict the dependent variables based on new independent variable data sets [13]. In essence, ANNs perform nonlinear mapping or pattern recognition [1416]. If a set of input data corresponds to a specific pattern, the network can be trained to produce a corresponding desired pattern at the output [17, 18]. The network has the ability to learn and estimate the output.

The aim of this study is to use an artificial neural network (ANN) to estimate and compare the secondary metabolite content of laurel (Laurus nobilis L.).

2. Materials and Methods

2.1. Test Material

Bay laurel, also known as Laurus nobilis L., is an aromatic evergreen tree or large shrub with green, glossy leaves. It is native to the Mediterranean region and has been used in cooking and medicine for thousands of years. The leaves are often used as a culinary herb, and they can be used fresh or dried. Bay leaves are used to add flavor to soups, stews, and other dishes, and they are a key ingredient in many traditional dishes, such as bouquet garni and pork adobo. In medicine, bay leaves have been used to treat a variety of ailments, including headaches, infections, and digestive issues. In this research, bay laurel leaves were used as material (Figure 1).

Properties and chemical formulas of secondary metabolites obtained from bay leaf are shown in Table 1.

3. Method

3.1. Essential Oil Extraction and Analysis of Gas Chromatography/Mass Spectrometry (GC-MS)

A clevenger device was used to obtain the essential oil. 100 g of bay laurel leaves and 1000 ml of distilled water were added to the 2000 ml glass flask of the device, which was dried in the shade. The essential oil was obtained by boiling the material at 100°C for three hours. Gas chromatography/mass spectrometry (GC-MS) device was used to determine the secondary metabolites in the essential oil. The GC-MS uses the principle of the separation technique. The sample was analyzed in Agilent 5975 gas chromatography (GC) system equipped with a HP-Innowax silica kapiler column (30 ml × 250 μm ID, 0.25 μm film thicknesss). Spectrometer detector equipped with triple-axis detection using helium as carrier gas. The flow rate is 1.50 ml/min; injector and column oven temperatures are 60 C and 240 C; injector mode split ratio is 20 : 1. GC-MS consists of an injection port at one end of a metal column packed with the analyzed solution and a detector at the other end of the column. It is made up of the mobile phase which is the helium gas and a stationary phase which serves as the column. The sample is injected into the injector with a liner under, the syringe picked the sample, and the mobile phase and the helium gas propelled the sample from the liner down the column where separation into different components occurs. The injection port was maintained at a temperature of 60°C for 10 mins, and it increased from 10°C per min to 240°C for 20 min. As the sample moved through the column, the different molecular characteristics determined how each substance in the sample interacted with the column surface. The column allowed the various substances to partition themselves, and the various components in the sample separated before eluting from the column. The length of the column was 60 m, an internal diameter was 0.25 mm, and a thickness was 0.25 μm. The amount of time at which each particular component elutes from the compound was retained in the GC column known as the retention time which helped in differentiating between the components such that if two samples do not have equal retention times, those samples are not the same substance.

3.2. Artificial Neural Networks (ANNs)

Artificial neural networks (ANNs) are created by bringing together artificial neural cells [1921]. These cells do not come together randomly, but rather follow a specific architecture that consists of three types of neuron layers: input, hidden, and output layers. The input layer is responsible for taking in information from the outside world and transferring it to the hidden layers [2224]. Some networks do not process any information in the input layer at all. In the hidden layers, the output layer receives the processed information from the input layer and may have more than one hidden layer. The output layer is where the artificial neural cells process the information coming from the hidden layers to produce the desired output for a given set of inputs [25] .

The structure of an ANN, including the artificial neural cells and the connections between the layers, is shown in Figure 2. In this figure, the round artificial neural cells are shown parallel to each other in each layer, with lines connecting them to represent the network connections. To train the network, the Levenberg-Marquardt (LM) algorithm was used with 2 inputs, 19 hidden layers, and 1 target layer. The LM algorithm is a commonly employed optimization technique in training artificial neural networks, especially in backpropagation-based models. It merges the advantages of two optimization techniques, gradient descent, and Gauss-Newton optimization, to offer a quick and effective method for determining the optimal parameters for the model. Backpropagation is a type of supervised learning algorithm used for training artificial neural networks. In backpropagation, the error in the model’s predictions is transmitted backwards through the network to adjust the weights and biases of the neurons, so as to reduce the error and enhance the accuracy of the model. The LM algorithm is utilized as an optimization technique in the training procedure to minimize the error by adjusting the weights and biases of the neurons. The LM algorithm initially assumes a Gauss-Newton approximation of the Hessian matrix of the cost function and then adjusts the approximation with a damping factor. This enables the algorithm to manage cases where the Gauss-Newton approximation is not appropriate and provides a more efficient optimization technique compared to the pure Gauss-Newton method. The LM algorithm is a widely used optimization method in backpropagation-based artificial neural networks, offering a quick and effective way to determine the optimal parameters for the model.

The optimal model obtained in this research consisted of 1 hidden layer with 10 neurons and a sigmoid function in the output layer. The performance of the ANN was evaluated using the mean square error of prediction (MSE) and the coefficient of determination (R2). The calculation of these criteria is shown in the following equations:where N is number of samples, yai and ya are actual secondary metabolitie values, and yi and ypi are measured and predicted the secondary metabolite values. The term MSE stands for mean squared error, a popular metric for evaluating the performance of regression models. It calculates the average of the squared differences between the actual output and the predicted output. The goal is to minimize the MSE, as a lower value indicates a better fit between the model’s predictions and the actual data. To achieve this, the optimization problem seeks to determine the optimal parameters for the model (equation (1)). The coefficient of determination, also known as R2, is a metric used to assess the strength of the relationship between the dependent and independent variables in a regression model. It is expressed as a value between 0 and 1, where 1 indicates that the model completely explains the variation in the dependent variable and 0 means that the model explains none of it. R2 is used to judge the quality of fit of a regression model and to compare different models. Generally, a higher R2 value implies a better fit of the model to the data. However, a high R2 value alone doesn’t guarantee a good model, as other factors such as overfitting or omitted crucial variables can also impact its value (equation (2)).

Every process element in an artificial neural network (ANN) has five main components: inputs, weights, a sum function, an activation function, and outputs. During the learning process, each neuron adjusts its weight individually, and as the weight decreases or increases, it changes the strength of the neuron’s signal. In this study, 21 parameters were used to estimate the secondary metabolites contents of laurel: temperature (°C), storage time (month), alpha-pinene, beta-pinene, sabinene, 1.8-cineole, gamma-terpinene, cymenol, linalool, borneol, 4-terpineol, caryophyllene, sabinene, alpha-terpineol, germacrene-D, alpha-selinene, methyl eugenol, caryophyllene oxide, spathulenol, eugenol, and beta-selinenol.

4. Result and Discussion

In the ANN model, subcriteria and their weights were used as input data, and normalized fuzzy weights of the parameters were used as target data. The Levenberg-Marquardt (LM) algorithm was used to train the network. The ANN was performed using the MATLAB software Matlab® R2012a (7.14.0.739) 32-bit (win32). A total of 570 samples were used in the ANN model, with 390 samples used for training, 90 samples used for testing, and 90 samples used for validation (Table 2).

An error histogram is a graphical representation of the distribution of errors in a set of data. It is typically plotted as a bar chart, with the height of each bar representing the number of errors within a given range of error values. To interpret an error histogram, you can look at the overall shape of the distribution to get a sense of the general trend in the data, as well as any outliers or unusual patterns. You can also use the error histogram to compare the error distribution across different groups or time periods, or to evaluate the performance of a model or algorithm. In general, a well-behaved error histogram should have a symmetrical, bell-shaped distribution, with most errors falling within a small range and a relatively small number of outliers. This indicates that the errors are evenly distributed and not overly influenced by a few exceptional cases. In Figure 3, the histogram shows the error for 20 bins for the validation, training, and testing of the normalized data. For the hidden layer, the best performance was recorded at 1000 epochs. The error value being close to zero indicates that the predicted value is close to the real data. The error histogram confirms the effectiveness of the validation, testing, and training phases.

The statement “best validation performance is 0.29378 at epoch 2” refers to the performance of a machine learning model on a validation dataset. In this case, it appears that the model achieved its best performance, as measured by some metric, at the second epoch of training. To interpret this information, it would be helpful to know the context in which the statement was made, and the specific metric used to evaluate the model’s performance. In Figure 4, it can be seen that the mean square error (MSE) of the testing performance and the training exceed the validation performance. In order to improve the results, the Levenberg-Marquardt (LM) algorithm was implemented in the hidden layer, where backpropagation was applied. The results, also shown in Figure 4, supported our choice of 10 hidden layers and 10 neurons by achieving a lower MSE and providing a powerful end to the proposed ANN training. The best-fit value with respect to the results in Figure 4 demonstrates that the most appealing value or the results of the iterations was one hidden layer, and 10 neurons obtained an MSE of 2.86606e−1 for training, 2.93776e−1 for testing, and 2.25790e−1 for validation at 1000 epochs.

The correlation between the real values plotted on the x-axis and the predicted values recorded on the y-axis for all data, testing data, training data, and validation data can be seen in Figure 5. The coefficient of determination, denoted by the symbol R2, is a measure of how well a regression line or model fits the data. It is a number between 0 and 1, where a value of 1 indicates a perfect fit and a value of 0 indicates that the model does not explain any of the variances in the data. The coefficient of determination is calculated by taking the square of the correlation coefficient, which measures the strength and direction of the relationship between two variables. It is commonly used in statistics to evaluate the performance of a regression model. The coefficient of determination (R2) is a useful indicator for assessing the forecast performance of the proposed artificial neural network. Therefore, the accuracy of predicted secondary metabolites was predicted by using measured secondary metabolite content of laurel by GC/MS.

In this research, temperature (°C) and storage time (month) were used as input parameters in the neural network. On the other hand, alpha-pinene, beta-pinene, sabinene, 1.8-cineole, gamma-terpinene, cymenol, linalool, borneol, 4-terpineol, caryophyllene, sabinene, alpha-terpineol, germacrene-D, alpha-selinene, methyl eugenol, caryophyllene oxide, spathulenol, eugenol, and beta-selinenol were used as an output parameter. Considering the R2 values obtained from the artificial neural network analysis, values of 0.97156 for the test, 0.98978 for the training, 0.98998 for the validation value, and 0.98831 for all values were obtained. Considering these values, the artificial neural network model predicts alpha-pinene, beta-pinene, sabinene, 1.8-cineole, gamma-terpinene, cymenol, linalool, borneol, 4-terpineol, caryophyllene, sabinene, alpha-terpineol, germacrene-D, alpha-selinene, methyl eugenol, caryophyllene oxide, spathulenol, eugenol, and beta-selinenol as an output parameter with 98% accuracy (Table 3). When Table 3 was examined, the relationship between actual values and ANN results was found to be 98%. This shows that the ANN model produces results close to the actual values at a rate of 98%.

5. Conclusion

Secondary metabolites are a type of chemical produced by plants, animals, and other organisms. They are typically not involved in the primary metabolic processes of the organism, such as growth and development, but instead serve other functions. Some examples of secondary metabolites include pigments, flavonoids, and alkaloids. Secondary metabolites are important for a number of reasons. For one, they can give plants and other organisms distinctive colors, flavors, and scents, which can help them to attract pollinators or deter predators. In addition, many secondary metabolites have beneficial effects on human health and are used in a variety of medicines and other products. For example, many drugs used to treat cancer and other diseases are derived from secondary metabolites. Overall, secondary metabolites play a key role in the biology and ecology of plants and animals and have many practical applications for humans.

In this study, it was aimed to predict the secondary metabolite contents of bay leaves analyzed by GC/MS at different temperatures and different storage times by the ANN model. The ANN results obtained were found to be quite close to the results obtained by GC/MS. In other words, the ANN model can accurately predict the secondary metabolite contents of bay leaves analyzed by GC/MS at different temperatures and storage times. This suggests that the ANN model was effective in modeling the relationship between the variables in the study and could potentially be used in similar contexts to predict the contents of other substances.

Data Availability

The datasets generated and analyzed during the current study are not publicly available but are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Emel Karaca Öner and Meryem Yeşil carried out the experiment. Mehmet Serhat Odabas designed the model and the computational framework and analyzed the data. All authors reviewed the manuscript.