Abstract

Artificial neural network (ANN) models have the capacity to eliminate the need for expensive experimental investigation in various areas of manufacturing processes, including the casting methods. An understanding of the interrelationships between input variables is essential for interpreting the sensitivity data and optimizing the design parameters. Silver nanoparticles (Ag-NPs) have attracted considerable attention for chemical, physical, and medical applications due to their exceptional properties. The nanocrystal silver was synthesized into an interlamellar space of montmorillonite by using the chemical reduction technique. The method has an advantage of size control which is essential in nanometals synthesis. Silver nanoparticles with nanosize and devoid of aggregation are favorable for several properties. In this investigation, the accuracy of artificial neural network training algorithm was applied in studying the effects of different parameters on the particles, including the AgNO3 concentration, reaction temperature, UV-visible wavelength, and montmorillonite (MMT) d-spacing on the prediction of size of silver nanoparticles. Analysis of the variance showed that the AgNO3 concentration and temperature were the most significant factors affecting the size of silver nanoparticles. Using the best performing artificial neural network, the optimum conditions predicted were a concentration of AgNO3 of 1.0 (M), MMT d-spacing of 1.27 nm, reaction temperature of 27°C, and wavelength of 397.50 nm.

1. Introduction

Nanoparticles are important materials for fundamental studies and diversified technical applications because of their size dependent properties or highly active performance due to their large surface areas. The synthesis of well dispersed pure metal nanoparticle is difficult as they have high tendency to form agglomerates particles [1, 2]. Therefore, to overcome agglomeration, preparation of nanoparticles based on clay compounds in which nanoparticles are supported within the interlamellar spaces of clay and/or on its external surfaces is one of the most effective solutions [3, 4]. In the same way, synthesis of metal nanoparticle on solid supports such as smectite clays is also a suitable way to prepare practically applicable supported particles as well as to control the particle size [5].

Smectite clays have an excellent swelling and adsorption ability, which is especially interesting for the impregnation of antibacterial active nanosize metals in the interlamellar space of clay [6, 7]. Montmorillonite (MMT) as lamellar clay has intercalation, swelling, and ion exchange properties. Its interlayer space has been used for the synthesis of material and biomaterial nanoparticles [8].

Due to its properties and areas of use, Ag is one of the most studied metals. However, certain factors, such as stability, morphology, particle size distribution, and surface state charge/modification which are significant in the controlled synthesis of Ag-NPs have attracted high attention, since this task is far from being understood [58]. For the synthesis of Ag-NPs in the solid substrate, AgNO3 is often used as the primary source of Ag+. There are numerous ways of Ag+ reduction: use of γ-ray [4], or UV irradiations [5], glucose [2], application of reducing chemicals, such as hydrazine [9], sodium borohydride [1, 3, 68], plant extraction [10, 11], and so forth.

Artificial neural network (ANN) is a nonlinear statistical analysis technique and is especially suitable for simulation of systems which are hard to be described by physical and chemical models. It provides a way of linking input data to output data using a set of nonlinear functions [12]. The ANN is a highly simplified model of the structure of a biological network [13]. The fundamental processing element of ANN is an artificial neuron or simply a neuron. A biological neuron receives inputs from other sources, combines them, performs generally a nonlinear operation on the result, and then outputs the final result [14]. The basic advantage of ANN is that it does not need any mathematical model since an ANN learns from examples and recognizes patterns in a series of input and output data without any prior assumptions about their nature and interrelations [13]. The ANN eliminates the limitations of the classical approaches by extracting the desired information using the input data. Applying ANN to a system needs sufficient input and output data instead of a mathematical equation [15].

Selecting the optimum architecture of the network is one of the challenging steps in ANN modelling. The term “architecture” mentions the number of layers in the network and number of neurons in each layer. Also a training algorithm is required for solving a problem via a neural network modeling. Many different algorithms can be applied for training procedure. During the training process, the weights and biases in the network are modified to minimize the error and obtain a high performance in the solution. At the end of the training and during the training error, mean absolute percentage error is computed between desired outputs and target outputs [16, 17]. It is the most useful numerical technique in modern engineering analysis and has been employed to study different problems in heat transfer, electrical systems, mechanical properties, solid mechanics, rigid body dynamics, fluid mechanics, and many other fields. In recent years, ANN has been introduced in nanotechnology applications as techniques to model data showing nonlinear relationships [12, 1720] and/or estimation of particle size in variety of nanoparticle samples [21].

Employing neural network models would lead to saving time and cost by predicting the results of the reactions so that the most promising conditions can then be verified [22].

The aim of the present research is the investigation of the effects of different parameters including AgNO3 concentration, temperature, wavelength, and MMT interlayer d-spacing on the prediction of size of Ag-NPs. The ANN model was used to predict the optimum size of Ag-NPs.

2. Experimental

To prepare the stable Ag-NPs via the chemical reduction method, it was significant to choose the suitable stabilizer and reducing agent. In this research, MMT suspension was used as the appropriate support for reducing AgNO3/MMT suspension using NaBH4 as the strong reducing agent. The surfaces of MMT suspension are assisting the Ag-NPs nucleation during the reduction process. The schematic illustration of the synthesis of Ag/MMT nanocomposites from AgNO3/MMT suspension produced by using sodium borohydride is shown in Figure 1. Meanwhile, as shown in (Figure 1(a)) AgNO3/MMT suspension was colourless, but after the addition of the reducing agent to the suspension they turned to brown colour indicating the formation of Ag-NPs in the MMT suspensions (Figure 1(b)).

2.1. Data Sets

Table 1 shows the experimental data used for ANN design. The experimental data were randomly divided into three sets: 26, 10, and 6, which were used as training, validation, and testing, respectively. The training data was applied to compute the network parameters while the validation data was applied to ensure robustness of the network parameters. If a network “learns too well” from the training data, the rules might not fit as well for the rest of the cases in the data. To avoid this “over fitting” phenomenon, the testing stage was used to control error; when it increased, the training stops [23]. For assessment the predictive ability of the generated model was used as the testing data set.

2.2. ANN Description

An artificial neural network learns general rules based on calculation upon numerical data [24]. Neurons are the smallest parts of ANN. Neurons are constructed in three layers including input, output, and hidden layer. The hidden layer can be developed with similar layer structure as neuron. In the other expression, the structure is as if the output of a hidden layer is the input for the sequent hidden layer [25]. The multilayer perceptron (MLP) networks are the most widely used neural networks that consist of one input layer, one output layer, and one or more hidden layers. In the ANN methodology, the sample data is often subdivided into training, validation, and test sets. The distinctions among these subsets are crucial. Ripley [26] defines the following “Training set: A set of examples used for learning that is to fit the parameters (weights) of the classifier. Validation set: A set of examples used to tune the parameters of a classifier, for example, to choose the number of hidden units in a neural network. Test set: A set of examples used only to assess the performance (generalization) of a fully specified classifier.” In this research, a multilayer perception (MLP) based feed forward ANN which uses back-propagation learning algorithm, was applied for the modeling of Ag-NPs size prediction. The inputs for the network include AgNO3 concentration, reaction temperature, UV-visible wavelength, and MMT d-spacing. The output is the size of Ag-NPs. In Figure 2, is shown a diagram of a typical MLP neural network with one hidden layer structure of proposed ANN. The input to the node in the hidden layer is given by

Each neuron consists of a transfer function expressing an internal activation level. The output from a neuron is determined by transforming its input using a suitable transfer function [27]. Generally, the transfer functions for function approximation (regression) are sigmoidal function, hyperbolic tangent, and linear function [28]. The most popular transfer function for a nonlinear relationship is the sigmoidal function [19, 29]. The output from the neuron of the hidden layer is given by

In (1) and (2), is the number of neurons in the input layer, is the number of neurons in the hidden layer, is the bias term, is the weighting factor, and is the activation function of the hidden layer such as the tan-sigmoid transfer function [18, 30].

The output of the th neuron in the output layer is given by where is the weighting factor, is the bias term, and is the number of neurons in the output layer. The values of the interconnection weights are determined by the training or learning process using a set of data. The aim is to find the value of the weight that minimizes the error [28]. A popular measure for evaluation of prediction ability of ANN models is the root mean square error (RMSE): where is the number of points, is the predicted value obtained from the neural network model, and is the actual value. The coefficient of determination reflects the degree of fit for the mathematical model [31]. The closer the value is to 1, the better the model fitting towards the actual data [32]: where is the number of points, is the predicted value obtained from the neural network model, is the actual value, and is the average of the actual values. Absolute average deviation (AAD) is another important index to evaluate the ANN output error between the actual and the predicted output [14]: where and are the predicted and actual responses, respectively, and is the number of the points. The network having minimum RMSE, minimum AAD, and maximum is considered as the best neural network model [33].

3. Results and Discussion

3.1. ANN Modeling Results

In this research, many different structures with two and three hidden layers with the different number of neurons in each layer were tested to obtain the best ANN configuration. The choosing of number of neurons in hidden layers is very important as it affects the training time and generalization property of neural networks. A higher value of neurons in hidden layer may force the network to memorize (as opposed to generalize) the patterns which it has seen during training whereas a lower value of neurons in hidden layer would waste a great deal of training time in finding its optimal representation [34]. There is no general rule for selecting the number of neurons in a hidden layer. It depends on the complexity of the system being modeled [33]. The most popular approach for finding the optimal number of neurons in hidden layer is by trial and error [35]. The tan-sigmoid function is used as transfer function for first layer and linear function is applied as transfer function for second layer.

In this paper, the trial and error approach was utilized in determining the optimum neurons in the hidden layers. The feed-forward neural network generally has one or more hidden layers, which enable the network to model nonlinear and complex functions [28]. Nevertheless, the number of hidden layers is difficult to decide [19]. It has been reported in the literature that one hidden layer is normally sufficient to provide an accurate prediction and can be the first choice for any practical feed-forward network design [36].

Therefore, a single hidden layer network was applied in this paper [12, 20]. The RMSE was applied as the error function. Also, and AAD were used as a measure of the predictive ability of the network. Each topology was repeated five times to avoid random correlation due to the random initialization of the weights [37]. Many types of learning algorithms are explained in the literature which can be applied for training of the network. However, it is difficult to know which learning algorithm will be more efficient for a given problem [38].

The LM is often the fastest back-propagation algorithm and is highly recommended as a first-choice supervised algorithm, even though it does require more memory than other algorithms. The LM back-propagation is a network training function that updates weight and bias values according to LM optimization. The LM is an approximation to the Newton’s method [39]. This is very well suited to the training of the neural network [40]. The algorithm uses the second-order derivatives of the mean squared error between the desired output and the actual output so that better convergence behavior can be obtained [41]. Therefore, various topologies (from 1 to 20 hidden neurons) using LM algorithm were examined. Results obtained showed that a network with 4 hidden neurons presented the best performance.

Figure 3 presents the scatter plots of the ANN model predicted versus actual values using LM algorithm for the training, validation, testing, and all data sets. Also, it presents the predicted model for the training, validation, testing, and all data sets well fitted to the actual values. The results of this study illustrated that the network consisted of three layers: input, hidden, and output with 4 nodes in hidden layer presenting the best performances. RMSE and between the actual and predicted values were determined as 0.0055 and 0.9999 for training set, 0.01529 and 0.9994 for validation set, and 0.7917 and 0.955 for testing set. The RMSE and for all data sets were also calculated as 0.0071 and 0.9753, respectively. These results show that the predictive accuracy of the model is high. In Table 2 are showed values of connection weights (parameters of the model) for the completed ANN model.

3.2. Sensitivity Analysis

In this research, a data analysis was performed to determine the effectiveness of a variable using the suggested ANN model in this work [20]. In the analysis, performance evaluations of different possible interaction of variables were investigated. Therefore, performances of the four groups (one, two, three, and four) variables were studied by the optimal ANN model using the LM with 4 neurons in the hidden layer.

The groups of input vectors were defined as follows: , AgNO3 concentration; , reaction temperature; , UV-visible wavelength; and , MMT d-spacing. The results are summarized in Table 3. The results in Table 3 showed to be the most effective parameter in the group of one variable, due to its lower RMSE, 0.301.

As shown in Table 3, the value of RMSE significantly decreased when was used in interaction with other variables in other groups. The minimum value of RMSE in the group of two was determined to be 0.011 with a further interaction of . The values of RMSE became smaller in the interaction of ; the best case of group of two variables was used with . The minimum value of RMSE in the group of three variables was 0.009 using the interaction of . The value of RMSE was decreased from 0.009 to 0.007 when was used in interaction with other variables in the after group of four variables.

3.3. Comparison of Experimental Data and ANN Output

The optimal conditions for the prediction of size of Ag-NPs were predicted as presented in Table 4 along with predicted and actual size of Ag-NPs. For this goal, ANN based LM was adopted for predicting the size of Ag-NPs in optimal conditions and then experiment was carried out under the recommended conditions [42]. The resulting response was compared to the predicted value. The optimum parameters were 1 M, 27°C, 397.5 nm, 1.27 nm for the concentration of AgNO3, reaction temperature, UV-visible wavelength, and MMT d-spacing, respectively. As shown in Table 4, the concentration was the most effective parameter on the size of Ag-NPs. The experimental reaction gave a reasonable size of Ag-NPs 4.3 nm. This result confirmed the validity of the model, and the experimental value was determined to be quite close to the ANN predicted value 4.5 nm, less of 1% in relative deviation, implying that the empirical model derived from the ANN can be used to sufficiently describe the relationship between the independent variables and response.

4. Conclusion

In this research, the artificial neural network models for the size prediction of Ag-NPs were presented. The analysis carried out confirms that ANN was a powerful tool for analysis and modeling. Based on the obtained results it can be concluded that the LM neural network model with 4 neurons in 1 hidden layer will be the fastest training algorithm and can present a very good performance for ANN modeling of nanocomposites behaviours. Data analysis showed that AgNO3 concentration (1.0 M) and reaction temperatures (27°C) are two most sensitive parameters. Therefore, employing neural network models would lead to saving time and cost by predicting the results of the reactions.