This work aims to develop a robust machine learning model for the prediction of the relative viscosity of nanoparticles (NPs) including Al2O3, TiO2, SiO2, CuO, SiC, and Ag based on the most important input parameters affecting them covering the size, concentration, thickness of the interfacial layer, and intensive properties of NPs. In order to develop a comprehensive artificial intelligence model in this study, sixty-nine data samples were collected. To this end, the Gaussian process regression approach with four basic function kernels (Matern, squared exponential, exponential, and rational quadratic) was exploited. It was found that Matern outperformed other models with R2 = 0.987, MARE (%) = 6.048, RMSE = 0.0577, and STD = 0.0574. This precise yet simple model can be a good alternative to the complex thermodynamic, mathematical-analytical models of the past.

1. Introduction

Nanoscience researchers have been recently interested in the viscosity and thermal conduction of nanofluids [1, 2]. The lubrication and thermal performances of a nanofluid are dependent on its viscosity [3, 4]. To use a nanofluid for thermal management purposes, it is required to bring a trade-off between a low viscosity level and a high thermal conduction level [57]. Temperature, fluid form, and the shape, size, and the load of nanoparticles are determinants of such a trade-off [8, 9].

Research has shown that viscosity strongly influences nanofluids in solar energy systems through a direct effect on the pump work and pressure drop [1012]. These fluids can be more efficiently used in solar energy systems through detailed knowledge of their viscosity [3, 13]. Accurate experimental works have been conducted on the viscosity evaluation of hybrid nanofluids [1416]. However, experimental evaluation is expensive and time-consuming. Researchers have introduced approaches to estimate nanofluid viscosity [17, 18]. Additionally, in recent years, new methods of modeling based on artificial intelligence such as ANFIS, SVM, and ANN have been used in a wide variety of sciences [1922]. These approaches are mostly based on soft computing and theoretical calculations [23]. Einstein theoretically developed a framework to estimate nanofluid viscosity at small volume fractions [24].

Traditional correlation-based methodologies have also been developed for nanofluid viscosity prediction [25]. However, such methodologies have been found to underestimate nanofluid viscosity as there are lack of important parameters playing key roles in the nanofluid rheology [18, 26]. Data mining and machine learning have been widely employed for the relative viscosity estimation of hybrid nanofluids in a variety of empirical conditions [23, 27, 28]. Artificial neural networks (ANNs), support vector machines (SVMs), and ANFIS-GA are among the common machine learning techniques [2932]. Researchers have introduced generic machine learning algorithms in recent years to estimate the viscosity of nanofluids based on data mining of the synthesis of nanofluids. Alrashed et al. introduced the ANN and ANFIS algorithms for the viscosity estimation of C-based nanofluid [33]. A total of 129 experimental data samples were exploited to implement optimized viscosity estimation through the ANN.

Likewise, Bahrami et al. proposed twenty-four ANN structures to estimate non-Newtonian hybrid Fe-Cu nanofluids within a mixed water-ethylene glycol base fluid [34]. Bayesian regularization (BR) outperformed the other methods in the prediction of viscosity. They argued that a rise in the number of neurons in the hidden layer led to a slight performance improvement. Ahmadi et al. comparatively studied a number of machine learning algorithms in the dynamic viscosity prediction of the CuO-water nanofluid [35]. They proposed ANN-MLP, MARS, MPR, M5-tree, and GMDH algorithms based on the nanofluid concentration, temperature, and nanostructure size. The ANN-MLP was found to have the highest predictive performance. Amin et al. developed a GMDH-ANN method to estimate the viscosity of Fe2O3 nanoparticles. The RMSE was obtained to be 0.0018 [35, 36].

This study aims to describe an artificial intelligence-based model for accurately predicting the relative viscosity of nanoparticles. For this purpose, the GPR model has been used considering its four main function kernels, including Matern, squared exponential, exponential, and rational quadratic. These kernel functions were selected because of their high ability to predict and model the various data observed in the literature [3741]. This model was proposed since it was newer and less complicated than analytical mathematical models. Furthermore, this problem can be solved more effectively by offering an accurate model to accommodate the limitations such as cost and time associated with accurate measurement and monitoring of laboratory data. This study employed these strategies and used various statistical methods to analyze and predict the target data.

2. GPR

GPR is an efficient probabilistic model developed based on kernels [42]. Gaussian processes include random variables of a multivariate Gaussian distribution [43, 44]. Let x and y denote the input and output domains. Then, the sphere of influence with n () pairs is obtained. The sphere domains have an equal distribution and independence. It is assumed that the average function μ = Y ⟶ Re defines the Gaussian process for the variables [45, 46]. The covariance function is then performed. GPR is capable of recognizing the random variable of f (x) for supplied predictors (x), representing randomly featured function f [47, 48]. The present work assumed an independent observation error with a mean value distribution of zero (i.e., ), zero variance, and f (x) of the Gaussian process at x (represented by k) [4951]:where I is the identity matrix, and . As is normal, the conditional distribution of the conditional distribution of the test label with the condition of a testing-training pair of is ∼ (μ, σ). As a result [52, 53],where is the matrix of the covariance examined in each training-testing pair. The other , and values have a similar matrix [5456]. Also, X denotes the training vector label, Y stands for the training data label, and represents the testing data [57]. The specified covariance function for the creation of a semifinite positive covariance matrix of . equations (2) and (3) is quantified by specified kernel k and noise degree for deduction. Efficient GRP training requires the selection of a suitable covariance function and parameters; the actual GFRP model function is determined by the covariance function [58, 59]. It contains the geometric structure of training samples. Thus, the mean and covariance functions should be estimated from the data (hyperparameters), so that prediction could be performed accurately [60]. As this model has been used in many recent studies in different fields of science, more details are available elsewhere [6165], so there is no need to repeat them here.

3. Preprocessing Procedure

As mentioned, GPR was used to estimate the relative viscosity of nanoparticles through the size, concentration, thickness of the interfacial layer, and intensive properties of NPs. A total of sixty-nine data samples were exploited [66]. MATLAB 2014 has been used to model these data. The input data were classified into a training subset (75%) and a testing subset (25%). Data normalization was carried out as [6769]where D denotes the parameter. Furthermore, subscriptions n, max, and min represent the normalized, maximum, and minimum values, respectively. The normalized data varied from −1 to 1. The relative viscosity of nanoparticles was the output obtained through the size, concentration, thickness of the interfacial layer, and intensive properties of NPs.

4. Models’ Evaluation

Model performance could be evaluated using the percentage of average relative deviation (ARD%), mean squared error (MSE), coefficient of determination (R2), root mean square error (RMSE), and standard deviation (STD) [7073]. These evaluation indices are written aswhere N denotes the number of data samples, while subscriptions cal and exp represent the calculated and experimental quantities, respectively [74]. Also, denotes the experimental relative viscosity of nanoparticles.

5. Results and Discussion

The models were evaluated using a variety of graphical techniques. Figure 1 shows the evaluation results of the models. As can be seen, all kernel functions of the GPR model showed higher accuracy in the estimation of the relative viscosity of nanoparticles.

Figure 2 shows the regression diagram. The highest fit was obtained through linear regression between the experimental data and model estimates.

Figure 3 shows the errors of the models in the estimation of the relative viscosity of nanoparticles (i.e., the difference between the estimates and experimental data). As can be seen, this model had the smallest error as a majority of the data samples were distributed around the zero line. According to our calculations, all kernels had an average relative deviation of less than 30%.

Moreover, the predictive performance of the models in the estimation of the relative viscosity of nanoparticles was evaluated statistically. Table 1 provides the comparison of the models in the statistical errors of the training data, testing data, and input dataset.

5.1. Outlier Detection

The experimental data utilized to develop a model strongly influence its reliability. It is required to detect and exclude outlier data as they have a different behavior from other data samples. This enhances the reliability of the model. To detect outliers, standardized residuals and leverage analysis were employed. The candidate outliers were evaluated using the Williams plot [75, 76]. It plots the standard residuals versus hat values. Furthermore, to identify the feasible region, hat values are obtained as the diagonal elements of the hat matrix [76]:where X is a matrix with a size of , where n is the number of data samples, and k is the number of inputs. The feasible region is represented by a square within the cutoff and warning leverage value. The warning leverage value is quantified as [77, 78]

It is worth mentioning that the cutoff is typically set to 3 for standardized residuals [79, 80]. The data samples that are not positioned within the feasible region are assumed to be outliers. Figure 4 shows the Williams plot. According to this figure, Matern, exponential, squared exponential, and rational quadratic were found to have only two outliers.

6. Conclusions

This study adopted the GPR approach to estimate the relative viscosity of nanoparticles based on the size, concentration, thickness of the interfacial layer, and intensive properties of NPs. The Matern kernel was found to outperform exponential, squared exponential, and rational quadratic in the estimation of outputs. MARE was calculated to be 6.048%, 7.059%, 7.211%, and 8.078% for them, respectively. Moreover, the dependence of the target values on the inputs was measured using a sensitivity analysis. The proposed model could be significantly helpful in mechanical and chemical applications, particularly in heat transfer evaluation for heat exchangers where a nanofluid (e.g., CNT-water nanofluid) is employed.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.