Abstract

The application of artificial neural networks on adsorption modeling has significantly increased during the last decades. These artificial intelligence models have been utilized to correlate and predict kinetics, isotherms, and breakthrough curves of a wide spectrum of adsorbents and adsorbates in the context of water purification. Artificial neural networks allow to overcome some drawbacks of traditional adsorption models especially in terms of providing better predictions at different operating conditions. However, these surrogate models have been applied mainly in adsorption systems with only one pollutant thus indicating the importance of extending their application for the prediction and simulation of adsorption systems with several adsorbates (i.e., multicomponent adsorption). This review analyzes and describes the data modeling of adsorption of organic and inorganic pollutants from water with artificial neural networks. The main developments and contributions on this topic have been discussed considering the results of a detailed search and interpretation of more than 250 papers published on Web of Science ® database. Therefore, a general overview of the training methods, input and output data, and numerical performance of artificial neural networks and related models utilized for adsorption data simulation is provided in this document. Some remarks for the reliable application and implementation of artificial neural networks on the adsorption modeling are also discussed. Overall, the studies on adsorption modeling with artificial neural networks have focused mainly on the analysis of batch processes (87%) in comparison to dynamic systems (13%) like packed bed columns. Multicomponent adsorption has not been extensively analyzed with artificial neural network models where this literature review indicated that 87% of references published on this topic covered adsorption systems with only one adsorbate. Results reported in several studies indicated that this artificial intelligence tool has a significant potential to develop reliable models for multicomponent adsorption systems where antagonistic, synergistic, and noninteraction adsorption behaviors can occur simultaneously. The development of reliable artificial neural networks for the modeling of multicomponent adsorption in batch and dynamic systems is fundamental to improve the process engineering in water treatment and purification.

1. Introduction

The removal of pollutants from industrial process streams, groundwater, and wastewaters has an undoubtedly importance in terms of sustainability and human health protection [1, 2]. Adsorption is a key treatment method for facing the current challenges of water depollution. In particular, it is a proven and well-known technology for water purification due to its both technical and economic advantages [37]. The recent advances on adsorption for water treatment have mainly focused on the preparation and evaluation of new materials with outstanding adsorption capacities for the removal of different pollutants like dyes, heavy metals, geogenic compounds, pharmaceuticals, and other emerging toxic chemicals [820]. Actually, there is a wide spectrum of adsorbents that have been prepared and assessed to remove inorganic and organic compounds from aqueous solutions. Adsorption properties of these novel adsorbents have been determined experimentally using batch adsorbers and/or packed bed columns, which are the typical operating modes of this purification method.

Experimental studies with batch adsorbers allow to quantify the adsorption kinetics and isotherms as well as other important thermodynamic parameters associated to the adsorbent performance. Batch adsorbers are useful for establishing the maximum adsorption capacities for the adsorbate(s)-adsorbent system under ideal and controlled conditions since the experimental data are obtained at the thermodynamic equilibrium where the mass transfer resistances are reduced [21, 22]. Note that the adsorption processes in batch condition are not commonly employed for the treatment of real fluids at large scale since equipment with significant dimensions and long operating times is required. The packed bed adsorption columns are the most appropriate option for treating real fluids at industrial scale operation including the adsorbent regeneration [23]. Breakthrough curves obtained in packed-bed columns are fundamental to determine the maximum adsorption capacities at dynamic operating conditions and the analysis of mass transfer phenomena on the adsorbent performance.

Process systems engineering of adsorption for water treatment requires the development of reliable models to predict the corresponding kinetic, thermodynamics, and mass transfer parameters of the system at hand. The adsorption modeling offers valuable data for the operation, control, optimization, and design of water purification equipment. For instance, the modeling of adsorption processes is fundamental to estimate the adsorbent performance at both dynamic and batch operating conditions, to optimize the adsorption process variables, to perform a sensitivity analysis of process conditions on the adsorption performance, and to analyze other design issues that are required to improve the operating costs and removal efficacy in water treatment [2427]. Herein, it is necessary to highlight that the adsorption processes in liquid phase are highly dependent on the type, variety, and concentration of adsorbate(s) contained in the fluid, the fluid physicochemical characteristics (e.g., ionic strength, temperature, and pH), the operational conditions of adsorber (e.g., stirring rate, adsorbent dosage, bed height, flow rate, and residence time), and the adsorbent physicochemical properties (e.g., particle size, surface chemistry, and textural parameters). Therefore, the modeling of adsorption process is a multivariable problem that involves nonlinear relationships between the input and output variable(s). These mathematical characteristics imply that the reliable correlation and prediction of adsorption processes are challenging, especially for multicomponent systems [2830].

Overall, the available adsorption models can be classified in theoretical, semitheoretical, and empirical, and they can be also in the form of analytical and differential equations. Some reviews have analyzed specific adsorption equations [3134], and results reported in a number of studies have also illustrated their limitations and advantages [28, 29, 33, 3538]. In particular, the drawbacks of adsorption models are magnified when they are applied in multicomponent solutions. The simultaneous presence of several compounds to be adsorbed from the fluid can affect the adsorbent behavior due to their antagonic, synergic, and noninteraction effects [3941]. These adsorption effects depend significantly on the properties of the adsorbates dissolved in the fluid and their concentrations. Multicomponent adsorption models derived from the traditional equations of Langmuir, Freundlich, or Sips are regarded as empirical approaches that can fail to simulate adsorption systems with several adsorbates. Consequently, it is important to develop and improve the available modeling tools for analyzing the multicomponent adsorption involved in the treatment and purification of real-life fluids.

Artificial intelligence-based models are an alternative to improve the simulation of adsorption processes for water treatment. Several authors have recognized the contribution of this type of models to obtain better correlations and estimations of the adsorption of inorganic and organic adsorbates in single and multicomponent solutions [4247]. The artificial neural networks (ANN) have been introduced as an effective and reliable approach to overcome the problems associated to the simulation of adsorption systems especially those corresponding to fluids with more than one adsorbate at different operating conditions [43, 46, 48, 49]. ANN are based on human brain structures and capable to represent the nonlinear interactions between a set of input and output variable(s) of a given system without considering a sophisticated theory [50]. They have been employed to resolve engineering problems such as fault detection, prediction of materials properties, soil degradation analysis, water treatment modeling, data reconciliation, process modeling, and control [5054]. The advantages of ANN (e.g., reliable correlation, simplicity, versatility, and prediction capabilities) to handle multivariable problems with nonlinear behavior have justified their application in the analysis and simulation of adsorption processes [50, 52, 5558].

In this direction, this review covers the ANN-based modeling of adsorption processes in dynamic and batch operating schemes. The objective of this review was to provide the readers a general perspective of the developments, contributions, and opportunities on the modeling of adsorption data with artificial neural networks. A brief description of the theory and basis of ANN is provided in the first section of this review. The modeling of kinetic, isotherms, and breakthrough curves with ANN is analyzed and discussed. Some important guidelines concerning the parameter estimation problem to be resolved for ANN training, the selection of the input and output variables to be modeled with ANN, the details of its numerical implementation in terms of adsorption data correlation and prediction, and some challenges to be faced and resolved besides perspectives on this topic are also covered in this manuscript.

2. Brief Introduction of Artificial Neural Networks

ANN were initially developed using the concept for artificial intelligence with the aim of simulating the activities and functions of nervous system and human brain in terms of memorizing and learning [48, 59]. They came from the analogy made between the human brain and computer processing. Basically, ANN are a computational system that replicates the function of the brain to carry out a specific task [60]. Input value(s) (i.e., independent variables of the system under study) are provided to the network and are manipulated via internal mathematical operations to produce an output value(s) (i.e., dependent variables of the system under study) [60]. ANN are considered as black-box models useful when a mathematical relationship between the output and input variables is not available to describe the phenomenon to be analyzed and/or where the traditional models may fail [61, 62]. ANN contain multiple interconnected nonlinear processing elements that “learn” to represent and extrapolate the nonlinear relationships between the dependent and independent variables of the case of study [48].

Mathematically, ANN are composed of simple elements to perform the calculations where these elements are interconnected with a certain topology or structure. The perceptron (neuron) is the simplest elements of a network. The basic model of a neuron is illustrated in Figure 1(a) and is integrated by the next components [63, 64]: (1) a set of synapses, which are the inputs of a neuron given by a weighted vector; (2) an adder that simulates the neuron body and gets the level of arousal; and (3) an activation function that generates the output if it reaches the level of excitement and restricts the output level, thus, avoiding the network congestion. Formally, the neuron output () in ANN is given by the expression where indicates the number of inputs to the neuron , and φ denotes the excitation or activation function [65, 66]. The argument of the activation function is the linear combination of the neuron inputs. Considering the set of entries and weights W of the neuron as a vector of dimension , Equation (1) can be defined as follows

ANN can be classified into static and dynamic networks where the first one has a broad range of application mainly due to its characteristic of not change with respect to time. Dynamic networks are applied for those problems that have changes with respect of time [67, 68]. Multilayer ANN are widely utilized because they resemble the structures of human brain and can be spread with forward and backward configurations where the selection depends on the case of study [69, 70]. Particularly, the multilayer ANN with forward spread has been successfully applied in the correlation and prediction of batch and dynamic adsorption processes [6, 11, 27, 41, 57, 7180].

A multilayer ANN structure includes an input layer, one or more hidden layers, and an output layer, see Figure 1(b). Input layer contains the independent variables of the case of study, while the output layer corresponds to the corresponding dependent variables. The structure definition of a multilayer ANN seeks to reduce the problems associated to the prediction of nonlinear behavior of a multivariable system. Therefore, an important issue is to establish the suitable ANN architecture (i.e., the number of hidden layers and their neurons). This task is commonly based on a trial-error approach. In this sense, the theorem of Kolmogorov [81] indicates that “the number of neurons in the hidden layer need not be larger than twice the number of entries.” Hecht-Nielsen et al. [82] proposed the next equation to estimate the number of neurons in the hidden layers of ANN [83]: where is the number of hidden layer neurons, is the number of entries, and is the number of hidden layers used in the ANN, respectively. Equation (3) suggests that the number of neurons required in the hidden layer should be . For the case of a multilayer structure with a single hidden layer, it has been recommended that the number of neurons should be 2/3 of the corresponding number of entries [84, 85].

The next step for building an ANN model is the application of excitation or activation function(s). These functions are required to spread the information and used for ANN training (i.e., the adjustment of the corresponding synaptic weights to model the system at hand) [86, 87]. There are different excitation/activation functions where the most common ones are the tangential sigmoidal (Equation (4)), logarithmic (Equation (5)), and radial basis (Equation (6))

The radial basis function is commonly used for dynamic systems and can be utilized in nondynamic processes but at the expense of increasing the computation and data processing time [8891]. Note that there are different radial functions, for example:

Gaussian

Multiquadratic

Inverse multiquadratic

Polyharmonic

After selecting the excitation/activation function(s), it is necessary to train the ANN model. This training can be performed with different approaches but the most common one is the training of back-propagation (BP), which has been the basis to apply other numerical methods like Levenberg-Maquart (LM) and Broyden-Fletcher-Goldfarb-Shannon (BFGS) [9295], or even more sophisticated optimization algorithms like the metaheuristics also known as stochastic optimizers [96, 97]. BP algorithm is used to define the parameters of a multilayer ANN with a fixed architecture with the aim of “learning” the system behavior. An optimization algorithm is required to minimize the sum of errors between the ANN output values and the given target values of the system to be modeled. Interested readers on advance topics of ANN, its characteristics and developments are encouraged to consult the reviews of Basheer and Hajmeer [73], Abraham [66], Poznyak et al. [98], Alam et al. [99], Gopinath et al. [29], Chong et al. [97], and Aani et al. [100].

Finally, ANN have been combined and/or hybridized with other numerical approaches to resolve complex engineering problems. Stochastic global optimization methods (e.g., particle swarm optimization, genetic algorithm, cuckoo search, and ant colony optimization), fuzzy logic, and principal component analysis have been employed to improve the performance of ANN modeling in several fields including adsorption [29, 96102].

3. Applications of ANNs to Model the Adsorption of Water Pollutants

A wide variety of theoretical and empirical models have been proposed to analyze, correlate, and predict adsorption processes. However, these models are generally based on restrictive assumptions and theories, which can limit significantly their application [7]. For instance, the traditional adsorption isotherms like Langmuir and Freundlich have adjustable parameters that neglect the impact of solution temperature or pH on the adsorption capacities at equilibrium. These traditional models have been extended to handle the multicomponent adsorption but their errors are significant for those systems with the simultaneous presence of antagonistic and synergistic adsorption effects [31, 34, 38, 103, 104]. Other examples corresponded to statistical physics models that are theoretical equations utilized to estimate physicochemical parameters of adsorption processes but with the limitation of neglecting the role of solution pH or other fluid characteristics. Similar remarks can be formulated for the conventional kinetic equations (e.g., pseudofirst and pseudosecond order models) or even mass transfer models. Therefore, ANN are an alternative to overcome these disadvantages and also to develop improved versions with better correlation and prediction capabilities. However, it is convenient to remark that ANN can be considered as black-box (i.e., empirical) models that are effective for correlation and prediction but without providing an additional theoretical understanding of the system under analysis. This drawback of ANN can be partially resolved via its hybridization with theoretical adsorption models [105, 106].

Mathematically, the performance of an adsorption system is a nonlinear function depended on the adsorbent properties, chemistry of adsorbate(s), operating conditions, fluid properties, and equipment configuration. This nonlinear functionality can be modeled using ANN based on the fact that there is no a limitation to incorporate all the independent variables affecting the adsorption system, see Figure 2. ANN can also predict the performance of multicomponent adsorption systems where the adsorption capacities or other performance metrics, like the concentration profiles of breakthrough curves, of all adsorbates are incorporated as output variables.

For the preparation of the current manuscript, a literature review was performed in Web of Science® database using the keywords “adsorption,” “water,” and “artificial neural network(s).” All articles found with these keywords were scrutinized to identify papers out of the scope of this review. Several references were identified with a significant lack of information in terms of type of characteristics of ANN, input and output data, and other relevant points, which were discarded for the analysis and discussion. This review covers more than 250 papers related to ANN modeling of adsorption data. For illustration, Figure 3 provides an overview of paper published on ANN and adsorption of water pollutants since 1999 to 2021 (July) according to Web of Science® database using the keywords. It is clear that the number of publications about this topic has continuously increasing where a diversity of adsorption systems (i.e., adsorbents, adsorbates, process configurations, and operating conditions) has been analyzed via ANN with different topologies, activation functions, and training methods including hybrid approaches. This set of publications has briefly described and discussed to provide an overview of the advantages, limitations, and current challenges on the adsorption modeling using ANN. Consequently, this section summarizes the main findings on the application of ANN for the modeling of kinetics, isotherms, and breakthrough curves obtained in the adsorption of different water pollutants.

3.1. Kinetic and Isotherms

Batch adsorption tests are required to quantify kinetics and isotherms thus characterizing the performance of adsorption processes. First applications of ANN in adsorption modeling were associated to the correlation and prediction of kinetics and isotherms. Tables 1 and 2 summarize the ANN modeling of kinetic and equilibrium studies for the adsorption of several pollutants from water. For instance, the adsorption data of arsenic, dyes, fluorides, heavy metals, pesticides, and organic compounds using activated carbons, bone char, lignocellulosic biomasses, clays, nanocomposites, hydrogels, and metal-organic frameworks have been modeled with ANN. These experimental studies have covered different operating conditions (e.g., 20–60 °C and pH 1–11) and a broad spectrum of adsorption capacities (3-270 mg/g). Several input variables have been considered in the ANN modeling such as pH, temperature, adsorbent dosage, contact time, initial concentration, physicochemical properties of the pollutant(s), and adsorbent, among others. Adsorption systems with one pollutant (i.e., adsorbate) dominate in the literature (~87%), and a limited number of multicomponent adsorption studies with two or more pollutants have been reported although the recognized capabilities of ANN to handle multiresponse processes. A brief description of representative studies on ANN modeling of kinetics and isotherms for different water pollutants is provided below.

Brasquet and Le Cloirec [107] were pioneers in the modeling of batch adsorption data with ANN. These authors formulated the question “why use neural networks in adsorption processes?” thus determining that ANN can be excellent predictors for this separation process if properly implemented. They studied the adsorption of 368 organic compounds on three activated carbons and used an ANN with four input variables: molecular size and flexibility with the variable 3Xp (0–10.33), molecular volume and topology of insaturation and heteroatoms with the variable 2Xvp (0–9.58), the critical dimension with the variable 6Xvp (0–5.06), and a dummy variable “D” (-1.15–1.08). ANN with three neurons in the hidden layer were applied considering as the output variable where is the equilibrium adsorption capacity and is the adsorbate equilibrium concentration [108]. These authors used 333 quantitative structure-activity relationship (QSAR) data for learning and 35 for testing of ANN from Blum et al. [109]. They used a classical neural network with BP algorithm as a training method and hyperbolic tangent sigmoid as activation function. This study concluded that an excessive number of neurons in the hidden layer was not necessary to achieve satisfactory modeling results with . It was also analyzed the impact of the number of neurons on the ANN overtraining.

Chu and Kim [110] compared the modified Langmuir model and feed-forward ANN for the prediction of competitive adsorption of cadmium and copper by a plant biomass. Equilibrium adsorption data at pH 4-5 and 25 °C were taken from Pagnanelli et al. [111] where a mutual suppression of the adsorption of both metals occurred in the binary metallic system due to the competition for the binding sites of this adsorbent. Input variables for ANN modeling were the copper and cadmium equilibrium concentrations (0.124–2.243 mmol/L) and pH, while the ANN outputs were the copper and cadmium adsorption capacities (0.006–0.165 mmol/g). ANN training was performed with 83.3% of data set, while the remaining 16.7% was utilized for testing. The logistic sigmoid function was applied in neuron activation. The best ANN configuration consisted of 1 hidden layer with 10 neurons and BP training. These models were compared using the relative errors where the best values were obtained for the ANN thus outperforming the data correlation with the modified Langmuir model.

Singh et al. [112] employed an adapted neural fuzzy model and a BP-ANN for the prediction of cadmium adsorption by hematite. Specifically, a 3-layered feed-forward BP-ANN was employed where the input variables were the cadmium concentration (44.48–88.96 μmol/L), agitation rate (50–125 rpm), pH (9.2), temperature (20.5–40.5 °C), and contact time (29–222 min), while the output variable was the final cadmium concentration (43–103 μmol/L). The training database consisted of 15 datasets. The activation functions were the logistic sigmoid and symmetric Gaussian for classical ANN model and hybrid neural fuzzy model, respectively. Results showed that the cadmium adsorption depended on the five input variables. The hybrid neuro-fuzzy model () provided better predictions of the cadmium adsorption than BP-ANN ().

ANN were used by Aber et al. [113] for modeling the kinetic adsorption data of the acid orange 7 dye using powdered activated carbon. In the kinetics experiments, the effect of initial concentration (150-350 mg/L) and pH (2.8-10.5) was evaluated. Input variables were the initial concentration (150–350 mg/L), pH (2.8–10.5), and contact time (75-600 min), while the final concentration after adsorption (5.48–178 mg/L) was the output variable to obtain the ANN model. Conventional adsorption kinetic equation (i.e., pseudosecond order) and feed-forward BP-ANN with 3-2-1 neurons and logistic sigmoid and hyperbolic tangent sigmoid functions were used for modeling the experimental data. A total of 219 experimental data were employed with 146 for training and 73 for prediction. The performance of these models was assessed, and ANN achieved the lowest mean relative error (5.81%). This study concluded that ANN was a predictive approach that could replace conventional kinetic models.

Yetilmezsoy and Demirel [114] proposed the use of a three-layer ANN for predicting the removal of lead with antep pistachio shells. The input variables were the adsorbent dosage (2–16 g/L), contact time (5–120 min), temperature (30–60 °C), pH (2–9), and lead initial concentration (5–100 mg/L), while the output variable was the lead removal (26.45-98.70%). 34, 16, and 16 data were used for training, validation, and testing, respectively. Tangent sigmoid function at the hidden layer and a linear function at the output layer were used, while LM algorithm was the best alternative for ANN modeling. This ANN model was able to fit the adsorption data showing a minimum value of the mean square error of and . Sensitivity analysis revealed that pH was the most influencing variable on the metal adsorption where a maximum lead removal of 99% was obtained.

Three-layer feed-forward ANN was used to model the adsorption kinetics of auramine O by activated carbon [115]. ANN were trained using the parameters obtained from the pseudosecond order kinetic equation. LM method was employed to train ANN with the next input variables: initial dye concentration (85–200 mg/L), contact time (1–120 min), agitation speed (400–800 rpm), temperature (305–333 K), initial solution pH (3–8), and activated carbon mass (0.3–1.8 g). The output variable was the dye adsorption capacity. The best ANN architecture was 6-7-1 with linear and hyperbolic tangent sigmoid activation functions. Overall, the difference of mean squared errors between the ANN and the pseudosecond order kinetic model varied only by <2%. This study was among the early attempts to combine ANN and a traditional kinetic adsorption equation to improve the adsorption modeling.

Garza-González et al. [116] proposed an approach to compare ANN and conventional isotherm models in the methylene blue adsorption by Spirulina sp. Simulated annealing and genetic algorithms were applied with ANN with two hidden layers and hyperbolic tangent sigmoid function. Temperature (25–50 °C), pH (2–8), and adsorbent dosage (1.2–10 g/L) were used as input neurons, while the adsorption capacities (3.09–66.98 mg/g) or the removal efficiency (23.56–86.89%) were considered as the output neurons. Results showed that genetic algorithm outperformed simulated annealing. Sensitivity analysis indicated that the removal efficiency was impacted by the operating variables as follows: , while the adsorption capacity was depended on . Finally, the experimental isotherms indicated a maximum adsorption capacity of 900 mg/g. The optimized ANN model outperformed significantly the Fritz-Schlunder equation.

Yang et al. [96] proposed the application of ANN and genetic algorithm to determine the importance of adsorption variables (e.g., initial dye concentration, time, temperature, and pH) on the adsorption of dyes congo red and acid black 172 by Penicillium YW01 biomass. Experimental results showed that the maximum adsorption capacities of this biomass were 411.53 mg/g for congo red and 225.38 mg/g for acid black 172. This dye separation process was endothermic and pH dependent. Adsorption kinetics were modeled with the pseudosecond order and Weber-Morris models, and the isotherms were fitted with Langmuir equation. ANN modeling was performed with 129 experimental data divided in 77, 26, and 26 for training, validation, and testing. The input variables were the contact time (5–360 min), initial dye concentration (50–800 mg/L), pH (1–10), and temperature (20–40 °C), while the output variable was the dye adsorption capacity (21.45–411.53 mg/g). values > 0.99 were obtained for the prediction of congo red and acid black 172 adsorption using ANN and genetic algorithm. This combined approach was more effective than ANN. The authors concluded that the initial adsorbate concentration and temperature showed the highest impact on the adsorption of both dyes.

Response surface methodology (RSM) and ANN were used to model the lead removal from industrial sludge leachate using red mud [117]. pH (3–7), contact time (5–60 min), and adsorbent mass (1.25–10 g/L) were the input variables, and the lead removal was the output variable. Box–Behnken design (BBD) was utilized for RSM and to obtain the data involved in ANN training. From this experimental design, the lead removal ranged from 38.84 to 96.82% where adsorbent dosage was the main operating variable followed by the contact time and pH. Feed-forward multilayer ANN with hyperbolic tangent sigmoid and logistic sigmoid functions and 3-12-1 architecture was used to predict the lead removal. and root mean squared error were used as the statistical metrics to assess the model performance. Both ANN and RSM models were satisfactory to correlate the experimental data of this adsorption system but with an evident advantage of ANN for predictive purposes.

Masood et al. [118] applied an ANN to predict the removal of total chromium using Bacillus sp. Experimental results showed that this removal process was pH dependent achieving a maximum adsorption capacity of 50 mg/g according to the equilibrium data, which were fitted to Freundlich equation. Feed-forward BP-ANN with three layers and logistic sigmoid activation function was employed in data analysis. Solution pH (4–9), contact time (2–6 h), and initial adsorbate concentration (100–400 mg/L) were the input layer variables, while the adsorption capacity (16.5–50 mg/g) was the output variable. 360 data from adsorption experiments were utilized for training (80%), testing (10%), and validation (10%). Modeling errors and were utilized to test the ANN accuracy. A minimum root mean squared error of 0.0001 and were obtained for ANN with 10 neurons. It was identified that pH was the most influencing factor to model the chromium removal followed by the adsorbate concentration and contact time.

Savic et al. [119] proposed a comparative study of a central composite design (CCD) and multilayer ANN to model and optimize the adsorption of iron on bentonite clay. This experimental design consisted of 16 tests, and results showed that the adsorption efficiency ranged from 71.24 to 89.85% at pH 7 and room temperature. For the ANN modeling, the training sample was 80%, and the test sample was 20%. The initial metal concentration (17.09–51.91 mg/L), contact time (10–120 min), and adsorbent concentration (1000–7000 mg/L) corresponded to the input layer, and the metal removal was the output layer where the ANN architecture was 3-9-1 with radial basis activation function. 3D and contour plots for CCD and ANN were obtained. Multilayer ANN showed higher and lower errors than CCD thus confirming its better prediction performance.

The performance of 9 adsorbents obtained from dead fungal biomass was analyzed in the adsorption of reactive black 5 from aqueous solution [120]. Adsorption isotherms and kinetics were quantified to study the adsorption mechanisms. ANN were utilized to predict the impact of adsorbent textural parameters and physicochemical properties on the dye adsorption capacities. Experimental adsorption capacities of these adsorbents were 34.18-179.26 mg/g. The pseudosecond order and Langmuir equations were suitable to fit the experimental kinetics and isotherms, respectively. BP-ANN was used with the next input variables: pH (1–9), contact time (5–360 min), initial dye concentration (50–250 mg/L), BET area (0.0698–0.7656 m2/g), pore volume (), pore diameter (4.21–4.70 nm), nitrogen content (2.29–4.70%), carbon content (45.78–60.21%), and hydrogen content (9.18–7.20%), while the output variable was the adsorption capacities (0.65–172.67 mg/g). 135, 45, and 45 experimental data were utilized for training, validation, and testing. LM algorithm was the training method with a feed-forward BP-ANN with 3 layers. A sensitivity analysis was performed via the Garson method obtaining the next tendency for tested input variables: . The authors concluded that the adsorption capacities were affected by the chemical composition and not by the surface area of these adsorbents.

Bingöl et al. [121] carried out a comparison between the multiple linear regression and the adaptive neuro-fuzzy inference system (ANFIS) for the cadmium adsorption with date palm seeds. This analysis considered 20 experiments to assess the effect of adsorbent mass (0.05–0.5 g), initial adsorbate concentration (5–100 mg/L), and pH (2–6) on the adsorption capacity (0.01-4.18 mg/g). ANFIS was trained with 50% of the experimental data, and the remaining 50% was utilized in testing. Results showed for ANFIS and for the multiple linear regression. These authors concluded that the multiple linear regression could not represent the nonlinearity of this adsorption process, and ANFIS was a better modeling alternative.

The application of ANN and gene expression programming (GEP) was studied by Çelekli et al. [122] in the prediction of the adsorption of lanaset red G dye using low-cost lentil straw. They applied a three-layer BP-ANN with 1 input layer consisting of 4 input neurons, namely, adsorbent particle size (125–500 μm), pH (1–4), contact time (0–360 min), and adsorbate concentration (50–800 mg/L), and an output layer corresponding to the adsorption capacity (30.57–271.12 mg/g). The training algorithm was the quick propagation method with the logistic sigmoid function where the data were divided in training (784), validation (184), and testing (184). The maximum adsorption capacity of this adsorbent was 271.12 mg/g. values were 0.999, 0.989, and 0.989 for ANN, pseudosecond order, and GEP models, respectively. Therefore, ANN was the best to adjust the experimental data. Solution pH and initial dye concentration were the operating variables with a significant impact on the adsorption of this organic pollutant.

Khajeh and Hezaryan [120] employed a hybrid ant colony optimization and ANN for the simulation and optimization of manganese and cobalt adsorption on SiO2 nanoparticles. Feed-forward multilayer ANN was utilized where pH (7.5–10.5), adsorbent dosage (0.05–0.015 g), contact time (10–30 min), and the concentration of 1-(2-pyridylazo)-2-naphthol (0.5–1.5 mol/L) were the input neurons, while the removal of manganese and cobalt (29–99%) was the output neuron. Tangent sigmoid and linear activation functions were used. LM algorithm was employed in ANN training where 57 experimental data were split into 64, 18, and 18% for training, validation, and testing, respectively. The experimental conditions optimized with the ant colony optimization were well predicted with ANN thus obtaining and 0.98 and a root mean square error of 0.0979 and 0.04 for manganese and cobalt, respectively.

Multilayer feed-forward ANN and genetic algorithm were applied to analyze the effect of several operating parameters on the adsorption of eosin Y dye by Co2O3-activated carbon [123]. The experimental maximum adsorption capacity was 555.56 mg/g at 25 °C and pH 3. A three-layer ANN with linear and tangent sigmoid functions were tested. LM method was the training algorithm. Input neurons included the adsorbent dosage (0.005 – 0.02 g), initial adsorbate concentration (30 – 80 mg/L), and contact time (0.5–30 min), while the removal percentage (%) was the output neuron. 70% of experimental data was used for training, 15% for validation, and 15% for testing. The lowest values of mean squared error (0.00015) and highest (0.9991) of ANN and genetic algorithm confirmed their suitability to model this adsorption system.

Çoruh et al. [124] proposed the use of nonlinear autoregressive model with exogenous input (NARX) neural network for predicting the zinc adsorption on activated almond shell. This model was developed considering as input variables the adsorbent dosage (0.125–4.0 g), pH (2–10), particle size (0.23–2.0 mm), and initial metal concentration (15–100 mg/L), where the output layer consisted of 2 neurons, i.e., adsorption capacity (mg/g) and removal percentage. These authors indicated that NARX was a dynamic recurrent model that converged faster and generalized better than other ANN. NARX architecture was 4-10-2 with a tangent sigmoid function and BP algorithm with a gradient descent momentum optimization. The performance of this model was tested thus obtaining a , and numerical results showed that NARX was successfully to model this batch adsorption system.

Mendoza-Castillo et al. [125] implemented a classical BP-ANN for modeling the adsorption isotherms and kinetics of four heavy metals (i.e., lead, cadmium zinc, and nickel) on several lignocellulosic wastes (i.e., jacaranda fruit, plum kernels, and nut shells). These authors discussed that the heavy metal adsorption on lignocellulosic biomasses was a complex process with highly nonlinear interactions among the adsorbent characteristics, the physicochemical properties of adsorbates, and the removal operating conditions. The input data were the biomass specific surface area (23–33 m2/g), the biomass contents of cellulose (29.54–50.16%), hemicellulose (21.46–25.87%), and lignin (26.58–20.50%), the concentration of acidic groups (0.87–1.14 mmol/g), the molecular weight (58.69–207.20 g/mol), hydrated ionic radii (4.01–3.30 Å), electronegativity (1.60–1.90), and hydration energy (-1485–-2106 kJ/mol) of tested heavy metals, the initial metal concentration (40 and 100 mg/L), or equilibrium metal concentration (20–250 mg/L) depending on kinetics or isotherms were analyzed. The experimental adsorption capacities (1-7 mg/g) of all heavy metals were considered as the ANN outputs. Different structures of ANN were assessed in terms of input variables where 70% of experimental data were used for training, 15% for testing, and 15% for validation. Linear and tangent sigmoid activation functions were used with one hidden layer and 10 neurons to avoid model overfitting. Results of the mean relative errors and showed that this ANN fitted properly the experimental data. The lignin content, acidic group amount, molecular weight, and hydration energy of heavy metals were the main factors affecting the adsorption process.

Nia et al. [123] reported the reactive orange 12 adsorption on gold nanoparticle-activated carbon and its modeling with ANN using an imperialist competitive algorithm. Neural Network Toolbox of MATLAB R2011a was utilized in this study. LM and BP algorithm were utilized. The input variables for ANN modeling were the adsorbent amount, contact time, and dye initial concentration, while the output variable was the dye removal (%). 168 experimental data were used for training and 72 for testing. ANN model with 9 hidden neurons showed and a mean squared error of 0.0007 for this adsorption system.

A hybrid approach using principal component analysis and ANN was proposed by Zeinali et al. [43] for modeling the competitive adsorption of brilliant green and methylene blue by graphite oxide nanoparticles. The experimental results indicated that the dye adsorption was pH dependent where the maximum adsorption capacities were 410 and 129.41 mg/g for methylene blue and brilliant green, respectively. Dye adsorption was inhibited by the presence of the second dye molecule in the aqueous solution. Adsorption data were divided in 100 for training and 40 for testing of ANN model. Input variable was the equilibrium concentration (mg/L) of dye mixture, and the output variable was concentrations (mg/L). Tangent sigmoid and linear activation functions with BP algorithm were used for ANN. The optimal ANN architecture included 10 neurons with and . The competitive isotherms were fitted with the conventional equations and the extended Freundlich model adjusted properly the data. Finally, the principal component analysis and ANN were effective for the simultaneous modeling of the adsorption capacity of brilliant green and methylene blue in binary solutions.

The ternary adsorption of three dyes (i.e., methylene blue, crystal violet, and brilliant green) on MnO2-loaded activated carbon was optimized and predicted with RSM and ANN [126]. Specifically, CCD and a three-layer feed-forward structure for RSM and ANN were used, respectively. Different ANN training algorithms were tested where the LM method was the most suitable. Hyperbolic tangent sigmoid and linear functions were used for hidden and output layers, respectively. 90 experimental data were divided into 70% for training, 15% for testing, and 15% for validating. and modeling errors were used to test the performance of the ANN model. Results indicated that ANN outperformed RSM with .

A novel quantum BP multilayer ANN was implemented by Bhattacharyya et al. [127] to predict the adsorption of iron by calcareous soil. Specifically, the quantum computing is based on the principles of quantum mechanics with operations like superposition and entanglement. Superposition is the characteristic of dynamical equations, while the entanglement is the property that produces a nonlocal interaction among bipartite correlated states. 6-6-1 topology was used for the multilayer ANN and the quantum-based ANN. The input variables were the initial adsorbate concentration (1.5–15 mg/L), adsorbent amount (0.01–0.11 g/mL), pH (2-10), contact time (20–180 min), stirring rate (100–300 rpm), and temperature (303-330 K), while the output variable was the iron removal (39.56–97.34%). Tangent and sigmoid activation functions were evaluated. Calculations demonstrated that the architecture of quantum ANN was superior to multilayer ANN for describing this adsorption process. The adsorbent achieved a maximum adsorption capacity of 2.475 mg/g, and the removal process depended on solution pH and temperature.

Darajeh et al. [128] carried out a comparative study between wavelet ANN and RSM to optimize the adsorption of copper, nickel, and lead onto a magnetic/talc nanocomposite. This ANN used wavelet functions as an alternative to the conventional sigmoid activation function. The initial adsorbate concentration (32–368 mg/L), adsorbent dosage (0.07–0.13 g), and adsorption time (13–147 s) were the input variables, and the removal percentages (21.6–98.5%) of these adsorbates were the output variables. This ANN was trained with 13 data, and the incremental BP, batch BP, quick-propagation, genetic algorithm, and LM were applied and assessed to obtain the best network. The best architecture was the incremental BP with 3-14-3 with . It was concluded that the initial adsorbate concentration was the most influential factor (35.16%) on the heavy metal adsorption followed by the adsorbate dosage. This alternative ANN was more suitable than RSM to predict the adsorption process.

The adsorption of cadmium on rice straw was modeled with ANFIS [129]. As the authors stated, this model combined the advantages of both fuzzy systems and ANN. The influence of initial cadmium concentration (10 and 100 mg/L), solution pH (2 and 7), and adsorbent mass (0.1 and 0.5 g/L) was analyzed. These operating conditions were the ANFIS input variables, and the output variable was the removal efficiency (%). LM method was utilized for ANN training where the data were distributed in 70% for training and 30% for validation. Hyperbolic tangent activation function was used with an architecture of 3-6-1. ANFIS showed that the initial cadmium concentration had the highest impact on the adsorption followed by pH and adsorbent dose. This model achieved for training, 0.82 for validation, and 0.97 for testing, respectively.

The individual and simultaneous ultrasonic-assisted removal of malachite green and methylene blue dyes by a magnetic ɤ-Fe2O3-loaded activated carbon were studied by Asfaram et al. [130] including its modeling with RSM and ANN. A feed-forward BP ANN was used with the next input variables: pH (4.5-7.5), initial dye concentration (10-20 mg/L), sonication time (3-5), and adsorbent mass (0.01-0.02 g). Dye removal (%) was the output variable. Hyperbolic tangent sigmoid function was used in hidden layer, and linear activation function was applied in output layer. 50 data were divided for training, testing, and validation (70/15/15 %) where LM was the training method. and different error functions were applied to test the ANN performance. Both RSM and ANN were capable of predicting the dye adsorption with high values of but ANN outperformed RSM.

Esfandian et al. [8] tested ANN using the experimental data of the removal of pesticide diazinon using acid-treated zeolite and modified zeolite by Cu2O nanoparticles. Experimental results indicated that these zeolites showed adsorption capacities of 15.10 and 61.73 mg/g, respectively. Adsorption depended on pH and temperature where an exothermic process was identified. Data modeling was performed considering pH (3-8), initial adsorbate concentration (50–120 mg/L), adsorbent dosage (0.05–0.35 g), and contact time (10–105 min) as input variables, and the target variable was the removal efficiency (%). Experimental data was divided in training (70%), validation (15%), and testing (15%). In this study, the multilayer feed-forward ANN with 7 hidden layer neurons and sigmoid function was utilized. This ANN showed the lowest modeling errors and was suitable to fit the experimental data of this adsorption system.

Fawzy et al. [131] also proposed the use of ANFIS to establish the impact of operational parameters on the nickel and cadmium adsorption by Typha domingensis biomass. Five variables were analyzed: pH (2–8), adsorbent dosage (2.5–40 g/L), particle size (0.25–1.0 mm), contact time (5–150 min), and metal concentration (25–300 mg/L). The output variable was the metal removal efficiency (%). Experimental data showed that this biomass achieved a maximum adsorption capacity of 4.51 and 28.49 mg/g for nickel and cadmium at pH 6 and , respectively. ANFIS training was carried out with a hybrid methodology consisting of a combination of the least-squares method and the BP gradient descent method where the Sugeno-type fuzzy inference system was applied. Results indicated that the initial concentration and pH had a significant influence on the metal adsorption. ANFIS was useful to identify the role of these operational parameters.

Ghaedi et al. [132] studied the application of ANN-particle swarm optimization approach for the modeling of methyl orange removal on lead oxide nanoparticles-loaded activated carbon. The input ANN variables were the contact time, adsorbent dosage, and dye concentration, while the output variable was the removal of methyl orange (%). ANN training was performed with LM algorithm using 270 data and 90 data for testing. ANN-PSO modeling with 6 neurons in the hidden layer offered the best results ().

Gomez-Gonzalez et al. [133] utilized ANN to model the lead adsorption by coffee ground. Its performance was compared with traditional equations as Langmuir and Freundlich. Specifically, pattern search, simulated annealing, and genetic algorithm were used to adjust the parameters of Langmuir and Freundlich and then to compare with ANN. Tangent sigmoid function was used with ANN (3 layers) and LM training. Input neuron was the equilibrium concentration, and the adsorption capacity (mg/g) was the output neuron. The architecture used was 1-13-1 for pH 3 and 1-4-1 for pH 4 and 5. Experimental data were distributed in 70% for training, 15% for validation, and 15% for testing with a tangent sigmoid activation function. A maximum adsorption capacity of 22.9 mg/g was obtained with coffee ground at pH 5 and 30 °C. These authors concluded that pattern search was the best optimization method, and ANN outperformed the conventional isotherm equations used in adsorption.

Podstawczyk and Witek-Krowiak [134] studied the malachite green adsorption using a novel composite. Specifically, the rapeseed meal was modified with magnetic nanoparticles. Adsorption kinetic data were modeled with the surface diffusion, pseudosecond order, and pseudofirst order models as well as ANN. These authors proposed a feed-forward ANN with 2-3-1 topology that was trained with LM method. The input variables were the adsorption time (0–270 min) and pH (4–6), while the adsorption capacity (0–40 mg/g) was the output variable. ANN outperformed the conventional kinetic equations showing . Dye adsorption isotherm indicated a maximum adsorption capacity of 836.2 mg/g.

Ahmadi et al. [135] tested random forest, radial basis function ANN and CCD polynomial model to simulate and optimize the ultrasonic-assisted removal of brilliant green with ZnS nanoparticles loaded on activated carbon. In particular, the random forest is based on decision trees and uses voting for classification and averaging for regression and predictions. The effect of several operational conditions such as adsorbent dosage (10–30 mg), initial adsorbate concentration (4–20 mg/L), and sonication time (2–6 min) on the removal efficiency (15.4–100%) was evaluated. Experimental data were divided in 70% for training and 30% for validation. Results showed that these approaches were suitable for data fitting. However, the random forest outperformed the other models. The optimized adsorption conditions allowed to achieve 98% of brilliant green removal.

Asfaram et al. [136] applied RSM, ANN, and radial basis function neural network (RBFNN) to model and predict the efficiency of Mn@CuS/ZnS nanocomposite-loaded activated carbon to remove malachite green and methylene blue dyes in binary adsorption systems assisted by ultrasound. The effect of pH (4-8), initial dye concentration (5-25 mg/L), sonication time (1-5 min), and adsorbent mass (0.01-0.03 g) on the dye removal percentage was tested. For ANN modeling, 32 experiments were used and randomly divided in 70% (training), 15% (testing), and 15% (validating) where LM algorithm was the best training method. Hyperbolic tangent sigmoid and linear functions with a BP algorithm were applied. For RBFNN, the Kernel stone algorithm was used as training method with 70% of data for training and 30% for testing. The results demonstrated the effectiveness of these models to predict the binary adsorption with the next tendency with values of 0.9984-0.9997, 0.9787-0.9997, and 0.917-0.9850, respectively.

The removal of indigo carmine and safranin-O using nanowires loaded on activated carbon was analyzed by Dastkhoon et al. [137]. Models based on RSM, multilayer ANN, and Doolittle factorization algorithm were tested for this adsorption system. CCD experimental design of 4 factors and 5 levels with a total of 30 experiments was employed. ANN model consisted of 3 layer feed-forward with tangent sigmoid and linear functions. Input neurons were the indigo carmine concentration (4–16 mg/L), safranin-O concentration (4–16 mg/L), adsorbent mass (20–40 mg), and sonication time (1–5 min), while the neuron output was the removal of these dyes (71.91–96.32%). Hyperbolic tangent sigmoid was the activation function. Doolittle factorization algorithm consisted of a factorized matrix that contained all the experimental data. Modeling results indicated that ANN offered a better precision in comparison to the other models, although Doolittle factorization algorithm was faster. The sensitivity analysis showed that the sonication time was the most important parameter. The maximum adsorption capacities were 29.09 and 37.85 mg/g for indigo carmine and safranin-O, respectively.

Parveen et al. [138] evaluated the support vector regression, multiple linear regression and ANN model to predict the chromium adsorption on maize brain waste. The effect of adsorption time (10-180 min), initial adsorbate concentration (200-300 mg/L), pH (1.4–8.5), and temperature (20-40 °C) on the adsorption capacity (mg/g) was analyzed. For support vector regression model, the Gaussian radial basis function was selected as the kernel function. 124 data were utilized: 80% for training and 20% for testing. ANN with a topology 4-10-1 was used where the experimental data were divided in 65% for training, 15% for validations, and 20% for testing where the kernel function was used as activation function. Results indicated that the support vector regression model was the best to predict the chromium adsorption capacity with the highest (i.e., 0.9986), followed by ANN () and multiple linear regression (), respectively.

Natural and modified clinoptilolite were tested in the fluoride adsorption from aqueous solutions, and the modeling was performed via the hybridization of ANN with Langmuir and pseudosecond order equations [105]. Specifically, ANN was employed to calculate the parameters of pseudosecond order and Langmuir equations, and the adsorption capacities were determined with these parameters and the corresponding adsorption equation. A feed-forward ANN was used where the input layer contained the temperature, time, and initial fluoride concentration for the adsorption kinetics and initial fluoride concentration, pH, and temperature for the adsorption isotherms. The output layer corresponded to the adjustable parameters of tested adsorption kinetic and isotherm equations. Experimental data were divided in 70% for training and 30% for validation and testing where a logistic sigmoid activation function was also utilized. This hybrid ANN model outperformed the classical adsorption equations showing from 0.95 to 0.99. These authors also indicated that the classical equations failed to predict the experimental data in some particular operating conditions. The maximum experimental adsorption capacities of these zeolites were 5.3 and 12.4 mg/g at 40 °C and pH 6, respectively.

Yildiz [139] reported the use of ANN for the modeling of zinc adsorption on peanut shells. Input variables were the initial solution pH, initial zinc concentration, and adsorbent dosage, and the output variable was the adsorbed amount of zinc. ANN with an architecture 3-16-1 and BP were utilized where Matlab® was the software employed in these calculations. 12, 4, and 4 data were used for training, testing, and validation of ANN. Overall, this ANN showed satisfactory results in adsorption data modeling.

Ghosal and Gupta [140] studied the application of ANN and Pareto front analysis for fluoride removal using Al/olivine. The impact of solution pH, agitation rate, temperature, contact time, initial fluoride concentration, and adsorbent dosage was studied. ANN modeling was performed with these input variables, and the output variables were the adsorption capacity and removal efficiency. LM was selected as the training algorithm. Finally, the results of ANN showed and mean square errors of 2.035 and 0.018 for the removal efficiency and adsorption capacity, respectively.

Karri and Sahu [141] tested the use of palm kernel shell derived-activated carbon for the zinc removal. RSM and particle swarm optimization-ANN were compared to obtain the optimal removal. First, RSM and CCD were utilized to correlate the zinc removal with the independent variables: pH (2-8), adsorbent mass (2-20 g/L), initial adsorbate concentration (10-100 mg/L), contact time (15-75 min), and temperature (30-70 °C). Different training algorithms as LM-BP, gradient descent, resilient BP, and gradient descent with adaptive linear regression were assessed. A feed-forward ANN and PSO were employed to obtain better estimations of this adsorption system. Several learning methods and topologies were analyzed, and the optimal ANN model was obtained with LM-BP training and 5-6-1 topology. Particle swarm optimization and ANN outperformed the RSM approach.

Mendoza-Castillo et al. [28] studied and discussed the advantages and limitations of ANN for the modeling of multicomponent adsorption of heavy metals on bone char. Experimental isotherms of single, binary, ternary, and quaternary solutions of copper, nickel, cadmium, and zinc were quantified experimentally and employed in ANN modeling. A multilayer feed-forward ANN was utilized with 141 data divided in 70% for training, 20% for testing, and 10% for validation. Input layer included the initial concentration of the metals in the solution, while the equilibrium concentration and adsorption capacity were analyzed as the output layer. Experimental results showed that the heavy metal adsorption in single solutions followed the tendency: . The adsorption in multimetallic systems showed an antagonistic effect caused by the presence of other coions. ANN showed a proper fit of multicomponent systems with . However, these results depended on the activation function and selected output variable. Specifically, the use of equilibrium concentration was not recommended because this extensive variable can generate wrong predictions (i.e., desorption behavior not observed in the experimental data) for this adsorption system. These authors concluded that intensive variables such as the adsorption capacity must be utilized in ANN modeling with the aim of generating reliable predictions. Results of this study also revealed that a proper ANN training and architecture are fundamental for a reliable prediction of the complex adsorption behavior in multicomponent systems.

Naderi et al. [142] applied a hybrid model consisting of simulated annealing and ANN to optimize and predict the crystal violet dye removal on centaurea stem. RSM was used to find the best experimental conditions. The maximum adsorption capacity was 476.19 mg/g. ANN with 6-10-1 topology and tangent sigmoid and linear activation functions was employed to model the adsorption data. This network was trained with the feed-forward BP algorithm where 32 data were divided in 80% for training and 20% for validation and testing. Input neurons were pH (5–13), temperature (20-40 °C), contact time (5–25 min), initial dye concentration (20-300 mg/L), and adsorbent dosage (3–15 mg), while the dye removal (%) was the output variable. of RSM (0.9942) and simulated annealing-ANN (0.9968) was very similar but the lowest prediction errors were obtained with the approach based on ANN.

The ultrasonic-assisted binary adsorption of sunset yellow and sidulfine blue dyes on oxide nanoparticles loaded on activated carbon was optimized and modeled with RSM and ANN [143]. A total of 26 experiments were performed where the effects of sonication time (6-12 min), adsorbent dosage (0.016-0.030 g), pH (7), and initial dye concentration (8-16 mg/L) on the dye removal percentage were tested. 17 BP algorithms were evaluated with ANN where LM and resilient methods were the best. The performance of these models was statistically compared by considering , root mean squared error, mean absolute error, and absolute average deviation. Results showed that ANN () outperformed RSM ().

The modeling of adsorption of salicylic acid on SiO2/Al2O3 nanoparticles was performed by Arshadi et al. [144] with ANN. In this study, the input variables were the initial salicylic acid concentration (5–1000 mg/L), initial solution pH (1–12), contact time (0.25–30 min), temperature (15–80 °C), and adsorbent dosage (0.25–10). The output variable was the adsorption capacity of salicylic acid (mg/g). ANN architecture of 5-12-1 was utilized. Results indicated that the ANN-based simulation of the adsorption of this compound was satisfactory obtaining .

Gadekar and Ahammed [145] tested a hybrid RSM and ANN model in the prediction of blue 79 dye removal using aluminum-based water treatment residuals. RSM was used to identify the optimum experimental conditions to achieve a high dye removal, and these data were employed to train ANN. ANN with 4-4-1 topology, tangent sigmoid, and linear activation functions was used. For ANN training, LM, gradient descent, and scaled conjugate BP algorithms were utilized. 45 data were employed in training (60%), validation (20%), and testing (20%). ANN input layer contained the adsorbent dose (10–30 g/L), initial pH (3–5), initial dye concentration (25–75 mg/L), and final pH (3.01–5.80). Dye removal (31.2-52%) was the ANN output neuron. Results indicated that ANN and RSM were a reliable alternative to predict the removal of this dye.

Ghaedi et al. [146] modeled the simultaneous ultrasonic-assisted ternary adsorption of rose bengal, safranin O, and malachite green dyes on copper oxide nanoparticles supported on activated carbon. ANN with 3 layers was applied where the initial dye concentrations (8-12 mg/L), pH (6-8), adsorbent dosage (0.05-0.025 g), and sonication time (2-4 min) were the input variables, while the output variable was the dye removal percentage (18.2-92.67%). LM and BP were employed as learning method. Hyperbolic tangent sigmoid and linear functions were utilized at hidden and output layers. High of ANN (>0.99) revealed a satisfactory fitting of tested experimental data.

The multicomponent adsorption of nitrobenzene, phenol, and aniline from a ternary aqueous system using granulated activated carbon was studied by Jadhav and Srivastava [147]. ANN was tested with BP and different activation functions. The equilibrium concentrations of nitrobenzene (0.003–0.8 mmol/L), aniline (0.01–1.6 mmol/L), and phenol (0.01–1.8 mmol/L) were the input variables, and the adsorption capacities were the output variable. Adsorption data were divided in 50% for training and 50% for testing. and mean squared errors were used to verify the model performance. ANN model accurately predicted () the ternary adsorption in comparison to other models.

Similarly, Nasab et al. [148] proposed a hybrid model consisting of genetic algorithm and ANN to predict the adsorption of crystal violet on chitosan/nanodiopside. CCD with 5 levels and 4 factors (30 experiments) was chosen to obtain the optimal dye removal. The input variables for the ANN model were pH (4.5–8.5), contact time (15–55 min), initial dye concentration (15-35 mg/L), and adsorbent amount (0.001–0.01 g), while the output variable was the dye removal (%). Feed-forward ANN with hyperbolic tangent sigmoid and linear activation functions and LM algorithm were employed. 70, 15, and 15% of experimental data were used in training, validation, and testing, respectively. The maximum dye removal was 99.5%. Genetic algorithm was applied to identify the optimal factors for obtaining the maximum adsorption. Results of ANN-genetic algorithm showed a higher (0.9708) than that of RSM (0.9652). Overall, both approaches provided accurate dye removal percentages.

Sharafi et al. [149] reported the phenol adsorption from aqueous solution using scoria stone modified with different acids (e.g., nitric, acetic, and phosphoric). Modeling of adsorption data was performed with RSM and ANN. Clonal selection algorithm was used with ANN modeling where the input variables were the phenol concentration, adsorbent dosage, and contact time. The output variable was the phenol removal. Overall, both RSM and ANN showed satisfactory results in data correlation.

Sadeghizadeh et al. [150] used ANFIS to predict the lead adsorption with a hydroxyapatite/chitosan nanocomposite. This adsorbent showed a maximum adsorption capacity of 225 mg/g, and this removal process was also endothermic. Concerning the data modeling, the input variables were temperature (25–55 °C), adsorption time (15–360 min), shaker velocity (80–400 rpm), adsorbent amount (0.05–1.5 g), initial lead concentration (0–5000 mg/L), pH (2–6), and hydroxyapatite concentration (15–75%). Output variable was the lead adsorption capacity (mg/g) where 57 experimental data were modeled (38 for training and 19 for testing). ANFIS was able to predict the lead adsorption with .

Takdastan et al. [6] used ANN to model the cadmium adsorption on modified oak waste. Kinetic and isotherms were quantified to characterize the effect of adsorption operating conditions. Experimental results revealed that the adsorption increased with temperature, initial concentration, adsorbent dosage, and pH. Isotherms were modeled with Liu, Temkin, Redlich-Peterson, Freundlich, and Langmuir equations, while the kinetics were fitted to intraparticle diffusion, pseudosecond and pseudo-first order, Elovich, and Avrami fractional order equations. Raw adsorbent and the NaOH-modified adsorbent had a maximum adsorption capacity of 155.9 and 771.4 mg/g, respectively. A feed-forward BP-ANN was applied using pH (2–8), contact time (5–240 min), adsorbent dosage (0.1–10 g/L), cadmium initial concentration (25–100 mg/L), and temperature (10–40 °C) in the input layer with a hidden layer of 8 neurons, and the cadmium removal (16–92.4% for ROW and 26–99.5% for AOW) was in the output ANN layer. 219 experimental data were employed in training (153), validation (33), and testing (33). for ANN modeling where pH had the highest impact on cadmium removal, while the adsorption temperature showed a slight effect.

ANN were used to model the lead adsorption on rice husks treated with HNO3 [151]. Specifically, a feed-forward BP-ANN and LM training were used. The input variables of ANN were the adsorbent dosage, initial lead concentration, and contact time, and the output variable was the lead adsorption capacity. These authors concluded that the adsorption modeling with ANN was effective.

The adsorption of 6 heavy metals (arsenic, nickel, cadmium, lead, zinc, and copper) on 44 biochars obtained from lignocellulosic feedstocks was modeled using a multilayer ANN and random forest [152]. 353 adsorption data were collected from literature, and 14 input variables were studied and divided in 4 sets (adsorbent properties, initial heavy metal concentration, operational conditions, and heavy metal properties), while the removal efficiency was the output variable. ANN architecture included 14 input neurons, 8-28 hidden neurons, and 1 output neuron with sigmoid activation function. Results showed that random forest outperformed in 28% the ANN performance with . It was concluded that biochar characteristics were the most important variables in heavy metal adsorption. Surface area did not show a significant impact on the metal removal.

Afolabi et al. [47] reported the use of ANN to model the pseudosecond order kinetics of the paracetamol adsorption using orange peel-activated carbon. The experimental conditions used as input variables were the initial paracetamol concentration (10–50 mg/L), contact time (0–330 min), and temperature (30–50 °C), and the output variable was the pseudosecond order kinetics. ANN with different hidden neurons, training algorithms, and activation functions were used for the data modeling. A total of 495 data were used (i.e., 330 for training and 165 for testing). Results showed the impact of training algorithms and activation functions on the ANN performance. The best ANN showed .

The removal of lead, cadmium, nickel, and zinc using a natural zeolite was modeled with ANN, multivariate nonlinear regression, particle swarm optimization-adaptive neuro-fuzzy inference system, genetic programming (GP), and the least squares support-vector machine [10]. The input modeling variables were the initial and equilibrium solution pH, silica concentration, molecular weight, first ionization energy, hydrated ionic radii, and electronegativity of tested metals. The adsorption capacity of heavy metals of the zeolite was the output variable. Results showed that ANN outperformed traditional adsorption equations with . Other tested models also offered a satisfactory data correlation.

Gopinath et al. [29] proposed the use of ANN with a homogeneous surface diffusion model to analyze the single, binary, and ternary adsorption kinetics of acid orange, acid blue, caffeine, acetaminophen, and benzotriazole on activated carbon. The mass transfer model considered bulk diffusion in the fluid phase and surface diffusion via the internal adsorbent structure. Note that these phenomena are not considered by the conventional pseudosecond and pseudofirst order models. Feed-forward ANN with 5 layers was utilized. Input variables were the type of adsorbent (carbon labelled and active char products), pH (3–8), temperature (25–45 °C), initial concentration (100–300 mg/L), and ratio of mass/volume (0.8–2 g/L), while the output variable was the removal efficiency (%). Tangent sigmoid and linear activation functions were used. Datasets were distributed in 90, 5, and 5 for training, testing, and validation, respectively, where ANN was trained with LM algorithm. of 0.999, 0.986, and 0.993 were obtained using the mass transfer model and ANN for single, binary, and ternary systems, respectively. Results of this study proved the advantages of ANN in the simulation of multicomponent adsorption kinetics considering more complex models based on mass transfer phenomena.

The treatment of water polluted with atenolol, ciprofloxacin, and diazepam in presence of COD and ammonia was performed in a sequencing batch reactor with a composite adsorbent consisted of bentonite, zeolite, biochar, and cockleshell mixed with Portland cement [11]. Contact time (2-24 h) and initial pharmaceutical concentration (1-5 mg/L) were the input variables for ANN modeling, while the output variable was the pharmaceutical removal (90.3% for atenolol, 95.5% for ciprofloxacin, and 95.6% for diazepam). ANN with three layers (2-5-1) was utilized where the model performance using the mean squared sum errors and . Data were divided in training (60%), validation (20%), and testing (20%), and LM was used for ANN training. Overall, ANN showed for the modeling of this system.

The single and competitive adsorption of acid blue 9 and allura red AC on chitosan-based hybrid hydrogels were modeled with ANN [7]. Experimental data indicated that acid blue 9 was better adsorbed than allura red on five adsorbents. In binary dye solutions, an antagonistic adsorption was observed. Input layer of ANN with the initial concentration of both dyes (0-0.126 mmol/L and 0–0.201 mmol/L for acid blue 9 and allura red AC, respectively), carbonaceous mass percentage of adsorbent (0-10% g/g), adsorbent porosity (0.724–0.880), and contact time (0–200 min). All experimental data were used for training (70%), validation (15%), and testing (15%). Several topologies were investigated, and the best ANN architecture was 5-10-10-10-2 with tangent sigmoid activation function. and root mean square error of 0.119 thus indicating that ANN could be an effective model to predict the adsorption of dyes by these hybrid hydrogels.

Franco et al. [153] applied the ANN and ANFIS to analyze the indium adsorption on 10 adsorbents: commercial activated carbon, multiwalled carbon nanotubes, chitin, chitosan, and other lignocellulosic agroindustrial wastes. The indium adsorption capacities of these materials ranged from 8.20 to 1000 mg/g. Modeling was performed considering the next input variables: specific surface area (0.85–200.40 m2/g), pH of point of zero charge (4.5–7), adsorbent dosage (0.05–2.0 g/L), and contact time (5–120 min). Output variable was the indium adsorption capacity. 1200 data were employed in the modeling where 70% for training and 30% for testing and/or validation. ANFIS utilized the Sugeno type with 4 hidden layers, while ANN was used with a 4-4-1 topology. ANN obtained and a mean squared error of . On the other hand, ANFIS achieved and a mean squared error of . Both models were capable of predicting the adsorption data.

Nayak and Pal [154] employed ANN for the prediction of nile blue A dye adsorption with overripe Abelmoschus esculentus seeds. CCD with 31 experiments was utilized to optimize the dye adsorption where the effect of adsorbent dosage (1–9 g/L), initial adsorbate concentration (140–750 mg/L), pH (2-9), and contact time (5–125 min) was tested. The maximum dye adsorption capacity was 71.78 mg/g. ANN with 3 layers and BP algorithm was used. ANN architecture included 4 input neurons (contact time, pH, initial dye concentration, and adsorbent dosage), one output neuron (adsorption capacity), and 12 hidden neurons. 31 experiments (including 16 factorial points, 8 axial points, and 7 replicates) were divided in training (70%), validation (15%), and testing (15%), and tangent sigmoid activation function was used. and modeling errors were the metrics to analyze the ANN performance. Sensitivity analysis demonstrated that pH and contact time were the most important parameters in this adsorption system. This adsorbent showed a maximum adsorption capacity of 105 mg/g according to the experimental isotherms.

Thirunavukkarasu and Nithya [155] reported the removal of acid orange 7 using CaO/CeO2 and its modeling via RSM and ANN. The input variables for ANN were the adsorption temperature (301–338 K), contact time (0–300 min), initial concentration of acid orange 7 (10–50 mg/L), adsorbent dose (0.02–0.2 g), and initial solution pH (2–12). The output variable was the dye removal (%). LM and BP were used in ANN training where the best architecture was 5-10-1. ANN results indicated a satisfactory modeling with a root mean square error of 0.3020.

Qi et al. [156] employed RSM, ANN-genetic algorithm, and ANN-particle swarm optimization to analyze the methylene blue adsorption on mesoporous rGO/Fe/Co nanohybrids. The effect of pH (2–6), temperature (20–40 °C), contact time (3–15 min), and initial dye concentration (200–600 mg/L) on the dye adsorption was analyzed using a CCD consisting of 30 experiments. The experimental results showed that the nanohybrids achieved a maximum dye removal of 89.41%, while a maximum adsorption capacity of 909.1 mg/g was obtained from the Langmuir isotherm. ANN with 3 layers, BP algorithm, and linear and tangent sigmoid activation functions was utilized. For the case of ANN-genetic algorithm, its parameters were , , , and . On the other hand, the parameters of particle swarm optimization were , , , , , and . The absolute errors between the experimental and predicted values were 2.88, 0.52, and 1.35 for CCD, ANN-particle swarm optimization, and ANN-genetic algorithm, respectively. Therefore, ANN–particle swarm optimization was the best option for this adsorption system.

In the study of Samadi-Maybodi and Nikou [79], ANN was used to predict the sarafloxacin adsorption on magnetized metal-organic framework Fe3O4/MIL-101(Fe). RSM with CCD of 30 experiments was used to optimize the removal efficiency obtaining a maximum value of 88.26%. A multilayer ANN with three layers was employed where the input variables were the solution pH (3-11), initial concentration (10–50 mg/L), adsorbent dosage (5–25 mg), and contact time (15–45 min), while the removal percentage (35.73–88.26%) was the output variable. LM was the training method with sigmoid tangent hyperbolic function for input to hidden layers and linear transfer function for hidden to output layers, while ANN was assessed using and mean squared error. Overall, ANN was reliable for predicting the sarafloxacin removal with .

Netto et al. [157] applied ANFIS and ANN to model the adsorption equilibrium of silver, cobalt, and copper on three zeolites ZSM-5, ZHY, and Z4A. Adsorption experiments were conducted at different temperatures. The input variables for the models were the Si/Al ratios of zeolites (50: 50, 71:29, 90:3), molecular weights of metal ions (58.93–107.87 g/mol), temperature (298-328 K), and initial adsorbate concentration (0–300 mg/L), while the equilibrium adsorption capacities of these metals were the output variables. ANN was tested with two training functions (LM-BP and Bayesian regularization BP). The linear and hyperbolic tangent sigmoid functions were utilized. For the case of ANFIS, the Gaussian curve was the input function, and the tune sugeno-type was used for training. 324 experimental data were divided in 85% for training and 15% for testing. The performance of ANN and ANFIS was analyzed with different statistical metrics. Overall, both models predicted accurately the adsorption data where ANFIS was slightly better. Z4A zeolite showed the best adsorption capacities where silver was more adsorbed in comparison to cobalt and copper.

Other recent studies on the ANN modeling of adsorption isotherms and kinetics include the fluoride adsorption on rice husk-derived biochar modified with Fe or Zn [158], the removal of brilliant green dye using mesoporous Pd–Fe magnetic nanoparticles immobilized on reduced graphene oxide [15], the adsorption of diazinon pesticide on a magnetic composite clay/graphene oxide/Fe3O4 [159], the removal of crystal violet and methylene blue on magnetic iron oxide nanoparticles loaded with cocoa pod carbon composite [160], the arsenide removal employing mesoporous CoFe2O4/graphene oxide nanocomposites [161], the adsorption of perfluorooctanoic acid on copper nanoparticles and fluorine-modified graphene aerogel [17], the uptake of dicamba (3,6-dichloro-2-methoxy benzoic acid) by MIL-101(Cr) metal-organic framework [16], the phosphorous adsorption on polyaluminum chloride water treatment residuals [162], the use of iron doped-rice husk for the chromium adsorption/reduction [163], the removal of methyl orange dye by an activated carbon derived from Acalypha indica leaves [164], the lead adsorption by a hydrochar obtained from the KOH activated Crocus sativus petals [165], the adsorption of the cefixime antibiotic using magnetic composite beads of reduced graphene oxide-chitosan [13], the use of graphene oxide-cyanuric acid nanocomposite for the lead adsorption [14], the arsenic removal by an adsorbent consisting of iron oxide incorporated carbonaceous nanomaterial derived from waste molasses [12], the fluoride adsorption by chemically activated carbon prepared from industrial paper waste [18], the methylene blue adsorption with polyvinyl alcohol/carboxymethyl cellulose-based hydrogels [166], the modeling of adsorption properties of biochar and resin for the removal of organic compounds [167], and the removal of lead from waster with a magnetic nanocomposite [168].

3.2. Breakthrough Curves

The dynamic adsorption experiments provide important engineering information about the adsorption process especially for real-scale applications. Breakthrough curves are commonly represented via the ratio of effluent adsorbate(s) concentration(s) and feed adsorbate(s) concentration(s) (i.e., ) versus the operating time or treated volume. These curves characterize the adsorbent performance at dynamic operating conditions. Overall, the breakthrough curves of water pollutants can correspond to symmetric and asymmetric profiles depending on the process operating conditions (e.g., feed flow, residence time, and column length) and the impact of mass transfer phenomena. The modeling of asymmetrical breakthrough is more challenging because the conventional models like Thomas and Yang equations are limited because they were developed to handle the ideal “S” profile expected and desired for adsorption columns. Therefore, ANN have been utilized to improve the correlation and prediction of breakthrough curves of the adsorption of water pollutants. ANN modeling of breakthrough curves has covered the adsorption of fluoride, dyes, heavy metals, pesticides, organic compounds, and phosphates with bone char, activated carbon, graphene, biochar, zeolites, biomasses, composites, nanomaterials, and other adsorbents like agroindustrial wastes. Different operating conditions such as temperature (15–50 °C), feed flow (0.5–30 mL/min), and pH (2–9) have been tested in the modeling of breakthrough curves in aqueous solutions with one or more adsorbates. Details of several studies on the ANN modeling of dynamic adsorption process for different pollutants and adsorbents are shown in Tables 3 and 4. A brief description of the main findings and representative studies of the ANN-based breakthrough adsorption modeling are provided in this subsection.

Texier et al. [169] proposed the application of a multilayer ANN to model the breakthrough curves of the adsorption of lanthanide ions (La, Eu, and Yb) using an immobilized Pseudomonas aeruginosa in polyacrylamide gel and a fixed-bed adsorber. The effect of superficial liquid velocity (0.76–2.29 m/h), particle size (125–500 μm), influent concentration (2–6 mM), and bed depth (250–400 mm) on the adsorption capacities was analyzed. Experimental breakthrough curves showed that the maximum bed adsorption capacities were 208 μmol/g for La, 219 μmol/g for Eu, and 192 μmol/g for Yb in single aqueous solutions. ANN modeling was performed considering the next input variables: initial concentration (2–6 mM), bed depth (250 and 330 mm), operating time (min), and the modified Reynolds number. Ratio was the output variable. BP algorithm was used in ANN training where the activation function of the hidden layer was the hyperbolic tangent. Training and validation were carried out with 392 adsorption data, and 40 additional data were used for testing the ANN performance. Root mean square error was used as the statistical metric to analyze the calculations with ANN. Results showed that the prediction ability of ANN was satisfactory for the first zone of the breakthrough curve, which corresponded to the zone before the breakthrough point. These authors also concluded that it should be necessary to extend the experimental column database to improve the ANN performance with the objective of predicting reliably all the zones of breakthrough curves.

Park et al. [170] modeled the breakthrough curves of chromium adsorption using a column packed with brown seaweed Ecklonia biomass. These authors discussed the effect of the operating parameters (feed concentration, initial concentration, pH, flow rate, and temperature) on the adsorption of this priority water pollutant. The experimental results showed that this biomass achieved an adsorption capacity of 50.2 mg/g by the 274th bed. Chromium adsorption reduced with pH decrements, and this removal process was endothermic. ANN modeling was done with the next input variables: influent chromium concentration (100–200 mg/L), biomass concentration (70–140 g/L), pH (2), flow rate (10–20 mL/min), temperature (25–45 °C), and the bed number (i.e., flow rate operating time/total column volume). Chromium concentration of treated fluid (0–200 mg/L) was utilized in the output ANN layer. 127 data were utilized for obtaining the ANN model with hyperbolic tangent function in the hidden layers and a linear function in the output layer. The performance of the feed-forward BP-ANN was assessed. Root mean square errors ranged from 2.52 to 3.20, thus, indicating that ANN was successful to model the breakthrough curves of chromium adsorption.

The adsorption breakthrough curves of 3 pesticides (namely, atrazine, atrazine-desethyl, and triflusulfuron-methyl) using 5 commercial activated carbon filters were modeled by Faur et al. [171]. Experimental isotherms of pesticides in aqueous solutions and natural waters were carried out in single and competitive adsorption between pesticide and natural organic matter. In a second stage, the breakthrough curves of pesticide were quantified for solutions with only one adsorbate. 15 variables were identified and ranked, in order of decreasing relevance and impact on the pesticide adsorption, by Gram-Schmidt orthogonalization. A static feed-forward ANN was used with the next input variables: micropore volume (%), mesopore volume (cm3/g), solubility (g/L), molecular weight (g/mol), initial concentration (mg/L), initial total organic carbon, flow velocity (m/h), time (min), Freundlich constants K (), and and elimination of natural organic matter (%). Also, a recurrent ANN was used with the next inputs: solubility (g/L), molecular weight (g/mol), initial concentration (mg/L), initial total organic carbon, secondary micropore volume (%), , isotherm constants like () and . In both models, the output variable was . 9749 data were employed and distributed in 67% for training and model selection and 33% for the final testing. ANN provided reliable predictions with and a . However, the recurrent ANN outperformed the static ANN particularly for the breakthrough and saturation zones. Note that this behavior was expected since the dynamic character of the adsorption process was considered in the recurrent ANN. Operating conditions and pesticide properties exhibited a significant impact on this adsorption process according to both ANN models, while the adsorbent properties showed a low impact on the pesticide removal.

Balci et al. [172] used the bed depth service time model and multilayer ANN for the correlation of breakthrough curves of the adsorption of reactive black 5 and basic blue 41 on Eucalyptus camaldulensis barks. The input variables were the volume of water treated (0.04–32.4 L), bed depth (5–20 cm), and dye concentration (100-400 mg/L), while the output variable was the final concentration of treated water (mg/L). These authors proposed a multilayer perceptron ANN with 3-5-1 architecture to model this adsorption process. This ANN was able to fit the adsorption data showing low modeling errors. They concluded that ANN was effective in the modeling, prediction, and estimation of adsorption processes.

Cavas et al. [173] applied the Thomas equation and ANN for modeling the breakthrough curves of methylene blue adsorption via Posidonia oceanica dead leaves. These authors used a multilayer feed-forward ANN with LM algorithm. Data modeling was performed considering the next input variables: bed height (3–9 cm), flow rate (3.64–7.28 mL/min), and time (min), while the effluent methylene blue concentration was the output. 1215 experimental data were used to train and test the performance of ANN with a hyperbolic tangent sigmoid activation function. Results showed that this ANN outperformed Thomas equation with and a mean square error lower than 0.001453.

Tovar-Gómez et al. [57] applied a hybrid neural network model and conventional adsorption models to predict the breakthrough curves of fluoride adsorption on two commercial bone chars. This study introduced the development of a hybrid approach based on ANN to improve the prediction of breakthrough curves with traditional adsorption equations. In particular, ANN was used to estimate the parameters of Thomas equation, and these estimated parameters were used to calculate the corresponding concentration profiles of asymmetric breakthrough curves. Input data for the hybrid ANN were the feed fluoride concentration (9–40 mg/L), column operating time (12.7-178.0 h), and feed flow rate (0.198–0.396 L/h). The output variable was the ratio for fluoride adsorption. All experimental data of two bone chars (186 and 198 for these bone chars) were used for training (75%) and 25% remaining data for verification and testing. A feed-forward ANN with two hidden layers with 18 neurons was chosen. This study utilized the classical BP algorithm for ANN training and the sigmoid activation function. This hybrid Thomas-ANN model outperformed the traditional Thomas equation showing the lowest mean square error thus reflecting its better accuracy for correlating and predicting the fluoride adsorption breakthrough curves. This study opened the possibilities of improving the performance of traditional breakthrough equations via their hybridization with ANN.

A comparison between hybrid Freundlich and wave propagation model with ANN was carried out to simulate the breakthrough curves of cesium and strontium on montmorillonite-iron oxide composite [42]. For ANN modeling, the input variables were the column operating time (min), feed concentration (2–50 mg/L), bed height (5–15 cm), and feed flow rate (0.5–8 mL/min), while the output variable was the outlet concentration. LM algorithm was applied in ANN training. ANN showed root mean square error values of 0.321–0.561 with for the modeling of breakthrough curves of cesium and strontium adsorption.

A three-layer feed-forward BP-ANN was applied to model the adsorption of phosphate by hydrated ferric oxide-based nanocomposite in a fixed bed column [174]. ANN with tangent sigmoid and linear activation functions was able to predict the performance of this adsorption system. Input variables were pH (3–9), sulfate concentration (0.42–1.68 mM), phosphate (0.042–0.084 mM), and temperature (15–35 °C). Removal efficiency (%) was the output variable. A feed-forward BP-ANN with 3 layers, 20 neurons, quasi-Newton training algorithm, and logistic sigmoid activation function was utilized. Overall, this surrogate model was suitable to predict the breakthrough curves, but it failed to follow the trend of some experimental data. The authors concluded that high-quality experimental data were required to obtain reliable predictions of dynamic adsorption systems. However, the characteristics of the studied breakthrough curves were well described by this ANN ().

Masomi et al. [175] studied the dynamic adsorption of 4-nitrophenol, 2-chlorophenol, and phenol using activated carbon obtained from pulp and paper mill sludge where ANN was also applied to model this removal process. The experimental variables were bed height (2, 4, and 6 cm equivalent to 0.1, 0.2, and 0.3 g of activated carbon), feed flow rate (2, 3.5, and 5 mL/min), feed concentration (50–400 mg/L), and temperature (20, 3,5 and 50 °C). from the breakthrough curves was the output variable for ANN. An architecture with several hidden layers and neurons was employed. 106 data were utilized for training, 20 for testing, and 20 for validation. The authors concluded that the use of ANN satisfactorily predicted the dynamic adsorption of phenol compounds.

Rojas-Mayorga et al. [176] performed a comparative study of the prediction of asymmetric breakthrough curves of fluoride adsorption on a modified bone char. The traditional models of Yan and Thomas, mass transfer model, and ANN were assessed. ANN input variables were the operating time (8–24 h), fluoride feed concentration (10–100 mg/L), and flow rate (0.18–0.36 L/h), while the output variable was the ratio of for fluoride removal. 948 experimental data were divided in 50, 25, and 25% for training, validation, and testing. Modeling results showed that ANN outperformed other models to predict these breakthrough curves. In fact, the model performance was . Due to the asymmetry of fluoride breakthrough curves, the Thomas and Yan equations showed the worst fitting. In fact, the main advantage of ANN relied on its capabilities to model asymmetric breakthrough curves that commonly occur during water treatment.

Reynel-Avila et al. [38] applied an ANN model to analyze and characterize the adsorption of anionic dyes (i.e., reactive blue 4, acid blue 74, and acid blue 25) using fixed-bed columns packed with bone char. ANN modeling was performed considered dye feed concentration (50–300 mg/L), column operating time (2-448 min), molecular weight of the dye (g/mol), adsorption temperature (30–40 °C), and dye molecular dimensions (, , , Å) as the input neurons. Output neuron was associated to the profile of the breakthrough curves. Adsorption experimental data and BP algorithm were used for training (70%), validation (15%), and testing (15%). Experimental results indicated that the maximum adsorption capacities of bone char were 34.91, 32.2, and 27.9 mg/g for acid blue 25, acid blue 74, and reactive blue 4 molecules, respectively. ANN was reliable to correlate the adsorption profile of packed bed columns with . In particular, the molecular dimensions of dyes were relevant in the dynamic adsorption with this adsorbent.

A stratified adsorption column packed with bone char was used for the binary adsorption of cadmium and zinc where the data modeling was performed with ANN [177]. Results showed that the use of this adsorber configuration reduced the antagonistic effects present in binary metallic systems and outperformed the conventional fixed-bed columns. A feed-forward BP-ANN was applied to model the binary breakthrough adsorption curves. Input variables were the molecular weight (g/mol), electronegativity and hydrated ionic radius (Å) of heavy metals, feed concentration of both adsorbates (100-200 mg/L), feed flow rate (4-6 mL/min), stratified bed length (5–15 g), and the column operating time (0-750 min), while the profiles for both metals were the output variables. ANN was able to fit the highly asymmetric behavior of cadmium and zinc breakthrough curves. These authors indicated that the breakthrough zone was challenging due to ANN showed the highest modeling errors. This study highlighted a limitation of ANN to model asymmetric breakthrough curves in multicomponent adsorption systems.

Gordillo et al. [178] reported the study of dynamic fuzzy ANN for the simulation of a fixed bed adsorption of zinc, nickel, and cadmium on bone char in single and bimetallic systems. Experimental dynamic adsorption studies were performed at pH 5 and 30 °C with feed concentrations of 2–60 mg/L in single and binary systems. Breakthrough curves were employed to calculate several parameters of fixed-bed columns. The modeling of concentration profiles via ANN considered the next input variables: initial feed concentration, hydration energy, electronegativity, hydrated ionic radii, and molecular weight of tested metals besides the column operating time. The output variables were the ratios for both metals. 3 hidden layers were employed in the ANN architecture where 70% of experimental data were utilized for training, 15% for validation, and 15% for testing. Results of this study indicated that this ANN was effective to represent the main characteristics and behavior of the breakthrough curves in the heavy metal adsorption in single and binary systems with antagonistic adsorption.

Liu et al. [179] performed the ANN modeling of a collection of experimental data reported in the literature for the adsorption of copper, chromium, and methylene blue on different waste residues (i.e., rice husk, tamarind fruit shell, and catla fish scales) using a rotating packed bed. Cascade-forward BP-ANN, Elman BP-ANN, and feed-forward BP-ANN were employed in this study. Experimental data were divided in 82 and 18% for training and testing, respectively. The input variables were the Reynolds number, ratio of contact time to maximum contact time, average high gravity factor, ratio of particle size to bed height, and ratio of feed concentration to packing density. The ratio of adsorption capacity at given time to the maximum adsorption capacity was the output variable. Tangent hyperbolic sigmoid function and a topology with 5 neurons in the hidden layer were utilized. Feed-forward BP-ANN showed the highest values and better accuracy followed by Cascade-forward BP-ANN and Elman BP-ANN.

Moreno-Pérez et al. [41] analyzed and discussed the capabilities and limitations of feed-forward BP-ANN, feed-forward BP-ANN with distributed time delay, cascade forward ANN, and Elman ANN for the modeling of multicomponent adsorption of heavy metals on bone char. The dynamic adsorption of these heavy metals generated asymmetric breakthrough curves, which were difficult to model with traditional adsorption equations. Twenty breakthrough curves were obtained for the adsorption of zinc, nickel, copper, and cadmium and their combinations in multimetallic solutions. Initial concentration of column feed (0.52–0.85 mmol/L) and column operating time (0–8 h) was the input variables, while the concentration profiles were the output variable. 1420 experimental data were divided into 70, 15, and 15% for training, validation, and testing of these ANN models. LM, Bayesian regularization, and scaled conjugate gradient were used and assessed as training algorithms. Experimental results showed that the highest adsorption capacities were obtained for copper in single and multicomponent solutions, which were 2.15-5.14 mmol/g. An antagonistic adsorption was identified in the solutions containing two or more heavy metals, which competed for the binding sites of the adsorbent surface. ANN performance depended on the hidden layers and their neurons, activation function, and training algorithm. Cascade forward ANN outperformed the other tested ANN models. Note that feed-forward BP-ANN is the most used ANN in adsorption literature but it could fail in the modeling of high asymmetric breakthrough curves of both single and multicomponent systems.

Shanmugaprakash et al. [180] developed an ANN model and optimized the zinc adsorption using Pongamia oil cake in both batch and dynamic systems. CCD was employed to improve both batch (31 experiments) and dynamic adsorption (20 experiments). A multilayer ANN with a topology of 3-7-1 and tangent sigmoid and linear activation functions was used. LM algorithm was the training method. For the column modeling, the input variables were the feed flow rate (5–15 mL/min), feed concentration (50–500 mg/L), and bed height (4–12 cm), while the output variable was the adsorption capacity (13.58–66.29 mg/g). This adsorbent had an adsorption capacity of 66.29 mg/g, and ANN outperformed RSM modeling with values of 0.99 and 0.84, respectively.

Cadmium adsorption on green adsorbents (i.e., jackfruit, mango and rubber leaves) in a down-flow fixed-bed columns was studied by Nag et al. [181]. They used a hybrid model ANN-genetic algorithm for the simulation and optimization of this adsorption process where the influence of bed height, flow rate, and initial concentration was determined. ANN model used the type of adsorbent, bed height (3–9 cm), feed flow rate (10–25 mL/min), column operating time (5–600 min), and feed concentration (20–80 mg/L) as input variables. Cadmium percentage removal (6–99.95%) was the output variable. 556 experimental data were divided in 70, 20, and 10% for training, validation, and testing, respectively. ANN modeling was done with the hyperbolic tangent activation function. Cadmium adsorption capacities followed the next trend: . The adsorption of this metal depended on the operating parameters thus achieving a maximum removal of 98.26% at optimized conditions. This ANN model showed .

Vakili et al. [77] applied ANN to model the removal of organic micro-pollutants (tonalide, ketoprofen, carbamazepine, and bisphenol) with fixed-bed columns packed with chitosan/zeolite. 30 experiments from CCD were employed to optimize the removal of these pollutants. A three-layer feed-forward ANN with 2-4-1 topology was used where the input variables were pH (4–8) and adsorbate concentration (0.5–2 mg/L), and the removal percentage (47.3–96.1%) was the output variable. LM algorithm was used for ANN training with several activation function as linear, hyperbolic tangent, and logistic sigmoid. This ANN showed a high accuracy with indicating that it can be used to optimize the adsorption process of organic micropollutants.

Anbazhagan et al. [182] analyzed the application of ANFIS and ANN for the methylene blue adsorption using activated carbon from leaves of Calotropis Gigantea (CGLAC) in a fixed–bed column. Different experimental conditions were tested including the initial concentration of methylene blue (100–500 mg/L), bed height (1-2 cm), solution pH (2-10), flow rate (3.5-6.5 mL), and temperature (303-333 K). These operating conditions were also used as inputs in the ANN analysis, while the methylene blue removal was the output variable. 60 experimental data were used for ANN where 40 were employed for training and 20 for prediction. Bed Depth Service Time, Yoon-Nelson, Wolbroska, Adams-Bohar,t and Thomas models were also employed to model this adsorption column. Results showed that ANN was effective to predict the adsorption of methylene blue in dynamic operating conditions.

Yusuf et al. [183] predicted the copper and manganese adsorption on surfactant decorated graphene packed in a down-flow bed column using ANN. Breakthrough curves were determined to identify the saturation time and bed adsorption capacity. The optimum adsorption capacities were 48.83 and 45.62 mg/g for copper and manganese, respectively, at bed height of 3 cm. A multilayer feed-forward ANN with hyperbolic tangent sigmoid function and a quick propagation algorithm were used. ANN model with 4-5-1 topology obtained for both heavy metals where the input variables were the adsorbent dosage (0.01–0.1 g), initial concentration (25–250 mg/L), temperature (25–50 °C), and pH (2–5), and the output variable was the heavy metal removal percentage.

ANN modeling of the adsorption of phenolic compounds on activated date palm biochar in a down-flow fixed-bed column was studied by Dalhat et al. [49]. Breakthrough curves at several operating conditions were determined and modeled with ANN and nonlinear regression generalized decay function model. A two-layer feed-forward ANN was used with next input data: bed height (10–40 cm), adsorbent mass (21.8–87.0 g), feed initial concentration (10–100 mg/L), feed flow rate (5–30 mL/min), and column operating time (0–720 min). Output data were the final effluent concentration or the ratio . Adsorption data was divided in 70% for training, 15% for validation, and 15% for testing. LM algorithm was used for ANN training with hyperbolic tangent sigmoid activation function. Adsorption capacities were 560.55 and 647.28 mg/g for orto-cresol and phenol, respectively. ANN outperformed the nonlinear regression model to fit the dynamic adsorption data. The use of final effluent concentration as output variable offered better adjustments that those with . Sensitivity analysis revealed that column operating time and feed initial concentration were the most relevant parameters for the adsorption of these pollutants.

Finally, the modeling of simultaneous dynamic adsorption of organic pollutants (e.g., phenol, toluene, benzene, caffeine, ciprofloxacin, flumequine, and diclofenac) on activated carbon was carried out with ANN [184]. A set of 15 systems with 5951 data collected from published papers were used to build the ANN model where the input variables were the molar mass (78.11-361.37 g/mol), initial concentration (0.00019-500 mg/L), feed flow rate (0.05-456.62 mL/min), bed height (2-200 cm), adsorbent particle diameter (0.1-2.4 mm), specific surface area (678-2869 m2/g), average pore diameter (1.29-3015 nm), and operating time (0-57170 min), while the ratio was the output variable. ANN was implemented using BP algorithm for learning with logistic sigmoid and hyperbolic tangent sigmoid as activation functions for hidden and output layers, respectively. The data were divided into 80% for learning, 10% for testing, and 10% for validation. ANN architecture was 8-45-1. Modeling results demonstrated the applicability of ANN to predict these dynamic adsorption systems with , root mean square error of 0.029, and absolute deviation of 1.81%. The sensitive analysis showed that all variables impacted the system performance where the flow rate and specific surface area were the most relevant.

4. Remarks on the Application of ANN for a Reliable Adsorption Modeling

This literature review indicates that the application of ANN for the modeling and correlation of adsorption data has been successfully adopted as an alternative approach to overcome the limitations of traditional models. Unfortunately, it has been identified that several published papers contain common mistakes related to the ANN implementation thus affecting the quality and reliability of developed models and the corresponding conclusions obtained from them in the adsorption modeling.

First, the output variables used in ANN modeling must be intensive variables especially for multicomponent adsorption systems. Several studies have reported the ANN training with removal percentages or final adsorbate concentrations as the output variables especially in batch adsorption systems. They are extensive variables whose values are directly related to the adsorbent amount and, consequently, the ANN-based models can learn incorrectly the system performance thus providing wrong estimations. In particular, Mendoza-Castillo et al. [28] have demonstrated that the use of extensive variables in the ANN modeling of binary, ternary, and quaternary antagonistic adsorption of heavy metal ions generated inaccurate estimations of the adsorption isotherms where a desorption process was erroneously predicted. It is convenient to indicate that the selection of proper output variables for ANN training will be more relevant for systems with multiple adsorbates that could display simultaneously different multicomponent adsorption behaviors. One example is the simultaneous adsorption of heavy metals and acid dyes where this system has the presence of both synergistic and antagonistic adsorption. Overall, the adsorption capacities should be the output variable used in the ANN modeling especially for real fluids (e.g., industrial effluents and groundwater) where several adsorbates could interact with the adsorbent utilized as separation medium.

After defining the ANN architecture (i.e., the number of neurons and layers), ANN training should be performed to determine the corresponding model parameters. This training relies on the resolution of a parameter estimation problem that is characterized by the presence of multiple solutions (i.e., a global optimization problem should be resolved). Training methods used in ANN-based adsorption modeling are commonly based on the application of local optimization methods, which are effective to find a set of the ANN parameters but their numerical performance is strongly related to the initial estimates, and there is no guarantee to find the global optimal solution that corresponds to the best data modeling. Under this scenario, the ANN parameters obtained with conventional training methods generally correspond to a local optimum of the objective function used. The identification of the ANN parameters via the traditional training algorithms (e.g., LM method) should imply several calculations with different initial estimates to identify the best solution based on results from proper statistical metrics. Therefore, ANN training should be recognized as a global optimization problem that requires reliable optimizers for its resolution. Stochastic global optimization methods like differential evolution, particle swarm optimization, genetic algorithm, and other recent metaheuristics are an alternative to solve the parameter identification of ANN training. Some studies on adsorption research have applied these optimizers as already discussed in Section 3 of this review. However, it is important to remark that the ANN training based on these optimizers will imply a significant increment on the computer time of the adsorption data modeling.

Another common failure identified in several adsorption studies is the ANN training with a limited number of experimental data. Overall, the increment of hidden layers and their neurons will improve the ANN performance thus reducing the modeling errors and increasing the determination coefficient (). But the number of ANN parameters to be determined on the model training should be significantly lower than the number of available experimental data with the aim of obtaining a reliable ANN model from a statistical point of view. With this in mind, the verification of ANN overtraining is a relevant issue that should be also analyzed in adsorption modeling. This step is usually not considered in the papers reported on the ANN-based adsorption modeling. As a general rule, the authors should select the ANN architecture with the least number of parameters that offered the best data fitting and modeling errors. These remarks also apply for hybrid models obtained from the combination of adsorption equations and ANN.

5. Conclusions

Artificial neural networks have proved to be a useful numerical approach to develop new models for the adsorption analysis. Several studies have demonstrated that ANN-based models can outperform the traditional equations for the correlation and prediction of isotherms, kinetics, and, in less extent, breakthrough curves. ANN-based models have been widely applied in the analysis and modeling of adsorption systems with one water pollutant. There are few studies on the multicomponent adsorption modeling with ANN, which are mainly related to the removal of heavy metals, dyes, and other few organic pollutants. Therefore, the application of ANN-based models in the analysis and simulation of multicomponent adsorption systems is an interesting topic to be studied and analyzed in forthcoming papers. Also, the studies related to the modeling of dynamic adsorption systems involving several adsorbates should be increased to complement the characterization of the capabilities and limitations of ANN in this configuration mode, which is fundamental for industrial and real-life applications. Literature review also indicated that several authors have reported the utilization of ANN with extensive variables (e.g., removal percentages or final adsorbate concentrations) as the output variables thus generating models that could predict wrongly the performance of adsorption system under analysis. ANN training with intensive adsorption variables is fundamental and mandatory to obtain reliable model for the process design of fluids with multiple adsorbates. The overtraining and the application of global optimization methods in the training stage are key issues to be analyzed and resolved during the adsorption data modeling via ANN. Data sets with a suitable amount of experimental information of adsorption systems are also required to obtain reliable ANN models from a statistical perspective. In this context, it is convenient to remark that the main drawback of ANN-based models relies on their limitation to provide a theoretical understanding of the physical and chemical phenomena present to the systems to be modeled. These models are considered as black-box and empirical approaches that are effective for both data correlation and prediction and, consequently, they can be employed as surrogate model when the theoretical models are not proper to simulate the adsorption system at hand. The hybridization of ANN with theoretical-based adsorption equations is an option to face this drawback and to develop advanced models. Overall, this artificial intelligence tool has a significant potential to overcome the limitations of traditional adsorption models for real fluids where several adsorbates are present thus causing different removal behaviors.

Data Availability

Data of this paper are available on request to the corresponding author.

Conflicts of Interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.