Table of Contents Author Guidelines Submit a Manuscript
Journal of Sensors
Volume 2018, Article ID 6025381, 11 pages
Research Article

The Development of an Intelligent Monitoring System for Agricultural Inputs Basing on DBN-SOFTMAX

1School of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
2Guangdong Provincial Water Environment and Aquatic Products Security Engineering Technology Research Center, Guangzhou Key Laboratory of Aquatic Animal Diseases and Waterfowl Breeding, Guangdong Provincial Key Laboratory of Waterfowl Healthy Breeding, College of Animal Sciences and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, Guangdong 510225, China

Correspondence should be addressed to Ting Wu; moc.qq@239486504 and Li Lin; nc.ude.ukhz@ilnil

Received 7 May 2018; Revised 13 August 2018; Accepted 30 August 2018; Published 28 October 2018

Academic Editor: Marco Grossi

Copyright © 2018 Ling Yang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


To solve the problem of unreliability of traceability information in the traceability system, we developed an intelligent monitoring system to realize the real-time online acquisition of physicochemical parameters of the agricultural inputs and to predict the varieties of input products accurately. Firstly, self-developed monitoring equipment was used to realize real-time acquisition, format conversion and pretreatment of the physicochemical parameters of inputs, and real-time communication with the cloud platform server. In this process, LoRa technology was adopted to solve the wireless communication problems between long-distance, low-power, and multinode environments. Secondly, a deep belief network (DBN) model was used to learn unsupervised physicochemical parameters of input products and extract the input features. Finally, these input features were utilized on the softmax classifier to establish the classification model, which could accurately predict the varieties of agricultural inputs. The results showed that when six kinds of pesticides, chemical fertilizers, and other agricultural inputs were predicted through the system, the prediction accuracy could reach 98.5%. Therefore, the system can be used to monitor the varieties of agrarian inputs effectively and use in real-time to ensure the authenticity and accuracy of the traceability information.

1. Introduction

The traceability system of agricultural products is a powerful tool for solving the food safety issues [1]. The information on the farm inputs, such as pesticides and fertilizers used for cultivation, is one of the most concerned problems of the consumers. Currently, some companies have established their own traceability system of agricultural products [2]. However, it is reasonable that the consumers do not trust the appreciable information recorded by the producers themselves because of the lack of supervision. Therefore, the establishment of a traceability system, which can record the information timely, accurately, and ultimately is an urgent need.

There have been many reports about the rapid techniques for detection of agricultural inputs. To name a few, Deng et al. established a liquid chromatography-tandem mass spectrometry (LC-MS) method for the simultaneous determination of benzoylurea pesticide residues in vegetables [3]. Zheng et al. found LC-MS method for the detection of pesticide residues in milk [4]. Selisker et al. used a competitive enzyme-linked immunosorbent assay (ELISA) to detect paraquat [5]. Alcocer et al. have developed a polyclonal antibody for detection of organophosphorus pesticides [6]. Kumaran and Tran-Minh used cholinesterase electrode to detect pesticides [7]. Chough et al. used a carbon electrode to identify the organophosphorus insecticide [8]. Seemingly, many of the rapid detection techniques for agricultural inputs have been established. However, most of the current monitoring of agricultural input information is still a kind of residue detection of the postproduction stage, and it is still difficult to monitor the data via real-time online. Furthermore, the current established traceability system records the traceability information, which is mainly entered manually; therefore, the information is not timely and accurate.

It is desirable to seek an alternative method to overcome these drawbacks. In this report, based on sensors and DBN-SOFTMAX algorithm, we developed an intelligent monitoring system for the agricultural inputs. Different from chemical-based agrarian inputs detection methods described above. This paper proposed using the sensors arranged in the soil to realize the monitoring and prediction of farming inputs. In general, sensors were employed in agriculture to achieve environmental monitoring such as moisture and temperature [9, 10] or to attain precise agricultural control [11]. In this paper, the sensors placed in the soil were used to collect the physicochemical characteristics of the inputs such as pH and EC value, and then the artificial intelligence algorithm was used to analyze the above sensor data and finally realized the intelligent monitoring and prediction of agricultural inputs.

2. Monitoring System Design

2.1. Working Principles and Overall Architecture

The overall structure of the intelligence-monitoring platform for agricultural inputs is shown in Figure 1.

Figure 1: The overall system architecture.

The monitoring equipment collects data every 15 seconds to obtain the physicochemical parameters of agricultural inputs, such as pH value, electronic conductivity (EC), and temperature, in real time. After data preprocessing, analog-to-digital conversion, and RS485 [12] format conversion, LoRa (long range) module transmits the data to the LoRa gateway [13] and converts them into the RJ45 format. Subsequently, the data will be received and stored in the cloud server, and data cleaning and reduction process are performed to obtain useful data for further modeling and classification. During the modeling process, the input data is continuously increased to the training samples, and the model is updated once a week to obtain more accurate prediction results.

2.2. Hardware Design

The monitoring system mainly consists of sensor module, low-power digital processor, multichannel AD/DA conversion module, RS485 serial communication module, LoRa wireless communication module, and solar power module. The sensor module includes a pH sensor, an EC sensor, and a temperature sensor. The RS485 serial port communication module provides multisensor data fusion service. It uses polling mode to collect different sensor data of the same monitoring point through the RS485 interface to complete multisensor data fusion. LoRa wireless communication module developed by LoRa spread spectrum chip SX1278; its transmission distance and penetration ability are more than one time higher than those of traditional FSK [14]. In LoRa wireless communication, the capability of error correction is stronger since the algorithm of cyclic interleaving error correction coding is expected to be adopted. The maximum continuous error correction is 64 bits, which can reduce the retransmission of a large number of erroneous data packets, to improve the anti-interference performance and transmission distance. The hardware structure of the monitoring system is shown in Figure 2.

Figure 2: The hardware structure of monitor terminal.
2.3. Software Design
2.3.1. The LoRa Node Software Design

There are three kinds of nodes in LoRA, namely, sensor, routing, and aggregation nodes. The routing node is responsible for forwarding data. The aggregation node does not collect data, but as a control center, it sends synchronization information to the monitoring network and the received data to the local monitoring and remote monitoring centers. The corresponding node software is designed to perform the functions of each node. In this paper, the sensor node was used as an example to introduce the software design method. The C language was used to develop software, and the flow chart of the program is shown in Figure 3:

Figure 3: The node program flow.

The entire programming process uses the modular design, mainly including equipment initialization, data acquisition and processing, serial communication, and wireless communication. PC monitoring controls acquisition cycle and acquisition command and controlling center software. If the node software receives the acquisition command sent by the PC monitoring center program, it immediately responds and transmits the collected data to the corresponding sensor according to different Modbus protocol commands.

2.3.2. The Monitoring Center Software Design

The software workflow diagram is shown in Figure 4. The monitoring center software uses C# to develop and communicate with the LoRa gateway module through TCP/IP network programming and obtains monitoring data transmitted by the LoRa gateway to the Internet network. The monitoring software mainly includes parameters setting, real-time monitoring, data processing, and other functions. Parameter setting function is to set the acquisition cycle, acquisition Modbus command, and other parameter settings. The real-time monitoring function is to collect sensor data in real time. A data processing function included calling DBN-SOFTMAX prediction model code, analyzing the received data, matching the established model, and predicting the varieties of inputs.

Figure 4: The software workflow.

3. DBN-SOFTMAX Algorithm and Modeling

3.1. The DBN-Based Feature Extraction Method

Restricted Boltzmann Machine (RBM) [15], which was part of DBN [16, 17], could extract features that are more abstract and significantly improve the ability of neural network generalization [18]. Each RBM was a two-layer model that contained only one hidden layer, and each RBM training output was used as the input for the next RBM.

3.1.1. Restricted Boltzmann Machine (RBM)

If , where represented the connection weight between the visible unit and the hidden element was the number of hidden cells and visible cells, respectively. Both the visible and hidden units were binary variables. That is, , was the offset of the visible element , was the offset of the visible element , was the number of samples, was a hidden layer unit, and was a visible layer unit.

RBM was an undirected graph model [19, 20] which was used to solve the value of the parameter θ, to fit the given training data, and the extracted feature (Figure 5).

Figure 5: The RBM model.

RBM task was used to fit the input training data, figured out the optimal parameter θ, and completed the feature extraction. The parameter θ could be learned in the training set to maximize the logarithmic likelihood function. The formula was as follows: where

The key to solving the optimal parameter was to obtain the partial derivative of for , and other parameters. Assuming that was a parameter value of , the logarithmic likelihood function concerning was

Since the number of samples T was known, the partial derivative of the logarithmic likelihood function for the connection weight , the offset of the visible layer element, and the offset of the hidden layer unit could be expressed by and . was a hidden probability distribution of training sample ; was a joint probability function for a given state (v, h); the function was where was the energy function of RBM and was the normalization factor.

3.1.2. CD Algorithm

It has been shown that the normalization factor was difficult to be solved [21]. Therefore, the joint probability function was also difficult to calculate. To solve this problem, the fast learning algorithm based on contrast divergence was used to training data, and the steps were as follows: (1) was given by the formula , (2)The RBM network structure was connected between layers, no connection within the layer and the structure of symmetry, i.e., when the state of the visible cell was fixed, an activating probability of the th hidden element was

When the state of the hidden cell was fixed, activating probability of the ith hidden element was

The binary state of all hidden layer units was calculated from equation (6). After the state of all hidden layer units was determined, the ith visible unit value of 1 probability according to equation (7) was determined, and a reconstruction of the visible layer was created (3)The parameter updated formula in the data training process was as follows:where ε was the learning rate, and was the distribution that represented the reconstructed model definition.

3.2. The Softmax Classifier

In the above process, what is finally obtained was the feature values of x(i). However, in the prediction of input varieties, it was necessary to categorize output and to add a softmax classifier [22] to the output layer to organize the learned feature values. The diagrammatic drawing of the softmax classifiers was presented in Figure 6. The marked training set , among representative training sample is k. The given test input , which is the classification model calculates the probability that it belongs to each category.

Figure 6: The softmax classifier.

Thus, to a sample set with k types, output k-dimensional vector to represent the probability vector. The jth element in the probability vector represents the probability of belonging to the jth category, and the sum of values of elements is 1. Specifically, our hypothesis function is shown as

Among are network model parameters, as this item mainly restricts the probability values from 0 to 1, which the sum of the probability values is 1.

In equation (9), the probability of the sample output by the classifier belonging to class j is ():

The likelihood function corresponding to training samples is

The parameter θ that maximizes the likelihood function as the optimal parameter of the softmax classifier. The cost function of the softmax regression model is

The cost function is minimized by the gradient descent method; the gradient function is as follows:

The softmax classifier has an unusual feature: it has a “redundant” set of parameters [23]. To illustrate the feature, if the vector was subtracted from the parameter vector , each becomes . The function is shown as

In equation (15), we can see that the parameters and can both gain the same result. In other words, when is the optimal parameter, can also have the same effect. It is the disadvantage of having redundant parameters in the softmax classifier. The loss function of the softmax classifier is distinctly nonconvex. Although there is a minimum point, the minimum value is in “flat” space and not at a single point. That is, all points in the area can get a minimum value. To make the cost function a strictly convex function, we need to add a weight attenuation term, as follows:

3.3. Modeling

In establishing the model, the collected data were first normalized, and then DBN was used for unsupervised training to extract features. However, these features were not directly applicable to classification [2426], so the softmax classifier was added to the output to perform supervised classification training. The flow diagram is shown in Figure 7.

Figure 7: The flow diagram of modeling.

4. Experiment Design

In this experiment, we used six agricultural inputs, including phosphate (P2O5, SinoChem, China), potassium (K2O, SinoChem, China), compound fertilizer (carbamide, nitrogen phosphorus potassium, SouthRanch, China), Podol pesticide (TaoChun, China), imidacloprid (Bayer, Germany), and oxamoxime (HeYi, China), purchased from local stores in Guangzhou, China. The first three of these inputs were chemical fertilizers, the latter three were pesticides, and their aqueous solutions were placed in dilution ratios (500 : 1) for use. Eighteen pots of soil-filled bottom drainable basins were prepared and set in an open-air environment. The EC sensors, pH sensors, and moisture sensors were inserted into the soil, and the power was turned on to enable to collect the sensor data in real time. During the period from October 2016 to March 2017, 200 ml of each inputs aqueous solution was sprayed into three pots of soil. Over 50 experiments, the soil parameter data before and after the input, including moisture proportion (before input), conductivity (before input), the pH value (before input), moisture ratio (after input), conductivity (after input), and the pH value (after input) were recorded. 150 data were collected for each type of input product, and the total number of data was 900.

5. Results and Discussion

5.1. Data of Sensors

The sensor data before input were not the same in each experiment; the collected sensor data after input minus before input could better explain the characteristics of the input. Six agricultural inputs were sprayed into the soil, respectively, and pH, conductivity, and moisture data were collected before and after the input. In this paper, 20 times of experimental data were randomly selected for observation.

Observing the collected pH values, before input they were close to 7, which was neutral. After applying six agricultural inputs, the pH value decreased. As shown in Figure 8, the changes of pH at different inputs were similar and overlapped with each other. It indicates that there was no significant difference in the power of hydrogen during several agricultural inputs applied.

Figure 8: Changes in pH before and after input.

Further observation of changes in electrical conductivity (EC), since the EC value was very sensitive to moisture content, there was a significant error in the shift in the EC value observed separately. The EC value divided the moisture content, and the obtained ratio was counted as shown in Figure 9. Observation shows that in general, EC changes in pesticides, including podol pesticide, imidacloprid, and oxamoxime were smaller, while fertilizers were comparatively larger. Changes of potassium had the most considerable EC value, and its value was more significant than the three; however, the EC value changes of pesticide were less than 1.5.

Figure 9: Changes in EC/moisture before and after input.
5.2. Modeling and Analysis

Sensors collected the trained and relevant experimental data of the agricultural inputs prediction models. The main content of each data sample is input product category, the moisture proportion (before input), conductivity (before input), the pH value (before input), moisture ratio (after input), conductivity (after input), and pH value (after input). In establishing the model, the leave-one-out method [27] was used for the cross-validation to test the model’s performance. Each of the 900 samples was taken separately, and then the remaining 899 samples were used to build the model. The model independently tested each sample, and the results were averaged to obtain the average performance of the method.

When using DBN for feature extraction, a four-layer neural network was established, the number of neurons in each layer was 300, 100, 20, and 6, respectively. The activation function of the hidden layer was “logsig”, the training method was the L-BFGS algorithm [28], and the output layer function was a softmax function. In DBN feature extraction, RBM used an unsupervised learning method to train each layer of RBM networks separately, ensuring that the feature information was preserved as much as possible during the mapping process. After the training was completed, a classifier was set at the last level of the DBN; using supervised fine-tuning, the best training results were obtained. Since each layer of RBM network only adjusted the weights in its own layer, it did not guarantee that the feature vector mapping of the entire DBN was optimal. After supervised fine-tuning, the process of RBM network training could be regarded as the initialization process of weight parameters of a deep neural network, which enabled the DBN network to overcome the disadvantages of the traditional BP network due to random initialization weight parameters and easy to fall into local optimum and long training time.

During the training, the number of iterations was 400, the learning rate was 0.1, and the training error target was set to 0.001. After the training, the extracted feature data were shown in Table 1:

Table 1: The characteristic data table.

Through unsupervised training of DBN and nonlinear mappings, the features were obtained from the input data, such as pH, moisture, and conductivity. After extracting features, the cohesion of the same types of agricultural inputs and the variances of different farm inputs could be better demonstrated. After dimensional reduction by the principal component analysis (PCA) method [29], the three-dimensional distribution of feature values and the three-dimensional distribution of original values were shown in Figures 10 and 11. It could be observed that the feature values extracted by DBN can be separated, but the initial input values without feature extraction were scattered and there were some confusion. Therefore, it was evident that using the extracted feature values for classification could achieve better prediction results.

Figure 10: Three-dimensional distribution of feature values.
Figure 11: Three-dimensional distribution of original values.

We used this model for predicting the agricultural inputs. First, by applying the RBM-based DBN model, unsupervised training on raw data was carried out to improve the robustness of the network. Second, the feature data were obtained, and the softmax classifier was added to the back of DBN, the feature data were taken as the input, and the categories of inputs were taken as the output. Thirdly, the feature data and the tagged samples were combined to fine-tune the softmax classifier, and finally, the model was established to predict the accuracy. The result was shown in Figure 12. When 900 samples were predicted, thirteen samples were wrongly predicted with the model accuracy of 98.5%.

Figure 12: The predicted results of DBN-SOFTMAX for test sets. In the ordinate, 1: potassium fertilizer; 2: compound fertilizer; 3: imidacloprid; 4: Podol liquid; 5: oxamoxime; and 6: phosphate fertilizer.

To evaluate the performance of the DBN-softmax model, BP-neural network and DBN-BP model were also established and the prediction accuracy of the input products was compared.

As shown in Table 2, the DBN-BP accuracy was higher than that of the BP neural network, because DBN adopted the unsupervised layer-wise [16] mechanism training mode. The weights gained through DBN were obtained by learning the structure of input data, which were close to the optimal global values. However, BP neural network, whose initial values were randomly set, was prone to problems such as local optimal and gradient diffusion, so it required manual adjustment parameters [30]. When DBN-BP was compared with DBN-SOFTMAX, we used the same DBN structure to extract features with different classifiers, so the prediction accuracy was the same. It indicated that the prediction accuracy depended mainly on the quality of feature extraction, and the results obtained by different classifiers were not very different.

Table 2: The forecast accuracy comparison table.

Further research on the determination coefficient (R-Square) and root mean square error (RMSE) when testing model performance. This paper compared conventional modeling methods such as BP-NN and support vector machine (SVM) [31]. When establishing the SVM model, the nu-SVM algorithm was selected, and the radial basis function (RBF) was selected as the kernel function. The error penalty coefficient γ and the kernel function parameter nu were 0.255 and 1, which were determined by grid searching technique [32]. In the process of modeling, the calibration sets were used to build the model, and the leave-one-out method was used for cross-validation to test the robustness and adaptability of the model further.

As shown in Table 3, when the BP-NN model was observed, the performance of the calibration sets was the same as SVM. However, the determination coefficient of cross-validation was smaller than SVM, and the root mean square error was more extensive than SVM, revealing that the model was not as accurate and stable as the SVM model. When the DBN model was observed, it could be seen that after feature extraction, the determination coefficients of calibration sets and cross-validation were the largest compared to BP-NN and SVM, reaching 0.99 and 0.99, respectively. Meanwhile, the RMSE of calibration sets and cross-validation in the DBN model were the smallest, which were 0.03 and 0.15, respectively. So in general, compared to SVM and BP-NN, DBN was still considered to be the optimal modeling method.

Table 3: DBN model performance comparison with BP-NN and SVM.

6. Conclusions

Based on the self-developed monitoring equipment and DBN-SOFTMAX model, we have developed a platform for intelligent monitoring of agricultural inputs; perform online and real-time monitoring on farms. When the agricultural inputs were applied in farms, we could compare the types of inputs and application time with the data entered by the administrators in the traceability system. Once the producers do not record the traceability information or input the wrong information, our system can capture related data timely and accurately, then automatically provides safety warning to the producers, to ensure that the traceability information is true and accurate. The intelligent monitoring platform will pave a new way for the development of traceability systems.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

Authors’ Contributions

Ling Yang, Ting Wu, and Li Lin conceived and designed the experiments; Ling Yang performed the experiments; Ting Wu, Juan Zhou, Xu Can Cai, and V Sarath Babu analyzed the data; Ling Yang, Ting Wu, and Li Lin wrote and finalized the manuscript.


This study was jointly supported by the National Natural Science Fund (61501531); fund for Science and Technology from Guangdong Province (2015A020209173, 2017A020225007); fund from Guangzhou Science and Technology Bureau (201704020030, 201803020033); and “Innovation and Strong Universities” special funds (KA170500G) from the Department of Education of Guangdong Province. V. Sarath Babu was supported by Chinese Postdoctoral Science Foundation.


  1. A. Regattieri, M. Gamberi, and R. Manzini, “Traceability of food products: general framework and experimental evidence,” Journal of Food Engineering, vol. 81, no. 2, pp. 347–356, 2007. View at Publisher · View at Google Scholar · View at Scopus
  2. T. Bosona and G. Gebresenbet, “Food traceability as an integral part of logistics management in food and agricultural supply chain,” Food Control, vol. 33, no. 1, pp. 32–48, 2013. View at Publisher · View at Google Scholar · View at Scopus
  3. H. Y. Deng, W. Xie, Z. Q. Zhou, B. Li, and Q. T. Jiang, “Determination of 11 benzoylurea insecticides residues in vegetable by LC-MS/MS,” Journal of Instrumental Analysis, vol. 28, no. 8, pp. 970–974, 2009. View at Google Scholar
  4. J. H. Zheng, G. F. Pang, C. L. Fan, and M. L. Wang, “Simultaneous determination of 128 pesticide residues in milk by liquid chromatography-tandem electrospray massspectrometry,” Chinese Journal of Chromatography, vol. 27, no. 3, pp. 254–263, 2009. View at Google Scholar
  5. M. Y. Selisker, D. P. Herzog, R. D. Erber, J. R. Fleeker, and J. Itak, “Determination of Paraquat in fruits and vegetables by a magnetic particle based enzyme-linked immunosorbent assay,” Journal of Agricultural and Food Chemistry, vol. 43, no. 2, pp. 544–547, 1995. View at Publisher · View at Google Scholar · View at Scopus
  6. M. J. Alcocer, P. P. Dillon, B. M. Manning et al., “Use of phosphonic acid as a generic hapten in the production of broad specificity anti-organophosphate pesticide antibody,” Journal of Agricultural and Food Chemistry, vol. 48, no. 6, pp. 2228–2233, 2000. View at Publisher · View at Google Scholar · View at Scopus
  7. S. Kumaran and C. Tran-Minh, “Insecticide determination with enzyme electrodes using different enzyme immobilization techniques,” Electroanalysis, vol. 4, no. 10, pp. 949–954, 2010. View at Publisher · View at Google Scholar · View at Scopus
  8. S. H. Chough, A. Mulchandani, P. Mulchandani, W. Chen, J. Wang, and K. R. Rogers, “Organophosphorus hydrolase–based amperometric sensor: modulation of sensitivity and substrate selectivity,” Electroanalysis, vol. 14, no. 4, pp. 273–276, 2015. View at Publisher · View at Google Scholar
  9. J. Burrell, T. Brooke, and R. Beckwith, “Vineyard computing: sensor networks in agricultural production,” IEEE Pervasive Computing, vol. 3, no. 1, pp. 38–45, 2004. View at Publisher · View at Google Scholar · View at Scopus
  10. M. Srbinovska, C. Gavrovski, V. Dimcev, A. Krkoleva, and V. Borozan, “Environmental parameters monitoring in precision agriculture using wireless sensor networks,” Journal of Cleaner Production, vol. 88, pp. 297–307, 2015. View at Publisher · View at Google Scholar · View at Scopus
  11. D. D. Chaudhary, S. P. Nayse, and L. M. Waghmare, “Application of wireless sensor networks for greenhouse parameter control in precision agriculture,” International Journal of Wireless & Mobile Networks, vol. 3, no. 1, pp. 140–149, 2011. View at Publisher · View at Google Scholar
  12. H. J. Jia and Z. H. Guo, “Research on the technology of rs485 over ethernet,” in 2010 International Conference on E-Product E-Service and E-Entertainment, pp. 1–3, Henan, China, November 2010. View at Publisher · View at Google Scholar · View at Scopus
  13. M. Aref and A. Sikora, “Free space range measurements with Semtech Lora™ Technology,” in 2014 2nd International Symposium on Wireless Systems within the Conferences on Intelligent Data Acquisition and Advanced Computing Systems, pp. 19–23, Offenburg, Germany, September 2014. View at Publisher · View at Google Scholar · View at Scopus
  14. N. Azmi, S. Sudin, L. M. Kamarudin et al., “Design and development of multi-transceiver Lorafi board consisting LoRa and ESP8266-Wifi communication module,” IOP Conference Series: Materials Science and Engineerin, vol. 318, article 012051, 2018. View at Publisher · View at Google Scholar · View at Scopus
  15. G. E. Hinton, “A practical guide to training restricted Boltzmann machines,” in Neural Networks: Tricks of the Trade, G. Montavon, G. B. Orr, and K. R. Müller, Eds., vol. 7700 of Lecture Notes in Computer Science, pp. 599–619, Springer, Berlin Heidelberg, 2012. View at Publisher · View at Google Scholar
  16. Y. Bengio, “Learning deep architectures for AI,” Foundations and Trends® in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009. View at Publisher · View at Google Scholar · View at Scopus
  17. B. Schölkopf, J. Platt, and T. Hofmann, “In Greedy layer-wise training of deep networks,” International Conference on Neural Information Processing Systems, pp. 153–160, 2006. View at Publisher · View at Google Scholar · View at Scopus
  18. G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, 2006. View at Publisher · View at Google Scholar · View at Scopus
  19. N. Li, J. Shi, and M. Gong, “Change detection in synthetic aperture radar images based on fuzzy restricted Boltzmann machine,” in Bio-inspired Computing – Theories and Applications. BIC-TA 2016, M. Gong, L. Pan, T. Song, and G. Zhang, Eds., vol. 681 of Communications in Computer and Information Science, pp. 438–444, Springer, Singapore, 2016. View at Publisher · View at Google Scholar · View at Scopus
  20. G. V. Tulder and M. D. Bruijne, “Combining generative and discriminative representation learning for lung CT analysis with convolutional restricted Boltzmann machines,” IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1262–1272, 2016. View at Publisher · View at Google Scholar · View at Scopus
  21. P. M. Long and R. A. Servedio, “Random classification noise defeats all convex potential boosters,” Machine Learning, vol. 78, no. 3, pp. 287–304, 2010. View at Publisher · View at Google Scholar · View at Scopus
  22. L. Chen, M. Zhou, W. Su, M. Wu, J. She, and K. Hirota, “Softmax regression based deep sparse autoencoder network for facial emotion recognition in human-robot interaction,” Information Sciences, vol. 428, pp. 49–61, 2018. View at Publisher · View at Google Scholar · View at Scopus
  23. K. Duan, S. S. Keerthi, W. Chu, S. K. Shevade, and A. N. Poo, “Multi-category classification by soft-max combination of binary classifiers,” in Multiple Classifier Systems. MCS 2003, T. Windeatt and F. Roli, Eds., vol. 2709 of Lecture Notes in Computer Science, pp. 125–134, Springer, Berlin, Heidelberg, 2003. View at Publisher · View at Google Scholar
  24. B. Liao, J. Xu, J. Lv, and S. Zhou, “An image retrieval method for binary images based on DBN and Softmax classifier,” IETE Technical Review, vol. 32, no. 4, pp. 294–303, 2015. View at Publisher · View at Google Scholar · View at Scopus
  25. A. Nagabandi, G. Kahn, R. S. Fearing, and S. Levine, “Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7559–7566, Brisbane, Australia, May 2018. View at Publisher · View at Google Scholar
  26. J. Yang, Y. Bai, F. Lin, M. Liu, Z. Hou, and X. Liu, “A novel electrocardiogram arrhythmia classification method based on stacked sparse auto-encoders and softmax regression,” International Journal of Machine Learning and Cybernetics, vol. 9, no. 10, pp. 1733–1740, 2017. View at Publisher · View at Google Scholar
  27. M. Kearns and D. Ron, “Algorithmic stability and sanity-check bounds for leave-one-out cross-validation,” Neural Computation, vol. 11, no. 6, pp. 1427–1453, 1999. View at Publisher · View at Google Scholar · View at Scopus
  28. D. C. Liu and J. Nocedal, “On the limited memory BFGS method for large scale optimization,” Mathematical Programming, vol. 45, no. 1–3, pp. 503–528, 1989. View at Publisher · View at Google Scholar · View at Scopus
  29. S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis,” Chemometrics and Intelligent Laboratory Systems, vol. 2, no. 1–3, pp. 37–52, 1987. View at Publisher · View at Google Scholar · View at Scopus
  30. B. Yang, X. H. Su, and Y. D. Wang, “Bp neural network optimization based on an improved genetic algorithm,” in Proceedings. International Conference on Machine Learning and Cybernetics, vol. 61, pp. 64–68, Beijing, China, November 2002. View at Publisher · View at Google Scholar
  31. C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. View at Publisher · View at Google Scholar
  32. L. I. Bing, Q. Z. Yao, Z. M. Luo, and Y. Tian, “Gird-pattern method for model selection of support vector machines,” Computer Engineering & Applications, vol. 44, no. 15, pp. 136–138, 2008. View at Google Scholar