Abstract

A home-auxiliary robot system based on characteristics of the electrooculogram (EOG) and tongue signal is developed in the current study, which can provide daily life assistance for people with physical mobility disabilities. It relies on five simple actions (blinking twice in a row, tongue extension, upward tongue rolling, and left and right eye movements) of the human head itself to complete the motions (moving up/down/left/right and double-click) of a mouse in the system screen. In this paper, the brain network and BP neural network algorithms are used to identify these five types of actions. The result shows that, for all subjects, their average recognition rates of eye blinks and tongue movements (tongue extension and upward tongue rolling) were 90.17%, 88.00%, and 89.83%, respectively, and after training, the subjects can complete the five types of movements in sequence within 12 seconds. It means that people with physical disabilities can use the system to quickly and accurately complete life self-help, which brings great convenience to their lives.

1. Introduction

The loss of a person’s limb function will bring a lot of troubles to his/her life [1, 2]. Take the elderly people with physical disability as an example: as they grow older, their limb movement becomes more and more difficult, which brings inconvenience to their life. To solve this problem, many researches have been carried out to use human physiological electrical signals to control auxiliary equipment to assist human life [36]. Šumak et al. successfully used eye wink, eyebrow motion, clenching of teeth, and smirk to control different functions of keyboard operation [7]. Fernandez-Fraga et al. used different hand movements to achieve effective control of the cursor movement up, down, left, and right in the screen [8]. Vinoj et al. used human motion imagination combined with visual stimulation to identify basic human motion characteristics such as sitting, standing, forward movement, turning right, and turning left, and these actions could be used as instructions to control the movement of exoskeletons, thus enabling exoskeletons to assist the normal movement of human beings [9]. Zhang et al. and Kong et al. applied tongue electrical signal features to the rehabilitation of paralyzed patients [10, 11]. Sahadat et al. successfully applied tongue electrical signal features to complete four computer access tasks without using their hands [12]. Additionally, there are many researches on controlling external auxiliary equipment to serve human beings based on the brain-computer interface (BCI) technology. The BCI is a technology that directly reads brain information and identifies human intentions through modern mathematical algorithms, which makes it possible to directly use human brain signals to control external devices to serve human beings [36].

For human motion imagination signals, the motion characteristics based on the two modes: event-related potential (ERP) [1316] and steady-state visual evoked potential (SSVEP) [17, 18], are relatively obvious, which are widely used in BCI technology research. Many teams of scientists have carried out research on the application of BCI technology. For example, they applied the BCI technology to assistive exoskeletons [19], flying robots [20, 21], humanoid robots for controlling the navigation [2229], robotic wheelchairs [20, 30, 31], and wheeled robots [3234].

Research has shown that there are many methods applied to recognize human brain motion information, and the brain network algorithm is one of them. Research by Stam and Reijneveld showed that the modern complex network theory was widely used to simulate human brain function [35]. Some studies have proved that the brain network connection could effectively express brain function and related neural activities [3640]. In this paper, the authors use the brain network features to recognize eye movements.

The eye movement signal, which contains important thinking information, is a very easy signal to observe. Niu used eye movement signals to distinguish the pilots’ different cognitive levels of driving [41]. The research conducted by Brookings and Wilson shows that the EOG signals can be used to estimate the cognitive requirements of different tasks [42, 43]. Fuwang et al. used eye movements to control the left and right movements of the cursor on the computer screen [44].

In this paper, we found that when using Emotiv equipment to collect signals, human eye movement and tongue movement signals can be easily detected in the time domain. So we used the five kinds of human head movements (blinking twice in a row, tongue extension, upward tongue rolling, and left and right eye movements) to control the home-auxiliary robot system and used this system for taking articles for daily use. The final experimental results show that this kind of household auxiliary robot system using simple signals from the human head can realize efficient control after simple training.

2. Experiment

2.1. Subjects

In our research, we randomly selected 12 subjects (6 males and 6 females; aged 28 ± 1.6 (SD)) from volunteers to participate in the experiment. For participation, they were required to meet these conditions of no history of neurological diseases or visual illness. Meanwhile, the subjects were prohibited from taking any type of irritating drinks (such as coffee, tea, or alcohol) for more than 48 hours. According to the Code of Ethics of the World Medical Association (Declaration of Helsinki), the Ethics Committee at the Northeast Electric Power University Hospital endorsed the study protocol.

2.2. Experimental Process
2.2.1. The EEG Acquisition Device

In the experiment, we used Emotiv as the EEG acquisition device. The electrodes of the device were attached to the scalp according to the international 10-20 system (14 channels = AF3, AF4, F3, F4, FC5, FC6, F7, F8, T7, T8, P7, P8, O1, and O2). In addition, the EEG acquisition device (Emotiv), which is convenient to carry, is widely used to collect EEG signals online.

Research by Ji et al. shows that AF3-AF4 (left ahead frontal-right ahead frontal brain areas) had a higher spectral correlation in successful putts than in unsuccessful putts [45]. Additionally, when we use EEG equipment such as Emotiv and Neuroscan to collect signals, the tongue signals are easily detected in the time domain. Thus, AF3 and AF4 signals were used to find tongue and eye movements in this study.

2.2.2. The Home-Auxiliary Robot System

Figure 1 shows the block diagram of the home-auxiliary robot system. The system mainly consists of a warehouse, a 6-DOF manipulator, a wireless carrying trolley, a human-computer interaction system, a signal acquisition unit (Emotiv), and two communication modules (TC35 and NRF905). It should be pointed out that the subjects can only operate the system with their own EEG and tongue electrical signals. The subjects’ EOG signals included blinking twice in a row and eye movements to the left and right, which, respectively, performed double-click confirmation of the cursor and the cursor movements to the left and right. And the tongue electrical signals included the tongue extension and upward rolling signals, which, respectively, performed upward and downward movements of the cursor.

In the experiment, the subjects focused on the “cursor waiting area” of the human-computer interaction instruction interface and controlled the cursor movement using EOG or tongue electrical signals. Take grabbing bottled water as an example: the subjects quickly moved their eyes from the “cursor waiting area” to the “drink area,” gazed at the area for 1 second, and then blinked twice in a row to determine the selected drink. After these operations were completed, the system executed the bottled water-grabbing operations in turn according to the programmed instructions in advance. Firstly, the bottled water in the vertical warehouse was selected and sent to the wireless carrying trolley waiting at the exit of the warehouse. Then, the wireless trolley moved the bottled water to the position of the manipulator for necessary treatment. Finally, the wireless trolley transported the bottled water to the location where the subjects were located.

In our experiment, for the five recognition actions (blinking twice in a row, tongue extension, upward tongue rolling, and left and right eye movements), only the blinking twice in a row action may be interfered by unintended blinks. Therefore, when the subject completes eye movements or tongue movements, the cursor moves from the initial position (cursor waiting area) to the instruction area (food/drink/daily medicine/emergency call); then, the subject blinks twice in a row to complete the double-click confirmation instruction. The entire execution process needs to be completed in 3 seconds; otherwise, the cursor will return to the initial position (cursor waiting area). This will effectively reduce the probability of incorrectly executing confirmation instructions caused by unintended blinks. Additionally, we analyzed the characteristics of five types of motion signals (blinking twice in a row, tongue extension, upward tongue rolling, and left and right eye movements) in the time domain. And these five signal characteristics are very obvious in the time domain and are quite different from other muscle artifact signals. Therefore, the noise caused by muscle artifacts has not been studied in this experiment. In order to overcome the problem of the electrode becoming dry in experiments, we redrip the conductive liquid (normal saline) on the electrode every other hour.

During the operation of the system, the communication between the wireless trolley and the upper computer is through the NRF905 module, and the emergency call for help is through the TC35 module. Additionally, during the operation of the wireless trolley, it runs along a fixed track, which enables it to accurately reach the manipulator position besides the track. Figure 1 shows the home-auxiliary robot system.

2.3. Data Preprocessing

In the experiment, the collected human physiological signals were easily disturbed by noise, especially the EEG and EOG. Thus, we must denoise the source signals firstly. The wavelet packet decomposition (WPD) can separate human biological signals from source signals, which can remove high-frequency and artifact noise. Resulting from the original signal, we obtained θ (4–8 Hz) subbands, which were used to analyze the characteristics of brain nerve activity during motor imagination.

3. Algorithm

3.1. Brain Network

Previous studies have shown that when the human brain processes complex information, many different cortical and subcortical regions of the brain are activated [46]. In this experiment, the brain network characteristics of the subjects are used to express the differences in neural activities between them. When constructing a brain network, each major functional area of the brain is regarded as a network node, and the connections between different nodes are called edges. A complete brain network consists of nodes and edges. In this study, we chose two important network parameters (clustering coefficient and global efficiency) to analyze the characteristics of the brain network. Details of these two parameters are as follows.

3.1.1. Clustering Coefficient

The degree of connectivity of a node in a network expresses the importance of the node in the network, which can be quantified by the number of connection edges with the node. In order to represent the connectivity characteristics of nodes of the entire network, the parameter C is adopted, which can be expressed as the ratio of the number of existing edges to the number of maximum possible edges [47, 48]. Its formula can be expressed aswhere Ei represents the number of existing edges between neighbors of the node i and Di represents the degree of connectivity of the node i. Di(Di – 1)/2 represents the number of maximum possible edges between neighbors of the node i [48].

3.1.2. Global Efficiency

The degree of integration of a network can be expressed by the network parameter G, which represents the speed with which the human brain processes information. The path length Li,j between the node i and the node j, which is the inverse ratio with the nodal efficiency, is the minimum value of edges. The path length is mathematically defined as [47, 48]where N is the number of nodes within a network. The global efficiency G of nodes, which can be estimated by the average value of the nodal efficiencies of nodes in a network, can be defined by

From equation (3), it can be concluded that a network with a short minimum path length between any pair of regional nodes has high global efficiency [49, 50]. Combined with equation (1), it can be concluded that the larger G value and C value mean the faster information transfer between one node and other nodes.

The relationship between main brain regions was determined by the synchronization likelihood (SL), and the algorithm is described as follows.

Assume a time series given by Xk,i (K = 1, …, M; i = 1, …, N). At the same time, the embedding dimension is taken as m, and the series can be expressed as

The probability that the distance between pairs of embedded vectors is less than ε isin which ω1 < |ji| < ω2. Heaviside staircase function is expressed as θ. The Euclidean distance is expressed as |·|. ω1 and ω2, which satisfy the condition of ω1 « ω2 « N, are two window variables.

For each k and each i, a critical distance is determined using εk,i:where Pref « 1. The number of channels, whose distance is less than the critical distance between vectors Xk,i and Xk,j, is expressed aswhere ω1 < |ji| < ω2. The SL algorithm can be expressed aswhere |Xk,i – Xk,j| < εk,i. The average value of all j values can be calculated by the following formula:where ω1 < |ji| <ω2.

The relationship between EEG signals of pairs of 14 channels (14 channels = F7, F3, F4, F8, FT7, FT8, C3, C4, TP7, TP8, P3, P4, O1, and O2) can be calculated by equation (9). After calculating the synchronization likelihood (SL) value between pairs of 14 channels, we need to select a reasonable threshold T to construct the brain network. The threshold T is determined as follows.

The SL value range is PrefT ≤ T ≤ 1, and the PrefT value range is close to 0. In order to make the contrast of different human brain motion characteristics obvious, it is necessary to determine a reasonable threshold to construct a brain network. For the subjects, the range of SL values between their respective brain electrodes is 0.01 < T < 0.51. We selected different T values with increments of 0.025 and then established different brain networks to analyze the difference in brain networks when a human performs motor imagination. Figure 2 shows the difference in brain network parameters (C and G) when we choose different T values.

From Figure 2, we can conclude that there is a significant difference in C between the two different brain hemispheres when T is chosen in the range 0.28 < T < 0.48. At the same time, this obvious difference in G also exists between the two different brain hemispheres when T is chosen in the range 0.29 < T < 0.43. In our study, we chose the mean value of T (T = 0.36) for the correlation calculation.

3.2. BP Neural Network

The BP (backpropagation) neural network algorithm [51], which has a wide range of applications, such as physiology, psychology, anatomy, and brain science, was used to recognize human blink signals and tongue electrical signals in this paper.

In this study, a three-layer BP neural network was used to analyze the characteristics of human eye and tongue electrical signals. The number of hidden layer nodes can be determined by the following empirical formula:in which n, n1, and m represent the unit numbers of the input layer, hidden layer, and output layer of the BP neural network, respectively, and a is a natural number with a value in the range of [1, 10]. In our study, the values of n and m are 5 and 3, respectively. It can be calculated from (10) that n1 should be a natural number in the range of [4, 12]. Then, the BP network is trained, and the network training error is shown in Table 1.

From Table 1, it can be concluded that when the number of hidden layer nodes is 9, the training error of the network is the smallest, which means that the output of the BP network is closer to the expected value. The BP neural network is shown in Figure 3.

3.3. Correlation Coefficient

In statistics, the Pearson correlation coefficient is a method to measure the relationship between two variables. In our study, the method is used to analyze the relationship between any two channels’ EEG signals. The correlation coefficient is calculated by the following formula:where and are series means and σx and σy are the values of standard deviation.

4. Results

In this study, five simple motion characteristic signals (blinking twice in a row, tongue extension, upward tongue rolling, and left and right eye movements) of human beings were analyzed. For left and right eye movement feature recognition, we used brain network features and correlation coefficient algorithm for analysis. For tongue extension, upward tongue rolling, and blinking motion, we used the BP neural network algorithm for identification and analysis.

4.1. EOG
4.1.1. Eye Movement

In the experiment, we found that the left and right eye movement signals of subjects fluctuated obviously in the time domain, especially in channels AF3 and AF4, and the eye movement waveforms in the two channels show negative correlation. The eye movement signals in the two channels are as shown in Figure 4.

With the disappearance of eye movement wave signals in AF3 and AF4 channels, the difference between left and right hemispheres of the brain topographic map is gradually significant, which indicates that there is a difference in nerve activity between the two hemispheres when performing left and right eye movements. For brain topography, low activity is indicated by the blue-shaded areas, whereas high activity is indicated by the red-shaded areas. As shown in Figure 5(a), the color of the left brain region is obviously lighter than that of the right brain region, which means that the neural activities in the right brain are more active than those in the left brain when a subject moves to the left. Additionally, a different phenomenon appears in Figure 5(b) when a subject moves to the right. In order to express this difference in brain nerve activity, we used the brain network method to analyze this characteristic difference.

The correlations between EEG signals of pairs of 14 channels were calculated using (9). In combination with the determined fixed threshold value (T = 0.36), the brain networks were formed. The brain networks corresponding to a subject performing left and right eye movements are shown in Figure 5.

As can be clearly seen from Figure 5, when the subject turns left, the connection density of the right brain network is significantly higher than that of the left brain network. On the contrary, when the subject turns right, the connection density of the left brain network is significantly higher than that of the right brain network. In order to quantify the density of brain networks, the parameters C and G were used to calculate and analyze brain network characteristics. Figure 6 shows the contrast difference in brain network parameters C and G when subjects perform left-to-right eye movements.

From Figure 6, we can see there are obvious differences in the network parameter values of the corresponding left and right brain hemispheres when subjects perform left and right eye movements (). Taking the left eye movement as an example, the connection density of the right hemisphere is higher than that of the left hemisphere. And the brain network parameter values (C and G) in the right hemisphere are larger than those in the left hemisphere. It means that the groups of neurons in the relevant brain regions in the right hemisphere cooperate to complete an equivalent action, which seems to have little to do with the neurons in the left hemisphere.

From the characteristics of human eye movement signals shown in Figures 4 and 6, the negative correlation fluctuation in the two channels (AF3 and AF4) can be identified by equation (11), and the characteristics of motion imagination signals in eye movement signals can be identified by the brain network algorithm. Taking the eye movement to the right as an example, the discrimination logic is as shown in Figure 7.

Figure 7 shows the logical relationship of eye movement feature recognition. It is especially pointed out that when subjects perform eye movements, the time-domain signal waveforms in AF3 and AF4 channels have negative correlation and the correlation coefficient is less than −0.85. 12 subjects have conducted 50 eye movement experiments to the left and right, respectively, and the recognition accuracy is shown in Table 2.

Table 2 shows the recognition accuracy of eye movements by the two methods. We can easily conclude that the recognition accuracy is relatively low when two algorithms are used to detect eye movement signals, respectively. However, a very high recognition rate can be obtained by using two methods to comprehensively recognize eye movements.

4.1.2. Blinking Twice in a Row

In the experiment, we found that when the subjects performed the motion of blinking twice in a row, the signal waveforms in the two channels (AF3 and AF4) were very similar, and the fluctuation was different from the ordinary fluctuation signal, which is shown in Figure 8.

For this special wave signal, the sliding window length is 150, and the BP neural network is used to identify it. At the same time, considering that the characteristic signal shows a very strong positive correlation in the two channels (AF3 and AF4), the Pearson correlation coefficient algorithm was used to identify the motion. In this paper, we conducted 50 blinking motions in a raw experiment on each subject and used this method to identify the motion. The recognition rate of this method is shown in Figure 9.

Figure 9 shows that there is a high recognition rate for recognizing the blinking motion using the BP neural network and Pearson correlation coefficient algorithm. The recognition rate is not less than 80%, and the highest recognition rate is close to 100%. And such recognition accuracy enables the blink action to be used as a confirmation function when selecting system instructions.

4.1.3. Tongue Signals

The tongue electrical signal is an obvious human physiological electrical signal. When we use EEG equipment such as Emotiv and Neuroscan to collect signals, the tongue signals are easily detected in the time domain, such as a human’s tongue extension and upward tongue rolling movements, as shown in Figure 10.

In the experiment, for the tongue electrical signals, the width of the moving window is selected to be 150, and the BP neural network is used to identify it. And each subject has done 50 experiments, respectively. The recognition rate of tongue electrical signals is shown in Figure 11.

Figure 11 shows that there is a high recognition rate for recognizing the tongue signal using the BP neural network. The recognition rate is not less than 80%, and the highest recognition rate is close to 98%. And such recognition accuracy enables the tongue action signal to be used as a choice for up-and-down movement when selecting system instructions.

4.2. Training Effect

In order to improve the working efficiency of the robot system, the subjects were subjected to repeated action training experiments. Take the example of completing five types of actions in sequence to illustrate the training effect. Each subject completed a series of movements (moving to the left, moving to the right, upward tongue rolling, tongue extension, and blinking two times in a row) in sequence, and the movements should be completed as soon as possible. Each subject trains 50 times a day, and the training lasts for 7 days. Figure 12 shows the change in the average time required to correctly identify the five types of sequential movements every day during the training.

Figure 12 shows that, with the increase in training time, the time required for subjects to complete the five movements (moving to the left, moving to the right, upward tongue rolling, tongue extension, and blinking two times in a row) correctly and sequentially decreases. This means that as long as the subjects have enough time to train, they can control the robot’s auxiliary system with their EOG and tongue signals.

4.3. Comprehensive Identification Effect

After training, we conducted a comprehensive identification analysis of the four basic tasks (Figure 1(c)) in the experiment. The recognition effect is shown in Table 3.

Table 3 shows that the auxiliary robot system in the experiment has high accuracy for identification of four tasks; additionally, the time required to complete the four tasks is short, which can ensure that the auxiliary robot system can quickly complete the auxiliary tasks for human beings.

5. Discussion

In this paper, the authors used the five kinds of human head movements to control the home-auxiliary robot system and used this system to complete taking articles for daily use. The experimental results showed that these five movements can be quickly and accurately identified, which made it more convenient to use in the field of home care.

5.1. Previous Studies

There are many researches on using human physiological signals to control external devices [5154]. Roy et al. used the genetic algorithm (GA) to recognize human left and right arm movements, and the recognition accuracy was 75.77% [55]. Šumak et al. successfully used eye wink, eyebrow motion, clenching of teeth, and smirk to control different functions of keyboard operation [7]. Fernandez-Fraga et al. used different hand movements to achieve effective control of the cursor movement up, down, left, and right in the screen [8]. Filho et al. used the graph method to recognize several kinds of human hand movements, and the recognition accuracy reached 98% [56]. Vinoj et al. used human motion imagination combined with visual stimulation to identify basic human motion characteristics such as sitting, standing, forward movement, turning right, and turning left, and these actions could be used as instructions to control the movement of exoskeletons, thus enabling exoskeletons to assist the normal movement of human beings [9].

5.2. Novel Findings of This Study

The recognition rate of left and right eye movements can reach more than 96%, and the recognition rate of tongue electricity and blink can reach more than 80%. Moreover, for all subjects, their average recognition rates of eye blinks and tongue movements (tongue extension and upward tongue rolling) were 90.17%, 88.00%, and 89.83%, respectively. This recognition rate is acceptable for the experiment of this real-time control equipment. In addition, when there is an error in recognition, the cursor will return to the initial position (cursor waiting area) after 3 s to ensure that the system will not misjudge. And these kinds of head physiological signals in this paper, which are all time-domain signals, are obvious and easy to detect. After training, the subjects could complete the five types of movements in sequence within 12 seconds. This makes it possible for the people with physical disabilities to use this auxiliary system to serve themselves, which will bring great convenience to their lives.

5.3. Limitations and Future Research Lines

In this study, only two kinds of obvious tongue signals were studied, and other movements of tongue, such as left and right tongue movements, were not thoroughly studied. In addition, we only consider the functions to be completed by the auxiliary robot in the future and do not consider the expensive cost of the auxiliary system to be built.

In future research, a relatively inexpensive robot assistance system may be developed, and more head physiological signals will be effectively classified and recognized. The auxiliary system will be applied not only to life assistance for patients with limb function loss but also to the field of smart home.

6. Conclusion

A home-auxiliary robot system based on characteristics of human physiological and motion signals is developed in the current study. It relies on five simple actions of the human head itself to complete the motions (moving up/down/left/right and double-click) of a mouse in the system screen. The research results include two aspects. On the one hand, using the brain network and Pearson correlation coefficient algorithm analysis, the recognition rate of eye movement signal features is higher than 96%. On the other hand, using the BP neural network algorithm analysis, the recognition rate of tongue electrical signals and blink signals is higher than 80%. Additionally, after training, the subjects could complete the five types of movements in sequence within 12 seconds. Thus, one can conclude that people with physical disabilities can use the system to quickly and accurately complete life self-help, which brings great convenience to their lives.

Data Availability

Readers can obtain the research data of this paper through this email: [email protected].

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

The authors gratefully acknowledge the financial support by the National Natural Science Foundation of China (51605419), Northeast Electric Power University (BSJXM-201521), and Jilin City Science and Technology Bureau (20166012).