Abstract

This paper proposes a novel neural network architecture based on adaptive resonance theory (ART) called ARTgrid that can perform both online and offline clustering of 2D object structures. The main novelty of the proposed architecture is a two-level categorization and search mechanism that can enhance computation speed while maintaining high performance in cases of higher vigilance values. ARTgrid is developed for specific robotic applications for work in unstructured environments with diverse work objects. For that reason simulations are conducted on random generated data which represents actual manipulation objects, that is, their respective 2D structures. ARTgrid verification is done through comparison in clustering speed with the fuzzy ART algorithm and Adaptive Fuzzy Shadow (AFS) network. Simulation results show that by applying higher vigilance values () clustering performance of ARTgrid is considerably better, while lower vigilance values produce comparable results with the original fuzzy ART algorithm.

1. Introduction

Adaptive Resonance Theory (ART) [1] is a cognitive neural theory that attempts to explain how the human brain autonomously learns, categorizes, recognizes, and predicts events in a dynamic and changing environment. ART contains a series of artificial neural networks (ANN), which are used for supervised and unsupervised learning. ART neural networks solve the stability-plasticity dilemma defined by Grossberg [2]. Plasticity of a learning algorithm denotes the characteristic of successful adaptation to changing environmental conditions and the possibility to code new input patterns. Stability of learning algorithms is characterized by the ability to learn new input patterns without catastrophic forgetting [3]. Mechanisms and main principles of Adaptive Resonance Theory can be observed in many areas of the human brain including the visual cortex as noted from significant experiments during the previous decades [4, 5]. Main principles are based on the assumption that learning apart from knowledge update utilizes two major mechanisms: categorization and expectation. The main ART mechanisms that are noted in recent ART based clustering architectures mostly utilize the search, choice, and resonance mechanisms from fuzzy ART. Fuzzy ART [6] enables fast categorization and learning performance of analog input patterns. Long-term connection weight values can only decrease in time which provides fuzzy ART with high clustering stability. The complement coding mechanism ensures stable normalization of input vectors. A fuzzy ART variant algorithm [7] uses high choice parameter values. The research filled a theoretical gap of a wider range of choice parameters while the clustering performance was comparable with the original fuzzy ART. A generalized ART architecture for learning by matching, association, instruction, and reinforcement is described in [8]. Fusion ART provides a learning mechanism that accounts how an autonomous agent acquires knowledge of its environment in real-time and in an incremental manner. Fusion ART is able to learn multidimensional mappings simultaneously. Furthermore it utilizes multimodal pattern channels. A modified fuzzy ART architecture is proposed in [9]. The main feature of the modified architecture includes a distinct vigilance parameter for each category. These parameters are continuously modified and updated according to the size of respective categories. A novel ART-based neural network architecture ART-C (ART under constraints) is introduced in [10]. ART-C is capable of online clustering of pattern sequences which are subject to constraints of the recognition category representation. Apart from the orienting and attentional subsystem ART-C introduces a novel constraining subsystem which adapts the vigilance parameter value online, delivering a user predefined cluster number. Mechanism from game theory [11] can improve the categorization performance in ART networks. Main problems of ART networks are exponential increase of memory requirements over time and difficulty of specifying learning tasks accuracy. Authors propose an adaptive classification mechanism based on Nash Equilibrium. The vigilance parameter is adaptively optimized through the learning process with direct impact on output categories size and their respective number. All previously mentioned ART architectures employ the fuzzy ART learning rule and are unable to change connection weights in a positive direction. While they provide stable clustering results and solve specific learning problems they do not provide a novel search mechanism compared to the original fuzzy ART. Previously developed adaptive fuzzy shadowed (AFS) network based on the original fuzzy ART is described in [12]. A clustering scenario with a number of diverse 2D space structures is presented. The AFS [12] network uses a learning rule described in [13], which is different from the one used in fuzzy ART. Within the AFS network a new match function was developed and implemented.

The family of ART networks utilizes a linear search mechanism which first computes all category choice scores and chooses the one with the highest value. In the following phase a resonance test is conducted. If the vigilance criterion is met an update of long-term memory connection weights takes place. The long-term memory (LTM) traces or connection weights correspond to top-down expectations of the network when a new input pattern is obtained. The LTM connection weights represent categories that the network has learned in previous iterations. When there is no appropriate stored category in LTM traces, the orienting subsystem resets all output neurons and a new category is formed. In applications where there is a need for high match tracking scores, that is, where similarity between the input and already stored category needs to be high (), a substantially large number of output categories are formed. The number of output categories has direct influence on network performance. The main search process in ART is linear; that is, all output neurons need to go through the category choice mechanism prior to the resonance test, as noted above.

In this paper a novel neural network architecture ARTgrid is presented. ARTgrid can perform both online and offline clustering of complex 2D object structures. When large sets of complex input data [12] are applied to the network a considerable amount of processing time is attributed to the category search process. For that reason a two-layer category field and search mechanism utilizing a forward search strategy [14] was developed. The main advantage as well as novelty of the proposed architecture, in comparison with existing ART based neural architectures, is a two-level categorization and search mechanism. This mechanism enhances clustering speed while maintaining high performance, which can be observed when vigilance parameter is set to higher values (). ARTgrid is mainly developed for robotic applications for work in unstructured environments with diverse work objects.

The rest of the paper is organized as follows. Section 2 presents the developed ARTgrid architecture. In Section 3 an overview and discussion of clustering results of a random generated 2D object structure set are given. The final section highlights future work.

2. ARTgrid

2.1. Use Case Scenario

The use case scenario for implementing ARTgrid consists of a blocks world used for assembly as depicted in Figures 1 and 2. The ARTgrid neural network is developed for the purpose of learning object relationships and object space structures in the previously established robotic framework [15, 16]. An object space structure can be defined at two levels of granularity. First the morphology, that is, the general shape, can be recorded without regarding individual objects in the structure. At a more detailed level, for providing finer details, individual objects and their respective information are obtained (position, orientation). All this information describes a certain step in the sequence displayed in Figure 1. With the developed architecture it is possible to obtain a generalized concept for learning these sequences and for creating similar categories. A space structure is recognized in two-dimensional space in which objects form different spatial relationships with, expected, different meanings.

In Figure 2 a random generated set of the final assembly phase is depicted. Each of the presented space structures includes distinctive assembly steps as noted in Figure 1.

2.2. ARTgrid Architecture

A standard ART network mainly consists of a category field, one input field, a reset node, and an orienting and attentional subsystem. These are the main building blocks of both ART1 and ART2 and fuzzy ART networks. As noted earlier, the main purpose of this research is to introduce a novel search and categorization mechanism for improved clustering in cases of high vigilance parameter values (). In Figure 3 the architecture of ARTgrid is presented. It comprises a dual network system with an additional resonance adaptation subsystem (RAS) and a segmentation filter. The segmentation filter is utilized to improve the visibility of distinct structure features by distributing and dividing the input structure into two channels. The first channel is structure morphology that only takes into account the relative shape of the space structure. The second channel includes an object matrix (MTO) mechanism which acts as a parallel match tracking process for finding space structure resemblances. The MTO consists of identified objects and their respective position and orientation in the workspace. RAS is used for additional control of category choices in both and levels as can be seen from Figure 3. RAS takes into account the object matrix (MTO) and controls an additional resonance gain that can either increase or decrease the resonance value based on object matrix matching. This process ensures that stable output categories in both and are created.

In Figure 4 an overview of the category filed hierarchy is presented. After the segmentation filter and preprocessing phase the input structure is recognized at output neurons in category field1, that is, in . The dual input channel ensures that both the structure morphology and the MTO () are passed through the LTM traces. When a certain output neuron in is in resonance it triggers an output to an associated category field2 . A secondary match tracking and resonance process follows, where the attentional subsystem and orienting subsystem of are activated. The resonance adaptation subsystem is active in both category fields and ensures an appropriate gain with respect to the input MTO noted as .

Figures 3 and 4 both denote the architecture and hierarchy of the ARTgrid neural network. The main components of ARTgrid which include category fields , and are indicated to emphasize their corresponding parts on both figures. In Figure 3 emphasis is given to information flow through the ARTgrid network with detailed explanation of the orienting, attentional, and RAS subsystem and their interaction. Figure 4 outlines the idea of the two-level categorization mechanism directly showing the proportion of categories in both levels.

2.3. The ARTgrid Algorithm

Let denote the acquired input structure where . Applying complement coding the input is transformed to its complement , where . Let denote the connection weight matrix associated with the th node in , and let denote the connection weights associated with the th node in . Initially, both and contain only one uncommitted node and their respective weight matrices contain all 1’s. Parameters: ARTgrid parameters include a learning rate parameter , a vigilance parameter , and a resonance adaptation parameter where , . For every output neuron the choice similarity function is computed given The fuzzy AND operator is defined as , the fuzzy OR operator is defined as , and the norm is defined as , for matrices and . Calculation of the choice similarity functions and corresponds to category levels and , respectively. Both values are calculated based on the input structure where each choice similarity function uses a category level specific connection weight matrix . In a code competition process as stated in (1), the and nodes with the highest choice function are activated. The winner neuron is identified as

In the template matching phase a resonance test is made:

In an offline learning strategy a modified CCS (center cluster searching) algorithm is initiated to find an initial set of most distinct input patterns. As a main mechanism it uses relative dissimilarity of input structures. In average 15% of the input structures are set as cluster centers and are applied to the network initially. The remaining set of input structures are applied in random order. The input structure with minimum density is used as the first cluster center matrix (CCS): In the following step the input matrix with maximum density is identified as

Input structure becomes the first CCS and becomes the second CCS input matrix, where . Matrices and are applied first to the network. A parameter denotes the maximum number of computed CCS matrices where denotes the current number of initialized CCS matrices. The rest of the process is as follows. The next input matrix with minimum relative similarity with already initialized CCS is computed given equations (6) and (7). Consider

The cluster center matrices are denoted as , . Next, the input matrix with minimum relative similarity with already initialized CCS matrices needs to be found:

Steps (6) and (7) are repeated until . By applying the CCS input matrices first to the ARTgrid network a more stable dissipation of categories can be formed in an offline training phase.

ARTgrid uses Moore [13] learning rule for both and levels as noted in (8). Utilizing this learning law the output neuron long-term connection weights can both increase and decrease in proportion to the similarity with the applied input structure:

The object matrix test function contributes to the increase or decrease of a specified resonance value either or through the RAS subsystem. The mechanism utilizes previous object information stored within the long-term connection weights and the MTO of an applied input structure. Four parameters are calculated in order to initialize the vector . The percentage of identical objects within and is noted and calculated as The function uniqueobjects() counts the number of all unique objects from a set of input matrices. The parameter denotes in which proportion identical objects are located in a resembling position in the input matrices and . A parameter is introduced for measuring structure similarities based on identical objects in a predefined region. For the purpose of calculating an elasticity factor is introduced. As the distance of two identical objects in and increases the activity decreases, as noted in (10). Value is calculated as the Euclidian distance between an identified object center in and . The distance is set to 6 pixels and to 14 pixels, respectively, providing a predefined tolerance gap:

Parameters and account for the deficit and excess of total objects in with respect to . They are calculated as follows: The function numobjects() is used to count the total number of all identified objects from the object matrix (MTO) of the corresponding structure. The formed vector is multiplied by a 4-dimensional weight vector . The scalar output is then applied as a resonance gain of the actual resonance . Resonance value previously calculated in the resonance test from structure morphology (3) is modified given equation (12). A resonance adaptation parameter is introduced where , : Resonance occurs if the match function value satisfies the criterion: where a mismatch reset occurs if

3. Results and Discussion

We have implemented ARTgrid, fuzzy ART, and the AFS neural network in MATLAB programming language. A total of 30 core functions were developed that address the new ARTgrid network. The implementation of fuzzy ART and AFS was done following their original algorithms described in [6] for fuzzy ART and [12] for AFS. We used the same programming language and hardware to enable cross comparison of training times for these three neural networks in this paper. Input structures were generated by a random generator; that is, position, rotation, and scale of individual objects were affected. The random generator was also utilized to change the number and type of specific objects in the structure. Figure 5 shows an example two-level cluster output of ARTgrid network. In the first column categories are displayed. Next to each category all respective categories are indicated. The parameters were set to , , , and . Size of the input structure, that is, the dimension of the input image, is 200 × 200 pixels where the activity of each pixel is analogue and can be set to either 1 fully active, 0 not active, or a value between 0 and 1 indicating activity level. The activity of a particular pixel in the image is associated with the presence of a located object. The activity color bar is displayed next to an enlarged sample structure in Figure 6.

17 simulation series were generated containing a variable number of input structures. The initial simulation series contains 5 inputs and the final set 100 random inputs. For each series a number of simulation runs were conducted. Vigilance parameter was chosen through empirical experiments on the specific data set as noted from literature [17]. Six distinct vigilance parameter values were set for level: (0.98, 0.95, 0.9, 0.85, 0.8, 0.7). At level was set to (0.58, 0.58, 0.5, 0.5, 0.45, 0.45). A comparison between the clustering performance of fuzzy ART (fART), adaptive shadowed network (AFS), and ARTgrid is obtained from simulations. In Figures 7, 8, and 9   is noted as for easier comprehension. The vigilance parameter value of AFS and fART is set to . Learning rate parameter of ARTgrid is set to , . In fART and AFS algorithm the learning rate parameter is set: .

Obtained results provide a vigilance parameter value threshold of ~0.85. Lower vigilance values indicate better fuzzy ART performance and higher values indicate better performance of ARTgrid. The AFS algorithm performed slowest clusterization in both aspects as it uses different match functions and a linear one-level search mechanism. Larger deviations, that is, fluctuations of the learning curve, are subject to the effect of random sequence of input patterns. As none of the three architectures has information of the inputs in advance there is a wider range of possible category solutions that the network can generate with respect to these sequences. Our implementation of ARTgrid, fuzzy ART, and AFS within MATLAB should not contribute to obtained results, that is, differences among training times for the three neural networks. In conclusion the developed ARTgrid clustering algorithm provides two main benefits.(i)The search space is structured and search for a corresponding cluster given an arbitrary input is faster when higher vigilance parameter values are utilized.(ii)The generalization capability of the network is enhanced providing generalization at multiple levels of granularity.

4. Future Work

In the future aspects of ARTgrid development few distinctive features and mechanisms should be implemented. One possibility to make the network output more stable is to implement an active category self-reorganization strategy. When an input structure is applied at certain and level, output neurons with the maximum choice parameters are chosen by default. The resonance within these neurons can be met or a new category can be formed. A backward search algorithm for a variable number of highest category choice values should be implemented. By utilizing the WTA (winner-takes-all) strategy, the activation of all other neurons is inhibited regardless of a possible resonance match. If a resonance match condition is met across multiple output neurons, there is a possibility of a “duplicate” category either in or both in and . We expect that these categories could be merged to form a new category that will be accounted for all similar inputs in the future. This can be seen as step toward category self-reorganization. This problem occurs as there is no possibility to have a structured world which will give the ARTgrid structured inputs and predefined input sequences. Inputs are acquired in random order, they are not known in advance, and the learning process is incremental. The network needs to make an active memory search and if there are certain similarities between the input structure and multiple output categories the self-reorganization strategy should be applied.

In future research we plan to test the influence of single iteration learning and multiple iteration learning on classification accuracy. We plan to compare the classification accuracy of ARTgrid neural network with both the fuzzy ART and AFS neural networks.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

Authors would like to acknowledge the support of Croatian Scientific Foundation through the research project ACRON—A New Concept of Applied Cognitive Robotics in Clinical Neuroscience. The authors would also like to thank one anonymous reviewer for valuable comments and suggestions which were helpful in improving the quality of the paper.