- About this Journal ·
- Abstracting and Indexing ·
- Advance Access ·
- Aims and Scope ·
- Article Processing Charges ·
- Articles in Press ·
- Author Guidelines ·
- Bibliographic Information ·
- Citations to this Journal ·
- Contact Information ·
- Editorial Board ·
- Editorial Workflow ·
- Free eTOC Alerts ·
- Publication Ethics ·
- Reviewers Acknowledgment ·
- Submit a Manuscript ·
- Subscription Information ·
- Table of Contents
Advances in Artificial Neural Systems
Volume 2012 (2012), Article ID 517234, 12 pages
Methodological Triangulation Using Neural Networks for Business Research
The Business School, University of Colorado Denver, Denver, CO 80202, USA
Received 6 October 2011; Revised 7 December 2011; Accepted 8 December 2011
Academic Editor: Ping Feng Pai
Copyright © 2012 Steven Walczak. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Artificial neural network (ANN) modeling methods are becoming more widely used as both a research and application paradigm across a much wider variety of business, medical, engineering, and social science disciplines. The combination or triangulation of ANN methods with more traditional methods can facilitate the development of high-quality research models and also improve output performance for real world applications. Prior methodological triangulation that utilizes ANNs is reviewed and a new triangulation of ANNs with structural equation modeling and cluster analysis for predicting an individual's computer self-efficacy (CSE) is shown to empirically analyze the effect of methodological triangulation, at least for this specific information systems research case. A new construct, engagement, is identified as a necessary component of CSE models and the subsequent triangulated ANN models are able to achieve an 84% CSE group prediction accuracy.
Artificial Neural networks (ANNs) have been used as a popular research and implementation paradigm in multiple domains for several decades now [1–9]. Recent literature is advocating the further usage of ANNs as a research methodology, especially in previously untried or underutilized domains [10, 11]. However, due to the early premise that ANNs are black boxes (i.e., it is difficult to evaluate the contribution of the independent variables) the demonstration of rigor and generalization of results from neural network research has been problematic.
Similarities between ANNs and various statistical methods (which have been shown to be both rigorous and generalizable) have been described for potential adopters [10, 12]. A common research paradigm for ANN researchers is to compare results obtained using an ANN to other more traditional statistical methods, including regression [13–16], discriminant analysis [17–21], other statistical methods [22–24], and multiple statistical methods [25–28]. Of the 16 articles just referenced, the majority of these results show ANNs being either similar to (with 2 being similar) or better than (with 12 outperforming) the compared statistical methods within the specific application domain.
While ANNs have a history, though short, their black box nature has led to adoption resistance by numerous-business related disciplines . Methodological triangulation may help to overcome these adoption and usage reservations as well as providing a means for improving the overall efficacy of ANN applications. Methodological triangulation is the utilization of multiple methods on the same problem (empirical data) to gain confidence in the results obtained and to improve external validity [30, 31]. ANNs and traditional statistical methods are both quantitative in nature. A quantitative method is hereby defined as a specific tool, procedure, or technique that is used to analyze the data of a specific problem to produce a corresponding model or results to answer a business research question.
The comparative analysis of ANNs versus standard statistical methods previously mentioned is an example of concurrent or parallel methodological triangulation , which is performed extensively to demonstrate performance improvements obtained through the utilization of neural network modeling. This paper will focus on nonconcurrent methodological triangulation techniques. Nonconcurrent methodological triangulation occurs when a statistical or other machine learning method is used in combination with an ANN, but the other method is applied to data prior to the ANN to refine the input vector and gain confidence in the reliability of the independent variables or alternately after the ANN has produced its results to improve the overall performance and/or interpretation of those results. The definition of nonconcurrent methodological triangulation used in this paper is similar to the sequential and parallel development mixed method and the sequential elaboration mixed method described by Petter and Gallivan .
The research presented in this paper will assess the efficacy of utilizing nonconcurrent triangulation of method with ANNs, specifically the preselection of variables with recognized statistical techniques. The triangulated ANN will be applied to the case of estimating an individual’s computer self-efficacy (CSE) without relying on self-evaluation, since self-evaluation may be subject to numerous biases [33, 34]. The next section provides a brief background on methodological triangulation that has been previously applied with ANNs followed by a section that describes the CSE estimation problem in more detail, which serves as a classification problem to demonstrate the results of the new methodology. The fourth section will present the triangulation methodology and describe the developed ANN models for CSE estimation. The penultimate section presents the results and a discussion of these results for CSE estimation utilizing the triangulated ANN.
This section describes ANNs and the literature on triangulation with ANNs.
2.1. Brief Description of ANNs
Before describing previous research that has either advocated or demonstrated the triangulation of various statistical and other machine learning methods with ANNs, a brief description of ANNs is provided. The following description is best suited to backpropagation trained ANNs, but can be generalized for other types of ANNs as well, especially other supervised learning ANNs. An ANN is a collection of processing elements typically arranged in layers (see Figure 1). The input layer requires some type of numeric data value. These values are then multiplied by weights (another numeric value) and aggregated for each hidden layer processing element. Various aggregation functions may be used, but commonly either a standard summation or maximizing function is used, producing a value: , which would be the aggregated input value for all input nodes (ranging over all possible ) for hidden processing node . The hidden layer elements then transpose the aggregated input values using a nonlinear function, typically a sigmoid function, such that the output of each hidden node looks like: . The outputs of each hidden layer node are then aggregated to the next layer, which may be the output layer or another hidden layer.
Learning, and hence the development of an accurate model, may be performed in a supervised or unsupervised manner. Supervised learning will be emphasized in this paper and utilizes historic examples of the problem being modeled. Examples of the independent variable sets are presented to the ANN, which then produces an output value or values as just described. The output value is compared to the known value from the historic training example and if an error above the error threshold exists, then the values of the weighted connections are adjusted to better approximate the observed output value. This type of learning is nonparametric and makes no assumptions about population distributions or the behavior of the error term .
ANNs have been shown to be able to accurately approximate almost any type of model for both classification and forecasting (regression), including both linear and nonlinear models [36–38]. Evaluation of supervised learning ANN models is performed by withholding a portion of the historic data sets and using this as an out-of-sample verification of the generalization of the ANN solution model to the real world. When comparing NNs against other methods, it should be the error on these out-of-sample or other out-of-sample results that is compared, where out of sample implies data that was not used for development of the ANN model.
2.2. Previous Work with Triangulating NNs
As already mentioned in the previous section, comparison of ANN classification or forecasting results to standard statistical methods following a model selection approach [35, 39] is a common method used by researchers to attempt to demonstrate both the validity of using an ANN model and also to demonstrate a methodological improvement gained from the use of the ANN modeling paradigm. However, the focus of the research reported in this paper is on nonconcurrent method triangulation and as such this section will focus on previous research that has utilized statistical and other methods in a nonconcurrent manner with ANNs.
A summary of prior research that implemented some form of triangulation that was not used for comparison of results is shown in Table 1. Very early work in improving ANN architecture design employed genetic algorithms (GAs) or genetic programming (GP) prior to instantiation of the ANN. ANNs have also been triangulated concurrently with other ANNs in an ensemble to improve classification performance for problems where a single classifier cannot perform adequately. All other existing work utilizing triangulation occurs following the production of results by the ANN model to reduce output error from the ANNs or to improve the explanation of the output results.
The proper selection of independent variables to utilize in the input vector for any ANN is required for performance optimization [45, 59–61]. Various techniques have been utilized in a nonconcurrent triangulation prior to training the ANN to eliminate correlated dependent variables and variables that have minimal or negative impact (noise) on the ANN results. These techniques include GAs to pre-select training data that will lead to faster convergence [43, 62], correlation matrices , regression , principal component analysis , and discriminant analysis .
An interesting example of the power of preprocessing of possible input data comes from a study by Durand et al. . They develop two models concurrently with the first being a partial least squares (PLS) regression model that has data preprocessed with a GA and an ANN model that has data preprocessed with a mutual information algorithm. The original problem space contained 480 variables. The GA was able to reduce the number of variables down to 11 for the PLS regression model and the mutual information algorithm was able to reduce the number of variables down to 12 for the ANN model, thus utilizing only 2.5 percent of the total available variables. The ANN model ultimately produced the best generalization performance between the two compared models.
The preprocessing of input/training data or the postprocessing of ANN output data to improve its accuracy or understandability are advantageous techniques to improve ANN performance, at least within the limited domains where these techniques have been previously applied. A combination of both preprocessing and postprocessing is explored in the remainder of this paper for the case of classifying individual CSE.
3. Methodology for ANN Triangulation
As described in the Background section, method triangulation is already widely used for neural network research. However, the triangulation is normally mentioned in passing and the effect of the triangulation is not typically evaluated.
One of the goals of the current research is to promote the ideal of utilizing method triangulation whenever ANNs are used in research or real world applications and to formalize to the extent possible a methodology for performing triangulation with ANNs. Another goal is to provide empirical evidence to demonstrate the efficacy of utilizing triangulation in ANN research.
A flowchart for implementing triangulation is shown in Figure 2. The methodology is focused on methodological triangulation that utilizes ANNs as one of 2 or more processes used nonconcurrently to develop robust research models or domain applications. The flowchart and proposed methodology does not include data preparation/cleansing, testing of multiple ANN architectures, or cross-comparison of different ANN learning methods; all of which are standard ANN model development practices [60, 61].
The alternate processes specified in the flowchart are meant to indicate that the researcher or developer has several choices here for which method to use in the triangulation process. Selection of a specific statistical method or other method is typically constrained by the from and qualities of the data to be analyzed.
The proposed methodology emphasizes two significant issues with ANN development: improving models through noise reduction and improving the interpretation of results (to overcome the black-box nature of ANNs). A side benefit of the methods advocated for triangulation prior to the ANN is that any reduction in the independent variable set will reduce the overall costs of the model [13, 35].
4. The CSE Problem: A Case Study for Evaluating Methodological Triangulation with ANNs
To demonstrate the benefit of the proposed triangulation method paradigm, a case study to predict CSE using ANNs is shown. The technology acceptance model (TAM) introduced by Davis  has long been used to predict the adoption of new technology by users. CSE is strongly linked, even as a determinant, with the perceived ease of use component of the TAM model. Prior research on CSE is highlighted in Table 2.
In a technology environment, determining CSE is important because low CSE may hinder learning [66, 67], while participants scoring high in CSE perform significantly better in computer software mastery [68–70]. However, CSE is difficult to measure rigorously. Contradictory CSE results from prior research may be due to weaknesses in existing measures of the construct as well as the need for control of antecedent and consequent factors directly associated with CSE .
Variables that measure the four constructs shown in Table 2: prior experience; computer anxiety; organizational support; engagement, will make up the independent variable vector to the ANN.
Although computer anxiety has similar measurement problems to CSE, prior experience and organizational support are easier to measure than CSE because they are more objective, can be validated, and are less dependent on perceptions. Even when perceptual measures are used for organizational support, as in this study, they are probably less emotionally laden than CSE and anxiety, which are tied to ego and self-assessment. Engagement, although a perceptual measure, is also less emotionally laden than CSE and computer anxiety and is observable. The nonparametric and nonlinear nature of ANNs make them ideal for modeling a problem that may have inaccurate or noisy data, such as in the evaluation of computer anxiety and engagement.
This section introduced the CSE problem, which is solved using triangulation in the next section.
5. Triangulation to Improve ANN Performance
CSE variables to measure the four constructs: prior experience (PE), computer anxiety (CA), organizational support (OS), and engagement (E) are collected using a survey that is administered to undergraduate and graduate students at a large southwestern state university in the United States. Students in both undergraduate classes and graduate classes were given the survey over a four-semester period. A total of 239 surveys were returned. Questions for the survey were derived from previously validated research and are shown in the appendix, Table 7. Three responses were dropped because of incomplete information on prior experience yielding 236 fully completed surveys.
The sample consisted of about 50% each of graduates and undergraduates and approximately two-thirds male. Almost 50% were of age 23 years or older. 93% had some work experience and 99% had more than one year of PC experience, thus minimizing differences in the prior experience variable values for the model.
Although considered to be more of a data cleansing operation as opposed to triangulation to improve performance through noise elimination by reducing the variable data set, outlier analysis using the box plot method was performed . Two responses were identified as outliers, thus producing a working data set of 234 responses.
As shown in Table 7 in the appendix, a total of 14 independent variables were collected to predict an individual’s CSE. As specified in the flowchart (see Figure 2), a correlation matrix of all independent variables was calculated  and high-correlation values indicated that all 4 of the prior experience variables were interdependent. Computer-based training and business application experience were dropped as having the highest correlation values. This is actually an interesting finding as a corollary result from the triangulation process, namely, that as prior work experience increases so does computer application experience. Eliminating the correlated variables reduced the independent variable set size from 14 variables down to 12 variables.
The next triangulation step to occur prior to the implementation of the ANN model is dependent on both the type of problem being solved (e.g., prediction or classification) and the variable data. The student population is a subset of and meant to be demonstrative of what could be achieved with the ERP training tool population at large and thus the distribution of the general population in unknown. Many of the parametric statistical models require the assumption of a normally distributed population of answers.
Structural equation modeling (SEM) is a methodology for analyzing latent variables, which cannot be measured directly. CSE and computer anxiety are examples of latent variables. However, SEM using LISREL, AMOS, and other similar packages makes assumptions of a normal distribution of the data and requires a relatively large sample size . On the other hand, the partial least squares (PLS) method does not assume a normal distribution of the data and does not require as large a sample size as LISREL and similar statistical software packages. Therefore, PLS-based SEM serves as a triangulation statistical method to analyze the antecedent constructs for the CSE case study data.
PLS-SEM is used to analyze a measurement model and a structural model. The measurement model determines whether the measures used are reliable and whether the discriminant validity is adequate. The loading on its construct assesses the reliability of an indicator. Loading values should be at least 0.60 and ideally at 0.70 or above indicating that each measure is accounting for 50% or more of the variance of the underlying latent variable . Two indicators for CSE were dropped because loadings on the construct were lower than the threshold. This helps to reduce the overall cost of the subsequent model through reduction in the number of variables required.
In Table 3, composite reliability scores show high reliability for the final constructs. All values for composite reliability are greater than 0.8 (except prior experience which is measured with unrelated formative indicators that are not expected to highly correlate with each other). The diagonal of the correlation matrix shows the square root of the average variance extracted (AVE). The AVE square root values are greater than the correlations among the constructs supporting convergent and discriminant validity. The means and standard deviations of the construct scales are also listed.
Table 4 shows the loadings and cross-loadings to the constructs of the measures, which had adequate reliability. All loadings are greater (in an absolute value) than 0.7, and are greater (in an absolute value) than cross-loadings showing again strong convergent and discriminant validity.
In the SEM model, (see Figure 3), the path coefficients produced by PLS show that engagement (0.492) and organizational support (0.112) are statistically significant (). Prior experience (−0.091) and computer anxiety (0.016) are not statistically significant. Demographic variables for age and gender were also evaluated in the original SEM model, but neither was statistically significant (0.012 and 0.029, resp.) and are subsequently removed from the model. The antecedents for the model shown in Figure 3 explained 32% () of the variance in CSE.
From the PLS-SEM model, it appears that either the engagement (E) construct variables or the organizational support (OS) variables or perhaps the combination of these two constructs will produce the best performing ANN model to predict an individual’s CSE. As with most ANN research, various hidden node architectures are constructed for the three possible input vectors (E, OS, E, and OS combined) [35, 39]. Each ANN starts with a quantity of hidden nodes equal to half the number of input nodes and this value is incremented by one and the ANN retrained until the performance starts to decay, indicating overlearning. All architectures are trained utilizing the backpropagation (BP) learning algorithm and training is halted when a RMSE of less than 0.05 is reported by the training algorithm. Architectures with the number of hidden nodes equal to the number of input nodes almost universally outperformed the other architectures and samples of these architectures are shown in Figure 4.
Additional tests are performed evaluating ANNs trained using the radial basis function (RBF) training methodology, which should be more noise resistant and operates more efficiently in cases of extrapolation (versus interpolation) than BP . As mentioned previously, comparing multiple ANN models and different supervised learning algorithms is a form of concurrent triangulation. The BP ANNs consistently outperformed the corresponding RBF ANNs and thus only the BP ANN results are reported.
Since the research is interested in investigating the improvement to ANN performance through the utilization of statistical method triangulation with ANNs, other combinations of constructs are also evaluated to determine if the E, OS, or E and OS input vectors do indeed produce the optimal prediction performance. All combinations of E, OS, and computer anxiety (CA) are developed as ANN models and an additional ANN model that includes prior experience (X) is also implemented. Each of the different construct combinations ANNs follow the same multiple architecture development and training protocol as that used for the three different ANN models recommended by the PLS-SEM method. Each construct combination uniformly used all of the construct variables from the survey which were not eliminated by the correlation matrix, whenever that corresponding construct was part of the input vector.
Additionally, an ANN model that utilized all 14 variables is developed and compared with the other results to further examine the benefit gained from triangulation. Post-ANN result triangulation to increase ANN model performance and understanding is discussed in the next section.
In this section, the results of the pre-ANN model specification triangulation are examined and post-ANN triangulation to improve and explain ANN results are demonstrated.
6.1. PLS-SME Triangulation Identifies Optimal Input Constructs
The prediction performance for the best performing architecture for each of the various ANNs evaluated is displayed highlighted in Table 5. Table 5 also shows the results of utilizing all variables, including the correlated variables eliminated earlier in the triangulation process. The evaluation is performed using a 12-fold cross-validation technique, which should approximate the obtainable results from utilizing a model that would have been trained on the full population set, but maintains the integrity of the data as all validation samples are withheld from the training data set for the 12 individual cross-validation ANN models . Each ANN attempted to predict the composite CSE score, which was the summation of the three retained CSE variables (following the PLS analysis) and had a range from 3 to 21.
As shown in Table 5, the E only construct ANN, which was the most significant construct according to the PLS-SEM preprocessing, produced the smallest mean absolute error (MAE) term and also had the largest quantity of perfect predictions for predicting CSE. The combination of E and OS had the second smallest MAE. An additional column is presented in Table 5 that represents near misses for the ANN model CSE predictions. From the near-miss column, the E and CA combination model performs best on the near-miss evaluation, with the E only ANN model coming a close second (not statistically different) and the E and OS model in third place.
Additionally, it can be seen that utilizing all 14 of the collected variables to predict the PLS-SEM modified CSE-dependent variable has a much worse performance than any of the reduced variable set. This provides strong empirical evidence for the need to triangulate using correlation matrixes to reduce variables and consequent noise [45, 60, 61].
The question of how to evaluate this particular ANN prediction model arises from the various construct combinations shown in Table 5. Based on the MAE and also correct predictions, the PLS-SME preprocessing was able to correctly identify the best set of independent variables for the ANN input vector. Exact matches of the CSE value may not be necessary and near misses (predictions within one of the actual value) may be just as useful. Interpreting the data in this way by recalculating the results based on an approximate match is a form of post-ANN method triangulation based on a heuristic, which is conceptually similar to cluster analysis. Expanding the analysis with this posttriangulation heuristic, shows that the addition of the CA construct variables enables the ANN to achieve optimal performance for predicting CSE within 1, though the E only ANN model was a very close second.
Bansal et al.  make a strong case for simplifying ANN models to reduce cost and improve the interpretation and usability of the resulting model. Since the CA construct did not significantly improve the within one performance of the ANN, the reduced cost of the E only ANN may be sufficient to outweigh the small gains achieved through the addition of the CA construct. The utility of including the computer anxiety construct in an ANN CSE prediction model must be weighed against the cost of obtaining reliable measurements for this construct.
The preceding example empirically demonstrates that utilizing traditional statistical methods can significantly improve ANN performance through the identification of the optimal set of independent variables, or in this case optimal antecedent constructs. The correlation matrix was able to eliminate 2 variables from the dependent variable input set. Due to the population and data constraints, the selected PLS-SEM triangulation was able to further reduce the input vector size and accurately identified the optimal constructs for inclusion in the ANN model to predict the exact CSE value from 3 to 21.
6.2. Post ANN Triangulation to Improve Results and Interpretation
As noted earlier, another utilization of methodological triangulation is to improve the performance or interpretation of ANN output. The results from Table 5 already demonstrate that a posttriangulation heuristic method can improve results by 150 to 185 percent.
For the case of predicting an individual’s CSE, an exact numeric value may not be necessary to utilize an ANN CSE prediction model’s output, since CSE is generally more broadly classified into levels or groups such as very high CSE, high CSE, moderate CSE, low CSE, and very low CSE. A further analysis of the BP-trained ANN CSE prediction model’s output is triangulated further to determine group identification and determine if this additional processing may improve the performance of the ANN CSE prediction models.
Values for delineating different levels of CSE are applied to the three aggregated CSE variables from the survey to distinguish five different CSE levels for the population from very low to very high (very low, low, moderate, high, and very high). For example, an individual with a CSE score between 3 and 7 inclusive is placed in the very low CSE group. This may be performed statistically using a k-means clustering algorithm with the number of clusters set to 5. The predicted CSE output values are also converted into a group classification using the same cutoffs that are applied to the user responses. The new five-group category classification results are displayed in Table 6 for each of the ANN construct models reported previously in Table 5, with the highest prediction accuracy values for each column highlighted.
The E-construct-only ANN CSE group classification prediction places the user in the correct CSE group 67.95% and within one group 93.59% of the time. The remaining predictions for all of the ANN CSE group classification models are within two groups of the correct classification (meaning that a very-low or low CSE user is never categorized as a very high CSE user and very high and high CSE users are never classified as very low CSE users). It should be noted that the traditional antecedents of CSE, CA and a combination of OS and CA also produce an identical perfect CSE group classification compared to the E-only construct ANN model.
The E-only ANN CSE prediction model with triangulated output did not produce the highest within-one-group predictions, similar to the case with the exact CSE value predictions, but had the second highest performance and again was not statistically different from the E and CA and also the E, OS and CA ANN models. As mentioned before, the additional cost associated with collected these variables versus the minimal performance increase needs to be evaluated to determine if the PLS-SEM constructs selection triangulation is merited.
CSE level classifications of individuals may also be viewed as fuzzy sets [92, 93], with the specific cutoffs for placement within a group (i.e., the boundary conditions) being indeterminate, but contained. For example, an individual with an aggregated CSE score of 17 working in a typical setting may be considered to belong to the very high CSE group, but that same individual employed at IBM, Texas Instruments, or Lockheed Martin might only be placed in the high or even medium CSE group. An overlap of 1 to 2 points between groups is enabled to simulate the application of a triangulated fuzzy algorithm that transforms the results into a more compatible fuzzy set notation, meaning that boundary members may be viewed as belonging to two possible groups, but still maintaining high homogeneity of members within a group . The fuzzy group classification results are also displayed in Table 6, again with the highest classification accuracy highlighted. The fuzzy classification results compared to the perfect nonfuzzy classification results indicate a 10 to 15 percent increase in classification accuracy for most of the ANN models and may ultimately be a more realistic approach for organizations in trying to determine the CSE level of employees.
The fuzzy classification results displayed in Table 6 lend further empirical support for inclusion of the E construct in the CSE prediction model, with the E-only model performing second highest and not statistically different from the highest prediction percentage. The E, OS, and CA fuzzy interpretation model and the CA-only fuzzy interpretations are the highest. This lends partial support for the earlier findings that E and possibly OS are the two most significant constructs. Another corollary finding is the potential for CA to be included as a required construct if optimal fuzzy model performance is desired. The PLS-SEM results displayed in Figure 3 show that CA is the only other construct to have a net positive impact on CSE, though this is fairly small.
This paper has presented evidence that triangulation of methods can improve the performance of ANN classification and prediction models. A case study of an ANN solution that predicts an individual’s CSE was reported to provide further empirical evidence of the efficacy in utilizing methodological triangulation when employing ANNs.
The CSE prediction models utilized two different types of triangulation of methods: (1) preprocessing triangulation including both a correlation matrix and the statistical method (PLS_SEM) for sequential development  to identify the optimal set of independent variables for developing the ANN prediction model and (2) a post-ANN clustering heuristic and possibly an additional fuzzy set algorithm for sequential elaboration  to improve the classification performance of the ANN.
The preprocessing triangulation effectively reduced the independent variable set and also succeeded in identifying the most relevant construct, engagement, or E. The E-construct-only BP ANNs achieved either optimal performance for perfect predictions of the exact value or group value for an individual’s CSE and had the lowest MAE. The triangulated clustering method may also help improve the understanding of the ANN’s output by transforming it into a more meaningful representation of the classification strategies that would be followed by a human resources department to identify computer application training needs of employees .
The CSE prediction problem utilized to provide empirical results for the recommended triangulation methods is not a trivial case. Directly measuring CSE is problematic . Utilizing antecedents to CSE that are measurable, such as engagement, provide reliable inputs to a CSE prediction model. This research has also demonstrated the need to include engagement in future research models of CSE. Subsequent output of a reliable CSE prediction may then be used as input to more complex models of technology acceptance and end-user training requirement models. Future research can investigate the use of ANN predicted CSE in the TAM and for more accurately predicting perceived ease of use.
From the review of the literature and the results presented here, future research involving the development of ANN classification or prediction models should utilize appropriate statistical methods to determine input variables for the ANN model. Additionally, when appropriate, the demystification of ANN output through posttriangulation methods can only serve to improve adoption of ANN methodologies.
While this paper has focused on how to triangulate statistical and other methods with ANN, where the ANN serves as the primary modeling paradigm, ANN themselves may also be utilized as a triangulating method to improve statistical and other modeling methods. Various researchers [95–97] have advocated the utilization of ANNs to assist in determining when a solution set has nonlinear properties and thus should not be modeled using strictly linear statistical modeling methods.
Additionally, ANNs may be used to estimate posterior probabilities in classification problems . Rustum and Adeloye  claim that ANNs may also be used to fill in missing data from domains that typically have very low quantities of data and where data may be noisy or missing, thus ANNs may be used to fill in the missing data reliably. This would then enable other research modeling techniques to be applied to larger and cleaner sets of data.
The methodology proposed in this paper has been shown to be effective, at least for the domain of CSE and prior research has already utilized method triangulation, but without analyzing the effect of the triangulation. Some final precautions should be noted. The selection of both pre- and posttriangulation statistical tools and other methods to incorporate into any ANN research or development process is highly dependent on the type of data and goals of the research model. As noted in Figure 3, alternate processes are available and selection of the appropriate statistical method is reliant on the type and constraints of the data. However, as demonstrated, if appropriate statistical and other methods (e.g., heuristic) are implemented, the results of the corresponding ANN models can be improved 750 percent or greater (difference between the full variable basin BP ANN model (bottom row of Table 5) and the perfect fuzzified predictions (Table 6)).
In Table 7, all survey questions are measured using a 7-point scale from 1 (highly disagree) to 7 (highly agree). Each question is adapted from the indicated sources to fit the experiment context (CSE = computer self-efficacy; OS = organizational support; CA = computer anxiety; and E = engagement).
Additional questions asked on the survey to determine prior experience (X), utilized a 4-point scale (1 = none; 2 = less than 1 year; 3 = 1 to 3 years; and 4 = more than 3 years).
How many years prior experience have you had with computer-based training?
How many years prior experience have you had with personal computers? (X1)
How many years prior experience have you had with business application software?
How many years prior work experience have you had? (X2).
The author would like to thank the reviewers of AANS for their insights that helped make this work more understandable and valuable as a research methodology guide. Additional acknowledgement is given to Associate Professor Judy Scott at the University of Colorado Denver. She provided the initial idea for the case study presented in this paper as well as all of the data for the case study.
- R. Dybowski and V. Gant, “Artificial neural networks in pathology and medical laboratories,” The Lancet, vol. 346, no. 8984, pp. 1203–1207, 1995.
- E. Y. Li, “Artificial neural networks and their business applications,” Information and Management, vol. 27, no. 5, pp. 303–313, 1994.
- S. H. Liao and C. H. Wen, “Artificial neural networks classification and clustering of methodologies and applications—literature analysis from 1995 to 2005,” Expert Systems with Applications, vol. 32, no. 1, pp. 1–11, 2007.
- G. Montague and J. Morris, “Neural-network contributions in biotechnology,” Trends in Biotechnology, vol. 12, no. 8, pp. 312–324, 1994.
- K. A. Smith and J. N. D. Gupta, “Neural networks in business: techniques and applications for the operations researcher,” Computers and Operations Research, vol. 27, no. 11-12, pp. 1023–1044, 2000.
- B. Widrow, D. E. Rumelhart, and M. A. Lehr, “Neural networks: applications in industry, business and science,” Communications of the ACM, vol. 37, no. 3, pp. 93–105, 1994.
- B. K. Wong, T. A. Bodnovich, and Y. Selvi, “Neural network applications in business: a review and analysis of the literature (1988-95),” Decision Support Systems, vol. 19, no. 4, pp. 301–320, 1997.
- B. K. Wong, V. S. Lai, and J. Lam, “A bibliography of neural network business applications research: 1994–1998,” Computers and Operations Research, vol. 27, no. 11-12, pp. 1045–1076, 2000.
- F. Zahedi, “A meta-analysis of financial applications of neural networks,” International Journal of Computational Intelligence and Organization, vol. 1, no. 3, pp. 164–178, 1996.
- K. B. DeTienne, D. H. DeTienne, and S. A. Joshi, “Neural Networks as Statistical Tools for Business Researchers,” Organizational Research Methods, vol. 6, no. 2, pp. 236–265, 2003.
- G. P. Zhang, “Avoiding pitfalls in neural network research,” IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, vol. 37, no. 1, pp. 3–16, 2007.
- B. Warner and M. Misra, “Understanding Neural Networks as Statistical Tools,” American Statistician, vol. 50, no. 4, pp. 284–293, 1996.
- A. Bansal, R. J. Kauffman, and R. R. Weitz, “Comparing the modeling performance of regression and neural networks as data quality varies: a business value approach,” Journal of Management Information Systems, vol. 10, no. 1, pp. 11–32, 1993.
- U. A. Kumar, “Comparison of neural networks and regression analysis: a new insight,” Expert Systems with Applications, vol. 29, no. 2, pp. 424–430, 2005.
- S. W. Palocsay and M. M. White, “Neural network modeling in cross-cultural research: a comparison with multiple regression,” Organizational Research Methods, vol. 7, no. 4, pp. 389–399, 2004.
- S. Sakai, K. Kobayashi, S. I. Toyabe, N. Mandai, T. Kanda, and K. Akazawa, “Comparison of the levels of accuracy of an artificial neural network model and a logistic regression model for the diagnosis of acute appendicitis,” Journal of Medical Systems, vol. 31, no. 5, pp. 357–364, 2007.
- S. Ghosh-Dastidar, H. Adeli, and N. Dadmehr, “Mixed-band wavelet-chaos-neural network methodology for epilepsy and epileptic seizure detection,” IEEE Transactions on Biomedical Engineering, vol. 54, no. 9, pp. 1545–1551, 2007.
- K. E. Graves and R. Nagarajah, “Uncertainty estimation using fuzzy measures for multiclass classification,” IEEE Transactions on Neural Networks, vol. 18, no. 1, pp. 128–140, 2007.
- R. C. Lacher, P. K. Coats, S. C. Sharma, and L. F. Fant, “A neural network for classifying the financial health of a firm,” European Journal of Operational Research, vol. 85, no. 1, pp. 53–65, 1995.
- R. Sharda, “Neural networks for the MS/OR analyst: an application bibliography,” Interfaces, vol. 24, no. 2, pp. 116–130, 1994.
- V. Subramanian, M. S. Hung, and M. Y. Hu, “An experimental evaluation of neural networks for classification,” Computers and Operations Research, vol. 20, no. 7, pp. 769–782, 1993.
- J. Farifteh, F. Van der Meer, C. Atzberger, and E. J. M. Carranza, “Quantitative analysis of salt-affected soil reflectance spectra: a comparison of two adaptive methods (PLSR and ANN),” Remote Sensing of Environment, vol. 110, no. 1, pp. 59–78, 2007.
- B. A. Jain and B. N. Nag, “Performance Evaluation of Neural Network Decision Models,” Journal of Management Information Systems, vol. 14, no. 2, pp. 201–216, 1997.
- L. M. Salchenberger, E. M. Cinar, and N. A. Lash, “Neural networks: a new tool for predicting thrift failures,” Decision Sciences, vol. 23, no. 4, pp. 899–916, 1992.
- I. Kurt, M. Ture, and A. T. Kurum, “Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease,” Expert Systems with Applications, vol. 34, no. 1, pp. 366–374, 2008.
- T. S. Lim, W. Y. Loh, and Y. S. Shih, “Comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms,” Machine Learning, vol. 40, no. 3, pp. 203–228, 2000.
- K. Y. Tam and M. Y. Kiang, “Managerial applications of neural networks: the case of bank failure predictions,” Management Science, vol. 38, no. 3, pp. 926–947, 1992.
- G. K. F. Tso and K. K. W. Yau, “Predicting electricity energy consumption: a comparison of regression analysis, decision tree and neural networks,” Energy, vol. 32, no. 9, pp. 1761–1768, 2007.
- L. Yang, C. W. Dawson, M. R. Brown, and M. Gell, “Neural network and GA approaches for dwelling fire occurrence prediction,” Knowledge-Based Systems, vol. 19, no. 4, pp. 213–219, 2006.
- J. Mingers, “Combining IS research methods: towards a pluralist methodology,” Information Systems Research, vol. 12, no. 3, pp. 240–259, 2001.
- A. Tashakkori and C. Teddlie, Mixed Methodology: Combining Qualitative and Quantitative Approaches, Sage, London, UK, 1998.
- S. C. Petter and M. J. Gallivan, “Toward a framework for classifying and guiding mixed method research in information systems,” in Proceedings of the 37th Hawaii International Conference on System Sciences, pp. 4061–4070, IEEE Computer Society, Los Alamitos, Calif, Usa, 2004.
- P. M. Podsakoff, S. B. MacKenzie, J. Y. Lee, and N. P. Podsakoff, “Common method biases in behavioral research: a critical review of the literature and recommended remedies,” Journal of Applied Psychology, vol. 88, no. 5, pp. 879–903, 2003.
- P. M. Podsakoff and D. W. Organ, “Self-reports in organizational research: problems and prospects,” Journal of Management, vol. 12, no. 4, pp. 531–554, 1986.
- S. Walczak, “Evaluating medical decision making heuristics and other business heuristics with neural networks,” in Intelligent Decision Making an AI Based Approach, G. Phillips-Wren and L. C. Jain, Eds., chapter 10, Springer, New York, NY, USA, 2008.
- K. Hornik, “Approximation capabilities of multilayer feedforward networks,” Neural Networks, vol. 4, no. 2, pp. 251–257, 1991.
- K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, no. 5, pp. 359–366, 1989.
- H. White, “Connectionist nonparametric regression: multilayer feedforward networks can learn arbitrary mappings,” Neural Networks, vol. 3, no. 5, pp. 535–549, 1990.
- N. R. Swanson and H. White, “A model-selection approach to assessing the information in the term structure using linear models and artificial neural networks,” Journal of Business & Economic Statistics, vol. 13, no. 3, pp. 265–275, 1995.
- S. A. Billings and G. L. Zheng, “Radial basis function network configuration using genetic algorithms,” Neural Networks, vol. 8, no. 6, pp. 877–890, 1995.
- J. N. D. Gupta, R. S. Sexton, and E. A. Tunc, “Selecting Scheduling Heuristics Using Neural Networks,” INFORMS Journal on Computing, vol. 12, no. 2, pp. 150–162, 2000.
- M. Rocha, P. Cortez, and J. Neves, “Evolution of neural networks for classification and regression,” Neurocomputing, vol. 70, no. 16-18, pp. 2809–2816, 2007.
- R. Sexton, “Identifying irrelevant variables in chaotic time series problems: using the genetic algorithm for training neural networks,” Journal of Computational Intelligence in Finance, vol. 6, no. 5, pp. 34–42, 1998.
- L. Yi-Hui, “Evolutionary neural network modeling for forecasting the field failure data of repairable systems,” Expert Systems with Applications, vol. 33, no. 4, pp. 1090–1096, 2007.
- A. Tahai, S. Walczak, and J. T. Rigsby, “Improving artificial neural network performance through input variable selection,” in Applications of Fuzzy Sets and The Theory of Evidence to Accounting II, P. Siegel, K. Omer, A. deKorvin, and A. Zebda, Eds., pp. 277–292, JAI Press, Stamford, Conn, USA, 1998.
- Z. Hua, Y. Wang, X. Xu, B. Zhang, and L. Liang, “Predicting corporate financial distress based on integration of support vector machine and logistic regression,” Expert Systems with Applications, vol. 33, no. 2, pp. 434–440, 2007.
- R. J. Kuo, Y. L. An, H. S. Wang, and W. J. Chung, “Integration of self-organizing feature maps neural network and genetic K-means algorithm for market segmentation,” Expert Systems with Applications, vol. 30, no. 2, pp. 313–324, 2006.
- R. J. Kuo, L. M. Ho, and C. M. Hu, “Integration of self-organizing feature map and K-means algorithm for market segmentation,” Computers and Operations Research, vol. 29, no. 11, pp. 1475–1493, 2002.
- T. Marwala, “Bayesian training of neural networks using genetic programming,” Pattern Recognition Letters, vol. 28, no. 12, pp. 1452–1458, 2007.
- D. Dancey, Z. A. Bandar, and D. McLean, “Logistic model tree extraction from artificial neural networks,” IEEE Transactions on Systems, Man, and Cybernetics B, vol. 37, no. 4, pp. 794–802, 2007.
- K. L. Hsieh and Y. S. Lu, “Model construction and parameter effect for TFT-LCD process based on yield analysis by using ANNs and stepwise regression,” Expert Systems with Applications, vol. 34, no. 1, pp. 717–724, 2008.
- R. Setiono, J. Y. L. Thong, and C. S. Yap, “Symbolic rule extraction from neural networks An application to identifying organizations adopting IT,” Information and Management, vol. 34, no. 2, pp. 91–101, 1998.
- T. G. Diettrich, “Ensemble methods in machine learning,” in Proceedings of the 1st International Workshop Multiple Classifier Systems, Lecture Notes in Computer Science, pp. 1–15, Springer Verlag, Cagliari, Italy, 2000.
- L. K. Hansen and P. Salamon, “Neural network ensembles,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 10, pp. 993–1001, 1990.
- G. P. Zhang and V. L. Berardi, “Time series forecasting with neural network ensembles: an application for exchange rate prediction,” Journal of the Operational Research Society, vol. 52, no. 6, pp. 652–664, 2001.
- D. Chetchotsak and J. M. Twomey, “Combining neural networks for function approximation under conditions of sparse data: the biased regression approach,” International Journal of General Systems, vol. 36, no. 4, pp. 479–499, 2007.
- Z. H. Zhou and Y. Jiang, “NeC4.5: neural ensemble based C4.5,” IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 6, pp. 770–773, 2004.
- S. Walczak and M. Parthasarathy, “Modeling online service discontinuation with nonparametric agents,” Information Systems and e-Business Management, vol. 4, no. 1, pp. 49–70, 2006.
- R. Pakath and J. S. Zaveri, “Specifying critical inputs in a genetic algorithm-driven decision support system: an automated facility,” Decision Sciences, vol. 26, no. 6, pp. 749–771, 1995.
- M. Smith, Neural Networks for Statistical Modeling, Van Nostrand Reinhold, New York, NY, USA, 1993.
- S. Walczak and N. Cerpa, “Heuristic principles for the design of artificial neural networks,” Information and Software Technology, vol. 41, no. 2, pp. 107–117, 1999.
- H. Z. Huang, R. Bo, and W. Chen, “An integrated computational intelligence approach to product concept generation and evaluation,” Mechanism and Machine Theory, vol. 41, no. 5, pp. 567–583, 2006.
- S. Walczak, “An empirical analysis of data requirements for financial forecasting with neural networks,” Journal of Management Information Systems, vol. 17, no. 4, pp. 203–222, 2001.
- A. Durand, O. Devos, C. Ruckebusch, and J. P. Huvenne, “Genetic algorithm optimisation combined with partial least squares regression and mutual information variable selection procedures in near-infrared quantitative analysis of cotton-viscose textiles,” Analytica Chimica Acta, vol. 595, no. 1-2, pp. 72–79, 2007.
- F. D. Davis, “Perceived usefulness, perceived ease of use, and user acceptance of information technology,” MIS Quarterly: Management Information Systems, vol. 13, no. 3, pp. 319–339, 1989.
- G. M. Marakas, M. Y. Yi, and R. D. Johnson, “The multilevel and multifaceted character of computer self-efficacy: toward clarification of the construct and an integrative framework for research,” Information Systems Research, vol. 9, no. 2, pp. 126–163, 1998.
- J. J. Martocchio, “Effects of Conceptions of Ability on Anxiety, Self-Efficacy, and Learning in Training,” Journal of Applied Psychology, vol. 79, no. 6, pp. 819–825, 1994.
- R. T. Christoph, G. A. Schoenfeld Jr, and J. W. Tansky, “Overcoming barriers to training utilizing technology: the influence of self-efficacy factors on multimedia-based training receptiveness,” Human Resource Development Quarterly, vol. 9, no. 1, pp. 25–38, 1998.
- M. E. Gist, C. Schwoerer, and B. Rosen, “Effects of Alternative Training Methods on Self-Efficacy and Performance in Computer Software Training,” Journal of Applied Psychology, vol. 74, no. 6, pp. 884–891, 1989.
- J. E. Mathieu, J. W. Martineau, and S. I. Tannenbaum, “Individual and situational influences on the development of self-efficacy: implications for training effectiveness,” Personnel Psychology, vol. 46, no. 1, pp. 125–147, 1993.
- R. Agarwal, V. Sambamurthy, and R. M. Stair, “Research report: the evolving relationship between general and specific computer self-efficacy—an empirical assessment,” Information Systems Research, vol. 11, no. 4, pp. 418–430, 2000.
- W. Hong, J. Y. L. Thong, W. M. Wong, and K. Y. Tam, “Determinants of user acceptance of digital libraries: an empirical examination of individual differences and system characteristics,” Journal of Management Information Systems, vol. 18, no. 3, pp. 97–124, 2001.
- V. Venkatesh, “Determinants of perceived ease of use: integrating control, intrinsic motivation, and emotion into the technology acceptance model,” Information Systems Research, vol. 11, no. 4, pp. 342–365, 2000.
- M. Igbaria and J. Iivari, “The effects of self-efficacy on computer usage,” Omega, vol. 23, no. 6, pp. 587–605, 1995.
- D. R. Compeau and C. A. Higgins, “Computer self-efficacy: development of a measure and initial test,” MIS Quarterly: Management Information Systems, vol. 19, no. 2, pp. 189–210, 1995.
- B. Hasan, “The influence of specific computer experiences on computer self-efficacy beliefs,” Computers in Human Behavior, vol. 19, no. 4, pp. 443–450, 2003.
- R. D. Johnson and G. M. Marakas, “research report: the role of behavioral modeling in computer skills acquisition—toward refinement of the model,” Information Systems Research, vol. 11, no. 4, pp. 402–417, 2000.
- R. W. Stone and J. W. Henry, “The roles of computer self-efficacy and outcome expectancy in influencing the computer end-user's organizational commitment,” Journal of End User Computing, vol. 15, no. 1, pp. 38–53, 2003.
- S. Taylor and P. Todd, “Assessing IT usage: the role of prior experience,” MIS Quarterly: Management Information Systems, vol. 19, no. 4, pp. 561–568, 1995.
- R. Torkzadeh, K. Pflughoeft, and L. Hall, “Computer self-efficacy, training effectiveness and user attitudes: an empirical study,” Behaviour and Information Technology, vol. 18, no. 4, pp. 299–309, 1999.
- J. B. Thatcher and P. L. Perrewé, “An empirical examination of individual traits as antecedents to computer anxiety and computer self-efficacy,” MIS Quarterly: Management Information Systems, vol. 26, no. 4, pp. 381–396, 2002.
- S. Taylor and P. A. Todd, “Understanding information technology usage: a test of competing models,” Information Systems Research, vol. 6, no. 2, pp. 144–176, 1995.
- D. S. Staples, J. S. Hulland, and C. A. Higgins, “A Self-Efficacy Theory Explanation for the Management of Remote Workers in Virtual Organizations,” Organization Science, vol. 10, no. 6, pp. 758–776, 1999.
- R. Agarwal and J. Prasad, “A Conceptual and Operational Definition of Personal Innovativeness in the Domain of Information Technology,” Information Systems Research, vol. 9, no. 2, pp. 204–215, 1998.
- J. Webster and H. Ho, “Audience engagement in multimedia presentations,” Data Base for Advances in Information Systems, vol. 28, no. 2, pp. 63–76, 1997.
- J. Webster and J. J. Martocchio, “Microcomputer playfulness: development of a measure with workplace implications,” MIS Quarterly: Management Information Systems, vol. 16, no. 2, pp. 201–224, 1992.
- F. D. Davis, R. P. Bagozzi, and P. R. Warshaw, “Extrinsic and intrinsic motivation to use computers in the workplace,” Journal of Applied Social Psychology, vol. 22, no. 14, pp. 1111–1132, 1992.
- J. F. Hair, W. C. Black, B. J. Babin, and R. E. Anderson, Multivariate Data Analysis, Prentice Hall, Upper Saddle River, NJ, USA, 7th edition, 2010.
- W. W. Chin, “The partial least squares approach for structural equation modeling,” in Modern Methods for Business Research, G. A. Marcoulides, Ed., pp. 295–336, Lawrence Erlbaum Associates, Hillsdale, NJ, USA, 1998.
- E. Barnard and L. Wessels, “Extrapolation and interpolation in neural network classifiers,” IEEE Control Systems, vol. 12, no. 5, pp. 50–53, 1992.
- B. Efron, The Jackknife, the Bootstrap, and Other Resampling Plans, SIAM, Philadelphia, Pa, USA, 1982.
- F. Höppner, F. Klawonn, R. Kruse, and T. Runkler, Fuzzy Cluster Analysis, Wiley, New York, NY, USA, 1999.
- L. A. Zadeh, “Fuzzy sets,” Information and Control, vol. 8, no. 3, pp. 338–353, 1965.
- M. Gist, “Self-efficacy: implications for organizational behavior and human resource management,” Academy of Management Review, vol. 12, no. 3, pp. 472–485, 1987.
- T. H. Lee, H. White, and C. W. J. Granger, “Testing for neglected nonlinearity in time series models. A comparison of neural network methods and alternative tests,” Journal of Econometrics, vol. 56, no. 3, pp. 269–290, 1993.
- D. Scarborough and M. J. Somers, Neural Networks in Organizational Research, American Psychological Association, Washington, DC, USA, 2006.
- M. J. Somers, “Thinking differently: assessing nonlinearities in the relationship between work attitudes and job performance using a Bayesian neural network,” Journal of Occupational and Organizational Psychology, vol. 74, no. 1, pp. 47–61, 2001.
- M. S. Hung, M. Y. Hu, M. S. Shanker, and B. E. Patuwo, “Estimating posterior probabilities in classification problems with neural networks,” International Journal of Computational Intelligence and Organization, vol. 1, no. 1, pp. 49–60, 1996.
- R. Rustum and A. J. Adeloye, “Replacing outliers and missing values from activated sludge data using kohonen self-organizing map,” Journal of Environmental Engineering, vol. 133, no. 9, pp. 909–916, 2007.