Abstract

The COVID-19 pandemic, a public health crisis of worldwide importance, announced by the World Health Organization (WHO) in January 2020 as an outbreak, has made distance education through the E-learning system an urgent and irreplaceable requirement. The study assessed factors affecting students’ online learning outcomes during the COVID-19 pandemic through interviews with 404 students who were subjects of the survey using the convenience sampling method via questionnaires. The study utilized the reliability analysis through Cronbach’s Alpha and the Bayesian Exploratory Factor Analysis (BEFA). The evaluation results of the research scales showed that 28 observed variables were used to measure 7 research concepts. Test results of the hypotheses showed that students’ online learning outcomes are affected by 6 factors in the descending order, respectively, learner characteristics, perceived usefulness, course content, course design, ease of use, and faculty capacity.

1. Introduction

Recently, advances in modern computer and network technology have driven the development of distance education [1]. In addition, the COVID-19 pandemic, a public health crisis of worldwide importance, announced by the World Health Organization (WHO) in January 2020 as an outbreak, has made distance education through the E-learning system an urgent and irreplaceable requirement. Despite the current pandemic that is hindering education worldwide, online learning based on Internet services has become available and universal, facilitating the learning system. Colleges and universities use online resources to continue their educational journey through software applications such as Zoom and Microsoft Teams.

As a result, the effectiveness of E-learning and students’ online learning outcomes become a matter of concern for universities in particular and the society in general. In fact, there has been a significant increase in research on factors affecting students’ online learning outcomes. According to [2], improved communication technologies enable easy learning systems since access to social media is a beneficial source of information and communication. Online technology is seen as an active element of both students’ and lecturers’ learning systems. During the pandemic era, several nations used television broadcasts and online sources to promote distance education. Prioritizing distance education primarily through online systems is a “model change in education.” The jammed education wheel causes certain instabilities regarding learners’ future, emphasizing the importance of technology in our lives. Online learning is a useful tool to overcome the challenges of the pandemic crisis in particular and other difficulties in general [3]. However, many argued that online learning is an education crisis today. Most learners are not interested in online learning due to limited interactions, unstable sound and visual quality due to dependence on Internet quality, and technological equipment not meeting demand. Therefore, this study aimed to explore factors that affect students’ outcomes during the online learning process.

Previous studies on the factors affecting students’ online learning outcomes used the traditional exploratory factor analysis (EFA) method to identify the representative factors. This study will contribute to the existing empirical literature by integrating the Bayesian approach to traditional EFA that simultaneously selects the dimension of the factor model, the allocation of manifest variables to factors, and the factor loadings. Theoretically, traditional EFA is divided into four steps: (i) choosing the dimension of the factor model; (ii) allocating manifest variables to factors; (iii) estimating factor loadings; and (iv) discarding measurements that load on multiple factors. There are several methods for selecting the dimension of the latent factors to extract and rotate factors [46]. However, each of the dimensions selected by analysts at each stage of a traditional EFA has substantial consequences on the estimated factor structure [7]. To overcome this problem, Conti et al. [7] proposed not to choose the number of factors in the first step but to choose factors together with other parameters by the Bayesian approach. Besides, by this approach, the allocation of manifest variables to factors will base the model with the highest probability. These are the fundamental ideas that prompted us to conduct this study.

The study is structured as follows: A literature review is provided in Section 2, followed by research methodology in Section 3, while empirical results are described in Section 4. Finally, the conclusion and policy implications are reported in Section 5.

2. Literature Review

The theory of factors affecting online learning outcomes of students in particular and the effectiveness of using technology, in general, is derived from the technology acceptance model (TAM) proposed in [8]. Davis proposed TAM to explain people’s attitudes and behaviors in adopting technology in the presence of other external variables. This model is often applied in the study of technology use behavior to understand the reasons for accepting or rejecting information systems. Information technology plays a prominent role in teaching as it can encourage innovation, provide new learning spaces, and transform teaching activities [9, 10], all associated with the ease of IT operations. Ease of operation, user experience convenience, and proficiency in information technologies directly affect users’ perception and motivation to learn [11]. Studies have proven that factors in TAM such as perceived ease of use and perceived usefulness positively impact student learning outcomes.

2.1. Perceived Ease of Use

Online learning platforms are designed for the purpose of knowledge sharing and learning. Today, as we live in a globalized world, using technology to obtain knowledge, acquiring information, and learning has become a daily need [12]. These sources are easy to use and accessible, facilitating knowledge-sharing processes. Many studies have shown that ease of use, accessibility, and transmission speed of online media and mobile devices are an important part of the learning process. Increased online learning adaptability is due to easing access, thus resulting in positive outcomes [13, 14]. Based on these rationales, the following hypothesis is designed for this study.

H1: perceived ease of use has a positive effect on students’ online learning outcomes.

2.2. Perceived Usefulness

Perceived usefulness is the degree to which learners believe that the use of online learning will help improve their performance [8]. The usefulness of online learning is demonstrated by helping learners save travel time and travel costs and access a variety of methods [15]. Many studies have shown that perceived usefulness positively impacts learners’ attitudes and motivation, thereby improving learning outcomes [2, 13]. Based on these rationales, the following hypothesis is designed for this study.

H2: perceived usefulness has a positive effect on students’ online learning outcomes.

2.3. Faculty Capacity

The approach in the online learning process is learner-centered rather than teacher-centered as in traditional education [16]. Pedagogical methods, professional competence, science and technology application level, the ability to form and combine different ideas, and practices in developing online course contents in higher education help students achieve better learning outcomes [1720]. Based on these rationales, the following hypothesis is designed for this study.

H3: faculty capacity has a positive effect on students’ online learning outcomes.

2.4. Course Content

Engaging course content attracts lots of participation and proactiveness among students, thereby influencing learning outcomes [21, 22]. The E-learning content includes the structure and content of chapters of learning materials. Besides, the E-learning content also includes additional materials to help students understand more clearly and deeply about the knowledge [23]. This factor facilitates the improvement of student’s analytical and critical thinking and problem-solving skills [24]. Based on these rationales, the following hypothesis is designed for this study.

H4: course content has a positive effect on students’ online learning outcomes.

2.5. Course Design

E-learning course design includes structure, course design interface, testing and evaluation methods, and exchange forums between lecturers and learners. A good course design will attract and facilitate students to learn through online classes [25]. The course design interface is used to introduce course content, designed according to student’s competence and level of understanding, and appropriate in terms of time and space to promote and support the self-study process [2628]. Based on these rationales, the following hypothesis is designed for this study.

H5: course design has a positive effect on students’ online learning outcomes.

2.6. Learner Characteristics

Social interaction with lecturers and with co-learners is imperative to achieve better online learning quality. Through strong interaction and consistent practice, the effectiveness of online learning can be achieved [2931]. In addition, proactiveness, self-study ability, and sense of compliance are important requirements for achieving better learning outcomes since regulations and requirements of online learning are more comfortable. The process is more difficult to control than traditional methods. Based on these rationales, the following hypothesis is designed for this study.

H6: learner characteristics have an effect on students’ online learning outcomes.

3. Research Methodology

3.1. Research Model

The theoretical framework denoting the study hypotheses as presented in Figure 1 was derived based on the literature discussed above.

3.2. Research Process

This study is conducted in two phases, as shown in Figure 2.

3.2.1. Phase 1

Preliminary research is conducted through qualitative research and preliminary quantitative research methods. Specifically, qualitative research is used to discover, adjust, and supplement observed variables in each scale of the model. Qualitative research is conducted through group discussion techniques with experts and managers in the research field. The scales built from qualitative research will be retested through preliminary quantitative research. Preliminary quantitative research is conducted with a small sample (70 students) to test the reliability of the scale with Cronbach’s Alpha coefficient and exploratory factor analysis for each scale. The purpose of preliminary quantitative research is to test and adjust the scales to suit the actual research data. Preliminary research has formed the scales of the factors in the research model, as shown in the Table 1.

The five-point Likert scale was used in this study for all observed variables of each factor. The 5-point Likert scale is used in the ascending order of magnitude. Specifically, 1 indicates “strongly disagree,” 2 indicates “disagree,” 3 indicates “normal,” 4 indicates “agree,” and 5 indicates “strongly agree.”

3.2.2. Phase 2

Official research is also conducted using quantitative research methods. The official research is conducted to test the scale’s reliability using Cronbach’s Alpha coefficient, exploratory factor analysis by the Bayesian method, and multivariate regression analysis (OLS). The purpose of the official research is to test the model and the research hypotheses. Data for official quantitative research were collected through direct and indirect interviews with students at universities by questionnaires from official scales after preliminary research.

Testing the Scale’s Reliability with Cronbach’s Alpha Coefficient. According to [32], this includes those observed variables with the corrected item total correlation greater than 0.3 and Cronbach’s Alpha greater than 0.6 to ensure the scale’s reliability.

Bayesian Exploratory Factor Analysis (BEFA). The specification of a posterior model is referred to as Bayesian analysis. Based on the observed data and some prior information, Bayesian analysis produced the posterior distribution of all parameters. As a result, the posterior distribution has two parts: a likelihood, which contains information about model parameters based on observed data, and a prior distribution, which includes information about model parameters before the data are observed. The Bayes rule is used to combine the likelihood function and prior distribution to create the posterior distribution:

For estimating the postdistribution, simulations were employed. Markov chain Monte Carlo (MCMC) may be used to simulate potentially complex posterior models with arbitrary accuracy. However, the specification of an effective sampling algorithm and verification of the MCMC’s convergence to the posterior distribution are typically difficult.

In addition, prior distributions for all model parameters in a Bayesian model must be specified. In a Bayesian model, prior distributions or priors are considered key components, so they must be selected carefully.

The basic factor analysis model is written aswhere is a vector consisting of M variables, for individual i, i= 1, 2,…, N. The residual idiosyncratic terms (“uniquenesses”) are denoted . The latent common factor of the model are denoted by . is the factor loading that indicates the relation between the observed variable X and the latent common factor F.

As given in [7], to perform the allocation of observed variables to each factor, we also use a matrix of binary indices ∆ with the same size as the factor loading matrix . Each row of ∆ indicates which latent factor the variable corresponds to the load. For example, if the m-th variable is combined with the k-factor, then the m-th row is the indicator vector ek:

When a variable does not load on any factor, the corresponding row of ∆ only contains zeros. We assume that no variable may load on more than one factor. This means. .

According to [7], to perform BEFA, it is necessary to determine the a priori distribution for (, the probability that a variable loads on factor k, (the idiosyncratic variances), (the factor loadings), and (the correlation matrix of the factors). In this study, we use the prior distributions for these parameters, as suggested in [7].

The number of latent factors K is determined according to the Ledermann boundary [33]. However, during MCMC sampling, random search on the factor loading matrix can produce 0 columns, thus reducing the number of latent factors. The number of MCMC iterations is 27500. The burn-in period of the MCMC sampler is 2500. Hence, the number of MCMC iterations saved for posterior inference (after burn-in) is 25000.

Multivariate Regression Analysis (OLS). We used multivariate regression analysis based on the least squares method (OLS) to evaluate the factors affecting students’ online learning outcomes and test the hypotheses. The specific model is as follows:where denotes students’ online learning outcomes, for individual i, i= 1, 2,…, N. denotes the factors from the result of BEFA. Here, is calculated by taking the average of the observed variables in . denotes the error terms. denotes the matrix of coefficients in the model.

3.3. Sampling and Data Collection

The sample was selected by the random method. According to [34], the sample size needs to be considered in correlation with the number of parameter estimates, and if the maximum likelihood (ML) method is used, the sample size must be at least 100 to 150. Besides, the study in [35] suggested that the ratio required for sample design is a minimum of 5 observations per parameter estimate (5 : 1 ratio). This study has a total of 28 parameter estimates, so the minimum sample size must reach 140 observations. According to [36], in practical research applications, a sample size of 150 or larger is often needed to obtain parameter estimates with sufficiently small standard errors. Thus, a sample size larger than 150 is acceptable.

Besides, the study in [37] developed the equation to yield a representative sample for a large population. Since the student population in Ho Chi Minh City is a large population, we use the equation developed in [37] as follows:

In case of sample size n, is a normal curve abscissa that reduces the area α at the tail (1-α is equivalent to a desired level of 95% confidence), e is a level of accuracy required, and p is the estimated proportion of an attribute present in the population. In statistics tables that contain the area under the normal curve, the value of Z is found. In this study, we chose the 95% confidence level, so the Z value = 1.96. The estimated proportion was chosen to be 0.5. The desired level of precision was chosen to be e = 5%. Therefore, the minimum sample size in this study was

In fact, we surveyed 430 students of universities in Ho Chi Minh City by both face-to-face interviews via QR-coded questionnaires and indirect interviews with questionnaires sent via e-mail. Our survey was conducted from February 2021 to June 2021. During this time period, we got 415 questionnaires back from these students. Therefore, the response rate was 96.51%. After that, we removed 11 more questionnaires due to a lack of response information. Finally, we used 404 questionnaires for the official research. The detail of sample is shown in Table 2.

The demographics showed that 47% of respondents were male. 20.5% of total respondents were first-year students, 27% of total respondents were second-year students, 26.7% of total respondents were third-year students, and 25.7% of total respondents were fourth-year students.

4. Empirical Results

4.1. Correlation Matrix

First, we take a look at the correlations among observed variables to determine if factor analysis is appropriate.

From Figure 3, we can see that most items have some correlation with each other. This would be a good candidate for factor analysis due to the relatively high correlations among items. We should remember that the goal of factor analysis is to model the interdependence of items using fewer (latent) variables. These interrelationships can be divided into several components.

4.2. Reliability Test

Cronbach’s Alpha was calculated to test the reliability of scales. Cronbach’s Alpha measures the consistency of observed variables on the same scale. Scales with Cronbach’s Alpha greater than 0.6 are satisfactory. In addition, the observed variables also have a variable total correlation coefficient greater than 0.3. The reliability test results of the scales are shown in Table 3.

The results showed that all scales and their observed variables achieved the reliability values and were further analyzed for exploratory factors.

4.3. Bayesian Exploratory Factor Analysis (BEFA)

In this study, the number of MCMC iterations is 27500. The burn-in period of the MCMC sampler is 2500. Hence, the number of MCMC iterations saved for posterior inference (after burn-in) is 25000.

First, the results of the BEFA method for observed variables that represent ease of use, perceived usefulness, faculty capacity, course content, course design, and learner characteristics are shown and explained in Figure 4 and Table 4.

In this study, the MCMC size used is 25000. The trade plot in Figure 4 shows that the posterior mean of the number of factors is 6. Besides, the posterior distribution also indicates that the probability that BEFA can extract 6 factors is 100%.

The allocation of observed variables to each factor is shown in Table 4. The results show that the posterior mean of the factor loading coefficient of each observed variable has a value greater than 0.5. Figure 5 shows a visualization of the allocation of observed variables to each factor.

Hence, BEFA extracted 6 factors and the observed variables in each factor had a factor loading coefficient greater than 0.5. The specific factors are as follows:(i)The first factor includes observed variables EOU1, EOU2, EOU3, and EOU4 representing ease of use. Name this factor as EOU, and calculate it as the mean of the component observed variables.(ii)The second factor includes observed variables PU1, PU2, PU3, and PU4 representing perceived usefulness. Name this factor as PU, and calculate it as the mean of the component observed variables.(iii)The third factor includes observed variables FC1, FC2, FC3, and FC4 representing faculty capacity. Name this factor as FC, and calculate it as the mean of the component observed variables.(iv)The fourth factor includes observed variables CC1, CC2, CC3, and CC4 representing course content. Name this factor as CC, and calculate it as the mean of the component observed variables.(v)The fifth factor includes observed variables CD1, CD2, CD3, and CD4 representing course design. Name this factor as CD, and calculate it as the mean of the component observed variables.(vi)The sixth factor includes observed variables LC1, LC2, LC3, and LC4 representing learner characteristics. Name this factor as LC, and calculate it as the mean of the component observed variables.

Second, we use the traditional EFA method for observed variables representing students’ online learning outcomes. This method is used because these observed variables only measure one factor, students’ online learning outcomes. The results are explained in Figure 6 and Table 5.

The scree plot in Figure 6 shows that the number of factors is 1, with an eigenvalue of 2.710, greater than 1. Besides, Table 5 shows that the KMO coefficient of 0.802 is greater than 0.5 and less than 1, indicating that the EFA method is in agreement with the actual data. Bartlett’s test shows that observed variables correlate with the factor. This factor includes observed variables SP1, SP2, SP3, and SP4 representing students’ online learning outcomes. Name this factor as SP, and calculate it as the mean of the component observed variables.

4.4. Multivariate Regression Analysis (OLS)

We used multivariate regression analysis based on the least squares method (OLS) to evaluate the factors affecting students’ online learning outcomes and test the hypotheses. The results are shown in Table 6.

Table 6 shows that the model does not have multicollinearity because the corresponding VIF values for the independent variables in the model are less than 5 [38]. Besides, the Durbin–Watson d has a value of 2.020, which is close to 2, so the model does not have autocorrelation. Finally, the Breusch–Pagan/Cook–Weisberg test has a -value of 0.1709, which is greater than the 5% significance level, so the model does not have heteroskedasticity.

Table 6 also shows that the regression coefficients of the variables EOU, PU, FC, CC, CD, and LC all have values greater than the 5% significance level. Thus, the variables EOU, PU, FC, CC, CD, and LC all have an impact on the dependent variable SP. In other words, ease of use, perceived usefulness, faculty capacity, course content, course design, and learner characteristics affect students’ online learning outcomes. In addition, the regression coefficients of these variables are all positive. These results show that hypotheses H1, H2, H3, H4, H5, and H6 are correct.

Finally, the standardized coefficients in Table 6 show that the order of impact of these factors on students’ online learning outcomes from strong to weak is as follows: learner characteristics, perceived usefulness, course content, course design, ease of use, and faculty capacity. The impact of each factor on students’ online learning outcomes is shown in Figure 7.

5. Conclusion and Policy Implications

The official research was also conducted using quantitative research methods with 404 respondents who are students in Ho Chi Minh City using the convenience sampling method with detailed questionnaires. The study utilized the reliability analysis through Cronbach’s Alpha and BEFA methods. Our empirical results proved that students’ outcomes during the online learning process are affected by 6 factors in the descending order, respectively, learner characteristics, perceived usefulness, course content, course design, ease of use, and faculty capacity (see Table 7). This result is also similar to that of studies in [2, 13, 14, 28, 31].

The study helped educators, lecturers, and students understand the importance of factors affecting students’ outcomes during the online learning process, thereby forming policies that focus on organizing, designing, and conducting online courses in particular and higher education in general. First, for students’ online learning to be successful, the university must hold training sessions to improve students’ initiative, encourage students to actively interact with lecturers and classmates, and improve students’ self-study ability. Besides, through training sessions, schools need to help students realize the usefulness of online learning, especially in the context of the COVID-19 pandemic. The online learning system should be built with a friendly and easy-to-use interface and diverse learning programs through the E-learning system, should improve system accessibility, should allow students to actively register, and should be flexible about the time to use.

Although this study accomplished its original goal, it does have some limitations. To begin with, because the new study was conducted on a small scale, generalizability may be limited. Second, the study focuses primarily on factors related to the online learning system, but it does not assess factors outside the system, such as the school’s incentive policy, communication quality, student support, and family circumstances. These are the limitations that should be addressed in future research.

Data Availability

The primary data used to support the findings of this study have been deposited in the Github repository (https://github.com/anhle32/A-BAYESIAN-EXPLORATORY-FACTOR-ANALYSIS.git).

Conflicts of Interest

The authors declare no conflicts of interest.

Authors’ Contributions

Dr. Hoang Anh Le conceived the idea and wrote Introduction, Literature Review, Research Methodology sections. Thi Tinh Thuong Pham MSc. wrote Empirical Results and Conclusion and Policy Implications. Dr. Doan Trang Do wrote Research Methodology.